Most Recent S Name Movers

Custom Dataset for Circuit Discovery

Modify copy circuits code

Outline:

Attention Patterns, Scatterplot and Copy Scores

<<<<<<<<<<<<<<<

Attention Patterns

most_recent_S_attn_pat.ipynb

https://colab.research.google.com/drive/1KaqcS92-BI4FZ7m-r8rCW9tIovxA_s93#scrollTo=M93Hy1XdqNKm&line=1&uniqifier=1

For the attention heads that matter, check their attention patterns to see where they are active on (dest token) and where they attend to (source token)

L8H11:

The last token “is” (dest) attends to the latest subject (src)

The second last token “child” attends to both 2nd latest and latest subject, but moreso on latest

But in a neg name mover head, the last token “is” (dest) ALSO attends to the latest subject (src)