Most Recent S Name Movers

https://colab.research.google.com/drive/1vzE3nGJm78E1SVJoCpKgl3D3HowdRXHp#scrollTo=qbjOx0YTJ2FP

most_recent_S_attn_pat.ipynb

https://colab.research.google.com/drive/1KaqcS92-BI4FZ7m-r8rCW9tIovxA_s93#scrollTo=VcFgqbcF4YvI

https://colab.research.google.com/drive/1NCBOLPx038FxwEacmHDsCesWIAW1z8kU

Path patching finds how attention heads move information from inputs to other heads. So these input types take a subject using in-context learning and outputs a subject. Based on IOI findings, we expect to find:

Induction heads (b/c of in-context learning)
Name mover heads
- Find evidence for this using Copy Scores
Subject influencing heads (here, seems to be most recent subject)

Can be done w/ just GPT-2-small

Previous findings:

Rewrite copy scores to not use ioi_dataset

Rewritten dataset: https://colab.research.google.com/drive/1NCBOLPx038FxwEacmHDsCesWIAW1z8kU#scrollTo=qau6bOQRXcrB&line=12&uniqifier=1

Rewritten copy_scores function:

https://colab.research.google.com/drive/1NCBOLPx038FxwEacmHDsCesWIAW1z8kU#scrollTo=-2rIAnfFqv62&line=5&uniqifier=1