test_prompt_most_recent_S.ipynb

https://colab.research.google.com/drive/1vzE3nGJm78E1SVJoCpKgl3D3HowdRXHp#scrollTo=ncE5KnOAOf6a

"The student is John. The pet is Mary. Connor went to the store. The human is”

This predicts “John” over “Connor”. This means it’s not ALWAYS doing “most recent subject”. So what makes it choose “which” subject?

Generalize IOI and ‘most recent subject’ to “Subject Choice Circuits”. Perhaps there is a consistent pattern, perhaps not.

<<<

"The student is John. The pet is Mary. The king is Connor. The human is”

This predicts John over Connor again. Why did previous prompts use most recent subject, over earliest subject?

<<<

"Alice is a teacher. Bob is a student. The child is Bob. Carol is a teacher. David is a student. The child is"

From what we’ve found, the [subject in the] “source sentence” doesn’t matter. So “The child is Bob” or “The child is Alice” doesn’t matter. These types of sentences are very sure in outputting “David” (90%) over second place token (<1%).

A difference of these types of sentences with the ones that predict the earliest subject is that this uses “[Subject] is a [word]. The [word_2] is”, whereas previous ones use “[Word] is subject. [Word_2] is”. Now, is it doing this because the to-output sentence is of different format, or because subject or word comes first in the sentence (in-context or to-output), or 3) the [word-desc] is the same? Or 4) the type of subject matters? There are many variations and combos of variations to try (may put this in a table if worth investigating).

Try changing all subject-description ordering to match the to-output format (same format):

Get before/after

Tangent expm

"The teacher is Alice. The teacher is Bob. The teacher is Alice. The teacher is”

As expected with induction heads or “duplicate identifiers”, this would give “Bob” because of the pattern that Alice was repeated before.