Steps to start

Be as innovative as possible; accuracy is not as important. I don’t care about how showing you people in the research community how accurate I am. I want to show off how innovative I am.

You must be okay with the innovative approach coded and plotted simply. The refinements of accuracy and optimality and formality come from others later; don’t worry about that, it’s not your job. You add value by innovation in your OWN way; you will not add much value in other ways compared to others.

There is more than one way to code something. Your aim is not to code is optimally good, but to code it in your own way that’s as good as you can get it.

Forget needing to understand well the complex techniques others use; do model editing in your own way that wildly mutates existing approaches in an interesting fashion. It doesn’t need complex math or concepts.

Instead of spending hours just trying to understand more things, spend some time understanding but spend most of the hours thinking creatively using your SUBCONSCIOUS- as if you’re dreaming. Let it flow as dreams naturally arrive in the state akin to sleep.

Being innovative doesn’t mean trying drastically different approaches. It means to take existing approaches and use them in unconventional ways. That way, you have something that works, but you modify it in a way that doesn’t break it or you just apply it to something different. Eg) train a model on activation data (like what SAEs do) or give activation data to chatGPT

Timeline

Exploratory Phase

Explanation of this phase

The aim of this phase is to find one approach that yields high potential. Before running preliminary experiments, we don’t know which ideas can work. Thus, this phase is a breadth-first search to find approaches that do meet minimum viable requirements.

For instance, if we say “Method X should be better than the previous Method Y”, then the preliminary experiments should show that X, on a small dataset on a small model, should be better than Y in significant way. If not, then if we cannot think of another modification to this approach to try, we should move to a completely different approach that tackles a different topic.

During the exploratory phase, we should keep the research aim broad, but the candidate topics should be under the same general topic. As we are not prophets, it is not guaranteed that we can find a method X that beats method Y within the research timeframe- it may not even exist.

Instead of stating a specific aim, we should find a topic to tackle that is in between being too broad and too specific. Then, we generate more specific candidate approaches (aims) within this topic.

For instance, a research topic can be, “To study editing deception in agents in order to improve AI safety”.

Within this research topic, we generate specific candidate approaches to conduct preliminary experiments on:

Approach Preliminary Experiment Prelim Success Conditions
Studying associations

We prioritize the approaches