You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PathEvals, which take the solution coordinate and generated rollout
these are the oldest group
there are a couple things in rollout_evals() which belong here
LOGIT_ATTRIB_TASKS, which give the model a specific prompt and have it generate a single token
these were made for direct logit attribution, and are of the form "can you predict the origin correctly" or "how well do you do on a random non-endpoint in the path"
"rollout evals", which take the raw tokens produced by a model and compute things about the validity of tokens
these are in rollout_evals()
"logit evals" as mentioned in Add logit-based evals #165, but this might really be part of LOGIT_ATTRIB_TASKS
would be nice to have a "total probability mass assigned to valid tokens / coordinate tokens" metric, more useful than just raw perplexity
these need to be wrapped into a couple different groups, depending on the inputs. Perhaps:
model, mazes for the task based evals
generated_path, correct_path as in the older PathEvals
the function rollout_evals() is a mess because it exports stuff which was in a jupyter notebook for the UniReps submission, no time to integrate it with the rest of the code right now.
The text was updated successfully, but these errors were encountered:
mega-PR, adding a bunch of experiment notebooks and the required code for them.
broad overview:
- added some example models to `examples/`
- reworked eval code, needs big changes -- see #200
- many modifications to mechinterp code
- had to enforce transformerlens 1.6.1 due to tokenizer changes (it tried to get our custom tokenizer from huggingface?)
- exported some code to muutils
- notebooks added:
- `eval_tasks_table.ipynb`: evaluate on a bunch of single token tasks. should be merged with other evals notebook
- `appendix_figures.ipynb`: junk and duplicates of code in other notebooks :/
- `generate_rollouts.ipynb`: what the name says, simple notebooks
comment history:
* trying to see if wandb model loading is working right
* moved dict shapes to muutils (its on unmerged branch tho)
* better loading of models from wandb
* wip????????????????
* way more testing for loading wandb models
* aaaa
* ???
* hallway run
* update muutils dep to 0.5.3
* updated TL and maze-dataset dep
* type hint
* notebook runs?
* wip runs
* cleared notebooks?
* exported eval plots
* format
* many fixes and changes sorry
* wip
* poetry lock
* minor adjustment to make model names cleaner
* exported single token tasks
* refactored baseline model, allowed return of multiple options
going to be useful for plot_logits
* more baseline model refactor
* format
* dep?
* train_model test was trying to train on 3M samples lol
* seperate appendix figures notebooks, better logits plotting
logits plotting now allows for adding other categories to the histogram besides
correct / incorrect, which we can use the baseline model for
* misc
* rename original hallway model
need to fix refs to it later lol
* WE'RE SO BACK, ADJACENCY HEADS ARE HERE
check the dla notebook!!!
* correlation of attention and distance
* misc
* ok no more figures for now
* temp notebooks, for experiments. move these to paper repo later
* eval tasks table
* final before unireps submit
* misc fixes??
* added padding functionality and batched predictions
* wip
* wip
* wip
* added attention animation plotter
* format
* update deps
* transformerlens 1.6.1 due to issues :/
* cleaning up notebooks
latest versions of some were in experiments repo
* fix up some notebooks, eval_model is still broken
* providing hallway model
* fixing eval_model issues with baseline solver
batching was not working at all, had to add a hack to recursively
call .generate() on RandomBaseline
return type was list[str] instead of tensor or list[list[str]] so
had to fix that as well
* update dep to muutils 0.5.5 (poetry not recognizing it yet)
* format
* poetry lock
* changed model used to hallway
* changed model paths, no jirpy
* update embedding structure nb
* updated plot attention for better cbar
* fix up eval tasks table notebook
* fix when cbar is none
* ran notebook
Evals are a mess, and these is no unified way to run them.
currently, we have a couple forms of evals:
PathEvals
, which take the solution coordinate and generated rolloutrollout_evals()
which belong hereLOGIT_ATTRIB_TASKS
, which give the model a specific prompt and have it generate a single tokenrollout_evals()
LOGIT_ATTRIB_TASKS
these need to be wrapped into a couple different groups, depending on the inputs. Perhaps:
model, mazes
for the task based evalsgenerated_path, correct_path
as in the olderPathEvals
generated_tokens, maze
as in the "rollout evals"sequence_logits, maze
for logit evals Add logit-based evals #165the function
rollout_evals()
is a mess because it exports stuff which was in a jupyter notebook for the UniReps submission, no time to integrate it with the rest of the code right now.The text was updated successfully, but these errors were encountered: