Skip to content

Repository for 'Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics', to be presented at COLM 2024.

License

Notifications You must be signed in to change notification settings

jmichaelov/recurrent-vs-transformer-modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics

Repository for the paper 'Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics', accepted at COLM 2024.

This repository contains all the code and data needed to replicate the results reported in the paper.

Paper Abstract

Transformers have supplanted Recurrent Neural Networks as the dominant architecture for both natural language processing tasks and, despite criticisms of cognitive implausibility, for modelling the effect of predictability on online human language comprehension. However, two recently developed recurrent neural network architectures, RWKV and Mamba, appear to perform natural language tasks comparably to or better than transformers of equivalent scale. In this paper, we show that contemporary recurrent models are now also able to match—and in some cases, exceed—performance of comparably sized transformers at modeling online human language comprehension. This suggests that transformer language models are not uniquely suited to this task, and opens up new directions for debates about the extent to which architectural features of language models make them better or worse models of human language comprehension.

Folder contents

  • cleaned_datasets contains all the N400/reading time datasets.
  • cleaned_stimuli contains the stimuli from the datasets prepared in a form to be input into the language models.
  • code contains the code to calculate surprisals for all language models on all stimuli (see get_surprisals.sh) as well as the code to calculate all language models' WikiText perplexity (see get_perplexities.sh). Both of these (as well as the code for shrinking the size of the surprisal output files) can be run using run_experiments.sh.
  • perplexities contains all the perplexities calculated using the Language Model Evaluation Harness with the code in get_perplexities.sh.
  • results contains the surprisals calculated using the language models.
  • statistics contains the statistical analysis code as well as the code used to generate the plots in the paper. run_analyses.R runs the regressions and calculates their AICs, lms.R runs the ordinary least-squares linear models used to analyze these AICs, and make_plots.R generates the plots included in the paper. We also include the run_analyses_split folder, which contains the code to run the regressions for each dataset separately using Slurm (using run_all.sh), which can reduce runtime if it is possible to run multiple jobs simultaneously. These results can then be combined into a single tsv file with combine_AICs.R.

To cite the code in this repository, please cite the original paper:

@article{michaelov2024revenge,
  title={Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics},
  author={Michaelov, James A. and Arnett, Catherine and Bergen, Benjamin K.},
  journal={arXiv preprint arXiv:2404.19178},
  year={2024},
  url={https://arxiv.org/abs/2404.19178}
}

About

Repository for 'Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics', to be presented at COLM 2024.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published