Skip to content

Latest commit

 

History

History
25 lines (19 loc) · 629 Bytes

README.md

File metadata and controls

25 lines (19 loc) · 629 Bytes

Replication code for the paper "Newswire: A Large-Scale Structured Database of a Century of Historical News"

Repo structure -

├── entity
│   └── pipeline.py
├── georeferencing
│   └── georef.py
├── README.md
├── topic_models
│   └── train_topics.py
└── utils
    └── utils.py

Entity (NER, Coreference, Disambiguation)

All relevant functions are in entity/pipeline.py

Georeferencing

All relevant functions in georeferncing/georef.py

Topic Classification

Code to train the models and example Hyperparameters are in topic_models/train_topics.py