Skip to content

Latest commit

 

History

History
127 lines (80 loc) · 5.51 KB

DATA.md

File metadata and controls

127 lines (80 loc) · 5.51 KB

Useful ML Data Resources

Resources

US Government Data Sources

  • https://www.bls.gov/data/

    • https://www.bls.gov/developers/
      • "The BLS Public Data API is currently available in two versions. Version"
      • "2.0 requires registration and allows users to access more data more frequently. Users may add calculations and annual averages to requests, and series description information is available for many BLS surveys."
      • "Version 1.0 is a more limited API that does not require registration and is open for public use."
    • https://www.bls.gov/developers/api_sample_code.htm
  • http://congressionalbills.org/about.html

    • The Congressional Bills Project is a relational database of over 400,000 public and private bills introduced in the U.S. House and Senate since 1947. The dataset is primarily intended for use scholars and students of legislative politics.
    • Trained human coders assign a primary topic (one of 19 major and one of 224 subtopics) to each bill based on their readings of either the short description or the title of the bill. Intercoder reliability across the 225 subtopics is very high. Intercoder disagreements can indicate coding errors, but we have found that most of them reflect legitimate disagreements about a bill's primary topic. For example, a bill proposing a health care program for children of illegal immigrants might be arguably coded as an immigration issue (530) or as a child health issue (332).
    • http://congressionalbills.org/download.html

EU Data Sources

Financial Data Resources

Restaurant Data

Image Data

Video Data

NLP focused data

NSF/NIH

  • CRCNS (Collaborative Research in Computational Neuroscience) is a joint program of NSF and NIH that supports integration of theoretical and experimental neuroscience through collaborative research projects.
  • http://crcns.org/NWB/Data_sets

Data Exchange Formats

Medical Data