Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in number of columns of doubles files #202

Open
TomKemperNL opened this issue Oct 22, 2022 · 0 comments
Open

Inconsistencies in number of columns of doubles files #202

TomKemperNL opened this issue Oct 22, 2022 · 0 comments

Comments

@TomKemperNL
Copy link

I was having some weird issues trying to import the data, which in the end turned out to be due to some inconsistent nr of columns within files.

So I've ran a little python script over the all the csv-files in this repository and noticed the following:

atp_matches_amateur.csv
        There are 1 lines with 49 columns
        There are 25001 lines with 40 columns
atp_matches_doubles_2000.csv
        There are 1 lines with 65 columns
        There are 1429 lines with 66 columns
atp_matches_doubles_2001.csv
        There are 1 lines with 65 columns
        There are 1393 lines with 66 columns
atp_matches_doubles_2002.csv
        There are 1 lines with 65 columns
        There are 1332 lines with 66 columns
atp_matches_doubles_2003.csv
        There are 1 lines with 65 columns
        There are 1260 lines with 66 columns
atp_matches_doubles_2004.csv
        There are 1 lines with 65 columns
        There are 1298 lines with 66 columns
atp_matches_doubles_2005.csv
        There are 1 lines with 65 columns
        There are 1260 lines with 66 columns
atp_matches_doubles_2006.csv
        There are 1 lines with 65 columns
        There are 1269 lines with 66 columns
atp_matches_doubles_2007.csv
        There are 1 lines with 65 columns
        There are 1284 lines with 66 columns
atp_matches_doubles_2008.csv
        There are 1 lines with 65 columns
        There are 1280 lines with 66 columns
atp_matches_doubles_2009.csv
        There are 1 lines with 65 columns
        There are 1281 lines with 66 columns
atp_matches_doubles_2010.csv
        There are 1 lines with 65 columns
        There are 1295 lines with 85 columns
atp_matches_doubles_2011.csv
        There are 1 lines with 65 columns
        There are 1281 lines with 85 columns
atp_matches_doubles_2012.csv
        There are 1 lines with 65 columns
        There are 1300 lines with 85 columns
atp_matches_doubles_2013.csv
        There are 1 lines with 65 columns
        There are 1260 lines with 85 columns
atp_matches_doubles_2014.csv
        There are 1 lines with 65 columns
        There are 1275 lines with 85 columns
atp_matches_doubles_2015.csv
        There are 1 lines with 65 columns
        There are 1317 lines with 85 columns
atp_matches_doubles_2016.csv
        There are 1 lines with 65 columns
        There are 1354 lines with 85 columns
atp_matches_doubles_2017.csv
        There are 1 lines with 65 columns
        There are 1313 lines with 85 columns
atp_matches_doubles_2018.csv
        There are 1 lines with 65 columns
        There are 1285 lines with 85 columns
atp_matches_doubles_2019.csv
        There are 1 lines with 65 columns
        There are 1236 lines with 85 columns
        There are 127 lines with 66 columns
atp_matches_doubles_2020.csv
        There are 1 lines with 65 columns
        There are 270 lines with 85 columns

Apparently for all the doubles-files the data is a bit too long (or the header-row is one column short)

@TomKemperNL TomKemperNL changed the title Inconsistencies in number of columns Inconsistencies in number of columns of doubles files Oct 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant