You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Large datasets, mostly CSV files, are currently fetched directly from Git LFS which induce significant Git LFS bandwidth costs.
Fetching these datasets as pre-compressed release assets will reduce download time and eliminate most GitHub Git LFS bandwidth costs. Thanks to @jvanulde for the idea and @DamonU2 for the pioneering work.
This, I think, is easier to implement and maintain, thus more robust and less error-prone than my previous unimplemented "XZ-compressed copies of repos" idea:
Large datasets, mostly CSV files, are currently fetched directly from Git LFS which induce significant Git LFS bandwidth costs.
Fetching these datasets as pre-compressed release assets will reduce download time and eliminate most GitHub Git LFS bandwidth costs. Thanks to @jvanulde for the idea and @DamonU2 for the pioneering work.
This, I think, is easier to implement and maintain, thus more robust and less error-prone than my previous unimplemented "XZ-compressed copies of repos" idea:
Data source repos:
Scripts that fetch from these repos include (but may not be limited to):
Cf. these commands found in add_data.sh, for example:
XZ or Zstd compression? (compressed file sizes vs. decompression speed)
The text was updated successfully, but these errors were encountered: