Skip to content

A simple JS Bookmarklet to download archive.org books ( with conversion to PDF)

Notifications You must be signed in to change notification settings

nazmifr/archive.org_booklet_scraper

Repository files navigation

Archive.org booklet scraper: a tool to download and browse archive.org books offline

Disclaimer: Borrowed books that you don't hold a personal physical copy of are legally available to you only for the duration of the loan.

As of october 2020 the old python and bash scrapers are broken, I made a simple semi-automatic scraper, for any urgent need please contact me here to have your item sent as a PDF in 24h nazmi.fr/contact

Work in progress, available soon

The new javascript bookmarklet

  • works in the browser
  • dynamic download of all highest resolution pages with progress display
  • automatic page number estimate
  • from & to values editable as a prompt before the script starts
  • automatic naming and metadata
  • automatic PDF conversion

Requirements for the new script

  • Web browser (Firefox, Chrome (Chromium, Brave, Edge, Opera, ...)) supporting bookmarklets
  • Optional: VPS with the online PDF converter suite (php, apache, imagemagick, ghostscript) (my server is set as default and can be used to convert the images to pdf for small donation of your choice, or some time doing OCR verification for some books)

About

A simple JS Bookmarklet to download archive.org books ( with conversion to PDF)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages