Archive.org booklet scraper: a tool to download and browse archive.org books offline

Disclaimer: Borrowed books that you don't hold a personal physical copy of are legally available to you only for the duration of the loan.

As of october 2020 the old python and bash scrapers are broken, I made a simple semi-automatic scraper, for any urgent need please contact me here to have your item sent as a PDF in 24h nazmi.fr/contact

Work in progress, available soon

The new javascript bookmarklet

works in the browser
dynamic download of all highest resolution pages with progress display
automatic page number estimate
from & to values editable as a prompt before the script starts
automatic naming and metadata
automatic PDF conversion

Requirements for the new script

Web browser (Firefox, Chrome (Chromium, Brave, Edge, Opera, ...)) supporting bookmarklets
Optional: VPS with the online PDF converter suite (php, apache, imagemagick, ghostscript) (my server is set as default and can be used to convert the images to pdf for small donation of your choice, or some time doing OCR verification for some books)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Archive.org booklet scraper: a tool to download and browse archive.org books offline

The new javascript bookmarklet

Requirements for the new script

Files

README.md

Latest commit

History

README.md

File metadata and controls

Archive.org booklet scraper: a tool to download and browse archive.org books offline

The new javascript bookmarklet

Requirements for the new script