Skip to content

linhnph05/HCMUS-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HCMUS-News Scraper: a web scraper that crawl news from my university websites

Inspired from this github repo

Visit the Results page

Websites list that I scraping from:

Technology:

At first, I use Scrapy but then one of the page that I want to crawl has dynamic JS loaded content so I switch to Selenium.

What I have learned:

working with json, basic github ci/cd, scraping static and dynamic content, how to overcome website's blocking objection.

About

Scraping news from my university websites

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages