Inviting ideas to reduce dependency on 3rd party scrappers #16

thelazyoxymoron · 2022-12-02T06:11:52Z

thelazyoxymoron
Dec 2, 2022

The limiting factor in this otherwise great project is the reliance on 3rd party scrappers/proxy services to fetch the results. I found out this the hard way when trying to use ScrapingAnt which doesn't have good IP pool for India. Using ScrapingRobot works, but it is severely limiting at 5,000 scrapes/lifetime.

Is there any way we can reduce dependency on 3rd party services and keep it accessible to folks running indie websites and make it truly self-hostable?

towfiqi · 2022-12-02T06:21:47Z

towfiqi
Dec 2, 2022
Maintainer

@thelazyoxymoron The 5000 req of scrapingrobot is probably renews each month and its not for lifetime. Just checked my account again after a month, and it looks like the request count has been reset to 5000. In any case, I have contacted them to clarify the free limit. Will get back to you once they respond.

As for alternative methods, its rough. Maintaining an IP pool for scraping Google is not cheap or easy, Google bans an ip whenever it makes unusual amount of search request in an hour. The alternative of scrapers is using rotating proxy ip services which can be can quite expensive.

0 replies

towfiqi · 2022-12-06T02:05:21Z

towfiqi
Dec 6, 2022
Maintainer

@thelazyoxymoron Just received a reply from scrapingrobot. And they said it renews each month. So its 5000/month.
Updated the doc accordingly.

0 replies

thelazyoxymoron · 2022-12-06T05:32:48Z

thelazyoxymoron
Dec 6, 2022
Author

Great, thanks.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inviting ideas to reduce dependency on 3rd party scrappers #16

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Inviting ideas to reduce dependency on 3rd party scrappers #16

thelazyoxymoron Dec 2, 2022

Replies: 3 comments

towfiqi Dec 2, 2022 Maintainer

towfiqi Dec 6, 2022 Maintainer

thelazyoxymoron Dec 6, 2022 Author

thelazyoxymoron
Dec 2, 2022

towfiqi
Dec 2, 2022
Maintainer

towfiqi
Dec 6, 2022
Maintainer

thelazyoxymoron
Dec 6, 2022
Author