You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was surprised to find http clients like python-requests, Go-http-client, wget, curl, etc included in the crawler list. While I understand that these tools can be abused, in our case a large portion of our legitimate web traffic is from API requests using http clients like these.
For now I think I'll need to create an overriding allow list of patterns and remove matches from agents.Crawlers before processing, but it would be great to be able to disambiguate client tools/libraries based on a field in crawler-user-agents.json. Maybe just an is_client boolean, or a more generic tags string array which could contain client or similar? Any thoughts?
The text was updated successfully, but these errors were encountered:
I was surprised to find http clients like
python-requests
,Go-http-client
,wget
,curl
, etc included in the crawler list. While I understand that these tools can be abused, in our case a large portion of our legitimate web traffic is from API requests using http clients like these.For now I think I'll need to create an overriding allow list of patterns and remove matches from
agents.Crawlers
before processing, but it would be great to be able to disambiguate client tools/libraries based on a field incrawler-user-agents.json
. Maybe just anis_client
boolean, or a more generictags
string array which could containclient
or similar? Any thoughts?The text was updated successfully, but these errors were encountered: