You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
diff --git a/robotstxt_test.go b/robotstxt_test.go
index 6ccb730..6cbda57 100644
--- a/robotstxt_test.go+++ b/robotstxt_test.go@@ -291,3 +291,38 @@ func newHttpResponse(code int, body string) *http.Response {
ContentLength: int64(len(body)),
}
}
++func TestDisallowAll(t *testing.T) {+ r, err := FromStatusAndBytes(500, nil) // We got a 500 response => Disallow all+ require.NoError(t, err)++ a := r.TestAgent("/", "*")+ assert.False(t, a) // Resource access NOT allowed (EXPECTED)++ b := r.FindGroup("*").Test("/")+ assert.True(t, b) // Resource access allowed (UNEXPECTED)++ assert.Equal(t, a, b) // Results for test on Agent and Group are differents...++ /*+ It's because the `disallowAll` is checked by `TestAgent` but not `Test`.++ Because `TestAgent` also calls `FindGroup` internally but obfuscates the+ value of `CrawlDelay`, users of this library might prefer to use+ (`FindGroup` + `Test`) to have access to the `CrawlDelay` value in case the+ path is allowed.++ FindGroup -> Test (ok) -> check CarwlDelay++ Unfortunately, the `Test` method does not use the `disallowAll` member set+ on response with status in the range [500; 599]. This behavior is unexpected+ and can lead to involuntary politeness policy violation.++ Unless we resign to call `TestAgent` and `FindGroup` to get the `CrawlDelay`+ value.++ TestAgent (ok) -> FindGroup -> check CrawlDelay++ This way, `FindGroup` has been called twice.+ Is there a way to avoid it without risking politeness policy violation?+ */+}
Run:
go test ./... -run TestDisallowAll
The text was updated successfully, but these errors were encountered:
masonlouchart
changed the title
TestAgent and Test (for the same agent) gives different result in case of temporary error when fetching the robots.txt file
TestAgent and Test (for the same user-agent) gives different result in case of temporary error when fetching the robots.txt file
Nov 21, 2023
masonlouchart
changed the title
TestAgent and Test (for the same user-agent) gives different result in case of temporary error when fetching the robots.txt file
TestAgent and Test (for the same user-agent) gives different results in case of temporary error when fetching the robots.txt file
Nov 21, 2023
Run:
go test ./... -run TestDisallowAll
The text was updated successfully, but these errors were encountered: