Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry on configurable exception #6991

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

YuriyHolinko
Copy link

@YuriyHolinko YuriyHolinko commented Jan 6, 2025

Number of retryable exceptions is very limited in the current logic so we have data loss in case of any other(not mentioned in the current java code) IO exception happen.
As we might have different networks we might experience different exceptions. In my environment I caught a few exceptions that very likely need to be retried and they are not listed as retryable in current code.
since each environment is different I suggest to have an ability to configure retryable exceptions

the change is fully backward compatible and does not change default behaviour of the library.

@YuriyHolinko YuriyHolinko requested a review from a team as a code owner January 6, 2025 21:07
Copy link

linux-foundation-easycla bot commented Jan 6, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@YuriyHolinko
Copy link
Author

Resolves #6962

Copy link

codecov bot commented Jan 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.01%. Comparing base (ccccd1b) to head (aac988d).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6991      +/-   ##
============================================
+ Coverage     89.97%   90.01%   +0.04%     
- Complexity     6591     6599       +8     
============================================
  Files           729      729              
  Lines         19852    19856       +4     
  Branches       1953     1954       +1     
============================================
+ Hits          17861    17873      +12     
+ Misses         1396     1387       -9     
- Partials        595      596       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

RetryInterceptor::isRetryableException,
e ->
retryPolicy.getRetryExceptionPredicate().test(e)
|| RetryInterceptor.isRetryableException(e),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OR here is interesting. It means a user can choose to expand the definition of what is retryable but not reduce it. I wonder if there are any cases when you would not want to retry when the default would retry. 🤔

Copy link
Author

@YuriyHolinko YuriyHolinko Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a user can choose to expand the definition of what is retryable but not reduce it

it's exactly the idea

I wonder if there are any cases when you would not want to retry when the default would retry

I would say no 🤔

@jack-berg
Copy link
Member

Thanks for the PR!

In my environment I caught a few exceptions that very likely need to be retried and they are not listed as retryable in current code. since each environment is different I suggest to have an ability to configure retryable exceptions

Wondering if you could elaborate on these, since its possible that the errors aren't actually environment-specific and everyone could benefit from them. My initial inclination was that we should just update the static definition of what constitutes a retryable exception, but I'm open to being wrong.

@YuriyHolinko
Copy link
Author

YuriyHolinko commented Jan 7, 2025

hey @jack-berg

Wondering if you could elaborate on these, since its possible that the errors aren't actually environment-specific and everyone could benefit from them. My #6962 (comment) was that we should just update the static definition of what constitutes a retryable exception, but I'm open to being wrong.

3 exceptions from me:

  1. DNS issues. my services are running on popular cloud providers and using their DNS services but sporadically I encounter issues like this.
java.net.UnknownHostException: xxxxxx.com
	at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
	at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
	at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
	at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
	at okhttp3.Dns$Companion$DnsSystem.lookup(Dns.kt:49)
	at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.kt:164)
java.io.InterruptedIOException: timeout 	
    at okhttp3.internal.connection.RealCall.timeoutExit(RealCall.kt:398)
java.net.SocketTimeoutException: timeout 	
    at okio.SocketAsyncTimeout.newTimeoutException(JvmOkio.kt:143) 	
    at okio.AsyncTimeout.access$newTimeoutException(AsyncTimeout.kt:162) 	
    at okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:340) 	
    at okio.RealBufferedSource.indexOf(RealBufferedSource.kt:449) 	
    at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.kt:333) 	
    at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)

Also recently we had network issues(retryable) when using SSL, but it was luckily solved by java upgrade so I can neglect it but it might be useful for some users with some java versions

I don't mind to put all of that into "static" definition but the reason I want to have it configurable is the ability to apply a quick fix when a new retryable exception is discovered. Also I don't know all the exceptions that other people encounter in their networks so the list of exceptions is not complete

So I can combine it all in static definition in addition to the current retryable exceptions, but I want to preserve the dynamic config as well

Tell me your thoughts about it

@YuriyHolinko YuriyHolinko marked this pull request as draft January 7, 2025 16:24
@YuriyHolinko YuriyHolinko marked this pull request as ready for review January 7, 2025 16:24
@YuriyHolinko YuriyHolinko requested a review from jack-berg January 7, 2025 16:24
@YuriyHolinko YuriyHolinko changed the title Retry on configurable exception Retry on configurable exception Jan 7, 2025
@YuriyHolinko
Copy link
Author

there is flaky test in of the checks, not related to my change because it's in metrics product 🙈
could anyone tell if I can rerun it somehow without any new commits ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data loss if issues on TCP protocol layer or failures on network link. Retry policy is ignored
2 participants