-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change handling of graceful case of LoadStatsReporting onRemoteClose #37076
Conversation
…tsReporting, and reduce retry rate. Signed-off-by: Brian Sonnenberg <[email protected]>
Signed-off-by: Brian Sonnenberg <[email protected]>
Signed-off-by: Brian Sonnenberg <[email protected]>
Signed-off-by: Brian Sonnenberg <[email protected]>
Signed-off-by: Brian Sonnenberg <[email protected]>
d236307
to
493ca09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/wait-any
@@ -26,6 +26,7 @@ LoadStatsReporter::LoadStatsReporter(const LocalInfo::LocalInfo& local_info, | |||
} | |||
|
|||
void LoadStatsReporter::setRetryTimer() { | |||
ENVOY_LOG(info, "Load reporter stats stream/connection will retry in {} ms.", RETRY_DELAY_MS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a counter for retry attempts? Otherwise this change will cause a loss of visibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the "errors" stat tracks all instances of onRemoteClose (which is the caller of retries), regardless of graceful or not. Do you think we should add a separate stat for graceful closes specifically, or just continue including the graceful close in the errors stat? I suppose from a client perspective, a graceful close could be considered an error. I do think the warning log level should be reduced regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be good to add a stat for establishing a new stream. Otherwise it would be difficult to diagnose the situation where the server keeps closing new streams for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/wait-any
@@ -26,6 +26,7 @@ LoadStatsReporter::LoadStatsReporter(const LocalInfo::LocalInfo& local_info, | |||
} | |||
|
|||
void LoadStatsReporter::setRetryTimer() { | |||
ENVOY_LOG(info, "Load reporter stats stream/connection will retry in {} ms.", RETRY_DELAY_MS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be good to add a stat for establishing a new stream. Otherwise it would be difficult to diagnose the situation where the server keeps closing new streams for some reason.
Signed-off-by: Brian Sonnenberg <[email protected]>
@briansonnenberg please fix format errors. https://github.com/envoyproxy/envoy/actions/runs/11900732486/job/33162213299#step:16:598 /wait |
Signed-off-by: Brian Sonnenberg <[email protected]>
Head branch was pushed to by a user without write access
Looks like some unrelated test is failing the CI? IpVersionsClientType/NetworkExtensionDiscoveryIntegrationTest https://github.com/envoyproxy/envoy/actions/runs/11919711638/job/33220044570 |
Fixes #34305. |
Reopening this PR
Currently, we treat all remote grpc stream closes as errors, and log warnings and increment failure metrics for every instance. A remote grpc stream close with a 0 code is not a failure, but instead a graceful termination, so it should be a lower log level and should not increment failure metrics.
Commit Message:
Additional Description:
Risk Level: Low
Testing:
Docs Changes:
Release Notes:
Platform Specific Features: