-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: erase persistent terms on app stop #670
base: main
Are you sure you want to change the base?
fix: erase persistent terms on app stop #670
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #670 +/- ##
==========================================
+ Coverage 73.02% 73.08% +0.05%
==========================================
Files 61 61
Lines 1924 1932 +8
==========================================
+ Hits 1405 1412 +7
- Misses 519 520 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This would trigger a global GC. What is the purpose of deleting the terms? |
If the app is completely stopped, I think it should cleanup its data, even if it has some extra cost like global GC. |
So we actually used to do this but I removed it. I need to dig up the commit/PR to remember why :) |
Or no, maybe we never deleted all terms, just set the tracer to |
I agree in principle with the "clean up after yourself" but under the guidelines we have of trying to minimize application impact during runtime, I'm reluctant to trigger a global GC. And we actively discourage attempting to restart the sdk. |
If sdk restart is discouraged, it is probably not that bad to cleanup persistent terms when it is stopped (let's assume it really needs to be stopped in that case). It would also probably make sense to change the way tracers are stored in persistent term (currently there are too many terms, e.g. a tracer per each application, though essentially they are all the same). Erasing fewer persistent terms must cause less stress on the system. Side note: In my personal opinion, if sdk restart is discouraged, two things must be accomplished:
Finally, global GC impact during the cleanup may be not that dramatic to justify leaving the terms even after full sdk stop.. 🤔
|
Maybe the overhead is acceptable. It is a tough one to be sure about so we err'ed on the side of just not causing the GC. I like it being cleaner and helping with testing. I'd say I'm torn on this issue. |
Given the intent is to clean-up the persistent terms in order to clean them up when restarting because otherwise there's a bug with straggling data, this means that there are two ways to fix it:
Both the clean-up and the overwrite should result in a global GC, but the overwrite should only GC the terms that changed. This may be less impactful? |
Overwrite doesn't work in the case of the Tracer being cached -- unless we expired the cache on restart somehow. For tests I realized the immediately solution is using a unique TracerProvider for each test suite. The term is keyed in part on the TracerProvider's name and by default the global TracerProvider is used when a user requests a Tracer -- its name is This does not work for SDK configurations like default propagators. |
Technically we are supposed to provide a way to reconfigure the Tracers. So we actually SHOULD add this with a big warning in the docs to not do it unless you really really need to and understand the consequences. |
One more thing to call out around Fred's comments. AFAIK we've never heard of otel itself crashing nor of use cases where folks are stopping and starting the app outside of tests. I think Tristan's comments address a few folks issues where specific tests are being affected but aren't production issues. I'd rather that get addressed than reworking these particular internals. |
No description provided.