Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: clarify behavior when retrieving non-existent currently active span #4304

Open
dmathieu opened this issue Sep 13, 2022 · 15 comments
Open
Labels
triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted

Comments

@dmathieu
Copy link
Member

dmathieu commented Sep 13, 2022

For languages that provide an implicitly propagated Context, the API should provide a way to retrieve the currently active span.
See https://github.com/open-telemetry/opentelemetry-specification/blob/2cfad37daf7e0d20851fd8a639a55375c3fc93dd/specification/trace/api.md#context-interaction

However, I am seeing a divergence in behaviours between SDKs which I believe would be nice to be coherent about.
If there is no current active span, most SDKs will return an invalid/noop span, while others will return undefined.

I believe this is a pretty big difference between those SDKs, as depending on the language being used, folks may get errors if they get a context which unexpectedly doesn't have any span.
Or they may be losing data if they get an invalid span and don't check for it.

My proposal is therefore the following:

  • SDKs that provide a way to retrieve the current span MUST return an invalid or noop span if none were set in the context.
  • SDKs MAY log a debug if an invalid/noop span was returned.
@Oberon00
Copy link
Member

What is the difference between noop and invalid span?

@Oberon00
Copy link
Member

Oberon00 commented Sep 13, 2022

There is this related (but not entirely applicable) point in the error handling guidance: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/error-handling.md#guidance

  1. Whenever API call returns values that is expected to be non-null value - in case of error in processing logic - SDK MUST return a "no-op" or any other "default" object that was (ideally) pre-allocated and readily available. This way API call sites will not crash on attempts to access methods and properties of a null objects.

EDIT: I think this issue may not require an OTEP, if others agree, maybe move this issue to a normal spec issue.

@dmathieu
Copy link
Member Author

What is the difference between noop and invalid span?

IMHO, their difference is an implementation detail. It's always a span which will not be sending any data once closed.

Thank you for the error handling link. That would definitely point towards returning a noop span rather than nil. I think being able to know when those cases occur (with a warning for example) would be nice, as even though they shouldn't trigger exceptions, they should be catchable as well.

@Flarna
Copy link
Member

Flarna commented Sep 13, 2022

regarding always return a span: How and a user differentiate between a non sampled trace (which is represented by Noop/NonRecording/... spans) and no trace active at all?

This is relevant for example in propagator.inject() which should not inject a span in case there is no trace active. But if getCurrentContext().getSpan() always provides a span some API on the span is needed to detect in inject is needed or not.
I guess comparing spanId/traceId against all 0 all the time is a bit of an overhead.

@dmathieu
Copy link
Member Author

What the Go SDK does is that SpanContext has an IsValid method, which returns false for noop spans.
Then propagators return early if the context is invalid.

@Oberon00
Copy link
Member

IsValid is part of the spec: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#isvalid and it is unrelated to noop vs. not-noop. Instead it checks if the spancontext has a valid trace & span ID. I think that should satisfy your use case @Flarna, because a not-sampled span will have a valid span & trace ID.

@Flarna
Copy link
Member

Flarna commented Sep 13, 2022

Well for propagators it's fine but I still don't see the advantage to create a Noop instance just to return something. Or to add more APIs on span like (IsSampled(), IsDummy(), IsApiOnly(), IsOnTrace(),...).

What's wrong with null/undefined/... in case there is nothing? Assuming here the language in question has something like this.

It remembers me a bit on C++ std::string which has no difference between no string and an empty string so one needs some extra flag or whatever to represent this case.

@dmathieu
Copy link
Member Author

The difference between nil and a noop span is that a noop span will accept calling all normal methods, while nil will throw exceptions on undefined methods.

From the specifications mentioned above:

This way API call sites will not crash on attempts to access methods and properties of a null objects.

@Flarna
Copy link
Member

Flarna commented Sep 13, 2022

if you call a non existing method on a Noop it will also throw. So well, wrong usage results in undefined behavior - as one would expect.

Maybe a bit off topic but related. Should we also return a dummy baggage if non is on context? and what should DummyBaggage.getEntry() return? at least in JS this returns BaggageEntry | undefined now.

Similar, what should context.getValue("nonExitistingKey") return as dummy?

@dmathieu
Copy link
Member Author

I meant undefined methods for nil, not undefined for a proper span object.

@Flarna
Copy link
Member

Flarna commented Sep 14, 2022

  • SDKs MAY log a warning if an invalid/noop span was returned.

I think we should not issue a warning as it is perfectly fine that no trace is active. warning logs indicate that something is wrong so at most debug/info would should be used in my opinion.

@dmathieu
Copy link
Member Author

Sure, debug makes sense. I've updated the issue description.

@dmathieu
Copy link
Member Author

What I'm seeing all other SDKs do is return empty/noop baggage and metrics/tracer providers, which matches the specification statement.

Regarding values, it seems to differ between SDKs.
For example, Go's context returns nil for missing values (but Go's context comes from the standard language library), and Baggage returns an invalid member when retrieving a key which doesn't exist.

@trask
Copy link
Member

trask commented Nov 19, 2024

(transferred to specification repository as part of #4284)

@svrnm svrnm added the triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted label Nov 25, 2024
@svrnm
Copy link
Member

svrnm commented Nov 25, 2024

This still seems to be a valid concern, with moving this over from the OTEP repo we should maybe continue the discussion on it, except there is another existing issue in the spec repo addressing the same concern

@github-actions github-actions bot added the triage:followup Needs follow up during triage label Dec 10, 2024
@trask trask removed the triage:followup Needs follow up during triage label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted
Projects
None yet
Development

No branches or pull requests

5 participants