Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make tracing suitable for event-driven architectures #4349

Open
wdonne opened this issue Dec 30, 2024 · 2 comments
Open

Make tracing suitable for event-driven architectures #4349

wdonne opened this issue Dec 30, 2024 · 2 comments
Labels
spec:trace Related to the specification/trace directory triage:deciding:needs-info Not enough information. Left open to provide the author with time to add more details

Comments

@wdonne
Copy link

wdonne commented Dec 30, 2024

The current specification focuses on only one use-case. This is a synchronous service call with a clear beginning and an end, which are known upfront. Moreover, the logic to produce a consistent trace is left entirely to the emitter of the telemetry. Backends are not required to be able to construct traces from bits and pieces.

For event-driven architectures this cannot work because:

  • There is no logical end to an event trace. The last event that occurred is the end, but you cannot know that when it occurs.
  • The beginning of an event trace is uncertain. It is the first occurrence of event with a certain trace ID. That can come from anywhere. In all those places a root span could be generated but without an end time.
  • An event has no duration. It just marks a moment. With post-processing, the time between an event and some reaction to it may be measured, but there isn't always a reaction.
  • Often events already carry something like a correlation ID, which are propagated. It should therefore be possible to set trace IDs that are derived from such information instead of having only generated IDs.

To fix this, it should be possible to generate traces that are a collection of root spans, all with the same trace ID and no end time. From that, a backend can produce a consistent trace when it is requested or update it if it is stored.

@wdonne wdonne added the spec:trace Related to the specification/trace directory label Dec 30, 2024
@danielgblanco
Copy link
Contributor

Hi @wdonne the tracing specification does contemplate asynchronous behaviour. We have a specific SIG that is working on semantic conventions for messaging. They meet on Thursdays at 8:00 PT and their Slack channel in CNCF is #otel-messaging.

We think some of the aspects you're proposing can be modelled via current functionality or can be discussed further in that group. Has this been already raised there?

@danielgblanco danielgblanco added the triage:deciding:needs-info Not enough information. Left open to provide the author with time to add more details label Jan 6, 2025
@wdonne
Copy link
Author

wdonne commented Jan 8, 2025

Hi @danielgblanco , event-driven systems are, indeed, asynchronous and often use a messaging system. The semantic conventions for messaging is only about attributes, as it is the case for all semantic conventions. Therefore, I wonder if that is the right place to discuss the constraints in the general specification that inhibit tracing for event-driven architectures. But if you think that SIG is a better option, then I will go there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:trace Related to the specification/trace directory triage:deciding:needs-info Not enough information. Left open to provide the author with time to add more details
Projects
None yet
Development

No branches or pull requests

2 participants