Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate span_id & trace_id in logs records when converting trace events #1394

Conversation

maximebedard
Copy link

@maximebedard maximebedard commented Nov 22, 2023

Fixes #1378.

Changes

Depends on tokio-rs/tracing-opentelemetry#78.

This PR updates the dependency of tracing-opentelemetry to 0.22 and add a support for propagating the trace context when bridging trace events to logs. This is useful when sending otlp log records to tools such as datadog, which would then use the trace_id and span_id to correlate logs and traces.

To do so, I'm pulling the OtelData from the current span and converting it to a TraceContext that can be set on the log record.

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@maximebedard maximebedard requested a review from a team November 22, 2023 18:54
Copy link

linux-foundation-easycla bot commented Nov 22, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@cijothomas
Copy link
Member

@maximebedard Could you sign the easycla part, which is a pre-req before accepting any contributions!

opentelemetry = { path = "../opentelemetry", features = ["logs"] }
opentelemetry_sdk = { path = "../opentelemetry-sdk", features = ["logs"] }
tracing = { version = "0.1", default-features = false, features = ["std"] }
tracing-opentelemetry = "0.22"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks odd to depend on tracing-opentelemetry!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. I believe it's necessary since it's where the OtelData is injected into the current span extension and where the TraceId and SpanId is generated. If there's a better way to retrieve the data, I would be happy to update the PR and remove this dependency.

Copy link
Member

@lalitb lalitb Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just copy the OtelData definition from tracing-opentelemetry, doesn't seems to be a deep-rooted definition. And this will help get rid of dependency.

#[derive(Debug, Clone)]
pub struct OtelData {
    /// The parent otel `Context` for the current tracing span.
    pub parent_cx: opentelemetry::Context,

    /// The otel span data recorded during the current tracing span.
    pub builder: opentelemetry::trace::SpanBuilder,
}

In the future, it would be good to have a single subscriber/layer for both traces and logs and so consolidate at that time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this works as the extension type internally uses a HashMap<TypeId, _>, and the TypeId of the copied definition would happen to be different. I tried an simpler example here and the types are indeed not equal.

Copy link
Member

@lalitb lalitb Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. You are right, copying OtelData definition won't work here. Do you think we can make this dependency optional? And then check for the tracing context only if this dependency is met. This will ensure that the users who are using this crate solely for logging purpose are not paying cost of context lookup when no span is actually created. And this is valid also for users who are directly using the otel tracing API (though it is not recommended).

As a side-note, as of now, when using bothtracing-opentelemetry and opentelemetry-appender-tracing, the logs emitted with tracing macros (trace!, info!, event! etc) are exported both through SpanExporter as span-events, and the LogExporter as logs. This mayn't be desirable. It would be eventually good to integrate both tracing-opentelemetry and opentelemetry-appender-tracing crates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ewoolsey Can you validate if the span_id and trace_id are populated correctly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's working correctly. I've got a project depending on this PR and it's functioning as expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, interesting. I think there won't be any error thrown by the Rust compiler for the circular dependency. Rust strong type system just allows that, but still this is a bad design. One option could be to reopen the PR for review to unblock, and follow up the circular dependency problem with a separate GitHub issue. @maximebedard - if you have any comment?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love that personally. This is a pretty critical feature for my project and finding a way around it has been really tough.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency issue is specific to the workspace, it would not be an issue if you use a released version of the opentelemetry crates and tracing-opentelemetry. We can see the issue in the CI job that failing here https://github.com/open-telemetry/opentelemetry-rust/actions/runs/7105523258/job/19342898991. It's not really possible to push a fix upstream unless we move the tracing_opentelemetry::OtelData into the current workspace.

I have a forked version that uses this patch, but I noticed that it doesn't fully solve the problem. The trace_id is only set for the root span (see https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/src/layer.rs#L871-L874), any child span created would be missing the trace_id and not be correlated correctly.

@maximebedard
Copy link
Author

I'm getting the CLA situation sorted out with my employer. I'll provide updates ASAP.

@0xForerunner
Copy link

Thanks for your hard work on this! Seems to be working well for me.

@hdost hdost added the S-blocked-cla Status: Issues blocked on CLA label Nov 29, 2023
@maximebedard maximebedard force-pushed the mb/add-otel-trace-id-log-record branch from efb0426 to 762b99a Compare December 5, 2023 19:05
@maximebedard maximebedard changed the title [WIP] Propagate span_id & trace_id in logs records when converting trace events Propagate span_id & trace_id in logs records when converting trace events Dec 5, 2023
@lalitb lalitb removed the S-blocked-cla Status: Issues blocked on CLA label Dec 7, 2023
@lpj145
Copy link

lpj145 commented Jan 26, 2024

What a amazing situation the one's top protocol to monitor services like open telemetry does't have a good integration with tracing, guys, we need to fix this thing... seriously, if you want to enable and add opentelemtry in another languages eg: tech stacks you just need to install it, configure under 3 lines, but in rust, you need todo a entire conversion just to trace works, but not for logs, by the way in the way that open telemtry follow we need to setup 3 connections of grpc just to have the data..... again... amazing...

@hdost
Copy link
Contributor

hdost commented Feb 18, 2024

@maximebedard 👋🏼 Hey I am curious what made you close the PR?

@lalitb
Copy link
Member

lalitb commented Feb 18, 2024

@maximebedard 👋🏼 Hey I am curious what made you close the PR?

It's summarised here -#1394 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: opentelemetry-appender-tracing should respect information inside tracing::Span
6 participants