Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Test OTel Standalone Library Instrumentation with NR OTel Span Prototype #2060

Closed
jasonjkeller opened this issue Sep 23, 2024 · 2 comments
Assignees
Labels
oct-dec qtr Represents proposed work item for the Oct-Dec quarter

Comments

@jasonjkeller
Copy link
Contributor

jasonjkeller commented Sep 23, 2024

Test whether the OTel Span Prototype captures spans from OTel Standalone Library Instrumentation for a framework currently un-instrumented by the New Relic Java agent (e.g. Armeria instrumentation is recommended). What happens when a framework is instrumented by both OTel Standalone Library Instrumentation and a NR Java agent instrumentation (weave module/custom instrumentation)?

https://opentelemetry.io/docs/concepts/instrumentation/libraries/
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md

@workato-integration
Copy link

@jasonjkeller jasonjkeller added the oct-dec qtr Represents proposed work item for the Oct-Dec quarter label Sep 23, 2024
@jasonjkeller jasonjkeller self-assigned this Sep 23, 2024
@jasonjkeller
Copy link
Contributor Author

jasonjkeller commented Oct 24, 2024

The following project contains a blog server and a client that makes simple CRUD requests to the server. Both the server and the client are built using Armeria and both are configured to use the Armeria standalone OTel instrumentation for emitting OTel spans/metrics.

https://github.com/jasonjkeller/armeria-client-server-example

When instrumenting both the client and the server with the NR OTel Span Prototype Java agent the Armeria OTel spans are detected and added to, or start, NR traces.

The client requires usage of the New Relic Java agent @Trace annotation to start a transaction as the detected OTel SpanKind.CLIENT spans will not start a transaction on their own.

The server does not need the Java agent instrumentation or APIs to start transaction as the detected OTel SpanKind.SERVER spans will cause the agent to start transactions (if there is not already an existing transaction).


There are two cases to consider..

Case 1: NR Java agent instrumentation libraries disabled, custom instrumentation and OTel spans enabled

In the first case, all out of the box instrumentation from the NR Java agent has been disabled. Only custom instrumentation and detected OTel Spans will contribute to traces. There are some clear issues demonstrated with this case in the screenshots below.

Client

armeria-client-standalone-otel-only

Server

armeria-server-standalone-otel-only

Case 2: NR Java agent instrumentation libraries, custom instrumentation, and OTel spans enabled

In the second case, all out of the box instrumentation from the NR Java agent is enabled and it will contribute to traces along with any custom instrumentation and detected OTel Spans. There are some noticeable improvements when the New Relic Java agent instrumentation is working, specifically we now see that the client and server entities are properly linked in the trace:

armeria-client-server-standalone-otel-and-nr

However, there are some noticeable problems present when the two instrumentation sources are combined...

Client

In addition to the trace above showing the client calling the server, the Java agent's Netty instrumentation has started the following transaction, which should be included in the trace between services, but instead is a separate transaction because it never successfully linked the async token. Furthermore, the token timed out resulting in the Truncated segment.
armeria-client-standalone-otel-and-nr

Server

In addition to the trace above showing the client calling the server, we see a separate transaction for the OTel server span representing the call to the server blogs endpoint. This span should be included in the trace with the client calling the server however it wasn't successfully linked to it.
armeria-server-standalone-otel-and-nr


In summary, things sort of work but not currently in a correct or cohesive manner as expected from APM agents, and this is for a very simple distributed trace.

When relying on just the OTel span data the traces are not as complete as desired and we don't see proper entity linking. Also, the OTel client spans still require manually starting transactions via the NR API for them to show up at all.

When combining the OTel span data with out of the box NR instrumentation, it's resulting in what should be a single complete trace instead being broken up into multiple traces. This is due to missed token linking across threads between NR and OTel spans.

Further work will need to be done to investigate a solution for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oct-dec qtr Represents proposed work item for the Oct-Dec quarter
Projects
Archived in project
Development

No branches or pull requests

1 participant