Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for OTEL tracing #1319

Merged
merged 1 commit into from
Feb 10, 2024

Conversation

saschagrunert
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

crictl now features 3 new CLI parameters:

  • --enable-tracing: Enable OpenTelemetry tracing. (default: false)
  • --tracing-endpoint: Address to which the gRPC tracing collector will send spans to. (default: "0.0.0.0:4317")
  • --tracing-sampling-rate-per-million: Number of samples to collect per million OpenTelemetry spans. Set to 1000000 or -1 to always sample. (default: -1)

The tracer provider will be created on startup and the Shutdown() invocation will ensure that all spans are processed before exiting the binary.

The hack/tracing directory contains scripts for local testing:

> ./hack/tracing/start
…
Everything is ready, open http://localhost:16686 to access jaeger

When now running crictl with --enable-tracing:

> sudo ./build/bin/linux/amd64/crictl --enable-tracing ps

Then jaeger should show collected traces and spans for the 3 RPCs ListContainers, ImageFsInfo as well as Version:

screenshot

Which issue(s) this PR fixes:

Fixes #1317

Special notes for your reviewer:

None

Does this PR introduce a user-facing change?

Added support for OTEL tracing via the new CLI flags `--enable-tracing`, `--tracing-endpoint` and `--tracing-sampling-rate-per-million`.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 15, 2023
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 15, 2023
@saschagrunert saschagrunert force-pushed the tracing branch 2 times, most recently from 0545993 to db27260 Compare December 15, 2023 08:46
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 15, 2023
@saschagrunert saschagrunert force-pushed the tracing branch 4 times, most recently from 491823a to 425a995 Compare December 15, 2023 09:10
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold

hold for other maintainers to look at.

Everything looks good, thank you for adding the hack/tracing. I think what can be useful as an additional improvements:

  1. Add a start span/stop span operations on individual crictl commands so commands will group all the calls they are making and we will also see failed commands - when they failed with arguments validation.
  2. I wonder if we may need to hide any sensitive information off the arguments passed. Since it is an optional feature, as-is OK, but maybe later we can add an option to filter out sensitive information
  3. We may want to add a computer name as part of a span so we can group spans by it. This way we can see from jaeger who called crictl command

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 15, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 15, 2023
@saschagrunert
Copy link
Member Author

saschagrunert commented Dec 18, 2023

  1. Add a start span/stop span operations on individual crictl commands so commands will group all the calls they are making and we will also see failed commands - when they failed with arguments validation.
  2. I wonder if we may need to hide any sensitive information off the arguments passed. Since it is an optional feature, as-is OK, but maybe later we can add an option to filter out sensitive information

Thanks! I'll follow-up on those 👍

  1. We may want to add a computer name as part of a span so we can group spans by it. This way we can see from jaeger who called crictl command

We have included the host name: https://github.com/kubernetes-sigs/cri-tools/pull/1319/files#diff-f1e0d24911164f850e8c1e92b5e73073d417eb5ca067ad3feba01bc6d6211082R34-R43

Or do you have something more specific in mind?

Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks adding this!

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 16, 2024
@feiskyer
Copy link
Member

@saschagrunert could you rebase?

@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 29, 2024
@saschagrunert saschagrunert force-pushed the tracing branch 2 times, most recently from 66bb77f to 29f9113 Compare January 29, 2024 09:31
@saschagrunert
Copy link
Member Author

@feiskyer done :)

hack/tracing/env Outdated Show resolved Hide resolved
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2024
cmd/crictl/main.go Outdated Show resolved Hide resolved
hack/tracing/env Outdated Show resolved Hide resolved
pkg/tracing/tracing.go Outdated Show resolved Hide resolved
pkg/tracing/tracing.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2024
@saschagrunert
Copy link
Member Author

Addressed the comments, thank you for the review @kwilczynski @SergeyKanzhelev. PTAL again.

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: feiskyer, saschagrunert, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [feiskyer,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@saschagrunert
Copy link
Member Author

@SergeyKanzhelev can we lift the hold?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 9, 2024
`crictl` now features 3 new CLI parameters:

- `--enable-tracing`: Enable OpenTelemetry tracing. (default: false)
- `--tracing-endpoint`: Address to which the gRPC tracing collector will send spans to. (default: "0.0.0.0:4317")
- `--tracing-sampling-rate-per-million`: Number of samples to collect per million OpenTelemetry spans. Set to 1000000 or -1 to always sample. (default: -1)

The tracer provider will be created on startup and the `Shutdown()`
invocation will ensure that all spans are processed before exiting the
binary.

The `hack/tracing` directory contains scripts for local testing:

```
> ./hack/tracing/start
…
Everything is ready, open http://localhost:16686 to access jaeger
```

When now running `crictl` with `--enable-tracing`:

```
> sudo ./build/bin/linux/amd64/crictl --enable-tracing ps
```

Then jaeger should show collected traces and spans for the 3 RPCs
`ListContainers`, `ImageFsInfo` as well as `Version`.

Signed-off-by: Sascha Grunert <[email protected]>
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 9, 2024
@SergeyKanzhelev
Copy link
Member

/unhold

sorry for delay

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Feb 10, 2024
@k8s-ci-robot k8s-ci-robot merged commit 79473a0 into kubernetes-sigs:master Feb 10, 2024
23 checks passed
@saschagrunert saschagrunert deleted the tracing branch February 12, 2024 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for OpenTelemetry traces
5 participants