Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collector execution hangs inside container #683

Open
machadovilaca opened this issue Oct 1, 2024 · 2 comments
Open

Collector execution hangs inside container #683

machadovilaca opened this issue Oct 1, 2024 · 2 comments
Labels
artifact:docker bug Something isn't working

Comments

@machadovilaca
Copy link

machadovilaca commented Oct 1, 2024

Describe the bug

Running a custom collector locally works as expected, but when running inside a container (Podman), execution hangs in the initial setup steps.

Steps to reproduce

  1. Create a collector with ocb
  2. Configure components with a receiver from opentelemetry-collector-contrib
  3. Create Dockerfile
FROM golang:1.23-bullseye

RUN apt-get update && apt-get install -y libvirt-dev

COPY kubevirt-vm-otel-collector /src/kubevirt-vm-otel-collector
COPY opentelemetry-collector-contrib /src/opentelemetry-collector-contrib

WORKDIR /src/kubevirt-vm-otel-collector

RUN go mod download
RUN go build -o kubevirt-vm-otel-collector

ARG USER_UID=10001
USER ${USER_UID}

ENTRYPOINT ["/src/kubevirt-vm-otel-collector/kubevirt-vm-otel-collector"]
CMD ["--config", "/src/kubevirt-vm-otel-collector/config.yaml"]
  1. Build and run image:
➜  podman run <IMG>
2024-10-01T17:56:19.134Z        info    [email protected]/service.go:129 Setting up own telemetry...
2024-10-01T17:56:19.134Z        warn    [email protected]/service.go:196 service::telemetry::metrics::address is being deprecated in favor of service::telemetry::metrics::readers
2024-10-01T17:56:19.134Z        info    [email protected]/telemetry.go:98        Serving metrics {"address": ":8888", "metrics level": "Normal"}
2024-10-01T17:56:19.134Z        info    builders/builders.go:26 Development component. May change in the future.        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2024-10-01T17:56:19.134Z        debug   builders/builders.go:24 Alpha component. May change in the future.      {"kind": "receiver", "name": "kubevirt_vms_receiver", "data_type": "metrics"}
<exits>

Expected Result

(observed only running locally)

➜  ./kubevirt-vm-otel-collector --config config.yaml
2024-10-01T19:10:55.839+0100    info    [email protected]/service.go:129 Setting up own telemetry...
2024-10-01T19:10:55.839+0100    warn    [email protected]/service.go:196 service::telemetry::metrics::address is being deprecated in favor of service::telemetry::metrics::readers
2024-10-01T19:10:55.839+0100    info    [email protected]/telemetry.go:98        Serving metrics {"address": ":8888", "metrics level": "Normal"}
2024-10-01T19:10:55.839+0100    info    builders/builders.go:26 Development component. May change in the future.        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2024-10-01T19:10:55.839+0100    debug   builders/builders.go:24 Alpha component. May change in the future.      {"kind": "receiver", "name": "kubevirt_vms_receiver", "data_type": "metrics"}
2024-10-01T19:10:55.841+0100    info    [email protected]/service.go:213 Starting kubevirt-vm-otel-collector...  {"Version": "0.0.1", "NumCPU": 16}
2024-10-01T19:10:55.841+0100    info    extensions/extensions.go:39     Starting extensions...
2024-10-01T19:10:55.841+0100    info    kubevirtvmreceiver/receiver.go:61       KubeVirt VM Receiver started    {"kind": "receiver", "name": "kubevirt_vms_receiver", "data_type": "metrics"}
2024-10-01T19:10:55.841+0100    info    [email protected]/service.go:239 Everything is ready. Begin running and processing data.
2024-10-01T19:10:55.841+0100    info    localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default.    {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-10-01T19:11:00.841+0100    debug   kubevirtvmreceiver/receiver.go:90       Collecting metrics      {"kind": "receiver", "name": "kubevirt_vms_receiver", "data_type": "metrics"}
2024-10-01T19:11:00.841+0100    debug   kubevirtvmreceiver/receiver.go:98       Connecting to VM on socket      {"kind": "receiver", "name": "kubevirt_vms_receiver", "data_type": "metrics", "socket": "qemu:///session"}
...
Copy link

dosubot bot commented Oct 1, 2024

Hey @machadovilaca 👋, I'm here to help you with your bug report. I'm Dosu, a bot that can assist with bug fixes, answer questions, and guide you on becoming a contributor. Let's dive into the issue you're experiencing.

To address the issue of the OpenTelemetry Collector hanging during startup inside a Podman container, consider the following:

  1. Network Configuration: Ensure that the network settings are correctly configured. The default endpoints might be binding to localhost instead of 0.0.0.0, which can cause connectivity issues inside containers. Verify and adjust the configuration if necessary.

  2. User Permissions: The Dockerfile sets a specific USER_UID. Ensure that this user has the necessary permissions to execute the collector and access required resources.

  3. Resource Limits: Check if there are any resource constraints (CPU, memory) imposed by Podman that might be affecting the startup. Adjust these limits if needed.

  4. Null Maps in Configuration: Ensure that your configuration files do not contain null values. Use {} for empty maps to avoid issues.

  5. Debugging Tools: Utilize the debug exporter to gain insights into what might be causing the hang. This can help identify if the issue is related to specific components or configurations.

  6. Internal Telemetry: Enable internal telemetry to monitor the collector's performance and identify potential bottlenecks.

These steps should help in diagnosing and resolving the startup hang issue.

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added artifact:docker bug Something isn't working labels Oct 1, 2024
@mx-psi
Copy link
Member

mx-psi commented Oct 4, 2024

From your code it seems like you are using a custom component. Are you able to reproduce this with the official Docker images and components in contrib?

If not, it's hard for us to help you, you can you try and attach a live debugger (e.g. by doing delve attach) and print the stack trace to see where it is stuck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
artifact:docker bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants