Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLP::Exporter#encode - bignum too big to convert into `unsigned long long' #1771

Open
manasaheggere opened this issue Dec 4, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@manasaheggere
Copy link

Hi Team,

We are using Ruby 3.2.5 and Rails 7.1.3.3

We have installed below opentelemetry gems
Gemfile

gem "opentelemetry-sdk"
gem "opentelemetry-exporter-otlp"
gem "opentelemetry-instrumentation-all"

and done the sdk configuration in opentelemetry.rb

require 'opentelemetry/sdk'
require 'opentelemetry/instrumentation/all'
require 'opentelemetry-exporter-otlp'

OpenTelemetry::SDK.configure do |c|
  c.service_name = <service_name>
  c.use_all()
  c.add_span_processor(
    OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(
      OpenTelemetry::Exporter::OTLP::Exporter.new
    )
  )
end
MyAppTracer = OpenTelemetry.tracer_provider.tracer(<tracer>)

We have also configured below environment variables

- name: OTEL_EXPORTER
  value: 'otlp'
- name: JAEGER_DISABLED
  value: 'true'
- name: JAEGER_SERVICE_NAME
  value: <service_name>
- name: JAEGER_AGENT_HOST
  valueFrom:
    fieldRef:
      apiVersion: v1
      fieldPath: status.hostIP
- name: OTEL_EXPORTER_OTLP_ENDPOINT
  value: http://$(JAEGER_AGENT_HOST):4318
- name: OTEL_SERVICE_NAME
  value: <service_name>

But we are seeing below errors in pre-prod environment

ERROR -- : OpenTelemetry error: Unable to export 485 spans
ERROR -- : OpenTelemetry error: unexpected error in OTLP::Exporter#encode - bignum too big to convert into `unsigned long long' - /home/circleci/.rubygems/gems/opentelemetry-exporter-otlp-0.29.0/lib/opentelemetry/exporter/otlp/exporter.rb:327:in `initialize'
@manasaheggere manasaheggere added the bug Something isn't working label Dec 4, 2024
@kaylareopelle
Copy link
Contributor

@manasaheggere, thanks for reaching out! I'm sorry to hear about the export errors.

Were things working on an earlier version of the OTLP exporter?

The line referenced in the error points to the encoding of a timestamp for a span event.

Are you creating span events with your app's tracer using the Span#add_event API?

# Add an Event to a {Span}.
#
# Example:
#
# span.add_event('event', attributes: {'eager' => true})
#
# Note that the OpenTelemetry project
# {https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-semantic-conventions.md
# documents} certain "standard event names and keys" which have
# prescribed semantic meanings.
#
# @param [String] name Name of the event.
# @param [optional Hash{String => String, Numeric, Boolean, Array<String, Numeric, Boolean>}] attributes
# One or more key:value pairs, where the keys must be strings and the
# values may be (array of) string, boolean or numeric type.
# @param [optional Time] timestamp Optional timestamp for the event.
#
# @return [self] returns itself
def add_event(name, attributes: nil, timestamp: nil)

@manasaheggere
Copy link
Author

Thanks for the response @kaylareopelle

We started implementing OTLP exporter now only, we were not used it earlier.

No, we have not implemented span.add_event

We have tried to create span using below code. This approach is also giving same error

require "opentelemetry/sdk"

def track_extended_warranty(extended_warranty)
  # Get the current span
  current_span = OpenTelemetry::Trace.current_span

  # And add useful stuff to it!
  current_span.add_attributes({
    "com.extended_warranty.id" => extended_warranty.id,
    "com.extended_warranty.timestamp" => extended_warranty.timestamp
  })
end
require "opentelemetry/sdk"

def do_work
  MyAppTracer.in_span("do_work") do |span|
    # do some work that the 'do_work' span tracks!
  end
end

@kaylareopelle
Copy link
Contributor

Hi @manasaheggere, thanks for your responses! I'm not able to reproduce the error yet with the provided code.

We'll need to reproduce the error outside of your environment to debug further.

Here's a walkthrough on how to create a minimal, reproducible example: https://stackoverflow.com/help/minimal-reproducible-example

This is the code I've used to test so far: https://gist.github.com/kaylareopelle/ef261c12e3a3e1c2cce59a25050c23f0
I'm running this script, while simultaneously running the OTLP collector available here. To run this collector, clone the opentelemetry-ruby repo, enter the examples/otel-collector directory, and run docker compose up to start the OTel Collector.

Can you update the gist and/or create a different reproduction script that raises the same error?

@manasaheggere
Copy link
Author

Hi @kaylareopelle

We cloned the opentelemetry-ruby opensource code and validated we are seeing only warnings and not traces.
We have also created a sample application to reproduce the issue.
With the same piece of code we are able to see the traces at our local on Jeager UI.

Even in actual project, at local we are not seeing this issue.
When we deploy the same code to pre-prod we are seeing the below errors.

ERROR -- : OpenTelemetry error: unexpected error in OTLP::Exporter#encode - bignum too big to convert into `unsigned long long' - /home/circleci/.rubygems/gems/opentelemetry-exporter-otlp-0.29.1/lib/opentelemetry/exporter/otlp/exporter.rb:326:in `initialize'
ERROR -- : OpenTelemetry error: Unable to export 61 spans

@arielvalentin
Copy link
Contributor

Is there anything else you can share with us?

Machine details like CPU architecture?

Are you deploying using containers? If so what is the image you're using?

Can you share the contents of the lock file that is generated and what version of protobuf is being installed?

@manasaheggere
Copy link
Author

Hi @arielvalentin

We are deploying our application docker image in AWS cloud using ArgoCD and kubernites.
We have used protobuf 4.29.0 version. Please find the attached Gemfile.lock.
lock_file.zip

@kaylareopelle
Copy link
Contributor

Hi @manasaheggere, thanks for sharing details around your architecture and your lock file.

We discussed this issue during the SIG and think we may need some additional logging around the exporter to try to get a better sense of what the data that's being rejected looks like. I have some code with extra logging I'd like you to add to your app.

I have two options for how you can add it:

  1. I created a branch I'd like you to use to install the OTLP exporter. To install it, update the line in your Gemfile for opentelemetry-exporter-otlp to:
gem 'opentelemetry-exporter-otlp', github: 'kaylareopelle/opentelemetry-ruby', branch: 'debug-unsigned-long-long', glob: 'exporter/otlp/*.gemspec'
  1. Alternatively, you can monkey patch your exporter by adding the code in this gist to your opentelemetry.rb file before you call OpenTelemetry::SDK.configure.

Could you run the exporter with the additional logging code in the environment where the error is raised and share the logs with us? Please remove any information that might be considered sensitive from the file before you post.

@arielvalentin
Copy link
Contributor

Hi @arielvalentin

We are deploying our application docker image in AWS cloud using ArgoCD and kubernites.

We have used protobuf 4.29.0 version. Please find the attached Gemfile.lock.

lock_file.zip

What is the machine architecture? 32 or 64 bit?

@manasaheggere
Copy link
Author

Hi @kaylareopelle

I have followed approach 1 and attached the screenshot of the loggers. I am seeing more logs similar to the loggers in screenshot, please let me know if any specific loggers is required to be shared.
Screenshot 2024-12-18 at 5 13 54 PM

We are also getting one more error
ERROR -- : OpenTelemetry error: unexpected configuration error due to attribute values must be (array of) strings, integers, floats, or booleans - OpenTelemetry::SDK::ConfigurationError - /home/circleci/.rubygems/gems/opentelemetry-sdk-1.6.0/lib/opentelemetry/sdk.rb:69:in rescue in configure'`

Hi @arielvalentin

Machine architecture is 64bit.

@arielvalentin
Copy link
Contributor

Can you share your gem file.lock ?

@manasaheggere
Copy link
Author

Hi @arielvalentin

Please find the attached Gemfile.lock
gemfile_lock.zip

@manasaheggere
Copy link
Author

Hi Team,
Any luck on this fix ?

@arielvalentin
Copy link
Contributor

We've run into the end of year holidays here so I don't think anyone has taken a closer look since that time.

Something you may want to try to give you better diagnostic information is adding a custom error handler to give us a bit more detail into what may be happening there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants