Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polling the HTTP /lag endpoint causes strange results #833

Open
alexchowle opened this issue Nov 28, 2024 · 1 comment
Open

Polling the HTTP /lag endpoint causes strange results #833

alexchowle opened this issue Nov 28, 2024 · 1 comment

Comments

@alexchowle
Copy link

alexchowle commented Nov 28, 2024

This maybe my limited understanding, here, but it's worth me writing this down to see if I'm misguided or have found an issue. My setup is 1 x Burrow instance consuming a single Kafka cluster. I have a polling script that calls the /lag endpoint periodically and forwards the retrieved JSON payload to a timeseries DB for visualisation and alerting.

After a period of ~36 hours I start to see a large increase in the calculated current_lag value per Topic Partition. I can find no natural explanation for this lag in terms of an increase of messages produced, or a slowdown in the Consumers within the Group. What I have found is:

  • Restarting the poller makes the current_lag reset to zero.
  • Starting a second poller alongside causes different current_lag values to be reported from the first poller.

To be clear: I have no fancy logic in the pollers - they simply perform a HTTP GET on v3/kafka/$CLUSTER/consumer/$CONSUMER_GROUP/lag and report the current_lag per Partition.

Have I misconfigured something?

This is Burrow v1.8.0

@alexchowle
Copy link
Author

It's like the observation of burrow changes the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant