-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option for metrics sampling #16
Comments
Hi @raxod502-plaid thanks for opening this issue. Yes, you're right; 50ms is hard-coded in k6 itself and this extension just re-sends everything. This issue is known and somewhat implied in #2. The main problem is that meaningful work in this extension is currently blocked by plans for metrics refactoring in k6. But given not fully clear timeline for that refactoring, perhaps, adding some workaround here would be good. Personally I'd prefer if we had a slightly smarter aggregation, even for a workaround... Would you be willing to add a simple configuration option together with this change? E.g. |
I can work on that - one concern I have though is, after implementing the workaround above, we see the following memory utilization pattern for k6: With this plugin disabled, the memory usage is stable at only a few GB. Do you know what's up with that? Presumably, metrics are somehow not getting freed, or the amount of data churn is so large the GC cannot keep up. |
This is interesting. Going by the snippet you posted before, the logic could definitely be improved by moving the check for |
Okay - I've been working on this integration on and off for many weeks now, so I'm going to take a break, but will make a note of this internally for when the work is picked back up. |
Any news on this issue? |
Nothing on my side. The project has been deprioritized for now, unfortunately. I still think it is a needed feature, though. |
Not sure why GitHub auto-closed that. Reopened. |
Wtf. |
The major issues with this have been resolved. The output aggregates all the samples of the same series seen during the same configured push interval. I keep it open mostly for tracking eventual feedback on the case to configure a different aggregation interval from the push interval. |
We've noticed that this plugin sends metrics to Prometheus that are sampled every 50ms. This results in an absolutely monumental amount of data volume being ingested, which overloads our system. All of our other services have metrics that are sampled at around 30s intervals. Even 1s would be fine, but 50ms is complete overkill since that level of detail will never show up in query results or a dashboard.
We've worked around the problem in our fork of xk6-output-prometheus-remote by applying the following diff:
This is clearly a hack, as it would be better to just set a specific time interval and have the sampling done intelligently, rather than hardcoding to once per remote write. Also, this implementation sends the oldest datapoint per interval, rather than the most recent which is probably what is desired. But, Prometheus simply can't manage with hundreds of thousands of time series generated by a 50ms sample interval, so we needed to do something to get the plugin working.
Would a change to make the sampling interval customizable be accepted upstream?
The text was updated successfully, but these errors were encountered: