-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance for remote write time > flush period #10
Comments
Hi @jdheyburn I believe you're right: Other than experimenting with configuration options and flush period, another thing that should speed up the push of metrics is dropping tags completely but that does mean losing info.
At the moment, the extension starts to drop samples in this situation. Holding samples in memory is not feasible: they grow very quickly. Storing them to any kind of storage contradicts the purpose here. Still, other approaches can be tried of course, but I personally would recommend against spending too much time on that at this point: because current implementation is not at its most efficient right now. It is a known problem which depends on the ongoing work in k6 itself and is described in open issues #2 and #3. Hopefully, we'll be able to start resolving this in the next year, once k6 has some new implementations merged in. Hope that helps 🙂 |
Sorry for the late reply. Thanks for the insight, I wasn't able to accurately rely on Prometheus metrics in the end. I am currently using it to load test Redis, against approx 20-30 commands - each with their own latency metrics being captured, with 1s flush period. I think this is just too many metrics to store, and defeats the original purpose of the remote_write endpoint. A possibility could be to allow Prom to scrape k6 for the metrics itself? That way we could mitigate the ingestion overload. Not that I'm requesting it since it requires a redesign, but interested in hearing why the status quo implementation was picked :) |
Hi @jdheyburn, As for the history of Prometheus vs k6, AFAIK, it is a long one and I suggest looking at the following:
Some comments there could help with an answer but in short, k6 itself cannot be viewed as a server but rather an instrumentation tool with its own complexities and limitations and scraping endpoint is one of those things that fall outside of those limitations, at least for the foreseeable future. Outputs are simply less restrictive and far easier to add in k6 than scraping endpoint. And this extension is essentially a response to the above issue: a way to have a native k6 support for Prometheus right now. Granted, it can definitely be much improved with further metrics refactoring in k6 like solving grafana/k6#1831. Hope that helps with understanding the status quo 🙂 |
Olha I admire the detail of your replies, not just on this issue but on others I've seen as well - you're an asset to the community and I thank you! The rationale makes complete sense now. 👍🏻 I am not using the xk6-redis extension, since it provides limited commands. I am using a custom go extension, where my entire load-testing process is inspired from the same work from Gitlab. Taken from a tcpdump in production, there are several commands being captured, in fact there are 41 - so I imagine there are a lot of metrics to capture. I recently did an Edit: Sorry, I lied - I am using xk6-redis 🙂 |
I just stumbled across https://github.com/szkiba/xk6-prometheus, which looks like it opens k6 for Prometheus scraping. |
Thank you for your kind words, Joseph Yes, |
Just came to say I managed to get metrics out successfully with xk6-prometheus, but I'll keep an eye on this extension too so that I can make a true comparison. I have a |
All the major dependencies with this have been resolved. This should now only happen if the server is really experiencing heavy loads or in case of network faults. So, if it still happens on low load then a bug should be reported. |
Hey again
During our load testing we are hitting the
Remote write took Ys while flush period is Xs
log message and so samples are likely being dropped. In our setup we are writing directly to Prometheus with theremote-write-receiver
feature.I noticed on the README that this sentence refers to the remote_write.queue_config for tuning.
However, this configuration can only be applied when Prometheus (or the target agent) is configured for publishing to a remote write endpoint; since
queue_config
is a subset ofremote_write
, whereremote_write.url
is a required field.Is my understanding of this correct?
For our use case, we don't necessarily need the metrics in real time. Would it be possible to have the k6 metrics inserted sequentially so that if remote write receiver latency > flush period, the extension would keep hold of samples until all are published?
Thanks!
The text was updated successfully, but these errors were encountered: