Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does s3proxy expose any meaningful metrics? #780

Open
musabshak opened this issue Feb 3, 2025 · 4 comments
Open

Does s3proxy expose any meaningful metrics? #780

musabshak opened this issue Feb 3, 2025 · 4 comments

Comments

@musabshak
Copy link

Something to the effect of:

  • data_served
  • num_requests (by method / path)
  • api rx/tx?

I know the answer lies in a quick read-through of the source code, but my Java code-navigation abilities are not super refined at the moment.

@gaul
Copy link
Owner

gaul commented Feb 3, 2025

I'm happy to add a feature like this but want to follow other servers as a template. Someone proposed adding a /statusz endpoint in #763 so maybe this would be a good place to expose statistics?

@musabshak
Copy link
Author

follow other servers as a template

By "other servers", I assume you mean other Java applications that expose metrics? ie you don't want to reinvent the wheel? That's totally reasonable + preferred of course. I am not readily able to point you to a Java example off the top of my head though unfortunately.

/statusz endpoint PR

This looks like a simple request for a /healthz endpoint to indicate node liveness, presumably meant to be used by something like a k8s liveness / readiness probe. What I have observed is:

  • often apps will expose Prometheus metrics to a /metrics endpoint
  • This /metrics endpoint may be used for the liveness / readiness probe

This is of course, assuming that you're not doing anything non-trivial to determine "is my application up and healthy?" (probing, for example, the GCS endpoint to ensure GCS is healthy, or, in other contexts, verifying health of DB connections etc)

@musabshak
Copy link
Author

musabshak commented Feb 3, 2025

For some context that may / may not be helpful, I am investigating an exact substitute for [1]. Minio, until 2022, used to support proxying to GCS (and other s3 APIs). We still use that outdated version of Minio to proxy requests to GCS (serving about 40 TiB/mo data from GCS -> on-prem).

Minio (in gateway mode) has been mostly flawless over the past 4 or so years. It exposes useful metrics. It has been performant. It has caching.

I am trying to adapt the existing version of s3proxy to try and replace Minio in our infra. Since s3proxy doesn't currently have caching, I'm looking into adding a front-end HTTP cache (Varnish) in front of s3proxy to deal with the caching RFE.

If we end up adopting s3proxy in production, this metrics RFE will be pretty important, for observability reasons.

[1] https://blog.min.io/deprecation-of-the-minio-gateway/

@gaul
Copy link
Owner

gaul commented Feb 3, 2025

I'm happy to merge Minio-style /metrics endpoint but won't have the cycles to do this myself. Could you raise a PR for this? The three sets of metrics you suggest sound like a good start.

#140 tracks caching which you already found. A minimal read-only version is easy to implement but I don't know what Minio did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants