Missing Metrics starting in 1.4.0? #1686

peakematt · 2024-11-11T20:50:54Z

What steps did you take and what happened:

I'm working to upgrade from v1.3.1 -> v1.4.6. When I performed this upgrade in a development environment, all pods came up cleanly, but I noticed that I lost some metrics like total_node_publish_error from our monitoring tool. I found the changelog entry on v1.4.0 indicating the metric was renamed and updated our monitoring tool to expect node_publish_error_total instead of total_node_publish_error. However, even after handling the rename, the metric still isn't available. Across versions, I'm still getting some metrics like rotation_reconcile_duration_sec.

If I port-forward :8095 on the pod to localhost and view the /metrics page, none of the node_* metrics are shown. To troubleshoot, I tried reverting versions to see where these metrics were lost. the last version where I see this metrics is v1.3.4. It seems like something happened in v1.4.0 that dropped these metrics. Is there a way to get them back?

What did you expect to happen:

metrics node_publish_total, node_unpublish_total, node_publish_error_total, node_unpublish_error_total, and sync_k8s_secret_total metrics to be available on :8098/metrics

I would expect that even if some of these metrics aren't created unless there are values to report (i.e. they default to null rather than 0), that I would still have at least node_publish_error_total. When I revert to v1.3.4 in our dev environment, this metric is immediately published with value 1 (which is a different issue I need to investigate 😅 ).

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Which provider are you using:
[e.g. Azure Key Vault, HashiCorp Vault, etc. Have you checked out the provider's repo for more help?]

GCP. I have checked https://github.com/GoogleCloudPlatform/secrets-store-csi-driver-provider-gcp/issues?q=is:issue+metrics and don't see any related issues.

Environment:

Secrets Store CSI Driver version: (use the image tag): v1.4.0
Kubernetes version: (use kubectl version):

Client Version: v1.29.7
Server Version: v1.29.9-gke.1496000

The text was updated successfully, but these errors were encountered:

peakematt added the kind/bug Categorizes issue or PR as related to a bug. label Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing Metrics starting in 1.4.0? #1686

Missing Metrics starting in 1.4.0? #1686

peakematt commented Nov 11, 2024

Missing Metrics starting in 1.4.0? #1686

Missing Metrics starting in 1.4.0? #1686

Comments

peakematt commented Nov 11, 2024