[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

celian-garcia · 2024-11-06T15:29:24Z

Description

Currently there is a mechanism allowing to split the result by dimension, thanks to a filter param hack.
This is great that all the dimensions are collected as labels in the metrics, but for some resources types it could be unwanted. In cause some concern about cardinality, or continuity of the queries between one version to another (e.g with prometheus exporter, if one does not do a "sum by ..." and the otelcol version is updated, the query can display different results)

To mitigate that I propose to put an optout for this collection so if we want for some resource types and not for others, we can create two separate receivers for example. Or we completely opt out if we don't want additional labels.

Edit:

After a first review, we agreed on the fact that it could do a bit more like allowing us to specify a list of dimensions for a particular metric.
e.g from added documentation

receivers:
  azuremonitor:
    dimensions:
      enabled: true
      overrides:
        "Microsoft.Network/azureFirewalls":
          # Real example of an Azure limitation here:
          # Dimensions exposed are Reason, Status, Protocol,
          # but when selecting Protocol in the filters, it returns nothing.
          # Note here that the metric display name is ``Network rules hit count`` but it's programmatic value is ``NetworkRuleHit``
          # Ref: https://learn.microsoft.com/en-us/azure/azure-monitor/reference/supported-metrics/microsoft-network-azurefirewalls-metrics
          "NetworkRuleHit": [Reason, Status]

Without this config you won't have the azure_networkrulehit_average_Count metric at all in your results
But with the config we have it with reason and status labels:

# HELP azure_networkrulehit_average_Count
# TYPE azure_networkrulehit_average_Count gauge
azure_networkrulehit_average_Count{azuremonitor_resource_id="/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Network/azureFirewalls/<redacted>",location="<redacted>",metadata_reason="<redacted>",metadata_status="Allow",name="<redacted>",resource_group="<redacted>",type="Microsoft.Network/azureFirewalls"} 21.875
azure_networkrulehit_average_Count{azuremonitor_resource_id="/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Network/azureFirewalls/<redacted>",location="<redacted>",metadata_reason="<redacted>",metadata_status="Deny",name="<redacted>",resource_group="<redacted>",type="Microsoft.Network/azureFirewalls"} 10

Link to tracking issue

Fixes #36611

Testing

Documentation

linux-foundation-easycla · 2024-11-06T15:29:31Z

The committers listed above are authorized under a signed CLA.

✅ login: celian-garcia / name: Célian GARCIA (9709d07)

celian-garcia · 2024-11-18T13:00:16Z

Hey @codeboten, @nslaughter, @jpkrohling. Any opinion on that? I feel it's a no brainer so if you can run the pipeline and have a look, it would be cool!

tesharp · 2024-11-20T11:41:35Z

We have noticed that some metrics do not return any data if you include dimensions that do not have any data, are empty or nil (assumption based on results we have seen). For instance Azure Firewall Network Rule Hits have 3 dimensions Status, Reason and Protocol. If filtering on Status and Reason we get back data, but as soon as you add Protocol it returns no data at all. It returns no data if you only try to filter on Protocol too. Seen same for other metrics that have "empty" dimension values.

Would it maybe make sense to not only opt in / out but set which dimensions you would like to filter on? We would still want Reason and Status but can drop Protocol dimension

celian-garcia · 2024-11-20T13:07:46Z

We have noticed that some metrics do not return any data if you include dimensions that do not have any data, are empty or nil (assumption based on results we have seen). For instance Azure Firewall Network Rule Hits have 3 dimensions Status, Reason and Protocol. If filtering on Status and Reason we get back data, but as soon as you add Protocol it returns no data at all. It returns no data if you only try to filter on Protocol too. Seen same for other metrics that have "empty" dimension values.

Would it maybe make sense to not only opt in / out but set which dimensions you would like to filter on? We would still want Reason and Status but can drop Protocol dimension

That's interesting. Then for that particular metric, using split_by_dimensions config field would allow you to receive data but you would lose the status and reason granularity. I just checked and indeed we have the same result even in the Azure Portal UI.

Selecting Network rules hit count and Split By = Protocol gives no result.
But for some metrics, like Application rules hits count, you have results split by http/https.
Also the SNAT Port utilization, you have results split by tcp/udp.

I believe this is a bug on the Network rules hit count definition that shouldn't propose protocol actually. I'm not sure that only this reason deserve a hack on our side, but still this feature would be interesting to me. Like it would be interesting to exclude a particular dimensions that would have too big cardinality or that we wouldn't be interested in.

Let me propose a config structure:

dimensions:
  # default to true to not introduce breaking change. This would mean that all the available dimensions will be collected, except if an exclusion exist.
  enabled: true | false 
  exclusions:
    "Microsoft.Network/azureFirewalls": # service name
        "Network rules hit count": # metric name
          - "Protocol"

We can also implement it the other way

dimensions:
  enabled: true 
  overrides:
    "Microsoft.Network/azureFirewalls":
        "Network rules hit count": [Reason, Status]

WDYT?

tesharp · 2024-11-20T13:18:00Z

I also think this is a bug in the api not only affecting this metric. We noticed it yesterday at 11:10 when suddenly all Postgresql flexible databases stoped reporting cpu, memory, storage etc. It had been working before that filtering on ServerName, but suddenly doesn't work anymore.

Nevertheless it seems like a good feature when you do not need the granularity with all dimensions. I think then the last option makes more sense to override and specify which dimensions you need.

celian-garcia · 2024-11-20T14:36:12Z

I also think this is a bug in the api not only affecting this metric. We noticed it yesterday at 11:10 when suddenly all Postgresql flexible databases stoped reporting cpu, memory, storage etc. It had been working before that filtering on ServerName, but suddenly doesn't work anymore.

Nevertheless it seems like a good feature when you do not need the granularity with all dimensions. I think then the last option makes more sense to override and specify which dimensions you need.

Okay I will make an update.

ahurtaud · 2024-11-21T03:56:18Z

Nevertheless it seems like a good feature when you do not need the granularity with all dimensions. I think then the last option makes more sense to override and specify which dimensions you need.

I agree, second proposed config format is better to me

celian-garcia · 2024-11-21T14:28:03Z

ping @tesharp I finished and pushed and I confirm that before this new overrides, the network rule hit metric was not even in the result ! Now it works well 💪

Note it's probably worth it that I move these dimensions functions out of scraper.go file. As if I make scraper_batch right after, I will reuse them..

Edit: moved to a dedicated dimensions.go files + added some unit tests

tesharp · 2024-11-22T08:37:52Z

Nice :) Looks good.

MovieStoreGuy · 2024-11-27T03:24:09Z

Please make sure you run the linters :)

celian-garcia · 2024-11-27T08:56:25Z

Please make sure you run the linters :)

Thanks @MovieStoreGuy, sorry for that. It should be fine now.

github-actions · 2024-12-25T05:21:25Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions · 2025-01-09T05:20:56Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

…mension Signed-off-by: Célian Garcia <[email protected]>

celian-garcia requested review from codeboten and a team as code owners November 6, 2024 15:29

github-actions bot assigned jpkrohling Nov 6, 2024

github-actions bot added the receiver/azuremonitor label Nov 6, 2024

github-actions bot requested a review from nslaughter November 6, 2024 15:29

celian-garcia changed the title ~~feat: Allow to not split result by dimension~~ [receiver/azuremonitorreceiver] feat: Allow to not split result by dimension Nov 6, 2024

celian-garcia force-pushed the split-dimensions-optout branch 4 times, most recently from aa46e6e to 4f70e3d Compare November 6, 2024 16:06

celian-garcia mentioned this pull request Nov 20, 2024

[azuremonitorreceiver] Add the ability to pull Resource Metrics using AZ Query Batch API #29593

Closed

celian-garcia force-pushed the split-dimensions-optout branch 2 times, most recently from 1d2a929 to 7bd6cad Compare November 21, 2024 12:57

celian-garcia force-pushed the split-dimensions-optout branch from 7bd6cad to 2191959 Compare November 21, 2024 15:16

celian-garcia force-pushed the split-dimensions-optout branch from 2191959 to 72df93d Compare November 27, 2024 08:42

celian-garcia force-pushed the split-dimensions-optout branch 2 times, most recently from 221eded to 514efd8 Compare December 2, 2024 08:58

celian-garcia force-pushed the split-dimensions-optout branch 2 times, most recently from 438ed36 to 2fad09b Compare December 10, 2024 20:27

github-actions bot added the Stale label Dec 25, 2024

celian-garcia force-pushed the split-dimensions-optout branch from 2fad09b to 3649244 Compare December 25, 2024 10:34

github-actions bot removed the Stale label Dec 26, 2024

github-actions bot added the Stale label Jan 9, 2025

[receiver/azuremonitorreceiver] feat: Allow to not split result by di…

9709d07

…mension Signed-off-by: Célian Garcia <[email protected]>

celian-garcia force-pushed the split-dimensions-optout branch from 3649244 to 9709d07 Compare January 9, 2025 06:20

github-actions bot removed the Stale label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

celian-garcia commented Nov 6, 2024 •

edited

Loading

linux-foundation-easycla bot commented Nov 6, 2024 •

edited

Loading

celian-garcia commented Nov 18, 2024

tesharp commented Nov 20, 2024

celian-garcia commented Nov 20, 2024

tesharp commented Nov 20, 2024

celian-garcia commented Nov 20, 2024

ahurtaud commented Nov 21, 2024

celian-garcia commented Nov 21, 2024 •

edited

Loading

tesharp commented Nov 22, 2024

MovieStoreGuy commented Nov 27, 2024

celian-garcia commented Nov 27, 2024

github-actions bot commented Dec 25, 2024

github-actions bot commented Jan 9, 2025

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

Are you sure you want to change the base?

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

Conversation

celian-garcia commented Nov 6, 2024 • edited Loading

Description

Edit:

Link to tracking issue

Testing

Documentation

linux-foundation-easycla bot commented Nov 6, 2024 • edited Loading

celian-garcia commented Nov 18, 2024

tesharp commented Nov 20, 2024

celian-garcia commented Nov 20, 2024

tesharp commented Nov 20, 2024

celian-garcia commented Nov 20, 2024

ahurtaud commented Nov 21, 2024

celian-garcia commented Nov 21, 2024 • edited Loading

tesharp commented Nov 22, 2024

MovieStoreGuy commented Nov 27, 2024

celian-garcia commented Nov 27, 2024

github-actions bot commented Dec 25, 2024

github-actions bot commented Jan 9, 2025

celian-garcia commented Nov 6, 2024 •

edited

Loading

linux-foundation-easycla bot commented Nov 6, 2024 •

edited

Loading

celian-garcia commented Nov 21, 2024 •

edited

Loading