Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert protocol is propagated #3292

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions cmd/otel-allocator/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"time"

"github.com/go-logr/logr"
monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1"
"github.com/prometheus/common/model"
promconfig "github.com/prometheus/prometheus/config"
_ "github.com/prometheus/prometheus/discovery/install"
Expand Down Expand Up @@ -58,6 +59,79 @@ type Config struct {
HTTPS HTTPSServerConfig `yaml:"https,omitempty"`
}

func (cfg *Config) GetPrometheus() *monitoringv1.Prometheus {
// we want to use endpointslices by default
serviceDiscoveryRole := monitoringv1.ServiceDiscoveryRole("EndpointSlice")
prom := &monitoringv1.Prometheus{
Spec: monitoringv1.PrometheusSpec{
CommonPrometheusFields: monitoringv1.CommonPrometheusFields{
ScrapeInterval: monitoringv1.Duration(cfg.PrometheusCR.ScrapeInterval.String()),
ServiceMonitorSelector: cfg.PrometheusCR.ServiceMonitorSelector,
PodMonitorSelector: cfg.PrometheusCR.PodMonitorSelector,
ServiceMonitorNamespaceSelector: cfg.PrometheusCR.ServiceMonitorNamespaceSelector,
PodMonitorNamespaceSelector: cfg.PrometheusCR.PodMonitorNamespaceSelector,
ServiceDiscoveryRole: &serviceDiscoveryRole,
},
},
}

// if there's no prom config set, just use the above defaults.
if cfg.PromConfig == nil {
return prom
}

// the prometheus-operator provides no automatic conversion for global fields
if len(cfg.PromConfig.GlobalConfig.ScrapeProtocols) > 0 {
var scrapeProtocols []monitoringv1.ScrapeProtocol
for _, protocol := range cfg.PromConfig.GlobalConfig.ScrapeProtocols {
scrapeProtocols = append(scrapeProtocols, monitoringv1.ScrapeProtocol(protocol))
}
prom.Spec.CommonPrometheusFields.ScrapeProtocols = scrapeProtocols
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the correct thing to do. In my view, GlobalConfig should only affect the raw scrape configs, while the respective Prometheus fields (which only affect prometheus-operator CRs) should be separately configured.

The ambiguity of how GlobalConfig should affect the prometheus-operator world is why I was reluctant to include it in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that, however, i think the confusion / importance of doing it this way is that we don't have the luxury of having it automatically propagated to the prometheus instance. i.e. when a prometheus instance sets the global config and is being configured via the prometheus operator, any scrape configs generated by the prometheus CRDs to be added to the prometheus instance will be using the global config defined by said prometheus instance.

As I was writing this, i was wondering if this is necessary given it's the collector that's doing the scraping. So if a user sets the global config on the prometheus receiver in the collector, shouldn't that be used when scraping a target overriding the scrape_configs received by the TA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to look into this, because if that's the case this would be unnecessary right? Otherwise, I do think we need this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it, we inhabit two separate worlds here:

  1. The world of raw Prometheus configurations. A user can put a configuration in their prometheus receiver settings, and expect the TargetAllocator to use it. scrape_configs and global_configs apply here. This has nothing to do with Kubernetes per se, and works without TA as well.

  2. The world of Prometheus CRs in Kubernetes. This is specific to the Target Allocator and is configured via the OpenTelemetryCollector (and in the near future, TargetAllocator) CR. Internally, this is done by passing the configuration to a Prometheus CR and using that to generate scrape configs from ServiceMonitors and such.

My opinion is that world 2 should not be affected by configuration for world 1. If we want to set scrapeProtocols in the same way we would normally do on a Prometheus CR, then we should have a scrapeProtocols field on our CRs for this. See #1934 for reference.

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that makes sense to me, and would probably better in the long run anyway. I had to look through some more code to make sense of this, but I agree that this is probably the way to go.

}
scrapeInterval := monitoringv1.Duration(cfg.PromConfig.GlobalConfig.ScrapeInterval.String())
if len(scrapeInterval) > 0 {
prom.Spec.CommonPrometheusFields.ScrapeInterval = scrapeInterval
}
prom.Spec.CommonPrometheusFields.ScrapeTimeout = monitoringv1.Duration(cfg.PromConfig.GlobalConfig.ScrapeTimeout.String())
if len(cfg.PromConfig.GlobalConfig.ExternalLabels) > 0 {
labels := map[string]string{}
for _, label := range cfg.PromConfig.GlobalConfig.ExternalLabels {
labels[label.Name] = label.Value
}
prom.Spec.CommonPrometheusFields.ExternalLabels = labels
}
if cfg.PromConfig.GlobalConfig.BodySizeLimit > 0 {
bodySizeLimit := monitoringv1.ByteSize(cfg.PromConfig.GlobalConfig.BodySizeLimit.String())
prom.Spec.CommonPrometheusFields.BodySizeLimit = &bodySizeLimit
}
if cfg.PromConfig.GlobalConfig.SampleLimit > 0 {
sampleLimit := uint64(cfg.PromConfig.GlobalConfig.SampleLimit)
prom.Spec.CommonPrometheusFields.SampleLimit = &sampleLimit
}
if cfg.PromConfig.GlobalConfig.TargetLimit > 0 {
targetLimit := uint64(cfg.PromConfig.GlobalConfig.TargetLimit)
prom.Spec.CommonPrometheusFields.TargetLimit = &targetLimit
}
if cfg.PromConfig.GlobalConfig.LabelLimit > 0 {
labelLimit := uint64(cfg.PromConfig.GlobalConfig.LabelLimit)
prom.Spec.CommonPrometheusFields.LabelLimit = &labelLimit
}
if cfg.PromConfig.GlobalConfig.LabelNameLengthLimit > 0 {
labelNameLengthLimit := uint64(cfg.PromConfig.GlobalConfig.LabelNameLengthLimit)
prom.Spec.CommonPrometheusFields.LabelNameLengthLimit = &labelNameLengthLimit
}
if cfg.PromConfig.GlobalConfig.LabelValueLengthLimit > 0 {
labelValueLengthLimit := uint64(cfg.PromConfig.GlobalConfig.LabelValueLengthLimit)
prom.Spec.CommonPrometheusFields.LabelValueLengthLimit = &labelValueLengthLimit
}
if cfg.PromConfig.GlobalConfig.KeepDroppedTargets > 0 {
keepDroppedTargets := uint64(cfg.PromConfig.GlobalConfig.KeepDroppedTargets)
prom.Spec.CommonPrometheusFields.KeepDroppedTargets = &keepDroppedTargets
}

return prom
}

type PrometheusCRConfig struct {
Enabled bool `yaml:"enabled,omitempty"`
PodMonitorSelector *metav1.LabelSelector `yaml:"pod_monitor_selector,omitempty"`
Expand Down
18 changes: 1 addition & 17 deletions cmd/otel-allocator/watcher/promOperator.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,23 +71,7 @@ func NewPrometheusCRWatcher(ctx context.Context, logger logr.Logger, cfg allocat
return nil, err
}

// we want to use endpointslices by default
serviceDiscoveryRole := monitoringv1.ServiceDiscoveryRole("EndpointSlice")

// TODO: We should make these durations configurable
prom := &monitoringv1.Prometheus{
Spec: monitoringv1.PrometheusSpec{
CommonPrometheusFields: monitoringv1.CommonPrometheusFields{
ScrapeInterval: monitoringv1.Duration(cfg.PrometheusCR.ScrapeInterval.String()),
ServiceMonitorSelector: cfg.PrometheusCR.ServiceMonitorSelector,
PodMonitorSelector: cfg.PrometheusCR.PodMonitorSelector,
ServiceMonitorNamespaceSelector: cfg.PrometheusCR.ServiceMonitorNamespaceSelector,
PodMonitorNamespaceSelector: cfg.PrometheusCR.PodMonitorNamespaceSelector,
ServiceDiscoveryRole: &serviceDiscoveryRole,
},
},
}

prom := cfg.GetPrometheus()
promOperatorLogger := level.NewFilter(log.NewLogfmtLogger(os.Stderr), level.AllowWarn())
promOperatorSlogLogger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelWarn}))
generator, err := prometheus.NewConfigGenerator(promOperatorLogger, prom, true)
Expand Down
172 changes: 154 additions & 18 deletions cmd/otel-allocator/watcher/promOperator_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import (
"testing"
"time"

"github.com/alecthomas/units"
"github.com/go-kit/log"
"github.com/go-kit/log/level"
monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1"
Expand All @@ -35,6 +36,7 @@ import (
promconfig "github.com/prometheus/prometheus/config"
"github.com/prometheus/prometheus/discovery"
kubeDiscovery "github.com/prometheus/prometheus/discovery/kubernetes"
"github.com/prometheus/prometheus/model/labels"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
v1 "k8s.io/api/core/v1"
Expand Down Expand Up @@ -99,6 +101,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
},
Expand Down Expand Up @@ -191,6 +194,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
},
Expand Down Expand Up @@ -259,6 +263,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
},
Expand Down Expand Up @@ -356,6 +361,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
},
Expand Down Expand Up @@ -467,6 +473,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
},
Expand Down Expand Up @@ -557,6 +564,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"testsvc": "testsvc",
Expand Down Expand Up @@ -628,6 +636,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
PodMonitorSelector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"testpod": "testpod",
Expand Down Expand Up @@ -696,6 +705,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
ServiceMonitorNamespaceSelector: &metav1.LabelSelector{
Expand Down Expand Up @@ -766,6 +776,7 @@ func TestLoadConfig(t *testing.T) {
},
cfg: allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
PodMonitorNamespaceSelector: &metav1.LabelSelector{
Expand Down Expand Up @@ -802,6 +813,145 @@ func TestLoadConfig(t *testing.T) {
},
},
},
{
name: "simple test (global config)",
serviceMonitors: []*monitoringv1.ServiceMonitor{
{
ObjectMeta: metav1.ObjectMeta{
Name: "simple",
Namespace: "test",
},
Spec: monitoringv1.ServiceMonitorSpec{
JobLabel: "test",
Endpoints: []monitoringv1.Endpoint{
{
Port: "web",
},
},
},
},
},
podMonitors: []*monitoringv1.PodMonitor{
{
ObjectMeta: metav1.ObjectMeta{
Name: "simple",
Namespace: "test",
},
Spec: monitoringv1.PodMonitorSpec{
JobLabel: "test",
PodMetricsEndpoints: []monitoringv1.PodMetricsEndpoint{
{
Port: "web",
},
},
},
},
},
cfg: allocatorconfig.Config{
PromConfig: &promconfig.Config{
GlobalConfig: promconfig.GlobalConfig{
ScrapeInterval: model.Duration(60 * time.Second),
ScrapeTimeout: model.Duration(20 * time.Second),
ScrapeProtocols: []promconfig.ScrapeProtocol{
promconfig.OpenMetricsText1_0_0,
promconfig.OpenMetricsText0_0_1,
promconfig.PrometheusText0_0_4,
promconfig.PrometheusProto,
},
ExternalLabels: []labels.Label{
{
Name: "example",
Value: "value",
},
},
BodySizeLimit: units.Kibibyte,
SampleLimit: 100,
TargetLimit: 100,
LabelLimit: 100,
LabelNameLengthLimit: 100,
LabelValueLengthLimit: 100,
KeepDroppedTargets: 100,
},
},
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
},
},
want: &promconfig.Config{
ScrapeConfigs: []*promconfig.ScrapeConfig{
{
JobName: "serviceMonitor/test/simple/0",
ScrapeInterval: model.Duration(60 * time.Second),
ScrapeProtocols: []promconfig.ScrapeProtocol{
promconfig.OpenMetricsText1_0_0,
promconfig.OpenMetricsText0_0_1,
promconfig.PrometheusText0_0_4,
promconfig.PrometheusProto,
},
BodySizeLimit: 1024,
SampleLimit: 100,
TargetLimit: 100,
LabelLimit: 100,
LabelNameLengthLimit: 100,
LabelValueLengthLimit: 100,
KeepDroppedTargets: 100,
ScrapeTimeout: model.Duration(20 * time.Second),
HonorTimestamps: true,
HonorLabels: false,
Scheme: "http",
MetricsPath: "/metrics",
ServiceDiscoveryConfigs: []discovery.Config{
&kubeDiscovery.SDConfig{
Role: "endpointslice",
NamespaceDiscovery: kubeDiscovery.NamespaceDiscovery{
Names: []string{"test"},
IncludeOwnNamespace: false,
},
HTTPClientConfig: config.DefaultHTTPClientConfig,
},
},
HTTPClientConfig: config.DefaultHTTPClientConfig,
EnableCompression: true,
},
{
JobName: "podMonitor/test/simple/0",
ScrapeInterval: model.Duration(60 * time.Second),
ScrapeProtocols: []promconfig.ScrapeProtocol{
promconfig.OpenMetricsText1_0_0,
promconfig.OpenMetricsText0_0_1,
promconfig.PrometheusText0_0_4,
promconfig.PrometheusProto,
},
BodySizeLimit: 1024,
SampleLimit: 100,
TargetLimit: 100,
LabelLimit: 100,
LabelNameLengthLimit: 100,
LabelValueLengthLimit: 100,
KeepDroppedTargets: 100,
ScrapeTimeout: model.Duration(20 * time.Second),
HonorTimestamps: true,
HonorLabels: false,
Scheme: "http",
MetricsPath: "/metrics",
ServiceDiscoveryConfigs: []discovery.Config{
&kubeDiscovery.SDConfig{
Role: "pod",
NamespaceDiscovery: kubeDiscovery.NamespaceDiscovery{
Names: []string{"test"},
IncludeOwnNamespace: false,
},
HTTPClientConfig: config.DefaultHTTPClientConfig,
},
},
HTTPClientConfig: config.DefaultHTTPClientConfig,
EnableCompression: true,
},
},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
Expand Down Expand Up @@ -869,6 +1019,7 @@ func TestNamespaceLabelUpdate(t *testing.T) {

cfg := allocatorconfig.Config{
PrometheusCR: allocatorconfig.PrometheusCRConfig{
ScrapeInterval: model.Duration(30 * time.Second),
ServiceMonitorSelector: &metav1.LabelSelector{},
PodMonitorSelector: &metav1.LabelSelector{},
PodMonitorNamespaceSelector: &metav1.LabelSelector{
Expand Down Expand Up @@ -1081,26 +1232,11 @@ func getTestPrometheusCRWatcher(t *testing.T, svcMonitors []*monitoringv1.Servic
}

factory := informers.NewMonitoringInformerFactories(map[string]struct{}{v1.NamespaceAll: {}}, map[string]struct{}{}, mClient, 0, nil)
informers, err := getInformers(factory)
informerMap, err := getInformers(factory)
if err != nil {
t.Fatal(t, err)
}

serviceDiscoveryRole := monitoringv1.ServiceDiscoveryRole("EndpointSlice")

prom := &monitoringv1.Prometheus{
Spec: monitoringv1.PrometheusSpec{
CommonPrometheusFields: monitoringv1.CommonPrometheusFields{
ScrapeInterval: monitoringv1.Duration("30s"),
ServiceMonitorSelector: cfg.PrometheusCR.ServiceMonitorSelector,
PodMonitorSelector: cfg.PrometheusCR.PodMonitorSelector,
ServiceMonitorNamespaceSelector: cfg.PrometheusCR.ServiceMonitorNamespaceSelector,
PodMonitorNamespaceSelector: cfg.PrometheusCR.PodMonitorNamespaceSelector,
ServiceDiscoveryRole: &serviceDiscoveryRole,
},
},
}

prom := cfg.GetPrometheus()
promOperatorLogger := level.NewFilter(log.NewLogfmtLogger(os.Stderr), level.AllowWarn())
promOperatorSlogLogger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelWarn}))

Expand Down Expand Up @@ -1132,7 +1268,7 @@ func getTestPrometheusCRWatcher(t *testing.T, svcMonitors []*monitoringv1.Servic
return &PrometheusCRWatcher{
kubeMonitoringClient: mClient,
k8sClient: k8sClient,
informers: informers,
informers: informerMap,
nsInformer: nsMonInf,
stopChannel: make(chan struct{}),
configGenerator: generator,
Expand Down
Loading
Loading