-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support monolithic deployment mode #722
Support monolithic deployment mode #722
Conversation
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #722 +/- ##
==========================================
- Coverage 77.76% 76.78% -0.99%
==========================================
Files 68 77 +9
Lines 5155 5733 +578
==========================================
+ Hits 4009 4402 +393
- Misses 949 1110 +161
- Partials 197 221 +24
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
…ngle pod Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
} | ||
|
||
// MonolithicObservabilityMetricsSpec defines the metrics settings of the Tempo deployment. | ||
type MonolithicObservabilityMetricsSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we reuse smth from the microservice type?
e.g. the whole observability spec
type ObservabilitySpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be inconsistent with the
<feature>:
enabled: true
style of the CR. I'd prefer to migrate the TempoStack
, maybe in the next CRD version? Shouldn't be too difficult to create a conversion webhook for this.
spec: # TempoMonolithicSpec defines the desired state of TempoMonolithic.
observability: # Observability defines observability configuration for the Tempo deployment
metrics: # Metrics defines the metrics configuration of the Tempo deployment
prometheusRules: # ServiceMonitors defines the PrometheusRule configuration
enabled: false # Enabled defines if the operator should create PrometheusRules for this Tempo deployment
serviceMonitors: # ServiceMonitors defines the ServiceMonitor configuration
enabled: false # Enabled defines if the operator should create ServiceMonitors for this Tempo deployment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we agreed on using the enabled field which is better supported in tools like kustomize
and kubectl edit
(the empty structs are removed). Are we going to reuse some parts from the monolithic APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can reuse a TLS struct, the multitenancy structs, ManagementStateType, LimitSpec and the storage secret.
|
||
// Default implements webhook.Defaulter so a webhook will be registered for the type. | ||
func (r *TempoMonolithic) Default() { | ||
log := ctrl.Log.WithName("tempomonolithic-webhook") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this rendered in the logs? Isn't it too long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll print this line:
{"level":"debug","ts":"2024-01-15T19:07:08.746058013+01:00","logger":"tempomonolithic-webhook","msg":"running defaulter webhook","name":"sample"}
if debug logs are enabled (go run ./main.go --zap-log-level=debug start
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But as we'll set the defaults in the reconcile loop now, I'm removing this log statement.
I'll keep it there for the validating webhook, which is still in use.
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
|
||
// MonolithicTracesStorageSpec defines the traces storage for the Tempo deployment. | ||
type MonolithicTracesStorageSpec struct { | ||
// Backend defines the backend for storing traces. Default: memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit concerned about using the in-memory as default.
The upstream uses PV as default and the in-memory can be easily overlooked and use significant resources in the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're only concerned about OOM situations, the memory counts towards the container resource limit:
While tmpfs is very fast be aware that, unlike disks, files you write count against the memory limit of the container that wrote them
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
I'd prefer to keep memory as default, as it's great for quick testing/demos/showcases (and is the default for jaeger all-in-one), and changing it to pv
is easy (and can/should be mentioned in the docs).
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
// Observability defines observability configuration for the Tempo deployment | ||
// | ||
// +kubebuilder:validation:Optional | ||
Observability *MonolithicObservabilitySpec `json:"observability,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking again, can we reuse some of the structs from the microservies type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, question is do we value consistency inside the same CR or consistency between the two CRs more?
spec:
observability: # Observability defines observability configuration for the Tempo deployment
metrics: # Metrics defines the metrics configuration of the Tempo deployment
prometheusRules: # ServiceMonitors defines the PrometheusRule configuration
enabled: false # Enabled defines if the operator should create PrometheusRules for this Tempo deployment
serviceMonitors: # ServiceMonitors defines the ServiceMonitor configuration
enabled: false
vs
spec:
observability: # ObservabilitySpec defines how telemetry data gets handled.
grafana: # Grafana defines the Grafana configuration for operands.
createDatasource: false # CreateDatasource specifies if a Grafana Datasource should be created for Tempo.
instanceSelector: # InstanceSelector specifies the Grafana instance where the datasource should be created.
metrics: # Metrics defines the metrics configuration for operands.
createPrometheusRules: false # CreatePrometheusRules specifies if Prometheus rules for alerts should be created for Tempo components.
createServiceMonitors: false # CreateServiceMonitors specifies if ServiceMonitors should be created for Tempo components.
tracing: # Tracing defines a config for operands.
jaeger_agent_endpoint: "localhost:6831" # JaegerAgentEndpoint defines the jaeger endpoint data gets send to.
sampling_fraction: ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example is consistent with the rest of the Monolithic CR, and allows additional settings for prometheusRules or serviceMonitors in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer consistency on the same CRD, just one question, in the future is possible to have some sort of consolidation in order to get both? That will imply a breaking change though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was thinking of maybe doing this change in v1alpha2 of TempoStack if we have a consensus.
// Ingress defines the ingress configuration for Jaeger UI | ||
// | ||
// +kubebuilder:validation:Optional | ||
Ingress *MonolithicJaegerUIIngressSpec `json:"ingress,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be the ingress spec reused from the microservices? There are more settings users might want to configure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already ready in #755:
spec:
jaegerui: # JaegerUI defines the Jaeger UI configuration
enabled: false # Enabled defines if the Jaeger UI should be enabled
ingress: # Ingress defines the ingress configuration for Jaeger UI
annotations: # Annotations defines the annotations of the Ingress object.
"key": ""
enabled: false # Enabled defines if an Ingress object should be created for Jaeger UI
host: "" # Host defines the hostname of the Ingress object.
ingressClassName: "" # IngressClassName is the name of an IngressClass cluster resource. Ingress controller implementations use this field to know whether they should be serving this Ingress resource.
route: # Route defines the route configuration for Jaeger UI
annotations: # Annotations defines the annotations of the Ingress object.
"key": ""
enabled: false # Enabled defines if a Route object should be created for Jaeger UI
host: "" # Host defines the hostname of the Ingress object.
ingressClassName: "" # IngressClassName is the name of an IngressClass cluster resource. Ingress controller implementations use this field to know whether they should be serving this Ingress resource.
termination: "edge" # Termination specifies the termination type. Default: edge.
vs spec of TempoStack:
jaegerQuery: # JaegerQuerySpec defines Jaeger Query specific options.
enabled: false # Enabled is used to define if Jaeger Query component should be created.
ingress: # Ingress defines Jaeger Query Ingress options.
annotations: # Annotations defines the annotations of the Ingress object.
"key": ""
host: "" # Host defines the hostname of the Ingress object.
ingressClassName: "" # IngressClassName is the name of an IngressClass cluster resource. Ingress controller implementations use this field to know whether they should be serving this Ingress resource.
route: # Route defines OpenShift Route specific options.
termination: "" # Termination specifies the termination type. By default "edge" is used.
type: "" # Type defines the type of Ingress for the Jaeger Query UI. Currently ingress, route and none are supported.
(ingressClassName should have been under ingress, as it doesn't apply to route)
Do we value we value consistency inside the same CR or consistency between the two CRs more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment, I'd prefer consistency on the same CR
internal/manifests/mutate.go
Outdated
} | ||
|
||
func (m *ImmutableErr) Error() string { | ||
return fmt.Sprintf("update to immutable field %s is forbidden", m.field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it print the existing and desired as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did initially, but printing structs with fmt.Sprintf("%v", some_struct)
is an unreadable mess if the struct is big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we then remove fields that are not used in the error struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the message to show the result of cmp.Diff()
now - it's still a bit unreadable as it's a single line, but when replacing the \n
and \t
we see a nice diff in the logs, and we already had a dependency on this library anyway (https://github.com/google/go-cmp).
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
// ExtraConfig defines any extra (overlay) configuration for components | ||
// | ||
// +kubebuilder:validation:Optional | ||
ExtraConfig *MonolithicExtraConfigSpec `json:"extraConfig,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for not reusing the same as microservices? The only reason I think is because the microservices could include other configs in the future. If that is the reason I'm ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we worked on the same feature at the same time. I'll check if I can reuse the struct and logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR and reused the struct and logic from the TempoStack now.
} | ||
|
||
// MonolithicStorageSpec defines the storage for the Tempo deployment. | ||
type MonolithicStorageSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question, Why this have it's own spec and inside only have one structure? or why to not use MonolithicTracesStorageSpec
directly. Is this for mimic the tempo configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's to mimic the tempo configuration. I thought maybe tempo has plans to store other things in the future, so I'll keep the same here also.
But I don't have very strong opinions on this, I could remove that extra layer if you like.
// OTLP defines the ingestion configuration for OTLP | ||
// | ||
// +kubebuilder:validation:Optional | ||
OTLP *MonolithicIngestionOTLPSpec `json:"otlp,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be the only protocol supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to make the gateway a drop-in feature, so the service ports should not change if gateway is enabled or not. So I can only support protocols which the gateway also supports.
Afaics the gateway only supports otlp/grpc and otlp/http, right?
With the general move to the OTEL SDK, and using the OTEL collector, I think it's fine to only support OTLP.
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Support Tempo monolithic deployment mode with a new
TempoMonolithic
CR.Partially resolves #710.