You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a jaeger operator, I want to have the elasticsearch-indexes cleaned up stably and reliably, even if shared across multiple clusters.
Steps to reproduce
Deploy the index-cleanup cronjob across multiple clusters
Because of the same cronjob definition in the helm charts, all jobs start at the same time
Some job instances are failing on their first run because they first gather a list of indices to delete, but then fail deleting them because they cannot find the index they want to cleanup anymore
The K8s-Job itself succeeds, because the job is spawning another pod which will then succeed, because the index-listing already has no delete-candidates anymore, but there was still a pod failing which can trigger alerts
Expected behavior
I'd expect to not solely rely on the retry primitives of a K8s job, but that the initial run of a cleanup job itself can cope with an index already being deleted, since deletion is an idempotent operation anyway.
Relevant log output
{"level":"info","ts":1730501701.360364,"caller":"es-index-cleaner/main.go:89","msg":"Indices before this date will be deleted","date":"2024-10-26T00:00:00Z"}
{"level":"info","ts":1730501701.360454,"caller":"es-index-cleaner/main.go:98","msg":"Queried indices","indices":[{"Index":"jaeger-service-2024-10-26","CreationTime":"2024-10-26T00:00:02.651Z","Aliases":{}},{"Index":"jaeger-service-2024-10-28","CreationTime":"2024-10-28T00:00:00.517Z","Aliases":{}},{"Index":"jaeger-span-2024-10-31","CreationTime":"2024-10-31T00:00:00.55Z","Aliases":{}},{"Index":"jaeger-service-2024-10-31","CreationTime":"2024-10-31T00:00:00.155Z","Aliases":{}},{"Index":"jaeger-span-2024-10-27","CreationTime":"2024-10-27T00:00:00.319Z","Aliases":{}},{"Index":"jaeger-span-2024-10-29","CreationTime":"2024-10-29T00:00:00.237Z","Aliases":{}},{"Index":"jaeger-service-2024-10-27","CreationTime":"2024-10-27T00:00:02.093Z","Aliases":{}},{"Index":"jaeger-service-2024-10-29","CreationTime":"2024-10-29T00:00:03.637Z","Aliases":{}},{"Index":"jaeger-span-2024-10-25","CreationTime":"2024-10-25T00:00:00.24Z","Aliases":{}},{"Index":"jaeger-span-2024-10-28","CreationTime":"2024-10-28T00:00:00.229Z","Aliases":{}},{"Index":"jaeger-span-2024-10-30","CreationTime":"2024-10-30T00:00:00.231Z","Aliases":{}},{"Index":"jaeger-span-2024-11-01","CreationTime":"2024-11-01T00:00:00.142Z","Aliases":{}},{"Index":"jaeger-service-2024-10-25","CreationTime":"2024-10-25T00:00:02.468Z","Aliases":{}},{"Index":"jaeger-service-2024-10-30","CreationTime":"2024-10-30T00:00:01.286Z","Aliases":{}},{"Index":"jaeger-service-2024-11-01","CreationTime":"2024-11-01T00:00:01.143Z","Aliases":{}},{"Index":"jaeger-span-2024-10-26","CreationTime":"2024-10-26T00:00:00.364Z","Aliases":{}}]}
{"level":"info","ts":1730501701.360637,"caller":"es-index-cleaner/main.go:105","msg":"Deleting indices","indices":[{"Index":"jaeger-span-2024-10-25","CreationTime":"2024-10-25T00:00:00.24Z","Aliases":{}},{"Index":"jaeger-service-2024-10-25","CreationTime":"2024-10-25T00:00:02.468Z","Aliases":{}}]}
Error: failed to delete indices: jaeger-span-2024-10-25,jaeger-service-2024-10-25,, request failed, status code: 404, body: {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [jaeger-service-2024-10-25]","index_uuid":"56pWYj7FRgiU-YimsljoYg","index":"jaeger-service-2024-10-25"}],"type":"index_not_found_exception","reason":"no such index [jaeger-service-2024-10-25]","index_uuid":"56pWYj7FRgiU-YimsljoYg","index":"jaeger-service-2024-10-25"},"status":404}
Fix for bug jaegertracing#6497. Add query param ignore_unavailable
to have index client not err when deleting, already
deleted indexes.
Signed-off-by: Shreyas Kirtane <[email protected]>
ES has an optional flag to ignore missing indexes. Added an option to es-index-cleanup, so this flag can be set in #6502. This should resolve the problem.
What happened?
As a jaeger operator, I want to have the elasticsearch-indexes cleaned up stably and reliably, even if shared across multiple clusters.
Steps to reproduce
Expected behavior
I'd expect to not solely rely on the retry primitives of a K8s job, but that the initial run of a cleanup job itself can cope with an index already being deleted, since deletion is an idempotent operation anyway.
Relevant log output
Screenshot
No response
Additional context
No response
Jaeger backend version
v1.64.0
SDK
No response
Pipeline
No response
Stogage backend
Elasticsearch v7.17.26
Operating system
Linux
Deployment model
Kubernetes
Deployment configs
The text was updated successfully, but these errors were encountered: