TBS: Expired entries stay much longer than TTL and consume disk space #15121

carsonip · 2025-01-03T22:59:21Z

#14923 focuses on resolving the state where apm-server is stuck at exceeding storage limit indefinitely. On the other hand, this Issue focuses on the fact that badger DB disk space is not used efficiently, as entries are staying around for longer than TTL in general. It violates the assumption that badger DB is a factor of event ingest throughput * TTL.

Compactions in badger DB is only triggered by reaching level size target, and may take a long time. If the value size is greater than ValueThreshold, the value is generally cleared on vlog gc, and what's left in LSM tree for longer than TTL is just the key. But if value size is smaller than ValueThreshold, both the key and value live in LSM tree will need to wait for the next compaction.

Potential solution: Lmax to Lmax compaction in badger v4 could be helpful to clear out expired entries without waiting for level size target, but the performance impact is unclear.

axw · 2025-01-06T00:40:45Z

Another option would be to create a new database for each time interval (say 1 minute), and delete databases after TTL + interval. (It would be necessary to add the DB bucketing interval in case an event falls right at the end of the interval.)

This option has the benefit that it doesn't rely on the TTL feature of Badger, so we could potentially consolidate on Pebble for an LSM needs.

carsonip · 2025-01-06T11:19:31Z

Another option would be to create a new database for each time interval (say 1 minute), and delete databases after TTL + interval. (It would be necessary to add the DB bucketing interval in case an event falls right at the end of the interval.)

Forgot to write that down in this issue. I have a PoC on a variant of your idea (time interval = TTL) implemented in carsonip@2868f22 using badger. DB is created every TTL, and we keep 2 DBs alive at any given point of time, such that the age of event in the combination of DB is strictly bounded by 2 * TTL. With the refactoring done in #15112 this can be implemented relatively easily on main.

Pebble is another story, but I agree that will then be easier to adopt as we no longer rely on any TTL feature.

carsonip added the enhancement label Jan 3, 2025

carsonip mentioned this issue Jan 3, 2025

[meta] Tail-based sampling (TBS) improvements #14931

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TBS: Expired entries stay much longer than TTL and consume disk space #15121

TBS: Expired entries stay much longer than TTL and consume disk space #15121

carsonip commented Jan 3, 2025

axw commented Jan 6, 2025

carsonip commented Jan 6, 2025 •

edited

Loading

TBS: Expired entries stay much longer than TTL and consume disk space #15121

TBS: Expired entries stay much longer than TTL and consume disk space #15121

Comments

carsonip commented Jan 3, 2025

axw commented Jan 6, 2025

carsonip commented Jan 6, 2025 • edited Loading

carsonip commented Jan 6, 2025 •

edited

Loading