Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up owned assets involving offline participants #212

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

saketh-are
Copy link
Contributor

@saketh-are saketh-are commented Feb 14, 2025

Motivation

If a node wipes its storage for any reason, any previously generated assets whose participant set includes that node are unusable. Such assets owned by other nodes are stored indefinitely until an attempt is made to use them, leading to failed computations. We need a recovery strategy to address this scenario.

Implementation

Once a node has filled its store with the desired quantity of an asset, it will gradually verify its stored assets and discard those that depend on offline participants.

To recover the system, the wiped node would intentionally remain offline for some time. The other nodes would notice its absence and discard any owned assets that depend on it.

Visibility

Each owned asset type is tracked using three counters: available, online, and offline.

When the alive participant set changes, both online and offline are reset to 0, indicating that none of the stored assets have been checked yet against the new participant set.

If the participant set then remains stable, the stored assets will be verified over time. When available and online are equal, and offline is 0, we will know that all assets depending on the offline node have been found and discarded.


Closes #213. See also #207.

A time-based expiration on all assets in the DistributedAssetStore will be implemented separately to prevent unbounded accumulation of unowned assets.

@@ -307,8 +423,10 @@ where
/// owning some of the assets. Each asset has exactly one owner.
///
/// Only the owner of an asset may pick the asset for use in an MPC computation.
/// As the owner, the `take_owned` method removes the oldest asset from the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even before this PR, we did not actually guarantee that we picked the oldest usable asset. This will likely need to be revisited in the future if we implement a strategy for agreement between nodes on their available assets. For now, I don't believe anything is sensitive to the asset ordering once they are generated.


if !should_generate {
// After the store is full, slowly discard triples which cannot be used right now
triple_store.maybe_discard_owned(10).await;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be set relative to the buffer size such that recovery happens in a reasonable timeframe. At 10 per .1s, a million buffered triples would be turned over in about 3 hours. Not sure if this is something we'd want to expose in the config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement eviction of owned assets involving offline participants
1 participant