Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply buckets optimization #4634

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

SirTyson
Copy link
Contributor

Description

Resolves #4633

This PR indexes bucket files during VerifyBucketsWork. Previously we would download Buckets, iterate all the buckets to check their hash, then iterate through all the buckets again to index them. Since startup is primarily disk bound, iterating through the entire BucketList twice is expensive. This change does the hash verification and indexing step in the same pass so we only have to read the BucketList once.

This introduces no new DOS vectors. Indexing unverified buckets could lead to an OOM based DOS attack, where a malicious History Archive provider hosts malicious buckets that are very large. However, such OOM attacks are already possible via a zip bomb, and History Archive providers are fairly trusted, so this is not a significant concern. To mitigate this I've added an INFO level log message saying what history archive a given file is being downloaded from. In the event of a DOS attack, these logs would give us enough info to quickly assign blame to the attacker and remove them from quorum sets.

Rebased on top of #4630.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@SirTyson SirTyson requested review from bboston7 and dmkozh January 29, 2025 03:16
@SirTyson SirTyson force-pushed the apply-buckets-optimization branch from 838489e to b4dfd1e Compare January 30, 2025 19:50
Copy link
Contributor

@bboston7 bboston7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great optimization! I'm not super familiar with the bucket portion of the codebase, so my review is more focused on standard C++ stuff.

#include "util/types.h"
#include "xdr/Stellar-ledger-entries.h"
#include <algorithm>
#include <asm-generic/errno.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use asm-generic/errno.h instead of the standard errno.h? The asm-generic version adds a dependency on linux headers, but we technically do support non-linux hosts. A quick google shows it may be possible to get linux headers on macos, though I didn't try myself. I have no idea about windows.

docs/metrics.md Outdated
bucketlistDB-live.bulk.inflationWinners | timer | time to load inflation winners
bucketlistDB-live.bulk.poolshareTrustlines | timer | time to load poolshare trustlines by accountID and assetID
bucketlistDB-live.bulk.prefetch | timer | time to prefetch
bucketlistDB-<X>.point.<y> | timer | time to load single entry of type <Y> on BucketList <X> (live/hotArchive)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bucketlistDB-<X>.point.<y> | timer | time to load single entry of type <Y> on BucketList <X> (live/hotArchive)
bucketlistDB-<X>.point.<Y> | timer | time to load single entry of type <Y> on BucketList <X> (live/hotArchive)

Nitpick: match the case of Y with the description in the rightmost column.

@@ -235,13 +235,13 @@ MAX_DEX_TX_OPERATIONS_IN_TX_SET = 0
# 0, indiviudal index is always used. Default page size 16 kb.
BUCKETLIST_DB_INDEX_PAGE_SIZE_EXPONENT = 14

# BUCKETLIST_DB_INDEX_CUTOFF (Integer) default 20
# BUCKETLIST_DB_INDEX_CUTOFF (Integer) default 250
# Size, in MB, determining whether a bucket should have an individual
# key index or a key range index. If bucket size is below this value, range
# based index will be used. If set to 0, all buckets are range indexed. If
# BUCKETLIST_DB_INDEX_PAGE_SIZE_EXPONENT == 0, value ingnored and all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# BUCKETLIST_DB_INDEX_PAGE_SIZE_EXPONENT == 0, value ingnored and all
# BUCKETLIST_DB_INDEX_PAGE_SIZE_EXPONENT == 0, value ignored and all

}
// BucketIndex throws if BucketManager shuts down before index finishes,
// so return empty index instead of partial index
catch (std::runtime_error&)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to catch a more specific exception here? Maybe BucketIndex should have it's own exception type that could be caught rather than swallowing all runtime exceptions.

Comment on lines +1580 to +1602

// Persist a map of indexes so we don't have dangling references in
// VerifyBucketsWork. We don't actually need to use the indexes created by
// VerifyBucketsWork here, so a throwaway static map is fine.
static std::map<int, std::unique_ptr<LiveBucketIndex const>> indexMap;

int i = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about this. It looks like VerifyBucketWork stores a reference to a unique pointer and this map exists to ensure that by the time that task runs, the pointee hasn't been deallocated due to this function going out of scope. Why not have VerifyBucketWork take ownership of the pointer completely and get rid of this map?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this seems sketchy


while (in && in.readOne(be, hasherPtr))
{
// peridocially check if bucket manager is exiting to stop indexing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// peridocially check if bucket manager is exiting to stop indexing
// periodically check if bucket manager is exiting to stop indexing

std::streamoff pageUpperBound = 0;
typename BucketT::EntryT be;
size_t iter = 0;
[[maybe_unused]] size_t count = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is [[maybe_unused]] necessary here? It looks like count is used.

Comment on lines 241 to 242
mData.filter =
std::make_unique<BinaryFuseFilter16>(keyHashes, seed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a break or something after the assignment? It looks like this always retries 10 times (albeit with the same seed if it's successful). Also, is there a risk that this fails 10 times in a row and mData.filter remains unset when exiting this loop?

- Bucket file size, in MB, that determines wether the `IndividualIndex` or
`RangeIndex` is used.
Default value is 20 MB, which indexes the first ~3 levels with the `IndividualIndex`.
- Bucket file size, in MB, that determines wether the Bucket is cached in memory or not.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Bucket file size, in MB, that determines wether the Bucket is cached in memory or not.
- Bucket file size, in MB, that determines whether the Bucket is cached in memory or not.

app.getWorkerIOContext(), hasher);
releaseAssertOrThrow(index);

uint256 vHash = hasher->finish();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Assert hasher.has_value() before dereferencing here or use hasher.value().finish.

@SirTyson SirTyson force-pushed the apply-buckets-optimization branch from c0577fc to a982355 Compare February 7, 2025 22:50
@@ -59,8 +60,9 @@ InMemoryIndex::InMemoryIndex(BucketManager const& bm,
std::streamoff lastOffset = 0;
std::optional<std::streamoff> firstOffer;
std::optional<std::streamoff> lastOffer;
SHA256* hasherPtr = hasher.has_value() ? &hasher.value() : nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not pass the pointer all around? Pointer is still the best we have now for optional<reference> intended here.

Comment on lines +1580 to +1602

// Persist a map of indexes so we don't have dangling references in
// VerifyBucketsWork. We don't actually need to use the indexes created by
// VerifyBucketsWork here, so a throwaway static map is fine.
static std::map<int, std::unique_ptr<LiveBucketIndex const>> indexMap;

int i = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this seems sketchy

self->mIndex =
createIndex<LiveBucket>(bm, self->mBucket->getFilename(),
self->mBucket->getHash(), ctx);
std::optional<SHA256> empty{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pass std::nullopt directly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Startup Indexing Time
3 participants