Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/block rejections timeout heuristic #5731

Merged
merged 44 commits into from
Feb 3, 2025
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
ea771ca
added heuristic for block rejections timeout
rdeioris Jan 22, 2025
daffba1
Merge branch 'develop' into feat/block_rejections_heuristic
rdeioris Jan 22, 2025
6ab756b
added rejections_timeout test exposure via BLOCK_REJECTIONS_CURRENT_T…
rdeioris Jan 22, 2025
bfd4793
improved test logic
rdeioris Jan 22, 2025
3cebb35
fixed integration test
rdeioris Jan 22, 2025
db14fba
Merge branch 'develop' into feat/block_rejections_heuristic
rdeioris Jan 22, 2025
c3e163e
fixed comment for BLOCK_REJECTIONS_CURRENT_TIMEOUT
rdeioris Jan 22, 2025
6b751c4
fixed test comment
rdeioris Jan 22, 2025
bb8df99
fmt
rdeioris Jan 22, 2025
280483b
added more comments
rdeioris Jan 22, 2025
05b3f92
fixed formatting
rdeioris Jan 23, 2025
9971d1a
added wanr! when timeout
rdeioris Jan 24, 2025
45ae015
initial prototype for configurable block_rejction timeout steps
rdeioris Jan 28, 2025
3071d36
improved config system
rdeioris Jan 29, 2025
f2ab500
fmt-stacks
rdeioris Jan 29, 2025
5c0805a
fixed default rejections timeout
rdeioris Jan 29, 2025
f22a990
improved comment for rejections timeout
rdeioris Jan 29, 2025
4f4b584
fixed typo
rdeioris Jan 29, 2025
463839f
ensure 0 key is specified for rejections timeout
rdeioris Jan 29, 2025
0e7461c
message typo
rdeioris Jan 29, 2025
420f861
use Counters instead of global test vars
rdeioris Jan 30, 2025
57f80fb
restored original wait_for_block_status
rdeioris Jan 30, 2025
d5abfdc
removed empty line
rdeioris Jan 30, 2025
ed48786
Merge branch 'develop' into feat/block_rejections_heuristic
rdeioris Jan 30, 2025
5269714
updated CHANGELOG
rdeioris Jan 30, 2025
5a70c43
updated CHANGELOG
rdeioris Jan 30, 2025
c424212
Secp256k1PrivateKey::random() instrad of new()
rdeioris Jan 30, 2025
d713290
removed useless test attributes
rdeioris Jan 30, 2025
adaabe6
use u32 for block_rejection_timeout_steps keys
rdeioris Jan 30, 2025
7cfbfdd
refactored block_rejection_timeout_steps percentage setup
rdeioris Jan 30, 2025
d4171d7
do not reset rejections_timer on block_status updates
rdeioris Jan 30, 2025
fbc3b7d
moved block_rejection_timeout_steps computation directly in SignerCoo…
rdeioris Jan 30, 2025
33642c3
Merge branch 'develop' into feat/block_rejections_heuristic
rdeioris Jan 31, 2025
790c1e1
merged with develop, reintroduces old logic for condition variables, …
rdeioris Jan 31, 2025
54aed16
rollback BlockStatus Default and PartialEq
rdeioris Jan 31, 2025
eb883ee
improved error messages on timeout
rdeioris Jan 31, 2025
fe429f4
report rejection step in miner log, ensure signe_test shutdown in int…
rdeioris Jan 31, 2025
6a238a7
added set_miner_current_rejections_timeout and set_miner_current_reje…
rdeioris Jan 31, 2025
8d2bb0d
refactored block_validation_check_rejection_timeout_heuristic test
rdeioris Feb 1, 2025
151c844
use From based cast for u32->u64
rdeioris Feb 1, 2025
b813afa
merged with develop
rdeioris Feb 1, 2025
f66c894
removed useless line
rdeioris Feb 1, 2025
0c86012
Merge branch 'develop' into feat/block_rejections_heuristic
rdeioris Feb 3, 2025
400c806
use u32 for rejections counter
rdeioris Feb 3, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/bitcoin-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ jobs:
- tests::signer::v0::block_commit_delay
- tests::signer::v0::continue_after_fast_block_no_sortition
- tests::signer::v0::block_validation_response_timeout
- tests::signer::v0::block_validation_check_rejection_timeout_heuristic
- tests::signer::v0::block_validation_pending_table
- tests::signer::v0::new_tenure_while_validating_previous_scenario
- tests::signer::v0::tenure_extend_after_bad_commit
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ and this project adheres to the versioning scheme outlined in the [README.md](RE
### Changed

- Miner will include other transactions in blocks with tenure extend transactions (#5760)
- Add `block_rejection_timeout_steps` to miner configuration for defining rejections-based timeouts while waiting for signers response (#5705)

### Fixed

Expand Down
30 changes: 29 additions & 1 deletion stackslib/src/config/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

pub mod chain_data;

use std::collections::{HashMap, HashSet};
use std::collections::{BTreeMap, HashMap, HashSet};
use std::net::{Ipv4Addr, SocketAddr, ToSocketAddrs};
use std::path::PathBuf;
use std::str::FromStr;
Expand Down Expand Up @@ -2155,6 +2155,8 @@ pub struct MinerConfig {
pub tenure_extend_poll_secs: Duration,
/// Duration to wait before attempting to issue a tenure extend
pub tenure_timeout: Duration,
/// Define the timeout to apply while waiting for signers responses, based on the amount of rejections
pub block_rejection_timeout_steps: HashMap<u32, Duration>,
}

impl Default for MinerConfig {
Expand Down Expand Up @@ -2193,6 +2195,14 @@ impl Default for MinerConfig {
),
tenure_extend_poll_secs: Duration::from_secs(DEFAULT_TENURE_EXTEND_POLL_SECS),
tenure_timeout: Duration::from_secs(DEFAULT_TENURE_TIMEOUT_SECS),
block_rejection_timeout_steps: {
let mut rejections_timeouts_default_map = HashMap::<u32, Duration>::new();
rejections_timeouts_default_map.insert(0, Duration::from_secs(600));
rejections_timeouts_default_map.insert(10, Duration::from_secs(300));
rejections_timeouts_default_map.insert(20, Duration::from_secs(150));
rejections_timeouts_default_map.insert(30, Duration::from_secs(0));
rejections_timeouts_default_map
},
}
}
}
Expand Down Expand Up @@ -2589,6 +2599,7 @@ pub struct MinerConfigFile {
pub tenure_cost_limit_per_block_percentage: Option<u8>,
pub tenure_extend_poll_secs: Option<u64>,
pub tenure_timeout_secs: Option<u64>,
pub block_rejection_timeout_steps: Option<HashMap<String, u64>>,
}

impl MinerConfigFile {
Expand Down Expand Up @@ -2731,6 +2742,23 @@ impl MinerConfigFile {
tenure_cost_limit_per_block_percentage,
tenure_extend_poll_secs: self.tenure_extend_poll_secs.map(Duration::from_secs).unwrap_or(miner_default_config.tenure_extend_poll_secs),
tenure_timeout: self.tenure_timeout_secs.map(Duration::from_secs).unwrap_or(miner_default_config.tenure_timeout),
block_rejection_timeout_steps: {
if let Some(block_rejection_timeout_items) = self.block_rejection_timeout_steps {
let mut rejection_timeout_durations = HashMap::<u32, Duration>::new();
for (slice, seconds) in block_rejection_timeout_items.iter() {
match slice.parse::<u32>() {
Ok(slice_slot) => rejection_timeout_durations.insert(slice_slot, Duration::from_secs(*seconds)),
Err(e) => panic!("block_rejection_timeout_steps keys must be unsigned integers: {}", e)
};
}
if !rejection_timeout_durations.contains_key(&0) {
panic!("block_rejection_timeout_steps requires a definition for the '0' key/step");
}
rejection_timeout_durations
} else{
miner_default_config.block_rejection_timeout_steps
}
}
})
}
}
Expand Down
94 changes: 87 additions & 7 deletions testnet/stacks-node/src/nakamoto_node/signer_coordinator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,12 @@
// You should have received a copy of the GNU General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.

use std::collections::BTreeMap;
use std::ops::Bound::Included;
use std::sync::atomic::AtomicBool;
use std::sync::{Arc, Mutex};
use std::thread::JoinHandle;
use std::time::{Duration, Instant};

use libsigner::v0::messages::{MinerSlotID, SignerMessage as SignerMessageV0};
use libsigner::{BlockProposal, SignerSession, StackerDBSession};
Expand Down Expand Up @@ -66,6 +69,8 @@ pub struct SignerCoordinator {
/// Rather, this burn block is used to determine whether or not a new
/// burn block has arrived since this thread started.
burn_tip_at_start: ConsensusHash,
/// The timeout configuration based on the percentage of rejections
block_rejection_timeout_steps: BTreeMap<u64, Duration>,
}

impl SignerCoordinator {
Expand Down Expand Up @@ -101,6 +106,14 @@ impl SignerCoordinator {
let miners_contract_id = boot_code_id(MINERS_NAME, is_mainnet);
let miners_session = StackerDBSession::new(&rpc_socket.to_string(), miners_contract_id);

// build a BTreeMap of the various timeout steps
let mut block_rejection_timeout_steps = BTreeMap::<u64, Duration>::new();
for (percentage, duration) in config.miner.block_rejection_timeout_steps.iter() {
let rejections_amount =
((f64::from(listener.total_weight) / 100.0) * f64::from(*percentage)) as u64;
block_rejection_timeout_steps.insert(rejections_amount, *duration);
}

let mut sc = Self {
message_key,
is_mainnet,
Expand All @@ -111,6 +124,7 @@ impl SignerCoordinator {
keep_running,
listener_thread: None,
burn_tip_at_start: burn_tip_at_start.clone(),
block_rejection_timeout_steps,
};

// Spawn the signer DB listener thread
Expand Down Expand Up @@ -293,16 +307,38 @@ impl SignerCoordinator {
sortdb: &SortitionDB,
counters: &Counters,
) -> Result<Vec<MessageSignature>, NakamotoNodeError> {
// the amount of current rejections (used to eventually modify the timeout)
let mut rejections: u64 = 0;
// default timeout (the 0 entry must be always present)
let mut rejections_timeout = self
.block_rejection_timeout_steps
.get(&rejections)
.ok_or_else(|| {
NakamotoNodeError::SigningCoordinatorFailure(
"Invalid rejection timeout step function definition".into(),
)
})?;

// this is used to track the start of the waiting cycle
let rejections_timer = Instant::now();
loop {
// At every iteration wait for the block_status.
// Exit when the amount of confirmations/rejections reaches the threshold (or until timeout)
// Based on the amount of rejections, eventually modify the timeout.
let block_status = match self.stackerdb_comms.wait_for_block_status(
block_signer_sighash,
EVENT_RECEIVER_POLL,
|status| {
status.total_weight_signed < self.weight_threshold
&& status
.total_reject_weight
.saturating_add(self.weight_threshold)
<= self.total_weight
// rejections-based timeout expired?
if rejections_timer.elapsed() > *rejections_timeout {
return false;
}
// number or rejections changed?
if status.total_reject_weight as u64 != rejections {
rdeioris marked this conversation as resolved.
Show resolved Hide resolved
return false;
}
// enough signatures?
return status.total_weight_signed < self.weight_threshold;
},
)? {
Some(status) => status,
Expand Down Expand Up @@ -336,10 +372,46 @@ impl SignerCoordinator {
return Err(NakamotoNodeError::BurnchainTipChanged);
}

if rejections_timer.elapsed() > *rejections_timeout {
warn!("Timed out while waiting for responses from signers";
"elapsed" => rejections_timer.elapsed().as_secs(),
"rejections_timeout" => rejections_timeout.as_secs(),
"rejections" => rejections,
"rejections_threshold" => self.total_weight.saturating_sub(self.weight_threshold)
);
return Err(NakamotoNodeError::SigningCoordinatorFailure(
"Gave up while trying reaching the threshold".into(),
rdeioris marked this conversation as resolved.
Show resolved Hide resolved
));
}

continue;
}
};

if rejections != block_status.total_reject_weight as u64 {
rejections = block_status.total_reject_weight as u64;
let rejections_timeout_tuple = self
.block_rejection_timeout_steps
.range((Included(0), Included(rejections)))
.last()
.ok_or_else(|| {
NakamotoNodeError::SigningCoordinatorFailure(
"Invalid rejection timeout step function definition".into(),
)
})?;
rejections_timeout = rejections_timeout_tuple.1;
info!("Number of received rejections updated, resetting timeout";
"rejections" => rejections,
"rejections_timeout" => rejections_timeout.as_secs(),
"rejections_threshold" => self.total_weight.saturating_sub(self.weight_threshold));
rdeioris marked this conversation as resolved.
Show resolved Hide resolved

Counters::set(
&counters.naka_miner_current_rejections_timeout_secs,
rejections_timeout.as_secs(),
);
Counters::set(&counters.naka_miner_current_rejections, rejections);
}

if block_status
.total_reject_weight
.saturating_add(self.weight_threshold)
Expand All @@ -357,10 +429,18 @@ impl SignerCoordinator {
"block_signer_sighash" => %block_signer_sighash,
);
return Ok(block_status.gathered_signatures.values().cloned().collect());
} else {
} else if rejections_timer.elapsed() > *rejections_timeout {
warn!("Timed out while waiting for responses from signers";
"elapsed" => rejections_timer.elapsed().as_secs(),
"rejections_timeout" => rejections_timeout.as_secs(),
"rejections" => rejections,
"rejections_threshold" => self.total_weight.saturating_sub(self.weight_threshold)
);
return Err(NakamotoNodeError::SigningCoordinatorFailure(
"Unblocked without reaching the threshold".into(),
"Gave up while trying reaching the threshold".into(),
rdeioris marked this conversation as resolved.
Show resolved Hide resolved
));
} else {
continue;
}
}
}
Expand Down
7 changes: 5 additions & 2 deletions testnet/stacks-node/src/run_loop/neon.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,9 @@ pub struct Counters {
pub naka_miner_directives: RunLoopCounter,
pub naka_submitted_commit_last_stacks_tip: RunLoopCounter,

pub naka_miner_current_rejections: RunLoopCounter,
pub naka_miner_current_rejections_timeout_secs: RunLoopCounter,

#[cfg(test)]
pub naka_skip_commit_op: TestFlag<bool>,
}
Expand All @@ -136,12 +139,12 @@ impl Counters {
fn inc(_ctr: &RunLoopCounter) {}

#[cfg(test)]
fn set(ctr: &RunLoopCounter, value: u64) {
pub fn set(ctr: &RunLoopCounter, value: u64) {
ctr.0.store(value, Ordering::SeqCst);
}

#[cfg(not(test))]
fn set(_ctr: &RunLoopCounter, _value: u64) {}
pub fn set(_ctr: &RunLoopCounter, _value: u64) {}
rdeioris marked this conversation as resolved.
Show resolved Hide resolved

pub fn bump_blocks_processed(&self) {
Counters::inc(&self.blocks_processed);
Expand Down
1 change: 1 addition & 0 deletions testnet/stacks-node/src/tests/signer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -937,6 +937,7 @@ fn setup_stx_btc_node<G: FnMut(&mut NeonConfig)>(
let counters = run_loop.counters();

let coord_channel = run_loop.coordinator_channels();

let run_loop_thread = thread::spawn(move || run_loop.start(None, 0));

// Give the run loop some time to start up!
Expand Down
Loading
Loading