From 64c545da286bb6b965a951f3177ed43ad61c96fb Mon Sep 17 00:00:00 2001 From: Moritz Date: Tue, 28 Jan 2025 13:27:05 +0100 Subject: [PATCH 1/6] bench(synth-bench): make binary selection more explicit --- benchmarks/synth-bm/justfile | 3 ++- docs/practices/workflows/benchmarking_synthetic_workloads.md | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/benchmarks/synth-bm/justfile b/benchmarks/synth-bm/justfile index 7645b04974f..bbd458e1801 100644 --- a/benchmarks/synth-bm/justfile +++ b/benchmarks/synth-bm/justfile @@ -1,4 +1,5 @@ -neard := "../../target/release/neard" +# Place the `neard` binary to be benchmarked here. +neard := "./neard" near_localnet_home := ".near/" rpc_url := "http://127.0.0.1:3030" diff --git a/docs/practices/workflows/benchmarking_synthetic_workloads.md b/docs/practices/workflows/benchmarking_synthetic_workloads.md index 121dc01b549..8d4f6999ede 100644 --- a/docs/practices/workflows/benchmarking_synthetic_workloads.md +++ b/docs/practices/workflows/benchmarking_synthetic_workloads.md @@ -78,8 +78,8 @@ Details of bringing up and configuring a network are out of scope for this docum ### Build `neard` -Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS. - +Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS. Place the corresponding `neard` binary in the justfile's directory or adjust commands and justfile variables to point to your binary. + ### Initialize the network ```command From 0e101332962914d612cc4cc662336437dc2a26ba Mon Sep 17 00:00:00 2001 From: Moritz Date: Tue, 28 Jan 2025 13:51:20 +0100 Subject: [PATCH 2/6] add cspell word --- docs/practices/workflows/benchmarking_synthetic_workloads.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/practices/workflows/benchmarking_synthetic_workloads.md b/docs/practices/workflows/benchmarking_synthetic_workloads.md index 8d4f6999ede..77e2fa6e357 100644 --- a/docs/practices/workflows/benchmarking_synthetic_workloads.md +++ b/docs/practices/workflows/benchmarking_synthetic_workloads.md @@ -113,4 +113,6 @@ Following these steps so far creates a config that will throttle throughput due } ``` +cspell:words + Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of unlimiting configuration. From 94f24c3759f0a92f501189ea861a07128c17ba0e Mon Sep 17 00:00:00 2001 From: Moritz Date: Wed, 29 Jan 2025 08:35:02 +0100 Subject: [PATCH 3/6] fix typo --- docs/practices/workflows/benchmarking_synthetic_workloads.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/practices/workflows/benchmarking_synthetic_workloads.md b/docs/practices/workflows/benchmarking_synthetic_workloads.md index 77e2fa6e357..d2c95a04c18 100644 --- a/docs/practices/workflows/benchmarking_synthetic_workloads.md +++ b/docs/practices/workflows/benchmarking_synthetic_workloads.md @@ -113,6 +113,5 @@ Following these steps so far creates a config that will throttle throughput due } ``` -cspell:words Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of unlimiting configuration. From f3a642bfc4ba75799430cc5a29dd090177da74c0 Mon Sep 17 00:00:00 2001 From: Moritz Date: Wed, 29 Jan 2025 16:04:52 +0100 Subject: [PATCH 4/6] Add binary to .gitignore --- benchmarks/synth-bm/.gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/benchmarks/synth-bm/.gitignore b/benchmarks/synth-bm/.gitignore index 6d244f57f97..d3fa05130b9 100644 --- a/benchmarks/synth-bm/.gitignore +++ b/benchmarks/synth-bm/.gitignore @@ -1,3 +1,4 @@ +neard .near/ target/ user-data/ From 32fb5b5257bf78614d260719dadd42b512fab575 Mon Sep 17 00:00:00 2001 From: Moritz Date: Mon, 10 Feb 2025 12:01:43 +0100 Subject: [PATCH 5/6] Remove merge conflict leftover --- docs/practices/workflows/benchmarking_synthetic_workloads.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/practices/workflows/benchmarking_synthetic_workloads.md b/docs/practices/workflows/benchmarking_synthetic_workloads.md index 0cc302d0b16..5d7a9f0b14b 100644 --- a/docs/practices/workflows/benchmarking_synthetic_workloads.md +++ b/docs/practices/workflows/benchmarking_synthetic_workloads.md @@ -145,7 +145,6 @@ Controls the rate at which transactions are sent. Assuming your hardware is able Before an RPC request is sent, the tooling awaits capacity on a buffered channel. Thereby the number of outstanding RPC requests is limited by `channel-buffer-size`. This can slow down the rate at which transactions are sent in case the node is congested. To disable that behavior, set `channel-buffer-size` to a large value, e.g. the total number of transactions to be sent. -<<<<<<< HEAD ## Workflows The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows. From 176bba0b20e61b47e23b3f13d0d6a19f277be181 Mon Sep 17 00:00:00 2001 From: Moritz Date: Mon, 10 Feb 2025 12:15:00 +0100 Subject: [PATCH 6/6] Remove text that got duplicated after merging upstream --- .../benchmarking_synthetic_workloads.md | 149 +----------------- 1 file changed, 1 insertion(+), 148 deletions(-) diff --git a/docs/practices/workflows/benchmarking_synthetic_workloads.md b/docs/practices/workflows/benchmarking_synthetic_workloads.md index 5d7a9f0b14b..7571a7a70d0 100644 --- a/docs/practices/workflows/benchmarking_synthetic_workloads.md +++ b/docs/practices/workflows/benchmarking_synthetic_workloads.md @@ -1,151 +1,4 @@ -# Benchmarking synthetic workloads - -Benchmarking a synthetic workload starts a new network with empty state. Then state is created and afterwards transactions involving that state are generated. For example, the native token transfer workload creates `n` accounts with NEAR balance and then generates transactions to transfer the native token between accounts. - -This approach has the following benefits: - -- Relatively simple and quick setup, as there is no state from real work networks involved. -- Fine grained control over traffic intensity. -- Enabling the comparison of `neard` performance at different points in time or with different features. -- Might expose performance bottlenecks. - -The main drawbacks of synthetic benchmarks are: - -- Drawing conclusions is limited as real world traffic is not homogeneous. -- Calibrating traffic generation parameters can be cumbersome. - -The tooling for synthetic benchmarks is available in [`benchmarks/synth-bm`](../../../benchmarks/synth-bm). - -## Workflows -The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows. - -### Benchmark native token transfers - -A typical workflow benchmarking the native token transfers using the above `justfile` would be something along the: -- set up the network - -```command -rm -rf .near && just init_localnet -# Modify the configuration (see the "Un-limit configuration" section) -[t1]$ just run_localnet -[t1]$ just create_subaccounts -``` -- run the benchmark -```command -# set the desired tx rate (`--interval-duration-micros`) and the total volume (`--num-transfers`) in the justfile -[t2]$ just benchmark_native_transfers -``` - -This benchmark generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run: - -```command -cargo run --release -- benchmark-native-transfers --help -``` - -For the native transfer benchmark transactions are sent with `wait_until: None`, meaning the responses the `near_synth_bm` tool receives are basically just an ACK by the RPC confirming it received the transaction. -Thus the numbers reported by the tool as if in -``` -[2025-01-27T14:05:12Z INFO near_synth_bm::native_transfer] Sent 200000 txs in 6.50 seconds -[2025-01-27T14:05:12Z INFO near_synth_bm::rpc] Received 200000 tx responses in 6.49 seconds -``` -are not directly indicative of the runtime performance and transaction outcomes. -The number of transactions successfully processed may be obtained by querying the `near_transaction_processed_successfully_total` metric, e.g. with: `http://localhost:3030/metrics | grep transaction_processed`. -Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly. - -### Benchmark calls to the `sign` method of an MPC contract - -Assumes the accounts that send the transactions invoking `sign` have been created as described above. Transactions can be sent to a RPC of a network on which an instance of the [`mpc/chain-signatures`](https://github.com/near/mpc/tree/79ec50759146221e7ad8bb04520f13333b75ca07/chain-signatures/contract) is deployed. - -Transactions are sent to the RPC with `wait_until: EXECUTED_OPTIMISTIC` as the throughput for `sign` is at a level at which neither the network nor the RPC are expected to be a bottleneck. - -All options of the command can be shown with: - -```command -cargo run -- benchmark-mpc-sign --help -``` - -## Auxiliary steps - -### Network setup and `neard` configuration - -Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup. - -### Build `neard` - -Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS. Place the corresponding `neard` binary in the justfile's directory or adjust commands and justfile variables to point to your binary. - -### Create sub accounts - -Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run: - -```command -cargo run --release -- create-sub-accounts --help -``` - -### Initialize the network - -```command -./neard --home .near init --chain-id localnet -``` - -### Enable memtrie - -The configuration generated by the above command does not enable memtrie. However, most benchmarks should run against a node with memtrie enabled, which can be achieved by setting the following in `.near/config.json`: - -``` -"load_mem_tries_for_tracked_shards": true -``` - -### Un-limit configuration -Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration: - -``` -# Modifications in .near/genesis.json - -"chain_id": "benchmarknet" -"gas_limit": 20000000000000000 # increase default by x20 - -# Modifications in .near/config.json -"view_client_threads": 8 # increase default by x2 -"load_mem_tries_for_tracked_shards": true # enable memtrie -"produce_chunk_add_transactions_time_limit": { - "secs": 0, - "nanos": 800000000 # increase default by x4 -} -``` - -Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of un-limiting configuration. - - -Modifications of `genesis.json` need to be applied before initializing the network with `just init_localnet`. Otherwise `just run_localnet` will fail. If you ran the node with default config and want to switch to unlimited config, the required steps are: - -```console -# Remove .near as you will need to initialize localnet again. -$ rm -rf .near -$ just init_localnet -# Modify the configuration -$ just run_localnet -``` - -## Common parameters - -The following parameters are common to multiple tasks: - -### `rpc-url` - -The RPC endpoint to which transactions are sent. - -Synthetic benchmarking may create thousands of transactions per second, which can hit network limitations if the RPC is located on a separate machine. In particular sending transactions to nodes running on GCP requires care as it can cause temporary IP address bans. For that scenario it is recommended to run a separate traffic generation vm located in the same GCP zone as the RPC node and send transactions to its `internal IP`. - -### `interval-duration-micros` - -Controls the rate at which transactions are sent. Assuming your hardware is able to send a request at every interval tick, the number of transactions sent per second equals `1_000_000 / interval-duration-micros`. The rate might be slowed down if `channel-buffer-size` becomes a bottleneck. - -### `channel-buffer-size` - -Before an RPC request is sent, the tooling awaits capacity on a buffered channel. Thereby the number of outstanding RPC requests is limited by `channel-buffer-size`. This can slow down the rate at which transactions are sent in case the node is congested. To disable that behavior, set `channel-buffer-size` to a large value, e.g. the total number of transactions to be sent. - -## Workflows +# Workflows The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows.