near · mooori · Jan 28, 2025 · Jan 28, 2025 · Jan 29, 2025 · Jan 29, 2025
@@ -1,3 +1,4 @@
+neard
 .near/
 target/
 user-data/
@@ -1,4 +1,5 @@
-neard := "../../target/release/neard"
+# Place the `neard` binary to be benchmarked here.
+neard := "./neard"
 near_localnet_home := ".near/"
 rpc_url := "http://127.0.0.1:3030"
 

@@ -1,56 +1,28 @@
-# Benchmarking synthetic workloads
+# Workflows
 
-Benchmarking a synthetic workload starts a new network with empty state. Then state is created and afterwards transactions involving that state are generated. For example, the native token transfer workload creates `n` accounts with NEAR balance and then generates transactions to transfer the native token between accounts.
-
-This approach has the following benefits:
-
-- Relatively simple and quick setup, as there is no state from real work networks involved.
-- Fine grained control over traffic intensity.
-- Enabling the comparison of `neard` performance at different points in time or with different features.
-- Might expose performance bottlenecks.
-
-The main drawbacks of synthetic benchmarks are:
-
-- Drawing conclusions is limited as real world traffic is not homogeneous.
-- Calibrating traffic generation parameters can be cumbersome.
-
-The tooling for synthetic benchmarks is available in [`benchmarks/synth-bm`](../../../benchmarks/synth-bm).
-
-## Workflows
 The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows.
 
-### Benchmark native token transfers
+### Create sub accounts
+
+Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:
 
-A typical workflow benchmarking the native token transfers using the above `justfile` would be something along the:
-- set up the network
-<!-- cspell:words subaccounts -->
-```command
-rm -rf .near && just init_localnet
-# Modify the configuration (see the "Un-limit configuration" section)
-[t1]$ just run_localnet
-[t1]$ just create_subaccounts
-```
-- run the benchmark
 ```command
-# set the desired tx rate (`--interval-duration-micros`) and the total volume (`--num-transfers`) in the justfile
-[t2]$ just benchmark_native_transfers
+cargo run --release -- create-sub-accounts --help
 ```
 
-This benchmark generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:
+### Benchmark native token transfers
+
+Generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:
 
 ```command
 cargo run --release -- benchmark-native-transfers --help
 ```
 
-For the native transfer benchmark transactions are sent with `wait_until: None`, meaning the responses the `near_synth_bm` tool receives are basically just an ACK by the RPC confirming it received the transaction.
-Thus the numbers reported by the tool as if in
-```
-[2025-01-27T14:05:12Z INFO  near_synth_bm::native_transfer] Sent 200000 txs in 6.50 seconds
-[2025-01-27T14:05:12Z INFO  near_synth_bm::rpc] Received 200000 tx responses in 6.49 seconds
+Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly. In the meantime, they can be calculated manually by querying the `near_transaction_processed_successfully_total` metric, e.g. with:
+
+```command
+http localhost:3030/metrics | grep transaction_processed
 ```
-are not directly indicative of the runtime performance and transaction outcomes.
-The number of transactions successfully processed may be obtained by querying the `near_transaction_processed_successfully_total` metric, e.g. with: `http://localhost:3030/metrics | grep transaction_processed`.
-Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly.
 
 ### Benchmark calls to the `sign` method of an MPC contract
 
@@ -64,24 +36,14 @@ All options of the command can be shown with:
 cargo run -- benchmark-mpc-sign --help
 ```
 
-## Auxiliary steps
-
-### Network setup and `neard` configuration
+## Network setup and `neard` configuration
 
 Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup.
 
 ### Build `neard`
 
-Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS.
-
-### Create sub accounts
-
-Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:
-
-```command
-cargo run --release -- create-sub-accounts --help
-```
-
+Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS. Place the corresponding `neard` binary in the justfile's directory or adjust commands and justfile variables to point to your binary.
+
 ### Initialize the network
 
 ```command
@@ -97,51 +59,104 @@ The configuration generated by the above command does not enable memtrie. Howeve
 ```
 
 ### Un-limit configuration
+
 Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration:
 
 ```
 # Modifications in .near/genesis.json
 
 "chain_id": "benchmarknet"
-"gas_limit": 20000000000000000               # increase default by x20
+"gas_limit": 20000000000000000 # increase default by x20
 
 # Modifications in .near/config.json
-"view_client_threads": 8                     # increase default by x2
-"load_mem_tries_for_tracked_shards": true    # enable memtrie 
+"view_client_threads": 8 # increase default by x2
+"load_mem_tries_for_tracked_shards": true
 "produce_chunk_add_transactions_time_limit": {
   "secs": 0,
-  "nanos": 800000000                         # increase default by x4
+  "nanos": 800000000 # increase default by x4
 }
 ```
 
-Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of un-limiting configuration.
+<!-- cspell:words unlimiting -->
+Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of unlimiting configuration.
+||||||| c625d98df
+## Workflows
+
+The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows.
+
+### Create sub accounts
+
+Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:
+
+```command
+cargo run --release -- create-sub-accounts --help
+```
+
+### Benchmark native token transfers
+
+Generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:
+
+```command
+cargo run --release -- benchmark-native-transfers --help
+```
+
+Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly. In the meantime, they can be calculated manually by querying the `near_transaction_processed_successfully_total` metric, e.g. with:
 
+```command
+http localhost:3030/metrics | grep transaction_processed
+```
+
+### Benchmark calls to the `sign` method of an MPC contract
 
-Modifications of `genesis.json` need to be applied before initializing the network with `just init_localnet`. Otherwise `just run_localnet` will fail. If you ran the node with default config and want to switch to unlimited config, the required steps are:
+Assumes the accounts that send the transactions invoking `sign` have been created as described above. Transactions can be sent to a RPC of a network on which an instance of the [`mpc/chain-signatures`](https://github.com/near/mpc/tree/79ec50759146221e7ad8bb04520f13333b75ca07/chain-signatures/contract) is deployed.
 
-```console
-# Remove .near as you will need to initialize localnet again.
-$ rm -rf .near
-$ just init_localnet
-# Modify the configuration
-$ just run_localnet
+Transactions are sent to the RPC with `wait_until: EXECUTED_OPTIMISTIC` as the throughput for `sign` is at a level at which neither the network nor the RPC are expected to be a bottleneck.
+
+All options of the command can be shown with:
+
+```command
+cargo run -- benchmark-mpc-sign --help
 ```
 
-## Common parameters
+## Network setup and `neard` configuration
 
-The following parameters are common to multiple tasks:
+Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup.
 
-### `rpc-url`
+### Build `neard`
 
-The RPC endpoint to which transactions are sent.
+Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS.
 
-Synthetic benchmarking may create thousands of transactions per second, which can hit network limitations if the RPC is located on a separate machine. In particular sending transactions to nodes running on GCP requires care as it can cause temporary IP address bans. For that scenario it is recommended to run a separate traffic generation vm located in the same GCP zone as the RPC node and send transactions to its `internal IP`.
+### Initialize the network
 
-### `interval-duration-micros`
+```command
+./neard --home .near init --chain-id localnet
+```
+
+### Enable memtrie
+
+The configuration generated by the above command does not enable memtrie. However, most benchmarks should run against a node with memtrie enabled, which can be achieved by setting the following in `.near/config.json`:
+
+```
+"load_mem_tries_for_tracked_shards": true
+```
+
+### Un-limit configuration
 
-Controls the rate at which transactions are sent. Assuming your hardware is able to send a request at every interval tick, the number of transactions sent per second equals `1_000_000 / interval-duration-micros`. The rate might be slowed down if `channel-buffer-size` becomes a bottleneck.
+Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration:
+
+```
+# Modifications in .near/genesis.json
 
-### `channel-buffer-size`
+"chain_id": "benchmarknet"
+"gas_limit": 20000000000000000 # increase default by x20
 
-Before an RPC request is sent, the tooling awaits capacity on a buffered channel. Thereby the number of outstanding RPC requests is limited by `channel-buffer-size`. This can slow down the rate at which transactions are sent in case the node is congested. To disable that behavior, set `channel-buffer-size` to a large value, e.g. the total number of transactions to be sent.
+# Modifications in .near/config.json
+"view_client_threads": 8 # increase default by x2
+"load_mem_tries_for_tracked_shards": true
+"produce_chunk_add_transactions_time_limit": {
+  "secs": 0,
+  "nanos": 800000000 # increase default by x4
+}
+```
 
+Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of unlimiting configuration.