Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bench(synth-bench): make binary selection more explicit #12817

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions benchmarks/synth-bm/.gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
neard
.near/
target/
user-data/
3 changes: 2 additions & 1 deletion benchmarks/synth-bm/justfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
neard := "../../target/release/neard"
# Place the `neard` binary to be benchmarked here.
neard := "./neard"
near_localnet_home := ".near/"
mooori marked this conversation as resolved.
Show resolved Hide resolved
rpc_url := "http://127.0.0.1:3030"

Expand Down
163 changes: 89 additions & 74 deletions docs/practices/workflows/benchmarking_synthetic_workloads.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,28 @@
# Benchmarking synthetic workloads
# Workflows

Benchmarking a synthetic workload starts a new network with empty state. Then state is created and afterwards transactions involving that state are generated. For example, the native token transfer workload creates `n` accounts with NEAR balance and then generates transactions to transfer the native token between accounts.

This approach has the following benefits:

- Relatively simple and quick setup, as there is no state from real work networks involved.
- Fine grained control over traffic intensity.
- Enabling the comparison of `neard` performance at different points in time or with different features.
- Might expose performance bottlenecks.

The main drawbacks of synthetic benchmarks are:

- Drawing conclusions is limited as real world traffic is not homogeneous.
- Calibrating traffic generation parameters can be cumbersome.

The tooling for synthetic benchmarks is available in [`benchmarks/synth-bm`](../../../benchmarks/synth-bm).

## Workflows
The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows.

### Benchmark native token transfers
### Create sub accounts

Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:

A typical workflow benchmarking the native token transfers using the above `justfile` would be something along the:
- set up the network
<!-- cspell:words subaccounts -->
```command
rm -rf .near && just init_localnet
# Modify the configuration (see the "Un-limit configuration" section)
[t1]$ just run_localnet
[t1]$ just create_subaccounts
```
- run the benchmark
```command
# set the desired tx rate (`--interval-duration-micros`) and the total volume (`--num-transfers`) in the justfile
[t2]$ just benchmark_native_transfers
cargo run --release -- create-sub-accounts --help
```

This benchmark generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:
### Benchmark native token transfers

Generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:

```command
cargo run --release -- benchmark-native-transfers --help
```

For the native transfer benchmark transactions are sent with `wait_until: None`, meaning the responses the `near_synth_bm` tool receives are basically just an ACK by the RPC confirming it received the transaction.
Thus the numbers reported by the tool as if in
```
[2025-01-27T14:05:12Z INFO near_synth_bm::native_transfer] Sent 200000 txs in 6.50 seconds
[2025-01-27T14:05:12Z INFO near_synth_bm::rpc] Received 200000 tx responses in 6.49 seconds
Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly. In the meantime, they can be calculated manually by querying the `near_transaction_processed_successfully_total` metric, e.g. with:

```command
http localhost:3030/metrics | grep transaction_processed
```
are not directly indicative of the runtime performance and transaction outcomes.
The number of transactions successfully processed may be obtained by querying the `near_transaction_processed_successfully_total` metric, e.g. with: `http://localhost:3030/metrics | grep transaction_processed`.
Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly.

### Benchmark calls to the `sign` method of an MPC contract

Expand All @@ -64,24 +36,14 @@ All options of the command can be shown with:
cargo run -- benchmark-mpc-sign --help
```

## Auxiliary steps

### Network setup and `neard` configuration
## Network setup and `neard` configuration

Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup.

### Build `neard`

Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS.

### Create sub accounts

Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:

```command
cargo run --release -- create-sub-accounts --help
```

Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS. Place the corresponding `neard` binary in the justfile's directory or adjust commands and justfile variables to point to your binary.

### Initialize the network

```command
Expand All @@ -97,51 +59,104 @@ The configuration generated by the above command does not enable memtrie. Howeve
```

### Un-limit configuration

Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration:

```
# Modifications in .near/genesis.json

"chain_id": "benchmarknet"
"gas_limit": 20000000000000000 # increase default by x20
"gas_limit": 20000000000000000 # increase default by x20

# Modifications in .near/config.json
"view_client_threads": 8 # increase default by x2
"load_mem_tries_for_tracked_shards": true # enable memtrie
"view_client_threads": 8 # increase default by x2
"load_mem_tries_for_tracked_shards": true
"produce_chunk_add_transactions_time_limit": {
"secs": 0,
"nanos": 800000000 # increase default by x4
"nanos": 800000000 # increase default by x4
}
```

Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of un-limiting configuration.
<!-- cspell:words unlimiting -->
Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of unlimiting configuration.
||||||| c625d98df
## Workflows

The tooling's [`justfile`](../../../benchmarks/synth-bm/justfile) contains recipes for the most relevant workflows.

### Create sub accounts

Creating the state for synthetic benchmarks usually starts with creating accounts. We create sub accounts for the account specified by `--signer-key-path`. This avoids dealing with the registrar, which would be required for creating top level accounts. To view all options, run:

```command
cargo run --release -- create-sub-accounts --help
```

### Benchmark native token transfers

Generates a native token transfer workload involving the accounts provided in `--user-dada-dir`. Transactions are generated by iterating through these accounts and sending native tokens to a randomly chosen receiver from the same set of accounts. To view all options, run:

```command
cargo run --release -- benchmark-native-transfers --help
```

Automatic calculation of transactions per second (TPS) when RPC requests are sent with `wait_until: NONE` is coming up shortly. In the meantime, they can be calculated manually by querying the `near_transaction_processed_successfully_total` metric, e.g. with:

```command
http localhost:3030/metrics | grep transaction_processed
```

### Benchmark calls to the `sign` method of an MPC contract

Modifications of `genesis.json` need to be applied before initializing the network with `just init_localnet`. Otherwise `just run_localnet` will fail. If you ran the node with default config and want to switch to unlimited config, the required steps are:
Assumes the accounts that send the transactions invoking `sign` have been created as described above. Transactions can be sent to a RPC of a network on which an instance of the [`mpc/chain-signatures`](https://github.com/near/mpc/tree/79ec50759146221e7ad8bb04520f13333b75ca07/chain-signatures/contract) is deployed.

```console
# Remove .near as you will need to initialize localnet again.
$ rm -rf .near
$ just init_localnet
# Modify the configuration
$ just run_localnet
Transactions are sent to the RPC with `wait_until: EXECUTED_OPTIMISTIC` as the throughput for `sign` is at a level at which neither the network nor the RPC are expected to be a bottleneck.

All options of the command can be shown with:

```command
cargo run -- benchmark-mpc-sign --help
```

## Common parameters
## Network setup and `neard` configuration

The following parameters are common to multiple tasks:
Details of bringing up and configuring a network are out of scope for this document. Instead we just give a brief overview of the setup regularly used to benchmark TPS of common workloads in a single-node with a single-shard setup.

### `rpc-url`
### Build `neard`

The RPC endpoint to which transactions are sent.
Choose the git commit and cargo features corresponding to what you want to benchmark. Most likely you will want a `--release` build to measure TPS.

Synthetic benchmarking may create thousands of transactions per second, which can hit network limitations if the RPC is located on a separate machine. In particular sending transactions to nodes running on GCP requires care as it can cause temporary IP address bans. For that scenario it is recommended to run a separate traffic generation vm located in the same GCP zone as the RPC node and send transactions to its `internal IP`.
### Initialize the network

### `interval-duration-micros`
```command
./neard --home .near init --chain-id localnet
```

### Enable memtrie

The configuration generated by the above command does not enable memtrie. However, most benchmarks should run against a node with memtrie enabled, which can be achieved by setting the following in `.near/config.json`:

```
"load_mem_tries_for_tracked_shards": true
```

### Un-limit configuration

Controls the rate at which transactions are sent. Assuming your hardware is able to send a request at every interval tick, the number of transactions sent per second equals `1_000_000 / interval-duration-micros`. The rate might be slowed down if `channel-buffer-size` becomes a bottleneck.
Following these steps so far creates a config that will throttle throughput due to various factors related to state witness size, gas/compute limits, and congestion control. In case you want to benchmark a node that fully utilizes its hardware, you can do the following modifications to effectively run with unlimited configuration:

```
# Modifications in .near/genesis.json

### `channel-buffer-size`
"chain_id": "benchmarknet"
"gas_limit": 20000000000000000 # increase default by x20

Before an RPC request is sent, the tooling awaits capacity on a buffered channel. Thereby the number of outstanding RPC requests is limited by `channel-buffer-size`. This can slow down the rate at which transactions are sent in case the node is congested. To disable that behavior, set `channel-buffer-size` to a large value, e.g. the total number of transactions to be sent.
# Modifications in .near/config.json
"view_client_threads": 8 # increase default by x2
"load_mem_tries_for_tracked_shards": true
"produce_chunk_add_transactions_time_limit": {
"secs": 0,
"nanos": 800000000 # increase default by x4
}
```

Note that as `nearcore` evolves, these steps and `BENCHMARKNET` adjustments might need to be updated to achieve the effect of unlimiting configuration.
Loading