Ensuring the correctness of a blockchain system like Mazze is a challenging
task. This repository comes with several rigorous
testing tools and scripts. This repository is a submodule of Mazze-Rust
at tests/extra-test-toolkits
.
Note that in some terminals, the default maximum number of open file descriptors may not be enough. This is especially true if you are using Mac with its default zsh terminal. You will need to change the limit to a larger number as follows:
$ ulimit -n 22288
Inside the directory consensus_fuzzer
, there is a random fuzzing
tool for the consensus component. It works as follows.
consensus_fuzzer/gen-random-graph.cpp
is a slow C++
implementation of the Mazze DAG-Embedded Tree Structure (DETS) consensus algorithm together with a
random graph generator that generates random DAG-Embedded Tree Structure (DETS) blocks in a special
format. consensus_bench
in the Mazze-Rust
repository is capable of processing this input format, run the
Mazze consensus, and compare the results with the slow C++ implementation.
consensus_fuzzer/iter-gen-random.py
is a python script that iteratively invoke the
generation-processing-comparing process. To run this fuzzing tool:
$ cd tests/extra-test-toolkits/consensus_fuzzer
$ g++ -O2 -o gen-random-graph gen-random-graph.cpp
$ ./iter-gen-random.py 10000 3 30 10 10 100
The python script will not stop until it finds an error or you manually
terminate it. If the python script finds an error, the rand.in
file will
correspond to the bug triggering input for the consensus_bench
program. The
six parameters passed to the python scripts corresponds to the number of
randomly generated block per test case, the
TIMER_CHAIN_BLOCK_DIFFICULTY_RATIO
parameter, the TIMER_CHAIN_BETA
parameter, the ADAPTIVE_WEIGHT_BETA
parameter, the
HEAVY_BLOCK_DIFFICULTY_RATIO
parameter, the ERA_EPOCH_COUNT
parameter,
respectively. You can pass any legitimate consensus parameter to the python
script. These numbers are default that we empirically find them useful for
detecting bugs.
The python script will also print out the processing speed of the consensus graph in the test. The expected speed is ~1000 blocks per second (on a Mac Book Pro 2019 laptop) and ~350 blocks per second on m5a.xlarge. If the reported speed is significantly lower than expected, it typically means a potential performance issue. For every release, we execute this fuzzing for at least one hour using the default parameters.
Note that if you terminate this script brutally (which you will like do). It
leaves two to three temporary directories with the __
prefix and sqlite_db
.
You should remove these directories manually.
tests/extra-test-toolkits/mazze_tracing.py
is a random testing script with the failure
injection capability. It will start a Mazze network with a fixed number of
nodes and inject node crashes, db crashes, and node restarts during. During the
running, it keep fetches states from different node and verify that these nodes
have the consensus for the DAG-Embedded Tree Structure (DETS) and block state. To run Mazze tracing,
you need to first compile the release version of the Mazze Rust implementation
from the source code. Then you can invoke the script as follows:
$ tests/extra-test-toolkits/mazze_tracing.py run
The python script will then start 10 different instances together with a mock instance. It will run non-stop until it finds an error (inconsistent state or unexpected crash). For every release, we execute this tracing script for at least one hour.
In case of errors, it will generate trace files snapshot*.json
and
txs*.json
to help diagnose the issue. Note that if you terminate this script
brutally (which you will likely do). It also generates these files so you may
want to clean them manually.
tests/extra-test-toolkits/scripts/one_click.sh
together with the remaining bash scripts in the
same directory provide an automatic deployment of Mazze network on AWS for
testing the simple payment TPS and transaction pool performance. You can run
this test as follows:
-
First you need to download and install AWS CLI tools. Properly configure the AWS credential for the CLI tool.
-
Make your default public key registered as a named key pair in the us-west-2 region.
-
Decide the branch of the Mazze repo you want to test. Note that this script pulls the source code from a GitHub repo that contains the Mazze rust implementation and compile them on the fly. You cannot run your local Mazze copy with this script. If you do not specify the repo/branch name, it will pull from the official Mazze-rust repo from the GitHub.
-
Run the following command:
$ cd tests/extra-test-toolkits/scripts
$ ./one_click.sh key-pair-name 20 branch-name [repo-name]
This will start 20 instances at the us-west-2 region together with a random transaction generator. It will take roughly 15 minutes to setup the experiments and then 20 minutes to finish the run. In the end, it will report the TPS performance. The expected good TPS number is ~4000TPS. If you get a TPS number much lower than the expectation, there is a performance regression at the transaction pool or at the storage layer. For every release, we run this script to test its performance.
The storage layer in Mazze is often the performance bottleneck.
core/benchmark/storage
therefore contains a benchmark tool to measure the
performance of the storage layer, eliminating other layer from the execution.
We also converted Ethereum network history payment transactions (first ~4m
blocks) as the benchmark traces. Here are steps to run the storage benchmark
test:
-
From the AWS S3
mazze-storage-bench
bucket, downloadfoundation.json
andeth_from_0_to_4141811_txs.rlp.tar.gz
. -
Untar the rlp history file to obtain
eth_from_0_to_4141811_txs.rlp
. -
Go to
core/benchmark/storage
and runcargo build --release
to compile the binarystorage_bench
. -
Create a temporary directory
tmp_storage_db
for holding the blockchain database generated in the experiment. -
Invoke the following command:
$ cd core/benchmark/storage
$ RUST_BACKTRACE=full target/release/storage_bench run -g /path/to/foundation.json -t /path/to/eth_from_0_to_4141811_txs.rlp -d /path/to/tmp_storage_db --txs_to_process 30000000 --skip 1156773812
This command will process the first 30 million transactiosn from the parsed history file and then quit. It is a good idea to time the running time of this command for computing the achieved processing throughput of the storage layer. The performance will largely depend on the quality of the underlying disk I/O. In MacBook Pro 2019, the throughput is 25000-30000 TPS. In m5a.xlarge, the throughput is 15000-20000 TPS. If the performance is lower than the expectation, it indicates a potential regression at the storage layer. For every relealse, we will run this test to check the storage layer performance.
The consensus implementation is typically fast and can process close to one
thousand blocks per second in the normal scenarios. However, if the DAG-Embedded Tree Structure (DETS)
is unstable and it contains a lot of forks, the consensus component may fail
back to slow routines. Its performance under such unstable scenarios is
critical because it corresponds to the catch-up speed during DoS attacks.
tests/extra-test-toolkits/attack_bench
contains a list of python scripts to benchmark the
consensus performance under attack scenarios:
-
fork_same_height_merge.py
creates a unstable DAG-Embedded Tree Structure (DETS) with roughly 95000 blocks. In the DAG-Embedded Tree Structure (DETS), it has three branches and in each branch there are star shape forks attached at a fixed height. It corresponds to one worst case scenario for the consensus procesing engine. The expected speed is ~70 blocks/s on MacBook Pro 2019 and ~45 blocks/s on m5a.xlarge. -
fork_same_height_hiding.py
tests the scenario where an attacker tries to actively mine at a fixed height, hides the mined blocks, and release them together. It measures the block generation capaiblity of the victim at this scenario. The expected generation speed is always faster than 1000 blocks in less than 1 minutes. -
fork_same_height_attack.py
tests a similar attack as 2 but the attacker does not hide the blocks. The expected generation speed is always afster than 100 blocks in less than 10 seconds. -
fork_chain_hiding.py
tests the scenario where an attacker tries to actively mine a saperate chain, hides the mined blocks, and release them together. The expected generation speed of the victim is always faster than 100 blocks in less than 10 seconds. -
fork_chain_attack.py
tests a similar attack as 4 but the attacker does not hide the blocks. The expected generation speed of the victim is always faster than 100 blocks in less than 10 seconds.
Note that 2, 3, and 5 are long running test scripts and you can terminate the execution after the speed stablizes. For every release, we run these scripts to make sure that there is no performance regression.