Skip to content

Latest commit

 

History

History
492 lines (340 loc) · 16.6 KB

README.md

File metadata and controls

492 lines (340 loc) · 16.6 KB

Zero Bin

A composition of paladin and proof_gen. Given the proof generation protocol as input, generate a proof. The project is instrumented with paladin, and as such can distribute proof generation across multiple worker machines.

Project layout

ops
├── Cargo.toml
└── src
   └── lib.rs
worker
├── Cargo.toml
└── src
   └── main.rs
leader
├── Cargo.toml
└── src
   └── main.rs
rpc
├── Cargo.toml
└── src
   └── main.rs
verifier
├── Cargo.toml
└── src
   └── main.rs

Ops

Defines the proof operations that can be distributed to workers.

Worker

The worker process. Receives proof operations from the leader, and returns the result.

Leader

The leader process. Receives proof generation requests, and distributes them to workers.

RPC

A binary to generate the block trace format expected by the leader.

Verifier

A binary to verify the correctness of the generated proof.

Leader Usage

The leader has various subcommands for different io modes. The leader binary arguments are as follows:

cargo r --release --bin leader -- --help

Usage: leader [OPTIONS] <COMMAND>

Commands:
  stdio    Reads input from stdin and writes output to stdout
  jerigon  Reads input from a Jerigon node and writes output to stdout
  native   Reads input from a native node and writes output to stdout
  http     Reads input from HTTP and writes output to a directory
  help     Print this message or the help of the given subcommand(s)

Options:
  -h, --help
          Print help (see a summary with '-h')
  --version
          Fetch the `evm_arithmetization` package version, build commit hash and build timestamp

Paladin options:
  -t, --task-bus-routing-key <TASK_BUS_ROUTING_KEY>
          Specifies the routing key for publishing task messages. In most cases, the default value should suffice

          [default: task]

  -s, --serializer <SERIALIZER>
          Determines the serialization format to be used

          [default: postcard]
          [possible values: postcard, cbor]

  -r, --runtime <RUNTIME>
          Specifies the runtime environment to use

          [default: amqp]
          [possible values: amqp, in-memory]

  -n, --num-workers <NUM_WORKERS>
          Specifies the number of worker threads to spawn (in memory runtime only)

      --amqp-uri <AMQP_URI>
          Provides the URI for the AMQP broker, if the AMQP runtime is selected

          [env: AMQP_URI=amqp://localhost:5672]

Table circuit sizes:
      --persistence <PERSISTENCE>
          [default: disk]

          Possible values:
          - none: Do not persist the processed circuits
          - disk: Persist the processed circuits to disk

      --arithmetic <CIRCUIT_BIT_RANGE>
          The min/max size for the arithmetic table circuit.

          [env: ARITHMETIC_CIRCUIT_SIZE=16..22]

      --byte-packing <CIRCUIT_BIT_RANGE>
          The min/max size for the byte packing table circuit.

          [env: BYTE_PACKING_CIRCUIT_SIZE=10..22]

      --cpu <CIRCUIT_BIT_RANGE>
          The min/max size for the cpu table circuit.

          [env: CPU_CIRCUIT_SIZE=15..22]

      --keccak <CIRCUIT_BIT_RANGE>
          The min/max size for the keccak table circuit.

          [env: KECCAK_CIRCUIT_SIZE=14..22]

      --keccak-sponge <CIRCUIT_BIT_RANGE>
          The min/max size for the keccak sponge table circuit.

          [env: KECCAK_SPONGE_CIRCUIT_SIZE=9..22]

      --logic <CIRCUIT_BIT_RANGE>
          The min/max size for the logic table circuit.

          [env: LOGIC_CIRCUIT_SIZE=12..22]

      --memory <CIRCUIT_BIT_RANGE>
          The min/max size for the memory table circuit.

          [env: MEMORY_CIRCUIT_SIZE=18..22]

Note that both paladin and plonky2 table circuit sizes are configurable via command line arguments and environment variables. The command line arguments take precedence over the environment variables.

TABLE CIRCUIT SIZES ARE ONLY RELEVANT FOR THE LEADER WHEN RUNNING IN in-memory MODE.

If you want to configure the table circuit sizes when running in a distributed environment, you must configure the table circuit sizes on the worker processes (the command line arguments are the same).

stdio

The stdio command reads proof input from stdin and writes output to stdout.

cargo r --release --bin leader stdio --help

Reads input from stdin and writes output to stdout

Usage: leader stdio [OPTIONS]

Options:
  -f, --previous-proof <PREVIOUS_PROOF>  The previous proof output
  -h, --help                             Print help

Pull prover input from the rpc binary.

cargo r --release --bin rpc fetch --rpc-url <RPC_URL> -b 6 > ./input/block_6.json

Pipe the block input to the leader binary.

cat ./input/block_6.json | cargo r --release --bin leader -- -r in-memory stdio > ./output/proof_6.json

Jerigon

The Jerigon command reads proof input from a Jerigon node and writes output to stdout.

cargo r --release --bin leader jerigon --help

Reads input from a Jerigon node and writes output to stdout

Usage: leader jerigon [OPTIONS] --rpc-url <RPC_URL> --block-interval <BLOCK_INTERVAL>

Options:
  -u, --rpc-url <RPC_URL>

  -i, --block-interval <BLOCK_INTERVAL>
          The block interval for which to generate a proof
  -c, --checkpoint-block-number <CHECKPOINT_BLOCK_NUMBER>
          The checkpoint block number [default: 0]
  -f, --previous-proof <PREVIOUS_PROOF>
          The previous proof output
  -o, --proof-output-dir <PROOF_OUTPUT_DIR>
          If provided, write the generated proofs to this directory instead of stdout
  -s, --save-inputs-on-error
          If true, save the public inputs to disk on error
  -b, --block-time <BLOCK_TIME>
          Network block time in milliseconds. This value is used to determine the blockchain node polling interval [env: ZERO_BIN_BLOCK_TIME=] [default: 2000]
  -k, --keep-intermediate-proofs
          Keep intermediate proofs. Default action is to delete them after the final proof is generated [env: ZERO_BIN_KEEP_INTERMEDIATE_PROOFS=]
      --backoff <BACKOFF>
          Backoff in milliseconds for retry requests [default: 0]
      --max-retries <MAX_RETRIES>
          The maximum number of retries [default: 0]
  -h, --help
          Print help

Prove a block.

cargo r --release --bin leader -- -r in-memory jerigon -u <RPC_URL> -b 16 > ./output/proof_16.json

Native

The native command reads proof input from a native node and writes output to stdout.

cargo r --release --bin leader native --help

Reads input from a native node and writes output to stdout

Usage: leader native [OPTIONS] --rpc-url <RPC_URL> --block-interval <BLOCK_INTERVAL>

Options:
  -u, --rpc-url <RPC_URL>

  -i, --block-interval <BLOCK_INTERVAL>
          The block interval for which to generate a proof
  -c, --checkpoint-block-number <CHECKPOINT_BLOCK_NUMBER>
          The checkpoint block number [default: 0]
  -f, --previous-proof <PREVIOUS_PROOF>
          The previous proof output
  -o, --proof-output-dir <PROOF_OUTPUT_DIR>
          If provided, write the generated proofs to this directory instead of stdout
  -s, --save-inputs-on-error
          If true, save the public inputs to disk on error
  -b, --block-time <BLOCK_TIME>
          Network block time in milliseconds. This value is used to determine the blockchain node polling interval [env: ZERO_BIN_BLOCK_TIME=] [default: 2000]
  -k, --keep-intermediate-proofs
          Keep intermediate proofs. Default action is to delete them after the final proof is generated [env: ZERO_BIN_KEEP_INTERMEDIATE_PROOFS=]
      --backoff <BACKOFF>
          Backoff in milliseconds for retry requests [default: 0]
      --max-retries <MAX_RETRIES>
          The maximum number of retries [default: 0]
  -h, --help
          Print help

Prove a block.

cargo r --release --bin leader -- -r in-memory native -u <RPC_URL> -b 16 > ./output/proof_16.json

HTTP

The HTTP command reads proof input from HTTP and writes output to a directory.

cargo r --release --bin leader http --help

Reads input from HTTP and writes output to a directory

Usage: leader http [OPTIONS] --output-dir <OUTPUT_DIR>

Options:
  -p, --port <PORT>              The port on which to listen [default: 8080]
  -o, --output-dir <OUTPUT_DIR>  The directory to which output should be written
  -h, --help                     Print help

Pull prover input from the rpc binary.

cargo r --release --bin rpc fetch -u <RPC_URL> -b 6 > ./input/block_6.json

Start the server.

RUST_LOG=debug cargo r --release --bin leader http --output-dir ./output

Note that HTTP mode requires a slightly modified input format from the rest of the commands. In particular, the previous proof is expected to be part of the payload. This is due to the fact that the HTTP mode may handle multiple requests concurrently, and thus the previous proof cannot reasonably be given by a command line argument like the other modes.

Using jq we can merge the previous proof and the block input into a single JSON object.

jq -s '{prover_input: .[0], previous: .[1]}' ./input/block_6.json ./output/proof_5.json | curl -X POST -H "Content-Type: application/json" -d @- http://localhost:8080/prove

Paladin Runtime

Paladin supports both an AMQP and in-memory runtime. The in-memory runtime will emulate a cluster in memory within a single process, and is useful for testing. The AMQP runtime is geared for a production environment. The AMQP runtime requires a running AMQP broker and spinning up worker processes. The AMQP uri can be specified with the --amqp-uri flag or be set with the AMQP_URI environment variable.

Starting an AMQP enabled cluster

Start rabbitmq

docker run --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3-management
Start worker(s)

Start worker process(es). The default paladin runtime is AMQP, so no additional flags are required to enable it.

RUST_LOG=debug cargo r --release --bin worker
Start leader

Start the leader process with the desired command. The default paladin runtime is AMQP, so no additional flags are required to enable it.

RUST_LOG=debug cargo r --release --bin leader jerigon -u <RPC_URL> -b 16 > ./output/proof_16.json

Starting an in-memory (single process) cluster

Paladin can emulate a cluster in memory within a single process. Useful for testing purposes.

cat ./input/block_6.json | cargo r --release --bin leader -- -r in-memory stdio > ./output/proof_6.json

Verifier Usage

A verifier binary is provided to verify the correctness of the generated proof. The verifier expects output in the format generated by the leader. The verifier binary arguments are as follows:

cargo r --bin verifier -- --help

Usage: verifier --file-path <FILE_PATH>

Options:
  --version                      Fetch the `evm_arithmetization` package version, build commit hash and build timestamp
  -f, --file-path <FILE_PATH>  The file containing the proof to verify
  -h, --help                   Print help

Example:

cargo r --release --bin verifier -- -f ./output/proof_16.json

RPC Usage

An rpc binary is provided to generate the block trace format expected by the leader.

cargo r --bin rpc -- --help

Usage: rpc <COMMAND>

Commands:
  fetch   Fetch and generate prover input from the RPC endpoint
  help    Print this message or the help of the given subcommand(s)

Options:
  -h, --help  Print help
  --version
          Fetch the `evm_arithmetization` package version, build commit hash and build timestamp

Example:

cargo r --release --bin rpc fetch --start-block <START_BLOCK> --end-block <END_BLOCK> --rpc-url <RPC_URL> --block-number 16 > ./output/block-16.json

Docker

Docker images are provided for both the leader and worker binaries.

Development Branches

There are three branches that are used for development:

  • main --> Always points to the latest production release
  • develop --> All PRs should be merged into this branch
  • testing --> For testing against the latest changes. Should always point to the develop branch for the zk_evm deps

Testing Blocks

For testing proof generation for blocks, the testing branch should be used.

Proving Blocks

If you want to generate a full block proof, you can use tools/prove_rpc.sh:

./prove_rpc.sh <BLOCK_START> <BLOCK_END> <FULL_NODE_ENDPOINT> <RPC_TYPE> <IGNORE_PREVIOUS_PROOFS>

Which may look like this:

./prove_rpc.sh 17 18 http://127.0.0.1:8545 jerigon false

Which will attempt to generate proofs for blocks 17 & 18 consecutively and incorporate the previous block proof during generation.

A few other notes:

  • Proving blocks is very resource intensive in terms of both CPU and memory. You can also only generate the witness for a block instead (see Generating Witnesses Only) to significantly reduce the CPU and memory requirements.
  • Because incorporating the previous block proof requires a chain of proofs back to the last checkpoint height, you can also disable this requirement by passing true for <IGNORE_PREVIOUS_PROOFS> (which internally just sets the current checkpoint height to the previous block height).
  • When proving multiple blocks concurrently, one may need to increase the system resource usage limit because of the number of RPC connections opened simultaneously, in particular when running a native tracer. For Linux systems, it is recommended to set ulimit to 8192.

Generating Witnesses Only

If you want to test a block without the high CPU & memory requirements that come with creating a full proof, you can instead generate only the witness using tools/prove_rpc.sh in the test_only mode:

./prove_rpc.sh <START_BLOCK> <END_BLOCK> <FULL_NODE_ENDPOINT> <RPC_TYPE> <IGNORE_PREVIOUS_PROOFS> <BACKOFF> <RETRIES> test_only

Filled in:

./prove_rpc.sh 18299898 18299899 http://34.89.57.138:8545 jerigon true 0 0 test_only

Finally, note that both of these testing scripts force proof generation to be sequential by allowing only one worker. Because of this, this is not a realistic representation of performance but makes the debugging logs much easier to follow.

Trace decoder tests

The trace decoder module has some basic regression tests, using the json witness data from the trace_decoder/tests/data/witnesses subdirectories. When needed (e.g. some block with corner-case discovered), additional input witness data should be generated using the following procedure:

  1. Run the rpc tool to fetch the block (or multiple blocks) witness:
cargo run --bin rpc fetch --rpc-url <node_rpc_endpoint> --start-block <start> --end-block <end> > ./b<number>_<network>.json
  1. Download the header file for the block (or range of blocks), making the json array of headers:
file_name = "b<number>_<network>_header.json"
echo "[" > $file_name && cast rpc eth_getBlockByNumber "0x<block_number>" 'false' --rpc-url <node_rpc_endpoint>  >> $file_name && echo "]" >> $file_name

Move the generated files to the appropriate subdirectory, and they will be automatically included in the test run.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.