Skip to content
This repository has been archived by the owner on Jul 5, 2024. It is now read-only.

fix(zkevm-circuits/begin_tx): add missing constraints #1776

Merged
merged 14 commits into from
Mar 12, 2024

Conversation

curryrasul
Copy link
Member

Description

This PR aims to fix #1475 by adding missing constraints.

Issue Link

#1475

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Questions / Need Help

  1. Meaning of value_prev argument in account_access_list_write_unchecked. Why is it sometimes set to 0, and sometimes to bool values such as is_coinbase_warm or is_caller_callee_equal. What is is_caller_callee_equal for?
  2. There's expression: https://github.com/privacy-scaling-explorations/zkevm-circuits/blob/main/zkevm-circuits/src/evm_circuit/execution/begin_tx.rs#L163
    Why do we have summation - if is_empty_code_hash.expr() is enough (as well as callee_not_exists is enough)?
  3. What's caller_nonce_hash_bytes? Is it Keccak(sender, nonce) and we have to constrain this exact value?

@curryrasul curryrasul marked this pull request as draft February 21, 2024 18:52
@github-actions github-actions bot added the crate-zkevm-circuits Issues related to the zkevm-circuits workspace member label Feb 21, 2024
@curryrasul curryrasul self-assigned this Feb 21, 2024
Copy link
Collaborator

@ChihChengLiang ChihChengLiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick comment for the first batch of questions. I'll comment more next.

zkevm-circuits/src/evm_circuit/execution/begin_tx.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/evm_circuit/execution/begin_tx.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/evm_circuit/execution/begin_tx.rs Outdated Show resolved Hide resolved
zkevm-circuits/src/evm_circuit/execution/begin_tx.rs Outdated Show resolved Hide resolved
cb.keccak_table_lookup(
create.input_rlc(cb),
create.input_length(),
caller_nonce_hash_bytes.to_word(),
);

cb.keccak_table_lookup(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how we should add a constraint for cb.curr.state.code_hash = keccak(tx.calldata)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't think this is correct.
The length should be something like this (via lookup to TxTable):

let length = cb.tx_context(tx_id, TxContextFieldTag::CallDataLength, None);

The input should be the rlc of the TxCalldata, which is dynamic length. Take a look at the SHA3 opcode to see how it calculates the keccak of memory:

cb.copy_table_lookup(
WordLoHi::from_lo_unchecked(cb.curr.state.call_id.expr()),
CopyDataType::Memory.expr(),
WordLoHi::from_lo_unchecked(cb.curr.state.call_id.expr()),
CopyDataType::RlcAcc.expr(),
memory_address.offset(),
memory_address.address(),
0.expr(), // dst_addr for CopyDataType::RlcAcc is 0.
memory_address.length(),
rlc_acc.expr(),
copy_rwc_inc.expr(),
);

An then CALLDATACOPY to see how it uses the copy circuit with TxCalldata as input:

And from understanding those you should make a copy event that has TxCalldata as input and RLC as ouput. Something like this:

            let copy_rwc_inc = cb.query_cell();
            let rlc_acc = cb.query_cell_phase2();
            cb.copy_table_lookup(
                WordLoHi::from_lo_unchecked(tx_id.expr()),
                CopyDataType::TxCalldata.expr(),
                WordLoHi::zero(),
                CopyDataType::RlcAcc.expr(),
                zero,
                length, // The TxCalldata length calculated previously
                0.expr(), // dst_addr for CopyDataType::RlcAcc is 0.
                length, // The TxCalldata length calculated previously
                rlc_acc.expr(),
                copy_rwc_inc.expr(),
            );

Now you can use rlc_acc as input to keccak_table_lookup

cb.keccak_table_lookup(
create.input_rlc(cb),
create.input_length(),
caller_nonce_hash_bytes.to_word(),
);

cb.keccak_table_lookup(
create.code_hash_keccak_rlc(cb),
32.expr(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this length is correct. What it should be equals to?

32.expr(),
cb.curr.state.code_hash.to_word(),
);
cb.require_equal_word(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to add this constraint? Cause after creating ContractCreateGadget we checked nonce equivalence, and some other things but not this for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not needed. The code_hash in the create gadget seems to be only for the CREATE2 opcode. The code_hash for creation via BeginTx follows a different pattern (it's hash of tx.calldata).

So here we're using a create.code_hash witness from the ContractCreateGadget to compare it against another code_hash. But the create.code_hash doesn't appear in any constraint inside ContractCreateGadget because we're never calling create.input_rlc with IS_CREATE2 = true. So the create.code_hash will have whichever value the prover assigns, so this equality check is meaningless.

@@ -202,21 +210,23 @@ impl<F: Field> ExecutionGadget<F> for BeginTxGadget<F> {
)
.to_word(),
);
});
Copy link
Member Author

@curryrasul curryrasul Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this; before there were two separate cb.condition with the same condition (that tx.is_create). I think we can merge them.
And actually in the #1475 it is stated that we need to check if callee address is equal to keccak(rlp(sender, nonce)) - but we actually have this constraint already, right? - call callee address equivalence constraint and keccak table lookup for caller_nonce_hash_bytes seems to be enough. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed. Notice that here we're checking against call_callee_address:

cb.require_equal_word(
"call callee address equivalence",
call_callee_address.to_word(),
AccountAddress::<F>::new(
caller_nonce_hash_bytes.limbs[0..N_BYTES_ACCOUNT_ADDRESS]
.to_vec()
.try_into()
.unwrap(),
)
.to_word(),
);

And caller_nonce_hash_bytes is constrained via:

cb.keccak_table_lookup(
create.input_rlc(cb),
create.input_length(),
caller_nonce_hash_bytes.to_word(),
);

which calls:
pub(crate) fn input_rlc(&self, cb: &EVMConstraintBuilder<F>) -> Expression<F> {

Now notice how input_rlc uses:

So we must check that they are correct.

This checks the nonce (your deleted code):

        cb.require_equal(
            "tx nonce equivalence",
            tx.nonce.expr(),
            create.caller_nonce(),
        );

And this checks the address:

cb.require_equal_word(
"tx caller address equivalence",
tx.caller_address.to_word(),
create.caller_address(),
);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I just saw that the removed nonce check was moved to

cb.require_equal(
"tx nonce equivalence",
tx.nonce.expr(),
create.caller_nonce(),
);

So I'm a bit confused.

Are you talking about the removal of the nonce check? Or something else?

I think it would be cleaner if all the create fields equality checks are inside the condition:

// 1. Handle contract creation transaction.
cb.condition(tx.is_create.expr(), |cb| {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were two cb.condition(tx.is_create.expr()):

cb.condition(tx.is_create.expr(), |cb| {

and the other one:
cb.condition(tx.is_create.expr(), |cb| {

I think it would be cleaner if all the create fields equality checks are inside the condition:
I think that's what I wanted to do as well, that's why I merged them.


But the second thing I asked is do we still need the constraint from here: #1475 (comment)

The callee address of the new context should be the keccak(RLP([sender_address, sender_nonce]))

I think we already have that, right?
cc. @ed255

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were two cb.condition(tx.is_create.expr()):

cb.condition(tx.is_create.expr(), |cb| {

and the other one:

cb.condition(tx.is_create.expr(), |cb| {

I think it would be cleaner if all the create fields equality checks are inside the condition:
I think that's what I wanted to do as well, that's why I merged them.

Ah, thanks for the clarification! On the constraint level it's the same, but I think in the code level it looks cleaner.

But the second thing I asked is do we still need the constraint from here: #1475 (comment)

The callee address of the new context should be the keccak(RLP([sender_address, sender_nonce]))

I think we already have that, right? cc. @ed255

Ah yes, I see now that this point was already done. I think I just missed this
https://github.com/privacy-scaling-explorations/zkevm-circuits/blob/3bbc757a20d9d9f7759db47c096d5395361bee74/zkevm-circuits/src/evm_circuit/execution/begin_tx.rs#L193C1-L205C12
Due to it being apart from all the logic of the is_create case. So yeah, all is good regarding this point :)

@curryrasul curryrasul marked this pull request as ready for review March 5, 2024 12:20
0.expr(),
length.expr(),
rlc_acc.expr(),
0.expr(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the check is failed, but can it be because I made rwc_inc = 0, as we discussed during the call @ed255 ?
In the previous commit it was equal to cb.query_cell().expr()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the errors and noticed there were quite a few things missing! I think it's not trivial to be aware of the missing things without being familiar with the entire system :)

Take a look at this commit where I added the missing things in order to make the tests pass 2291146

Here's a list of changes:

  • The cells calldata_length and calldata_rlc in BeginTx where not being assigned, so I added them to the gadget and assigned them in BeginTx::assign_exec_step
  • The keccak table didn't have the input for tx.call_data when the tx is creation, so I added it via the bus-mapping
  • The copy table didn't have the input for copying the tx.call_data into rlc when the tx is creation, so I added it via the bus-mapping
  • The copy table only contains an entry in case there's at least 1 byte copied. If tx.call_data.len() == 0 there will be no entry in the copy table, so the lookup will fail. I added a conditional on the copy table lookup for the case tx.call_data.len() > 0, and otherwise I directly set call_data_rlc = 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask - is it necessary add calldata_length, calldata_rlc, etc. to the gadget, and therefore assign them. Why wouldn't it work the way I made it before? Or the cells "were not there" before?

For example what's the fundamental difference between this:

let length = cb.tx_context(tx_id, TxContextFieldTag::CallDataLength, None);

and having a separate calldata_length field/cell in the gadget, and doing the lookup:

cb.tx_context_lookup(
tx_id.expr(),
TxContextFieldTag::CallDataLength,
None,
WordLoHi::from_lo_unchecked(calldata_length.expr()),
);


The copy table only contains an entry in case there's at least 1 byte copied. If tx.call_data.len() == 0 there will be no entry in the copy table, so the lookup will fail. I added a conditional on the copy table lookup for the case tx.call_data.len() > 0, and otherwise I directly set call_data_rlc = 0

Yes, I was going through Scroll's begin_tx and they have similar thing; I forgot to add the same check here, sorry.

Thanks for help, I'll merge branches and push it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask - is it necessary add calldata_length, calldata_rlc, etc. to the gadget, and therefore assign them. Why wouldn't it work the way I made it before? Or the cells "were not there" before?

If you don't add these cells to the gadget, they will not be part of the config, and if they are not part of the config, you don't have the reference to do the witness assignment, and then they will be assigned to 0, so the constraints will fail. It was working before because these two cells didn't exists before.

For example what's the fundamental difference between this:

let length = cb.tx_context(tx_id, TxContextFieldTag::CallDataLength, None);

and having a separate calldata_length field/cell in the gadget, and doing the lookup:

cb.tx_context_lookup(
tx_id.expr(),
TxContextFieldTag::CallDataLength,
None,
WordLoHi::from_lo_unchecked(calldata_length.expr()),
);

The first option is always nicer, but it does two things:

  1. Allocate the cell
  2. Perform the lookup with that cell

In this case, we need to allocate the cell unconditionally because we want to store it in the config to later on use it for witness assignment, but the lookup must be conditional!

Another option would be doing something like this:

let length = cb.condition(tx.is_create.expr(), |cb| {
    // [...]
    let length = cb.tx_context(tx_id, TxContextFieldTag::CallDataLength, None);
    // [...]
    length
};

That would be equivalent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for explanation

@github-actions github-actions bot added the crate-bus-mapping Issues related to the bus-mapping workspace member label Mar 6, 2024
Copy link
Member

@ed255 ed255 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A part from the dbg! line, all looks good!

},
);
}
dbg!((state.tx.id, state.tx.call_data.len(), &state.tx.call_data));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slipped through! Can you remove it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed that. Done!

Copy link
Collaborator

@ChihChengLiang ChihChengLiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏽

@curryrasul curryrasul added this pull request to the merge queue Mar 12, 2024
Merged via the queue into main with commit 4996eb6 Mar 12, 2024
15 checks passed
@curryrasul curryrasul deleted the fix/begintx-missingconstraint branch March 12, 2024 08:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
crate-bus-mapping Issues related to the bus-mapping workspace member crate-zkevm-circuits Issues related to the zkevm-circuits workspace member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BeginTx missing constraints
3 participants