Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Linea variant #22

Merged
merged 2 commits into from
Aug 30, 2024
Merged

feat: add Linea variant #22

merged 2 commits into from
Aug 30, 2024

Conversation

gbotrel
Copy link
Contributor

@gbotrel gbotrel commented Aug 21, 2024

Continuing #17

@xJonathanLEI
Copy link
Contributor

Tested and it does seem like it's always hitting state root mismatch. I believe this has to do with the EVM difference between Linea and Ethereum? I'm not familiar with Linea but the state root computation code in rsp is very much correct, as I've tested more than 50,000 latest mainnet blocks and the roots are correct.

@gbotrel
Copy link
Contributor Author

gbotrel commented Aug 22, 2024

Yes I'm hitting the same issue..
State root computation should be identical between Linea (on the L2, not what we prove which uses a separate concurrent state) and Ethereum; (you can actually sync a vanilla geth node).

Main difference I see is Linea currently uses Clique for consensus, and it might have some subtle effect somewhere...
Maybe a config issue (i.e. chainspec) but I don't see how the execution passes. Maybe the initial state or the storage proof should also query the block (clique) signer ..

I'm a bit out of ideas on what to investigate..

@xJonathanLEI
Copy link
Contributor

Yeah I believe that the state root computation logic is the same between the two. If Linea were indeed using a different trie structure, you're most likely going to fail when verifying the storage proofs, which didn't happen and you were able to run the program till the end.

I also think that execution is not too shabby. Apart from state root, rsp also verifies receipt roots, and based on your program execution it doesn't seem to have a problem with that. However, receipts only contain logs and gas usage, so they don't represent the whole state transition.

Given that we agree root computation itself is working as intended, the issue must come from the fact that the resulting state after local block execution differs from what geth thinks it is. Therefore, for starter, it'd be extremely helpful to figure out where it differs.

It's easier said than done though. A comprehensive and fool-proof approach would be to compare the 2 tries side by side to eventually find the leaf that differs. However, this requires good understanding of MPT, and also tracing the alloy-trie crate for recovering trie node preimages.

A more superficial approach would be directly compare the state. For each account modified by local execution, you can compare its new nonce, balance, code hash, and storage root with the ones reported by geth (which can be obtained from the eth_getProof method). Apparently this approach is not fool-proof: if our local block execution misses an account, you wouldn't be able to find out this way. That said, I think it's more likely than not to reveal the issue.

My guess would be that some account ends up having a different storage root than what geth thinks it should. In that case, we just have to further check which specific slots are different, and more importantly, why they're different.

@gbotrel
Copy link
Contributor Author

gbotrel commented Aug 22, 2024

So, I found a suspicious account; if we check say, transition from genesis to block 1 (or even the first few blocks which are simple), I compare the fields of the storage proof (previous and current block from geth) with the execution outcome from reth (which should match the current block from geth), there is always one balance mismatch:

address: 0x0000000000000000000000000000000000000000
balance current mismatch geth: 1 rust: 2000021000000021001

So that might be clique signer issue or gas computation.. 0x000..00 is marked as the "sequencer" of the block (https://lineascan.build/block/1)

@xJonathanLEI
Copy link
Contributor

Interesting finding! This is probably what's causing the issue since it looks like every single block is hitting the problem.

Need to figure out what causes geth to think different tho.

@gbotrel
Copy link
Contributor Author

gbotrel commented Aug 27, 2024

Went to the bottom of it by tracing step by step in local forks and made it work :-).
So, there were 2 related mismatch, due to the clique consensus I believe + config;

  1. reth gave a block reward
  2. reth rewarded the block sequencer instead of the clique signer

I have an... acceptable workaround;

  1. activate Paris hardfork so that reth doesn't do a block reward (seems to work ^^)
  2. dirty; when querying the RPC for a block, when network is linea, we change the header.beneficiary from 0x000...00 to the clique signer on the cloned block before calling the revm execute.

Will do an updated PR tonight or tomorrow 👍

@xJonathanLEI
Copy link
Contributor

Wow that's really awesome!

@gbotrel
Copy link
Contributor Author

gbotrel commented Aug 28, 2024

@xJonathanLEI PR ready for review :) not sure it's the best abstraction but it results in a minimal diff. my archive node is still up if you want to test.

@xJonathanLEI
Copy link
Contributor

Great! Will take a look. As for testing, we now have rsp-tests that's allows running tests against it even after your node is gone. I will make use of that to add integration tests here.

crates/executor/client/src/lib.rs Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
@xJonathanLEI
Copy link
Contributor

Rebased and squashed to resolve conflicts and cleanup spaghetti commit history.

@xJonathanLEI
Copy link
Contributor

Tested this against the original problematic block of 279107, getting these:

  • block_hash: 0x5b954ddd5ad9f544f66b358f692fd08b0995929b0b6d8fd407f95aed6d0c47bd
  • state_root: 0x8e9c4016efb40e35092bf54ec859c4f8e7126edeb4811f280629180d5f646e44

Comparing this against RPC data, it's matching. I think we can proceed to merge :)

Before testing I was kinda expecting to get a correct state root but a mismatched block hash just because how you're manually tweaking the header. To my surprise it turns out to match. I guess under the hood with Clique it's doing the exact same thing.

@xJonathanLEI
Copy link
Contributor

@gbotrel I just added the lock file CI check for rsp-cliean-linea but not the integration test as the rsp-tests cache setup does not currently play nice with trie node preimage recovery, resulting in occasional error under a race condition.

Could you please keep the node around just a little bit longer so that we can add that in as a follow up?

Copy link
Contributor

@xJonathanLEI xJonathanLEI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xJonathanLEI xJonathanLEI merged commit b1935be into succinctlabs:main Aug 30, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants