Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy wrong forks on op-geth/v1.101304.2 X op-node/v1.4.0 on mainnet?! #468

Open
juno-yu opened this issue Jan 7, 2024 · 7 comments
Open

Comments

@juno-yu
Copy link

juno-yu commented Jan 7, 2024

Describe the bug
Easy hardforks on op-geth/v1.101304.2 X op-node/v1.4.0 on mainnet - hit 3 times in 48hours (All on different heights)

To Reproduce
Run with op-geth/v1.101304.2 X op-node/v1.4.0 on mainnet , archive node , sync from L1

Expected behavior
Nodes won't finalise wrong block

Screenshots
Nodes finalize and stop at wrong blocks that didn't match with https://optimistic.etherscan.io/ && can't rollback by debugSetHead on op-geth (nodes would want rollback to older heights... which sounds taking very long time for ETA)

System Specs:

  • OS: linux
  • Package Version (or commit hash): corresponding images on dockerhub for both components

Additional context

  • Rolled back to op-geth/v1.101304.2 X op-node/v1.3.2 - then it was good for last 24hr on multiple nodes
@mslipper
Copy link

mslipper commented Jan 7, 2024

Hi Juno,

What do you mean by "nodes finalize and stop at wrong blocks" - are you saying that the node diverges from what's on Etherscan, or that the node halts outright? Or both? Can you please post some logs for us to see?

Lastly, are you running with l1.trustrpc set to true?

@juno-yu
Copy link
Author

juno-yu commented Jan 7, 2024

Hi Juno,

What do you mean by "nodes finalize and stop at wrong blocks" - are you saying that the node diverges from what's on Etherscan, or that the node halts outright? Or both? Can you please post some logs for us to see?

Lastly, are you running with l1.trustrpc set to true?

Yes

This was one of the node going wrong

 t=2024-01-04T23:07:01+0000 lvl=info msg="no peers ready to handle block requests for more P2P requests for L2 block history" target=114,388,037 end=0x00008fbf237170d75421d65fad1bc435c91d5
 246aec4b4169b02fa5782f9c143:114393561 current=114,393,330
that one stucking at 114,388,037

seem that node hardforked to some where , hash mismatch

  "checkTime": "2024-01-04T23:45:58.456Z",
  "blockHeight": 114388037,
  "blockHash": "0xdf8af4ce7e93aa91b3e3f9a1e667155baab5b34a17b0de7fe16d8704fdf7ab46",
  "blockTime": "2024-01-04T13:27:31.000Z",

https://optimistic.etherscan.io/block/114388037
expect 0xd915e7daf532a72aa626207501b44727d95af206283237f01386ec084902bc95

@juno-yu
Copy link
Author

juno-yu commented Jan 7, 2024

merely rollback by 1 block (the fork was 1 block only) seemed not the correct way out

debug.setHead 1 block back for op-geth on that node then restart op-node -> doing some Walking back L1Block by hash for multi-day data older than the problematic block , (don't know need how long & how much it rolling back), so I had to stop it ,restore by alternative ways as catch up is slow nowadays

@juno-yu
Copy link
Author

juno-yu commented Jan 7, 2024

l1.trustrpc

not with l1.trustrpc that time , how does this affect node behavior on fork / rollback?!
we pointing to private geth (eth) so should be safe to enable too , but not at the forked moment

@sebastianst
Copy link
Member

@juno-yu we've recently received reports that receipts fetching with l1.trustrpc == true could lead to missing receipts during block derivation, probably coming from a temporary problem with the L1 connection, in turn deriving a block with missing user deposit transactions. We are still investigating why receipts fetching could return less receipts for a block without an error. In the meantime, we've released an image op-node/v1.4.3-rc.3 that puts back receipts validation even if l1.trustrpc is enabled. We will finalize this release candidate this week. You can safely use this image with l1.trustrpc == true or disable it in your existing setup.

Can you confirm that block you mentioned above ("blockHeight": 114388037, "blockHash": "0xdf8af4ce7e93aa91b3e3f9a1e667155baab5b34a17b0de7fe16d8704fdf7ab46") was the first diverging block?

If yes, can you access this block's transactions/tx hashes? How many does it contain? Is it less that the correct block's number of transactions? If it is missing some, can you confirm that these are missing user deposits?

@jun0tpyrc
Copy link

jun0tpyrc commented Jan 9, 2024

thanks , good to know cause being addressed

I don't keep the broken datadir with us as archive datadir are like 6TB+ nowadays and i needed to recover things for our production workload - can't check those broken blocks boundary again

@smartcontracts
Copy link

@sebastianst what's the status of this?

@tynes tynes transferred this issue from ethereum-optimism/optimism Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants