Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node: init blockchain state on startup #3069

Merged
merged 2 commits into from
Dec 25, 2024
Merged

node: init blockchain state on startup #3069

merged 2 commits into from
Dec 25, 2024

Conversation

carpawell
Copy link
Member

Fixes #3066 on cold starts.

Fixes #3066 on cold starts.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 0% with 18 lines in your changes missing coverage. Please review.

Project coverage is 22.36%. Comparing base (3a10bc1) to head (b0b1270).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
cmd/neofs-node/netmap.go 0.00% 9 Missing ⚠️
cmd/neofs-node/config.go 0.00% 8 Missing ⚠️
cmd/neofs-node/reputation.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3069      +/-   ##
==========================================
- Coverage   22.36%   22.36%   -0.01%     
==========================================
  Files         793      793              
  Lines       58698    58713      +15     
==========================================
- Hits        13130    13129       -1     
- Misses      44667    44683      +16     
  Partials      901      901              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@cthulhu-rider cthulhu-rider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although proposed change is rly needed, it does not cover division by zero totally. Zero can still come from the service pov. I suggest:

  1. preserve zero return in implementations
  2. throw more clear panic in the service

we can also consider defaulting

@cthulhu-rider
Copy link
Contributor

@carpawell #3066 (comment) what did u mean to fix?

@roman-khimov
Copy link
Member

roman-khimov commented Dec 25, 2024

Epoch duration can't be 0. But this can be an operator error. In which case panic is not the best way to handle the problem. I'd suggest relying on proper value everywhere, but check it on update (including node start) and using the default if it's wrong.

Also, looks like t.objSharedMeta calculations can be omitted if !(t.localNodeInContainer && t.metainfoConsistencyAttr != "").

@cthulhu-rider
Copy link
Contributor

firstBlock := (uint64(currentBlock)/currentEpochDuration + 1) * currentEpochDuration
panics for the same reason

@carpawell
Copy link
Member Author

@carpawell #3066 (comment) what did u mean to fix?

@cthulhu-rider, i meant do not encode meta info if it is not requested. It does not relate to panic directly so i would like to put this fix to another PR (like #3063 with some title change).

@carpawell
Copy link
Member Author

Also, looks like t.objSharedMeta calculations can be omitted if !(t.localNodeInContainer && t.metainfoConsistencyAttr != "").

@roman-khimov, agree, will fix but IMO not in a fast panic fix PR.

@carpawell
Copy link
Member Author

panics for the same reason

@cthulhu-rider, what do you mean? It shares the same netmap state that is updated right after creation after this PR.

@cthulhu-rider
Copy link
Contributor

panics for the same reason

@cthulhu-rider, what do you mean? It shares the same netmap state that is updated right after creation after this PR.

just a remark, i faced this when hotfixed original panic with default to get working tests. Anyway, these are different places which can potentially panic

Copy link
Contributor

@cthulhu-rider cthulhu-rider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node/nemtap:

and i still recommend to throw particular panic in places where the original one could occur

cmd/neofs-node/reputation.go Outdated Show resolved Hide resolved
cmd/neofs-node/netmap.go Outdated Show resolved Hide resolved
cmd/neofs-node/netmap.go Show resolved Hide resolved
@@ -743,6 +743,11 @@ func initBasics(c *cfg, key *keys.PrivateKey, stateStorage *state.PersistentStor

eDuration, err := nmWrap.EpochDuration()
fatalOnErr(err)
if eDuration == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all in neofs-node, likely you can have some nState.UpdateEpochDuration() that will handle this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, degraded logs a little then

@roman-khimov
Copy link
Member

throw particular panic in places where the original one could occur

It's like adding dead code, not worth the trouble. This panic is rather obvious one even when the only thing you see is line number and "division by zero".

@carpawell carpawell force-pushed the fix/meta-panic branch 2 times, most recently from a30919b to 7028440 Compare December 25, 2024 14:36
Copy link
Contributor

@cthulhu-rider cthulhu-rider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont forget about linter

Fallback to 240 blocks default epoch duration if unexpected zero value received.
Zero value is not acceptable, and it is hard to predict how the system reacts to
it (panic was observed at least once). Refs #3066.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
@carpawell
Copy link
Member Author

It's like adding dead code

Same opinion, zero division is a funny cult mistake, in go IMO not needed any context, devs will always understand what it is about, admins can do nothing about it

@roman-khimov roman-khimov merged commit 7ac9947 into master Dec 25, 2024
19 of 22 checks passed
@roman-khimov roman-khimov deleted the fix/meta-panic branch December 25, 2024 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic during object put
3 participants