[Fix] Introduce GC logging on failure, don't break the loop. #3876

ruseinov · 2024-01-15T13:44:39Z

Summary of changes

Changes introduced in this pull request:

GC error log.
No more breaking GC loop on error.

Reference issue to close (if applicable)

Work on #3863

Change checklist

I have performed a self-review of my own code,
I have made corresponding changes to the documentation,
I have added tests that prove my fix is effective or that my feature works (if possible),
I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

hanabi1224 · 2024-01-15T14:49:16Z

src/db/gc/mod.rs

@@ -163,7 +163,10 @@ impl<DB: Blockstore + GarbageCollectable + Sync + Send + 'static> MarkAndSweep<D
 /// using CAR-backed storage with a snapshot, for implementation simplicity.


Is this note still valid, if not please update it.

/// NOTE: This currently does not take into account the fact that we might be starting the node /// using CAR-backed storage with a snapshot, for implementation simplicity.

This note will always be valid, it states that GC does not care about car-backed storage blocks. Perhaps I should rephrase it to make it clearer.

Basically, the way that it works is as follows: the database GC has access to is either ParityDB or MemoryDB, depending on the implementation. Since the GC is designed to wait until it starts doing the mark step - we don't really need to do anything special. And even if the database does not have many records by that time - it will just do a smaller cleanup in the first cycle.

In this case, I don't quite understand how fn filter works when it's takes the parity-db only. Shouldn't fn unordered_stream_graph take the entire DB (parity-db + car-db)?

It should work just fine. We only care about cleaning up unreachable nodes. Since all the data that's stored in ParityDB is in the future for any data that's in car-db - we don't really care about it. Unless I'm missing some weird edge-case. I'm assuming that snapshot we export are finalized, in which case there should not be any issues.

The one enhancement I think we should do based on your comments and my revisiting the code is specifying the first epoch that's stored in the database as opposed to the car file. Otherwise

if depth > current_epoch { time::sleep(interval).await; return anyhow::Ok(()); }

is basically useless. Or we can get rid of it altogether and just rely on initial sleep that we do later for simplicity.

I'm assuming that snapshot we export are finalized

I'm not positive about this, as I understand this can only be true when we export snapshot from latest - finality, @lemmih could you weight in on this?

just rely on initial sleep

I suggest storing the last GC epoch in parity-db metadata comlumn, calculating wait time instead of waiting for 10h on every start. In an edge case, if we have a node that restarts (maybe for getting updates) every 9h then GC would never get a chance to run.

@lemmih what do you think about the above? Seems reasonable to me.
We could also initialize the last GC epoch as the heaviest tipset we see in car-backed storage when the database is empty.

I'm not too concerned about this.

src/db/gc/mod.rs

hanabi1224 · 2024-01-15T14:51:06Z

src/db/gc/mod.rs

@@ -185,6 +188,9 @@ impl<DB: Blockstore + GarbageCollectable + Sync + Send + 'static> MarkAndSweep<D
 // Make sure we don't run the GC too often.
 time::sleep(interval).await;


Is there a way to not sleep for 10h here and make it testable with CI calibnet checks

We can make it configurable. This sleep was introduced at David's request to avoid running the GC too often.

Actually I don't think it's necessary. The reason being - we still need to wait finality to do filter and sweep, which makes this scenario pretty much untestable using calibnet checks, unless we want to wait 7.5+ hours.

Why is it necessary to wait finality epochs as apposed to keeping 2*finality state roots like snapshot export?

which makes this scenario pretty much untestable using calibnet checks

Could you elaborate on why it becomes impossible to proactively trigger GC by a CLI event like before? That was useful for e2e tests

We can make it artificially testable by introducing more abstractions in order to be able to mock certain parts of the algorithm.

If I'm not wrong, this still does not allow effective manual e2e testing or benchmarking on calibnet or mainnet, sounds like a deal breaker to me.

Note that the old GC has solved the exact problem (running in the background without downtime) without waiting

It's not ideal, whether or not it's a deal breaker, because it's not manually testable - I'm not really sure. We can always benchmark every part of the three steps, there's no need to benchmark all three steps at the same time.

@lemmih could you please chip in with your thoughts on the matter? I kind of agree that this is hard to test and impossible to test manually, but it does offer a lot simpler codebase and algorithm.

Waiting for chain finality is required for correctness. Due to possible forks, we may not delete data unless (1) it is unreachable from the current HEAD, and (2) it was written to the database more than chain_finality epochs ago. This does make end-to-end testing more difficult: How do we quickly test the GC when it is not allowed to garbage collect anything in the first 7.5 hours?

Lowering (or removing) the wait time is not an option. The wait is required for correctness and cannot be any lower than 7.5 hours on mainnet or calibnet.

I posit that end-to-end testing is not important. We can verify with quickcheck that the GC behaves as expected (ie. deletes what it should delete, doesn't delete what it shouldn't, etc). And we can statically guarantee that the GC loop will never exit (except for critical failures, which should terminate Forest). Furthermore, the nodes we run in production will show whether there are any GC leaks over time.

In summary: Keep the 7.5 hour wait (this is not optional). Keep the quickcheck tests. Don't do end-to-end tests in the CI (as the GC doesn't do anything interesting in this timespan). Prove to the reader that the GC loop will run as long as Forest itself.

As noted in the issue comment, one way out of this is providing GC alarms in our long-running node. This is not ideal, given our current release process, but it's better than pure faith. 🙏

I'm currently figuring out why the GC gets stuck. It could be due to the fact that we bootstrap the node with a snapshot (car-backed storage) and the parallel graph walk somehow get stuck due to not having access to data all the way to genesis. Once that is debugged and resolved - we'll be able to monitor GC via checks and metrics indeed.

LesnyRumcajs · 2024-07-19T08:18:01Z

@ruseinov Can this be closed? Or should it be merged?

ruseinov · 2024-07-19T08:44:10Z

@ruseinov Can this be closed? Or should it be merged?

yep, about time.

[Fix] Introduce GC logging on failure, don't break the loop.

8a93d61

ruseinov requested a review from a team as a code owner January 15, 2024 13:44

ruseinov requested review from LesnyRumcajs and aatifsyed and removed request for a team January 15, 2024 13:44

ruseinov marked this pull request as draft January 15, 2024 14:28

fix

3fd29cc

hanabi1224 reviewed Jan 15, 2024

View reviewed changes

src/db/gc/mod.rs Show resolved Hide resolved

hanabi1224 reviewed Jan 15, 2024

View reviewed changes

Merge branch 'main' into ru/fix/gc

4008eea

ruseinov closed this Jul 19, 2024

ruseinov deleted the ru/fix/gc branch July 19, 2024 08:44

ruseinov restored the ru/fix/gc branch July 19, 2024 09:19

ruseinov reopened this Jul 19, 2024

Merge branch 'main' into ru/fix/gc

f9ee5b5

ruseinov marked this pull request as ready for review July 19, 2024 09:20

ruseinov requested review from lemmih and hanabi1224 July 19, 2024 09:20

ruseinov added 4 commits July 19, 2024 12:10

fix mut

338e85c

Merge branch 'main' into ru/fix/gc

71fb5cd

make linter happy

e0c8581

Merge branch 'main' into ru/fix/gc

5d81e15

LesnyRumcajs approved these changes Jul 22, 2024

View reviewed changes

hanabi1224 approved these changes Jul 22, 2024

View reviewed changes

ruseinov added this pull request to the merge queue Jul 22, 2024

lemmih approved these changes Jul 22, 2024

View reviewed changes

Merged via the queue into main with commit 81d0f55 Jul 22, 2024
28 checks passed

ruseinov deleted the ru/fix/gc branch July 22, 2024 11:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Introduce GC logging on failure, don't break the loop. #3876

[Fix] Introduce GC logging on failure, don't break the loop. #3876

ruseinov commented Jan 15, 2024 •

edited

Loading

hanabi1224 Jan 15, 2024

ruseinov Jan 15, 2024 •

edited

Loading

hanabi1224 Jan 16, 2024 •

edited

Loading

ruseinov Jan 16, 2024

hanabi1224 Jan 17, 2024 •

edited

Loading

ruseinov Jan 17, 2024

lemmih Jul 22, 2024

hanabi1224 Jan 15, 2024

ruseinov Jan 15, 2024

ruseinov Jan 16, 2024

hanabi1224 Jan 17, 2024

hanabi1224 Jan 17, 2024 •

edited

Loading

hanabi1224 Jan 18, 2024

ruseinov Jan 18, 2024

lemmih Jan 19, 2024

LesnyRumcajs Jan 19, 2024

ruseinov Jan 19, 2024

LesnyRumcajs commented Jul 19, 2024

ruseinov commented Jul 19, 2024

		@@ -163,7 +163,10 @@ impl<DB: Blockstore + GarbageCollectable + Sync + Send + 'static> MarkAndSweep<D
		/// using CAR-backed storage with a snapshot, for implementation simplicity.

		@@ -185,6 +188,9 @@ impl<DB: Blockstore + GarbageCollectable + Sync + Send + 'static> MarkAndSweep<D
		// Make sure we don't run the GC too often.
		time::sleep(interval).await;

[Fix] Introduce GC logging on failure, don't break the loop. #3876

[Fix] Introduce GC logging on failure, don't break the loop. #3876

Conversation

ruseinov commented Jan 15, 2024 • edited Loading

Summary of changes

Reference issue to close (if applicable)

Other information and links

Change checklist

Choose a reason for hiding this comment

ruseinov Jan 15, 2024 • edited Loading

Choose a reason for hiding this comment

hanabi1224 Jan 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanabi1224 Jan 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanabi1224 Jan 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LesnyRumcajs commented Jul 19, 2024

ruseinov commented Jul 19, 2024

ruseinov commented Jan 15, 2024 •

edited

Loading

ruseinov Jan 15, 2024 •

edited

Loading

hanabi1224 Jan 16, 2024 •

edited

Loading

hanabi1224 Jan 17, 2024 •

edited

Loading

hanabi1224 Jan 17, 2024 •

edited

Loading