Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send batch of blocks from the main loop #37

Merged
merged 4 commits into from
Jun 27, 2024
Merged

Conversation

vegarsti
Copy link
Member

@vegarsti vegarsti commented Jun 25, 2024

This PR changes the node indexer from sending one block at a time to sending a batch of blocks. Earlier we implemented concurrent block fetching with buffering (#32). On a configurable interval (defaults to every second), we now check the buffer and send all possible blocks.

We add a flag for specifying whether you wish to skip a failed block or not. It's off by default. This means if all the retries to the RPC node fails for a given block, we will crash. This ensures no block gaps.

Copy link
Member Author

vegarsti commented Jun 25, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @vegarsti and the rest of your teammates on Graphite Graphite

@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch 4 times, most recently from bfb9cb2 to d17408e Compare June 25, 2024 12:57
@vegarsti vegarsti requested a review from msf June 25, 2024 12:59
Copy link
Contributor

@msf msf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is super clean & nice.
I have nitpicks on the log lines and how to name the env/config variable for defining the "block submit interval".
I also think we need to have a test that shows we sending a batched request instead of a single block..

config/config.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
ingester/mainloop.go Outdated Show resolved Hide resolved
ingester/mainloop.go Outdated Show resolved Hide resolved
ingester/mainloop.go Outdated Show resolved Hide resolved
ingester/mainloop.go Outdated Show resolved Hide resolved
ingester/mainloop_test.go Outdated Show resolved Hide resolved
@vegarsti vegarsti force-pushed the dune-client-batch-blocks branch from 3c2104d to 75cb211 Compare June 27, 2024 05:45
Base automatically changed from dune-client-batch-blocks to main June 27, 2024 05:48
@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch 6 times, most recently from 6399e35 to f0e0be9 Compare June 27, 2024 07:10
Comment on lines +62 to +63
if resp == nil {
log.Warn("Retrying request", "error", err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a panic once on this, so safe guarding against that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#38

@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch 2 times, most recently from 2e022fa to 8d7e472 Compare June 27, 2024 07:13
@vegarsti vegarsti requested a review from msf June 27, 2024 07:14
@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch 4 times, most recently from db35c5d to f00a5d4 Compare June 27, 2024 07:37
Comment on lines 151 to 156
if !i.cfg.SkipFailedBlocks {
return err
}
// We need to send an empty block downstream to indicate that this failed
blocks <- models.RPCBlock{BlockNumber: blockNumber}
continue
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized we need to handle the case of failing to get a block from the RPC node. If we just proceed without sending on the channel we will never proceed, since the SendBlocks channel relies on the block numbers. So we need to pass it downstream

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, we need to change the channels to be "blockRPCResp" and that can be RPCBlock OR an error + blockNumber

@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch from f00a5d4 to 11d3299 Compare June 27, 2024 07:39
Comment on lines 200 to 207
if block.Empty() {
// We got an empty block from the RPC client goroutine, either fail or send an empty block downstream
if !i.cfg.SkipFailedBlocks {
i.log.Info("Received empty block, exiting", "number", block.BlockNumber)
return errors.Errorf("empty block received")
}
i.log.Info("Received empty block", "number", block.BlockNumber)
}

nextNumberToSend = i.trySendCompletedBlocks(ctx, blocks, nextNumberToSend)
i.log.Info("SendBlocks: Sent any completed blocks to DuneAPI", "nextNumberToSend", nextNumberToSend)
collectedBlocks[block.BlockNumber] = block
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized we need to handle the case of failing to get a block from the RPC node. If we just proceed without sending on the channel we will never proceed, since the SendBlocks channel relies on the block numbers. So we need to pass it downstream

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, why don't we have:

if nextNumberToSend == block.Number && block.Empty() {
  log.WARN...
  if !skip { return .. }
  nextNumberToSend +=1 // skip it
}
.. normal logic here, the trySendBlocks will skip that block..

Comment on lines 241 to 243
// Skip block if it's empty and we're configured to skip empty blocks
if i.cfg.SkipFailedBlocks && block.Empty() {
nextNumberToSend++
continue
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only here do we properly skip the empty block

@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch from 11d3299 to 3a457e0 Compare June 27, 2024 07:41
@vegarsti vegarsti force-pushed the main-loop-batch-blocks branch from 3a457e0 to 6730e76 Compare June 27, 2024 09:21
Copy link
Contributor

@msf msf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving with comments. :-)

Comment on lines 151 to 156
if !i.cfg.SkipFailedBlocks {
return err
}
// We need to send an empty block downstream to indicate that this failed
blocks <- models.RPCBlock{BlockNumber: blockNumber}
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, we need to change the channels to be "blockRPCResp" and that can be RPCBlock OR an error + blockNumber

select {
case <-ctx.Done():
i.log.Info("SendBlocks: Context canceled, stopping")
return ctx.Err()
case block, ok := <-blocksCh:
case block, ok := <-blocks:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

liked the old name more :)

if block.Empty() {
// We got an empty block from the RPC client goroutine, either fail or send an empty block downstream
if !i.cfg.SkipFailedBlocks {
i.log.Info("Received empty block, exiting", "number", block.BlockNumber)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate log line, move it above the if

Comment on lines 200 to 207
if block.Empty() {
// We got an empty block from the RPC client goroutine, either fail or send an empty block downstream
if !i.cfg.SkipFailedBlocks {
i.log.Info("Received empty block, exiting", "number", block.BlockNumber)
return errors.Errorf("empty block received")
}
i.log.Info("Received empty block", "number", block.BlockNumber)
}

nextNumberToSend = i.trySendCompletedBlocks(ctx, blocks, nextNumberToSend)
i.log.Info("SendBlocks: Sent any completed blocks to DuneAPI", "nextNumberToSend", nextNumberToSend)
collectedBlocks[block.BlockNumber] = block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, why don't we have:

if nextNumberToSend == block.Number && block.Empty() {
  log.WARN...
  if !skip { return .. }
  nextNumberToSend +=1 // skip it
}
.. normal logic here, the trySendBlocks will skip that block..

// Outer loop: We might need to send multiple batch requests if our buffer is too big
for _, ok := collectedBlocks[nextNumberToSend]; ok; _, ok = collectedBlocks[nextNumberToSend] {
// Collect a blocks of blocks to send, only send those which are in order
blocks := make([]models.RPCBlock, 0, len(collectedBlocks))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
blocks := make([]models.RPCBlock, 0, len(collectedBlocks))
blocks := make([]models.RPCBlock, 0, maxBatchSize)

for _, block := range blocks {
// We cannot send blocks out of order to DuneAPI
if block.BlockNumber != next {
panic(fmt.Sprintf("expected block %d, got %d", next, block.BlockNumber))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use require.Equal(t, blockNumber, next)


// Test must fail if DuneAPI receives blocks out of order
require.Equal(t, block.BlockNumber, sentBlockNumber+1)
// Fail if we're not sending a batch of blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is function screaming to get out.
(aka, this helper function is useful in many tests where you wanna validate what we pass to SendBlocks()

require.ErrorIs(t, err, someRPCError)
}

// TestRunRPCNodeFails shows that we crash if the RPC client fails to fetch a block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated comment

require.GreaterOrEqual(t, sentBlockNumber, maxBlockNumber)
}

// TestRunRPCNodeFails shows that we crash if the RPC client fails to fetch a block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is showing we fail if DuneAPI.SendBlocks() fails right? so we shouldchange the name and comment?

@@ -5,3 +5,7 @@ type RPCBlock struct {
// agnostic blob of data that is the block
Payload []byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest we add a Error field and use Errored() bool instead of Empty()

@vegarsti vegarsti merged commit a4ba018 into main Jun 27, 2024
1 check passed
@vegarsti vegarsti deleted the main-loop-batch-blocks branch June 27, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants