-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rkuris/streaming iterator from start #346
Conversation
There are several TODOs here, but the code works for the two test cases I created. Both of these avoid making extension nodes, which are going away, and this code may need some tweaks to support values inside branch nodes.
Per review comments
Per review comments
Per review comments
Per review comments
IteratorState is any of: - Start at the beginning of the trie - Start at a specific key value of the trie - Continue after a saved node and parents See the inline comments for more details. TODO: - Implement StartAtBegininng (and tests) - If a key is provided that doesn't exist, start iterating at the next one (if we really need this) - If an error is returned, we should next return None, then start from the beginning. This makes the resumption after an error a little cleaner. We'd need another IteratorState for this. - Remove extension nodes (depends on other diffs)
This was an artifact from an earlier change. Reverted back to original.
Seems like we can just reference self and it works!
Removed unnecessary clone Added additional comments Added TODO for a few things: - possible to return a reference to the value instead of a copy - should handle case of branch nodes with values while traversing Adding a randomized key-value test after extension nodes are removed is also recommended.
Depends on streaming iterator spike PR |
This walks down the leftmost branch every time, recording the path down, then figuring out what the key and value is at the bottom, then resumes. Also refactors get_key_from_parents, as this is now done 3 times in the code, sometimes when we add a leaf key value, and other times when not.
059f9dd
to
8b3648f
Compare
Per review comments
From review comments
Moved the comments about the behavior at the end of the stream to doc comments on the iterator. Per review comments.
Per review comments
Not 100% sure this is what the reviewer asked for, but this is okay too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of stylistic suggestions, I guess those can all be considered nits.
I do, however, think that if it's possible to write code that doesn't panic, we should absolutely be writing code that cannot panic, even in the event of code changes. It's always better to have the compiler uphold your invariants.
As for the tests, splitting case coverage into separate unit tests also makes refactoring and debugging much easier. Unit tests should cover a single case (within reason) such that it's obvious what failure is occurring. Right now, if we wanted to change the behaviour of the iterator starting over from the beginning, we would have to modify a test that covers multiple other cases. We could introduce a bug into our test coverage of a case that we did not want to change and cause a regression.
firewood/src/merkle.rs
Outdated
let found_offset = children[child_position as usize..] | ||
.iter() | ||
.position(|&addr| addr.is_some()); | ||
|
||
if let Some(found_offset) = found_offset { | ||
child_position += found_offset as u8; | ||
} else { | ||
next = parents.pop(); | ||
continue; | ||
} | ||
|
||
let addr = children[child_position as usize].unwrap(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let found_offset = children[child_position as usize..] | |
.iter() | |
.position(|&addr| addr.is_some()); | |
if let Some(found_offset) = found_offset { | |
child_position += found_offset as u8; | |
} else { | |
next = parents.pop(); | |
continue; | |
} | |
let addr = children[child_position as usize].unwrap(); | |
let found_offset = children | |
.iter() | |
.skip(child_position as usize) | |
.enumerate() | |
.filter_map(|(i, addr)| addr.map(|addr| (i as u8, addr))) | |
.next(); | |
let Some((found_offset, addr)) = found_offset else { | |
next = parents.pop(); | |
continue; | |
}; | |
child_position += found_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another case of removing the Option being easier. Can we just land #341 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed per the suggestion
|
||
(found_key, value) | ||
} | ||
_ => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a foot gun if anyone changes the code. It would be better to have
let (found_key, value) = match std::mem::take(&mut self.key_state) {
at the top of the poll_next
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to get this to work. It's a bit tricky because of what's borrowed and moved. I'll create a task for fixing this in a followup.
Per review comments
Per review comments
You have to add one only if it wasn't None
Added cases of starting in the middle and at the last key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly have preferred no panicking code because I don't think that #341 should be merged. I will leave a comment in that PR as to why.
Fair point; lets make these changes before merging then. |
Per review comments
The loop to calculate the next found position and address can be a little simpler using enumerate, skip, and filter_map.
Also removed incorrect comment and comment about DiskAddress::is_null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇
Thanks for the patience
This walks down the leftmost branch every time, recording the path down, then figuring out what the key and value is at the bottom, then resumes.
Also refactors get_key_from_parents, as this is now done 3 times in the code, sometimes when we add a leaf key value, and other times when not.