Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Welcome Basket" for new nodes #350

Open
carver opened this issue Oct 31, 2024 · 2 comments
Open

"Welcome Basket" for new nodes #350

carver opened this issue Oct 31, 2024 · 2 comments

Comments

@carver
Copy link
Contributor

carver commented Oct 31, 2024

Problem

Collecting network data near your node ID as a fresh peer is slow. It's been tolerable on history, but gets painfully slow on state.

Previous Proposal: Range Queries

We've talked about ways to address this before. The main proposal I remember is running range-queries on peers. But I don't think we overcame the general concern about asking a peer to do arbitrary work. Also, trie-node data is typically not stored with enough info to prove that it's canonical, to requesting peers.

New Proposal: Welcome Baskets

We could have every node prepare its own Welcome Basket to offer to new neighbors.

Requesting a Basket

When a peer joins the network, and is looking to populate its database, it first works to fill its peer buckets. When it finds peers that are nearby itself (roughly within its own estimated radius), it asks for a welcome basket.

What's in the Basket?

The basket is a heterogeneous collection of all types of content keys, values and inclusion proofs for the network.

The basket almost certainly does not include all of a mature peer's content (though it might for a very new one). When selecting which content to include in its basket, a peer should prefer content closest to its own node ID. This way, collections of baskets from different peers should create a wider coverage.

The Welcome Basket is not filtered by the transmitting peer. That work is left up to the receiving peer.

How big should we allow the basket to grow? 🤷🏻‍♂️ TBD

Generating a Basket

Every peer would be responsible for generating a welcome basket. Brand new peers would have an empty one, of course. Baskets should not be created on-demand. They should be pre-generated in a background process, or incrementally generated as new offers come in. The data must be fully provable to other peers.

The cost of sending a Welcome Basket should be trivial.

Goal

Welcome Baskets can give new nodes a quick jolt of data, so they can instantly start contributing to the network.

It's not a goal to immediately fill up new nodes. This just accelerates the first stages of the onboard process.

@pipermerriam
Copy link
Member

How would this actually work for trie-node/archive style state data? If it needs proof matter... then we're talking about nodes storing a significant overhead of proof data that they otherwise wouldn't be holding in storage... the data closest to them is fundamentally likely to be uniformly randomly distributed across the state trie... trie node based state data is the worst.

Overall I like the concept but I'm struggling to figure out how it would get applied to the data we most need it to apply to.

@carver
Copy link
Contributor Author

carver commented Oct 31, 2024

Right, whether this approach is valuable hinges a lot on the numbers. The more replication there is on the network, the more effective it is. The more spread out the node IDs, the more effective. Probably the hardest to pin down: what's the value of getting a leg up?

I imagine the difference between Welcome Basket and Status Quo like this:
image

So a node will reach 100% of its target capacity slowly over time in both cases, but will accelerate the beginning a bit, as it finds Welcome Baskets. I do think there's value there, but I'm not sure how important of a goal that is. The relative shape of the graph will vary depending on how big the baskets are, how big the node target storage is, how fast the data is being bridged in, and how many peers are nearby.

As for the state data, yeah the proofs will be much more efficient for the new latest state format. My intuition is still that something is better than nothing, even when paying a ~10x proving overhead. But I won't be convinced I'm right until we have more numbers on all the above factors.

So I don't think it's worth seriously pursuing a Welcome Baskets spec right now. Just a concept that felt worth drafting now, while we move toward collecting more data. Starting to push the latest archive state onto the network reliably is one example of data I'm keen to collect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants