-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - ATX v2 (for ATX merge) #5785
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5785 +/- ##
==========================================
+ Coverage 0 80.6% +80.6%
==========================================
Files 0 289 +289
Lines 0 29868 +29868
==========================================
+ Hits 0 24103 +24103
- Misses 0 4169 +4169
- Partials 0 1596 +1596 ☔ View full report in Codecov by Sentry. |
common/types/wire/activation_v2.go
Outdated
} | ||
|
||
type InnerActivationTxV2 struct { | ||
PublishEpoch uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're changing ATX structure, it would be nice if we could include an (untrusted) LayerID
here for sync purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what the use case is and what value the LayerID
would have? Would it need to be part of InnerActivationTx
(for signing)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is needed for syncv2 so that better clustering of ATX IDs during sync can be achieved, reducing sync traffic. The LayerID
must be signed so it can't be changed by other nodes that process the ATX, confusing the sync algorithm by assigning the same ATX to different layers. It is assumed that some nodes may lie in that LayerID
field in ATXs they publish, but with majority of nodes being honest, that should not cause substantial performance degradation of the set reconciliation algorithm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would LayerID
field contain the layer in which the ATX was created or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think PublishEpoch
needs to be the first field in the type, because some decoding logic depends on it and we cannot change older ATXs retroactively to now use the LayerID
instead.
I'm not fully understanding how LayerID
can help improve sync? An ATX isn't published in any layer - protocol wise it should make no difference if an ATX is published on the first day or the last day of an epoch?
Also PublishEpoch
is part of the PoET challenge. So if it would be a LayerID
an honest node would need to know in advance with 5 minutes accuracy when it will publish its ATX 2 weeks later, which seems very difficult to predict (especially in a multi-smesher setup) as a) there is some delay when the PoEt proof will be available, it could take multiple passes to generate the PoST proof, the node might not be in sync before publishing the ATX further delaying the publication, etc.
Any sensible node will then just set the LayerID to the last layer before the next PoET round starts to give itself the maximum time to generate the PoST and publish the ATX, causing all ATXs of the same PoET phase to be published in the same layer...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, we need to keep the publish epoch the first field in the encoding for backwards compatibility and so we know which version of the ATX the bytes after that encode.
We would then need to additionally encode the LayerID
which the publishing node has no incentive to be honest about and the receiving node has no way to verify if it is correct or not. What happens if someone with a lot of identities just claims they were all published in the same layer if they weren't?
If it's not an issue for a successful sync then why is helpful to have it? If it does cause problems - how can we prevent a malicious user from disrupting the network by claiming incorrect publication layers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, we need to keep the publish epoch the first field in the encoding for backwards compatibility and so we know which version of the ATX the bytes after that encode.
👍
@ivan4th Could you motivate and explain adding the LayerID
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@poszu The range-based set reconciliation algorithm is based on subdividing the ranges of possible key values for which the fingerprints (hashes of keys XORed together) don't match. If keys are just hash values, and say a peer has 100 hashes that the other one doesn't have, then syncing these 100 keys which are evenly spread across the key space may cause 100 subdivisions (roughly speaking), narrowing down to each of the keys individually.
But if the keys include LayerID
and thus can be sorted by time (layer), then the most likely situation is that these 100 keys will be close to one another, and with the algorithm we use it may only take just a few subdivisions to sync all of the keys. Basically, just narrow down to the 100 new keys with just small "already synced" gaps between quickly and then transfer all of the IDs in the time-based range, downloading the missing blobs afterwards.
With such an optimization, the new sync algorithm will be very efficient at handling "send me all of the recent data" case, which is expected to be most common for the synced / mostly synced nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I don't see any downside to using a publish layer instead of a publish epoch. The ATX handler could translate it to the epoch easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do understand your goal but I still fail to see the benefit.
Any honest node will optimize their post proofing times to consume most (if not all) of their CG, so most of the ATXs will be published towards the end of a CG, causing all ATXs to be concentrated in very few layers.
Even if 100% of all nodes are honest, the great majority of ATXs will be within those few layers (the last layers of every CG and the last few layers of the epoch). Which makes this optimization much less efficient as it would be if ATXs would be spread evenly across an epoch.
activation/wire/wire_v2.go
Outdated
InnerActivationTxV2 | ||
|
||
SmesherID types.NodeID | ||
// Signature over InnerActivationTxV2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add as information that it is the root of the merkle tree formed from all fields of InnerActiationTxV2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's worth documenting it when we decide on the signing schema.
// Hash serializes the NIPostChallenge and returns its hash. | ||
// The serialized challenge is first prepended with a byte 0x00, and then hashed | ||
// for second preimage resistance of poet membership merkle tree. | ||
func (c *NIPostChallengeV2) Hash() types.Hash32 { | ||
ncBytes := codec.MustEncode(c) | ||
return hash.Sum([]byte{0x00}, ncBytes) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably also be changed to be the root of a merkle tree formed by the NiPostChallengeV2
, to ensure we can proof any field to be a specific value in a malfeasance proof without having to attach the full ATX. We just need to make sure that the NiPostChallengeV2
is attached via its root to the tree to be able to extract each field individually.
Regarding pre-image resistance: we need to check if this is still needed with the new signature scheme.
bors merge |
## Motivation Proposal of the new ATX version to support the atx merging.
Pull request successfully merged into develop. Build succeeded: |
Motivation
Proposal of the new ATX version to support the atx merging.
Description 🚧
New ATX V2 allows identities to marry (join an equivocation set) and publish ATX together (merge).
💒 Marriage
Marriage happens when an identity selected by the miner publishes an ATX with the
Marriages
slice filled with marriage certificates.A certificate proves the will of the identity that signed it to be married to the identity that publishes the certificate. For example, if identity Alice wants to marry Bob (and vice versa), the owner of these identities first picks one that will publish an ATX with a marriage certificate. Say it's Bob. Alice signs Bob's ID and creates a
MarriageCertificate
, which Bob includes in his next ATX.The marriage is instant ⚡ and, from the moment Bob published the ATX, Alice and Bob are forever bound by forming a so-called
equivocation set
.👪 equivocation set
An equivocation set bounds all identities it contains. If one of them acts maliciously, all IDs in the set become malicious.
Merging
A merge is an act of publishing a single ATX on behalf of many identities. A merged ATX contains non-interactive proofs of space-time (NIPoST) for multiple identities. It is signed by a single ID that publishes it and must include a VRF nonce (that belongs to the ID signing the ATX) that satisfies the combined storage of all included IDs. It becomes active for the target epoch and will participate in the network protocols (i.e. Hare) and receive accumulated rewards on the selected coinbase account according to the weight of all IDs present in the ATX.
💡 A merged ATX contains a reference to the ATX in which marriage happened for easy syntactical verification if the included IDs are married.
Restrictions
1️⃣ One merged ATX per epoch
There can be only one merged ATX per equivocation set. Identities that choose not to merge (for example because of software or hardware malfunction, being late with PoST proving, etc.) must publish separate ATXs just for themself. They cannot form a second merged ATX in the same epoch.
For example, given an equivocation set of IDs A,B,C,D,E. If C and E malfunction, the node:
2️⃣ There is an upper limit of equivocation set size
Only up to N identities can marry int a single equivocation set. Marrying more is malfeasance and cancels all IDs already present in the set as well as the ones that tried to join it.
3️⃣ The equivocation set cannot grow
Identities that are already married cannot marry again.