-
Notifications
You must be signed in to change notification settings - Fork 86
Reusable Diffusion Investigation
- Reusable Diffusion
This document outlines the findings of an investigation into the Cardano diffusion layer and explores how to make it reusable for third-party users. The goal is to transform the current diffusion layer so as to become a library that enables others to build overlay networks, leveraging the existing self-balancing, self-optimizing, and self-healing capabilities of the current Cardano Node network. Users will be able to diffuse any data across their network by running custom protocols, defining their own targets for connectivity, and setting their own governance policies (such as churning and monitoring jobs).
Note that this project is not to meant to be a fork of the current ouroboros-network
repository, but rather a major refactorisation.
In the current architecture, the Cardano node initializes the consensus layer, which in turn initializes the diffusion layer. The node parses configuration files and passes diffusion parameters to the consensus layer via RunNodeArgs
and StdRunNodeArgs
. Consensus then uses these parameters to initialise the diffusion layer.
Additionally, the consensus layer is responsible for providing the diffusion layer with the versioned applications (mini-protocols) bundle, which manages peer connections and promotions. This is important for protocols like the handshake protocol and codecs. The consensus layer also supplies the LedgerPeersConsensusInterface
, which informs the diffusion layer about the latest slot numbers, ledger state judgments, and the ledger peers.
Currently the consensus layer only depends on the ouroboros-network
and ouroboros-network-api
libraries. ouroboros-network
offers the top-level integration of all network components (i.e. diffusion layer). ouroboros-network-api
the shared API between network and consensus layers.
The goal of a reusable design is to allow third-party users to leverage the ouroboros-network
diffusion layer for their own applications by decoupling it from Cardano node-specific implementation details. Currently, the only client for the ouroboros-network
is the Cardano node, which, being part of a monolithic stack, has tightly coupled many implementation specifics to the diffusion layer. This coupling makes it difficult to reuse the diffusion layer
for other purposes.
flowchart TB
subgraph CN[Cardano Node]
C[Consensus]
subgraph Diffusion["Diffusion (ouroboros-network)"]
direction TB
DN2C[N2C]
DN2N[N2N]
end
C -- "LedgerPeerConsensusInterface" --> Diffusion
end
flowchart TB
subgraph CN["Cardano Node (Haskell)"]
C[Consensus]
subgraph CNDiff["Diffusion (o-network)"]
CNN2C[N2C]
CNN2N[N2N]
end
C -- "LedgerPeerConsensusInterface" --> CNDiff
end
subgraph CA["Custom App (Haskell)"]
subgraph CADiff["Diffusion (o-network)"]
CAN2C[N2C]
CAN2N[N2N]
end
CNN2C -- "LedgerPeerConsensusInterface RPC" --> CAN2C
end
subgraph CAE["Custom App Logic (X Lang)"]
CAE2C[N2C]
end
CAN2C <--> CAE2C
The main objective is to abstract and extend the LedgerPeerConsensusInterface
so that third-party diffusion layers can interact with a full Cardano node and access all necessary information to operate. This will be facilitated through a node-to-client RPC interface. Users will also have the ability to configure all diffusion parameters, such as PeerMetrics
for churn or PeerSelectionTargets
for the peer selection governor. Additionally, users can introduce their own application-specific parameters to implement custom details as needed.
This section describes and analyses the current stakeholders (i.e. third party users) that will benefit from using the ouroboros-network
stack as a reusable API. This document's decisions are guided by the properties identified in the stakeholders use cases.
Mithril is a protocol based on a Stake-based Threshold Multi-signature scheme, which leverages Cardano's Stake Pool Operators (SPOs) to certify data from the Cardano blockchain in a trustless manner. Currently, Mithril is used within the Cardano ecosystem to enable fast bootstrapping of full nodes and secure light wallets.
The Mithril protocol coordinates the collection of individual signatures from the signers (run by SPOs), which are then aggregated into multi-signatures and certificates by the aggregators. To achieve full decentralization, Mithril must operate over a decentralized peer-to-peer network. Building such a network from scratch would require substantial time, effort, and investment. Moreover, since SPOs, representing Cardano's active stake, will need to adopt and operate Mithril nodes alongside their Cardano nodes, a more efficient solution is to leverage the existing Cardano network layer. This approach will simplify Mithril's development while minimizing its impact on the Cardano network and reducing the maintenance overhead for SPOs.
Mithril will be an early adopter of the proposed design in this document, serving as a use case and illustrative example.
- mithril-node
- mithril-signer (developed in Rust by the Mithril team)
- cardano-node
architecture-beta
group MithrilNode[Mithril Node]
group OuroborosNetworkMithril[Mithril Ouroboros Network] in MithrilNode
service NodeToClientMithril[Mithril NodeToClient] in OuroborosNetworkMithril
service NodeToNodeMithril[Mithril NodeToNode] in OuroborosNetworkMithril
service NodeToClientCardanoClient[NodeToClient Cardano Client] in OuroborosNetworkMithril
group CardanoNode[Cardano Node]
group OuroborosNetworkCardano[Cardano Ouroboros Network] in CardanoNode
service NodeToClientCardano[Cardano NodeToClient] in OuroborosNetworkCardano
service NodeToNodeCardano[Cardano NodeToNode] in OuroborosNetworkCardano
group MithrilSigner[Mithril Signer]
service NodeToClientMithrilSigner[Mithril NodeToClient] in MithrilSigner
NodeToClientCardanoClient:L -- R:NodeToClientCardano
NodeToClientMithrilSigner:B -- T:NodeToClientMithril
The Mithril node must operate alongside the Cardano node, communicating through UNIX sockets and a custom Node-to-Client (N2C) RPC protocol. This enables the Mithril diffusion layer to access the necessary ledger information to establish a resilient overlay network. Additionally, the Mithril node's diffusion layer will need to support custom protocols to facilitate communication with both other Mithril nodes and the Mithril signer nodes, which provide signatures for diffusion. These signer nodes, or any application-specific logic processes, can be implemented in any suitable programming language, as long as both the Mithril node and the signer node communicate using the same protocol.
Other protocols in the Cardano ecosystem, such as Leios and Peras (and probably other protocols in the future), also need the capability to diffuse messages originating from block producers in a decentralized fashion. However, in the Leios and Peras cases, the Cardano node itself is a producer and consumer of these messages. We have taken into consideration this need for a generic solution in the design.
Currently, the following diffusion-specific parameters can be configured manually, via configuration files, or programmatically (these belong to the Diffusion.run
function):
- Tracers (Common interface between P2P and Non-P2P)
- Tracers Extra (Additional tracers for P2P diffusion)
- Arguments (Common interface between P2P and Non-P2P)
- Arguments Extra (Additional arguments for P2P diffusion)
- Applications (Common interface for mini-protocol bundles)
- Applications Extra (Additional settings for P2P applications)
NOTE: Once we remove non-p2p, we can merge Tracers
& TracersExtra
, Arguments
& ArgumentsExtra
and Applications
& ApplicationsExtra
.
1. Tracers: Includes tracers for (Local) Mux, (Local) Handshake, and the diffusion tracer. These tracers are likely to remain unchanged until the Non-P2P stack is fully removed.
2. Tracers Extra: Includes tracers for P2P components such as (Local) Public Root Peers, Ledger Peers, Peer Selection, (Churn) Peer Selection Counters, (Local) Connection Manager, (Local) Server, and (Local) Inbound Governor. Third-party users will need to implement their own tracers to monitor the diagnostics of their diffusion layer. 3. Arguments: Includes settings for IPv4/IPv6 addresses, rate limits, and whether the diffusion layer should be initiated in either initiator-only or initiator-responder mode. These configurations are required from third-party users.
4. Arguments Extra: Covers P2P-specific configurations, including Peer Targets, Consensus Mode, Minimum Big Ledger Peers for Trusted State, Peer Sharing, Protocol Idle Timeouts, TIME_WAIT timeouts, Churn Deadline Interval, and Bulk Churn Interval. It also requires STM actions to handle data such as (Local) Public Root Peers, Bootstrap Peers flags, Ledger Peer Snapshots, and whether to use Ledger Peers. Some parameters here are specific to the Cardano node and may not be relevant to third-party applications, so they should be abstracted from the user.
5. Applications: This includes versioned mini-protocol bundles to run based on the connection mode: Initiator Mode, Initiator-Responder Mode, or Local Responder (Non-P2P). These bundles are polymorphic on the N2N/N2C versions and respective data, enabling third-party users to define their own configurations. However, the user must provide the LedgerPeersConsensusInterface
, which serves as the only communication point between the diffusion and consensus layers. In the current system, the consensus layer provides this interface by directly accessing the required TVar
s. To support third-party usage, the diffusion layer will need to communicate with a Cardano node using an N2C (RPC) protocol. A callback informing whether the node is connected to local roots or external peers is also required, though this may not be relevant for third-party users.
6. Applications Extra: This includes additional configuration for application behaviors such as the (Local) Rethrow Policy, Return Policy, Peer Metrics TVar for Peer Selection, Block Fetch Mode, and Peer Sharing Registry. As with other parameters, not all of these settings may be relevant for third-party users.
The following better illustrates the dependencies between components:
flowchart TB
subgraph CN[Cardano Node]
subgraph Diffusion Arguments
Tr[Tracers]
TrE[Tracers Extra]
Args[Arguments]
ArgsE[Arguments Extra]
subgraph Apps[Applications]
LCI[LedgerConsensusInterface]
end
AppsE[Applications Extra]
end
subgraph ON[Ouroboros Network]
OG[Outbound Governor]
CG[Churn Governor]
IG[Inbound Governor]
Etc[Others: Connection Manager, Local Root Peers, Public Root Peers, etc..]
end
OC[Ouroboros Consensus]
end
OC --> LCI
Tr --> Etc
TrE --> OG
TrE --> CG
TrE --> IG
TrE --> Etc
Args --> Etc
Args --> IG
ArgsE --> OG
ArgsE --> CG
ArgsE --> IG
ArgsE --> Etc
Apps --> Etc
AppsE --> OG
AppsE --> CG
AppsE --> Etc
Note that all parameters, except for the LedgerPeerConsensusInterface
, are static. The LedgerPeerConsensusInterface
enables dynamic interaction with the consensus layer, providing real-time values that the diffusion layer relies on to function correctly.
To improve the current configuration scheme and provide a cleaner API for third-party users, a new diffusion configuration structure is proposed. The goal is to abstract and hide irrelevant components specific to the Cardano node application (e.g., Block Fetch Mode, Ledger Snapshot, Bootstrap Peers) while allowing users to configure the essential parts of the diffusion layer.
-
Tracers: Remains unchanged.
-
Tracers Extra: These will need to be made extensible so that third-party users can add their own tracers if needed.
-
Arguments: Remains unchanged.
-
Arguments Extra: These will be exposed so that third-party users can define and instantiate their own arguments, for example:
data CardanoArgumentsExtra m = CardanoArgumentsExtra { caePeerTargets :: ConsensusModePeerTargets , caeReadUseBootstrapPeers :: STM m UseBootstrapPeers , caeMinBigLedgerPeersForTrustedState :: MinBigLedgerPeersForTrustedState , caeConsensusMode :: ConsensusMode } data ArgumentsExtra extraArgs m = ArgumentsExtra { , daPeerTargets :: PeerSelectionTargets , daOwnPeerSharing :: PeerSharing , daProtocolIdleTimeout :: DiffTime , daTimeWaitTimeout :: DiffTime , daDeadlineChurnInterval :: DiffTime , daBulkChurnInterval :: DiffTime , caeReadUseLedgerPeers :: STM m UseLedgerPeers , caeReadUseBootstrapPeers :: UseBootstrapPeers , caeMinBigLedgerPeersForTrustedState :: MinBigLedgerPeersForTrustedState , daReadLocalRootPeers :: STM m (LocalRootPeers.Config RelayAccessPoint) , daReadPublicRootPeers :: STM m (Map RelayAccessPoint PeerAdvertise) , daReadLedgerPeerSnapshot :: STM m (Maybe LedgerPeerSnapshot) , daAPIArgs :: extraArgs }
-
Applications: These remain mostly unchanged, except for
LedgerPeersConsensusInterface
, which will require changes to support all RPC methods required by the N2C communication protocols. This will require an implementation of the protocols on the Cardano node side, and a new handshake protocol as well. Since there is currently a Cardano (Genesis) specific callback (daUpdateOutboundConnectionsState
), it would be best to abstract over extra consensus callbacks, for example:data CardanoLedgerPeersConsensusInterface m = CardanoLedgerPeersConsensusInterface { clpciGetLedgerStateJudgement :: STM m LedgerStateJudgement , clpciUpdateOutboundConnectionsState :: OutboundConnectionsState -> m () } data LedgerPeersConsensusInterface api m = LedgerPeersConsensusInterface { lpGetLatestSlot :: STM m (WithOrigin SlotNo) , lpGetLedgerPeers :: STM m [(PoolStake, NonEmpty RelayAccessPoint)] -- ... other RPC methods , lpExtraAPI :: api }
Here
daUpdateOutboundConnectionsState
could always be filled with an empty value like\_ -> pure ()
, but if one can avoid leaking Cardano specific details the better. Also an extension point is more versatile. -
Applications Extra: The only questionable value in this record is
daBlockFetchMode
, as it is specific to the Block Fetch mini-protocol, and the Churn Governor logic is not decoupled from this and other Cardano Node-specific parameters. If we have two separate implementations for Churn as mentioned below in Churn Governor then having an extension point is a viable option.
The P2P components that have the most Cardano Node implementation details on them and thus can not be made completely general easily are the Outbound Governor, the Churn Governor and the Local/Public Root Peers Provider. The best way to proceed in order to better separate the concerns and provide the best, general API for third party users is to factor out the Cardano specific types and parameters of these main components types and arguments into different modules. With a good enough module structure we can provide a clean and simple API for diffusion instantiation just by importing the right modules.
With this being said, a good way to do this would be to:
-
Add all extension points mentioned above in
Cardano.{Diffusion, Node, PeerSelection}
-
Move P2P diffusion instantiation to one of these folders/modules
-
Extend
Ouroboros.Network.Diffusion
to be able to accommodate a third party diffusion instantiations like:
data P2P = P2P -- ^ General P2P mode. Can be instantiated with custom -- data types | NonP2P -- ^ Cardano non-P2P mode. Deprecated | P2PCardano -- ^ Cardano P2P mode.
data ExtraTracers (p2p :: P2P) extraState extraFlags extraPeers m where P2PTracers :: Common.TracersExtra RemoteAddress NodeToNodeVersion NodeToNodeVersionData LocalAddress NodeToClientVersion NodeToClientVersionData IOException extraState extraState extraFlags extraPeers m -> ExtraTracers 'P2P extraState extraFlags extraPeers m
P2PCardanoTracers :: Common.TracersExtra RemoteAddress NodeToNodeVersion NodeToNodeVersionData LocalAddress NodeToClientVersion NodeToClientVersionData IOException CardanoPeerSelectionState CardanoPeerSelectionState PeerTrustable (CardanoPublicRootPeers RemoteAddress) m -> ExtraTracers 'P2PCardano CardanoPeerSelectionState PeerTrustable (CardanoPublicRootPeers RemoteAddress) m
NonP2PTracers :: NonP2P.TracersExtra -> ExtraTracers 'NonP2P extraState extraFlags extraPeers m
data ArgumentsExtra (p2p :: P2P) extraArgs extraPeers m where P2PArguments :: Common.ArgumentsExtra extraArgs extraPeers m -> ArgumentsExtra 'P2P extraArgs extraPeers m
P2PCardanoArguments :: Common.ArgumentsExtra (CardanoArgumentsExtra m) PeerTrustable m -> ArgumentsExtra 'P2PCardano (CardanoArgumentsExtra m) PeerTrustable m
NonP2PArguments :: NonP2P.ArgumentsExtra -> ArgumentsExtra 'NonP2P extraArgs extraPeers m
data Applications (p2p :: P2P) extraAPI m a where P2PApplications :: Common.Applications RemoteAddress NodeToNodeVersion NodeToNodeVersionData LocalAddress NodeToClientVersion NodeToClientVersionData (CardanoLedgerPeersConsensusInterface m) m a -> Applications 'P2P extraAPI m a
P2PCardanoApplications :: Common.Applications RemoteAddress NodeToNodeVersion NodeToNodeVersionData LocalAddress NodeToClientVersion NodeToClientVersionData (CardanoLedgerPeersConsensusInterface m) m a -> Applications 'P2PCardano (CardanoLedgerPeersConsensusInterface m) m a
NonP2PApplications :: Common.Applications RemoteAddress NodeToNodeVersion NodeToNodeVersionData LocalAddress NodeToClientVersion NodeToClientVersionData () m a -> Applications 'NonP2P () m a
data ApplicationsExtra (p2p :: P2P) ntnAddr m a where P2PApplicationsExtra :: Common.ApplicationsExtra ntnAddr m a -> ApplicationsExtra 'P2P ntnAddr m a
P2PCardanoApplicationsExtra :: Common.ApplicationsExtra ntnAddr m a -> ApplicationsExtra 'P2PCardano ntnAddr m a
NonP2PApplicationsExtra :: NonP2P.ApplicationsExtra -> ApplicationsExtra 'NonP2P ntnAddr m a
run :: forall (p2p :: P2P) extraArgs extraState extraFlags extraPeers extraAPI a. Tracers RemoteAddress NodeToNodeVersion LocalAddress NodeToClientVersion IO -> ExtraTracers p2p extraState extraFlags extraPeers IO -> Arguments IO Socket RemoteAddress LocalSocket LocalAddress -> ArgumentsExtra p2p extraArgs extraFlags IO -> Applications p2p extraAPI IO a
-> ApplicationsExtra p2p RemoteAddress IO a
-> IO ()
run _ (P2PTracers _) _ (P2PArguments _) (P2PApplications _) (P2PApplicationsExtra _) = ThirdParty.run ... run tracers (P2PCardanoTracers tracersExtra) args (P2PCardanoArguments argsExtra) (P2PCardanoApplications apps) (P2PCardanoApplicationsExtra appsExtra) = void $ P2P.run tracers tracersExtra args argsExtra apps appsExtra run tracers (NonP2PTracers tracersExtra) args (NonP2PArguments argsExtra) (NonP2PApplications apps) (NonP2PApplicationsExtra appsExtra) = do NonP2P.run tracers tracersExtra args argsExtra apps appsExtra
<!-- TOC --><a name="outbound-governor-peer-selection-governor"></a>
#### Outbound Governor (Peer Selection Governor)
The peer selection governor manages the discovery and selection of upstream peers. It operates based on a set of targets (`PeerSelectionTargets`) and attempts to meet them through a series of monitoring actions. For example, if the governor is below its target for established peers, it will select a peer from its known set to connect to. Conversely, if the governor exceeds its target for hot peers, it will demote peers according to predefined metrics.
Currently, the Outbound Governor is coupled with Cardano node-specific parameters like `ConsensusMode`, `ConsensusModePeerTargets`, `LedgerPeerSnapshot`, and `LedgerStateJudgement`. These parameters are scattered across `LocalRootPeers`, `PublicRootPeers`, `PeerSelectionActions`, `PeerSelectionInterfaces`, and `PeerSelectionState`. However, by abstracting these into an additional record, we can separate application-specific logic:
```haskell
data LocalRootPeers extraFlags peeraddr =
LocalRootPeers
-- | Here extraFlags allow 3rd party users to enhance their local peers
(Map peeraddr (PeerAdvertise, PeerTrustable, extraFlags))
[(HotValency, WarmValency, Set peeraddr)]
data PublicRootPeers peeraddr =
PublicRootPeers {
-- | Configuration aka Public Config Peers should not be needed anymore
-- getPublicConfigPeers :: !(Map peeraddr PeerAdvertise)
getLedgerPeers :: !(Set peeraddr)
, getBootstrapPeers :: !(Set peeraddr)
, getBigLedgerPeers :: !(Set peeraddr)
}
-- This moves readUseLedgerPeers from PeerSelectionInterfaces to here
data CardanoPeerSelectionActions m =
CardanoPeerSelectionActions {
cpsaReadLedgerPeerSnapshot :: STM m (Maybe LedgerPeerSnapshot)
, cpsaPeerTargets :: ConsensusModePeerTargets
}
type Config extraLocalRootPeersFlags peeraddr =
[(HotValency, WarmValency, Map peeraddr ( PeerAdvertise, extraLocalRootPeersFlags))]
data PeerSelectionActions extraActions consensusAPI extraLocalRootPeersFlags
peeraddr peerconn m =
PeerSelectionActions {
, readPeerSelectionTargets :: STM m PeerSelectionTargets
, readLocalRootPeers :: STM m (LocalRootPeers.Config extraLocalRootPeersFlags
peeraddr)
, readInboundPeers :: m (Map peeraddr PeerSharing)
, peerSharing :: PeerSharing
, peerConnToPeerSharing :: peerconn -> PeerSharing
, requestPublicRootPeers :: LedgerPeersKind
-> Int
-> m ( PublicRootPeers peeraddr , DiffTime)
, requestPeerShare :: PeerSharingAmount -> peeraddr -> m (PeerSharingResult peeraddr)
, peerStateActions :: PeerStateActions peeraddr peerconn m
, getLedgerStateCtx :: LedgerPeersConsensusInterface consensusAPI m
, readUseBootstrapPeers :: STM m UseBootstrapPeers
, readUseLedgerPeers :: STM m UseLedgerPeers
, getExtraActions :: extraActions
}
-- readUseLedgerPeers was moved to CardanoPeerSelectionActions
data PeerSelectionInterfaces peeraddr peerconn m =
PeerSelectionInterfaces {
countersVar :: StrictTVar m PeerSelectionCounters,
, publicStateVar :: StrictTVar m (PublicPeerSelectionState peeraddr),
, debugStateVar :: StrictTVar m (PeerSelectionState peeraddr peerconn),
}
data CardanoPeerSelectionState =
CardanoPeerSelectionState {
ledgerStateJudgement :: !LedgerStateJudgement
, consensusMode :: !ConsensusMode
, hasOnlyBootstrapPeers :: !Bool
, ledgerPeerSnapshot :: Maybe LedgerPeerSnapshot
}
data PeerSelectionState extraState extraLocalRootPeersFlags
peeraddr peerconn =
PeerSelectionState {
targets :: !PeerSelectionTargets
, localRootPeers :: !(LocalRootPeers extraLocalRootPeersFlags peeraddr)
, publicRootPeers :: !(PublicRootPeers peeraddr)
, knownPeers :: !(KnownPeers peeraddr)
, establishedPeers :: !(EstablishedPeers peeraddr peerconn)
, activePeers :: !(Set peeraddr)
, publicRootBackoffs :: !Int
, publicRootRetryTime :: !Time
, inProgressPublicRootsReq :: !Bool
, bigLedgerPeerBackoffs :: !Int
, bigLedgerPeerRetryTime :: !Time
, inProgressBigLedgerPeersReq :: !Bool
, inProgressPeerShareReqs :: !Int
, inProgressPromoteCold :: !(Set peeraddr)
, inProgressPromoteWarm :: !(Set peeraddr)
, inProgressDemoteWarm :: !(Set peeraddr)
, inProgressDemoteHot :: !(Set peeraddr)
, inProgressDemoteToCold :: !(Set peeraddr)
, stdGen :: !StdGen
, inboundPeersRetryTime :: !Time
, bootstrapPeersTimeout :: !(Maybe Time)
, bootstrapPeersFlag :: !UseBootstrapPeers
, minBigLedgerPeersForTrustedState :: MinBigLedgerPeersForTrustedState
, extraState :: !extraState
}
data CardanoPeerSelectionArguments =
CardanoPeerSelectionArguments {
cnpsaConsensusMode :: ConsensusMode
}
data PeerSelectionArguments extraArgs extraActions consensusAPI
peeraddr peerconn m =
PeerSelectionArguments {
psaPeerSelectionTracer :: Tracer m (TracePeerSelection peeraddr)
, psaDebugPeerSelectionTracer :: Tracer m (DebugPeerSelection peeraddr)
, psaPeerSelectionCountersTracer :: Tracer m PeerSelectionCounters
, psaFuzzRng :: StdGen
, psaPeerSelectionActions :: PeerSelectionActions extraActions consensusAPI
extraLocalRootPeersFlags
peeraddr peerconn m
, psaPeerSelectionPolicy :: PeerSelectionPolicy peeraddr m
, psaPeerSelectionInterfaces :: PeerSelectionInterfaces peeraddr peerconn m
, psaMinBigLedgerPeersForTrustedState :: MinBigLedgerPeersForTrustedState
, psaExtraArgs :: extraArgs
}
Perhaps PeerSelectionPolicy
record should also be able to be extended so third party users can write their own policies and use them.
peerSelectionGovernor
:: ( Alternative (STM m)
, MonadAsync m
, MonadDelay m
, MonadLabelledSTM m
, MonadMask m
, MonadTimer m
, Ord peeraddr
, Show peerconn
, Hashable peeraddr
)
=> PeerSelectionArguments extraArgs extraActions consensusAPI
extraLocalRootPeersFlags peeraddr peerconn m
-> m Void
peerSelectionGovernor
PeerSelectionArguments {
psaPeerSelection
, psaDebugPeerSelectionTracer
, psaPeerSelectionCountersTracer
, psaFuzzRng
, psaPeerSelectionActions
, psaPeerSelectionPolicy
, psaPeerSelectionInterfaces
, psaMinBigLedgerPeersForTrustedState
, psaExtraArgs
} =
JobPool.withJobPool $ \jobPool ->
peerSelectionGovernorLoop
psaPeerSelectionTracer
psaDebugPeerSelectionTracer
psaPeerSelectionCountersTracer
psaPeerSelectionActions
psaPeerSelectionPolicy
psaPeerSelectionInterfaces
psaMinBigLedgerPeersForTrustedState
jobPool
(emptyPeerSelectionState psaFuzzRng extraArgs)
The Peer Selection Governor consists of a series of guarded decisions known as monitoring actions. These actions, either blocking or non-blocking, guide the governor's decision-making process. The order of execution is crucial. Although the current governor has Cardano-specific actions, third-party users may need to customize their monitoring actions.
Below is a minimal (without Cardano-specific actions) set of monitoring actions, sorted by execution order:
-
Blocking decisions:
connections
jobs
targetPeers
localRoots
-
Non-blocking decisions: 5.
BigLedgerPeers.belowTarget
6.BigLedgerPeers.aboveTarget
7.RootPeers.belowTarget
8.KnownPeers.belowTarget
9.KnownPeers.aboveTarget
10.EstablishedPeers.belowTarget
11.EstablishedPeers.aboveTarget
12.ActivePeers.belowTarget
13.ActivePeers.aboveTarget
The minimal set of monitoring actions above is not fully decoupled from Cardano node-specific details. Below is a list of actions that depend on Cardano-specific details:
-
Blocking decisions: 3.
targetPeers
: Requires access toledgerStateJudgement
, and specificConsensusModePeerTargets
(for Genesis implementation). These help decide which set of targets to switch to. 4.localRoots
: Requires access toledgerStateJudgement
, to decide when to skip this action and maintain Genesis invariants regarding local and trusted peers. -
Non-blocking decisions: 5.
BigLedgerPeers.belowTarget
: Requires access toledgerStateJudgement
to determine when to skip this action. 8.KnownPeers.belowTarget
: RequiresledgerStateJudgement
for action skipping. 10.EstablishedPeers.belowTarget
: RequiresledgerStateJudgement
for action skipping. 12.ActivePeers.belowTarget
: Depends onledgerStateJudgement
to manage thebelowTargetBigLedgerPeers
action.
The importance of targetPeers
and localRoots
could be reconsidered, allowing third-party users to decide how to manage their targets and local roots. Since these two are the only actions for which their logic depends on Cardano specific details. The other actions coupling are only on the outside for a guard.
Since the general case is not obvious, providing a clear/simple outbound governor API for third party users is not easy. One can just do enough to support the Cardano node case which would be only to provide a way to add extra guarding conditions for belowTarget
actions. However, if the third-party user requires finer control over their monitoring actions, a more customizable approach might be beneficial, allowing full control through a simple API:
type MonitoringAction extraState extraActions consensusAPI
extraLocalRootPeersFlags peeraddr peerconn m =
[ PeerSelectionPolicy peeraddr m
-> PeerSelectionActions extraActions consensusAPI extraLocalRootPeersFlags
peeraddr peerconn m
-> PeerSelectionState extraState extraLocalRootPeersFlags peeraddr peerconn
-> Guarded (STM m) (TimedDecision m peeraddr peerconn)]
guardedDecisions :: PeerSelectionState extraState extraLocalRootPeersFlags
peeraddr peerconn
-> MonitoringActions extraState extraActions consensusAPI
extraLocalRootPeersFlags peeraddr peerconn
m
-> Guarded (STM m) (TimedDecision m peeraddr peerconn)
guardedDecisions st actions =
[ Monitor.jobs jobPool st ]
<> foldMap (\a -> a st) actions
This API would grant the user full control, with a caveat: they must ensure not to introduce errors, but it would also make debugging easier since the user controls the governor's code. A minimal API could be provided with the essential monitoring actions as a baseline, allowing for customization while avoiding redundancy.
Alternatively, an inversion-of-control approach could be implemented. The minimal set of essential operations would still be defined, but users could extend these by providing their own callbacks without compromising the core actions. Although this would complicate the API, it’s unclear how flexible this approach would be for future needs.
To extend monitoring actions, users could insert them either before or after the essential blocking and non-blocking actions:
data ExtraGuardedDecisions extraState extraActions consensusAPI
extraLocalRootPeersFlags peeraddr peerconn m =
ExtraGuardedDecisions {
-- | This list of guarded decisions will come before all default possibly
-- blocking -- decisions. The order matters, making the first decisions
-- have priority over the later ones.
--
-- Note that these actions should be blocking.
preBlocking :: [MonitoringAction extraState extraActions
consensusAPI extraLocalRootPeersFlags
peeraddr peerconn m]
-- | This list of guarded decisions will come after all possibly preBlocking
-- and default blocking decisions. The order matters, making the first
-- decisions have priority over the later ones.
--
-- Note that these actions should be blocking.
, postBlocking :: [MonitoringAction extraState extraActions
consensusAPI extraLocalRootPeersFlags
peeraddr peerconn m]
-- | This list of guarded decisions will come before all default non-blocking
-- decisions. The order matters, making the first decisions have priority over
-- the later ones.
--
-- Note that these actions should not be blocking.
, preNonBlocking :: [MonitoringAction extraState extraActions
consensusAPI extraLocalRootPeersFlags
peeraddr peerconn m]
-- | This list of guarded decisions will come before all preNonBlocking and
-- default non-blocking decisions. The order matters, making the first
-- decisions have priority over the later ones.
--
-- Note that these actions should not be blocking.
, postNonBlocking :: [MonitoringAction extraState extraActions
consensusAPI extraLocalRootPeersFlags
peeraddr peerconn m]
}
An example of Cardano node monitoring actions could look like this:
cardanoNodeMonitoringActions
:: ExtraGuardedDecisions extraState extraActions consensusAPI
extraLocalRootPeersFlags peeraddr peerconn m
cardanoNodeMonitoringActions = ExtraGuardedDecisions {
preBlocking = [ \_ psa pst -> monitorBootstrapPeersFlag psa pst
, \_ psa pst -> monitorLedgerStateJudgement psa pst
, \_ _ pst -> waitForSystemToQuiesce pst
]
, postBlocking = [ \_ psa pst -> ledgerPeerSnapshotChange pst psa
]
, preNonBlocking = []
, postNonBlocking = []
}
guardedDecisions :: Time
-> PeerSelectionState extraState extraLocalRootPeersFlags
peeraddr peerconn
-> Map peeraddr PeerSharing
-> MonitoringAction extraState extraActions consensusAPI
extraLocalRootPeersFlags peeraddr peerconn m
-> Guarded (STM m) (TimedDecision m peeraddr peerconn)
guardedDecisions blockedAt st inboundPeers ExtraGuardedDecisions {...} =
-- Make sure preBlocking set is in the right place
foldMap (\a -> a policy actions st) preBlocking
<> Monitor.connections actions st
<> Monitor.jobs jobPool st
<> Monitor.targetPeers actions st
<> Monitor.localRoots actions st
-- Make sure postBlocking set is in the right place
foldMap (\a -> a policy actions st) postBlocking
-- Make sure preNonBlocking set is in the right place
foldMap (\a -> a policy actions st) preNonBlocking
<> BigLedgerPeers.belowTarget actions blockedAt st
<> BigLedgerPeers.aboveTarget policy st
<> RootPeers.belowTarget actions blockedAt st
<> KnownPeers.belowTarget actions blockedAt
inboundPeers policy st
<> KnownPeers.aboveTarget policy st
<> EstablishedPeers.belowTarget actions policy st
<> EstablishedPeers.aboveTarget actions policy st
<> ActivePeers.belowTarget actions policy st
<> ActivePeers.aboveTarget actions policy st
-- Make sure postNonBlocking set is in the right place
foldMap (\a -> a policy actions st) postNonBlocking
The options presented in the previous section are not entirely conclusive in determining the best course of action. As the project evolves and more information is gathered—particularly regarding the stakeholders and their specific requirements greater clarity will emerge. With this additional insight, it will become easier to assess the trade-offs and make more informed decisions about the most suitable approach to follow.
The Churn Governor is responsible for rotating peers in the diffusion layer. This process is not modular and has Cardano node-specific dependencies like BlockFetchMode
and ConsensusMode
. Ideally, the Churn Governor should be decoupled from these specifics, but it’s possible to offer a simplified API for third-party users who just need basic churning.
Interestingly enough PeerMetrics
which is an argument for PeerChurnArgs
, isn't used in the churn logic. The peer metrics are used by simplePeerSelectionPolicy
which is used in the peer selection governor in the policyPickHotPeersToDemote
so this only really matters for hot demotion, it doesn't have to do with Churn at all, in fact that field in PeerChurnArgs
is never used. So this can/should be refactored too.
Here's an abstraction of the Cardano-specific PeerChurnArguments
:
data CardanoPeerChurnArgs m =
CardanoPeerChurnArgs {
cpcaModeVar :: StrictTVar m ChurnMode
, cpcaReadFetchMode :: STM m FetchMode
, cpcaPeerTargets :: ConsensusModePeerTargets
, cpcaReadUseBootstrap :: STM m UseBootstrapPeers
, cpcaConsensusMode :: ConsensusMode
}
data PeerChurnArgs m extraArgs extraDebugState extraFlags extraPeers extraAPI peeraddr =
PeerChurnArgs {
pcaPeerSelectionTracer :: Tracer m (TracePeerSelection extraDebugState extraFlags extraPeers peeraddr)
, pcaChurnTracer :: Tracer m ChurnCounters
, pcaDeadlineInterval :: DiffTime
, pcaBulkInterval :: DiffTime
, pcaPeerRequestTimeout :: DiffTime
, pcaMetrics :: PeerMetrics m peeraddr
, pcaRng :: StdGen
, pcaPeerSelectionVar :: StrictTVar m PeerSelectionTargets
, pcaReadCounters :: STM m PeerSelectionCounters
, getLedgerStateCtx :: LedgerPeersConsensusInterface extraAPI m
, getLocalRootHotTarget :: STM m HotValency
, getExtraArgs :: extraArgs
}
By abstracting Cardano-specific parameters, we can provide two implementations: one tightly coupled with CardanoPeerChurnArgs
and another for basic churn functionality.
-- | Promoted data type.
data ChurnType = BasicChurn | CardanoChurn
data PeerChurnArgs (churnType :: ChurnType) m consensusAPI peeraddr =
BasicChurnArgs :: PeerChurnArgs' m () consensusAPI peeraddr
-> PeerChurnArgs BasicChurn m consensusAPI peeraddr
| CardanoChurnArgs :: PeerChurnArgs' m (CardanoPeerChurnArgs m) consensusAPI peeraddr
-> PeerChurnArgs CardanoChurn m consensusAPI peeraddr
peerChurnGovernor :: forall m peeraddr.
( MonadDelay m
, Alternative (STM m)
, MonadTimer m
, MonadCatch m
)
=> PeerChurnArgs churnType m consensusAPI peeraddr
-> m Void
peerChurnGovernor
CardanoChurnArgs (PeerChurnArgs { ... }) = ...
BasicChurnArgs (PeerChurnArgs { ... }) = ...
A more straightforward approach is to have two implementations: a general one and a Cardano Node specific one and the right one is used when instantiating diffusion as mentioned in Cardano Node Implementation Specific Details.
The Peer Selection Actions
module provides the withPeerSelectionActions
function, which initializes the ledger peers and local root providers threads and creates a PeerSelectionActions
record. This record, which includes the requestPublicRootPeers
function along with other field methods, is primarily used by the Outbound Governor. Notably, requestPublicRootPeers
contains Cardano-specific node implementation details that should be abstracted for use with any PublicRootPeers
of type extraPeers
.
One key observation is that the PeerSelectionActionsArgs
and PeerSelectionActions
records contain overlapping information and are only used within this context, without dependencies from other components. Therefore, these can be streamlined into a single record structure.
data PeerSelectionActions extraActions extraPeers extraFlags extraAPI peeraddr peerconn m =
PeerSelectionActions {
readPeerSelectionTargets :: STM m PeerSelectionTargets,
-- | Provides the initial configuration of locally or privately known root peers.
--
-- Sourced from 'ArgumentsExtra' during Diffusion initialization.
--
readOriginalLocalRootPeers :: STM m (LocalRootPeers.Config extraFlags RelayAccessPoint),
readLocalRootPeers :: STM m (LocalRootPeers.Config extraFlags peeraddr),
requestPublicRootPeers :: LedgerPeersKind -> Int -> m (PublicRootPeers extraPeers peeraddr, DiffTime),
peerStateActions :: PeerStateActions peeraddr peerconn m,
getLedgerStateCtx :: LedgerPeersConsensusInterface extraAPI m,
readLedgerPeerSnapshot :: STM m (Maybe LedgerPeerSnapshot),
-- | Extension point for third-party users to incorporate additional actions.
--
extraActions :: extraActions
}
Since requestPublicRootPeers
relies on a function provided by ledgerPeerThread
, we need to adjust the withPeerSelectionActions
function type to a callback, allowing for this dependency and generalizing the implementation:
withPeerSelectionActions
:: forall extraActions extraPeers extraFlags extraAPI peeraddr peerconn resolver exception m a.
( Alternative (STM m)
, MonadAsync m
, MonadDelay m
, MonadThrow m
, Ord peeraddr
, Exception exception
, Eq extraFlags
)
=> Tracer m (TraceLocalRootPeers extraFlags peeraddr exception)
-> StrictTVar m (Config extraFlags peeraddr)
-> PeerActionsDNS peeraddr resolver exception m
-> ( (NumberOfPeers -> LedgerPeersKind -> m (Maybe (Set peeraddr, DiffTime)))
-> PeerSelectionActions extraActions extraPeers extraFlags extraAPI peeraddr peerconn m)
-> WithLedgerPeersArgs extraAPI m
-> ( (Async m Void, Async m Void)
-> PeerSelectionActions extraActions extraPeers extraFlags extraAPI peeraddr peerconn m
-> m a)
-> m a
withPeerSelectionActions
localTracer
localRootsVar
paDNS
getPeerSelectionActions
ledgerPeersArgs
k = do
withLedgerPeers
paDNS
ledgerPeersArgs
(\getLedgerPeers lpThread -> do
let peerSelectionActions@PeerSelectionActions
{ readOriginalLocalRootPeers
} = getPeerSelectionActions getLedgerPeers
withAsync
(localRootPeersProvider
localTracer
paDNS
DNS.defaultResolvConf
readOriginalLocalRootPeers
localRootsVar)
(\lrppThread -> k (lpThread, lrppThread) peerSelectionActions))
To further increase flexibility, this module should define general getPublicRootPeers
and getPeerShare
functions, enabling both Cardano and third-party applications to utilize these functionalities for their specific requirements.
getPublicRootPeers
:: ( Monad m
, Monoid extraPeers
)
=> (NumberOfPeers -> m (extraPeers, DiffTime))
-> (NumberOfPeers -> LedgerPeersKind -> m (Maybe (Set peeraddr, DiffTime)))
-> LedgerPeersKind
-> Int
-> m (PublicRootPeers extraPeers peeraddr, DiffTime)
getPublicRootPeers getExtraPeers getLedgerPeers ledgerPeersKind n = do
mbLedgerPeers <- getLedgerPeers (NumberOfPeers $ fromIntegral n) ledgerPeersKind
case mbLedgerPeers of
Nothing -> do
(extraPeers, dt) <- getExtraPeers (NumberOfPeers $ fromIntegral n)
pure (PublicRootPeers.empty extraPeers, dt)
Just (ledgerPeers, dt) ->
case ledgerPeersKind of
AllLedgerPeers ->
pure (PublicRootPeers.fromLedgerPeers ledgerPeers, dt)
BigLedgerPeers ->
pure (PublicRootPeers.fromBigLedgerPeers ledgerPeers, dt)
This revision clarifies and simplifies the structure, emphasizing essential details and refactoring steps while enhancing readability.
The other components that get initialized by diffusion layer do not depend in any way on Cardano specific details and can be customized all by providing the diffusion arguments at the top level, so these should stay the same.
To fully decouple the diffusion layer and make it reusable for third-party use cases, several important steps are necessary.
-
Stakeholder Identification and Collaboration
- Objective: Identify key stakeholders and gather input on design requirements. This will help in making decisions that may not be straightforward, such as defining protocol features for compatibility and extensibility.
- Actions: Engage with teams such as Mithril to gather feedback, understand integration requirements, and identify any needed enhancements for both Cardano and third-party compatibility.
-
Documentation and API Specification
- Objective: Before restructuring begins, produce comprehensive documentation that describes the current diffusion layer architecture, including its dependencies, interaction points with the consensus layer, and configurable parameters.
- Actions: Create a detailed API reference for the diffusion layer, covering its dependencies, parameter configurations, and modular extension points. This document will serve as a baseline reference for both Cardano developers and external adopters.
-
Network Reorganization, Unit Testing and Modularity
-
Objective: Modularize the
ouroboros-network
to support different configuration and implementations for both Cardano-specific and third-party uses. -
Actions:
- Refactor the
ouroboros-network
to allow users to specify which parameters they want to instantiate their own diffusion with. - Refactor the
ouroboros-network
to allow users to specify which protocols they want to run. Redesign the Node-to-Client (N2C) client to retrieve data needed for diffusion and expanding the N2C protocol where needed. - Run the full test suite and ensure modular test coverage in the redesigned network. Testing should guarantee that at least Cardano node configuration is validated.
- Work closely with the performance team to verify that all structural changes maintain high performance for the Cardano node, addressing any performance bottlenecks as they are identified.
- Refactor the
-
Objective: Modularize the
-
Abstracting Consensus-Related Logic
- Objective: Define a flexible, clear interface between the consensus and diffusion layers.
- Actions: Abstract consensus interactions (e.g., the N2C RPC protocol) into a standalone interface, minimizing tight coupling with Cardano’s infrastructure. Work with stake holders to figure out the minimal API that should be required.
-
Mithril-Specific Protocol and Network Development
- Objective: Implement protocols specific to the Mithril Network, to allow Mithril to run independently on the new diffusion layer.
-
Actions:
- Implement a Mithril-specific N2C protocol, including versioning, handshake processes, and the mini-protocol for signature submission, following the designs specified in the CIP.
- Design a Node-to-Node protocol for Mithril that supports custom validation rules and a dedicated mempool for signature submission.
- Develop the Mithril churn metric to monitor peer behavior and activity.
- Implement the executable, covering configuration parsing, KES key integration, and support for custom diffusion configuration options, such as peer targets and topology files.
- Explore reusability of Cardano-node code, potentially exposing customizable libraries with APIs that are extensible to fit third-party node requirements.
-
Performance Validation and Final Integration Testing
-
Objective: Ensure that the redesigned diffusion layer and
ouroboros-network
maintain stable performance and operational reliability for both Cardano-specific and third-party uses. - Actions: Conduct a comprehensive performance analysis in collaboration with the performance team to validate efficiency under both Cardano and generalized configurations. Once stable, perform final integration testing across the Cardano node and Mithril configurations, addressing any regressions or performance challenges to maintain high standards.
-
Objective: Ensure that the redesigned diffusion layer and