-
Notifications
You must be signed in to change notification settings - Fork 86
Assessing Peer Selection Optimality for Decentralisation
Need an approach to measure the optimality of the peer selection algorithm. This is not just about averages (though they are important) but also about distribution.
Literature discusses path length - and that is our starting point that of the Characteristic Path Length [Watts99], but there is a need to understand whether the topography of the final connectivity represents a good enough solution to meet the data diffusion liveness goals needed to ensure that data diffusion does not becomes a significant factor in the chain growth quality.
The goal is to create an assessment tool/regime that:
- permits us to quantify the liveliness for blocks of a given size in network topographies taken from a realistic network topographies (topographies are topologies which include measures - on their edges - that capture causality/performance issues of interest).
- can be used for investigating the peer selection algorithm as a standalone problem. With the aim of using the executable specification directly in the deployed system.
- take data gathered as part of the deployment to both hone the underlying topographical data and to assess / home the effectiveness of the peer selection algorithm.
The parameters of the system:
- the set of all nodes under consideration along with their permitted connectivity (model firewall based deployments).
- the one-way ∆Q of all pairs
- the ability to change (as a function of time) the topography
- simulate major network events (eg cable failure)
- simulate nodes failing
- simulate effect of bearer level DDoS against a particular peers.
- the effect of network capacity (bandwidth) for given peers
- promotion / demotion algorithm (selection of warm / hot peers from the cold / warm sets)
- parameters of the gossip algorithm (e.g no of peers, random / non-random responses, etc)
The measures of interest:
- Time to store and forward blocks of different sizes between directly connected peers.
- Time to distribute a maximum size block (2Mb) through a fully connected (every peer is a hot peer). This represents a practically unachievable best case (to give upper bounds).
- Time to distribute various block sizes through randomly selected graphs - looking for any effective correlation between graph theoretic measures (e.g. characteristic path length) and data diffusion measures (CDF of time to diffuse). This gives an estimate of likely starting conditions for any optimisation algorithm.
- Centrality/Isolation of a node - can a node determine (from purely local measures) that its "well connected" to other nodes (i.e highly unlikely to receive a block too late to build on it).
- Basic PoC of approach - covering points 1, 2 and 4 in method above.
- Take PoC and build appropriate harness to support remaining item in method above.
- Implement the measures
- Design visualisation
- Implement (in model gossip), promotion / demotion
- Evaluate for initial guess of parameters
- Design optimisation criteria
- Run optimisations to confirm
Inter-region latencies for AWS - to be taken with pinch of salt, asymmetry is telling, method is probably measuring an upper bound.