Skip to content

Commit

Permalink
EIP-7594: Decouple network subnets from das-core
Browse files Browse the repository at this point in the history
Currently we use subnets as a unit of custody in the PeerDAS core
protocol because it doesn't make sense to partially custody only some
columns in the subnets and waste the bandwidth to download the columns
the node doesn't custody.

Since subnets correspond to GossipSub topics which are in a layer lower
than the core protocol, using subnets as a unit of custody makes the
core layer and the network layer too coupled to each other and leave no
room for the network layer flexibility.

This commit introduces "custody groups" which are used a unit of custody
instead of subnets.

The immediate benefit of the decoupling is that we can immediately
increase the number of subnets without affecting the expected number of
peers to cover all columns and affecting the network stability and
without touching the core protocol.

The reason we want to increase the number of subnets to match the number
of columns is that the columns will be propagated through the network
faster when they have their own subnets. Just like EIP-4844, each
blob has its own subnet because, if all the blobs are in a single subnet,
the blobs will be propagated more slowly.

Since we keep the number of custody groups the same as the previous
number of subnets (32), the expected number of peers you need to cover
all the columns is not changed. In fact, you need only NUMBER_OF_COLUMNS
and NUMBER_OF_CUSTODY_GROUPS to analyze the expected number, which
makes the core protocol completely decoupled from the network layer.
  • Loading branch information
ppopth committed Aug 14, 2024
1 parent 13ac373 commit afbe001
Show file tree
Hide file tree
Showing 10 changed files with 119 additions and 74 deletions.
1 change: 1 addition & 0 deletions configs/mainnet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ WHISK_PROPOSER_SELECTION_GAP: 2

# EIP7594
NUMBER_OF_COLUMNS: 128
NUMBER_OF_CUSTODY_GROUPS: 128
MAX_CELLS_IN_EXTENDED_MATRIX: 768
DATA_COLUMN_SIDECAR_SUBNET_COUNT: 128
MAX_REQUEST_DATA_COLUMN_SIDECARS: 16384
Expand Down
1 change: 1 addition & 0 deletions configs/minimal.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ WHISK_PROPOSER_SELECTION_GAP: 1

# EIP7594
NUMBER_OF_COLUMNS: 128
NUMBER_OF_CUSTODY_GROUPS: 128
MAX_CELLS_IN_EXTENDED_MATRIX: 768
DATA_COLUMN_SIDECAR_SUBNET_COUNT: 128
MAX_REQUEST_DATA_COLUMN_SIDECARS: 16384
Expand Down
73 changes: 41 additions & 32 deletions specs/_features/eip7594/das-core.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@
- [Custom types](#custom-types)
- [Configuration](#configuration)
- [Data size](#data-size)
- [Networking](#networking)
- [Custody setting](#custody-setting)
- [Containers](#containers)
- [`DataColumnSidecar`](#datacolumnsidecar)
- [`MatrixEntry`](#matrixentry)
- [Helper functions](#helper-functions)
- [`get_custody_columns`](#get_custody_columns)
- [`get_custody_groups`](#get_custody_groups)
- [`compute_columns_for_custody_group`](#compute_columns_for_custody_group)
- [`compute_extended_matrix`](#compute_extended_matrix)
- [`recover_matrix`](#recover_matrix)
- [`get_data_column_sidecars`](#get_data_column_sidecars)
Expand All @@ -32,8 +32,9 @@
- [Parameters](#parameters)
- [Reconstruction and cross-seeding](#reconstruction-and-cross-seeding)
- [FAQs](#faqs)
- [Row (blob) custody](#row-blob-custody)
- [Subnet stability](#subnet-stability)
- [Why don't nodes custody rows?](#why-dont-nodes-custody-rows)
- [Why don't we rotate custody over time?](#why-dont-we-rotate-custody-over-time)
- [Does having a lot of column subnets make the network unstable?](#does-having-a-lot-of-column-subnets-make-the-network-unstable)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- /TOC -->
Expand All @@ -54,6 +55,7 @@ The following values are (non-configurable) constants used throughout the specif
| - | - | - |
| `RowIndex` | `uint64` | Row identifier in the matrix of cells |
| `ColumnIndex` | `uint64` | Column identifier in the matrix of cells |
| `CustodyIndex` | `uint64` | Custody group identifier in the matrix of cells |

## Configuration

Expand All @@ -64,18 +66,13 @@ The following values are (non-configurable) constants used throughout the specif
| `NUMBER_OF_COLUMNS` | `uint64(CELLS_PER_EXT_BLOB)` (= 128) | Number of columns in the extended data matrix |
| `MAX_CELLS_IN_EXTENDED_MATRIX` | `uint64(MAX_BLOBS_PER_BLOCK * NUMBER_OF_COLUMNS)` (= 768) | The data size of `ExtendedMatrix` |

### Networking

| Name | Value | Description |
| - | - | - |
| `DATA_COLUMN_SIDECAR_SUBNET_COUNT` | `128` | The number of data column sidecar subnets used in the gossipsub protocol |

### Custody setting

| Name | Value | Description |
| - | - | - |
| `SAMPLES_PER_SLOT` | `8` | Number of `DataColumnSidecar` random samples a node queries per slot |
| `CUSTODY_REQUIREMENT` | `4` | Minimum number of subnets an honest node custodies and serves samples from |
| `NUMBER_OF_CUSTODY_GROUPS` | `128` | Number of column groups available for nodes to custody |
| `CUSTODY_REQUIREMENT` | `4` | Minimum number of custody groups an honest node custodies and serves samples from |

### Containers

Expand Down Expand Up @@ -103,33 +100,39 @@ class MatrixEntry(Container):

## Helper functions

### `get_custody_columns`
### `get_custody_groups`

```python
def get_custody_columns(node_id: NodeID, custody_subnet_count: uint64) -> Sequence[ColumnIndex]:
assert custody_subnet_count <= DATA_COLUMN_SIDECAR_SUBNET_COUNT
def get_custody_groups(node_id: NodeID, custody_group_count: uint64) -> Sequence[CustodyIndex]:
assert custody_group_count <= NUMBER_OF_CUSTODY_GROUPS

subnet_ids: List[uint64] = []
custody_groups: List[uint64] = []
current_id = uint256(node_id)
while len(subnet_ids) < custody_subnet_count:
subnet_id = (
while len(custody_groups) < custody_group_count:
custody_group = CustodyIndex(
bytes_to_uint64(hash(uint_to_bytes(uint256(current_id)))[0:8])
% DATA_COLUMN_SIDECAR_SUBNET_COUNT
% NUMBER_OF_CUSTODY_GROUPS
)
if subnet_id not in subnet_ids:
subnet_ids.append(subnet_id)
if custody_group not in custody_groups:
custody_groups.append(custody_group)
if current_id == UINT256_MAX:
# Overflow prevention
current_id = NodeID(0)
current_id += 1

assert len(subnet_ids) == len(set(subnet_ids))
assert len(custody_groups) == len(set(custody_groups))
return sorted(custody_groups)
```

### `compute_columns_for_custody_group`

columns_per_subnet = NUMBER_OF_COLUMNS // DATA_COLUMN_SIDECAR_SUBNET_COUNT
```python
def compute_columns_for_custody_group(custody_group: CustodyIndex) -> Sequence[ColumnIndex]:
assert custody_group < NUMBER_OF_CUSTODY_GROUPS
columns_per_group = NUMBER_OF_COLUMNS // NUMBER_OF_CUSTODY_GROUPS
return sorted([
ColumnIndex(DATA_COLUMN_SIDECAR_SUBNET_COUNT * i + subnet_id)
for i in range(columns_per_subnet)
for subnet_id in subnet_ids
ColumnIndex(NUMBER_OF_CUSTODY_GROUPS * i + custody_group)
for i in range(columns_per_group)
])
```

Expand Down Expand Up @@ -223,15 +226,17 @@ def get_data_column_sidecars(signed_block: SignedBeaconBlock,

### Custody requirement

Each node downloads and custodies a minimum of `CUSTODY_REQUIREMENT` subnets per slot. The particular subnets that the node is required to custody are selected pseudo-randomly (more on this below).
Columns are grouped into custody groups. Nodes custodying a custody group MUST custody all the columns in that group.

Each node downloads and custodies a minimum of `CUSTODY_REQUIREMENT` custody groups per slot. The particular custody groups that the node is required to custody are selected pseudo-randomly (more on this below).

A node *may* choose to custody and serve more than the minimum honesty requirement. Such a node explicitly advertises a number greater than `CUSTODY_REQUIREMENT` via the peer discovery mechanism -- for example, in their ENR (e.g. `custody_subnet_count: 4` if the node custodies `4` subnets each slot) -- up to a `DATA_COLUMN_SIDECAR_SUBNET_COUNT` (i.e. a super-full node).
A node *may* choose to custody and serve more than the minimum honesty requirement. Such a node explicitly advertises a number greater than `CUSTODY_REQUIREMENT` via the peer discovery mechanism -- for example, in their ENR (e.g. `custody_group_count: 4` if the node custodies `4` groups each slot) -- up to a `NUMBER_OF_CUSTODY_GROUPS` (i.e. a super-full node).

A node stores the custodied columns for the duration of the pruning period and responds to peer requests for samples on those columns.

### Public, deterministic selection

The particular columns that a node custodies are selected pseudo-randomly as a function (`get_custody_columns`) of the node-id and custody size -- importantly this function can be run by any party as the inputs are all public.
The particular columns/groups that a node custodies are selected pseudo-randomly as a function (`get_custody_groups`) of the node-id and custody size -- importantly this function can be run by any party as the inputs are all public.

*Note*: increasing the `custody_size` parameter for a given `node_id` extends the returned list (rather than being an entirely new shuffle) such that if `custody_size` is unknown, the default `CUSTODY_REQUIREMENT` will be correct for a subset of the node's custody.

Expand All @@ -249,7 +254,7 @@ In this construction, we extend the blobs using a one-dimensional erasure coding

For each column -- use `data_column_sidecar_{subnet_id}` subnets, where `subnet_id` can be computed with the `compute_subnet_for_data_column_sidecar(column_index: ColumnIndex)` helper. The sidecars can be computed with the `get_data_column_sidecars(signed_block: SignedBeaconBlock, blobs: Sequence[Blob])` helper.

Verifiable samples from their respective column are distributed on the assigned subnet. To custody a particular column, a node joins the respective gossipsub subnet. If a node fails to get a column on the column subnet, a node can also utilize the Req/Resp protocol to query the missing column from other peers.
Verifiable samples from their respective column are distributed on the assigned subnet. To custody columns in a particular custody group, a node joins the respective gossipsub subnets. If a node fails to get columns on the column subnets, a node can also utilize the Req/Resp protocol to query the missing columns from other peers.

## Reconstruction and cross-seeding

Expand All @@ -265,7 +270,7 @@ Once the node obtains a column through reconstruction, the node MUST expose the

## FAQs

### Row (blob) custody
### Why don't nodes custody rows?

In the one-dimension construction, a node samples the peers by requesting the whole `DataColumnSidecar`. In reconstruction, a node can reconstruct all the blobs by 50% of the columns. Note that nodes can still download the row via `blob_sidecar_{subnet_id}` subnets.

Expand All @@ -276,6 +281,10 @@ The potential benefits of having row custody could include:

However, for simplicity, we don't assign row custody assignments to nodes in the current design.

### Subnet stability
### Why don't we rotate custody over time?

To start with a simple, stable backbone, for now, we don't shuffle the custody assignments via the deterministic custody selection helper `get_custody_groups`. However, staggered rotation likely needs to happen on the order of the pruning period to ensure subnets can be utilized for recovery. For example, introducing an `epoch` argument allows the function to maintain stability over many epochs.

### Does having a lot of column subnets make the network unstable?

To start with a simple, stable backbone, for now, we don't shuffle the subnet assignments via the deterministic custody selection helper `get_custody_columns`. However, staggered rotation likely needs to happen on the order of the pruning period to ensure subnets can be utilized for recovery. For example, introducing an `epoch` argument allows the function to maintain stability over many epochs.
No, the number of subnets don't really matter. What matters to the network stability is the number of nodes and the churn rate in the network.
5 changes: 3 additions & 2 deletions specs/_features/eip7594/p2p-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@

| Name | Value | Description |
|------------------------------------------------|------------------------------------------------|---------------------------------------------------------------------------|
| `DATA_COLUMN_SIDECAR_SUBNET_COUNT` | `128` | The number of data column sidecar subnets used in the gossipsub protocol |
| `MAX_REQUEST_DATA_COLUMN_SIDECARS` | `MAX_REQUEST_BLOCKS_DENEB * NUMBER_OF_COLUMNS` | Maximum number of data column sidecars in a single request |
| `MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS` | `2**12` (= 4096 epochs, ~18 days) | The minimum epoch range over which a node must serve data column sidecars |

Expand Down Expand Up @@ -320,8 +321,8 @@ Requests the MetaData of a peer, using the new `MetaData` definition given above

##### Custody subnet count

A new field is added to the ENR under the key `csc` to facilitate custody data column discovery.
A new field is added to the ENR under the key `cgc` to facilitate custody data column discovery.

| Key | Value |
|--------|------------------------------------------|
| `csc` | Custody subnet count, big endian integer |
| `cgc` | Custody group count, big endian integer |
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,26 @@
)


def _run_get_custody_columns(spec, rng, node_id=None, custody_subnet_count=None):
def _run_get_custody_columns(spec, rng, node_id=None, custody_group_count=None):
if node_id is None:
node_id = rng.randint(0, 2**256 - 1)

if custody_subnet_count is None:
custody_subnet_count = rng.randint(0, spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT)
if custody_group_count is None:
custody_group_count = rng.randint(0, spec.config.NUMBER_OF_CUSTODY_GROUPS)

result = spec.get_custody_columns(node_id, custody_subnet_count)
columns_per_group = spec.config.NUMBER_OF_COLUMNS // spec.config.NUMBER_OF_CUSTODY_GROUPS
groups = spec.get_custody_groups(node_id, custody_group_count)
yield 'node_id', 'meta', node_id
yield 'custody_subnet_count', 'meta', custody_subnet_count
yield 'custody_group_count', 'meta', custody_group_count

result = []
for group in groups:
group_columns = spec.compute_columns_for_custody_group(group)
assert len(group_columns) == columns_per_group
result.extend(group_columns)

assert len(result) == len(set(result))
assert len(result) == (
custody_subnet_count * spec.config.NUMBER_OF_COLUMNS // spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT
)
assert len(result) == custody_group_count * columns_per_group
assert all(i < spec.config.NUMBER_OF_COLUMNS for i in result)
python_list_result = [int(i) for i in result]

Expand All @@ -31,48 +36,48 @@ def _run_get_custody_columns(spec, rng, node_id=None, custody_subnet_count=None)
@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns__min_node_id_min_custody_subnet_count(spec):
def test_get_custody_columns__min_node_id_min_custody_group_count(spec):
rng = random.Random(1111)
yield from _run_get_custody_columns(spec, rng, node_id=0, custody_subnet_count=0)
yield from _run_get_custody_columns(spec, rng, node_id=0, custody_group_count=0)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns__min_node_id_max_custody_subnet_count(spec):
def test_get_custody_columns__min_node_id_max_custody_group_count(spec):
rng = random.Random(1111)
yield from _run_get_custody_columns(
spec, rng, node_id=0,
custody_subnet_count=spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT)
custody_group_count=spec.config.NUMBER_OF_CUSTODY_GROUPS)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns__max_node_id_min_custody_subnet_count(spec):
def test_get_custody_columns__max_node_id_min_custody_group_count(spec):
rng = random.Random(1111)
yield from _run_get_custody_columns(spec, rng, node_id=2**256 - 1, custody_subnet_count=0)
yield from _run_get_custody_columns(spec, rng, node_id=2**256 - 1, custody_group_count=0)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns__max_node_id_max_custody_subnet_count(spec):
def test_get_custody_columns__max_node_id_max_custody_group_count(spec):
rng = random.Random(1111)
yield from _run_get_custody_columns(
spec, rng, node_id=2**256 - 1,
custody_subnet_count=spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT,
custody_group_count=spec.config.NUMBER_OF_CUSTODY_GROUPS,
)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns__max_node_id_max_custody_subnet_count_minus_1(spec):
def test_get_custody_columns__max_node_id_max_custody_group_count_minus_1(spec):
rng = random.Random(1111)
yield from _run_get_custody_columns(
spec, rng, node_id=2**256 - 2,
custody_subnet_count=spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT,
custody_group_count=spec.config.NUMBER_OF_CUSTODY_GROUPS,
)


Expand All @@ -81,7 +86,7 @@ def test_get_custody_columns__max_node_id_max_custody_subnet_count_minus_1(spec)
@single_phase
def test_get_custody_columns__short_node_id(spec):
rng = random.Random(1111)
yield from _run_get_custody_columns(spec, rng, node_id=1048576, custody_subnet_count=1)
yield from _run_get_custody_columns(spec, rng, node_id=1048576, custody_group_count=1)


@with_eip7594_and_later
Expand Down
43 changes: 28 additions & 15 deletions tests/core/pyspec/eth2spec/test/eip7594/unittests/test_custody.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,48 +6,61 @@
)


def run_get_custody_columns(spec, peer_count, custody_subnet_count):
assignments = [spec.get_custody_columns(node_id, custody_subnet_count) for node_id in range(peer_count)]
def run_get_custody_columns(spec, peer_count, custody_group_count):
assignments = [spec.get_custody_groups(node_id, custody_group_count) for node_id in range(peer_count)]

columns_per_subnet = spec.config.NUMBER_OF_COLUMNS // spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT
columns_per_group = spec.config.NUMBER_OF_COLUMNS // spec.config.NUMBER_OF_CUSTODY_GROUPS
for assignment in assignments:
assert len(assignment) == custody_subnet_count * columns_per_subnet
assert len(assignment) == len(set(assignment))
columns = []
for group in assignment:
group_columns = spec.compute_columns_for_custody_group(group)
assert len(group_columns) == columns_per_group
columns.extend(group_columns)

assert len(columns) == custody_group_count * columns_per_group
assert len(columns) == len(set(columns))


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns_peers_within_number_of_columns(spec):
peer_count = 10
custody_subnet_count = spec.config.CUSTODY_REQUIREMENT
custody_group_count = spec.config.CUSTODY_REQUIREMENT
assert spec.config.NUMBER_OF_COLUMNS > peer_count
run_get_custody_columns(spec, peer_count, custody_subnet_count)
run_get_custody_columns(spec, peer_count, custody_group_count)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns_peers_more_than_number_of_columns(spec):
peer_count = 200
custody_subnet_count = spec.config.CUSTODY_REQUIREMENT
custody_group_count = spec.config.CUSTODY_REQUIREMENT
assert spec.config.NUMBER_OF_COLUMNS < peer_count
run_get_custody_columns(spec, peer_count, custody_subnet_count)
run_get_custody_columns(spec, peer_count, custody_group_count)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns_maximum_subnets(spec):
def test_get_custody_columns_maximum_groups(spec):
peer_count = 10
custody_subnet_count = spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT
run_get_custody_columns(spec, peer_count, custody_subnet_count)
custody_group_count = spec.config.NUMBER_OF_CUSTODY_GROUPS
run_get_custody_columns(spec, peer_count, custody_group_count)


@with_eip7594_and_later
@spec_test
@single_phase
def test_get_custody_columns_custody_size_more_than_number_of_columns(spec):
def test_get_custody_columns_custody_size_more_than_number_of_groups(spec):
node_id = 1
custody_subnet_count = spec.config.DATA_COLUMN_SIDECAR_SUBNET_COUNT + 1
expect_assertion_error(lambda: spec.get_custody_columns(node_id, custody_subnet_count))
custody_group_count = spec.config.NUMBER_OF_CUSTODY_GROUPS + 1
expect_assertion_error(lambda: spec.get_custody_groups(node_id, custody_group_count))


@with_eip7594_and_later
@spec_test
@single_phase
def test_compute_columns_for_custody_group_out_of_bound_custody_group(spec):
expect_assertion_error(lambda: spec.compute_columns_for_custody_group(spec.config.NUMBER_OF_CUSTODY_GROUPS))
3 changes: 2 additions & 1 deletion tests/formats/networking/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@
The aim of the networking tests is to set a base-line on what really needs to pass, i.e. the essentials.

Handlers:
- [`get_custody_columns`](./get_custody_columns.md): `get_custody_columns` helper tests
- [`compute_columns_for_custody_group`](./compute_columns_for_custody_group.md): `compute_columns_for_custody_group` helper tests
- [`get_custody_groups`](./get_custody_groups.md): `get_custody_groups` helper tests
13 changes: 13 additions & 0 deletions tests/formats/networking/compute_columns_for_custody_group.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# `compute_columns_for_custody_group` tests

`compute_columns_for_custody_group` tests provide sanity checks for the correctness of the `compute_columns_for_custody_group` helper function.

## Test case format

### `meta.yaml`

```yaml
description: string -- optional: description of test case, purely for debugging purposes.
custody_group: int -- argument: the custody group index.
result: list of int -- output: the list of resulting column indices.
```
Loading

0 comments on commit afbe001

Please sign in to comment.