Skip to content

Commit

Permalink
Migrate to Diem and BCS (#1)
Browse files Browse the repository at this point in the history
Co-authored-by: Mathieu Baudet <mathieubaudet@fb.com>
  • Loading branch information
ma2bd and ma2bd authored Dec 12, 2020
1 parent 45dcd3e commit 4e8b861
Show file tree
Hide file tree
Showing 14 changed files with 98 additions and 94 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# Changelog

## [v0.1.1] - 2020-12-11
- Renaming crate into "bcs".

## [v0.1.0] - 2020-11-17
- Initial release.

[v0.1.0]: https://github.com/libra/lcs/releases/tag/v0.1.0
[v0.1.1]: https://github.com/diem/bcs/releases/tag/v0.1.1
[v0.1.0]: https://github.com/diem/bcs/releases/tag/v0.1.0
2 changes: 1 addition & 1 deletion CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Code of Conduct

The project has adopted a Code of Conduct that we expect project participants to adhere to. Please [read the full text](https://developers.libra.org/docs/policies/code-of-conduct) so that you can understand what actions will and will not be tolerated.
The project has adopted a Code of Conduct that we expect project participants to adhere to. Please [read the full text](https://developers.diem.com/docs/policies/code-of-conduct) so that you can understand what actions will and will not be tolerated.
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ This project welcomes contributions.

## Contributor License Agreement (CLA)

For pull request to be accepted by any Libra projects, a CLA must be [signed](https://libra.org/en-US/cla-sign). You will only need to do this once to work on any of Libra's open source projects.
For pull request to be accepted by any Diem projects, a CLA must be [signed](https://diem.com/en-US/cla-sign). You will only need to do this once to work on any of Diem's open source projects.

When submitting a pull request (PR), the `libra-github-bot` will check your submission for a valid CLA. If one is not found, then you will need to [submit](https://libra.org/en-US/cla-sign) an Individual CLA for yourself or a Corporate CLA for your company.
When submitting a pull request (PR), the `diem-github-bot` will check your submission for a valid CLA. If one is not found, then you will need to [submit](https://diem.com/en-US/cla-sign) an Individual CLA for yourself or a Corporate CLA for your company.

## Issues

Expand Down
14 changes: 7 additions & 7 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
[package]
name = "libra-canonical-serialization"
version = "0.1.0"
authors = ["Libra Association <opensource@libra.org>"]
description = "Libra Canonical Serialization (LCS)"
documentation = "https://docs.rs/libra-canonical-serialization"
repository = "https://github.com/libra/lcs"
name = "bcs"
version = "0.1.1"
authors = ["Diem <opensource@diem.com>"]
description = "Binary Canonical Serialization (BCS)"
repository = "https://github.com/diem/bcs"
homepage = "https://diem.com"
readme = "README.md"
license = "Apache-2.0"
edition = "2018"
Expand All @@ -19,5 +19,5 @@ proptest = "0.10.1"
proptest-derive = "0.2.0"

[[bench]]
name = "lcs_bench"
name = "bcs_bench"
harness = false
46 changes: 23 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
> **Note to readers:** On December 1, 2020, the Libra Association was renamed to Diem Association. The project repos are in the process of being migrated. All projects will remain available for use here until the migration to a new GitHub Organization is complete.
## Libra Canonical Serialization (LCS)
## Binary Canonical Serialization (BCS)

LCS defines a deterministic means for translating a message or data structure into bytes
BCS defines a deterministic means for translating a message or data structure into bytes
irrespective of platform, architecture, or programming language.

### Background

In Libra, participants pass around messages or data structures that often times need to be
In Diem, participants pass around messages or data structures that often times need to be
signed by a prover and verified by one or more verifiers. Serialization in this context refers
to the process of converting a message into a byte array. Many serialization approaches support
loose standards such that two implementations can produce two different byte streams that would
Expand All @@ -21,12 +21,12 @@ serialized bytes or risk losing the ability to verify messages. This creates a b
participants to maintain both a copy of the serialized bytes and the deserialized message often
leading to confusion about safety and correctness. While there exist a handful of existing
deterministic serialization formats, there is no obvious choice. To address this, we propose
Libra Canonical Serialization that defines a deterministic means for translating a message into
Diem Canonical Serialization that defines a deterministic means for translating a message into
bytes and back again.

### Specification

LCS supports the following data types:
BCS supports the following data types:

* Booleans
* Signed 8-bit, 16-bit, 32-bit, 64-bit, and 128-bit integers
Expand All @@ -42,20 +42,20 @@ LCS supports the following data types:

### General structure

LCS is not a self-describing format and as such, in order to deserialize a message, one must
BCS is not a self-describing format and as such, in order to deserialize a message, one must
know the message type and layout ahead of time.

Unless specified, all numbers are stored in little endian, two's complement format.

### Recursion and Depth of LCS Data
### Recursion and Depth of BCS Data

Recursive data-structures (e.g. trees) are allowed. However, because of the possibility of stack
overflow during (de)serialization, the *container depth* of any valid LCS data cannot exceed the constant
overflow during (de)serialization, the *container depth* of any valid BCS data cannot exceed the constant
`MAX_CONTAINER_DEPTH`. Formally, we define *container depth* as the number of structs and enums traversed
during (de)serialization.

This definition aims to minimize the number of operations while ensuring that
(de)serialization of a known LCS format cannot cause arbitrarily large stack allocations.
(de)serialization of a known BCS format cannot cause arbitrarily large stack allocations.

As an example, if `v1` and `v2` are values of depth `n1` and `n2`,
* a struct value `Foo { v1, v2 }` has depth `1 + max(n1, n2)`;
Expand All @@ -81,7 +81,7 @@ All string and integer values have depths `0`.

#### ULEB128-Encoded Integers

The LCS format also uses the [ULEB128 encoding](https://en.wikipedia.org/wiki/LEB128) internally
The BCS format also uses the [ULEB128 encoding](https://en.wikipedia.org/wiki/LEB128) internally
to represent unsigned 32-bit integers in two cases where small values are usually expected:
(1) lengths of variable-length sequences and (2) tags of enum values (see the corresponding
sections below).
Expand All @@ -99,15 +99,15 @@ In general, a ULEB128 encoding consists of a little-endian sequence of base-128
digits. Each digit is completed into a byte by setting the highest bit to 1, except for the
last (highest-significance) digit whose highest bit is set to 0.

In LCS, the result of decoding ULEB128 bytes is required to fit into a 32-bit unsigned
In BCS, the result of decoding ULEB128 bytes is required to fit into a 32-bit unsigned
integer and be in canonical form. For instance, the following values are rejected:
* `[808080808001]` (2^36) is too large.
* `[8080808010]` (2^33) is too large.
* `[8000]` is not a minimal encoding of 0.

#### Optional Data

Optional or nullable data either exists in its full representation or does not. LCS represents
Optional or nullable data either exists in its full representation or does not. BCS represents
this as a single byte representing the presence `0x01` or absence `0x00` of data. If the data
is present then the serialized form of that data follows. For example:

Expand All @@ -121,9 +121,9 @@ assert_eq!(to_bytes(&no_data)?, vec![0]);

#### Fixed and Variable Length Sequences

Sequences can be made of up of any LCS supported types (even complex structures) but all
Sequences can be made of up of any BCS supported types (even complex structures) but all
elements in the sequence must be of the same type. If the length of a sequence is fixed and
well known then LCS represents this as just the concatenation of the serialized form of each
well known then BCS represents this as just the concatenation of the serialized form of each
individual element in the sequence. If the length of the sequence can be variable, then the
serialized sequence is length prefixed with a ULEB128-encoded unsigned integer indicating
the number of elements in the sequence. All variable length sequences must be
Expand All @@ -142,7 +142,7 @@ assert_eq!(to_bytes(&large_variable_length)?, vec![0x8f, 0x4a]);

#### Strings

Only valid UTF-8 Strings are supported. LCS serializes such strings as a variable length byte
Only valid UTF-8 Strings are supported. BCS serializes such strings as a variable length byte
sequence, i.e. length prefixed with a ULEB128-encoded unsigned integer followed by the byte
representation of the string.

Expand All @@ -161,12 +161,12 @@ assert_eq!(to_bytes(&utf8_str)?, expecting);
Tuples are typed composition of objects: `(Type0, Type1)`

Tuples are considered a fixed length sequence where each element in the sequence can be a
different type supported by LCS. Each element of a tuple is serialized in the order it is
different type supported by BCS. Each element of a tuple is serialized in the order it is
defined within the tuple, i.e. [tuple.0, tuple.2].

```rust
let tuple = (-1i8, "libra");
let expecting = vec![0xFF, 5, b'l', b'i', b'b', b'r', b'a'];
let tuple = (-1i8, "diem");
let expecting = vec![0xFF, 4, b'd', b'i', b'e', b'm'];
assert_eq!(to_bytes(&tuple)?, expecting);
```

Expand All @@ -175,7 +175,7 @@ assert_eq!(to_bytes(&tuple)?, expecting);

Structures are fixed length sequences consisting of fields with potentially different types.
Each field within a struct is serialized in the order specified by the canonical structure
definition. Structs can exist within other structs and as such, LCS recurses into each struct
definition. Structs can exist within other structs and as such, BCS recurses into each struct
and serializes them in order. There are no labels in the serialized format, the struct ordering
defines the organization within the serialization stream.

Expand Down Expand Up @@ -214,9 +214,9 @@ assert_eq!(w_bytes, expecting);
#### Externally Tagged Enumerations

An enumeration is typically represented as a type that can take one of potentially many
different variants. In LCS, each variant is mapped to a variant index, a ULEB128-encoded 32-bit unsigned
different variants. In BCS, each variant is mapped to a variant index, a ULEB128-encoded 32-bit unsigned
integer, followed by serialized data if the type has an associated value. An
associated type can be any LCS supported type. The variant index is determined based on the
associated type can be any BCS supported type. The variant index is determined based on the
ordering of the variants in the canonical enum definition, where the first variant has an index
of `0`, the second an index of `1`, etc.

Expand All @@ -241,7 +241,7 @@ If you need to serialize a C-style enum, you should use a primitive integer type
#### Maps (Key / Value Stores)

Maps are represented as a variable-length, sorted sequence of (Key, Value) tuples. Keys must be
unique and the tuples sorted by increasing lexicographical order on the LCS bytes of each key.
unique and the tuples sorted by increasing lexicographical order on the BCS bytes of each key.
The representation is otherwise similar to that of a variable-length sequence. In particular,
it is preceded by the number of tuples, encoded in ULEB128.

Expand All @@ -258,7 +258,7 @@ assert_eq!(to_bytes(&map)?, to_bytes(&expecting)?);

### Backwards compatibility

Complex types dependent upon the specification in which they are used. LCS does not provide
Complex types dependent upon the specification in which they are used. BCS does not provide
direct provisions for versioning or backwards / forwards compatibility. A change in an objects
structure could prevent historical clients from understanding new clients and vice-versa.

Expand Down
2 changes: 2 additions & 0 deletions README.tpl
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
> **Note to readers:** On December 1, 2020, the Libra Association was renamed to Diem Association. The project repos are in the process of being migrated. All projects will remain available for use here until the migration to a new GitHub Organization is complete.

{{readme}}

## Contributing
Expand Down
4 changes: 2 additions & 2 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Security Policies and Procedures

Please see Libra's
[security policies](https://developers.libra.org/docs/policies/security) and
Please see Diem's
[security policies](https://developers.diem.com/docs/policies/security) and
procedures for reporting vulnerabilities.
8 changes: 4 additions & 4 deletions benches/lcs_bench.rs → benches/bcs_bench.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
// Copyright (c) The Libra Core Contributors
// Copyright (c) The Diem Core Contributors
// SPDX-License-Identifier: Apache-2.0

use bcs::to_bytes;
use criterion::{criterion_group, criterion_main, Criterion};
use libra_canonical_serialization::to_bytes;
use std::collections::{BTreeMap, HashMap};

pub fn lcs_benchmark(c: &mut Criterion) {
pub fn bcs_benchmark(c: &mut Criterion) {
let mut btree_map = BTreeMap::new();
let mut hash_map = HashMap::new();
for i in 0u32..2000u32 {
Expand All @@ -24,5 +24,5 @@ pub fn lcs_benchmark(c: &mut Criterion) {
});
}

criterion_group!(benches, lcs_benchmark);
criterion_group!(benches, bcs_benchmark);
criterion_main!(benches);
16 changes: 8 additions & 8 deletions src/de.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (c) The Libra Core Contributors
// Copyright (c) The Diem Core Contributors
// SPDX-License-Identifier: Apache-2.0

use crate::error::{Error, Result};
Expand All @@ -7,13 +7,13 @@ use std::convert::TryFrom;

/// Deserializes a `&[u8]` into a type.
///
/// This function will attempt to interpret `bytes` as the LCS serialized form of `T` and
/// This function will attempt to interpret `bytes` as the BCS serialized form of `T` and
/// deserialize `T` from `bytes`.
///
/// # Examples
///
/// ```
/// use libra_canonical_serialization::from_bytes;
/// use bcs::from_bytes;
/// use serde::Deserialize;
///
/// #[derive(Deserialize)]
Expand Down Expand Up @@ -53,7 +53,7 @@ where
deserializer.end().map(move |_| t)
}

/// Deserialization implementation for LCS
/// Deserialization implementation for BCS
struct Deserializer<'de> {
input: &'de [u8],
max_remaining_depth: usize,
Expand Down Expand Up @@ -196,7 +196,7 @@ impl<'de> Deserializer<'de> {
impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
type Error = Error;

// LCS is not a self-describing format so we can't implement `deserialize_any`
// BCS is not a self-describing format so we can't implement `deserialize_any`
fn deserialize_any<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
Expand Down Expand Up @@ -438,23 +438,23 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
r
}

// LCS does not utilize identifiers, so throw them away
// BCS does not utilize identifiers, so throw them away
fn deserialize_identifier<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
self.deserialize_bytes(_visitor)
}

// LCS is not a self-describing format so we can't implement `deserialize_ignored_any`
// BCS is not a self-describing format so we can't implement `deserialize_ignored_any`
fn deserialize_ignored_any<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
Err(Error::NotSupported("deserialize_ignored_any"))
}

// LCS is not a human readable format
// BCS is not a human readable format
fn is_human_readable(&self) -> bool {
false
}
Expand Down
2 changes: 1 addition & 1 deletion src/error.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (c) The Libra Core Contributors
// Copyright (c) The Diem Core Contributors
// SPDX-License-Identifier: Apache-2.0

use serde::{de, ser};
Expand Down
Loading

0 comments on commit 4e8b861

Please sign in to comment.