Skip to content
This repository has been archived by the owner on Aug 15, 2024. It is now read-only.

Unified byte-level frontend-backend interface #9

Open
shamatar opened this issue Apr 1, 2019 · 5 comments
Open

Unified byte-level frontend-backend interface #9

shamatar opened this issue Apr 1, 2019 · 5 comments

Comments

@shamatar
Copy link
Member

shamatar commented Apr 1, 2019

Looks like in ZKP space there is a good separation like in LLVM toolchain - there is a set of tools to define circuits (R1CS) and calculate witness, and bellman or libsnark can be a proof generation backend. For this purpose there is a need to have a byte-level interface for the following uses:

  • export/import R1CS for purposes of proving/verifying key generation
  • export/import of proving/verifying keys
  • export/import witness for proof generation

A challenge for this is that there is no even a unified way how points or field elements are serialized (there is an official RFC with 0x02, 0x03, 0x04 to indicate compressed/decompressed points, but e.g. bellman uses pairing crate that has it's own definitions).

In the next posts I'll give some concrete proposals, and any external contributions are welcome!

@HarryR
Copy link

HarryR commented Apr 1, 2019

There is also the 'gadget standard' project by QED-it: https://github.com/QED-it/gadget_standard

They use flatbuffers to efficiently pass constraints and variables around. For more information, see the zkinterface specification.

They specify the field_order as bytes, and field elements as bytes, but I don't think they specify the endian-ness of them (I assume... little endian?)

It would be good to be able to extend this with the following:

  • constraint dump format for R1CS, using flatbuffers, specifies field order and number of variables
  • variable dump format - to accompany the constraints dump

The interface for prover modules (say, one for bellman, one for libsnark) would be to:

  1. load constraints dump file
  2. sanity checks
  3. load variables dump file
  4. generate proof
  5. return proof, in which format?

At the moment I'm using JSON to return the proof, with hexadecimal encoded values for coordinates & scalars, and not using point compression. This seems to be easiest to integrate with other programs, and doesn't require you to specify a schema as is required by flatbuffers / protobuf.

@shamatar
Copy link
Member Author

shamatar commented Apr 1, 2019

Yes, it's a great option. I really mean only "constraints dump" and "variable dump", not a gadget-level. Proof export is also an important point, as well as proving/verification keys.

JSON is always an option, but only for a small amounts of data. No-overhead deserialization is not that important for now I'd say. Field element serialization can be byte-level up to BigEndian/LittleEndian, although EC points are more difficult due to compression requirements.

Roughly a spec will look like:

  • Header
    • Curve name
    • Field length
    • Field BE encoding
    • Use of compression
    • ...
  • R1CS
    • Num public variables
    • Num private variables
    • Single constraint in a format of:
      • Num elements in A, B, C
      • arrays as tuples (coeff, Variable_ID)
      • If Variable_id < Num public variables it is a public one, otherwise - private one with index Variable_id - Num public variables

Fixed length data types, such as indexing (u32 or u64) can be either specified in header or made solid by the standard.

@HarryR
Copy link

HarryR commented Apr 1, 2019

Lets convert this to a flatbuffers spec, roughly based on zkinterface.fbs:

To describe the R1CS and variable assignments is something like:

    table Assignment {
        value : [ubyte] ;
    }

    table R1CS {
        constraints    : [Constraint] ;
        field_order    : [ubyte] ;
        num_publics : uint32 ;
        num_secrets : uint32 ;
        num_intermediates : uint32 ;
    }

    table R1CSWithAssignment {
        assignments : [Assignment] ;
        r1cs : R1CS ;
    }

    table Constraint {
        // (A) * (B) = (C)
        linear_combination_a : [Term] ;
        linear_combination_b : [Term] ;
        linear_combination_c : [Term] ;
    }

    table Term {
        variable_idx  :uint32;
        coefficient    :[ubyte];
    }

The R1CS.field_order parameter is implicitly little-endian to make life simpler, as is any big-integer scalar value (represented as an array of ubyte) - everything is little endian.

The constraint system is specific to a field order, it's assumed that any variable assignments for a constraint system will have values modulo the field order. All terms are assumed to have coefficents modulo the field order. A simple check can be performed, len(field_order) == len(x).

You then have 3 variables at the constraint-level:

  • num_publics - Number of public inputs, first n variables will be the public inputs
  • num_secrets - Subsequent n variables will be the 'secret' inputs
  • num_intermediates - Further n remaining variables will be intermediate values

The total number of variables required is num_publics + num_secrets + num_intermediates.

The variable ID is implicit, the R1CS struct allows you to determine the total number of variables, then the assignments : [Assignment] ; gives you a list of scalar values to be assigned to each variable. This avoids a dictionary mapping and allows for direct-index access with known bounds. Public inputs come first, then secret inputs, then intermediate/auxiliary variables fill the rest.

If the number of variable assignments provided doesn't match the number required (publics + secrets + intermediates) it is a fatal error

The flatbuffer spec handles magic & header etc.

If we're only dealing with the R1CS and variable assignments then the only parameter needed for sanity checks is the field-order. The curve name and use of point compression is irrelevant. Personally I think having the option to switch from little to big-endian encoding is also irrelevant, how many of us are natively using big-endian PowerPC or PA-RISC? Ya... none..

@shamatar
Copy link
Member Author

shamatar commented Apr 1, 2019

Just as a side note, while direct access to field in memory is cool, it’s unlikely to be handy for existing libraries.

@HarryR
Copy link

HarryR commented Apr 1, 2019

while direct access to field in memory is cool, it’s unlikely to be handy for existing libraries.

With libsnark you should be able to memcpy the [ubyte] array into the resultant field element class storage, or just cast the bytes as the correct constant type pointer and it will 'just work'.

Why would it not work?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants