Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes on Field Agnosticity #390

Open
DavePearce opened this issue Nov 26, 2024 · 8 comments
Open

Notes on Field Agnosticity #390

DavePearce opened this issue Nov 26, 2024 · 8 comments
Assignees

Comments

@DavePearce
Copy link
Collaborator

DavePearce commented Nov 26, 2024

Overview

The goal here is to allow support for different underlying prime fields. Currently, the prime field is assumed to be BLS12-377. That requires a minimum of 252bitsof storage, and can comfortably hold u128 values for addition without overflow. Thus, representing a u256 value (as found in the EVM) requires a vector of two field elements. The challenge with supporting different fields is that many of interest are much smaller:

  • (Goldilocks) Elements on the goldilocks field fit into a single u64 whose prime p is defined as 2^64 - 2^32 + 1 = 18446744069414584321. This means it can hold the result of adding four u62 values without overflow. Thus, in order to represent a u256 value, we would need a vector of at least five field elements.

  • (Baby Bear) Elements on the Baby Bear field fit into a single u32 whose prime p is defined as 15*2^27 + 1 = 450359962737049. This means it can hold the result of adding two u30 values without overflow. Thus, in order to represent a u256 value, _we would need a vector of at least 9 elements.

The goal here is to give some example implementations of common operations, such as addition, multiplication, normalisation, comparison, etc. This is loosely based on an original set of notes.

@DavePearce DavePearce self-assigned this Nov 26, 2024
@DavePearce DavePearce changed the title Support Field Agnosticity Notes on Field Agnosticity Nov 26, 2024
@DavePearce DavePearce reopened this Nov 27, 2024
@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Constraints

The existing constraints split u256 values into two u128 limbs, and are hardcoded with an assumption that the underlying field is large enough. For example, a common operation is a byte decomposition which aims to show that a given value fits into a u256. Let use assume that our value is split into the RES_HI and RES_LO columns. Then, the byte decomposition will require an additional four columns (BYTE_HI, BYTE_LO, ACC_HI, ACC_LO) and, for now, assume it will take 16 rows (i.e. because a u128 has 16 bytes). Then, an example decomposition of RES_LO looks like this:

Row RES_LO BYTE_LO ACC_LO
0 0xFFEEDDCCBBAA99887766554433221100 0xFF 0xFF
1 0xFFEEDDCCBBAA99887766554433221100 0xEE 0xFFEE
2 0xFFEEDDCCBBAA99887766554433221100 0xDD 0xFFEEDD
3 0xFFEEDDCCBBAA99887766554433221100 0xCC 0xFFEEDDCC
...
14 0xFFEEDDCCBBAA99887766554433221100 0x11 0xFFEEDDCCBBAA998877665544332211
15 0xFFEEDDCCBBAA99887766554433221100 0x00 0xFFEEDDCCBBAA99887766554433221100

What we see is that the ACC_LO accumulates the values of BYTE_1 until it matches the original value of RES_LO. Since this accumulation must be done within 16 rows, it proves that RES_LO does indeed fit within a u128. A similar thing is done for RES_HI. Also, we can optimise this to reduce the number of rows required for small values. For example, if RES_LO actually only holds say 0x4c22 then we can do the decomposition in two rows.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Word Size

Given a field 𝔽 which can hold at least n bits of information, the question is what word size should be chosen. That is, what the size of each limb making up a u256 should be. For the BLS12-377 curve, the word size was u128 and thus two limbs are required for each u256 value. Different word sizes could be chosen, however, and these have different trade offs. For example, a word size could be chosen to ensure the underlying field element is big enough to hold the result of multiplying two limbs (i.e. u128) values without causing overflow. That is quite a strong assumption, and probably not optimal in most cases.

(Configurations) Generally speaking, we want to choose a word size such that the underlying field can hold the result of summing at least m words without overflow. For the Goldilocks field, for example, we could represent a u256 using a vector of five u52 limbs) --- this would allow the underlying field to hold the result of summing 4096 words without overflow. This ensures certain optimisations (see below) are possible. However, in contrast, suppose our prime field was 5 and our word size u4. Then, we could not even hold the result of summing two words without overlow --- meaning this configuration is not usable.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Addition

When adding two u52 bit values together, the result always fits within a u53 (where the most significant bit is also called the overflow bit). From electronic circuits, a common strategy for chaining adders together is a so-called ripple-carry adder. At a high-level it looks like this:

RippleCarryAdder(1)

Here, we are adding two values X+Y=Z where each argument is split across four limbs. The carry bit cN for each adder is the overflow bit from the addition. We can then implement our adder as follows:

(defcolumns
  (X3 :i16@prove) (X2 :i16@prove) (X1 :i16@prove) (X0 :i16@prove)
  (Y3 :i16@prove) (Y2 :i16@prove) (Y1 :i16@prove) (Y0 :i16@prove)
  (C3 :i1@prove)  (C2 :i1@prove)  (C1 :i1@prove)  (C0 :i1@prove)
  (Z3 :i16)       (Z2 :i16)       (Z1 :i16)       (Z0 :i16))

(defconst OVERFLOW 65536)

(defpurefun (ADDER cin arg1 arg2 out cout)
  (eq! (+ out (* cout OVERFLOW)) (+ arg1 arg2 cin)))

(defconstraint X_Y_Z ()
  (begin
   (ADDER 0 X0 Y0 Z0 C0)
   (ADDER C0 X1 Y1 Z1 C1)
   (ADDER C1 X2 Y2 Z2 C2)
   (ADDER C2 X3 Y3 Z3 C3)))

Here, columns for X and Y are marked as i16@prove to ensure these values fit within a u16. However, if this was part of a larger circuit where it was already known that X and Y were valid u16 then this could be avoided.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Zero Test

The zero test checks whether a given expression is zero or not. When a given u256 value is split up into n limbs it must check each limb is zero. When the underlying word size is large enough to sum all the limbs without overflow, this can be optimised to a single sum. The following illustrates:

(defcolumns
  (X3 :i16@prove) (X2 :i16@prove) (X1 :i16@prove) (X0 :i16@prove)
  (RES :binary@prove)
)

(defpurefun ((IS-ZERO :binary@loob :force) w3 w2 w1 w0) (+ w3 w2 w1 w0))

(defconstraint c1 ()
  (if (IS-ZERO X3 X2 X1 X0)
      (vanishes! RES) (eq! RES 1)))

In this example, we are assuming that the sum (+ w3 w2 w1 w0) cannot overflow the underlying field element. In such case, the only way for that sum to be 0 is if all the words w3 .. w0 are themselves 0.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Normalisation

The normalisation operator projects a given value X into a binary value Y, such that Y==0 iff X==0 and Y==1 otherwise. Thus, it Y tells us when X was zero or non-zero. As above, when the underlying word size is large enough to sum all the limbs without overflow, this can be optimised to the normalisation of a single sum. The following illustrates:

(defcolumns
  (X3 :i16@prove) (X2 :i16@prove) (X1 :i16@prove) (X0 :i16@prove)
  (RES :binary@prove)
)

(defpurefun ((NORM :binary) w3 w2 w1 w0) (~ (+ w3 w2 w1 w0)))

(defconstraint c1 ()
  (eq! RES (NORM X3 X2 X1 X0)))

@DavePearce
Copy link
Collaborator Author

DavePearce commented Dec 9, 2024

Constraints

An import question is how we apply all of this to actual constraints. As a simple example, consider:

X + Y = 1 + 2 * Z

And let's assume all columns are 16 bits, and say the underlying field gives 18 bits of information. Some example values:

X1 X0 Y1 Y0 Z1 Z0
01 1a 00 05 00 8f

Then, start by breaking up the constraints:

X0 + Y0 = 1 + Z0 + Z0
X1 + Y1 = 0 + Z1 + Z1

We can see this will fail because X1==1 whilst Z1==0. We can rephrase this as:

(256 * Cxy) + Txy0 = X0 + Y0
Txy1 + Cxy = X1 + Y1

(256 * Cz) + Tz0 = 1 + Z0 + Z0
Tz1 + Cz = Z1 + Z1

Txy0 == Tz0
Txy1 == Tz1
Cxy < 2
Cz < 2
Txy0 < 256
Txy1 < 256
Tz0 < 256
Tz0 < 256

Then, for our example, we get this:

. X Y Cxy Txy Tz Cz Z
0 1a 05 00 1f 1f 01 8f
1 01 00 00 01 01 00 00

Question 1

Can we optimise this to use just temporary one column, or not?

  • The problem is that the sum 1 + Z0 + Z0 - X0 - Y0 might underflow and then fail the range constraints.
  • One possible solution would be to do e.g. (C*256+T = 1+Z0+Z0-X0-Y0) || (C*256+T = X0+Y0-1-Z0-Z0). The problem here is that the high byte constraints must mirror "the flip" presumably.

Question 2

Can we optimise the number of range constraints required?

  • We have to at least partition e.g. Txy from Cxy otherwise we can rebalance them to change the outcome.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Dec 10, 2024

Suggestion which uses four binary columns:

     X0 + Y0 + C0 * 2**128 =      (1 + Z0 + Z0) + D0 * 2**128
C0 + X1 + Y1 + C1 * 2**128 = D0 + (0 + Z1 + Z1) + D1 * 2**128

This seems a bit iffy though, as in some cases C0 and D0 are unconstrained. For example, let X0==Z0 and Y0==Z0+1. Then the first constraint reduces to:

Z0 + Z0 + 1 + C0 * 2**128 =      (1 + Z0 + Z0) + D0 * 2**128

And we require that C0 == D0 but otherwise they are unconstrained.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Dec 10, 2024

A rough strategy starts to appear:

  • Expand constraints fully, rewriting them into their "balanced" form (i.e. without negations, as illustrated above)
  • Split constraints accordingly to fit them in. The splitting may not be uniform, as we'll custom the wordsize based on the e.g. the number of additions, etc. This requires some kind of range analysis to check for overflows on either size.

There are a couple of problems with this as a basic approach:

  • (Normalisation). Its not clear how to handle normalisation here. Perhaps we can just treat it as a normal column? This might confuse the range analysis though.
  • (Non-Linear). This approach (of multiplying out the constraints) tends to lead to a lot of non-linear constraints (less than ideal).

For example, consider expanding this constraint:

Constraint Form
(X = Y) ∨ (X = Y + 1) Logical
(X - Y) * (X - Y - 1) Arithmetic
X*X + Y*Y + Y = 2*X*Y + X Expanded

Problem) the expanded form requires several non-linear multiplication gates --- which are presumably expensive to implement?

Therefore) It makes sense to work on the original logical form. It is actually possible to lower HIR into the logical form (LIR) rather than MIR then either go from there to MIR, or straight down to AIR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant