Optimization pass: Use AutoHoG to construct compound multi-input/multi-output gates for CGGI schemes #648

asraa · 2024-04-24T21:56:56Z

asraa
Apr 24, 2024
Maintainer

Context
AutoHoG is a procedure for generating "enriched" compound gates over FHE that can reduce the computational complexity of boolean FHE circuit, like the CGGI scheme, when compared to using standard cell libraries (e.g. LUT cells, or AND/NOT/XOR/etc gates). AutoHoG takes Verilog and synthesizes a netlist (as JSON), which is then converted into a DAG (as JSON). AutoHoG produces an optimized DAG which is given the Iyokan for gate evaluation.

Currently, our tosa-to-boolean-tfhe pipeline lowers standard MLIR circuits to an optimized combinational circuit using either LUT cells or the standard boolean gate cells. This conversion pass internally converts the module into Verilog and calls Yosys as a library. The Yosys passes use ABC to techmap the circuit. The resulting RTLIL (Yosys' internal circuit representation) is converted back to MLIR using a combinational circuit dialect that contains LUT and boolean gate operations.

Proposal
This proposal is to integrate AutoHoG into our pass pipeline as a replacement for the current Yosys Optimizer pass described above.

Design and Discussions

Representing MISO generated gates

The multi-input single-output (MISO) gates generated by AutoHoG specify (1) a linear combination to apply to the inputs and (2) a truth table applied to the result of the linear combination. In our current pass, we read the truth table value from the RTLIL cell's string attribute. If the optimized DAG can contain custom attributes for the cells, we can translate those into MLIR attributes.

In MLIR, we can construct an abstract cell that holds (1) a custom attribute "linear combination" that contains (i) an array of integers describing the weights of the linear combination and (ii) an offset added and (2) an integer attribute that represents the truth table input T (a table of size n can be represented with an n-bit integer). For e.g.

%3 = comb.generic_gate %0, %1, %2 {linear = <weights = [1, 2, -1], offset=0>, truth_table = 5} : (i1, i1, i1) -> i1

Representing combination: multi-output MIMO gates

AutoHoG also combines multiple MISO gates to a MIMO gate by combining gates that have the same inputs. This technique can evaluate multiple LUTs on those same inputs with the cost of a single bootstrap. After blind rotate, we multiply by another polynomial. The biggest concern here is holding this polynomial in memory.

TODO: I'm still working on this one

Considerations when lowering to an exit dialect

The exit dialect we lower to to emit library code (before we simply lower the scheme to low-level instructions / LLVM / etc) will require the ability to evaluate the linear combinations at runtime and to generate the LUTs are runtime. We may be able to encode the linear logic into API calls when lowering to the scheme-specific dialects, for e.g. the MISO gate can be lowered to

%3 = comb.generic_gate %0, %1 {linear = <weights = [1, -1], offset=0>, truth_table = 3} : (i1, i1) -> i1

%2 = lwe.sub %0, %1 : !lwe.lwe_ciphertext
%3 = cggi.generate_lut_polynomial 3 : i2 -> !cggi.truth_table
%4 = cggi.bootstrap %2, %3, %params : !lwe.lwe_ciphertext

Clearly, we need to lower to dialects that can support granular ops like generate test polynomials, blind roate, multiplying by polynomials, etc.

cc @sbian3 @Lavendes

I will continue to update tomorrow with more info.

j2kun · 2024-04-24T22:58:10Z

j2kun
Apr 24, 2024
Maintainer

Some notes:

For the linear combination attribute, we can probably reuse affine_map, since it can express linear combinations (and more)
jaxite can definitely support the lower level API, and we can package up higher level granularity ops if necessary.
Is there any hope of porting the implementation directly to HEIR? The paper abstract says "match-and-replace strategy" which suggests we could implement it via MLIR's pattern-matching engine.

@sbian3 could we get access to a copy of the AutoHoG paper linked above? I don't have access to it. Some questions I have related to the paper:

How does the optimization/synthesis time relate to Yosys? An example runtime for a given input size would suffice.
Do the innovations in this paper impose additional constraints on scheme parameters? For example, in jaxite we chose RLWE parameters with dimension 2 (so that an RLWE sample is (a_1, a_2, b)), which is not common among TFHE implementations.

2 replies

j2kun Apr 24, 2024
Maintainer

Is a MIMO gate different from just having a list of LUTs in its attribute, and an equivalently-sized list of results? That seems straightforward to model.

Lavendes Apr 29, 2024

The replacement time for the ISCAS c1908 circuit (with 880 gates) requires approximately thirty minutes.
In AutoHoG, the generated gates use parameters (a, b), where the dimension n of a is 2048. I believe multidimensional parameters are feasible as long as kn equals 2048.
MIMO gate modeling requires a linear combination of inputs and multiple test vectors, known as table T in our paper. It also necessitates recording the multiple output wires in the circuit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization pass: Use AutoHoG to construct compound multi-input/multi-output gates for CGGI schemes #648

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Optimization pass: Use AutoHoG to construct compound multi-input/multi-output gates for CGGI schemes #648

asraa Apr 24, 2024 Maintainer

Replies: 1 comment · 2 replies

j2kun Apr 24, 2024 Maintainer

j2kun Apr 24, 2024 Maintainer

Lavendes Apr 29, 2024

asraa
Apr 24, 2024
Maintainer

Replies: 1 comment 2 replies

j2kun
Apr 24, 2024
Maintainer

j2kun Apr 24, 2024
Maintainer