High-Level FHE Dialect #53

AlexanderViand-Intel · 2023-06-29T15:28:56Z

AlexanderViand-Intel
Jun 29, 2023
Maintainer

As discussed in the MLIR Dialects Session, I'm setting up three discussions topics, corresponding to the three streams identified.
This is the High-Level FHE Dialect discussion, you can find the Scheme Dialects and Poly/Math Dialects discussions at the links.

The goal of this abstraction/dialect is to enable high-level programs (DSLs, annotated Python, C/C++, etc) to be translated to a scheme/implementation agnostic intermediate representation.

Technical/MLIR Challenges

In addition to the ability to distinguish between "public" and "private" data (most likely via a wrapper-style type along the lines of !fhe.secret<T>, this requires the ability to express computations over encrypted data. Existing FHE MLIR compilers currently use their own "faux arith" dialects for this (e.g., fhe.add(%a, %b)) as the existing arith dialect restricts the types which can be used with its operations. However, there is an opportunity here to suggest a more generic version of arith that is more accepting of different types/semantics, with something like base2 (paper, code, spec) by @KFAFSP for use cases that need to precisely specify arithmetic semantics.

Conceptual Challenges

The scheme/approach-agnostic nature of such an abstraction also requires solving a challenge not yet addressed by the existing tools: Expressing sufficient information to allow conversion to an efficient FHE "circuit", no matter which computational paradigm (binary emulation with gate bootstrapping, precise or approximate large integers, or small(ish) integers with LUTs/PBSs) is used. This in addition to the challenge of tracking encryption noise (the "error" in Learning with Errors), both in the precise setting and in the approximate setting (where noise and message are mixed). While encryption noise can most likely be handled via MLIR analysis tooling, rather than needing to be encoded into the IR, it is less clear how to do this for "program-to-circuit approximation error" that arises when one converts an existing computation to a more efficiently evaluatable form, e.g., by truncating bit width of data types and/or replacing non-arithmetic operations (e.g., comparisons, sigmoid, etc) with either LUTs or polynomial approximations. Especially for the latter, it seems unclear how to express the error introduced by a high-degree approximation as a clean "number of bits". However, this might be possible when combined with sufficient range analysis (i.e., "in [x,y], this approximation has at most b bits of error").

Next steps

We need to solve the conceptual issues around representing approximation tolerance in a scheme-agnostic way, and converge on a technical approach which we could bring back to the Working Group and, with their approval, pitch as an RFC to the MLIR upstream. The expected timeline for this is 1 month until we have a draft RFC ready to present back to the Working Group and, ideally shortly afterwards, the MLIR community.

Tasks

Please register your interest in actively contributing to this abstraction/dialect by commenting below, and suggest possible next task.
Familiarizing ourselves with base2 and identifying prior discussions on the intended direction of arith

j2kun · 2023-06-29T21:36:58Z

j2kun
Jun 29, 2023
Maintainer

I think a good next step is to give a high-level sketch of what each of the possible implementations would look like (and where the challenges might occur), even including the supposedly-naive options like encoding noise in type attributes. I am happy to take a stab at this myself just to give me a chance to think about it.

For the question of "fixing arith" so that it could work with a new container type like secret<i32>, I could use a pointer to any existing material that describes how this would be implemented (as traits, I assume? I haven't tried implementing traits before, so if anyone knows of a good tutorial or upstream "golden" example I could learn from, please post it here)

3 replies

KFAFSP Jun 30, 2023

For the question of "fixing arith" so that it could work with a new container type like secret,

Please note that adding support for new container types is not the intention of arith update movements. In fact, it is widely argued that support for all containers, including the existing ShapedType support shall be removed. There are good arguments to be had for both sides, with the strongest in favor of keeping it probably being constant folding and thus mixing poison.

In your case, as secret<i32> would be an entirely domain-specific term of the FHE abstraction, I would actually adivse maintaining your own set of operations. If this is not to your liking, you may also consider the named requirements RFC I mentioned below as a light-weight polymorphism alternative, though I don't see the value without further information.

j2kun Jun 30, 2023
Maintainer

adding support for new container types is not the intention of arith update movements

Is there somewhere that describes what has been decided about the "arith update movements" that I could read more about?

KFAFSP Jul 1, 2023

Not that I'm aware of... You can gather most of it by reading the discourse board. Apart from that, most of the concrete gripes I also heard at the MLIR Hackathon / EuroLLVM, where I presented my approach.

I'm thinking of starting an RFC post for bit and cyclic next week. I wanted to hold off on that, but maybe it's better to let it "soak". We could then use that as a place to collect all this information.

Note that the consensus was not to deprecate arith, but to keep it around as an "unspecified" interface, which is how current users mostly experience it. Still, some changes are deemed necessary. So, even better would be someone stepping forth and starting and RFC for arith itself. I was not going to do that, because that part is more or less worthless to my work.

lschuetze · 2023-06-30T09:08:41Z

lschuetze
Jun 30, 2023

I welcome this initiative as we just started take a look at the MLIR landscape for FHE. In our use-case we used Proxy Re-Encryption (PRE) to enable a secure computation of data encrypted under different keys from multiple parties. Re-encryption happened to another key (i.e., the key of the party that receives the result, but is not doing the computation and not colluding with the entity that does the computation) whenever two operands of an operation have been encrypted under different keys. We implemented our prototype with PALISADE (now OpenFHE) using the BFV scheme.

At the high-level IR, this required to additionally tag private data with a label, i.e., secret<i32, key1>. Optimizations at this stage ignored the label. In a scheme-level dialect (rather belonging into #54) this is used to track the encrypting key during the computation and to insert additional PRE operations into the circuit, globally minimizing the amount of PRE operations and the multiplicative depth of the circuit. Our tools mapped the labels in the circuit later to the real keys and API calls.

4 replies

AlexanderViand-Intel Jul 1, 2023
Maintainer Author

Thank you so much for chiming in! The need to have secret<T> be relative to a key seems obvious now that you mentioned it but would have been easy to overlook! I think there's something interesting discussion to be had about whether, at this high level, it should really be a "key" or rather a (set of) "owners"/"parties with access" that should be tracked?

The correct answer here might depend quite a bit on the scenario a given compiler is targeting, so on a technical level we might want to go for a very permissively typed attribute (with a sensible "N/A" default + some pretty-printing to make sure this only shows if it's actually being used)?

lschuetze Jul 4, 2023

at this high level, it should really be a "key" or rather a (set of) "owners"/"parties with access" that should be tracked?

Right, in reality each operation unions the potential set of owners of a secret value that is computed. In the High-Level IR we only captured the key labels via the inputs, which gave us some polymorphism w.r.t. real keys and parties, and used those labels.

In our compiler, we computed the potential owners when we did the lowering. The question is, also technically with MLIR, how much of this set of potential owners must be captured already in the type-level of the IR or whether that can be computed later, too.

AlexanderViand-Intel Jul 6, 2023
Maintainer Author

Thanks for expanding on this! I like the idea of computing the sets during a lowering.
"Key" tagging seems like something where a type-alias-ed AnyAttribute might come in handy, to allow different compilers to do very different things.

asraa Jul 6, 2023
Maintainer

Nice! I like the idea of including the key as a parameter that the data is secret "relative" to.

With regards to the idea that "should it be a key or rather a set of owners": is there really a difference? I can imagine maybe we have one key encrypted with a key shared by Alice and Bob, and then another piece of data with Alice, Bob, and Charlie. In this case of simply adding one party (or perhaps the reverse and restricting to a subset), is the PRE operation any cheaper than going from an arbitrary party A to party B? If not, then it seems like keeping track of an opaque key "label" seems to make sense, and that the lowering would be responsible for computing the owners.

KFAFSP · 2023-06-30T12:15:44Z

KFAFSP
Jun 30, 2023

Thanks for pinging me on this. You may have realized already that there haven't been any updates to base2 recently. This is partly because I want to dedicate myself to properly upstreaming the prerequisites. In order, these are:

Named constraints
Poison semantics
bit & cyclic(no RFC opened yet)

While I have a working prototype integrating all of this, it has not been scrutinized by the community yet, and I expect changes to be necessary. Unfortunately, such is the way of moving from research to upstream. Therefore, should you determine that stronger integer semantics are necessary, I would be happy to move on this together during the RFC phase, and avoid waiting on implementation delays.

2 replies

AlexanderViand-Intel Jul 28, 2023
Maintainer Author

I noticed that there's been quite a bit of movement on the poison semantics RFC, and I was wondering if you would you be interested in joining our next discussion meeting (Tuesday, 1st of August at 17:00 CEST) to discuss the state of base2 and whether it and/or the underlying features might have applications to a potential fhe dialect? We'd very much appreciate it!

KFAFSP Aug 2, 2023

Sorry about my late reply, unfortunately I was unavailable at that time. I'm not sure whether I'm the right person for what I think you need, but I'd gladly join the next time around.

Note that Mehdi has now started the discussion about the future of arith on discourse: https://discourse.llvm.org/t/rfc-arithmetic-vs-llvm-dialect/72477/3

I think the sentiment so far supports following up on the ideas presented in this issue so far.

AlexanderViand-Intel · 2023-07-01T13:05:55Z

AlexanderViand-Intel
Jul 1, 2023
Maintainer Author

One of the key goals with this "High-Level FHE Abstraction" is to be able to represent complex programs pre "arithmetization" (i.e., truncation, approximation, LUT-ization, etc). However, while there are obvious ways to represent some "non arithmetic" operations in the IR (e.g., comparisons, divisions, and other "basic" operations), this gets a bit murkier for things such as commonly used activation functions (should there be an fhe.sigmoid?) and doesn't really extend to "arbitrary" functionalities.

Here, I think we should follow an approach similar to what certain frontends already do (concrete comes to mind here), essentially offering an interface that accepts a function/lambda and a range and does the necessary work to figure out, e.g., a k-bit LUT. For this, the high-level dialect should probably just nest the function/lambda in a special region-based operation, to allow different compilers to apply different lowerings (e.g., polynomial approximation instead of LUTs).

This raises a bunch of concrete questions about what information this operation needs to encode (in addition to the function) in order to allow this lowering to be done effectively by compilers following a variety of different approaches. Clearly, the expected input range seems pretty important, as maybe the acceptable error in bits?

7 replies

BourgerieQuentin Jul 11, 2023

On concrete there are no notion of floating point at the compiler level (we dealing only with integers). Concrete Python frontend handle some subgraphs where integers inputs are converted to floating point then convert to integer, this is transformed to a table lookup. This is the only support of floating point we have at the Concrete level. But Concrete-ML support floating points by quantizing floatting point model to integers.
But anyway support of floating point / fixed point / floating point could be added to the FHE high level dialect and the way to support should differs from a compiler to another depending of the actual support of the underlying scheme, maybe could be good to share commons passes for high level transformations (like quantization) that could make sense for several scheme.
About the activation function, it make sense as well to support any function and maybe have some arbitrary function that could be specifically optimized by the compiler, maybe having something like linalg.generic that can express the actual implementation and the "library_call" attribute. One of the challenge to express the function in MLIR and not like a lookup table is the code generation from the frontend, it was really much simpler to evaluate the function from python to convert into a lookup table instead of generate code of the function from python to MLIR. Anyway I think both approach (i.e. MLIR function vs Table lookup) are both interesting but as well the table lookup can be express with an arbitrary MLIR function.

BourgerieQuentin Jul 11, 2023

About the error I'm not sure we want to express that on the apply function operators, maybe it could make sense to express that on the type? From my point of view the error is the probability than the noise lead to an incorrect decryption of an fhe value (secret)

AlexanderViand-Intel Jul 11, 2023
Maintainer Author

Thanks for the detailed update Quentin! I think assuming that there'll be some kind of quantization of floating points is a good approach, and I like the idea of quantization as something that might end up shared between different compilers. For now, I'd suggest we make sure to allow things like fhe.secret<f64> so compilers have a way to express high-level programs pre-quantization but otherwise assume that "canonical" programs in the high-level IR will be relying only on fhe.secret<something-integer-like>.

I really like the comparison to linalg.generic, and also how many of the "custom" operations in the linalg dialect can be considered syntactic sugar for specific instances of linalg.generic, which could provide a nice way to both have a lot of custom operations for commonly used non-linear functionalities (ReLU, sigmoid, sign, comparisons, etc) without requiring compilers to implement all of them even if they don't optimize them, sine we could simply provide optional lowerings from e.g. ReLU to a corresponding generic instance.

At the same time, I absolutely agree that it's much easier to evaluate functions in the frontend, especially for the LUT approach. Maybe this is even true for the polynomial approximation approach? A trivial approach could be to sample a lot of points of the function and then look for a polynomial that is "close enough" (for some definition of that) to all of those points in a given interval. Either way, it seems like there's definitively a need for an fhe dialect operation that allows us to model LUTs/sampled functions. I wonder if this should maybe be at a slightly higher abstraction level than PBS LUTs. For example, it might make sense to allow multivariate LUTs and let the compiler deal with the univarification (pretty sure that's not a word, though) later.

AlexanderViand-Intel Jul 11, 2023
Maintainer Author

From my point of view the error is the probability than the noise lead to an incorrect decryption of an fhe value (secret)

I think we're talking about a different type of error here. In the meeting (meeting notes) we identified three types of error: encryption/LWE noise, encoding/decoding errors (e.g., CKKS or any other fixed-point approximations of reals), and program->circuit approximation error (i.e., error due to quantization and conversion of non-linear things to LUTs/polynomial approx.).
The first two are most likely best handled via analyses (i.e., not explicitly represented in the IR) as they're very much scheme/implementation specific. However, how much "approximation error" a program can tolerate depends a lot on the application, and so needs to be provided in some way by the programmer/frontend and should therefore be encoded in some way or other in the IR.

I'm not sure we want to express that on the apply function operators, maybe it could make sense to express that on the type?

Assuming "bits" is indeed a sufficiently generic measure of "approximation error", then it might indeed be a good idea to encode this in the type system. For example, as an optional parameter along the lines of fhe.secret<i16, tolerance=2>?

However, from a user/programmer perspective, it might be easier to think about "how many bits of error can I tolerate for this comparison/sigmoid/etc" rather than "how many bits of error can I tolerate in this value at this point in the program", as types are associated with SSA values, a lot of which don't have explicit counterparts in high-level programs.

AlexanderViand-Intel Jul 28, 2023
Maintainer Author

I was skimming through ELSAM, a follow-up paper to HECATE, which in turn proposes a more optimal rescaling insertion approach than the waterline approach used in EVA, and it reminded me that we should consider how to handle (fixed-point) scale in the high-level fhe dialect. Clearly, this mostly appears in the context of CKKS and it's ability to rescale, but the underlying notion of fixed-point values and scales seems more general.

Maybe this is actually a lower-level detail, and the compiler should work out the correct scale from some combination of high-level type, range, and "acceptable bits of approximation", but then that opens up the wohle can of worms that is floating point types and what it means to be "k bits off" on a floating point value...

AlexanderViand-Intel · 2023-07-20T06:42:17Z

AlexanderViand-Intel
Jul 20, 2023
Maintainer Author

In the last WG meeting, the idea of "core" and "non-core" operations came up, where "core" operations in a dialect/abstraction level. The non-core operations would come with default "lowerings" to core operations, so that tools only have to implement the core operations.
This is similar to what we discussed above with linalg.generic style things, but would also apply to the lower-level dialects.

While it might seem reasonable to use add/sub and mul as the only core operations for the high-level dialect, I don't think this is reasonable given the expressive power of bit-wise approaches and LUT-based things, and it seems like we'd need some kind of LUT-like op, too.

For example, we might have a (non-core) fhe.cmp operation for comparisons. Tools that have a direct translation for this (e.g., bit-wise encryption) will probably want to offer their own direct lowering (technically, I'm thinking in terms of MLIR Patterns here, not Lowerings/Passes, but I hope the point is clear enough either way).

Alternatively, fhe.cmp could lower to (core) fhe.sub and (non-core) fhe.sign, which would then allow tools that have a good approach for sign to take over. If this isn't the case, the default lowering of fhe.sign would be to the LUT-like core op (details TBD) that could then be realized either via some form of PBS or via polynomial interpolation.

Note that this actually has implications for the information that fhe.sign needs to contain (and I believe this will hold for a wide range of non-core ops): at minimum, we need the range of the input that should be handled, and the precision/step size. However, this assumes the points are evenly distributed in the range which might not be a good fit for the polynomial approach.

I don't have a lot of experience with state-of-the-art polynomial approx/interp approaches for FHE, so it'd love to hear a (more) expert opinion on whether or not this is too much of a constraint.

0 replies

j2kun · 2023-07-31T23:45:10Z

j2kun
Jul 31, 2023
Maintainer

FYI, I've started implementing secret in #78

I also read through the structured operations and transform dialects to see what linalg.generic was all about. I think it would be a good idea to model this, at the very least so that we can implement all our "core" ops (like secret.cmp) in terms of secret.generic. I will aim to get a prototype working in the next week or two, along with a forgetful/cleartext lowering.

2 replies

j2kun Aug 1, 2023
Maintainer

To expand/reframe this a bit better (contradicting some of what I wrote above):

My intention is to have secret be the dialect that represents structured computation on encrypted data, but with no specification of how the data is encrypted. This would not include any of the actual computation performed (like secret.cmp, which would go in a different computation-focused dialect).

How this could work is following in the footsteps of linalg.generic by defining an op, secret.generic, that "lifts" plaintext computation (implicitly) to operate on secret types, with lowerings to a dialect like fhe_math or fhe_arith or whatever that transform a "lift" to supported ops. By further annotating the secret.generic types with a library_call analogous to linalg.generic's, we can also preserve the high level metadata about what operation is being computed, so that those operations can be lowered directly to special ops, and everything else can be lowered generically.

Here's an example of what I mean, which I'm going to prototype in #78 this week:

The input program would look something like:

x : i32 = 7
# or however one gets the data as input, what matters is that the source language
# annotates `y` as private in some way we can ingest in MLIR
y : Secret[i32] = encrypt(input('Enter your value to encrypt'), sk)
z = x * y

The corresponding MLIR might look like:

module {
  func.func @main(%y : !secret.secret<i32>) {
  %X = arith.constant 7 : i32
  %Z = !secret.generic
    ins(%X, %Y : i32, !secret.secret<i32>)) {
    ^bb0(%x: i32, %y: i32) :
      %d = arith.addi %x, %y: i32
      !secret.yield %d : i32
    } -> (!secret.secret<i32>)
  }
}

Here the body of secret.generic operates on the plaintext values inside the corresponding arguments of type secret.secret (and plaintext arguments are passed through. But the result type is a secret.secret<i32>.

A naive lowering would match on this op, and convert arith.addi to the corresponding fhe_arith.plaintext_ciphertext_addi (or whatever the op is) that operates directly on the secret type. If the op is special, like sigmoid, then we'd rely on the input -> secret lowering to add that as a magic string attribute.

#sigmoid_generic = #secret.generic_attr {
  library_name = "sigmoid",
  ...
}
func.func @main(%y : !secret.secret<f32>) {
  %Z = !secret.generic #sigmoid_generic
    ins(%Y : !secret.secret<f32>)) {
    ^bb0(%y f32) {
       <naive implementation of sigmoid using arith>
    }
} -> (!secret.secret<i32>)

If the library_name is recognized, it would lower to the named op in fhe_math. If not, the lowering would be naive: split the secret.generic into single-op secret.generics, and then lower each generic op to the corresponding fhe_math op (via, say, the greedy rewrite engine).

I also think that secret is the right place to start including things like the precision constraints, but I want to give this a fair shake before expanding in that direction.

j2kun Aug 15, 2023
Maintainer

Moving this thread to #96

High-Level FHE Dialect #53

AlexanderViand-Intel Jun 29, 2023 Maintainer

Technical/MLIR Challenges

Conceptual Challenges

Next steps

Tasks

Replies: 6 comments · 18 replies

j2kun Jun 29, 2023 Maintainer

j2kun Jun 30, 2023 Maintainer

AlexanderViand-Intel Jul 1, 2023 Maintainer Author

AlexanderViand-Intel Jul 6, 2023 Maintainer Author

asraa Jul 6, 2023 Maintainer

AlexanderViand-Intel Jul 28, 2023 Maintainer Author

AlexanderViand-Intel Jul 1, 2023 Maintainer Author

AlexanderViand-Intel Jul 11, 2023 Maintainer Author

AlexanderViand-Intel Jul 11, 2023 Maintainer Author

AlexanderViand-Intel Jul 28, 2023 Maintainer Author

AlexanderViand-Intel Jul 20, 2023 Maintainer Author

j2kun Jul 31, 2023 Maintainer

j2kun Aug 1, 2023 Maintainer

j2kun Aug 15, 2023 Maintainer

AlexanderViand-Intel
Jun 29, 2023
Maintainer

Replies: 6 comments 18 replies

j2kun
Jun 29, 2023
Maintainer

j2kun Jun 30, 2023
Maintainer

AlexanderViand-Intel Jul 1, 2023
Maintainer Author

AlexanderViand-Intel Jul 6, 2023
Maintainer Author

asraa Jul 6, 2023
Maintainer

AlexanderViand-Intel Jul 28, 2023
Maintainer Author

AlexanderViand-Intel
Jul 1, 2023
Maintainer Author

AlexanderViand-Intel Jul 11, 2023
Maintainer Author

AlexanderViand-Intel Jul 11, 2023
Maintainer Author

AlexanderViand-Intel Jul 28, 2023
Maintainer Author

AlexanderViand-Intel
Jul 20, 2023
Maintainer Author

j2kun
Jul 31, 2023
Maintainer

j2kun Aug 1, 2023
Maintainer

j2kun Aug 15, 2023
Maintainer