Lean models of various functions and data types from Rust libraries to facilitate Lean-based verification of Rust programs.
This library is intended to be used for verifying Rust programs via Lean. It defines equivalent Lean versions of functions from the Rust standard library.
The Lean definitions are based on the description of the Rust functions, which is published at https://doc.rust-lang.org/std.
The library includes:
- Definitions of Lean functions equivalent to those from the Rust standard library
- Proofs that these definitions are consistent with the description of the Rust functions
- Supporting lemmas and theorems for proof construction
-
Add the following lines to your
lakefile.lean
:require «rust-lean-models» from git "https://github.com/model-checking/rust-lean-models"
-
Change the lean version in your
lean-toolchain
to the one in lean-toolchain -
Run
lake update
in the terminal. -
Import the packages and open the namespaces in your Lean files (see ProofTutorial.lean)
import RustLeanModels.Basic import RustLeanModels.RustString open RustString open Iterator
Replace the Rust built-in functions that are used in your code with the corresponding Lean function in this library.
Tables in RustString.md and Iterators.md give the mapping from Rust types and functions to their Lean equivalents.
We demonstrate the applications of the library and some proof techniques for string programs in ProofTutorial.lean. This tutorial shows the correctness of two simple programs that compute the longest common prefix and longest common substring of the two strings. More examples can be found in ProofExample.lean.
For each Rust function, we provide a recursive Lean function. Implementing
the equivalent functions in Lean recursively enables users to construct
induction proofs conveniently. The Lean function has the same name as the Rust original,
except when the Rust name clashes with a Lean keyword. In case of a clash, a Rust function func_name
is renamed to rust_func_name
in Lean.
Implementing a function recursively may requires some extra parameters.
In those cases, we first define an auxiliary function (name: func_name_aux
) which is defined
recursively with the extra parameters, then we define the function func_name
based on the auxiliary function
with initial values for the parameters.
For example, char_indices
is defined based on char_indices_aux
as
def char_indices (s: Str) := char_indices_aux s 0
For Rust functions that use the Rust Pattern
type, we implement two recursive sub-functions
(name: func_name_char_filter
and func_name_substring
) that replace the Pattern
type
in the input with either a char filter (of typeChar → Bool
) or a string (of type List Char
).
Then we define the function func_name
based on these two sub-functions by matching the
input of Pattern
type. For example, split
is defined by two sub-functions
split_char_filter
and split_substring
as:
def split (s: Str) (p: Pattern) := match p with
| Pattern.SingleChar c => split_char_filter s (fun x => x = c)
| Pattern.ListChar l => split_char_filter s (fun x => x ∈ l)
| Pattern.FilterFunction f => split_char_filter s f
| Pattern.WholeString s1 => split_substring s s1
All recursive implementations are proven to be "correct" in the sense that they are consistent with the descriptions of the Rust versions (see below for more details).
For functions that return Bool
or can be defined based on other functions that have already been proven correct
-
First, we provide a variant Lean definition of the Rust function that we call the definitional version (with name
func_name_def
). This version is intended to match the documented description of the Rust function. Ideally, the definitional version should not contain recursion, except in some trivial cases, but it can make use of the recursive versions of other functions that have been previously proven correct. -
Then, we state and prove a theorem that shows that the recursive and definitional versions of Rust function
func_name
are equivalent. This equivalence theorem is calledfunc_name_EQ
and it has typefunc_name = func_name_def
. The theorem ensures that the function is implemented correctly and allows the two versions to be used interchangeably. In some cases, constructing a non-induction proof using the definitional version is more convenient. -
For example, the function
is_char_boundary
has a definitional version:def is_char_boundary_def (s: Str) (i: Nat) := i ∈ ListCharPos s ∨ i = byteSize s
a recursive definition:
def is_char_boundary (s: Str) (i: Nat) := match s with | [] => i == 0 | h::t => if i = 0 then true else if i < Char.utf8Size h then false else is_char_boundary t (i - Char.utf8Size h)
and an equivalence theorem
theorem is_char_boundary_EQ : is_char_boundary s p = is_char_boundary_def s p
When the description of a Rust function cannot be efficiently expressed in Lean (requires recursions, or is unintuitive), we can:
-
Define the definitional version (similar to Case 1) based on a recursive trivial function, then prove the equivalence theorem. For example, the
byteSize_def
function is based on the functionsum_list_Nat
that computes the sum of a list of natural numbers:def sum_list_Nat (l : List Nat) := List.foldl Nat.add 0 l` def byteSize_def (s : List Char) : Nat := sum_list_Nat (List.map (fun x: Char => Char.utf8Size x) s)
-
Define and prove the correctness of one/some subordinate function(s), then define the definitional version based on them. For example, to define
split_inclusive_char_filter_def
, we first define and prove the correctness of two functions:-
list_char_filter_charIndex (s: Str) (f: Char → Bool)
: returns the list of positions of characters ins
satisfying the filterf
-
split_at_charIndex_list (s: Str) (l: List Nat)
: splits the stringss
at positions inl
then we define
split_inclusive_char_filter_def
as follows:def split_inclusive_char_filter_def (s: Str) (f: Char → Bool) := split_at_charIndex_list s (List.map (fun x => x+1) (list_char_filter_charIndex s f))
-
We state and prove a soundness theorem for the function with
name: func_name_sound
and type: x = func_name input1 input2 ... ↔ properties of x
.
For example, the soundness theorem for the function floor_char_boundary
defined as:
def floor_char_boundary_aux (s: Str) (i: Nat) (k: Nat): Nat := match s with
| [] => k
| h::t => if i < Char.utf8Size h then k else floor_char_boundary_aux t (i - Char.utf8Size h) (k + Char.utf8Size h)
def floor_char_boundary (s: Str) (i: Nat) := floor_char_boundary_aux s i 0
is
theorem floor_char_boundary_sound: flp = floor_char_boundary s p
↔ (is_char_boundary s flp) ∧ (flp ≤ p) ∧ (∀ k, ((is_char_boundary s k) ∧ (k ≤ p)) → k ≤ flp )
- The soundness theorem should cover all the properties in the Rust documentation.
- We make some minor/trivial adjustments when needed to express the properties in Lean.
- The modus ponens (→) direction of the theorem is enough to ensure the correctness of the recursive version if the properties in the right-hand-side ensure that the function in the left-hand-side is deterministic.
- The modus ponens reverse (←) direction ensures that we stated enough properties in right-hand-side such that it can be satisfied by only one function.
If the function returns an option, we separately state and prove two soundness theorems for the two cases
of the return value: func_name_none_sound
and func_name_some_sound
. For example:
```lean
theorem split_at_none_sound : split_at s p = none ↔ ¬ ∃ s1, List.IsPrefix s1 s ∧ byteSize s1 = p
theorem split_at_some_sound : split_at s p = some (s1, s2) ↔ byteSize s1 = p ∧ s = s1 ++ s2
```
For functions involving the Pattern
type, we separately state and prove two equivalent/soundness
theorems for the two sub-functions discussed previously (func_name_char_filter_EQ
and func_name_substring_EQ
)
or (func_name_char_filter_sound
and func_name_substring_sound
). For example:
```lean
theorem contains_char_filter_EQ : contains_char_filter s f = contains_char_filter_def s f
theorem contains_substring_EQ : contains_substring s p = contains_substring_def s p
```
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.