Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMT: Arithmetic bounds checking #109

Open
d-xo opened this issue May 12, 2021 · 0 comments
Open

SMT: Arithmetic bounds checking #109

d-xo opened this issue May 12, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@d-xo
Copy link
Collaborator

d-xo commented May 12, 2021

One of the major differences between the act language and EVM lies in the treatment of arithmetic. EVM arithmetic operates on 256 bit words with overflow, whereas act arithmetic operates on unbounded integers and never overflows.

In order to ensure that properties proved against the act specification are sound wrt to the implementation, we need to ensure that the specifications do not make claims about arithmetic expressions that would overflow if implemented in EVM, but do not when expressed as an act.

This analysis should be performed against the expressions in the storage, creates and returns blocks only. Expressions in the iff, ensures and invariants blocks are making claims about the spec only, and so do not need to be expressible in EVM.

Note that the presence of arbitrary preconditions over the expressions in question (in the iff block) would seem to preclude the possibility of a purely syntactic procedure, meaning we will need to rely on an SMT solver for an automatic analysis. As usual this will probably prove insufficient in the face of complex non-linear arithmetic. We may wish to consider allowing users to export problematic claims to Coq for manual proof if required.

As an example consider the following spec:

constructor of Mixed
interface constructor(int _a, uint _b, int64 _c, uint64 _d)

creates

    int   a := _a
    uint  b := _b
    int64 c := _c

behaviour nested of Mixed
interface nested()

storage

    c => ((c * c) mod 2 ^ 64) - (2 ^ 32)

behaviour nested1 of Mixed
interface nested1()

storage

    c => (c * c) / (2 ^ 64)

One potential analysis would proceed by walking up the tree for each top level integer expression (in this case the expressions on the RHS of the updates to c in nested and nested1), and asserting that the range of possible values at each intermediate node is within the range of a uint256 or an int256. This ensures that all subexpressions can be safely represented within an EVM word. The final assertion for the top node in the tree should check that the whole expression is within the range of the LHS of the assignment / update.

Note that we probably want to introduce syntax that allows us to specify the expected type of the return value.

As a more specific example we can sketch the expected smt-lib that we would want to produce for the update: c => (c * c) / (2 ^ 64):

We would start by declaring all relevant storage variables, all predicates from the iff block, and the bounds derived from the types of the variables involved:

(define-const MAX_UINT_256 Int 115792089237316195423570985008687907853269984665640564039457584007913129639935)
(define-const MIN_INT_256 Int (- 57896044618658097711785492504343953926634992332820282019728792003956564819968))
(define-const MAX_INT_256 Int 57896044618658097711785492504343953926634992332820282019728792003956564819967)
(define-const MIN_INT_64 Int (- 9223372036854775808))
(define-const MAX_INT_64 Int 9223372036854775807)

(define-const pow64 Int 18446744073709551616)

(declare-const c Int)

(assert (and (<= MIN_INT_64 c) (<= c MAX_INT_64)))

And continue by asserting that c * c can be safely represented in EVM:

(push)
(assert (not (or (and (<= 0 (* c c)) (<= (* c c) MAX_UINT_256)) (and (<= MIN_INT_256 (* c c)) (<= (* c c) MAX_INT_256)))))
(check-sat)
(pop)

If this is unsat, then we have a proof that the result of c * c can be safely represented in an evm word. We can then continue by:

  1. asserting that c * c is within the range of an EVM word
  2. checking if c * c / 2 ^ 64 is within the range of an int64 (the type of c)
(push)
(assert (or (and (<= 0 (* c c)) (<= (* c c) MAX_UINT_256)) (and (<= MIN_INT_256 (* c c)) (<= (* c c) MAX_INT_256))))
(assert (not (and (<= MIN_INT_256 (div (* c c) pow64)) (<= (div (* c c) pow64) MAX_INT_256))))
(check-sat)
(pop)

If this second query returns unsat, then we have a proof that the expression as a whole cannot exceed the bounds of it's type (an int64).

One final consideration is expressions involving explicit calls to mod. In these cases, as along as the RHS of the mod would allow the result to fit within the relevant type bounds, then bounds checking can be ignored for the top node of the expression on the LHS of the mod. Note that we cannot safely ignore bounds checking for the nested subexpressions on the LHS as we still need to be sure that the result of each subexpression on the LHS can be safely represented in an EVM word. An example:

constructor of Mod
interface constructor(uint256 _a, uint128 _b)

creates

    uint256 a := _a
    uint128 b := _b

behaviour help of Mod
interface help()

storage

    a => (a ^ a) mod (2 ^ 256)
    b => (a ^ a) mod (2 ^ 256)

In this case the assignment to a is safe (can never overflow), while the assignment to b is unsafe and should result in an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant