-
Notifications
You must be signed in to change notification settings - Fork 21
Simple proofs
This section gives simple examples of running CBMC on simple programs to prove assertions within those programs. Perhaps most important, this section introduces the concept of proof assumptions, and the importance of understanding the assumptions that a proof is making.
CBMC is a model checker for C. This means that CBMC explores all possible paths through our code on all possible inputs, and checks that all assertions in our code are true in all such paths. If we run CBMC on the program int1.c
#include <assert.h>
int main() {
int x;
assert(x == 0);
return 0;
}
with the command
cbmc int1.c
then CBMC will report an assertion failure with output that includes the lines
[main.assertion.1] line 5 assertion x == 0: FAILURE
VERIFICATION FAILED
The problem is that x
is an uninitialized stack variable, and CBMC considers it can take any value. This is a way for CBMC to consider all possible values that could be left on the stack for x
by previous programs when this program begins. One of those values
is not zero.
We can see what value CBMC discovered by asking CBMC to produce an execution
trace when it finds an assertion failure. We do this by running CBMC with
the flag --trace
. The trace describes one step-by-step
path through the code that leads to the assertion failure. For
example, the output of the command
cbmc --trace int1.c
consists of a single step
State 24 file int1.c function main line 4 thread 0
----------------------------------------------------
x=67108864 (00000100 00000000 00000000 00000000)
that initializes x
to 67108864
(the value of you counter example trace may differ). If we initialize x
to 0
as in the program int2.c
#include <assert.h>
int main() {
int x = 0;
assert(x == 0);
return 0;
}
and run CBMC with the command
cbmc int2.c
then CBMC reports that the verification is successful.
NOTE: The first state in the trace printed above is numbered 24 because CBMC runs through a number of its own internal steps before getting to the our code. CBMC does not print states resulting from its own internal steps, only states resulting from steps of our own code.
CBMC is a bounded model checker for C. CBMC works by unwinding the loops in our code (ie by unrolling/inlining a finite number of iterations of the loop). CBMC needs to know a bound on the number of times it is expected to unwind any particular loop in our code.
Sometimes CBMC can figure this out on its own. Consider the program loop1.c
#include<assert.h>
int main() {
unsigned array[10];
for (int i = 0; i < 10; i++) {
array[i] = 0;
}
for (int i = 0; i < 10; i++) {
assert(array[i] == 0);
}
return 0;
}
and run CBMC with
cbmc loop1.c
and CBMC will print
[main.assertion.1] line 11 assertion array[i] == 0: SUCCESS
VERIFICATION SUCCESSFUL
Sometimes CBMC needs help. Consider the program loop2.c
#include<assert.h>
int main() {
unsigned bound;
unsigned array[bound];
for (int i = 0; i < bound; i++) {
array[i] = 0;
}
for (int i = 0; i < bound; i++) {
assert(array[i] == 0);
}
return 0;
}
and run CBMC with
cbmc loop2.c
and CBMC will unwind the first loop forever. We need to tell CBMC how many times to unwind the loops in the program. Run CBMC with
cbmc --unwind 11 loop2.c
and CBMC will unwind every loop in the program 10 times (not 11 times!) and print
[main.assertion.1] line 11 assertion array[i] == 0: SUCCESS
VERIFICATION SUCCESSFUL
Now, however, we have a problem. CBMC has proved that there is no assertion failure in the program as long as we limit loops to 10 iterations. But how do we know that 10 iterations is enough? What if it is possible for a loop to iterate 11 times, and the error we are looking for occurs on the 11th iteration?
We can ask CBMC to prove that we have unwounded loops enough times: We
can ask CBMC to prove that the loop termination condition is guaranteed
to be true after the specified number of unwindings. We do this by
running CBMC with the flag --unwinding-assertions
. For example, running
CBMC on the first program loop1.c with
cbmc --unwind 11 --unwinding-assertions loop1.c
succeeds with
VERIFICATION SUCCESSFUL
but running CBMC on the second program loop2.c with
cbmc --unwind 11 --unwinding-assertions loop2.c
fails with output that includes
[main.unwind.0] line 7 unwinding assertion loop 0: FAILURE
VERIFICATION FAILED
In hindsight, the problem is obvious: If the initialized variable bound
happens to begin with the value 11
, then unwinding loops only 10 times will
not be enough.
If we are only going to unwind loops 10 times, then we have to restrict
ourselves to arrays of at most 10 elements.
We need to tell CBMC to assume that bound
is at most 10.
We make this assumption with the instrinsic function __CPROVER_assume
.
Consider the program loop2a.c
#include<assert.h>
int main() {
unsigned bound;
unsigned array[bound];
__CPROVER_assume(bound < 11);
for (int i = 0; i < bound; i++) {
array[i] = 0;
}
for (int i = 0; i < bound; i++) {
assert(array[i] == 0);
}
return 0;
}
and run CBMC with the command
cbmc --unwind 11 --unwinding-assertions loop2a.c
and CBMC reports success.
We now come, however, to the harsh reality of mathematical proof:
- Every proof makes assumptions, and the quality of the proof depends on the quality of the assumptions.
If the proof assumptions are true in practice, then the proof is valid in practice. If the assumptions are only sometimes true, however, then the proof is valid only in those cases.
When we write a proof for a function, we generally make three classes of assumptions:
- we may bound the size of input to the function with explicit assumptions
like
bound < 11
, - we may model the environment of the function with a proof harness that we will describe in the next section, and
- we may stub out some of the functions invoked by the function under test.
When reading or writing a proof, it is essential to pay attention to the assumptions being made, and to make these assumptions as explicit and easy-to-find as possible.
So far, we have run CBMC on a program by invoking CBMC directly on a source file. If the program is built from several source files, just list them all on the command line. We can also compile the program into the intermediate representation that CBMC uses for model checking, and run CBMC on that instead.
CBMC has a compiler called goto-cc
that compiles a source
file into the intermediate representation used by CBMC.
This intermediate representation is called a goto program, and is
basically blocks of straight-line code linked together by
goto statements.
The goto-cc
compiler can also take a collection of goto programs and
link them together into a single goto program.
Just like any other compiler.
In fact, goto-cc
is intended to be drop-in replacement for gcc
.
You may be able to build your entire project for CBMC with
just make CC=goto-cc
, and this is something to try just to see
just to see how far goto-cc
can get into your project.
In practice, the resulting goto program is just too enormous
to use in the proof of a single function, but we will get into that later.
We use goto-cc
just like we use gcc
.
With our first program int1.c, we ran CBMC on the
source file with
cbmc int1.goto
We can get the same results by compiling the source file into a goto program and running CBMC on that:
goto-cc -o int1.goto int1.c
cbmc int1.goto