Skip to content

Proposal: Implement lexical block level scoping in Viper

Grigore Rosu edited this page Dec 3, 2017 · 14 revisions

Preamble

VIP: <to be assigned>
Title: Implement lexical block-level scoping in Viper
Author: @yzhang90 @daejunpark
Type: Standard Track
Status: Draft
Created: 2017-12-1

Simple Summary

Viper should correctly implement the block-level scoping semantics in order to avoid misleading and error-prone code.

Abstract

The semantics of variable scoping is very important for a programming language because the scoping determines how a (variable) name binds to the actual object. The current Viper compiler mixes different scoping approaches and makes it very hard to reason about programs. We propose to uniformly use lexical block-level scoping in Viper.

Motivation

The principles behind Viper are security, simplicity and audibility. However, the current scoping in Viper is not clear and sometimes makes the code misleading and error-prone.

In Viper, a local variable is accessible throughout the whole function after it is declared. For example, in the following program, b is equal to 1 if choice is true and 0 otherwise. Even though a is declared and assigned inside if, it is still accessible after the if statement finishes:

def foo(choice: bool):
    if (choice):
        a = 1
    b = a

However, index variables are treated differently from other local variables. The scope of index variables is the for loop and the index variables are not accessible outside the loop. For example, Viper rejects the following program:

def foo():
    for i in range(10):
        pass
    i += 1

What's more, an annotated assignment a:num, which normally also initializes a to 0, is only executed once in the loop and thud b equals 10 in the end:

def foo():
    for i in range(10):
        a:num
        a += 1
    b = a

With the current scoping rules, a program which declares a variable inside the if...then...else branch or the for loop is almost equivalent to declaring the variable before the if statement or the for loop. For instance,

def foo(choice: bool) -> num:
     if choice:
         a = 1
         return a
     else:
         b = 2
         return a + b

is equivalent to

def foo(choice: bool) -> num:
     a: num
     b: num
     if choice:
         a = 1
         return a
     else:
         b = 2
         return a + b

The difference is that for the second program, one can negate the choice variable and swap the then and else branch, but you could not do the same thing for the first program (because a is not defined before expression a + b!).

Situations like the above make it unnecessarily difficult to reason about programs and program equivalence. In general, it is much more convenient to formally analyze/verify programs with block-scoped variables, because that allows for compositional reasoning without modifying the original program (e.g., without moving the variable declarations to the beginning of the method). Compositional verification is very important in practice, because it allows us to decompose large proof obligations into smaller proof tasks and even do those in parallel.

Specification

Throughout the specification, before/after means lexically appears before/after.

Since global variables have their own namespace self.<variable name>, we only specify the lexical block-level scoping rules for local variables inside a function declaration.

Variable declaration and initialization

A variable should be declared and initialized only through <varname>: <type> = <initial_value>. This enforces programmers to declare the type and initialize the variable. By contrast, current Viper allows a:num (which in our opinion should be forbidden) to declare a variable and (implicitly) initialize it to zero and allows a = 1 if a is not declared before (The type of a is inferred by the compiler). The only exceptions are function parameters. They do not need to be initialized.

Block scope

function declaration, if and for can create block scope.

  • function declaration def foo(a:num): stmts introduces a block scope. The block scope is started with all the formal parameters(i.e, a) declared and followed by stmts. The block scope ends when the function declaration finishes.

  • if statement: There are two cases. if(exp) then stmts introduces 1 block scope and all the statements in the then branch is executed inside this block scope. if(exp) then stmts1 else stmts2 introduce 2 block scopes. stmts1 is executed in one block scope and stmts2 is executed in the other. The two block scopes are in parallel.

  • for loop: for i in exp: stmt introduces 2 nested block with the outer block declaring the index variable i and the inner block holding the loop body stmt. Introducing 2 nested blocks is necessary because i should be visible inside the loop body and i should be alive throughout the loop. However, any variable declared inside the loop body is only alive in the current round.

Rules

  • Variables must be lexically declared (and initialized) before being used. For example, the following program should be rejected.
def foo():
    a += 1

The following program should be rejected as well.

def foo():
    a = 1

because a is not declared. The assignment should only look up the variable instead of declaring the variable if it is not declared before.

  • Variables must only be declared once inside a block. For example, the following program should be rejected:
def foo():
    a:num = 1
    a:num = 2

But the following program is valid:

def foo():
    a:num = 1
    a = 2
  • Variables declared in one block must be freed when the block ends. For example, the following program is valid:
def foo(choice: bool):
    if (choice):
        a:num = 0
        a += 1
    else:
        a:num = 1
        a += 2

a should be 1 or 3 depending on the value of choice.

  • Blocks can be nested (i.e., you can declare if inside for loop). Variables declared in the outer-block are visible to inner-block, but not the other way around.

For example,

def foo(choice: bool):
    a:num = 10
    if (choice):
        a += 1

a is 11 if choice is true.

However, the following program is not allowed because a is declared in the if block and it is not visible to the outer-block.

def foo(choice: bool):
    if (choice):
        a:num = 1
    a += 1
  • Shadowing: Block cannot be used for variable shadowing because it is error-prone. For example, the following program should be rejected because a inside the if statement tends to shadow the a outside.
def foo(choice:bool):
    a:num =0
    if (choice):
        a:bool = true

Question: Should the following program be allowed?

def foo(choice:bool):
    if (choice):
        a:bool = true
    a:num = 0

In java, similar program is allowed. But you could not do variable hoisting to lift the declaration to the top of the block.

Backwards Compatibility

If a program declares all the variables in the beginning, the behavior of the program should be the same.

Copyright

Copyright and related rights waived via CC0


Clone this wiki locally