Skip to content

Variables, Data and Scope

jndean edited this page Jan 28, 2020 · 6 revisions

Railway variables are dynamically typed. This may not have been a good idea, but the language embraces runtime checks to enforce many of its properties, so dynamic typing does not ruin the narrative. There are only two data types: numbers and arrays.

Numbers are arbitrary-precision rationals, allowing multiplication and division to be first-class invertible operations alongside addition and subtraction. Floating-point arithmetic is a big no no for reversible computation due to the finite precision, and making Railway integer-only would have restricted its applications. However, rational numbers do incur additional computational cost during all operations since one must find the irreducible form (coprime numerator and denominator) to prevent unnecessary memory growth.

Arrays are lists of numbers or other arrays, and can contain mixed types. They are mutable, support indexed access / modification, and can change size by pushing to / popping from the end. All operations are constant time-complexity, if you ignore the possibility that changing the size of an array may trigger a realloc.

There are 3 kinds of expression for creating arrays.

  1. Array Literals

    Creates an array from an itemised list of contents. There is no aliasing in Railway, so any variables named in the array literal will be copied to the new array (so in the example below parts of A are copied into B twice, and A is not accessible via B).

    Grammar:

    array_literal : '[' expression? ( ',' expression)* ']'
    

    Examples:

    let A = [1, 6, -3/5]
    let B = [[1, 2], 0, [A, A[1]]]
    let C = []
    
  2. Array Range

    Creates an array using a starting number (included), stopping number (not included), and optional stepping number. Array ranges can be evaluated lazily when used to index for loops or try loops.

    Grammar:

    array_range : '[' expression 'to' expression ('by' expression')? ']'
    

    Examples:

    $ Creates [0, 1, 2, 3] $
    let X = [0 to 4]
    
    $ Creates [10, 15/2, 5, 5/2] $
    let Y = [10 to X[2] by -5/2]
    
    $ Lazy element evaluation in for loops avoids creating an array of size 100 $
    for (i in [0 to 100])
       println (i)
    rof
    
  3. Array Tensors

    Creates a tensor (nested array of arrays) with dimensions specified by the second expression (which must be a 1D array of numbers). Places a copy of the first expression at every position in the new tensor.

    Grammar:

    '[' expression 'tensor' expression ']'
    

    Examples:

    $ Create a 3x4 tensor of zeros (array of 3 arrays of 4 zeros each) $
    let Z = [0 tensor [3, 4]]
    
    $ Copies Z into a 2x2 tensor, resulting in a 2x2x3x4 tensor of zeros $
    let points = [Z tensor [2, 2]]
    

There is no boolean data type in Railway. When expressions are tested for True or False, the number 0 and the empty array evaluate to False and all other data evaluates to True.

Initialisation and Scopes

There are two scopes in Railway: global scope and function scope.

Global variables are declared and initialised in global scope using the global keyword at the file level, i.e. outside any functions. Global scope is evaluated when the file is parsed, and global variables are accessible from any function in the file.

Grammar:

global_stmt : 'global' name '=' expression

Local variables are declared and initialised in function scope using the let keyword.

Grammar:

let_stmt : 'let' name '=' expression

Function scopes are flat: local variables declared at any level of 'indentation' in the function belong to the single function scope. Hence in the following example x and array are both in the same scope.

global N = 10

func myfunc()()
    let array = [0 to N]
    if (N > 20)
        let x = 4
        ...

Variables in function scope can shadow global variables (be initialised with the same name). It is not possible to access the global variable from that scope until the local variable has been uninitialised.

Variables May Not Go Out Of Scope

This point is so important I gave it its own heading. A variable going out of scope would destroy the information contained therein, which is not invertible; if the code was run in reverse and the variable reappeared in scope, the interpreter would not know how to initialise it. Therefore, if a function returns and there are still local variables in scope (which are not borrowed and not returned), a LeakedInformation error is raised.

Since they cannot go out of scope, all local variables must be uninitialised by value using the unlet keyword. This can be expensive, but ensures correctness. When code is reversed, let statements become unlet statements and vice versa.

let x = 6
x += 5
unlet x = 11

Does that mean I need to know the result of my program before it has run? Sometimes, but mostly no. Constructs like do-yield-undo help us work around this apparent restriction, but we'll get to that.

Modifying Variables

For the same reason they can not go out of scope, we may not set the value of a variable which has already been initialised. The operation x = 4 would not be invertible, since the previous value in x would be forgotten. This has profound consequences for the way Railway programs are written, but that's not what this section is about. Here we discuss the ways an existing variable can be modified in place in an invertible fashion.

Arithmetic Modifications: There are four invertible arithmetic operations that can be done in place, supporting numeric variables only.

Grammar:

modop_stmt : lookup ('+=' | '-=' | '*=' | '/=') expression

Examples:

x += 1
x -= 2
array[x] *= 3
array[i][j] /= 4

These work as expected, with the possible exception of Multiplication. In Railway, multiplying by 0 in place raises a ZeroError at run-time because it is not invertible. When the code runs backwards, that 0 multiplication would be a 0 division, which should make you uncomfortable.

Array Modifications: Aside from modifying the numbers in an array in place using the above arithmetic operations, we can also modify array variables with the push, pop and swap operations.

Grammar:

pop_stmt : 'pop' lookup '=>' name
         | 'pop' name '<=' lookup

push_stmt : 'push' name '=>' lookup
          | 'push' lookup '<=' name

swap_stmt : 'swap' lookup '<=>' lookup

Here name means any legal variable name, and lookup is a name and optionally some indices. The pop statement removes the last element of the array specified by the lookup, and assigns it to the name in the current scope. The push statement removes the named variable from current scope and appends it as an item on the end of the array specified by the lookup. This behaviour ensures that push and pop are mutual inverses, and prevents aliasing by making sure that data pushed to an array cannot still be accessed via its old name. Whenever you see an arrow (=>) in Railway, it means an object is changing ownership, and you will no longer be able to access it under its old name.

Examples:

let X = [1,5,8,0]
pop X => value
$ Now 'X' is [1,5,8], 'value' is 0
value += 9
push value => X
$ Now 'value' is removed from scope, 'X' is [1,5,8,9]

The swap operation swaps the data in the two specified locations directly, be they numbers or arrays.

Examples:

let X = [99, 99]
let Y = [1, 2, 3, 4]
swap X <=> Y[-1]
unlet X = 4
unlet Y = [1, 2, 3, [99, 99]]

Even though this is an invertible action it is actually not possible to do in complete generality using only other Railway statements. Consider the way it would be done in another language, python.

tmp = Y[-1]  # Initialisation of tmp
Y[-1] = X    # Assignment to Y[-1]
X = tmp      # Assignment to X

In Railway we can do initialisation but not assignment because, as discussed above, assignment destroys the information stored in the existing variable and hence is not invertible. Thus, the swap statement is needed to facilitate this operation. In fact, the swap statement is really the only way to insert an item into an existing array; one must swap out the existing element so it can be uninitialised properly. As such the swap statement is more important to the language than one might initially assume.

Self-modification and Aliasing

Self-modification is when information from a variable is used to modify that same variable. This is not allowed within a Railway statement (though it is possible in other ways), since it is not in general invertible.

x -= x

Clearly the information in x is destroyed here, so this code cannot be reversed. Below are more types of single-statement self-modification, all of which will be caught at parse-time.

x += array[x-1]
array[i] += array[j]
array[array[i]] += 1

The parser will raise a SelfModification error by checking for uses of the modified variable's name. It is for this reason that there can be no aliasing in Railway, i.e. it should not be possible to refer to a variable using two different names within the same scope, bypassing the parser's checks for self-modification. The 'ownership' system is an overly grand name for the set of rules and behaviours that guarantees aliasing will be detected at parse-time. If you have ever used Rust, you may be wondering why Railway doesn't implement a proper ownership system with mutable/immutable references to more flexibly guarantee that self-modification can never happen. The answer is that a) that would require a much more sophisticated compilation stage than I have written, and b) too much of the language was rooted in being dynamically typed by that point. I'm certainly interested in exploring the possility more in future projects.

[ Next Doc >]