Skip to content

Language specification

robsimmons edited this page Feb 22, 2012 · 12 revisions

This document gives a specification of the grammar of L10 programs and talks a little bit about their meaning; some example L10 programs can be found on the examples page.

Lexical conventions

Single-line comments take the form // ... \n, and multi-line comments take the form {- ... -} and may be nested.

The syntax of pragmas is probably too heavyweight and will likely be revised, but it is intended to emphasize that things in pragmas aren't part of the logical specification of logic programming and may be specific to one implementation of L10 or another.

Tokens

  • Lower-case identifiers LCID: any sequence of alphanumeric characters that start with a lower-case identifiers a-z.
  • Upper-case identifiers UCID: any sequence of alphanumeric characters, underscores, and single quotes ' that start with a capital letter A-Z.
  • Natural number literals NAT: 0 or any other numeric character 1-9 followed by other numeric characters 0-9.
  • String literals STRING: the quote character ", followed by a series of alphanumeric characters, underscores, single quotes, and spaces and terminated by another quote.

Reserved identifiers

  • Negation not and existential quantification Ex.
  • Built-in classifiers extensible, rel, type, and world.
  • An underscore is treated as a special upper-case identifier that we do not care about.
  • SML reserved words (abstype, andalso, as, case, and so on) are reserved: the fact that the words true and false are reserved is particularly relevant.

Syntax

There are broadly four categories of top-level definition, which are defined more carefully below.

  • Constant declarations LCID: CLASS.
  • Rule declarations PREMS -> ATOMS.
  • Database declarations LCID = ATOMS.
  • Pragmas #foo ... .. (Case is ignored in pragmas, #foo is just as good as #FOO or #FoO.)

Terms and atomic propositions

Terms and atomic propositions are both formed from the same very basic grammar; the only interesting special case is addition, which computes the addition of two known natural numbers.

TERM ::= UCID | LCID | NAT | STRING | TERM TERM | TERM + TERM

Propositions and rules

Premises and conclusions are basically a series of atomic propositions, though some premises may be negated or be the special comparison propositions. Equality and equality are valid on terms of any type, but inequality can only be used on terms of the nat type of natural numbers.

ATOM  ::= LCID | ATOM TERM
ATOMS ::= ATOM | ATOM, ATOMS
PAT   ::= ATOM | Ex UCID. PREM 
PREM  ::= PAT | not PAT         
        | TERM == TERM | TERM != TERM 
        | TERM < TERM | TERM <= TERM  
        | TERM => TERM | TERM > TERM  
PREMS ::= PREM | PREM, PREMS
RULE  ::= ATOMS | PREMS -> ATOMS

Classifiers, declarations, and the signature

TYPE  ::= LCID                   // Constant type 
        | LCID -> LCID           // Constructor type
        | {UCID : LCID} TYPE     // Named constructor type

CLASS ::= LCID -> CLASS          // Argument
        | {UCID : TYPE} CLASS    // Named argument
        | LCID                   // Atomic type, classifies type constructors
        | type                   // Type, classifies atomic types
        | world                  // World, classifies worlds constructors
        | rel @ ATOM             // Forward-chaining proposition, classifies predicates

DECL  ::= LCID : CLASS.          // New constant/world/type/proposition
        | RULE.                  // New rule
        | LCID = ATOMS.          // Specify a database, no world
        | PRAGMA.                // Implementation-specific pragmas

SIG   ::=                        // Empty signature 
        | DECL SIG               // Declaration

Precedence

Application binds more tightly than anything else.

binds least tightly ---> =   :
                         <- (left associative)
                         -> (right associative)
                         , (left associative)
                         @ 
                         ==   !=   >   <   >=   <= 
                         + (left associative)
binds most tightly --->  (application)

The backwards-facing arrow <- is allowed everywhere the forward facing arrow -> is allowed, but it is syntactic sugar with higher precedence than ->. (Therefore, w <- x -> z <- y, while a very bad idea, is the same as writing (x -> w) -> z -> y.)

Pragmas in the Elton compiler

MODES :=         // None
       | MODES + // Input
       | MODES - // Output

IMPL := transparent // Implement without the 'wizard interface'
      | sealed      // Implemented with a 'wizard interface'
      | external    // Implemented by the user elsewhere

PRAGMA := #query LCID: LCID MODES // QUERY query_name foo + + -
        | #type LCID IMPL         // TYPE tree transparent

Well-formed programs

Well-typed terms

There are three kinds of type in L10.

The built-in types nat and string have constants (like 27 and "hello", respectively), and have some additional built-in operations (currently just t1 + t2, which represents natural number addition).

The extensible types are declared: the declaration t: extensible. means that t is an extensible type. Extensible types are open-ended: you don't ever have to declare constants of extensible type, and if you write a constant where the typechecker expects an extensible type, then L10 will decide that that constant is now a member of that extensible type. This association is permanent: if you have two extensible types t and s and L10 decides that the constant a has type t, then you will be unable to use the constant a where a term of type s is expected. Extensible types are Datalog-ish: they consist only of individual constants, not structured terms like f (g a) b.

The regular types are also declared. A declaration tree: type. means that tree is a regular type. In order to use the type tree, you need to explicitly declare constants. These declarations...

tree: type.
leaf: tree.
node: nat -> tree -> tree -> tree.

...say that tree is a type that can take one of two forms: leaf and node N T1 T2, where N is a natural number and T1 and T2 are trees.

Well-formed declarations

(Declarations of new propositions must be well-scoped.)

Well-formed rules

Range and world restriction on rules ...

Logic programming

...