This is a compiler for the Xlang programming language, developed as the final project of my undergraduate compiler course.
The Xlang compiler follows a typical compiler design, with the following phases:
- Lexical analysis - Using flex, this phase breaks the input code into tokens
- Syntax analysis - Using bison, this phase parses the tokens and generates a syntax tree
- Semantic analysis - This phase performs semantic checking and builds a symbol table
- Intermediate code generation - The syntax tree is traversed to generate intermediate code
- Code optimization - The intermediate code is optimized to improve performance
- Code generation - The optimized intermediate code is translated to target assembly code
The compiler targets the MIPS R2000/R3000 architecture and can generate executable code on MIPS processors.
Xlang is an educational programming language with C/Pascal-like syntax. Key features include:
- Xlang syntax is similar to Pascal/C
- Code is organized into classes and methods
- Supports nested block structures using { }
- ; required to terminate statements
- Built-in types are integer and boolean
- Arrays are supported for these types
- No support for strings, floating point, etc.
- Variables must be declared before use
- Local variables within methods
- Global variables at class scope
- Call-by-value parameter passing
- Arithmetic operators: +, -, *, /, %
- Logical operators: &&, ||, !
- Comparison operators: ==, !=, <, >, <=, >=
- if-else conditional statements
- for loops with 3 parts: init, check, increment
- break and continue keywords
- Functions defined with return type
- void return for procedures
- Recursion is permitted
- Parameters passed by value
- Single inheritance model
- Program entry point is main() method
- Fields and methods defined in class
- // indicates line comment
- /* */ delimits block comment
- No object-oriented features besides classes
- No strings, floats, chars, enums, structs
- Basic type checking but no full type safety
- Limited compile-time error checking
- Small subset of features compared to C or C++
- Input file is broken into a stream of tokens
- Each token has a type (keyword, identifier, operator etc.) and attribute value
- Regular expressions match token patterns
- Handling of comments, whitespace, preprocessor directives
- Context-free grammar defines allowable syntax
- Recursive descent parsing constructs syntax tree
- Syntax tree nodes reflect grammar productions
- Tree represents program structure based on language grammar
- Check types of expressions and assignments
- Resolve variable references using symbol table
- Check function arguments and return types
- Detect undeclared and redefined identifiers
- Insert implicit type conversions where needed
- 3-address code with temporary variables
- Static single assignment form for optimization
- Quadruples represent operations on variables
- Symbols refer to variables and constants
- Common subexpression elimination
- Copy propagation
- Dead code elimination
- Redundant load/store elimination
- Basic block optimizations
- Tree traversal generates MIPS instructions
- Register allocation using graph coloring
- Peephole optimizations combine instructions
- Add directives for data/text segments
- Insert stack management instructions
The information provided above is an English translation of the original project description, which was written in Persian. The original Persian project description PDF can be found in this repository, along with my Yacc and Lex codes.