Skip to content

jgiron42/B

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

B

B is a B compiler for i386 written in C using lex and yacc. this repo include the compiler and a minimalist runtime for linux (contains an entrypoint and a syscall function)

The B program read B language from stdin and write intel syntax i386 GAS assembly to stdout.

The main goal of this project was to code-golf a compiler. The compiler doesn't have any AST and works only by syntax directed translation. It read B code from stdin and output asm i386 code to stdout. Using only syntax directed translation required a few hacks (eg in switch statements or in function calls) but this allowed to output asm code for each line of input immediately.

Installation

dependencies:

debian:

apt install -y make gcc yacc flex
# And if you want to link with C:
dpkg --add-architecture i386
apt update
apt install -y libc6-dev:i386

fedora:

dnf install -y make gcc byacc flex
# And if you want to link with C:
dnf install glibc-devel.i686

compilation:

git clone https://github.com/jgiron42/B.git
cd B
make

usage:

This repo include a compile.sh file to easily compile files to executables, for example if you want to compile the sample file:

./compile.sh sample/print_args.b

The compile.sh script also support linking with C or asm, options can be passed to the linker via the LD_FLAGS variable or by passing a -l option when calling the script (eg: ./compile.sh main.b file.s file.c -lc).

Technical details

The compiler follow mostly the original spec from Thompson with few exceptions to make it work on i386.

addressing

Since B was originally created for the PDP-11 which use word addressing and i386 is byte-addressed, this compiler translate the expression

ptr[2]

as

*(ptr + 4 * 2)

rather than

*(ptr + 2)

calling conventions

Since B call function as rvalue, all B function symbols are in fact pointers to functions.

C has an awful thing called function to pointer decay which mean that functions can be called by reference or by pointer. If you want to call a C function from B you need to use the & operator in the following way because you need to turn the lvalue to an rvalue.

extrn exit;
(&exit)();

And if you want to call a B function from C you simply need to declare it as a pointer to function, for example:

extern void (*b_func)(void);
b_func();

vectors

In the same way, B vectors are really pointers to vectors for example:

vec[4];

Would compile to:

vec:
.long vec + 4
.space 4, 0