Skip to content

A very simple computer written in Verilog, with a Python test bench

License

Notifications You must be signed in to change notification settings

EnigmaCurry/SAP

Repository files navigation

Simple As Possible Computer (SAP)

This project implements a loose interpretation of Malvino and Brown's SAP-1 computer architecture (as first described in Digital Computer Electronics (1977)). The end goal is to replicate Ben Eater's TTL Computer. Ben made a fascinating series of Youtube videos where he builds and extends the SAP-1 architecture, using nothing but simple 74xx transistors, on breadboard. For my CPU, instead of hand wiring logic chips, the entire design is defined in software, as Verilog code. This code can be simulated on a workstation, and proven to operate the way it is intended. Eventually, this software could be turned into real hardware on an FPGA board.

This code runs with the Icarus Verilog open source simulation tool on a regular Linux PC. Each component of the system is written and tested in isolation. Each unit is its own separate directory, with its own Makefile to run a Python test suite, written with cocotb. Adhering to this structure means that every unit is fully modular, and easier to comprehend and test, by drawing a box around it.

Synthesis is not yet supported. I don't own an FPGA development board yet, but I plan to check out IceStorm, a fully open source toolchain, once I do. (I hope I haven't written any non-synthesizable verilog yet! FPGAs only work with a subset of the Verilog language.) :/

This same project has been implemented several times, in different ways, by many people:

Writing your own version of SAP-1, from scratch, is a worthwhile project. It can be a very elucidating process, whether it materializes as physical logic chips, on a breadboard, as Hardware Description Language, running in a virtual machine... or in any abstraction, not yet realized. (Could reality itself be composed of sucessive lambda calculus expressions? [a(b)=1; a(1)=a(a(b))=2; a(6)=a(a(a(a(a(a(b))))))=7; find b;]) Actually going down the path, and doing the work yourself, shows you what you did not know before: there is an understanding of how a thing works, and there is a knowledge of how a thing works. These are not the same thing.

The SAP-1

SAP-1 is a simple 8-bit computer architecture with only 128 bits of memory (16 bytes!), 2 program registers, and 5 instructions. A really simple computer. Here's a comparison with a few other 8 bit computers:

CPU Year Address Size Max Ram # registers # instructions # transistors
SAP-1 (Malvino) 1977 4 bits 16 B 2 5 <500
Intel 8008 (the 'first' 8bit CPU) 1972 14 bits 16 KB 7 48 3500
Apple II (MOS 6502, Commodore 64, NES) 1975 16 bits 64 KB 3 56 4237
TRS-80 (Z80) 1977 16 bits 64 KB 17 158 8500

Architecture

Here is a nice ASCII diagram from jk-quantized:

                                ||            --------
                                ||   loNib   |        | <- CO
                                || <-------> |   PC   | <- J
                                ||           |        | <- CE
                                ||            --------
                                ||
                                ||
                                ||
            --------            ||            --------
     MI -> |        |   loNib   ||           |        | <- AI
           |  MAR   | <-------- || <-------> |   A    | <- AO
           |        |           ||           |        |
            --------            ||            --------
               |                ||               |
               | loNib          ||               |
               v                ||               v
            --------            ||            --------
     RI -> |        |           ||           |        | <- ΣO
     R0 -> |  RAM   | <-------> || <-------> |  ALU   | <- SU
           |        |           ||           |        | -> FC
            --------            ||            --------  -> FZ
                                ||               ^
                                ||               |
                                ||               |
            --------            ||            --------
     II -> |        | <-------- ||           |        | <- BI
     IO -> |   IR   |           || --------> |   B    | // <- BO
           |        | --------> ||           |        |
            --------    loNib   ||            --------
               |                ||
               | hiNib          ||
               v                ||
          INSTR DECODER         ||            --------
          AND  CONTROL          ||           |        | <- OI
           SEQUENCER            || --------> |  OUT   |
                                ||           |        |
            --------            ||            --------
     FI -> |        |           ||                |
     FZ -> |   FR   |           ||                |
     FC -> |        |           ||                v
            --------            ||             DISPLAY
                |               ||
                |               ||
                v               ||
              FLAGS             ||
                                ||

The two straight lines, going down the middle, represent a shared bus between components, which is 8 bits wide. The bus can transmit arbitrary binary data, as well as program instructions. In many other CPU designs, there are two busses: one for instructions/addresses, and a separate one for data. The two bus architecture is often called Harvard design. The SAP-1 design is supposed to be simpler, so both data, and instructions, share a single bus, like a time-share.

When the bus is used for binary data, all 8 bits of the bus are used. It's just data.

When the bus is used for instructions, the 8 bits are split in half: using the first/left 4 bits (called hiNib) as the Op Code (the 'what' to do), and the second/right 4 bits (called loNib) as the Operand, which usually specifies a memory address (the 'thing' to do 'what' upon).

Component Name Size / Type Purpose
PC Program Counter 4 bit counting register (loNib) Counts from 0000->1111 (or until HLT), the pointer to the current instruction in memory.
MAR Memory Address Register 4 bit register (loNib) Stores the memory address to actively index in the RAM.
RAM Memory 16 8bit memory cells (full bus width) 16 bytes of addressable memory: 16 entries, 8 bits each, I/O lines for all 8 bits to/from the bus.
IR Instruction Register 4 bit opcode register (hiNib) / 4 bit address register (loNib) Stores the current instruction of execution, separates hiNib and loNib, from the bus.
A Accumulator (Register A) 8 bit register (full bus width) Stores the result of computations.
B Register B 8 bit register (full bus width) Stores the 'other number' for math operations, the number to add/subtract with register A.
ALU Arithemetic Logic Unit 8 bit adder Takes values from register A and B, and performs either add/subtract function, and then puts the result back in A.
OUT Output 8 bit register (full bus width) Stores a snapshot of register A, then shows it to the user.
FR Flags Register 2 bit register flags Flags are stored when an ADD or SUB instruction results in either: 0 (zero flag) or a number that cannot be represented in 8 bits (carry flag)

The other symbols on the diagram with arrows pointing in or out of the units, are the control signals coming from the main Control Sequencer, discussed later.

Instruction set

The original SAP-1 has only 5 instructions:

Instruction Op code (hiNib) Operand (loNib) Description
LDA 0000 4 bit memory address to load Load memory at the given address into the accumulator register
ADD 0001 4 bit memory address to add Add memory at the given address to the accumulator register
SUB 0010 4 bit memory address to subtract Subtract memory at address from accumulator register
OUT 1110 None Send accumulator contents to the output register
HLT 1111 None Halt program

The computer is deterministically controlled by the contents of its RAM. If the RAM contents is changed, then the process of the computer is changed. Conversely, if the same RAM contents are used multiple times, the same process is performed every time. Therefore, if you want the computer to do something, you program the RAM, and the computer will do what is in the RAM.

The user of SAP-1 is expected to program the RAM before the computer starts execution. The machine has two modes: program mode, and execution mode. In program mode, the RAM is disconnected from the bus, and the user can directly edit the contents of the RAM through input switches. Both program and data must be input. In execution mode, these switches are disconnected, and the RAM is connected to the main bus instead. The RAM provides the program and data for the rest of the computer to follow and use.

Example program execution

Here is an example snapshot of the contents of the RAM. The program containing the instructions always starts at memory address 0000, and continues until an HLT instruction is found. In this example, the program is in memory locations 0000 through 0101. Addresses after HLT is where data storage starts, locations 0110 through 1111, and can contain data to be used during the execution of the program, using any of the LDA, ADD, or SUB instructions.

Memory Address Memory Contents (hiNib loNib) Decoded Instruction Description
0 - 0000 0000 1001 LDA 9 Read the contents of memory address 9 (1001) and place the value in register A (16).
1 - 0001 0001 1110 ADD E Read the contents of memory address E (1110) and add it to whats already in register A (16 + 127 = 143).
2 - 0010 0010 1101 SUB D Read the contents of memory address D (1101) and subtract it from whats already in register A (143 - 64 = 79).
3 - 0011 1110 xxxx OUT Read the contents of register A, and write it to the output buffer (LEDs, stdout) (Answer shown: 79)
4 - 0100 0001 1010 ADD A Read the contents of memory address A (1010) and add it to whats already in register A (79 + (-86) = -7).
5 - 0101 1111 xxxx HLT End of Program. Everything after this address is data storage. (Answer still shows 79, because of no 2nd OUT)
6 - 0110 xxxx xxxx data unused
7 - 0111 xxxx xxxx data unused
8 - 1000 xxxx xxxx data unused
9 - 1001 0001 0000 data The number 16 (programmed in before execution)
A - 1010 1010 1010 data The number -86 (programmed in before execution)
B - 1011 xxxx xxxx data unused
C - 1100 xxxx xxxx data unused
D - 1101 0100 0000 data The number 64 (programmed in before execution)
E - 1110 0111 1111 data The number 127 (programmed in before execution)
F - 1111 xxxx xxxx data unused

Once the user has programmed all of this data into RAM, the computer is switched to execution mode. The Program Counter (PC) counts from 0000 to 1111 (or until HLT), which advances through all the memory addresses, one by one. Starting at memory address 0000, the first instruction is read, and then executed (LDA 9). The Program Counter advances to 0001, and the second memory address is read (ADD E), and then executed, and so on. Once memory address 5 (0101) is read, an HLT instruction is parsed, and execution stops.

  • The computer program calculates : 16 + 127 - 64
  • Then outputs: 79
  • Then calculates 79 + (-86), but does not display the answer.
  • Then stops.

Control Sequencer

Since all the components share a single bus, there has to be something that ensures that only one thing writes to the bus at a time. This controller should also coordinate when a unit should read from the bus, and when to ignore it. The letters with arrows, represented in the diagram, show lines coming from a central controller out to each of the individual units of the system. The controller fetches an instruction from RAM, decodes the instruction into the Instruction Register, and based on this input, the Controller knows what components to talk to, to carry out the instruction. It has dedicated wires (not the main bus), running to all the various parts of the system, and ensures things happen in the right sequence.

These are the control signals output from the main controller:

Control Signal Name Description
HLT Halt Stops execution of the running program
MI MAR In The Memory Address Register is told to read from the bus
RI RAM In The Memory (RAM) is told to read from the bus
RO RAM Out The Memory (RAM) is told to write to the bus
IO Instruction Out The Instruction Register is told to write the address (loNib) to the bus. The opcode (hiNib) is always connected to the Controller
II Instruction In The Instruction Register is told to read from the bus, decode, and store the Instruction
AI A Register In Register A is told to read from the bus and store the data
AO A Register Out Register A is told to write to the bus
ΣO Sum Out The ALU is told to write the result of its computation to the bus
SU Subtract The ALU is told to do a subtraction operation, rather than addition
BI Register B In Register B is told to read from the bus and store the data
OI Output Register In The Output Register is told to read from the bus, store and display the data
CI Counter Increment The Program Counter is told to increment
CO Counter Out The Program Counter is told to write to the bus
J Jump The Program Counter is told to read from the bus, and update its count to the loNib value
FI Flags In The Flags register is told to store the current ALU flags (connected outside the bus)

The 'controller' is shorthand for Control Sequencer, as it controls the process of a single instruction over the duration of several sequential clock cycles. For the sake of simplicity, the orginal SAP-1 controller defined every instruction as taking 6 clock cycles (or 6 t-states). (Not every instruction needed all of those 6 cycles, but it is simpler to implement if you just wait out the unused cycles doing nothing.) Each t-state of the instruction is carried out sequentially, with the controller outputing different control signals depending on the current instruction and step. The controller is designed to handle each type of instruction, and carries out the step-sequence of the control signals necessary for the given instruction.

For example, let's examine the execution of a single instruction: LDA. From the perspective of the controller, this requires 6 steps (only 4 used for LDA). Each part of a step happens in parallel:

  1. Do the following Memory Fetch operations, in parallel:

    • Tell the Program Counter (PC) to output the current program pointer to the main bus. This is the next 'line number' of our program to execute, (actually it's the address pointing to it, in memory). Any component could potentially now read this value from the bus, but will not do so unless the Control Sequencer tells it to.

    • The Memory Address Register (MAR) is told to read this value from the main bus. For this moment, the MAR and the PC now contain the same value.

  2. Do the following Memory Fetch operations, in parallel:

    • Tell the Memory (RAM) to lookup the memory stored at the address that the MAR holds, and output the data to the main bus.

    • Tell the Instruction Register (IR) to read from the main bus, to decode, and store the current instruction.

    • The Instruction Register (IR) is always connected to the Control Sequencer, outputting the decoded Opcode from the stored instruction (the hiNib, the 'what to do'; LDA in this example). This Opcode informs the Control Sequencer in its decision for what to do in subsequent steps.

    • Tell the Program Counter (PC) to increment itself, in preparation for the next instruction (which won't occur until all the steps of the current instruction finish).

  3. Do the following LDA specific operations, in parallel:

    • Tell the Instruction Register (IR) to output the instruction to the bus. The memory address operand of the LDA instruction is now on the lower 8 bits (loNib) of the main bus.

    • Tell the Memory Address Register (MAR) to load from the main bus. This is the address to load from memory in the next step.

  4. Do the following LDA specific operations, in parallel:

    • Tell the Memory (RAM) to lookup the data from the MAR address, and output it to the main bus. Now the bus contains the contents of memory specified by the LDA instruction.

    • Tell Register A to read from the main bus. Now Register A contains the data the LDA instruction told it to load. This completes the LDA instruction.

  5. No operation. Just wait.

    • Malvino's SAP-1 always used 6 T-states for instructions, so instructions that don't need these extra T-states just waited out the unused cycles, doing nothing.
  6. No operation. Just wait.

    • (In Verilog, we trim off these unused T-states, because we can.)

The above explanation may seem overly complicated, and expressed in words, it is. Each step can be simplified if written with combinatorial logic, using the Control Signal names as labels. Here are all five of the orignal SAP-1 instructions, each expressed as a control sequence comprised of 6 t-states (not all used):

Instruction T-1 T-2 T-3 T-4 T-5 T-6
LDA MI, CO RO, CI MI, IO RO, AI
ADD MI, CO RO, CI MI, IO RO, BI AI, ΣO, FI
SUB MI, CO RO, CI MI, IO RI, BI AI, ΣO, SU, FI
OUT MI, CO RO, CI AO, OI
HLT MI, CO RO, CI HLT

Notice that for all of the instructions, the first two t-states are identical. T-1 and T-2, combined, are called the Fetch cycle, as it is the same steps that any instruction will need to perform in order to load the next instruction from RAM. Steps T-3 through T-6 are specific to the Instruction type, and are therefore different for each instruction.

These control sequences are implemented in the hardware design of the Control Sequencer. This is easy to do in Verilog, because if you want to modify the architecture, perhaps to add a new type of Instruction, you just modify the code. In Ben's breadboard design, he implemented these control sequences as an eeprom. This makes his design flexible, should he want to add additional instructions (see the next section). To do so, he does not need to change any of the existing wires, just reprogram the eeprom. In our Verilog design, the whole implementation is programmed, so this is an unnecessary complication. We don't need microcode, we can just hardcode the control sequences directly into the Control Sequencer implementation, and recompile when changes are made. (This decision is more conducive to FPGA implementation than ASIC design. An ASIC that supports microcode is software upgradeable. A hardcoded ASIC implementation is not.)

Is SAP-1 really a computer though?

With only five instructions, the only things SAP-1 can do are: Read memory; Add and Subtract 8 bit numbers; Show the result; and Stop. This is a calculator, not a full computer. To make this system useful, more instructions need to be created. (Up to 16 instructions can be created, using 4bit op codes.) As Ben shows, in his video Making a computer Turing complete, its only a handful of additonal instructions that are necessary to make this a universal Turing Machine (still only with 16B of memory). Ben uses a modified instruction set (note the opcode changes):

Instruction Op code (hiNib) Operand (loNib) Description
NOP 0000 None No operation
LDA 0001 4 bit memory address to read Read the contents of the given memory address, store it in the A register.
ADD 0010 4 bit memory address to add Add the contents of the given memory address to the A register.
SUB 0011 4 bit memory address to subtract Subtract the contents of the given memory address from the A register.
STA 0100 4 bit memory address to write Write the contents of the A register into the memory at the given address.
LDI 0101 4 bit number to give to A directly Load Data Immediately into register A, given the operand number (0-15). No RAM read.
JMP 0110 4 bit number to give to P. Counter Jump. Set the Program Counter to the given operand, which will be the next instruction pointer.
JC 0111 4 bit number to give to P. Counter Jump conditionally, on carry flag.
JZ 1000 4 bit number to give to P. Counter Jump conditionally, on zero flag.
OUT 1110 None Send accumulator contents to the output register.
HLT 1111 None Halt program.

The addition of the ability to store data back to RAM, and the three new jump instructions, provide the capabilities of a Turing Machine; a computer that is capable of calculating any calculable problem (within memory constraints).

Setup for development and simulation

# Install dependencies (I use Arch Linux):
pacman -S iverilog

# Clone repository
git clone https://github.com/EnigmaCurry/SAP.git
cd SAP

# Checkout cocotb submodule
git submodule init
git submodule update

# The main integration test, which runs a simple program
# to test how all the parts work together.
make

# Each component has its own Makefile in its own directory.
# Running 'make' will simulate the component by itself, driven 
# by a Python test bench found in the same directory:
cd program_counter
make

See here for example output of the main integration test

About

A very simple computer written in Verilog, with a Python test bench

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published