Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

The Damn Simple Architecture

Instruction Set

Instruction Set

Overview

Below is an overview of the instruction set and the various operands. This table is non-exhaustive and may be updated as the design changes. Please note that the table spans multiple pages.

Also note that immediate (constant/literal) arguments are 16-bits long in I (immediate argument) typed instructions. For more information on this, refer to instruction encoding.

TypeDescription
RUsed when an instruction takes one or more register arguments, but no immediates. This type is also used by shift and rotation operations, as it contains a 5 bit shift amount field.
IUsed when an instruction takes at most two register arguments as well as a halfword immediate argument. This is typically used by immediate arithmetic operations e.g. addi, as well as loads and stores (where a base register and immediate offset are passed). Also used by branching instructions. The operand is a signed offset from the current value of PCX.
JUsed by jumps excluding jr, which uses a register as its argument. Jumps are absolute addresses, but there is a 256MB region around PCX since the argument is 26 bits. Since arguments are always word aligned, we bitshift left twice and set the upper 4 bits to match that of the value in PCX. This then forms a valid word-sized address.

Note: J-type instructions are currently unused.

R-type Instruction Encoding

Bits 31-26Bits 25-21Bits 20-16Bits 15-11Bits 10-6Bits 5-0
OpcodeSource Reg 1Source Reg 2Destination RegShift AmountUnused

The shift amount must be 0 when the opcode does not match a shift instruction or else the CPU will assert an Illegal Instruction exception.

If any register field is not used, it should be set to the special value NOREG, defined in the Registers section of this document. Failure to do so may result in an Illegal Instruction exception as this is undefined for an instruction that does not expect this argument to be provided.

I-type Instruction Encoding

Bits 31-26Bits 25-21Bits 20-16Bits 15-0
OpcodeSource RegDest Reg16-bit immediate

I-type instructions are used when 16-bit immediate arguments are desired. This could be for immediate arithmetic instructions (like adding 10 to the value in ACC), or loads and stores, where we may want to access the ith index of an array using an offset.

J-type Instruction Encoding

Bits 31-26Bits 25-0
OpcodeAddress

J-type instructions are used for absolute jumps.

The 26-bit address is converted to a 32-bit address by: The 26-bit address field is shifted left by 2 bits (due to word alignment we ignore the 2 least significant bits). Combined with the upper 4 bits of the PC to form a 32-bit address (bitwise OR).

The jump range: 256MB region around current PC. For longer jumps than this, see jr (Jump to word address in register).

To compute this address, the linker should find the address of the label, cut off the top 4 bits, then rightward shift twice. The CPU will then convert this to the actual 32-bit address following the steps outlined above.

Instructions

Hardware Instructions

HexTypeMnemonicOperandsDescription
0x00RNOPn/aNo operation - a blank line.
0x01RMOVSrcReg, DestRegCopies from SrcReg to DestReg.
0x02RMOVSSrcReg, DestRegCopies from SrcReg to DestReg, sign extending the value to take up a full word.
0x03ILDBBaseReg, Offset, DestRegLoads a byte from memory address (base + offset) into DestReg. The effective address must be byte-aligned.
0x04ILDBSBaseReg, Offset, DestRegLoads a sign-extended byte from memory address (base + offset) into DestReg. The effective address must be byte-aligned.
0x05ILDHBaseReg, Offset, DestRegLoads a half-word from memory address (base + offset) into DestReg. The effective address must be 2-byte-aligned.
0x06ILDHSBaseReg, Offset, DestRegLoads a sign-extended half-word from memory address (base + offset) into DestReg. The effective address must be 2-byte-aligned.
0x07ILDWBaseReg, Offset, DestRegLoads a word from memory address (base + offset) into DestReg. The effective address must be 4-byte-aligned.
0x08ISTBSrcReg, BaseReg, OffsetStores a byte from SrcReg in memory address (base + offset). The effective address must be byte-aligned.
0x09ISTHSrcReg, BaseReg, OffsetStores a half-word from SrcReg in memory address (base + offset). The effective address must be 2-byte-aligned.
0x0AISTWSrcReg, BaseReg, OffsetStores a word from SrcReg in memory address (base + offset). The effective address must be 4-byte-aligned.
0x0BILLIDstReg, ValueLoads a 16-bit literal value into reg, setting the bottom 16 bits of the word. To populate the upper 16 bits, see LUI.
0x0CILUIDstReg, ValueLoads a 16-bit literal value into reg, setting the top 16 bits of the word. To populate the lower 16 bits, see LLI.
0x0DIJMPDestReg, Offset | AddressUnconditionally jumps to the calculated address or direct address.
0x0EIJEQDestReg, Offset | AddressJumps to the calculated address or direct address if equal flag set.
0x0FIJNEDestReg, Offset | AddressJumps to the calculated address or direct address if the equal flag is not set.
0x10IJGTDestReg, Offset | AddressJumps to the calculated address or direct address if greater than flag set.
0x11IJGEDestReg, Offset | AddressJumps to the calculated address or direct address if greater than flag or equal flag set.
0x12IJLTDestReg, Offset | AddressJumps to the calculated address or direct address if less than flag set.
0x13IJLEDestReg, Offset | AddressJumps to the calculated address or direct address if less than flag or equal flag set.
0x14RCMPReg1, Reg2Compares the value of Reg1 to the value in Reg2. The results of the comparisons are set in the Status register.
0x15RINCRegIncrements the value in the given register.
0x16RDECRegDecrements the value in the given register.
0x17RSHLReg, Literal | ValRegLeft shifts the value in Reg by the given amount (either a register, or a literal value).
0x18RSHRReg, Literal | ValRegRight shifts the value in Reg by the given amount (either a register, or a literal value).
0x19RADDSrc1, Src2, DestAdds the value of Src2 to Src1 and writes the result to Dest.
0x1ARSUBSrc1, Src2, DestSubtracts the value of Src2 from Src1 and writes the result to Dest.
0x1BRANDSrc1, Src2, DestPerforms bitwise AND on Src1 and Src2 storing the result in Dest.
0x1CRORSrc1, Src2, DestPerforms bitwise OR on Src1 and Src2 storing the result in Dest.
0x1DRNOTSrc, DestPerforms bitwise NOT on Src storing the result in Dest.
0x1ERXORSrc1, Src2, DestPerforms bitwise XOR on Src1 and Src2 storing the result in Dest.
0x1FRNANDSrc1, Src2, DestPerforms bitwise NAND on Src1 and Src2 storing the result in Dest.
0x20RNORSrc1, Src2, DestPerforms bitwise NOR on Src1 and Src2 storing the result in Dest.
0x21RXNORSrc1, Src2, DestPerforms bitwise XNOR on Src1 and Src2 storing the result in Dest.
0x22IINTLiteralInitiates an interrupt with the given 8 bit interrupt code. Triggering an interrupt invokes the following behaviour: The return address is saved to the RET register. The stack base ptr is set to the kernel stack.
0x23RIRTn/aReturns from an interrupt.
0x24RHLTn/aHalts the processor.
0x25IIADDSrc1, Literal, DestAn immediate version of addition taking a 16-bit immediate value.
0x26IISUBSrc1, Literal, DestAn immediate version of subtraction taking a 16-bit immediate value.

DSA Assembly Language Instruction Reference

Overview

This document provides a comprehensive reference for the DSA (Damn Simple Architecture) assembly language, including all hardware instructions and pseudo-instructions with their syntax variations and usage examples.

Table of Contents

Instructions

This section is a complete overview of the assembly language and instructions. It includes both the hardware instructions that translate directly to machine code as well as pseudo instructions and directives that are translated to hardware instructions or directives by the assembler.

Instruction Types

Hardware Instructions

Data Movement Instructions

MnemonicOperandsDescription
MOVsrc_reg, dest_regCopy value from source to destination register
MOVSsrc_reg, dest_regCopy with sign extension

Examples:

mov rg0, rg1        ; Copy rg0 to rg1
movs rg0, rg1       ; Copy rg0 to rg1 with sign extension

Memory Access Instructions

Load Instructions

MnemonicOperandsDescription
LDBbase_reg, dest_reg [, offset]
label, dest_reg [, offset]
Load byte from memory
LDBSbase_reg, dest_reg [, offset]
label, dest_reg [, offset]
Load byte with sign extension
LDHbase_reg, dest_reg [, offset]
label, dest_reg [, offset]
Load half-word (16-bit)
LDHSbase_reg, dest_reg [, offset]
label, dest_reg [, offset]
Load half-word with sign extension
LDWbase_reg, dest_reg [, offset]
label, dest_reg [, offset]
Load word (32-bit)

Examples:

; Direct register addressing
ldb rg0, rg1        ; Load byte from address in rg0
ldw rg0, rg1, 8     ; Load word from (rg0 + 8)

; Label addressing
ldb buffer, rg2     ; Load byte from label 'buffer'
ldw stack, bpr      ; Load stack address into base pointer

Label Expansions:

; ldb buffer, rg2 expands to:
lli buffer, rg2     ; Load lower 16 bits of buffer address
lui buffer, rg2     ; Load upper 16 bits of buffer address  
ldb rg2, rg2        ; Load byte from address in rg2

; ldw stack, bpr expands to:
lli stack, bpr      ; Load lower 16 bits of stack address
lui stack, bpr      ; Load upper 16 bits of stack address
ldw bpr, bpr        ; Load word from address in bpr

Store Instructions

MnemonicOperandsDescription
STBsrc_reg, base_reg [, offset]
src_reg, label [, offset]
Store byte to memory
STHsrc_reg, base_reg [, offset]
src_reg, label [, offset]
Store half-word to memory
STWsrc_reg, base_reg [, offset]
src_reg, label [, offset]
Store word to memory

Examples:

; Direct register addressing
stb rg0, rg1        ; Store byte from rg0 to address in rg1
stw rg0, rg1, 12    ; Store word to (rg1 + 12)

; Label addressing
stb acc, buffer     ; Store byte from accumulator to 'buffer'
stw rg1, current    ; Store word to 'current' variable

Label Expansions:

; stb acc, buffer expands to:
lli buffer, rgf     ; Load lower 16 bits of buffer address
lui buffer, rgf     ; Load upper 16 bits of buffer address
stb acc, rgf        ; Store byte from acc to address in rgf

; stw rg1, current expands to:
lli current, rgf    ; Load lower 16 bits of current address
lui current, rgf    ; Load upper 16 bits of current address
stw rg1, rgf        ; Store word from rg1 to address in rgf

Immediate Load Instructions

MnemonicOperandsDescription
LLIimm, dest_regLoad 16-bit immediate into lower 16 bits
Clears upper 16 bits!
LUIimm, dest_regLoad 16-bit immediate into upper 16 bits

Usage

ensure that you always run Lli before Lui as Lli clears the upper 16 bits.

Examples:

lli 0x1234, rg0     ; Load 0x1234 into lower 16 bits of rg0
lui 0xABCD, rg0     ; Load 0xABCD into upper 16 bits of rg0

Jump Instructions

MnemonicOperandsDescription
JMPaddr [, offset_reg]
imm, offset_reg
Unconditional jump
JEQaddr [, offset_reg]Jump if equal flag set
JNEaddr [, offset_reg]Jump if not equal flag set
JGTaddr [, offset_reg]Jump if greater than flag set
JGEaddr [, offset_reg]Jump if greater or equal flags set
JLTaddr [, offset_reg]Jump if less than flag set
JLEaddr [, offset_reg]Jump if less or equal flags set

Examples:

jmp start           ; Jump to label 'start'
jmp 4, ret          ; Jump to address (4 + ret register)
jeq end             ; Jump to 'end' if equal flag set
jgt loop            ; Jump to 'loop' if greater than flag set

Arithmetic Instructions

MnemonicOperandsDescription
ADDsrc1_reg, src2_reg, dest_regAddition
SUBsrc1_reg, src2_reg, dest_regSubtraction
IADDsrc_reg, imm [, dest_reg]Immediate addition
ISUBsrc_reg, imm [, dest_reg]Immediate subtraction
INCregIncrement register by 1
DECregDecrement register by 1

Examples:

add rg0, rg1, rg2   ; rg2 = rg0 + rg1
sub rg0, rg1, rg2   ; rg2 = rg0 - rg1
iadd rg0, 10        ; rg0 = rg0 + 10
// or using alternate syntax
addi rg0, 1         ; rg0 = rg0 + 1
inc rg0             ; rg0 = rg0 + 1

Bitwise Operations

MnemonicOperandsDescription
ANDsrc1_reg, src2_reg, dest_regBitwise AND
ORsrc1_reg, src2_reg, dest_regBitwise OR
XORsrc1_reg, src2_reg, dest_regBitwise XOR
NOTsrc_reg, dest_regBitwise NOT
NANDsrc1_reg, src2_reg, dest_regBitwise NAND
NORsrc1_reg, src2_reg, dest_regBitwise NOR
XNORsrc1_reg, src2_reg, dest_regBitwise XNOR

Examples:

and rg0, rg1, rg2   ; rg2 = rg0 & rg1
not rg0, rg1        ; rg1 = ~rg0

Shift Operations

MnemonicOperandsDescription
SHLreg, shift_amountShift left
SHRreg, shift_amountShift right

Examples:

shl rg0, 2          ; Shift rg0 left by 2 bits
shr rg0, 3          ; Shift rg0 right by 3 bits

Comparison and Control

MnemonicOperandsDescription
CMPreg1, reg2Compare registers and set flags

Examples:

cmp rg0, zero       ; Compare rg0 with zero register
cmp rg1, rg2        ; Compare rg1 with rg2

System Instructions

MnemonicOperandsDescription
HLT-Halt processor execution
NOP-No operation
INTinterrupt_codeTrigger interrupt
IRT-Return from interrupt

Examples:

hlt                 ; Stop processor execution
int 0x21            ; Trigger interrupt 0x21

Pseudo Instructions

Stack Operations

MnemonicOperandsDescription
PUSHregPush register value onto stack
POPregPop stack value into register

Examples:

push rg0            ; Push rg0 value onto stack
pop ret             ; Pop return address

Memory Access Shortcuts

MnemonicOperandsDescription
LWIname, regLoad address into register

Examples:

lwi string, rg1     ; Load address of 'string' into rg1

Data Directives

Data Definition

MnemonicSyntaxDescription
DBname: value1 [, value2, ...]Define bytes
(byte aligned)
DHname: value1 [, value2, ...]Define half-words
(2 byte aligned)
DWname: value1 [, value2, ...]Define words
(4 byte aligned)

Examples:

db message: "Hello World", 0, 0x20, 231
dh numbers: 1000, 2000, 3000
dw stack: 0x10000

Notes:

  • All string literals are automatically null-terminated

Memory Reservation

MnemonicSyntaxDescription
RESBname: sizeReserve bytes
RESHname: sizeReserve half-words
RESWname: sizeReserve words

Examples:

resb buffer: 256    ; Reserve 256 bytes
resh array: 100     ; Reserve space for 100 half-words
resw heap: 1024     ; Reserve space for 1024 words

Imports

MnemonicSyntaxDescription
INCLUDEmodule_name "path"Include module symbols
More details on the module System

Usable Registers

RegisterTypeDescription
rg0-rgfGeneral PurposeGeneral-purpose registers.
accSpecialAccumulator for calculations and temporary storage - don't use this for variables as pseudo instructions may overwrite this implicitly!
sprSpecialStack pointer
bprSpecialBase pointer for stack frames
retSpecialReturn address register
idrPrivilegedInterrupt descriptor table address
on-read/write: protection fault (unless in kernel mode)
mmrPrivilegedHardware memory map table address
on-read/write: protection fault (unless in kernel mode)
zeroRead-onlyAlways contains zero
on-read: always returns zero
on-write: value is voided
pcxRead-onlyProgram counter
on-write: protection fault
noregPlaceholderIndicates absence of register argument
on-read/write: illegal instruction fault

Imports

Module System

MnemonicSyntaxDescription
INCLUDEalias[:] "path"Include module symbols

Import Precedence

Notes:

  • The order of imports may affect the order in which dependencies are placed into the output binary.
  • Circular dependencies are allowed and fully supported.
  • The module name is caller-defined and can be used to create aliases for libraries within the scope of the calling file. This makes namespacing easy.

Examples:

include print "./lib/print.dsa"
include maths "./lib/maths.dsa"

External Symbol Access Convention

External symbols are accessed using the :: operator.

Examples:

include print "./lib/print.dsa"

init:
    // ensure we have a stack setup so we can call functions properly

db string: "Hello world!"

start:
    // load the address of the string into rg1.
    lwi string, rg1
    // push the string address argument
    push rg1
    // call the print function
    call print::print
    // clean up the stack
    pop zero
    hlt

Calling Convention

Calling Convention

StepResponsibilityActionDescription
0CallerSave Current StateEnsure that any registers with important data in are pushed to the stack so that they can be restored later.
1CallerPush argumentsPush exactly n arguments to the stack
(in order, last argument pushed first)
2CallerCall functionExecute call namespace::function
this automatically pushes the return address (pcx) and jumps to the function
3FunctionSet up stack frameExecute push bpr; mov spr, bpr to establish new stack frame
4FunctionAccess argumentsRead arguments starting at spr+8
(first 3 args at offsets 8, 12, 16)
5FunctionExecute functionPerform the function's operations using the arguments
6FunctionStore return valueWrite return value (if any) to spr+8
7FunctionRestore stack frameExecute mov bpr, spr; pop bpr to restore previous stack frame
8FunctionReturnExecute return pseudo-instruction to return to caller
9CallerClean up stackPop exactly n arguments from the stack to clean up
10CallerHandle unused valuesUse pop zero to discard any unused stack values if needed
11CallerRestore StatePop any registers that were pushed in step 0
(or pop zero if no longer needed)

Notes:

  • The namespace in step 2 is the name assigned in the include statement
  • The call pseudo-instruction automatically handles return address management so long as the callee does not mess with the stack
  • Arguments are accessed by the callee using offsets from the base pointer (bpr)

Function Control

MnemonicOperandsDescription
CALLnamespace::functionCall a function with automatic return address management
RETURN-Return from a function to the caller

Examples:

call-local.dsa

// ensure the stack is set up first!

caller:
    push rg0    
    push rg1

    call callee  // make call to a local function
    pop rg0     // put result in rg0
    pop zero    // void second return val

callee:
    // setup new stack frame
    push bpr
    mov spr, bpr

    // function body

    // restore the stack frame
    mov bpr, spr
    pop bpr
    return              ; Return from the current function

call-external.dsa

include external "./external.dsa"

// ensure the stack is set up first!
db string: "Hello, world!"
caller:
    // push args
    lwi string, rg0
    push rg0
    call external::callee // do something with the string
    pop zero

external.dsa

callee:
    // set up the stack
    push bpr
    mov spr, bpr

    // function body

    // restore the stack frame
    mov bpr, spr
    pop bpr
    return              ; Return from the current function

Examples

Library Examples

Multiplication Library (multiply.dsa)

// multiply.dsa
// usage:
//
// include multiply "<relative path>"
//
// usage for multiply:
// push (arg1)
// push (arg0)
// call multiply::multiply
// pop (arg0)
// pop (arg1)

multiply:
    push bpr
    mov spr, bpr

    ldw bpr, rg0, 8  // load op 1
    ldw bpr, rg1, 12 // load op 2

    lli 0, acc      // initialize accumulator

start:	
    add acc, rg0, acc
    dec rg1

    cmp rg1, zero
    jgt start

end:
    stw acc, bpr, 8  // store result for caller
    mov bpr, spr
    pop bpr
    return
// print.dsa
// usage:
//
// include print "<relative path>"
//      
// usage for print:
//      push (register containing address of string)
//      call print::print
//      pop zero
//
// usage for reset:
//      call print::reset

dw display: 0x20000
dw current: 0x20000

// prints the given text to the screen.
print:
    push bpr
    mov spr, bpr

    ldw bpr, rg0, 8    // get string address argument
    ldw current, rg1    // get current display position

print_loop:
    ldb rg0, acc
    stb acc, rg1

    iadd rg0, 1
    iadd rg1, 1

    cmp acc, zero
    jne print_loop
    jmp end

// return
end:
    stw rg1, current

    mov bpr, spr
    pop bpr
    return

// resets the cursor position on the screen
reset:
    push bpr
    mov spr, bpr
    ldw display, rg1
    stw rg1, current
    mov bpr, spr
    pop bpr
    return

Example Program (main.dsa)

include print "./print.dsa"

dw stack: 0x10000
db string: "'To confuse your enemy, you must first confuse yourself' - Probably Sun Tzu."

init:
    // set up a stack.
    ldw stack, bpr
    mov bpr, spr

start:
    lwi string, rg1

    // push string address argument
    push rg1
    // call print function
    call print::print
    // clean up stack
    pop rg1

    hlt

Tooling

Tooling Options

Assembler

  • The assembler is the program that translates assembly code into machine code.
  • It is the only tool required to build DSA assembly language programs.
  • The assembler also works as a library that can be called from applications such as the emulator

Our Tooling:

Assembler

Building the Assembler

Clone the repository

git clone https://git.zxq5.dev/LowLevelDevs/damn_simple_architecture.git
cd damn_simple_architecture

Build the assembler

cd assembler
cargo build --release

Usage

<binary> -i <input_file.dsa> -o <output_file.dsb>

Syntax tooling

Syntax Highlighting

Emulator

  • our custom Emulator has built-in syntax highlighting for the DSA assembly language. all files with the .dsa extension have the syntax applies

VSCode

DSA Emulator

DSA Emulator

The DSA Emulator is a visual emulator that allows you to debug and test your programs in a controlled environment. It is composed of a control panel, memory inspector, and a built in editor.

The control panel lets you view all of the registers, step through the instructions, and view the current instruction counter.

The memory inspector lets you view any region of memory in the emulator.

The editor contains a built in assembler instance, so you can edit and assemble your code from the comfort of the emulator.

The loader is responsible for loading your code into memory so that the emulator can run it.

Building the Emulator

Features

Control Panel

Memory Inspector

Stack Inspector

Editor

Loader

Display

Instruction History

DSC - Damn Simple Code

This document is a work in progress!

Nothing is final!

Syntax

  • we aim to make the syntax simple and easy to understand, this has the following benefits
    • easy to write
    • easy to parse
    • little variation in syntax means we have to handle less cases in semantic analysis, meaning we will be able to create a working compiler quicker.

Types

  • we should support the following types
    • unsigned integer types (U8, U16, U32)
    • signed integer types (I8, I16, I32)
    • boolean type (Bool)
    • struct types (Struct)
    • dynamic types *(Dyn)

Functions

Other Language Support

Brainf*

Language overview

  • Brainf* instructions are as follows:
InstructionDescription
+Increment the current memory cell
-Decrement the current memory cell
<Move the data pointer to the left
>Move the data pointer to the right
.Output the value of the current memory cell as a character
,Input a character and store its value in the current memory cell
[Jump to the instruction after the matching ] if the value in the current memory cell is zero
]Jump to the instruction after the matching [ if the value in the current memory cell is non-zero

Implementations

we currently have two implementations of the brainf* esoteric programming language:

Compiler

  • this is the most efficient way to run brainf* programs on the DSA architecture, but of course, still terribly inefficient due to the nature of the language.
  • compiling allows us to calculate the jump addresses at compile time, therefore making each brainf* instruction take at maximum three DSA instructions to execute

Interpreter

  • this method is much slower, with even jumping to the start of a loop having an O(n) time complexity, which depending on the complexity of the program can up to double the running time.
  • additionally, interpreting the language means much more logic is required at runtime relative to compiling.
  • from our testing on a few example programs such as a fibonacci sequence generator, the interpreter is several orders of magnitude slower, with the fibonacci generator beingabout 10 times slower than it's compiled equivalent, at around 3.8 million instructions to generate and pretty-print the first 16 fibonacci numbers, compared to around 350,000 for the compiled version, which we estimate is about as efficient as brainf* can be on our architecture without writing an optimiser.

Usage

Compiling

  • currently The DSA Assembler supports compiling brainf* programs, with the following command:
<assembler binary name> -brainf