Files
damn_simple_architecture/docs/DSA_Assembly_Reference.md
T
2026-02-05 01:09:38 +00:00

24 KiB

DSA Assembly Language Reference

Overview

This document is the comprehensive reference for writing DSA assembly programs. It covers assembly syntax, pseudo-instructions, directives, the module system, calling conventions, and provides complete examples.

Related Documents:

  • For hardware instruction details and encoding: See DSA ISA Specification
  • For build system and toolchain: See project documentation

Assembly Syntax

General Rules

  • Case Insensitive: Mnemonics can be uppercase or lowercase (mov = MOV)
  • Comments: Use // for line comments or /* */ for block comments
  • Labels: Identifier followed by colon (e.g., main:, loop:)
  • Whitespace: Flexible spacing between operands
  • Numbers:
    • Decimal: 100, 255
    • Hexadecimal: 0x10, 0xFFFF
    • Binary: 0b1010 (if supported by assembler)

Operand Order Convention

DSA assembly uses GAS-style syntax (source → destination):

mov rg0, rg1        ; Copy rg0 TO rg1 (destination is last)
add rg0, rg1, rg2   ; rg2 = rg0 + rg1 (destination is last)

For load/store with immediates:

lli 0x1234, rg0     ; Load immediate 0x1234 INTO rg0
ldw rg0, rg1, 8     ; Load from (rg0+8) INTO rg1
stw rg0, rg1, 8     ; Store rg0 TO address (rg1+8)

Registers

Register(s) Type Description Usage Notes
rg0-rgf General 16 general-purpose registers Use for variables, temporaries
acc Special Accumulator ⚠️ Volatile - pseudo-instructions may overwrite
spr Special Stack pointer Points to top of stack
bpr Special Base pointer Used for stack frames
ret Special Return address Holds return address for functions
zero Read-only Always zero Reads return 0, writes discarded
pcx Read-only Program counter Cannot be written directly
idr Privileged Interrupt descriptor table Kernel mode only
mmr Privileged Memory map register Kernel mode only
noreg Placeholder No register Used in encoding, triggers fault if accessed

Register Conventions:

  • acc: Used by pseudo-instructions for temporary values - do not rely on it being preserved
  • rgf: Used by label-addressing pseudo-instructions as a scratch register
  • rg0-rge: Available for general use; calling convention defines which are preserved

Hardware Instructions

This section shows assembly syntax. For encoding details, see the ISA Specification.

Data Movement

mov src_reg, dest_reg       ; Copy value from src_reg to dest_reg
movs src_reg, dest_reg      ; Copy with sign extension

Examples:

mov rg0, rg1                ; rg1 = rg0
movs acc, rg2               ; rg2 = sign_extend(acc)

Memory Load Instructions

ldb base_reg, dest_reg [, offset]      ; Load byte (zero-extend)
ldbs base_reg, dest_reg [, offset]     ; Load byte (sign-extend)
ldh base_reg, dest_reg [, offset]      ; Load halfword (zero-extend)
ldhs base_reg, dest_reg [, offset]     ; Load halfword (sign-extend)
ldw base_reg, dest_reg [, offset]      ; Load word

Offset: Optional signed 16-bit offset (defaults to 0)

Examples:

ldb rg0, rg1                ; Load byte from address in rg0
ldw rg0, rg1, 8             ; Load word from (rg0 + 8)
ldhs rg2, rg3, -4           ; Load signed halfword from (rg2 - 4)

Alignment Requirements:

  • ldb/ldbs: No alignment required
  • ldh/ldhs: Must be 2-byte aligned
  • ldw: Must be 4-byte aligned

Memory Store Instructions

stb src_reg, base_reg [, offset]       ; Store byte
sth src_reg, base_reg [, offset]       ; Store halfword
stw src_reg, base_reg [, offset]       ; Store word

Examples:

stb rg0, rg1                ; Store byte to address in rg1
stw rg0, rg1, 12            ; Store word to (rg1 + 12)
sth acc, spr, -2            ; Store halfword to (spr - 2)

Alignment Requirements: Same as loads

Immediate Load Instructions

lli immediate, dest_reg     ; Load lower 16 bits (CLEARS upper 16!)
lui immediate, dest_reg     ; Load upper 16 bits (preserves lower 16)

⚠️ CRITICAL: lli clears the upper 16 bits! Always use lli before lui.

Loading 32-bit Constants:

lli 0x1234, rg0             ; rg0 = 0x00001234
lui 0xABCD, rg0             ; rg0 = 0xABCD1234

Loading Addresses: See lwi pseudo-instruction

Jump and Branch Instructions

jmp addr [, offset_reg]     ; Unconditional jump
jeq addr [, offset_reg]     ; Jump if equal
jne addr [, offset_reg]     ; Jump if not equal
jgt addr [, offset_reg]     ; Jump if greater than
jge addr [, offset_reg]     ; Jump if greater or equal
jlt addr [, offset_reg]     ; Jump if less than
jle addr [, offset_reg]     ; Jump if less or equal

Jump Modes:

; Absolute jump (using zero register)
jmp label, zero             ; Jump to label address
jmp 0x4000, zero            ; Jump to absolute address 0x4000

; Register-based jump
jmp 0, ret                  ; Jump to address in ret register
jmp 4, ret                  ; Jump to (ret + 4)

; PC-relative (if assembler supports)
jeq loop_start              ; Jump to loop_start if equal flag set

Conditional Jumps: Based on flags set by cmp instruction

Comparison

cmp reg1, reg2              ; Compare reg1 with reg2, set flags

Flags Set:

  • Equal: reg1 == reg2
  • GreaterThan: reg1 > reg2
  • LessThan: reg1 < reg2
  • GreaterThanOrEqual: reg1 >= reg2
  • LessThanOrEqual: reg1 <= reg2

Example:

cmp rg0, zero               ; Compare rg0 with 0
jeq is_zero                 ; Branch if rg0 == 0
jgt is_positive             ; Branch if rg0 > 0
jlt is_negative             ; Branch if rg0 < 0

Arithmetic Instructions

add src1, src2, dest        ; dest = src1 + src2
sub src1, src2, dest        ; dest = src1 - src2
iadd src, immediate, dest   ; dest = src + immediate
isub src, immediate, dest   ; dest = src - immediate
inc reg                     ; reg = reg + 1
dec reg                     ; reg = reg - 1

Examples:

add rg0, rg1, rg2           ; rg2 = rg0 + rg1
sub rg0, rg1, rg2           ; rg2 = rg0 - rg1
iadd rg0, 10, rg0           ; rg0 = rg0 + 10
isub rg1, 5, rg2            ; rg2 = rg1 - 5
inc spr                     ; spr = spr + 1
dec spr                     ; spr = spr - 1

Note: For iadd/isub, destination can be the same as source for in-place operations.

Bitwise Logical Operations

and src1, src2, dest        ; dest = src1 & src2
or src1, src2, dest         ; dest = src1 | src2
xor src1, src2, dest        ; dest = src1 ^ src2
not src, dest               ; dest = ~src
nand src1, src2, dest       ; dest = ~(src1 & src2)
nor src1, src2, dest        ; dest = ~(src1 | src2)
xnor src1, src2, dest       ; dest = ~(src1 ^ src2)

Examples:

and rg0, rg1, rg2           ; rg2 = rg0 & rg1
or rg0, rg1, rg2            ; rg2 = rg0 | rg1
not rg0, rg1                ; rg1 = ~rg0
xor rg0, rg0, rg0           ; rg0 = 0 (XOR register with itself)

Shift Operations

shl reg, shift_amount       ; Shift left by amount (0-31)
shr reg, shift_amount       ; Shift right by amount (0-31)

Shift Amount:

  • Can be a literal: shl rg0, 2 (shift by 2)
  • Can be a register: shl rg0, rg1 (shift by value in rg1, uses low 5 bits)

Examples:

shl rg0, 2                  ; rg0 = rg0 << 2
shr rg1, 3                  ; rg1 = rg1 >> 3
shl rg0, rg1                ; rg0 = rg0 << (rg1 & 0x1F)

Note: Shift right is logical (zero-fill), not arithmetic

System and Control Instructions

hlt                         ; Halt processor
nop                         ; No operation
int interrupt_code          ; Trigger interrupt (8-bit code)
irt                         ; Return from interrupt

Examples:

hlt                         ; Stop execution
nop                         ; Do nothing (timing/alignment)
int 0x21                    ; Trigger interrupt 0x21
irt                         ; Return from interrupt handler

Pseudo-Instructions

Pseudo-instructions are assembly-level constructs that expand into one or more hardware instructions.

Data Definition Directives

db label: value1 [, value2, ...]    ; Define bytes
dh label: value1 [, value2, ...]    ; Define halfwords (16-bit)
dw label: value1 [, value2, ...]    ; Define words (32-bit)

Examples:

db message: "Hello, World!", 0       ; String with null terminator
db bytes: 0x01, 0x02, 0x03          ; Array of bytes
dh numbers: 1000, 2000, 3000        ; Array of halfwords
dw stack_base: 0x10000              ; Single word value
dw table: 0, 0, 0, 0                ; Array of 4 words

String Encoding: Strings are encoded as byte sequences with escape sequences:

  • \n = newline (0x0A)
  • \t = tab (0x09)
  • \r = carriage return (0x0D)
  • \\ = backslash
  • \" = double quote
  • \0 = null (0x00)

Memory Reservation Directives

resb label: size            ; Reserve 'size' bytes
resh label: size            ; Reserve 'size' halfwords
resw label: size            ; Reserve 'size' words

Examples:

resb buffer: 256            ; Reserve 256 bytes
resh array: 100             ; Reserve 100 halfwords (200 bytes)
resw heap: 1024             ; Reserve 1024 words (4096 bytes)

Note: Reserved memory is uninitialized (contents undefined).

Stack Operations

push reg                    ; Push register onto stack
pop reg                     ; Pop stack into register

Expansion:

; push rg0 expands to:
iadd spr, 4, spr            ; spr = spr + 4 (stack grows up)
stw rg0, spr, 0             ; Store rg0 to [spr]

; pop rg0 expands to:
ldw spr, rg0, 0             ; Load [spr] into rg0
isub spr, 4, spr            ; spr = spr - 4

Note: DSA stack grows upward (toward higher addresses).

Examples:

push rg0                    ; Save rg0 on stack
push rg1                    ; Save rg1 on stack
; ... do work ...
pop rg1                     ; Restore rg1
pop rg0                     ; Restore rg0

Load Address Pseudo-Instruction

lwi label, dest_reg         ; Load address of label into register

Expansion:

; lwi message, rg0 expands to:
lli message, rg0            ; Load lower 16 bits of address
lui message, rg0            ; Load upper 16 bits of address

Example:

db message: "Hello!", 0
    
lwi message, rg0            ; rg0 = address of message
ldb rg0, rg1                ; rg1 = first byte of message ('H')

Memory Access with Labels

Load and store instructions can use labels directly:

ldb label, dest_reg [, offset]
ldh label, dest_reg [, offset]
ldw label, dest_reg [, offset]
stb src_reg, label [, offset]
sth src_reg, label [, offset]
stw src_reg, label [, offset]

Expansion (uses rgf as scratch):

; ldb buffer, rg2 expands to:
lli buffer, rgf             ; Load lower 16 bits of buffer address
lui buffer, rgf             ; Load upper 16 bits of buffer address
ldb rgf, rg2, 0             ; Load byte from address in rgf

; stw rg1, current expands to:
lli current, rgf            ; Load lower 16 bits of current address
lui current, rgf            ; Load upper 16 bits of current address
stw rg1, rgf, 0             ; Store word to address in rgf

⚠️ Important: These pseudo-instructions use rgf as a scratch register! Do not use rgf for other purposes when using label-based memory access.

Examples:

dw counter: 0

ldw counter, rg0            ; Load value of counter
iadd rg0, 1, rg0            ; Increment
stw rg0, counter            ; Store back

Function Call Pseudo-Instructions

call namespace::function    ; Call function from included module
return                      ; Return from function

Expansion:

; call print::print expands to:
lwi print::print, ret       ; Load function address into ret
jmp 0, ret                  ; Jump to function (saves return in pcx)
; (The assembler/linker resolves namespace::function to address)

; return expands to:
jmp 0, ret                  ; Jump to address in ret register

Note: The actual return address handling may be more complex depending on the calling convention.

Module System

include namespace "path/to/file.dsa"

Example:

include print "lib/print.dsa"
include math "lib/math.dsa"

; Can now call:
call print::print
call math::multiply

Namespace Resolution:

  • Functions in included modules are accessible via namespace::label
  • Namespace is the identifier before the filename
  • Labels in included files are prefixed with the namespace

Calling Convention

DSA uses a standard calling convention for function calls.

Stack Frame Layout

Higher Addresses
├─────────────┤
│   Arg N     │  ← spr + (8 + 4*(N-1))
│   ...       │
│   Arg 2     │  ← spr + 16
│   Arg 1     │  ← spr + 12
│   Arg 0     │  ← spr + 8   (first argument)
├─────────────┤
│   Ret Addr  │  ← spr + 4   (return address)
├─────────────┤
│   Old BPR   │  ← spr + 0   (saved base pointer)
├─────────────┤  ← bpr, spr (current frame)
│   Locals    │  (local variables, if any)
Lower Addresses

Calling Sequence

Caller Responsibilities:

  1. Push arguments in reverse order (last argument first):
push arg2
push arg1
push arg0
  1. Call the function:
call namespace::function
  1. Clean up arguments after return:
pop zero                    ; Discard or retrieve arg0
pop zero                    ; Discard arg1
pop zero                    ; Discard arg2

Callee Responsibilities:

  1. Set up stack frame:
function:
    push bpr                ; Save old base pointer
    mov spr, bpr            ; Establish new base pointer
  1. Access arguments:
    ldw bpr, rg0, 8         ; Load arg0 from spr+8
    ldw bpr, rg1, 12        ; Load arg1 from spr+12
    ldw bpr, rg2, 16        ; Load arg2 from spr+16
  1. Execute function body:
    ; Function logic here
    add rg0, rg1, acc       ; Example: acc = arg0 + arg1
  1. Store return value (optional, overwrites arg0):
    stw acc, bpr, 8         ; Store result where arg0 was
  1. Restore stack frame:
    mov bpr, spr            ; Restore stack pointer
    pop bpr                 ; Restore old base pointer
  1. Return to caller:
    return

Complete Example

; Function: add two numbers
; Args: arg0, arg1
; Returns: sum in arg0 position

add_function:
    push bpr                ; Save base pointer
    mov spr, bpr            ; Set up stack frame
    
    ldw bpr, rg0, 8         ; Load arg0
    ldw bpr, rg1, 12        ; Load arg1
    add rg0, rg1, acc       ; acc = arg0 + arg1
    
    stw acc, bpr, 8         ; Store result
    
    mov bpr, spr            ; Restore stack
    pop bpr                 ; Restore base pointer
    return

; Caller:
main:
    lwi stack_base, bpr
    mov bpr, spr
    
    lli 5, rg0
    lli 7, rg1
    
    push rg1                ; Push arg1 (7)
    push rg0                ; Push arg0 (5)
    call local::add_function
    pop rg2                 ; Get result (12)
    pop zero                ; Discard arg1
    
    hlt
    
dw stack_base: 0x10000

Register Usage Conventions

Register(s) Usage Preserved?
rg0-rg3 Function arguments, temporaries No (caller-saved)
rg4-rge Local variables Yes (callee-saved if used)
rgf Scratch (used by label addressing) No
acc Temporary calculations No
spr Stack pointer Yes (must be restored)
bpr Base pointer Yes (must be restored)
ret Return address Managed by call/return

Notes:

  • Functions should save and restore rg4-rge if they use them
  • rg0-rg3 may be overwritten by called functions
  • acc and rgf are volatile - assume they're overwritten

Complete Examples

Example 1: Multiplication Library

// multiply.dsa
// Multiplies two numbers using repeated addition
//
// Usage:
//   include multiply "multiply.dsa"
//   push arg1
//   push arg0
//   call multiply::multiply
//   pop result
//   pop zero        ; discard second argument

multiply:
    push bpr
    mov spr, bpr

    ldw bpr, rg0, 8         ; Load multiplier
    ldw bpr, rg1, 12        ; Load multiplicand

    lli 0, acc              ; Initialize result to 0

loop_start:
    add acc, rg0, acc       ; acc += multiplier
    dec rg1                 ; multiplicand--

    cmp rg1, zero
    jgt loop_start          ; Continue if multiplicand > 0

    stw acc, bpr, 8         ; Store result for caller
    
    mov bpr, spr
    pop bpr
    return

Example 2: Print Library

// print.dsa
// Prints null-terminated string to display memory
//
// Usage:
//   include print "print.dsa"
//   
//   push string_address
//   call print::print
//   pop zero
//
//   call print::reset     ; Reset cursor (no args)

dw display: 0x20000         ; Display memory base address
dw current: 0x20000         ; Current cursor position

// Print function
print:
    push bpr
    mov spr, bpr

    ldw bpr, rg0, 8         ; Get string address argument
    ldw current, rg1        ; Get current cursor position

print_loop:
    ldb rg0, acc            ; Load character
    stb acc, rg1            ; Store to display

    iadd rg0, 1, rg0        ; Advance string pointer
    iadd rg1, 1, rg1        ; Advance cursor

    cmp acc, zero           ; Check for null terminator
    jne print_loop          ; Continue if not null

    stw rg1, current        ; Save cursor position

    mov bpr, spr
    pop bpr
    return

// Reset cursor function
reset:
    push bpr
    mov spr, bpr
    
    ldw display, rg1        ; Load display base
    stw rg1, current        ; Reset cursor to start
    
    mov bpr, spr
    pop bpr
    return

Example 3: Main Program

// main.dsa
// Demonstrates using included libraries

include print "./print.dsa"

dw stack: 0x10000
db string: "'To confuse your enemy, you must first confuse yourself' - Probably Sun Tzu.", 0

init:
    // Set up stack
    ldw stack, bpr
    mov bpr, spr

start:
    // Load string address
    lwi string, rg1

    // Call print function
    push rg1
    call print::print
    pop rg1                 ; Clean up (rg1 now contains arg we passed)

    hlt

Example 4: Conditional Logic

// Demonstrates comparisons and branching

dw value: 42

main:
    ldw value, rg0
    
    cmp rg0, zero
    jeq is_zero
    jgt is_positive
    jlt is_negative

is_zero:
    // Handle zero case
    lwi zero_msg, rg1
    jmp print_and_exit

is_positive:
    // Handle positive case
    lwi positive_msg, rg1
    jmp print_and_exit

is_negative:
    // Handle negative case
    lwi negative_msg, rg1
    jmp print_and_exit

print_and_exit:
    push rg1
    call print::print
    pop zero
    hlt

db zero_msg: "Value is zero", 0
db positive_msg: "Value is positive", 0
db negative_msg: "Value is negative", 0

Example 5: Loop with Counter

// Count from 0 to 9

dw stack: 0x10000

main:
    ldw stack, bpr
    mov bpr, spr
    
    lli 0, rg0              ; Counter = 0
    lli 10, rg1             ; Limit = 10

loop:
    // Do something with counter in rg0
    push rg0
    call process_value
    pop zero
    
    inc rg0                 ; Counter++
    cmp rg0, rg1            ; Compare with limit
    jlt loop                ; Loop if counter < limit
    
    hlt

process_value:
    push bpr
    mov spr, bpr
    
    ldw bpr, rg0, 8         ; Get value
    ; Process value here...
    
    mov bpr, spr
    pop bpr
    return

Best Practices

1. Stack Management

  • Always balance push/pop operations
  • Set up stack frame in every function
  • Clean up arguments after function calls
  • Use pop zero to discard unwanted values

2. Register Usage

  • Don't rely on acc being preserved
  • Don't use rgf for variables (used by label addressing)
  • Save callee-saved registers if you modify them
  • Use zero register for zero constants

3. Memory Access

  • Ensure proper alignment for halfword/word access
  • Use label-based addressing for clearer code
  • Check that labels are defined before use

4. Function Design

  • Document calling convention in comments
  • Validate input arguments when appropriate
  • Use consistent parameter order
  • Return values via stack or designated register

5. Code Organization

  • Use meaningful label names
  • Comment complex operations
  • Group related functions in modules
  • Use includes for code reuse

6. Performance

  • Minimize memory accesses (use registers)
  • Avoid unnecessary comparisons
  • Use shifts for multiplication/division by powers of 2
  • Consider instruction pipelining if supported

Assembler Directives

Alignment (if supported)

.align 4                    ; Align to 4-byte boundary
.align 2                    ; Align to 2-byte boundary

Origin (if supported)

.org 0x1000                 ; Set location counter to 0x1000

Section Control (if supported)

.text                       ; Code section
.data                       ; Data section
.bss                        ; Uninitialized data section

Note: Assembler directive support depends on the specific DSA assembler implementation.

Common Patterns

Loading 32-bit Constants

lli lower_16_bits, reg
lui upper_16_bits, reg

Zero a Register

mov zero, reg               ; Method 1
xor reg, reg, reg           ; Method 2
lli 0, reg                  ; Method 3

Copy Memory

ldw src_addr, rg0           ; Load from source
stw rg0, dest_addr          ; Store to destination

Multiply by Power of 2

shl reg, 3                  ; Multiply by 8 (2^3)

Divide by Power of 2

shr reg, 2                  ; Divide by 4 (2^2)

Boolean NOT

cmp reg, zero
jeq was_zero                ; If reg == 0, result is 1
lli 0, reg
jmp done
was_zero:
lli 1, reg
done:

Min/Max

; max(rg0, rg1) -> rg2
mov rg0, rg2                ; Assume rg0 is max
cmp rg0, rg1
jge done
mov rg1, rg2                ; rg1 was larger
done:

Troubleshooting

Common Errors

Alignment Fault:

  • Check that halfword loads/stores use even addresses
  • Check that word loads/stores use addresses divisible by 4

Illegal Instruction:

  • Verify opcode is valid
  • Check that shift amount is 0 for non-shift instructions
  • Ensure you're not using noreg as a source/destination

Stack Corruption:

  • Verify push/pop balance
  • Check that functions restore bpr before returning
  • Ensure caller cleans up arguments

Wrong Results:

  • Verify lli is called before lui when loading constants
  • Check that you're not relying on acc or rgf being preserved
  • Verify signed vs. unsigned loads (ldb vs. ldbs)

Debugging Tips

  1. Add nop instructions as breakpoint markers
  2. Print register values using display memory
  3. Use single-step execution to trace program flow
  4. Verify stack pointer values at function boundaries
  5. Check label addresses in disassembly

Appendix: Instruction Quick Reference

Category Instructions
Data Movement mov, movs
Memory Load ldb, ldbs, ldh, ldhs, ldw
Memory Store stb, sth, stw
Immediate Load lli, lui
Jump/Branch jmp, jeq, jne, jgt, jge, jlt, jle
Comparison cmp
Arithmetic add, sub, iadd, isub, inc, dec
Logical and, or, xor, not, nand, nor, xnor
Shift shl, shr
System hlt, nop, int, irt
Pseudo db, dh, dw, resb, resh, resw, push, pop, lwi, call, return, include

Version History

  • v1.0 - Initial comprehensive reference
    • Combined hardware instructions and pseudo-instructions
    • Added complete calling convention
    • Included practical examples
    • Documented common patterns and best practices