Files
damn_simple_architecture/docs/DSA_Assembly_Reference.md
T
2026-02-05 01:09:38 +00:00

945 lines
24 KiB
Markdown

# DSA Assembly Language Reference
## Overview
This document is the comprehensive reference for writing DSA assembly programs. It covers assembly syntax, pseudo-instructions, directives, the module system, calling conventions, and provides complete examples.
**Related Documents:**
- For hardware instruction details and encoding: See *DSA ISA Specification*
- For build system and toolchain: See project documentation
## Assembly Syntax
### General Rules
- **Case Insensitive:** Mnemonics can be uppercase or lowercase (`mov` = `MOV`)
- **Comments:** Use `//` for line comments or `/* */` for block comments
- **Labels:** Identifier followed by colon (e.g., `main:`, `loop:`)
- **Whitespace:** Flexible spacing between operands
- **Numbers:**
- Decimal: `100`, `255`
- Hexadecimal: `0x10`, `0xFFFF`
- Binary: `0b1010` (if supported by assembler)
### Operand Order Convention
DSA assembly uses **GAS-style syntax** (source → destination):
```asm
mov rg0, rg1 ; Copy rg0 TO rg1 (destination is last)
add rg0, rg1, rg2 ; rg2 = rg0 + rg1 (destination is last)
```
For load/store with immediates:
```asm
lli 0x1234, rg0 ; Load immediate 0x1234 INTO rg0
ldw rg0, rg1, 8 ; Load from (rg0+8) INTO rg1
stw rg0, rg1, 8 ; Store rg0 TO address (rg1+8)
```
## Registers
| Register(s) | Type | Description | Usage Notes |
|-------------|------|-------------|-------------|
| **rg0-rgf** | General | 16 general-purpose registers | Use for variables, temporaries |
| **acc** | Special | Accumulator | ⚠️ Volatile - pseudo-instructions may overwrite |
| **spr** | Special | Stack pointer | Points to top of stack |
| **bpr** | Special | Base pointer | Used for stack frames |
| **ret** | Special | Return address | Holds return address for functions |
| **zero** | Read-only | Always zero | Reads return 0, writes discarded |
| **pcx** | Read-only | Program counter | Cannot be written directly |
| **idr** | Privileged | Interrupt descriptor table | Kernel mode only |
| **mmr** | Privileged | Memory map register | Kernel mode only |
| **noreg** | Placeholder | No register | Used in encoding, triggers fault if accessed |
**Register Conventions:**
- **acc**: Used by pseudo-instructions for temporary values - do not rely on it being preserved
- **rgf**: Used by label-addressing pseudo-instructions as a scratch register
- **rg0-rge**: Available for general use; calling convention defines which are preserved
## Hardware Instructions
This section shows assembly syntax. For encoding details, see the ISA Specification.
### Data Movement
```asm
mov src_reg, dest_reg ; Copy value from src_reg to dest_reg
movs src_reg, dest_reg ; Copy with sign extension
```
**Examples:**
```asm
mov rg0, rg1 ; rg1 = rg0
movs acc, rg2 ; rg2 = sign_extend(acc)
```
### Memory Load Instructions
```asm
ldb base_reg, dest_reg [, offset] ; Load byte (zero-extend)
ldbs base_reg, dest_reg [, offset] ; Load byte (sign-extend)
ldh base_reg, dest_reg [, offset] ; Load halfword (zero-extend)
ldhs base_reg, dest_reg [, offset] ; Load halfword (sign-extend)
ldw base_reg, dest_reg [, offset] ; Load word
```
**Offset:** Optional signed 16-bit offset (defaults to 0)
**Examples:**
```asm
ldb rg0, rg1 ; Load byte from address in rg0
ldw rg0, rg1, 8 ; Load word from (rg0 + 8)
ldhs rg2, rg3, -4 ; Load signed halfword from (rg2 - 4)
```
**Alignment Requirements:**
- `ldb/ldbs`: No alignment required
- `ldh/ldhs`: Must be 2-byte aligned
- `ldw`: Must be 4-byte aligned
### Memory Store Instructions
```asm
stb src_reg, base_reg [, offset] ; Store byte
sth src_reg, base_reg [, offset] ; Store halfword
stw src_reg, base_reg [, offset] ; Store word
```
**Examples:**
```asm
stb rg0, rg1 ; Store byte to address in rg1
stw rg0, rg1, 12 ; Store word to (rg1 + 12)
sth acc, spr, -2 ; Store halfword to (spr - 2)
```
**Alignment Requirements:** Same as loads
### Immediate Load Instructions
```asm
lli immediate, dest_reg ; Load lower 16 bits (CLEARS upper 16!)
lui immediate, dest_reg ; Load upper 16 bits (preserves lower 16)
```
**⚠️ CRITICAL:** `lli` clears the upper 16 bits! Always use `lli` before `lui`.
**Loading 32-bit Constants:**
```asm
lli 0x1234, rg0 ; rg0 = 0x00001234
lui 0xABCD, rg0 ; rg0 = 0xABCD1234
```
**Loading Addresses:** See `lwi` pseudo-instruction
### Jump and Branch Instructions
```asm
jmp addr [, offset_reg] ; Unconditional jump
jeq addr [, offset_reg] ; Jump if equal
jne addr [, offset_reg] ; Jump if not equal
jgt addr [, offset_reg] ; Jump if greater than
jge addr [, offset_reg] ; Jump if greater or equal
jlt addr [, offset_reg] ; Jump if less than
jle addr [, offset_reg] ; Jump if less or equal
```
**Jump Modes:**
```asm
; Absolute jump (using zero register)
jmp label, zero ; Jump to label address
jmp 0x4000, zero ; Jump to absolute address 0x4000
; Register-based jump
jmp 0, ret ; Jump to address in ret register
jmp 4, ret ; Jump to (ret + 4)
; PC-relative (if assembler supports)
jeq loop_start ; Jump to loop_start if equal flag set
```
**Conditional Jumps:** Based on flags set by `cmp` instruction
### Comparison
```asm
cmp reg1, reg2 ; Compare reg1 with reg2, set flags
```
**Flags Set:**
- Equal: `reg1 == reg2`
- GreaterThan: `reg1 > reg2`
- LessThan: `reg1 < reg2`
- GreaterThanOrEqual: `reg1 >= reg2`
- LessThanOrEqual: `reg1 <= reg2`
**Example:**
```asm
cmp rg0, zero ; Compare rg0 with 0
jeq is_zero ; Branch if rg0 == 0
jgt is_positive ; Branch if rg0 > 0
jlt is_negative ; Branch if rg0 < 0
```
### Arithmetic Instructions
```asm
add src1, src2, dest ; dest = src1 + src2
sub src1, src2, dest ; dest = src1 - src2
iadd src, immediate, dest ; dest = src + immediate
isub src, immediate, dest ; dest = src - immediate
inc reg ; reg = reg + 1
dec reg ; reg = reg - 1
```
**Examples:**
```asm
add rg0, rg1, rg2 ; rg2 = rg0 + rg1
sub rg0, rg1, rg2 ; rg2 = rg0 - rg1
iadd rg0, 10, rg0 ; rg0 = rg0 + 10
isub rg1, 5, rg2 ; rg2 = rg1 - 5
inc spr ; spr = spr + 1
dec spr ; spr = spr - 1
```
**Note:** For `iadd`/`isub`, destination can be the same as source for in-place operations.
### Bitwise Logical Operations
```asm
and src1, src2, dest ; dest = src1 & src2
or src1, src2, dest ; dest = src1 | src2
xor src1, src2, dest ; dest = src1 ^ src2
not src, dest ; dest = ~src
nand src1, src2, dest ; dest = ~(src1 & src2)
nor src1, src2, dest ; dest = ~(src1 | src2)
xnor src1, src2, dest ; dest = ~(src1 ^ src2)
```
**Examples:**
```asm
and rg0, rg1, rg2 ; rg2 = rg0 & rg1
or rg0, rg1, rg2 ; rg2 = rg0 | rg1
not rg0, rg1 ; rg1 = ~rg0
xor rg0, rg0, rg0 ; rg0 = 0 (XOR register with itself)
```
### Shift Operations
```asm
shl reg, shift_amount ; Shift left by amount (0-31)
shr reg, shift_amount ; Shift right by amount (0-31)
```
**Shift Amount:**
- Can be a literal: `shl rg0, 2` (shift by 2)
- Can be a register: `shl rg0, rg1` (shift by value in rg1, uses low 5 bits)
**Examples:**
```asm
shl rg0, 2 ; rg0 = rg0 << 2
shr rg1, 3 ; rg1 = rg1 >> 3
shl rg0, rg1 ; rg0 = rg0 << (rg1 & 0x1F)
```
**Note:** Shift right is logical (zero-fill), not arithmetic
### System and Control Instructions
```asm
hlt ; Halt processor
nop ; No operation
int interrupt_code ; Trigger interrupt (8-bit code)
irt ; Return from interrupt
```
**Examples:**
```asm
hlt ; Stop execution
nop ; Do nothing (timing/alignment)
int 0x21 ; Trigger interrupt 0x21
irt ; Return from interrupt handler
```
## Pseudo-Instructions
Pseudo-instructions are assembly-level constructs that expand into one or more hardware instructions.
### Data Definition Directives
```asm
db label: value1 [, value2, ...] ; Define bytes
dh label: value1 [, value2, ...] ; Define halfwords (16-bit)
dw label: value1 [, value2, ...] ; Define words (32-bit)
```
**Examples:**
```asm
db message: "Hello, World!", 0 ; String with null terminator
db bytes: 0x01, 0x02, 0x03 ; Array of bytes
dh numbers: 1000, 2000, 3000 ; Array of halfwords
dw stack_base: 0x10000 ; Single word value
dw table: 0, 0, 0, 0 ; Array of 4 words
```
**String Encoding:** Strings are encoded as byte sequences with escape sequences:
- `\n` = newline (0x0A)
- `\t` = tab (0x09)
- `\r` = carriage return (0x0D)
- `\\` = backslash
- `\"` = double quote
- `\0` = null (0x00)
### Memory Reservation Directives
```asm
resb label: size ; Reserve 'size' bytes
resh label: size ; Reserve 'size' halfwords
resw label: size ; Reserve 'size' words
```
**Examples:**
```asm
resb buffer: 256 ; Reserve 256 bytes
resh array: 100 ; Reserve 100 halfwords (200 bytes)
resw heap: 1024 ; Reserve 1024 words (4096 bytes)
```
**Note:** Reserved memory is uninitialized (contents undefined).
### Stack Operations
```asm
push reg ; Push register onto stack
pop reg ; Pop stack into register
```
**Expansion:**
```asm
; push rg0 expands to:
iadd spr, 4, spr ; spr = spr + 4 (stack grows up)
stw rg0, spr, 0 ; Store rg0 to [spr]
; pop rg0 expands to:
ldw spr, rg0, 0 ; Load [spr] into rg0
isub spr, 4, spr ; spr = spr - 4
```
**Note:** DSA stack grows upward (toward higher addresses).
**Examples:**
```asm
push rg0 ; Save rg0 on stack
push rg1 ; Save rg1 on stack
; ... do work ...
pop rg1 ; Restore rg1
pop rg0 ; Restore rg0
```
### Load Address Pseudo-Instruction
```asm
lwi label, dest_reg ; Load address of label into register
```
**Expansion:**
```asm
; lwi message, rg0 expands to:
lli message, rg0 ; Load lower 16 bits of address
lui message, rg0 ; Load upper 16 bits of address
```
**Example:**
```asm
db message: "Hello!", 0
lwi message, rg0 ; rg0 = address of message
ldb rg0, rg1 ; rg1 = first byte of message ('H')
```
### Memory Access with Labels
Load and store instructions can use labels directly:
```asm
ldb label, dest_reg [, offset]
ldh label, dest_reg [, offset]
ldw label, dest_reg [, offset]
stb src_reg, label [, offset]
sth src_reg, label [, offset]
stw src_reg, label [, offset]
```
**Expansion (uses rgf as scratch):**
```asm
; ldb buffer, rg2 expands to:
lli buffer, rgf ; Load lower 16 bits of buffer address
lui buffer, rgf ; Load upper 16 bits of buffer address
ldb rgf, rg2, 0 ; Load byte from address in rgf
; stw rg1, current expands to:
lli current, rgf ; Load lower 16 bits of current address
lui current, rgf ; Load upper 16 bits of current address
stw rg1, rgf, 0 ; Store word to address in rgf
```
**⚠️ Important:** These pseudo-instructions use `rgf` as a scratch register! Do not use `rgf` for other purposes when using label-based memory access.
**Examples:**
```asm
dw counter: 0
ldw counter, rg0 ; Load value of counter
iadd rg0, 1, rg0 ; Increment
stw rg0, counter ; Store back
```
### Function Call Pseudo-Instructions
```asm
call namespace::function ; Call function from included module
return ; Return from function
```
**Expansion:**
```asm
; call print::print expands to:
lwi print::print, ret ; Load function address into ret
jmp 0, ret ; Jump to function (saves return in pcx)
; (The assembler/linker resolves namespace::function to address)
; return expands to:
jmp 0, ret ; Jump to address in ret register
```
**Note:** The actual return address handling may be more complex depending on the calling convention.
### Module System
```asm
include namespace "path/to/file.dsa"
```
**Example:**
```asm
include print "lib/print.dsa"
include math "lib/math.dsa"
; Can now call:
call print::print
call math::multiply
```
**Namespace Resolution:**
- Functions in included modules are accessible via `namespace::label`
- Namespace is the identifier before the filename
- Labels in included files are prefixed with the namespace
## Calling Convention
DSA uses a standard calling convention for function calls.
### Stack Frame Layout
```
Higher Addresses
├─────────────┤
│ Arg N │ ← spr + (8 + 4*(N-1))
│ ... │
│ Arg 2 │ ← spr + 16
│ Arg 1 │ ← spr + 12
│ Arg 0 │ ← spr + 8 (first argument)
├─────────────┤
│ Ret Addr │ ← spr + 4 (return address)
├─────────────┤
│ Old BPR │ ← spr + 0 (saved base pointer)
├─────────────┤ ← bpr, spr (current frame)
│ Locals │ (local variables, if any)
Lower Addresses
```
### Calling Sequence
**Caller Responsibilities:**
1. **Push arguments in reverse order** (last argument first):
```asm
push arg2
push arg1
push arg0
```
2. **Call the function:**
```asm
call namespace::function
```
3. **Clean up arguments** after return:
```asm
pop zero ; Discard or retrieve arg0
pop zero ; Discard arg1
pop zero ; Discard arg2
```
**Callee Responsibilities:**
1. **Set up stack frame:**
```asm
function:
push bpr ; Save old base pointer
mov spr, bpr ; Establish new base pointer
```
2. **Access arguments:**
```asm
ldw bpr, rg0, 8 ; Load arg0 from spr+8
ldw bpr, rg1, 12 ; Load arg1 from spr+12
ldw bpr, rg2, 16 ; Load arg2 from spr+16
```
3. **Execute function body:**
```asm
; Function logic here
add rg0, rg1, acc ; Example: acc = arg0 + arg1
```
4. **Store return value** (optional, overwrites arg0):
```asm
stw acc, bpr, 8 ; Store result where arg0 was
```
5. **Restore stack frame:**
```asm
mov bpr, spr ; Restore stack pointer
pop bpr ; Restore old base pointer
```
6. **Return to caller:**
```asm
return
```
### Complete Example
```asm
; Function: add two numbers
; Args: arg0, arg1
; Returns: sum in arg0 position
add_function:
push bpr ; Save base pointer
mov spr, bpr ; Set up stack frame
ldw bpr, rg0, 8 ; Load arg0
ldw bpr, rg1, 12 ; Load arg1
add rg0, rg1, acc ; acc = arg0 + arg1
stw acc, bpr, 8 ; Store result
mov bpr, spr ; Restore stack
pop bpr ; Restore base pointer
return
; Caller:
main:
lwi stack_base, bpr
mov bpr, spr
lli 5, rg0
lli 7, rg1
push rg1 ; Push arg1 (7)
push rg0 ; Push arg0 (5)
call local::add_function
pop rg2 ; Get result (12)
pop zero ; Discard arg1
hlt
dw stack_base: 0x10000
```
### Register Usage Conventions
| Register(s) | Usage | Preserved? |
|-------------|-------|------------|
| rg0-rg3 | Function arguments, temporaries | No (caller-saved) |
| rg4-rge | Local variables | Yes (callee-saved if used) |
| rgf | Scratch (used by label addressing) | No |
| acc | Temporary calculations | No |
| spr | Stack pointer | Yes (must be restored) |
| bpr | Base pointer | Yes (must be restored) |
| ret | Return address | Managed by call/return |
**Notes:**
- Functions should save and restore rg4-rge if they use them
- rg0-rg3 may be overwritten by called functions
- acc and rgf are volatile - assume they're overwritten
## Complete Examples
### Example 1: Multiplication Library
```asm
// multiply.dsa
// Multiplies two numbers using repeated addition
//
// Usage:
// include multiply "multiply.dsa"
// push arg1
// push arg0
// call multiply::multiply
// pop result
// pop zero ; discard second argument
multiply:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 ; Load multiplier
ldw bpr, rg1, 12 ; Load multiplicand
lli 0, acc ; Initialize result to 0
loop_start:
add acc, rg0, acc ; acc += multiplier
dec rg1 ; multiplicand--
cmp rg1, zero
jgt loop_start ; Continue if multiplicand > 0
stw acc, bpr, 8 ; Store result for caller
mov bpr, spr
pop bpr
return
```
### Example 2: Print Library
```asm
// print.dsa
// Prints null-terminated string to display memory
//
// Usage:
// include print "print.dsa"
//
// push string_address
// call print::print
// pop zero
//
// call print::reset ; Reset cursor (no args)
dw display: 0x20000 ; Display memory base address
dw current: 0x20000 ; Current cursor position
// Print function
print:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 ; Get string address argument
ldw current, rg1 ; Get current cursor position
print_loop:
ldb rg0, acc ; Load character
stb acc, rg1 ; Store to display
iadd rg0, 1, rg0 ; Advance string pointer
iadd rg1, 1, rg1 ; Advance cursor
cmp acc, zero ; Check for null terminator
jne print_loop ; Continue if not null
stw rg1, current ; Save cursor position
mov bpr, spr
pop bpr
return
// Reset cursor function
reset:
push bpr
mov spr, bpr
ldw display, rg1 ; Load display base
stw rg1, current ; Reset cursor to start
mov bpr, spr
pop bpr
return
```
### Example 3: Main Program
```asm
// main.dsa
// Demonstrates using included libraries
include print "./print.dsa"
dw stack: 0x10000
db string: "'To confuse your enemy, you must first confuse yourself' - Probably Sun Tzu.", 0
init:
// Set up stack
ldw stack, bpr
mov bpr, spr
start:
// Load string address
lwi string, rg1
// Call print function
push rg1
call print::print
pop rg1 ; Clean up (rg1 now contains arg we passed)
hlt
```
### Example 4: Conditional Logic
```asm
// Demonstrates comparisons and branching
dw value: 42
main:
ldw value, rg0
cmp rg0, zero
jeq is_zero
jgt is_positive
jlt is_negative
is_zero:
// Handle zero case
lwi zero_msg, rg1
jmp print_and_exit
is_positive:
// Handle positive case
lwi positive_msg, rg1
jmp print_and_exit
is_negative:
// Handle negative case
lwi negative_msg, rg1
jmp print_and_exit
print_and_exit:
push rg1
call print::print
pop zero
hlt
db zero_msg: "Value is zero", 0
db positive_msg: "Value is positive", 0
db negative_msg: "Value is negative", 0
```
### Example 5: Loop with Counter
```asm
// Count from 0 to 9
dw stack: 0x10000
main:
ldw stack, bpr
mov bpr, spr
lli 0, rg0 ; Counter = 0
lli 10, rg1 ; Limit = 10
loop:
// Do something with counter in rg0
push rg0
call process_value
pop zero
inc rg0 ; Counter++
cmp rg0, rg1 ; Compare with limit
jlt loop ; Loop if counter < limit
hlt
process_value:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 ; Get value
; Process value here...
mov bpr, spr
pop bpr
return
```
## Best Practices
### 1. Stack Management
- Always balance push/pop operations
- Set up stack frame in every function
- Clean up arguments after function calls
- Use `pop zero` to discard unwanted values
### 2. Register Usage
- Don't rely on `acc` being preserved
- Don't use `rgf` for variables (used by label addressing)
- Save callee-saved registers if you modify them
- Use `zero` register for zero constants
### 3. Memory Access
- Ensure proper alignment for halfword/word access
- Use label-based addressing for clearer code
- Check that labels are defined before use
### 4. Function Design
- Document calling convention in comments
- Validate input arguments when appropriate
- Use consistent parameter order
- Return values via stack or designated register
### 5. Code Organization
- Use meaningful label names
- Comment complex operations
- Group related functions in modules
- Use includes for code reuse
### 6. Performance
- Minimize memory accesses (use registers)
- Avoid unnecessary comparisons
- Use shifts for multiplication/division by powers of 2
- Consider instruction pipelining if supported
## Assembler Directives
### Alignment (if supported)
```asm
.align 4 ; Align to 4-byte boundary
.align 2 ; Align to 2-byte boundary
```
### Origin (if supported)
```asm
.org 0x1000 ; Set location counter to 0x1000
```
### Section Control (if supported)
```asm
.text ; Code section
.data ; Data section
.bss ; Uninitialized data section
```
**Note:** Assembler directive support depends on the specific DSA assembler implementation.
## Common Patterns
### Loading 32-bit Constants
```asm
lli lower_16_bits, reg
lui upper_16_bits, reg
```
### Zero a Register
```asm
mov zero, reg ; Method 1
xor reg, reg, reg ; Method 2
lli 0, reg ; Method 3
```
### Copy Memory
```asm
ldw src_addr, rg0 ; Load from source
stw rg0, dest_addr ; Store to destination
```
### Multiply by Power of 2
```asm
shl reg, 3 ; Multiply by 8 (2^3)
```
### Divide by Power of 2
```asm
shr reg, 2 ; Divide by 4 (2^2)
```
### Boolean NOT
```asm
cmp reg, zero
jeq was_zero ; If reg == 0, result is 1
lli 0, reg
jmp done
was_zero:
lli 1, reg
done:
```
### Min/Max
```asm
; max(rg0, rg1) -> rg2
mov rg0, rg2 ; Assume rg0 is max
cmp rg0, rg1
jge done
mov rg1, rg2 ; rg1 was larger
done:
```
## Troubleshooting
### Common Errors
**Alignment Fault:**
- Check that halfword loads/stores use even addresses
- Check that word loads/stores use addresses divisible by 4
**Illegal Instruction:**
- Verify opcode is valid
- Check that shift amount is 0 for non-shift instructions
- Ensure you're not using `noreg` as a source/destination
**Stack Corruption:**
- Verify push/pop balance
- Check that functions restore `bpr` before returning
- Ensure caller cleans up arguments
**Wrong Results:**
- Verify `lli` is called before `lui` when loading constants
- Check that you're not relying on `acc` or `rgf` being preserved
- Verify signed vs. unsigned loads (ldb vs. ldbs)
### Debugging Tips
1. Add `nop` instructions as breakpoint markers
2. Print register values using display memory
3. Use single-step execution to trace program flow
4. Verify stack pointer values at function boundaries
5. Check label addresses in disassembly
## Appendix: Instruction Quick Reference
| Category | Instructions |
|----------|-------------|
| **Data Movement** | mov, movs |
| **Memory Load** | ldb, ldbs, ldh, ldhs, ldw |
| **Memory Store** | stb, sth, stw |
| **Immediate Load** | lli, lui |
| **Jump/Branch** | jmp, jeq, jne, jgt, jge, jlt, jle |
| **Comparison** | cmp |
| **Arithmetic** | add, sub, iadd, isub, inc, dec |
| **Logical** | and, or, xor, not, nand, nor, xnor |
| **Shift** | shl, shr |
| **System** | hlt, nop, int, irt |
| **Pseudo** | db, dh, dw, resb, resh, resw, push, pop, lwi, call, return, include |
## Version History
- **v1.0** - Initial comprehensive reference
- Combined hardware instructions and pseudo-instructions
- Added complete calling convention
- Included practical examples
- Documented common patterns and best practices