17 KiB
DSA Instruction Set Architecture Specification
Overview
The Damn Simple Architecture (DSA) is a 32-bit RISC-style architecture designed for simplicity and educational purposes. This document provides the complete instruction set architecture specification, including all hardware instructions, registers, and encoding formats.
Data Types and Sizes
| Type | Size | Alignment |
|---|---|---|
| Byte | 8 bits | 1-byte aligned |
| Halfword | 16 bits | 2-byte aligned |
| Word | 32 bits | 4-byte aligned |
Note on Endianness:
- Instructions and numeric data in memory: Little-endian
- Data defined via
db/dh/dwdirectives: Big-endian (assembler-specific)
Registers
DSA provides 32 programmer-accessible registers plus several internal system registers.
Programmer-Accessible Registers
| Hex | Register | Type | Description |
|---|---|---|---|
| 0x00-0x0F | rg0-rgf | General Purpose | 16 general-purpose registers for variables and temporary values |
| 0x10 | acc | Special | Accumulator for calculations and temporary storage ⚠️ Used as scratch by pseudo-instructions - volatile |
| 0x11 | spr | Special | Stack pointer - points to top of stack |
| 0x12 | bpr | Special | Base pointer - used for stack frame management |
| 0x13 | ret | Special | Return address register - used for function returns |
| 0x14 | idr | Privileged | Interrupt descriptor table address Read/write triggers protection fault in user mode |
| 0x15 | mmr | Privileged | Hardware memory map table address Read/write triggers protection fault in user mode |
| 0x16 | zero | Read-only | Constant zero value Reads always return 0, writes are discarded |
| 0x17 | noreg | Placeholder | Indicates unused register field Read/write triggers illegal instruction fault Can also be referenced as null |
| 0x18-0x1F | - | Reserved | Reserved for future use |
System Registers (indices 0x18-0x1C): These exist in the encoding space but are internal to the CPU implementation:
| Hex | Register | Description |
|---|---|---|
| 0x18 | mar | Memory Address Register (CPU internal) |
| 0x19 | mdr | Memory Data Register (CPU internal) |
| 0x1A | sts | Status Register (CPU internal) |
| 0x1B | cir | Current Instruction Register (CPU internal) |
| 0x1C | pcx | Program Counter (read-only, special access) |
Note on PCX (Program Counter):
- PCX can be read in certain contexts (e.g., stored during CALL)
- Writing to PCX triggers a protection fault
- PCX is automatically updated by jump and branch instructions
Status Register (STS) Layout
The status register is a 32-bit register with the following flag bits:
| Bit | Name | Description | Boot Value |
|---|---|---|---|
| 0 | Equal | Set if last comparison result was equal | 0 |
| 1 | GreaterThan | Set if last comparison result was greater than | 0 |
| 2 | GreaterThanOrEqual | Set if last comparison was greater than or equal | 0 |
| 3 | LessThan | Set if last comparison result was less than | 0 |
| 4 | LessThanOrEqual | Set if last comparison was less than or equal | 0 |
| 5 | Zero | Set if last arithmetic/logic operation result was zero | 0 |
| 6-31 | - | Reserved | 0 |
Instruction Encoding Formats
DSA uses three instruction encoding formats:
R-Type (Register) Instructions
Used for operations with register operands only, including shifts.
31-26 | 25-21 | 20-16 | 15-11 | 10-6 | 5-0
--------+---------+---------+---------+--------+-------
Opcode | SrcReg1 | SrcReg2 | DestReg | ShiftAmt | Unused
- Opcode (6 bits): Instruction operation code
- SrcReg1 (5 bits): First source register
- SrcReg2 (5 bits): Second source register
- DestReg (5 bits): Destination register
- ShiftAmt (5 bits): Shift amount (for shift instructions only, must be 0 otherwise)
- Unused (6 bits): Must be 0
Important Rules:
- ShiftAmt must be 0 for non-shift instructions (else illegal instruction fault)
- Unused register fields must be set to
noreg(0x17) if not used - Using registers in unexpected positions may cause illegal instruction fault
I-Type (Immediate) Instructions
Used for operations with a 16-bit immediate value.
31-26 | 25-21 | 20-16 | 15-0
--------+---------+---------+-------------
Opcode | SrcReg | DestReg | 16-bit Immediate
- Opcode (6 bits): Instruction operation code
- SrcReg (5 bits): Source register (base for memory ops)
- DestReg (5 bits): Destination register (or offset register for jumps)
- Immediate (16 bits): Signed 16-bit immediate value or offset
Usage:
- Arithmetic: Immediate is a signed value
- Memory access: Immediate is a signed byte offset from base address
- Branches: Immediate is a signed offset added to base register
- Literal loads: Immediate is unsigned 16-bit value
J-Type (Jump) Instructions
Used for absolute jumps with large address ranges.
31-26 | 25-0
--------+----------------------
Opcode | 26-bit Address
- Opcode (6 bits): Jump instruction code
- Address (26 bits): Partial address for jump
Address Calculation:
- Left-shift the 26-bit address by 2 (word alignment)
- OR with upper 4 bits of current PCX
- Result is final 32-bit jump address
Jump Range: 256MB region around current PC (±128MB)
Note: J-type instructions are defined but currently unused. Use I-type JMP with register addressing for all jumps.
Hardware Instructions
Data Movement
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x00 | NOP | R | - | No operation - does nothing |
| 0x01 | MOV | R | SrcReg, DestReg | Copy value from SrcReg to DestReg |
| 0x02 | MOVS | R | SrcReg, DestReg | Copy with sign extension to fill 32 bits |
MOV/MOVS Details:
- MOV performs direct copy (all 32 bits)
- MOVS sign-extends the value (useful after byte/halfword loads)
- Both instructions set the Zero flag if result is zero
Memory Access - Load Instructions
All loads require proper alignment or trigger an alignment fault.
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x03 | LDB | I | BaseReg, DestReg, Offset | Load byte (8-bit), zero-extend to 32 bits |
| 0x04 | LDBS | I | BaseReg, DestReg, Offset | Load byte (8-bit), sign-extend to 32 bits |
| 0x05 | LDH | I | BaseReg, DestReg, Offset | Load halfword (16-bit), zero-extend to 32 bits |
| 0x06 | LDHS | I | BaseReg, DestReg, Offset | Load halfword (16-bit), sign-extend to 32 bits |
| 0x07 | LDW | I | BaseReg, DestReg, Offset | Load word (32-bit) |
Load Operation:
- Effective address = BaseReg + SignExtend(Offset)
- Offset is a signed 16-bit value
- Alignment requirements:
- LDB/LDBS: No alignment required (byte-aligned)
- LDH/LDHS: Must be 2-byte aligned
- LDW: Must be 4-byte aligned
Encoding Note: In machine code, the order is: BaseReg (SrcReg field), DestReg field, Offset (Immediate field)
Memory Access - Store Instructions
All stores require proper alignment or trigger an alignment fault.
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x08 | STB | I | SrcReg, BaseReg, Offset | Store byte (8-bit) to memory |
| 0x09 | STH | I | SrcReg, BaseReg, Offset | Store halfword (16-bit) to memory |
| 0x0A | STW | I | SrcReg, BaseReg, Offset | Store word (32-bit) to memory |
Store Operation:
- Effective address = BaseReg + SignExtend(Offset)
- Offset is a signed 16-bit value
- Only the relevant bits are stored (8, 16, or 32)
- Alignment requirements:
- STB: No alignment required (byte-aligned)
- STH: Must be 2-byte aligned
- STW: Must be 4-byte aligned
Encoding Note: In machine code: SrcReg (SrcReg field), BaseReg (DestReg field), Offset (Immediate field)
Immediate Load Instructions
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x0B | LLI | I | Value, DestReg | Load 16-bit value into lower 16 bits ⚠️ CLEARS upper 16 bits! |
| 0x0C | LUI | I | Value, DestReg | Load 16-bit value into upper 16 bits Lower 16 bits unchanged |
Usage for 32-bit Values:
LLI 0x1234, rg0 ; rg0 = 0x00001234
LUI 0xABCD, rg0 ; rg0 = 0xABCD1234
⚠️ CRITICAL: Always execute LLI before LUI, as LLI clears the upper 16 bits!
Note on LUI: The assembler may shift the immediate value right by 16 bits when encoding, so specify the upper 16 bits directly (e.g., LUI 0xABCD, rg0 not LUI 0xABCD0000, rg0).
Encoding Note: In machine code: Value (Immediate field), DestReg (SrcReg field for LLI, SrcReg field for LUI)
Jump and Branch Instructions
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x0D | JMP | I | Offset, BaseReg | Unconditional jump to (BaseReg + Offset) |
| 0x0E | JEQ | I | Offset, BaseReg | Jump if Equal flag set |
| 0x0F | JNE | I | Offset, BaseReg | Jump if Equal flag NOT set |
| 0x10 | JGT | I | Offset, BaseReg | Jump if GreaterThan flag set |
| 0x11 | JGE | I | Offset, BaseReg | Jump if GreaterThan OR Equal flag set |
| 0x12 | JLT | I | Offset, BaseReg | Jump if LessThan flag set |
| 0x13 | JLE | I | Offset, BaseReg | Jump if LessThan OR Equal flag set |
Jump Calculation:
- Target address = BaseReg + SignExtend(Offset)
- If BaseReg = zero, this becomes absolute addressing with Offset
- If BaseReg = ret, this becomes return-style addressing
- Conditional jumps check flags in STS register
Common Patterns:
JMP label, zero ; Absolute jump to label address
JMP 0, ret ; Jump to address in ret register
JMP 4, ret ; Jump to (ret + 4)
Encoding Note: In machine code: Offset (Immediate field), BaseReg (SrcReg field) (DestReg unused, set to noreg)
Comparison
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x14 | CMP | R | Reg1, Reg2 | Compare Reg1 with Reg2, set flags in STS |
Flag Setting:
- Equal: Set if Reg1 == Reg2
- GreaterThan: Set if Reg1 > Reg2 (signed)
- GreaterThanOrEqual: Set if Reg1 >= Reg2 (signed)
- LessThan: Set if Reg1 < Reg2 (signed)
- LessThanOrEqual: Set if Reg1 <= Reg2 (signed)
- Zero: Set if (Reg1 - Reg2) == 0 (same as Equal)
Encoding Note: DestReg and ShiftAmt fields unused (set to noreg and 0)
Arithmetic Instructions
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x15 | INC | R | Reg | Increment register by 1 |
| 0x16 | DEC | R | Reg | Decrement register by 1 |
| 0x19 | ADD | R | Src1, Src2, Dest | Dest = Src1 + Src2 |
| 0x1A | SUB | R | Src1, Src2, Dest | Dest = Src1 - Src2 |
| 0x25 | IADD | I | Src, Literal, Dest | Dest = Src + SignExtend(Literal) |
| 0x26 | ISUB | I | Src, Literal, Dest | Dest = Src - SignExtend(Literal) |
Flag Effects:
- Zero flag set if result is zero
- Other flags undefined after arithmetic (use CMP for comparisons)
Encoding Notes:
- INC/DEC: Reg in SrcReg1 field, DestReg set to noreg
- IADD/ISUB: Immediate is signed 16-bit value, all three operands required
Bitwise Logical Operations
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x1B | AND | R | Src1, Src2, Dest | Dest = Src1 & Src2 (bitwise AND) |
| 0x1C | OR | R | Src1, Src2, Dest | Dest = Src1 | Src2 (bitwise OR) |
| 0x1D | NOT | R | Src, Dest | Dest = ~Src (bitwise NOT) |
| 0x1E | XOR | R | Src1, Src2, Dest | Dest = Src1 ^ Src2 (bitwise XOR) |
| 0x1F | NAND | R | Src1, Src2, Dest | Dest = ~(Src1 & Src2) (bitwise NAND) |
| 0x20 | NOR | R | Src1, Src2, Dest | Dest = ~(Src1 | Src2) (bitwise NOR) |
| 0x21 | XNOR | R | Src1, Src2, Dest | Dest = ~(Src1 ^ Src2) (bitwise XNOR) |
Flag Effects:
- Zero flag set if result is zero
- Other flags undefined
Encoding Note: NOT uses only Src (SrcReg1) and Dest (DestReg); SrcReg2 unused (set to noreg)
Shift Operations
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x17 | SHL | R | Reg, ShiftAmount | Shift Reg left by ShiftAmount bits Zero-fill from right |
| 0x18 | SHR | R | Reg, ShiftAmount | Shift Reg right by ShiftAmount bits Zero-fill from left (logical shift) |
Shift Amount:
- Literal shifts: ShiftAmount is a 5-bit literal (0-31) in assembly
- Stored in ShiftAmt field of instruction
- SrcReg2 set to noreg
- Register shifts: ShiftAmount is a register containing shift value
- Register specified in SrcReg2 field
- ShiftAmt field must be 0
- Only low 5 bits of register value used
Note: Current assembler implementation may only support literal shifts. Check assembler documentation.
Flag Effects:
- Zero flag set if result is zero
Encoding Notes:
- Reg in both SrcReg1 and DestReg fields (shifted in place)
- For literal shifts: ShiftAmt field contains shift count, SrcReg2 = noreg
- For register shifts: SrcReg2 contains register, ShiftAmt must be 0
System and Control Instructions
| Hex | Mnemonic | Type | Operands | Description |
|---|---|---|---|---|
| 0x22 | INT | I | InterruptCode | Trigger interrupt with 8-bit code Saves return address to ret register Sets bpr to kernel stack |
| 0x23 | IRT | R | - | Return from interrupt Restores execution context |
| 0x24 | HLT | R | - | Halt processor execution Stops fetch-decode-execute cycle |
INT Behavior:
- Save current PCX to ret register
- Switch bpr to kernel stack address
- Look up interrupt handler address in interrupt descriptor table (idr)
- Jump to handler at interrupt vector
IRT Behavior:
- Restore previous execution context
- Return to address in ret register
- Restore user stack pointer
Encoding Notes:
- INT: InterruptCode in low 8 bits of Immediate field
- IRT/HLT: All register fields set to noreg, ShiftAmt to 0
Meta Instructions (Assembler/Linker)
These instructions are used by the assembler and linker but may not represent real CPU operations.
| Hex | Mnemonic | Description |
|---|---|---|
| 0x27 | SEGMENT | Segment marker (implementation-specific) |
| 0x3E | DATA | Raw data embedding |
Note: The SEGMENT instruction opcode may vary between implementations (0x27 in assembler, 0x3F in some contexts). Consult your specific toolchain documentation.
Instruction Summary Table
| Opcode | Mnemonic | Type | Category |
|---|---|---|---|
| 0x00 | NOP | R | Control |
| 0x01 | MOV | R | Data Movement |
| 0x02 | MOVS | R | Data Movement |
| 0x03 | LDB | I | Memory Load |
| 0x04 | LDBS | I | Memory Load |
| 0x05 | LDH | I | Memory Load |
| 0x06 | LDHS | I | Memory Load |
| 0x07 | LDW | I | Memory Load |
| 0x08 | STB | I | Memory Store |
| 0x09 | STH | I | Memory Store |
| 0x0A | STW | I | Memory Store |
| 0x0B | LLI | I | Immediate Load |
| 0x0C | LUI | I | Immediate Load |
| 0x0D | JMP | I | Jump |
| 0x0E | JEQ | I | Branch |
| 0x0F | JNE | I | Branch |
| 0x10 | JGT | I | Branch |
| 0x11 | JGE | I | Branch |
| 0x12 | JLT | I | Branch |
| 0x13 | JLE | I | Branch |
| 0x14 | CMP | R | Comparison |
| 0x15 | INC | R | Arithmetic |
| 0x16 | DEC | R | Arithmetic |
| 0x17 | SHL | R | Shift |
| 0x18 | SHR | R | Shift |
| 0x19 | ADD | R | Arithmetic |
| 0x1A | SUB | R | Arithmetic |
| 0x1B | AND | R | Logical |
| 0x1C | OR | R | Logical |
| 0x1D | NOT | R | Logical |
| 0x1E | XOR | R | Logical |
| 0x1F | NAND | R | Logical |
| 0x20 | NOR | R | Logical |
| 0x21 | XNOR | R | Logical |
| 0x22 | INT | I | System |
| 0x23 | IRT | R | System |
| 0x24 | HLT | R | System |
| 0x25 | IADD | I | Arithmetic |
| 0x26 | ISUB | I | Arithmetic |
| 0x27 | SEGMENT | - | Meta |
| 0x3E | DATA | - | Meta |
Exception Conditions
The following conditions trigger exceptions:
| Exception | Trigger Condition |
|---|---|
| Illegal Instruction | - Invalid opcode - noreg used as source/destination - ShiftAmt non-zero for non-shift instruction - Register field violations |
| Protection Fault | - Write to pcx register - Read/write idr or mmr in user mode - Read from noreg - Write to zero register (discarded, no fault) |
| Alignment Fault | - LDH/LDHS/STH with odd address - LDW/STW with address not divisible by 4 |
| Memory Access Violation | - Access to unmapped or protected memory - Stack overflow/underflow |
Calling Convention
See the DSA Assembly Language Reference for the complete calling convention and ABI specification.
Notes on Design
- Word Size: All addresses and general computation is 32-bit
- Endianness: Little-endian for instructions and runtime data; assembler data directives may use big-endian
- Stack Growth: Stack grows downward (toward lower addresses) - PUSH decrements SPR
- Alignment: Natural alignment required for halfword and word accesses
- Sign Extension: All immediate values are sign-extended unless noted
- Zero Register: Provides constant zero, writes are legal but discarded
- Reserved Encodings: Opcodes 0x27-0x3D and 0x3F reserved or implementation-specific