Files
damn_simple_architecture/docs/DSA_ISA_Specification.md
T
2026-02-07 18:21:37 +00:00

17 KiB

DSA Instruction Set Architecture Specification

Overview

The Damn Simple Architecture (DSA) is a 32-bit RISC-style architecture designed for simplicity and educational purposes. This document provides the complete instruction set architecture specification, including all hardware instructions, registers, and encoding formats.

Data Types and Sizes

Type Size Alignment
Byte 8 bits 1-byte aligned
Halfword 16 bits 2-byte aligned
Word 32 bits 4-byte aligned

Note on Endianness:

  • Instructions and numeric data in memory: Little-endian
  • Data defined via db/dh/dw directives: Big-endian (assembler-specific)

Registers

DSA provides 32 programmer-accessible registers plus several internal system registers.

Programmer-Accessible Registers

Hex Register Type Description
0x00-0x0F rg0-rgf General Purpose 16 general-purpose registers for variables and temporary values
0x10 acc Special Accumulator for calculations and temporary storage
⚠️ Used as scratch by pseudo-instructions - volatile
0x11 spr Special Stack pointer - points to top of stack
0x12 bpr Special Base pointer - used for stack frame management
0x13 ret Special Return address register - used for function returns
0x14 idr Privileged Interrupt descriptor table address
Read/write triggers protection fault in user mode
0x15 mmr Privileged Hardware memory map table address
Read/write triggers protection fault in user mode
0x16 zero Read-only Constant zero value
Reads always return 0, writes are discarded
0x17 noreg Placeholder Indicates unused register field
Read/write triggers illegal instruction fault
Can also be referenced as null
0x18-0x1F - Reserved Reserved for future use

System Registers (indices 0x18-0x1C): These exist in the encoding space but are internal to the CPU implementation:

Hex Register Description
0x18 mar Memory Address Register (CPU internal)
0x19 mdr Memory Data Register (CPU internal)
0x1A sts Status Register (CPU internal)
0x1B cir Current Instruction Register (CPU internal)
0x1C pcx Program Counter (read-only, special access)

Note on PCX (Program Counter):

  • PCX can be read in certain contexts (e.g., stored during CALL)
  • Writing to PCX triggers a protection fault
  • PCX is automatically updated by jump and branch instructions

Status Register (STS) Layout

The status register is a 32-bit register with the following flag bits:

Bit Name Description Boot Value
0 Equal Set if last comparison result was equal 0
1 GreaterThan Set if last comparison result was greater than 0
2 GreaterThanOrEqual Set if last comparison was greater than or equal 0
3 LessThan Set if last comparison result was less than 0
4 LessThanOrEqual Set if last comparison was less than or equal 0
5 Zero Set if last arithmetic/logic operation result was zero 0
6-31 - Reserved 0

Instruction Encoding Formats

DSA uses three instruction encoding formats:

R-Type (Register) Instructions

Used for operations with register operands only, including shifts.

 31-26  |  25-21  |  20-16  |  15-11  |  10-6  |  5-0
--------+---------+---------+---------+--------+-------
 Opcode | SrcReg1 | SrcReg2 | DestReg | ShiftAmt | Unused
  • Opcode (6 bits): Instruction operation code
  • SrcReg1 (5 bits): First source register
  • SrcReg2 (5 bits): Second source register
  • DestReg (5 bits): Destination register
  • ShiftAmt (5 bits): Shift amount (for shift instructions only, must be 0 otherwise)
  • Unused (6 bits): Must be 0

Important Rules:

  • ShiftAmt must be 0 for non-shift instructions (else illegal instruction fault)
  • Unused register fields must be set to noreg (0x17) if not used
  • Using registers in unexpected positions may cause illegal instruction fault

I-Type (Immediate) Instructions

Used for operations with a 16-bit immediate value.

 31-26  |  25-21  |  20-16  |    15-0
--------+---------+---------+-------------
 Opcode | SrcReg  | DestReg | 16-bit Immediate
  • Opcode (6 bits): Instruction operation code
  • SrcReg (5 bits): Source register (base for memory ops)
  • DestReg (5 bits): Destination register (or offset register for jumps)
  • Immediate (16 bits): Signed 16-bit immediate value or offset

Usage:

  • Arithmetic: Immediate is a signed value
  • Memory access: Immediate is a signed byte offset from base address
  • Branches: Immediate is a signed offset added to base register
  • Literal loads: Immediate is unsigned 16-bit value

J-Type (Jump) Instructions

Used for absolute jumps with large address ranges.

 31-26  |         25-0
--------+----------------------
 Opcode |   26-bit Address
  • Opcode (6 bits): Jump instruction code
  • Address (26 bits): Partial address for jump

Address Calculation:

  1. Left-shift the 26-bit address by 2 (word alignment)
  2. OR with upper 4 bits of current PCX
  3. Result is final 32-bit jump address

Jump Range: 256MB region around current PC (±128MB)

Note: J-type instructions are defined but currently unused. Use I-type JMP with register addressing for all jumps.

Hardware Instructions

Data Movement

Hex Mnemonic Type Operands Description
0x00 NOP R - No operation - does nothing
0x01 MOV R SrcReg, DestReg Copy value from SrcReg to DestReg
0x02 MOVS R SrcReg, DestReg Copy with sign extension to fill 32 bits

MOV/MOVS Details:

  • MOV performs direct copy (all 32 bits)
  • MOVS sign-extends the value (useful after byte/halfword loads)
  • Both instructions set the Zero flag if result is zero

Memory Access - Load Instructions

All loads require proper alignment or trigger an alignment fault.

Hex Mnemonic Type Operands Description
0x03 LDB I BaseReg, DestReg, Offset Load byte (8-bit), zero-extend to 32 bits
0x04 LDBS I BaseReg, DestReg, Offset Load byte (8-bit), sign-extend to 32 bits
0x05 LDH I BaseReg, DestReg, Offset Load halfword (16-bit), zero-extend to 32 bits
0x06 LDHS I BaseReg, DestReg, Offset Load halfword (16-bit), sign-extend to 32 bits
0x07 LDW I BaseReg, DestReg, Offset Load word (32-bit)

Load Operation:

  • Effective address = BaseReg + SignExtend(Offset)
  • Offset is a signed 16-bit value
  • Alignment requirements:
    • LDB/LDBS: No alignment required (byte-aligned)
    • LDH/LDHS: Must be 2-byte aligned
    • LDW: Must be 4-byte aligned

Encoding Note: In machine code, the order is: BaseReg (SrcReg field), DestReg field, Offset (Immediate field)

Memory Access - Store Instructions

All stores require proper alignment or trigger an alignment fault.

Hex Mnemonic Type Operands Description
0x08 STB I SrcReg, BaseReg, Offset Store byte (8-bit) to memory
0x09 STH I SrcReg, BaseReg, Offset Store halfword (16-bit) to memory
0x0A STW I SrcReg, BaseReg, Offset Store word (32-bit) to memory

Store Operation:

  • Effective address = BaseReg + SignExtend(Offset)
  • Offset is a signed 16-bit value
  • Only the relevant bits are stored (8, 16, or 32)
  • Alignment requirements:
    • STB: No alignment required (byte-aligned)
    • STH: Must be 2-byte aligned
    • STW: Must be 4-byte aligned

Encoding Note: In machine code: SrcReg (SrcReg field), BaseReg (DestReg field), Offset (Immediate field)

Immediate Load Instructions

Hex Mnemonic Type Operands Description
0x0B LLI I Value, DestReg Load 16-bit value into lower 16 bits
⚠️ CLEARS upper 16 bits!
0x0C LUI I Value, DestReg Load 16-bit value into upper 16 bits
Lower 16 bits unchanged

Usage for 32-bit Values:

LLI 0x1234, rg0    ; rg0 = 0x00001234
LUI 0xABCD, rg0    ; rg0 = 0xABCD1234

⚠️ CRITICAL: Always execute LLI before LUI, as LLI clears the upper 16 bits!

Note on LUI: The assembler may shift the immediate value right by 16 bits when encoding, so specify the upper 16 bits directly (e.g., LUI 0xABCD, rg0 not LUI 0xABCD0000, rg0).

Encoding Note: In machine code: Value (Immediate field), DestReg (SrcReg field for LLI, SrcReg field for LUI)

Jump and Branch Instructions

Hex Mnemonic Type Operands Description
0x0D JMP I Offset, BaseReg Unconditional jump to (BaseReg + Offset)
0x0E JEQ I Offset, BaseReg Jump if Equal flag set
0x0F JNE I Offset, BaseReg Jump if Equal flag NOT set
0x10 JGT I Offset, BaseReg Jump if GreaterThan flag set
0x11 JGE I Offset, BaseReg Jump if GreaterThan OR Equal flag set
0x12 JLT I Offset, BaseReg Jump if LessThan flag set
0x13 JLE I Offset, BaseReg Jump if LessThan OR Equal flag set

Jump Calculation:

  • Target address = BaseReg + SignExtend(Offset)
  • If BaseReg = zero, this becomes absolute addressing with Offset
  • If BaseReg = ret, this becomes return-style addressing
  • Conditional jumps check flags in STS register

Common Patterns:

JMP label, zero     ; Absolute jump to label address
JMP 0, ret          ; Jump to address in ret register
JMP 4, ret          ; Jump to (ret + 4)

Encoding Note: In machine code: Offset (Immediate field), BaseReg (SrcReg field) (DestReg unused, set to noreg)

Comparison

Hex Mnemonic Type Operands Description
0x14 CMP R Reg1, Reg2 Compare Reg1 with Reg2, set flags in STS

Flag Setting:

  • Equal: Set if Reg1 == Reg2
  • GreaterThan: Set if Reg1 > Reg2 (signed)
  • GreaterThanOrEqual: Set if Reg1 >= Reg2 (signed)
  • LessThan: Set if Reg1 < Reg2 (signed)
  • LessThanOrEqual: Set if Reg1 <= Reg2 (signed)
  • Zero: Set if (Reg1 - Reg2) == 0 (same as Equal)

Encoding Note: DestReg and ShiftAmt fields unused (set to noreg and 0)

Arithmetic Instructions

Hex Mnemonic Type Operands Description
0x15 INC R Reg Increment register by 1
0x16 DEC R Reg Decrement register by 1
0x19 ADD R Src1, Src2, Dest Dest = Src1 + Src2
0x1A SUB R Src1, Src2, Dest Dest = Src1 - Src2
0x25 IADD I Src, Literal, Dest Dest = Src + SignExtend(Literal)
0x26 ISUB I Src, Literal, Dest Dest = Src - SignExtend(Literal)

Flag Effects:

  • Zero flag set if result is zero
  • Other flags undefined after arithmetic (use CMP for comparisons)

Encoding Notes:

  • INC/DEC: Reg in SrcReg1 field, DestReg set to noreg
  • IADD/ISUB: Immediate is signed 16-bit value, all three operands required

Bitwise Logical Operations

Hex Mnemonic Type Operands Description
0x1B AND R Src1, Src2, Dest Dest = Src1 & Src2 (bitwise AND)
0x1C OR R Src1, Src2, Dest Dest = Src1 | Src2 (bitwise OR)
0x1D NOT R Src, Dest Dest = ~Src (bitwise NOT)
0x1E XOR R Src1, Src2, Dest Dest = Src1 ^ Src2 (bitwise XOR)
0x1F NAND R Src1, Src2, Dest Dest = ~(Src1 & Src2) (bitwise NAND)
0x20 NOR R Src1, Src2, Dest Dest = ~(Src1 | Src2) (bitwise NOR)
0x21 XNOR R Src1, Src2, Dest Dest = ~(Src1 ^ Src2) (bitwise XNOR)

Flag Effects:

  • Zero flag set if result is zero
  • Other flags undefined

Encoding Note: NOT uses only Src (SrcReg1) and Dest (DestReg); SrcReg2 unused (set to noreg)

Shift Operations

Hex Mnemonic Type Operands Description
0x17 SHL R Reg, ShiftAmount Shift Reg left by ShiftAmount bits
Zero-fill from right
0x18 SHR R Reg, ShiftAmount Shift Reg right by ShiftAmount bits
Zero-fill from left (logical shift)

Shift Amount:

  • Literal shifts: ShiftAmount is a 5-bit literal (0-31) in assembly
    • Stored in ShiftAmt field of instruction
    • SrcReg2 set to noreg
  • Register shifts: ShiftAmount is a register containing shift value
    • Register specified in SrcReg2 field
    • ShiftAmt field must be 0
    • Only low 5 bits of register value used

Note: Current assembler implementation may only support literal shifts. Check assembler documentation.

Flag Effects:

  • Zero flag set if result is zero

Encoding Notes:

  • Reg in both SrcReg1 and DestReg fields (shifted in place)
  • For literal shifts: ShiftAmt field contains shift count, SrcReg2 = noreg
  • For register shifts: SrcReg2 contains register, ShiftAmt must be 0

System and Control Instructions

Hex Mnemonic Type Operands Description
0x22 INT I InterruptCode Trigger interrupt with 8-bit code
Saves return address to ret register
Sets bpr to kernel stack
0x23 IRT R - Return from interrupt
Restores execution context
0x24 HLT R - Halt processor execution
Stops fetch-decode-execute cycle

INT Behavior:

  1. Save current PCX to ret register
  2. Switch bpr to kernel stack address
  3. Look up interrupt handler address in interrupt descriptor table (idr)
  4. Jump to handler at interrupt vector

IRT Behavior:

  1. Restore previous execution context
  2. Return to address in ret register
  3. Restore user stack pointer

Encoding Notes:

  • INT: InterruptCode in low 8 bits of Immediate field
  • IRT/HLT: All register fields set to noreg, ShiftAmt to 0

Meta Instructions (Assembler/Linker)

These instructions are used by the assembler and linker but may not represent real CPU operations.

Hex Mnemonic Description
0x27 SEGMENT Segment marker (implementation-specific)
0x3E DATA Raw data embedding

Note: The SEGMENT instruction opcode may vary between implementations (0x27 in assembler, 0x3F in some contexts). Consult your specific toolchain documentation.

Instruction Summary Table

Opcode Mnemonic Type Category
0x00 NOP R Control
0x01 MOV R Data Movement
0x02 MOVS R Data Movement
0x03 LDB I Memory Load
0x04 LDBS I Memory Load
0x05 LDH I Memory Load
0x06 LDHS I Memory Load
0x07 LDW I Memory Load
0x08 STB I Memory Store
0x09 STH I Memory Store
0x0A STW I Memory Store
0x0B LLI I Immediate Load
0x0C LUI I Immediate Load
0x0D JMP I Jump
0x0E JEQ I Branch
0x0F JNE I Branch
0x10 JGT I Branch
0x11 JGE I Branch
0x12 JLT I Branch
0x13 JLE I Branch
0x14 CMP R Comparison
0x15 INC R Arithmetic
0x16 DEC R Arithmetic
0x17 SHL R Shift
0x18 SHR R Shift
0x19 ADD R Arithmetic
0x1A SUB R Arithmetic
0x1B AND R Logical
0x1C OR R Logical
0x1D NOT R Logical
0x1E XOR R Logical
0x1F NAND R Logical
0x20 NOR R Logical
0x21 XNOR R Logical
0x22 INT I System
0x23 IRT R System
0x24 HLT R System
0x25 IADD I Arithmetic
0x26 ISUB I Arithmetic
0x27 SEGMENT - Meta
0x3E DATA - Meta

Exception Conditions

The following conditions trigger exceptions:

Exception Trigger Condition
Illegal Instruction - Invalid opcode
- noreg used as source/destination
- ShiftAmt non-zero for non-shift instruction
- Register field violations
Protection Fault - Write to pcx register
- Read/write idr or mmr in user mode
- Read from noreg
- Write to zero register (discarded, no fault)
Alignment Fault - LDH/LDHS/STH with odd address
- LDW/STW with address not divisible by 4
Memory Access Violation - Access to unmapped or protected memory
- Stack overflow/underflow

Calling Convention

See the DSA Assembly Language Reference for the complete calling convention and ABI specification.

Notes on Design

  1. Word Size: All addresses and general computation is 32-bit
  2. Endianness: Little-endian for instructions and runtime data; assembler data directives may use big-endian
  3. Stack Growth: Stack grows downward (toward lower addresses) - PUSH decrements SPR
  4. Alignment: Natural alignment required for halfword and word accesses
  5. Sign Extension: All immediate values are sign-extended unless noted
  6. Zero Register: Provides constant zero, writes are legal but discarded
  7. Reserved Encodings: Opcodes 0x27-0x3D and 0x3F reserved or implementation-specific