Files
damn_simple_architecture/docs/DSA_ISA_Specification.md
T
2026-02-07 18:21:37 +00:00

430 lines
17 KiB
Markdown

# DSA Instruction Set Architecture Specification
## Overview
The Damn Simple Architecture (DSA) is a 32-bit RISC-style architecture designed for simplicity and educational purposes. This document provides the complete instruction set architecture specification, including all hardware instructions, registers, and encoding formats.
## Data Types and Sizes
| Type | Size | Alignment |
|------|------|-----------|
| Byte | 8 bits | 1-byte aligned |
| Halfword | 16 bits | 2-byte aligned |
| Word | 32 bits | 4-byte aligned |
**Note on Endianness:**
- Instructions and numeric data in memory: Little-endian
- Data defined via `db/dh/dw` directives: Big-endian (assembler-specific)
## Registers
DSA provides 32 programmer-accessible registers plus several internal system registers.
### Programmer-Accessible Registers
| Hex | Register | Type | Description |
|-----|----------|------|-------------|
| 0x00-0x0F | **rg0-rgf** | General Purpose | 16 general-purpose registers for variables and temporary values |
| 0x10 | **acc** | Special | Accumulator for calculations and temporary storage<br/>⚠️ Used as scratch by pseudo-instructions - volatile |
| 0x11 | **spr** | Special | Stack pointer - points to top of stack |
| 0x12 | **bpr** | Special | Base pointer - used for stack frame management |
| 0x13 | **ret** | Special | Return address register - used for function returns |
| 0x14 | **idr** | Privileged | Interrupt descriptor table address<br/>Read/write triggers protection fault in user mode |
| 0x15 | **mmr** | Privileged | Hardware memory map table address<br/>Read/write triggers protection fault in user mode |
| 0x16 | **zero** | Read-only | Constant zero value<br/>Reads always return 0, writes are discarded |
| 0x17 | **noreg** | Placeholder | Indicates unused register field<br/>Read/write triggers illegal instruction fault<br/>Can also be referenced as **null** |
| 0x18-0x1F | - | Reserved | Reserved for future use |
**System Registers (indices 0x18-0x1C):**
These exist in the encoding space but are internal to the CPU implementation:
| Hex | Register | Description |
|-----|----------|-------------|
| 0x18 | **mar** | Memory Address Register (CPU internal) |
| 0x19 | **mdr** | Memory Data Register (CPU internal) |
| 0x1A | **sts** | Status Register (CPU internal) |
| 0x1B | **cir** | Current Instruction Register (CPU internal) |
| 0x1C | **pcx** | Program Counter (read-only, special access) |
**Note on PCX (Program Counter):**
- PCX can be read in certain contexts (e.g., stored during CALL)
- Writing to PCX triggers a protection fault
- PCX is automatically updated by jump and branch instructions
### Status Register (STS) Layout
The status register is a 32-bit register with the following flag bits:
| Bit | Name | Description | Boot Value |
|-----|------|-------------|------------|
| 0 | **Equal** | Set if last comparison result was equal | 0 |
| 1 | **GreaterThan** | Set if last comparison result was greater than | 0 |
| 2 | **GreaterThanOrEqual** | Set if last comparison was greater than or equal | 0 |
| 3 | **LessThan** | Set if last comparison result was less than | 0 |
| 4 | **LessThanOrEqual** | Set if last comparison was less than or equal | 0 |
| 5 | **Zero** | Set if last arithmetic/logic operation result was zero | 0 |
| 6-31 | - | Reserved | 0 |
## Instruction Encoding Formats
DSA uses three instruction encoding formats:
### R-Type (Register) Instructions
Used for operations with register operands only, including shifts.
```
31-26 | 25-21 | 20-16 | 15-11 | 10-6 | 5-0
--------+---------+---------+---------+--------+-------
Opcode | SrcReg1 | SrcReg2 | DestReg | ShiftAmt | Unused
```
- **Opcode** (6 bits): Instruction operation code
- **SrcReg1** (5 bits): First source register
- **SrcReg2** (5 bits): Second source register
- **DestReg** (5 bits): Destination register
- **ShiftAmt** (5 bits): Shift amount (for shift instructions only, must be 0 otherwise)
- **Unused** (6 bits): Must be 0
**Important Rules:**
- ShiftAmt must be 0 for non-shift instructions (else illegal instruction fault)
- Unused register fields must be set to `noreg` (0x17) if not used
- Using registers in unexpected positions may cause illegal instruction fault
### I-Type (Immediate) Instructions
Used for operations with a 16-bit immediate value.
```
31-26 | 25-21 | 20-16 | 15-0
--------+---------+---------+-------------
Opcode | SrcReg | DestReg | 16-bit Immediate
```
- **Opcode** (6 bits): Instruction operation code
- **SrcReg** (5 bits): Source register (base for memory ops)
- **DestReg** (5 bits): Destination register (or offset register for jumps)
- **Immediate** (16 bits): Signed 16-bit immediate value or offset
**Usage:**
- Arithmetic: Immediate is a signed value
- Memory access: Immediate is a signed byte offset from base address
- Branches: Immediate is a signed offset added to base register
- Literal loads: Immediate is unsigned 16-bit value
### J-Type (Jump) Instructions
Used for absolute jumps with large address ranges.
```
31-26 | 25-0
--------+----------------------
Opcode | 26-bit Address
```
- **Opcode** (6 bits): Jump instruction code
- **Address** (26 bits): Partial address for jump
**Address Calculation:**
1. Left-shift the 26-bit address by 2 (word alignment)
2. OR with upper 4 bits of current PCX
3. Result is final 32-bit jump address
**Jump Range:** 256MB region around current PC (±128MB)
**Note:** J-type instructions are defined but currently unused. Use I-type JMP with register addressing for all jumps.
## Hardware Instructions
### Data Movement
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x00 | **NOP** | R | - | No operation - does nothing |
| 0x01 | **MOV** | R | SrcReg, DestReg | Copy value from SrcReg to DestReg |
| 0x02 | **MOVS** | R | SrcReg, DestReg | Copy with sign extension to fill 32 bits |
**MOV/MOVS Details:**
- MOV performs direct copy (all 32 bits)
- MOVS sign-extends the value (useful after byte/halfword loads)
- Both instructions set the Zero flag if result is zero
### Memory Access - Load Instructions
All loads require proper alignment or trigger an alignment fault.
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x03 | **LDB** | I | BaseReg, DestReg, Offset | Load byte (8-bit), zero-extend to 32 bits |
| 0x04 | **LDBS** | I | BaseReg, DestReg, Offset | Load byte (8-bit), sign-extend to 32 bits |
| 0x05 | **LDH** | I | BaseReg, DestReg, Offset | Load halfword (16-bit), zero-extend to 32 bits |
| 0x06 | **LDHS** | I | BaseReg, DestReg, Offset | Load halfword (16-bit), sign-extend to 32 bits |
| 0x07 | **LDW** | I | BaseReg, DestReg, Offset | Load word (32-bit) |
**Load Operation:**
- Effective address = BaseReg + SignExtend(Offset)
- Offset is a signed 16-bit value
- Alignment requirements:
- LDB/LDBS: No alignment required (byte-aligned)
- LDH/LDHS: Must be 2-byte aligned
- LDW: Must be 4-byte aligned
**Encoding Note:**
In machine code, the order is: BaseReg (SrcReg field), DestReg field, Offset (Immediate field)
### Memory Access - Store Instructions
All stores require proper alignment or trigger an alignment fault.
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x08 | **STB** | I | SrcReg, BaseReg, Offset | Store byte (8-bit) to memory |
| 0x09 | **STH** | I | SrcReg, BaseReg, Offset | Store halfword (16-bit) to memory |
| 0x0A | **STW** | I | SrcReg, BaseReg, Offset | Store word (32-bit) to memory |
**Store Operation:**
- Effective address = BaseReg + SignExtend(Offset)
- Offset is a signed 16-bit value
- Only the relevant bits are stored (8, 16, or 32)
- Alignment requirements:
- STB: No alignment required (byte-aligned)
- STH: Must be 2-byte aligned
- STW: Must be 4-byte aligned
**Encoding Note:**
In machine code: SrcReg (SrcReg field), BaseReg (DestReg field), Offset (Immediate field)
### Immediate Load Instructions
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x0B | **LLI** | I | Value, DestReg | Load 16-bit value into lower 16 bits<br/>⚠️ **CLEARS upper 16 bits!** |
| 0x0C | **LUI** | I | Value, DestReg | Load 16-bit value into upper 16 bits<br/>Lower 16 bits unchanged |
**Usage for 32-bit Values:**
```
LLI 0x1234, rg0 ; rg0 = 0x00001234
LUI 0xABCD, rg0 ; rg0 = 0xABCD1234
```
**⚠️ CRITICAL:** Always execute LLI before LUI, as LLI clears the upper 16 bits!
**Note on LUI:** The assembler may shift the immediate value right by 16 bits when encoding, so specify the upper 16 bits directly (e.g., `LUI 0xABCD, rg0` not `LUI 0xABCD0000, rg0`).
**Encoding Note:**
In machine code: Value (Immediate field), DestReg (SrcReg field for LLI, SrcReg field for LUI)
### Jump and Branch Instructions
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x0D | **JMP** | I | Offset, BaseReg | Unconditional jump to (BaseReg + Offset) |
| 0x0E | **JEQ** | I | Offset, BaseReg | Jump if Equal flag set |
| 0x0F | **JNE** | I | Offset, BaseReg | Jump if Equal flag NOT set |
| 0x10 | **JGT** | I | Offset, BaseReg | Jump if GreaterThan flag set |
| 0x11 | **JGE** | I | Offset, BaseReg | Jump if GreaterThan OR Equal flag set |
| 0x12 | **JLT** | I | Offset, BaseReg | Jump if LessThan flag set |
| 0x13 | **JLE** | I | Offset, BaseReg | Jump if LessThan OR Equal flag set |
**Jump Calculation:**
- Target address = BaseReg + SignExtend(Offset)
- If BaseReg = zero, this becomes absolute addressing with Offset
- If BaseReg = ret, this becomes return-style addressing
- Conditional jumps check flags in STS register
**Common Patterns:**
```
JMP label, zero ; Absolute jump to label address
JMP 0, ret ; Jump to address in ret register
JMP 4, ret ; Jump to (ret + 4)
```
**Encoding Note:**
In machine code: Offset (Immediate field), BaseReg (SrcReg field) (DestReg unused, set to noreg)
### Comparison
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x14 | **CMP** | R | Reg1, Reg2 | Compare Reg1 with Reg2, set flags in STS |
**Flag Setting:**
- Equal: Set if Reg1 == Reg2
- GreaterThan: Set if Reg1 > Reg2 (signed)
- GreaterThanOrEqual: Set if Reg1 >= Reg2 (signed)
- LessThan: Set if Reg1 < Reg2 (signed)
- LessThanOrEqual: Set if Reg1 <= Reg2 (signed)
- Zero: Set if (Reg1 - Reg2) == 0 (same as Equal)
**Encoding Note:**
DestReg and ShiftAmt fields unused (set to noreg and 0)
### Arithmetic Instructions
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x15 | **INC** | R | Reg | Increment register by 1 |
| 0x16 | **DEC** | R | Reg | Decrement register by 1 |
| 0x19 | **ADD** | R | Src1, Src2, Dest | Dest = Src1 + Src2 |
| 0x1A | **SUB** | R | Src1, Src2, Dest | Dest = Src1 - Src2 |
| 0x25 | **IADD** | I | Src, Literal, Dest | Dest = Src + SignExtend(Literal) |
| 0x26 | **ISUB** | I | Src, Literal, Dest | Dest = Src - SignExtend(Literal) |
**Flag Effects:**
- Zero flag set if result is zero
- Other flags undefined after arithmetic (use CMP for comparisons)
**Encoding Notes:**
- INC/DEC: Reg in SrcReg1 field, DestReg set to noreg
- IADD/ISUB: Immediate is signed 16-bit value, all three operands required
### Bitwise Logical Operations
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x1B | **AND** | R | Src1, Src2, Dest | Dest = Src1 & Src2 (bitwise AND) |
| 0x1C | **OR** | R | Src1, Src2, Dest | Dest = Src1 \| Src2 (bitwise OR) |
| 0x1D | **NOT** | R | Src, Dest | Dest = ~Src (bitwise NOT) |
| 0x1E | **XOR** | R | Src1, Src2, Dest | Dest = Src1 ^ Src2 (bitwise XOR) |
| 0x1F | **NAND** | R | Src1, Src2, Dest | Dest = ~(Src1 & Src2) (bitwise NAND) |
| 0x20 | **NOR** | R | Src1, Src2, Dest | Dest = ~(Src1 \| Src2) (bitwise NOR) |
| 0x21 | **XNOR** | R | Src1, Src2, Dest | Dest = ~(Src1 ^ Src2) (bitwise XNOR) |
**Flag Effects:**
- Zero flag set if result is zero
- Other flags undefined
**Encoding Note:**
NOT uses only Src (SrcReg1) and Dest (DestReg); SrcReg2 unused (set to noreg)
### Shift Operations
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x17 | **SHL** | R | Reg, ShiftAmount | Shift Reg left by ShiftAmount bits<br/>Zero-fill from right |
| 0x18 | **SHR** | R | Reg, ShiftAmount | Shift Reg right by ShiftAmount bits<br/>Zero-fill from left (logical shift) |
**Shift Amount:**
- **Literal shifts**: ShiftAmount is a 5-bit literal (0-31) in assembly
- Stored in ShiftAmt field of instruction
- SrcReg2 set to noreg
- **Register shifts**: ShiftAmount is a register containing shift value
- Register specified in SrcReg2 field
- ShiftAmt field must be 0
- Only low 5 bits of register value used
**Note:** Current assembler implementation may only support literal shifts. Check assembler documentation.
**Flag Effects:**
- Zero flag set if result is zero
**Encoding Notes:**
- Reg in both SrcReg1 and DestReg fields (shifted in place)
- For literal shifts: ShiftAmt field contains shift count, SrcReg2 = noreg
- For register shifts: SrcReg2 contains register, ShiftAmt must be 0
### System and Control Instructions
| Hex | Mnemonic | Type | Operands | Description |
|-----|----------|------|----------|-------------|
| 0x22 | **INT** | I | InterruptCode | Trigger interrupt with 8-bit code<br/>Saves return address to ret register<br/>Sets bpr to kernel stack |
| 0x23 | **IRT** | R | - | Return from interrupt<br/>Restores execution context |
| 0x24 | **HLT** | R | - | Halt processor execution<br/>Stops fetch-decode-execute cycle |
**INT Behavior:**
1. Save current PCX to ret register
2. Switch bpr to kernel stack address
3. Look up interrupt handler address in interrupt descriptor table (idr)
4. Jump to handler at interrupt vector
**IRT Behavior:**
1. Restore previous execution context
2. Return to address in ret register
3. Restore user stack pointer
**Encoding Notes:**
- INT: InterruptCode in low 8 bits of Immediate field
- IRT/HLT: All register fields set to noreg, ShiftAmt to 0
### Meta Instructions (Assembler/Linker)
These instructions are used by the assembler and linker but may not represent real CPU operations.
| Hex | Mnemonic | Description |
|-----|----------|-------------|
| 0x27 | **SEGMENT** | Segment marker (implementation-specific) |
| 0x3E | **DATA** | Raw data embedding |
**Note:** The SEGMENT instruction opcode may vary between implementations (0x27 in assembler, 0x3F in some contexts). Consult your specific toolchain documentation.
## Instruction Summary Table
| Opcode | Mnemonic | Type | Category |
|--------|----------|------|----------|
| 0x00 | NOP | R | Control |
| 0x01 | MOV | R | Data Movement |
| 0x02 | MOVS | R | Data Movement |
| 0x03 | LDB | I | Memory Load |
| 0x04 | LDBS | I | Memory Load |
| 0x05 | LDH | I | Memory Load |
| 0x06 | LDHS | I | Memory Load |
| 0x07 | LDW | I | Memory Load |
| 0x08 | STB | I | Memory Store |
| 0x09 | STH | I | Memory Store |
| 0x0A | STW | I | Memory Store |
| 0x0B | LLI | I | Immediate Load |
| 0x0C | LUI | I | Immediate Load |
| 0x0D | JMP | I | Jump |
| 0x0E | JEQ | I | Branch |
| 0x0F | JNE | I | Branch |
| 0x10 | JGT | I | Branch |
| 0x11 | JGE | I | Branch |
| 0x12 | JLT | I | Branch |
| 0x13 | JLE | I | Branch |
| 0x14 | CMP | R | Comparison |
| 0x15 | INC | R | Arithmetic |
| 0x16 | DEC | R | Arithmetic |
| 0x17 | SHL | R | Shift |
| 0x18 | SHR | R | Shift |
| 0x19 | ADD | R | Arithmetic |
| 0x1A | SUB | R | Arithmetic |
| 0x1B | AND | R | Logical |
| 0x1C | OR | R | Logical |
| 0x1D | NOT | R | Logical |
| 0x1E | XOR | R | Logical |
| 0x1F | NAND | R | Logical |
| 0x20 | NOR | R | Logical |
| 0x21 | XNOR | R | Logical |
| 0x22 | INT | I | System |
| 0x23 | IRT | R | System |
| 0x24 | HLT | R | System |
| 0x25 | IADD | I | Arithmetic |
| 0x26 | ISUB | I | Arithmetic |
| 0x27 | SEGMENT | - | Meta |
| 0x3E | DATA | - | Meta |
## Exception Conditions
The following conditions trigger exceptions:
| Exception | Trigger Condition |
|-----------|------------------|
| **Illegal Instruction** | - Invalid opcode<br/>- noreg used as source/destination<br/>- ShiftAmt non-zero for non-shift instruction<br/>- Register field violations |
| **Protection Fault** | - Write to pcx register<br/>- Read/write idr or mmr in user mode<br/>- Read from noreg<br/>- Write to zero register (discarded, no fault) |
| **Alignment Fault** | - LDH/LDHS/STH with odd address<br/>- LDW/STW with address not divisible by 4 |
| **Memory Access Violation** | - Access to unmapped or protected memory<br/>- Stack overflow/underflow |
## Calling Convention
See the DSA Assembly Language Reference for the complete calling convention and ABI specification.
## Notes on Design
1. **Word Size:** All addresses and general computation is 32-bit
2. **Endianness:** Little-endian for instructions and runtime data; assembler data directives may use big-endian
3. **Stack Growth:** Stack grows **downward** (toward lower addresses) - PUSH decrements SPR
4. **Alignment:** Natural alignment required for halfword and word accesses
5. **Sign Extension:** All immediate values are sign-extended unless noted
6. **Zero Register:** Provides constant zero, writes are legal but discarded
7. **Reserved Encodings:** Opcodes 0x27-0x3D and 0x3F reserved or implementation-specific