zxq5/damn_simple_architecture

Fork 0

Files

T

zxq5 89762b54e3 updated docs

2026-02-05 01:09:38 +00:00

18 KiB

Raw Blame History

DSA Implementation vs Documentation Discrepancies

Critical Discrepancies

1. Stack Growth Direction ❌ CRITICAL

Documentation states: Stack grows upward (toward higher addresses)

Implementation shows (expand.rs:44-51):

fn expand_push(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    // ...
    nodes.extend(vec![
        node!(label, Opcode::SubI, spr, 4, spr),  // spr = spr - 4
        node!(None, Opcode::Stw, reg, spr, 0),
    ]);

Implementation shows (expand.rs:130-137):

fn expand_pop(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    // ...
    nodes.extend(vec![
        node!(label, Opcode::Ldw, spr, reg, 0),
        node!(None, Opcode::AddI, spr, 4, spr),  // spr = spr + 4
    ]);

Reality: Stack grows DOWNWARD (toward lower addresses)

PUSH: Decrements SPR by 4, then stores
POP: Loads, then increments SPR by 4

Impact: All documentation examples and calling convention diagrams are backwards!

2. CALL Pseudo-instruction Expansion ❌ CRITICAL

Documentation states (DSA_Assembly_Reference.md):

; call print::print expands to:
lwi print::print, ret       ; Load function address into ret
jmp 0, ret                  ; Jump to function (saves return in pcx)

Implementation shows (expand.rs:109-123):

fn expand_call(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    nodes.extend(vec![
        node!(label, Opcode::SubI, spr, 4, spr),    // Decrement stack pointer
        node!(None, Opcode::Stw, pcx, spr, 0),       // Store PCX (return addr) on stack
        node!(None, Opcode::Jmp, addr, zero),        // Jump to function
    ]);

Reality: CALL expansion is:

Decrement SPR by 4
Store PCX (return address) to stack
Jump to function address

Impact: Return address is stored on the STACK, not in RET register!

3. RETURN Pseudo-instruction Expansion ❌ CRITICAL

Documentation states:

; return expands to:
jmp 0, ret                  ; Jump to address in ret register

Implementation shows (expand.rs:125-135):

fn expand_return(current: &Node, nodes: &mut Vec<Node>) {
    nodes.extend(vec![
        node!(label, Opcode::Ldw, spr, ret, 0),     // Load return addr from stack
        node!(None, Opcode::AddI, spr, 4, spr),     // Increment stack pointer
        node!(None, Opcode::Jmp, 4, ret),           // Jump to (ret + 4)
    ]);
}

Reality: RETURN expansion is:

Load return address from stack into RET register
Increment SPR by 4
Jump to (RET + 4)

Why +4? The stored PCX points to the instruction AFTER the call's jump, so we need to add 4 to skip past the stored PCX instruction itself... or this might be a bug in the implementation.

Impact: Return mechanism is completely different from documentation!

4. Calling Convention - Stack Frame Layout ❌ CRITICAL

Documentation states:

Higher Addresses
├─────────────┤
│   Arg N     │  ← spr + (8 + 4*(N-1))
│   ...       │
│   Arg 2     │  ← spr + 16
│   Arg 1     │  ← spr + 12
│   Arg 0     │  ← spr + 8
├─────────────┤
│   Ret Addr  │  ← spr + 4
├─────────────┤
│   Old BPR   │  ← spr + 0
├─────────────┤  ← bpr, spr
│   Locals    │
Lower Addresses

Reality based on implementation: Since stack grows DOWN:

Lower Addresses
├─────────────┤  ← Current SPR/BPR
│   Old BPR   │  ← spr + 0 (immediately above SPR)
├─────────────┤
│   Ret Addr  │  ← spr + 4 (pushed by CALL)
├─────────────┤
│   Arg 0     │  ← spr + 8
│   Arg 1     │  ← spr + 12
│   Arg 2     │  ← spr + 16
│   ...       │
│   Arg N     │  ← spr + (8 + 4*(N-1))
├─────────────┤
Higher Addresses

The diagram needs to be flipped! The offsets are correct, but the direction is wrong.

5. Label-Based Load/Store Scratch Register ⚠️ IMPORTANT

Documentation states: Uses rgf as scratch register

Implementation confirms (expand.rs:138-153):

fn expand_ldx(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    // For ldb label, reg:
    nodes.extend(vec![
        node!(current.label(), Opcode::Lli, name, reg),
        node!(None, Opcode::Lui, name, reg),
        node!(None, opcode, reg, reg, offset),
    ]);

Wait! This is WRONG in the implementation!

The load expansion uses the DESTINATION register as scratch:

ldb buffer, rg2  expands to:
    lli buffer, rg2    ; Uses rg2 as destination
    lui buffer, rg2    ; Uses rg2 as destination
    ldb rg2, rg2, 0    ; Uses rg2 as base

Documentation says it should use rgf:

ldb buffer, rg2 expands to:
    lli buffer, rgf    ; Uses rgf as scratch
    lui buffer, rgf    ; Uses rgf as scratch
    ldb rgf, rg2, 0    ; Load from rgf into rg2

For stores (expand.rs:155-176):

fn expand_stx(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    // For stb reg, label:
    let temp = Token::Register(Register::Acc);  // Uses ACC, not RGF!
    
    nodes.extend(vec![
        node!(current.label(), Opcode::Lli, dest, temp),
        node!(None, Opcode::Lui, dest, temp),
        node!(None, opcode, base, temp, offset),
    ]);

Reality:

Load pseudo-instructions use the DESTINATION register as scratch
Store pseudo-instructions use the ACC register as scratch, NOT rgf

Impact: Documentation is incorrect about which registers are used!

6. LWI Pseudo-instruction ✅ CORRECT

Documentation and implementation agree:

fn expand_lwi(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    nodes.extend(vec![
        node!(current.label(), Opcode::Lli, val, reg),
        node!(None, Opcode::Lui, val, reg),
    ]);

This matches the documented expansion.

7. PUSHA/POPA Pseudo-instructions 📝 UNDOCUMENTED

These exist in implementation but are NOT in documentation!

expand.rs:53-76:

fn expand_pusha(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    let count = expect_token!(arg0, Immediate)?;
    let spr = Token::Register(Register::Spr);
    let registers: Vec<Register> = Register::general();

    nodes.push(node!(label, Opcode::SubI, spr, Token::Immediate(count * 4), spr));
    
    nodes.extend((0..count).rev().map(|i| {
        node!(None, Opcode::Stw,
            Token::Register(registers[i as usize]),
            spr,
            Token::Immediate(i * 4)
        )
    }));

expand.rs:78-101:

fn expand_popa(current: &Node, nodes: &mut Vec<Node>) -> Result<(), AssembleError> {
    let count = expect_token!(arg0, Immediate)?;
    
    nodes.extend((0..count).rev().map(|i| {
        node!(
            { if i == 0 { label.clone() } else { None } },
            Opcode::Ldw,
            spr,
            Token::Register(registers[i as usize]),
            Token::Immediate(i * 4)
        )
    }));
    
    nodes.push(node!(None, Opcode::AddI, spr, Token::Immediate(count * 4), spr));

What they do:

pusha N - Push first N general-purpose registers (rg0-rgN) to stack
popa N - Pop first N general-purpose registers from stack

Missing from documentation entirely!

8. Register Index Encoding ⚠️ IMPORTANT

Documentation states: System registers like MAR, MDR, STS, CIR, PCX are "internal" and not accessible

Implementation shows (instructions.rs:148-153):

0x18 => Self::Mar,
0x19 => Self::Mdr,
0x1A => Self::Sts,
0x1B => Self::Cir,
0x1C => Self::Pcx,

Reality: These registers ARE encoded in the instruction format at indices 0x18-0x1C!

However, instructions.rs:186 shows:

"null" => Ok(Self::NoReg),  // Can parse "null" as NoReg

Documentation never mentions "null" as an alternative name for noreg!

9. LUI Immediate Value Handling ⚠️ IMPORTANT

Documentation states:

lui immediate, dest_reg     ; Load immediate into upper 16 bits

Implementation shows (codegen.rs:248-254):

fn build_load_immediate_instruction(...) -> Result<Instruction, AssembleError> {
    // ...
    match opcode {
        Opcode::Lli => {
            let instruction_args = args!(I, immediate: value as u16, r1: dest);
            Ok(Instruction::LoadLowerImmediate(instruction_args))
        }
        Opcode::Lui => {
            let upper_value = value >> 16;  // Shifts right by 16!
            let instruction_args = args!(I, immediate: upper_value as u16, r1: dest);
            Ok(Instruction::LoadUpperImmediate(instruction_args))
        }

Reality: When assembling lui immediate, reg, the assembler:

Takes the immediate value
Shifts it RIGHT by 16 bits
Stores the result in the instruction

This means:

lli 0x1234, rg0    ; Stores 0x1234 in lower 16 bits
lui 0xABCD0000, rg0  ; Right-shifts to 0xABCD, stores in upper 16 bits

Or more likely, the assembler expects:

lli 0x1234, rg0    ; Stores 0x1234 in lower 16 bits
lui 0xABCD, rg0    ; Stores 0xABCD in upper 16 bits (no shift needed)

Documentation needs clarification on what immediate value format LUI expects!

10. Data Definition Encoding ⚠️ IMPORTANT

Implementation (expand.rs:217-267):

fn process_dx_data(args: Vec<Token>, size: usize) -> Result<Vec<u32>, AssembleError> {
    for token in args {
        match token {
            Token::StringLit(mut s) => {
                s.push('\0');  // Automatically adds null terminator!
                for ch in s.chars() {
                    let mut char_buf = [0u8; 4];
                    let char_bytes = ch.encode_utf8(&mut char_buf);
                    buffer.extend_from_slice(char_bytes.as_bytes());
                }
            }
            Token::Immediate(value) => {
                buffer.extend_from_slice(&value.to_be_bytes());  // BIG ENDIAN!
            }

Key findings:

String literals automatically get null terminator appended
Numeric values are stored in BIG ENDIAN format (to_be_bytes)
Documentation says "little-endian byte order" globally

Contradiction: Data definition uses BIG ENDIAN, but doc says LITTLE ENDIAN!

11. Segment Instruction 📝 UNDOCUMENTED

Implementation has a SEGMENT instruction (0x27/0x3F):

Segment(u32) = 0x3F,

This is completely undocumented!

From model.rs:

Self::Segment => write!(f, "[SEGMENT]"),

From codegen.rs:

Opcode::Segment => build_segment_instruction(&args),

Purpose unclear, needs documentation!

12. Data Instruction 📝 UNDOCUMENTED

Implementation has a DATA instruction (0x3E):

Data(u32) = 0x3E,

This appears to be a meta-instruction for embedding raw data, but it's undocumented in the assembly reference!

13. INC/DEC Instruction Encoding ⚠️ MINOR

Implementation (codegen.rs:293-299):

fn build_inc_dec_instruction(opcode: Opcode, args: &[Token]) -> Result<Instruction, AssembleError> {
    let reg = expect_token!(reg_token, Register)?;
    match opcode {
        Opcode::Inc => Ok(Instruction::Increment(args!(R, sr1: reg))),
        Opcode::Dec => Ok(Instruction::Decrement(args!(R, sr1: reg))),

Reality: INC/DEC only set SR1 field, not DR field.

But args.rs shows:

impl RTypeArgs {
    pub fn new(...) -> Self {
        let sr1 = sr1.unwrap_or_default();  // Defaults to NoReg
        let dr = dr.unwrap_or_default();     // Defaults to NoReg

So the DR field gets set to NoReg, which is correct per documentation.

However, the Display impl (instructions.rs:449) shows:

Self::Increment(a) | Self::Decrement(a) => write!(f, " {}", a.sr1),

This is correct - only shows SR1 in disassembly.

14. Shift Instruction Operand Order ⚠️ MINOR

Implementation (codegen.rs:301-312):

fn build_shift_instruction(opcode: Opcode, args: &[Token]) -> Result<Instruction, AssembleError> {
    let reg = expect_token!(reg_token, Register)?;
    let amount = expect_token!(amount_token, Immediate)? as u8;
    
    match opcode {
        Opcode::Shl => Ok(Instruction::ShiftLeft(args!(R, sr1: reg, shamt: amount))),

This only handles LITERAL shift amounts, not REGISTER shift amounts!

Documentation states both are supported:

shl rg0, 2      ; Literal shift
shl rg0, rg1    ; Register shift

The current codegen only handles the literal case!

This is a BUG in the implementation - register shifts aren't properly assembled!

15. Jump Instruction Operand Order ⚠️ CONFUSION

Documentation shows assembly syntax:

jmp addr [, offset_reg]

But implementation (codegen.rs:256-270):

fn build_jump_instruction(opcode: Opcode, args: &[Token]) -> Result<Instruction, AssembleError> {
    let address = expect_token!(address_token, Immediate)?;
    let offset = expect_token!(offset_token, Register)?;
    let instruction_args = args!(I, immediate: address as u16, r1: offset);

This expects:

First arg: immediate (address)
Second arg: register (offset)

So assembly syntax should be:

jmp immediate, offset_register

Example:

jmp 0x1000, zero   ; Jump to 0x1000
jmp 4, ret         ; Jump to (ret + 4)

Documentation syntax is correct, but parameter names are confusing!

The "address" is actually an OFFSET, and the register is the BASE!

Better naming:

jmp offset, base_register
; Target = base_register + offset

16. NOT Instruction Operand Count ✅ MINOR ISSUE

Documentation shows:

not src, dest       ; Two operands

Implementation (instructions.rs:428-429):

Self::Compare(args) | Self::Not(args) => {
    write!(f, " {}, {}", args.sr1, args.sr2)
}

This displays BOTH sr1 and sr2 for NOT!

But codegen.rs:354-362:

fn build_not_instruction(args: &[Token]) -> Result<Instruction, AssembleError> {
    let reg = expect_token!(reg_token, Register)?;
    let dest = expect_token!(dest_token, Register)?;
    Ok(Instruction::Not(args!(R, sr1: reg, dr: dest)))

Sets sr1 and dr, NOT sr1 and sr2!

The Display impl is WRONG - should show sr1 and dr:

Self::Not(args) => write!(f, " {}, {}", args.sr1, args.dr)

This is a display bug in the implementation!

17. Register File Indexing ✅ CORRECT

Documentation and implementation both agree:

0x00-0x0F: rg0-rgf (general purpose)
0x10: acc
0x11: spr
0x12: bpr
0x13: ret
0x14: idr
0x15: mmr
0x16: zero
0x17: noreg

This matches perfectly.

18. Immediate Arithmetic Destination ⚠️ MINOR

Implementation (codegen.rs:314-330):

fn build_arithmetic_immediate_instruction(...) -> Result<Instruction, AssembleError> {
    let reg = expect_token!(reg_token, Register)?;
    let immediate = expect_token!(immediate_token, Immediate)? as u16;
    let dest = expect_token!(dest_token, Register)?;
    let instruction_args = args!(I, immediate: immediate, r1: reg, r2: dest);

This REQUIRES three arguments:

Source register
Immediate value
Destination register

But documentation says destination is optional:

iadd src_reg, imm [, dest_reg]   ; dest optional

Reality: The assembler REQUIRES the destination register!

If you want in-place operation:

iadd rg0, 10, rg0    ; Required to specify rg0 twice

Not:

iadd rg0, 10         ; This won't work!

Documentation is misleading - destination is NOT optional!

19. Memory Instruction Offsets ✅ CORRECT

Implementation correctly handles signed 16-bit offsets:

let offset = expect_token!(offset_token, Immediate)? as u16;

These are stored as u16 but interpreted as signed i16 at runtime.

Documentation is correct about this.

20. Instruction Opcode Values ✅ VERIFIED

Comparing model.rs opcodes with instructions.rs:

Instruction	model.rs	instructions.rs	Match
Nop	0x00	0x0	✅
Mov	0x01	0x1	✅
MovSigned	0x02	0x2	✅
LoadByte	0x03	0x3	✅
...	...	...	✅
AddImmediate	0x25	0x25	✅
SubImmediate	0x26	0x26	✅
Segment	0x27	0x3F	❌ MISMATCH!

CRITICAL: Segment instruction has opcode 0x27 in model.rs but 0x3F in instructions.rs!

Summary of Critical Issues

Must Fix in Documentation:

✅ Stack grows DOWNWARD - flip all diagrams
✅ CALL expansion - uses stack, not ret register directly
✅ RETURN expansion - loads from stack, jumps to ret+4
✅ Stack frame layout - flip diagram vertically
✅ Load pseudo scratch register - uses DEST reg, not rgf
✅ Store pseudo scratch register - uses ACC, not rgf
✅ Add PUSHA/POPA documentation
✅ Add SEGMENT instruction documentation
✅ Add DATA instruction documentation
✅ Clarify LUI immediate value handling
✅ Fix endianness - data definition uses BIG endian
✅ IADD/ISUB destination NOT optional
✅ Add "null" as alias for noreg
✅ Fix Segment opcode - 0x27 or 0x3F?

Potential Implementation Bugs:

⚠️ Shift instruction - doesn't handle register shifts
⚠️ NOT display - shows sr2 instead of dr
⚠️ RETURN +4 offset - why is this needed?
⚠️ Segment opcode mismatch - 0x27 vs 0x3F

Minor Documentation Improvements:

Add explicit examples of stack growth direction
Show complete memory layout diagrams
Document which registers are volatile/preserved
Add troubleshooting section for common mistakes
Clarify jump instruction parameter semantics

18 KiB Raw Blame History

DSA Implementation vs Documentation Discrepancies

Critical Discrepancies

1. Stack Growth Direction ❌ CRITICAL

2. CALL Pseudo-instruction Expansion ❌ CRITICAL

3. RETURN Pseudo-instruction Expansion ❌ CRITICAL

4. Calling Convention - Stack Frame Layout ❌ CRITICAL

5. Label-Based Load/Store Scratch Register ⚠️ IMPORTANT

6. LWI Pseudo-instruction ✅ CORRECT

7. PUSHA/POPA Pseudo-instructions 📝 UNDOCUMENTED

8. Register Index Encoding ⚠️ IMPORTANT

9. LUI Immediate Value Handling ⚠️ IMPORTANT

10. Data Definition Encoding ⚠️ IMPORTANT

11. Segment Instruction 📝 UNDOCUMENTED

12. Data Instruction 📝 UNDOCUMENTED

13. INC/DEC Instruction Encoding ⚠️ MINOR

14. Shift Instruction Operand Order ⚠️ MINOR

15. Jump Instruction Operand Order ⚠️ CONFUSION

16. NOT Instruction Operand Count ✅ MINOR ISSUE

17. Register File Indexing ✅ CORRECT

18. Immediate Arithmetic Destination ⚠️ MINOR

19. Memory Instruction Offsets ✅ CORRECT

20. Instruction Opcode Values ✅ VERIFIED

Summary of Critical Issues

Must Fix in Documentation:

Potential Implementation Bugs:

Minor Documentation Improvements:

18 KiB

Raw Blame History