Compare commits

2 Commits

Author SHA1 Message Date
zxq5 afb2761a2b . 2025-06-27 13:58:21 +01:00
zxq5 e23c20b635 added instruction set doc and brainf* doc 2025-06-27 10:32:33 +01:00
6 changed files with 161 additions and 0 deletions
+3
View File
@@ -2,6 +2,8 @@
[Introduction](README.md)
- [The Damn Simple Architecture](dsa-arch.md)
- [Instruction Set](dsa-arch/instruction_set.md)
- [DSA - Damn Simple Assembly](dsa.md)
- [Instructions](dsa/instructions.md)
- [Hardware Instructions](dsa/instructions/hardware.md)
@@ -25,5 +27,6 @@
- [Display](emulator/features/display.md)
- [Instruction History](emulator/features/instruction_history.md)
- [DSC - Damn Simple Code](dsc.md)
- [Functions](dsc/functions.md)
- [Other Language Support](misc_languages.md)
- [Brainf*](misc_languages/brainf.md)
+2
View File
@@ -0,0 +1,2 @@
# The Damn Simple Architecture
+101
View File
@@ -0,0 +1,101 @@
# Instruction Set
# Instruction Set
## Overview
Below is an overview of the instruction set and the various operands. This table is non-exhaustive and may be updated as the design changes. *Please note that the table spans multiple pages.*
Also note that immediate (constant/literal) arguments are 16-bits long in I (immediate argument) typed instructions. For more information on this, refer to instruction encoding.
| Type | Description |
| -- | -- |
| R | Used when an instruction takes one or more register arguments, but no immediates. This type is also used by shift and rotation operations, as it contains a 5 bit shift amount field. |
| I | Used when an instruction takes at most two register arguments as well as a halfword immediate argument. This is typically used by immediate arithmetic operations e.g. addi, as well as loads and stores (where a base register and immediate offset are passed). Also used by branching instructions. The operand is a signed offset from the current value of PCX. |
| J | Used by jumps excluding jr, which uses a register as its argument. Jumps are absolute addresses, but there is a 256MB region around PCX since the argument is 26 bits. Since arguments are always word aligned, we bitshift left twice and set the upper 4 bits to match that of the value in PCX. This then forms a valid word-sized address. |
**Note:**
J-type instructions are currently unused.
### R-type Instruction Encoding
| Bits 31-26 | Bits 25-21 | Bits 20-16 | Bits 15-11 | Bits 10-6 | Bits 5-0 |
| - | - | - | - | - | - |
| Opcode | Source Reg 1 | Source Reg 2 | Destination Reg | Shift Amount | Unused |
The shift amount must be 0 when the opcode does not match a shift instruction or else the CPU will assert an Illegal Instruction exception.
If any register field is not used, it should be set to the special value NOREG, defined in the Registers section of this document. Failure to do so may result in an Illegal Instruction exception as this is undefined for an instruction that does not expect this argument to be provided.
### I-type Instruction Encoding
| Bits 31-26 | Bits 25-21 | Bits 20-16 | Bits 15-0 |
| - | - | - | - |
| Opcode | Source Reg | Dest Reg | 16-bit immediate |
I-type instructions are used when 16-bit immediate arguments are desired. This could be for
immediate arithmetic instructions (like adding 10 to the value in ACC), or loads and stores, where we may want to access the ith index of an array using an offset.
### J-type Instruction Encoding
| Bits 31-26 | Bits 25-0 |
| - | - |
| Opcode | Address |
J-type instructions are used for absolute jumps.
The 26-bit address is converted to a 32-bit address by:
The 26-bit address field is shifted left by 2 bits (due to word alignment we ignore the 2 least significant bits).
Combined with the upper 4 bits of the PC to form a 32-bit address (bitwise OR).
The jump range: 256MB region around current PC. For longer jumps than this, see jr (Jump to word address in register).
To compute this address, the linker should find the address of the label, cut off the top 4 bits, then rightward shift twice. The CPU will then convert this to the actual 32-bit address following the steps outlined above.
## Instructions
### Hardware Instructions
| Hex | Type | Mnemonic | Operands | Description |
|------|------|----------|----------|-------------|
| 0x00 | R | NOP | n/a | No operation - a blank line. |
| 0x01 | R | MOV | SrcReg, DestReg | Copies from SrcReg to DestReg. |
| 0x02 | R | MOVS | SrcReg, DestReg | Copies from SrcReg to DestReg, sign extending the value to take up a full word. |
| 0x03 | I | LDB | BaseReg, Offset, DestReg | Loads a byte from memory address (base + offset) into DestReg. The effective address must be byte-aligned. |
| 0x04 | I | LDBS | BaseReg, Offset, DestReg | Loads a sign-extended byte from memory address (base + offset) into DestReg. The effective address must be byte-aligned. |
| 0x05 | I | LDH | BaseReg, Offset, DestReg | Loads a half-word from memory address (base + offset) into DestReg. The effective address must be 2-byte-aligned. |
| 0x06 | I | LDHS | BaseReg, Offset, DestReg | Loads a sign-extended half-word from memory address (base + offset) into DestReg. The effective address must be 2-byte-aligned. |
| 0x07 | I | LDW | BaseReg, Offset, DestReg | Loads a word from memory address (base + offset) into DestReg. The effective address must be 4-byte-aligned. |
| 0x08 | I | STB | SrcReg, BaseReg, Offset | Stores a byte from SrcReg in memory address (base + offset). The effective address must be byte-aligned. |
| 0x09 | I | STH | SrcReg, BaseReg, Offset | Stores a half-word from SrcReg in memory address (base + offset). The effective address must be 2-byte-aligned. |
| 0x0A | I | STW | SrcReg, BaseReg, Offset | Stores a word from SrcReg in memory address (base + offset). The effective address must be 4-byte-aligned. |
| 0x0B | I | LLI | DstReg, Value | Loads a 16-bit literal value into reg, setting the bottom 16 bits of the word. To populate the upper 16 bits, see LUI. |
| 0x0C | I | LUI | DstReg, Value | Loads a 16-bit literal value into reg, setting the top 16 bits of the word. To populate the lower 16 bits, see LLI. |
| 0x0D | I | JMP | DestReg, Offset \| Address | Unconditionally jumps to the calculated address or direct address. |
| 0x0E | I | JEQ | DestReg, Offset \| Address | Jumps to the calculated address or direct address if equal flag set. |
| 0x0F | I | JNE | DestReg, Offset \| Address | Jumps to the calculated address or direct address if the equal flag is not set. |
| 0x10 | I | JGT | DestReg, Offset \| Address | Jumps to the calculated address or direct address if greater than flag set. |
| 0x11 | I | JGE | DestReg, Offset \| Address | Jumps to the calculated address or direct address if greater than flag or equal flag set. |
| 0x12 | I | JLT | DestReg, Offset \| Address | Jumps to the calculated address or direct address if less than flag set. |
| 0x13 | I | JLE | DestReg, Offset \| Address | Jumps to the calculated address or direct address if less than flag or equal flag set. |
| 0x14 | R | CMP | Reg1, Reg2 | Compares the value of Reg1 to the value in Reg2. The results of the comparisons are set in the Status register. |
| 0x15 | R | INC | Reg | Increments the value in the given register. |
| 0x16 | R | DEC | Reg | Decrements the value in the given register. |
| 0x17 | R | SHL | Reg, Literal \| ValReg | Left shifts the value in Reg by the given amount (either a register, or a literal value). |
| 0x18 | R | SHR | Reg, Literal \| ValReg | Right shifts the value in Reg by the given amount (either a register, or a literal value). |
| 0x19 | R | ADD | Src1, Src2, Dest | Adds the value of Src2 to Src1 and writes the result to Dest. |
| 0x1A | R | SUB | Src1, Src2, Dest | Subtracts the value of Src2 from Src1 and writes the result to Dest. |
| 0x1B | R | AND | Src1, Src2, Dest | Performs bitwise AND on Src1 and Src2 storing the result in Dest. |
| 0x1C | R | OR | Src1, Src2, Dest | Performs bitwise OR on Src1 and Src2 storing the result in Dest. |
| 0x1D | R | NOT | Src, Dest | Performs bitwise NOT on Src storing the result in Dest. |
| 0x1E | R | XOR | Src1, Src2, Dest | Performs bitwise XOR on Src1 and Src2 storing the result in Dest. |
| 0x1F | R | NAND | Src1, Src2, Dest | Performs bitwise NAND on Src1 and Src2 storing the result in Dest. |
| 0x20 | R | NOR | Src1, Src2, Dest | Performs bitwise NOR on Src1 and Src2 storing the result in Dest. |
| 0x21 | R | XNOR | Src1, Src2, Dest | Performs bitwise XNOR on Src1 and Src2 storing the result in Dest. |
| 0x22 | I | INT | Literal | Initiates an interrupt with the given 8 bit interrupt code. Triggering an interrupt invokes the following behaviour: The return address is saved to the RET register. The stack base ptr is set to the kernel stack. |
| 0x23 | R | IRT | n/a | Returns from an interrupt. |
| 0x24 | R | HLT | n/a | Halts the processor. |
| 0x25 | I | IADD | Src1, Literal, Dest | An immediate version of addition taking a 16-bit immediate value. |
| 0x26 | I | ISUB | Src1, Literal, Dest | An immediate version of subtraction taking a 16-bit immediate value. |
+19
View File
@@ -1 +1,20 @@
# DSC - Damn Simple Code
# This document is a work in progress!
# Nothing is final!
## Syntax
- we aim to make the syntax simple and easy to understand, this has the following benefits
- easy to write
- easy to parse
- little variation in syntax means we have to handle less cases in semantic analysis, meaning we will be able to create a working compiler quicker.
## Types
- we should support the following types
- unsigned integer types (U8, U16, U32)
- signed integer types (I8, I16, I32)
- boolean type (Bool)
- struct types (Struct)
- dynamic types *(Dyn)
-
+1
View File
@@ -0,0 +1 @@
# Functions
+35
View File
@@ -1 +1,36 @@
# Brainf*
## Language overview
- Brainf* instructions are as follows:
| Instruction | Description |
| --- | --- |
| `+` | Increment the current memory cell |
| `-` | Decrement the current memory cell |
| `<` | Move the data pointer to the left |
| `>` | Move the data pointer to the right |
| `.` | Output the value of the current memory cell as a character |
| `,` | Input a character and store its value in the current memory cell |
| `[` | Jump to the instruction after the matching `]` if the value in the current memory cell is zero |
| `]` | Jump to the instruction after the matching `[` if the value in the current memory cell is non-zero |
## Implementations
we currently have two implementations of the brainf* esoteric programming language:
### Compiler
- this is the most efficient way to run brainf* programs on the DSA architecture, but of course, still terribly inefficient due to the nature of the language.
- compiling allows us to calculate the jump addresses at compile time, therefore making each brainf* instruction take at maximum three DSA instructions to execute
### Interpreter
- this method is much slower, with even jumping to the start of a loop having an O(n) time complexity, which depending on the complexity of the program can up to double the running time.
- additionally, interpreting the language means much more logic is required at runtime relative to compiling.
- from our testing on a few example programs such as a fibonacci sequence generator, the interpreter is several orders of magnitude slower, with the fibonacci generator beingabout 10 times slower than it's compiled equivalent, at around 3.8 million instructions to generate and pretty-print the first 16 fibonacci numbers, compared to around 350,000 for the compiled version, which we estimate is about as efficient as brainf* can be on our architecture without writing an optimiser.
## Usage
### Compiling
- currently [The DSA Assembler](../dsa/tooling/assembler.md) supports compiling brainf* programs, with the following command:
```bash
<assembler binary name> -brainf
```