40 Commits

Author SHA1 Message Date
zxq5 c02ecd58e8 add dsx clone command 2026-02-25 22:16:52 +00:00
zxq5 0d54b319f1 dsx/dsx_server repo system first implementation 2026-02-25 14:52:04 +00:00
zxq5 ba4ced6433 added icon and .desktop file to DSA arch package 2026-02-23 23:52:22 +00:00
zxq5 4bee36eb7f - added support for args in Builder trait
- added is_lib arg for compiler to produce non-main assembly outputs
- updated templates to use include_str
2026-02-23 22:06:07 +00:00
zxq5 71b36dc6b5 reoganised project 2026-02-23 20:32:45 +00:00
zxq5 42bc666c11 commented out debug print statements and removed some commented out code that's no longer needed 2026-02-23 19:52:25 +00:00
zxq5 1d38aca523 added serial-out support to emulator + serial lib & command line mode for dsa emulator 2026-02-23 19:51:05 +00:00
zxq5 7ab1ac8842 added logging to builder trait and implemented for compiler and
assembler crates.
2026-02-23 09:04:30 +00:00
zxq5 a1d7b54479 - created dsx and dsx_server in place of dsx-build
- dsx replaces dsx-build and is a build tool/package manager for the DSA
  ecosystem
- dsx_server is the repository/server for dsx packages.

dsx is a WIP.
2026-02-22 21:44:41 +00:00
zxq5 7117b927f3 - updated common with new compiler/builder trait to provide a common
interface for build tools
- updated editor and build tooling to use new system
2026-02-22 21:43:22 +00:00
zxq5 4ed5da259e continued work on new register allocator implementation.
next commit will have it integrated if it works
2026-02-14 20:23:20 +00:00
zxq5 7b18922cc7 Merge pull request 'Merge compiler and emulator progress from last few months into main.' (#11) from compiler into main
Reviewed-on: #11
2026-02-14 11:54:15 +00:00
zxq5 a0b02cb955 Create tasks.json 2026-02-14 11:50:59 +00:00
zxq5 240f0e553f fixed failing tests
TODO: add comprehensive testing to everything
2026-02-14 11:50:54 +00:00
zxq5 25a59a6b19 fixed clippy lints 2026-02-14 11:50:36 +00:00
zxq5 c67217a6b8 Merge branch 'main' into compiler 2026-02-14 11:09:33 +00:00
zxq5 7ccbd9258f - fixed some clippy lints
- updated comments in compiler codegen
- deleted old dsa compiler outputs
- settings for zed
2026-02-14 11:05:15 +00:00
zxq5 201b18069b continued on register allocator rewrite, slow progress as scoping is
proving to be a challenge
2026-02-14 02:46:29 +00:00
zxq5 d66baf6f99 moved loc 2026-02-13 21:42:59 +00:00
zxq5 75ad04cf95 forgot to commit these 2026-02-10 16:37:33 +00:00
zxq5 8361833b1c broken commit, started working on scopes 2026-02-10 16:33:32 +00:00
zxq5 5e575e2cd8 used claude to write a language spec (syntax and simple examples) for
DSC that we can follow as a reference for implementation.
2026-02-10 10:05:13 +00:00
zxq5 931af90789 - renamed assembler_runner to just assembler
- implemented type parsing including custom types and generics (useless
  for now as we do no semantic analysis)
- implemented struct literal parsing
- implemented struct definition parsing (no generics yet)
- implemented tuple parsing
- registers are now allocated starting from zero
- updated to-dos
2026-02-10 10:03:48 +00:00
zxq5 509b3465f1 docs update 2026-02-09 00:10:49 +00:00
zxq5 22241a5633 - implementation of <var> <op> = <expr> type statements such as `x +=
5`
- implementation of logical and shift operations in parser and codegen.
- implementation of sizeof keyword as unary operator

in progress (non functional)
- implementation of prefix and postfix inc/dec operators

- array access by index (implemented, untested as arrays aren't
  implemented yet). essentially just a pointer write with offset.
- struct/member access (parsing implemented, untested.)
2026-02-09 00:10:37 +00:00
zxq5 e2be83414b - updated assembler to support new shift implementation
- updated emulator to support new shift implementation
- updated emulator to rename NoReg to Null as in the common lib
2026-02-09 00:05:45 +00:00
zxq5 f7ed764e96 renamed NoReg to Null in common 2026-02-09 00:04:19 +00:00
zxq5 328741eb51 updated compiler with support for more operators.
(only the unary operators from this are implemented for now)
2026-02-08 20:03:31 +00:00
zxq5 9f35fc9415 block allocator implementation and example 2026-02-08 12:36:49 +00:00
zxq5 828f5bfb2d fixed pointers and stuff. 2026-02-08 11:45:26 +00:00
zxq5 6699333b2c - C frontend broken for now
- If statements work properly now (hopefully)
- still issues with while loops pushing vars to the stack. need scoping
  implemented to fix this!

- refactored registers.rs and fixed faulty logic.
- made register allocation optimisations
2026-02-08 00:14:18 +00:00
nullndvoid 458661b02a misc: add 'profiling' profile. 2025-06-29 04:11:41 +01:00
nullndvoid c41e5328e6 docs: fix failing doctest 2025-06-29 02:21:31 +01:00
zxq5 67ebf48d6f removed old log file 2025-06-29 02:08:06 +01:00
zxq5 98668c681e more optimisations test program ~54MIPS -> ~110MIPS 2025-06-29 02:04:14 +01:00
zxq5 05a25447b2 minor optimisation to reduce unnecessary allocations 2025-06-28 03:43:20 +01:00
zxq5 56d2abe17f - optimised main emulator loop, allowing updates only once every roughly 32,000 instructions.
- optimised memory access patterns, removing unecessary mutability and accesses.
- replaced the standard HashMap with an implementation that uses a faster hashing algorithm.

results:

before:
    - our benchmark program with ~4m instructions would take around for their data to make it to the UI, and a bit over 200ms to actually run

after:
    - our benchmark program with ~4m instructions can run in around 75ms, and the UI receives the update almost instantly.

conclusion:
- emulator performance should be around 2-3x faster than before.
2025-06-28 03:21:46 +01:00
zxq5 eaaefd1b07 added rule to .gitignore 2025-06-27 18:31:52 +01:00
zxq5 5302ad3876 removed junk files 2025-06-27 18:30:53 +01:00
zxq5 2280f1e5d9 updated vscode settings 2025-06-27 18:30:26 +01:00
153 changed files with 13339 additions and 3760 deletions
+4
View File
@@ -5,3 +5,7 @@ rustc-wrapper = "sccache"
[future-incompat-report]
frequency = "always"
[profile.profiling]
inherits = "release"
debug = true
+2
View File
@@ -1,3 +1,5 @@
/target
**/*.env
Cargo.lock
.test/
pkg
+4
View File
@@ -8,4 +8,8 @@
"files.trimTrailingWhitespace": true,
"gitea.owner": "LowLevelDevs",
"gitea.repo": "damn_simple_architecture",
"[markdown]": {
"editor.formatOnSave": true,
"editor.formatOnPaste": true
}
}
+15
View File
@@ -0,0 +1,15 @@
// Folder-specific settings
//
// For a full list of overridable settings, and general information on folder-specific settings,
// see the documentation: https://zed.dev/docs/configuring-zed#settings-files
{
"lsp": {
"rust-analyzer": {
"initialization_options": {
"check": {
"command": "clippy", // rust-analyzer.check.command (default: "check")
},
},
},
},
}
+46
View File
@@ -0,0 +1,46 @@
[
{
"label": "Run Emulator",
"command": "cargo run --bin emulator",
"use_new_terminal": true,
},
{
"label": "Run Compiler",
"command": "cargo run --bin compiler",
"use_new_terminal": true,
},
{
"label": "Run Assembler",
"command": "cargo run --bin assembler",
"use_new_terminal": true,
},
{
"label": "Run Build System (dsx-build)",
"command": "cargo run --bin dsx-build",
"use_new_terminal": true,
},
{
"label": "Check All",
"command": "cargo clippy --all-targets",
"use_new_terminal": false,
},
{
"label": "Build All (Release)",
"command": "cargo build --release",
"use_new_terminal": false,
},
{
"label": "Publish Arch Package",
"command": "sh resources/publish.sh",
},
{
"label": "Run Tests",
"command": "cargo test",
"use_new_terminal": true,
},
{
"label": "Profile Emulator with perf",
"command": "cargo build --profile profiling; perf record -g -F 999 target/profiling/emulator; perf script -F +pid | save test.perf",
"use_new_terminal": true,
},
]
+9 -1
View File
@@ -1,7 +1,11 @@
cargo-features = ["codegen-backend"]
[workspace]
members = ["emulator", "common", "assembler", "dsa_editor", "compiler", "dsx-build"]
members = [
"core/dsa_common", "core/assembler", "core/compiler",
"emulator", "emulator/dsa_editor",
"dsx/dsx_server", "dsx/dsx_common", "dsx/dsx"
]
resolver = "3"
[workspace.package]
@@ -15,3 +19,7 @@ panic = "abort" # Cranelift does not support stack unwinds.
lto = false
debug = true
incremental = false # sccache does not support caching incremental crates.
[profile.release]
debug = true
lto = "fat"
+59
View File
@@ -0,0 +1,59 @@
# Maintainer: zxq5 <zxq5@proton.me>
pkgbase='damn-simple-architecture'
pkgname=('dsa' 'dsx' 'dsa-tools' 'dsx-server')
pkgver=0.1.1
pkgrel=1
startdir='.'
pkgdesc="Damn Simple Architecture"
arch=('x86_64')
url="https://git.zxq5.dev/zxq5/damn-simple-architecture"
license=('MIT')
makedepends=('rust' 'cargo' 'sed')
build() {
cargo build --release \
--bin dsa \
--bin dsx \
--bin dsa-a \
--bin dsa-c \
--bin dsx-server
}
package_dsa() {
pkgdesc="DSA core binary"
depends=()
install -Dm755 "$startdir/target/release/dsa" "$pkgdir/usr/bin/dsa"
install -Dm644 "$startdir/resources/dsa.desktop" \
"$pkgdir/usr/share/applications/dsa.desktop"
install -Dm644 "$startdir/resources/dsa.png" \
"$pkgdir/usr/share/icons/hicolor/256x256/apps/dsa.png"
}
package_dsx() {
pkgdesc="DSX client"
depends=('dsa')
install -Dm755 "$startdir/target/release/dsx" "$pkgdir/usr/bin/dsx"
}
package_dsa-tools() {
pkgdesc="DSA assembler and compiler tools"
depends=()
install -Dm755 "$startdir/target/release/dsa-a" "$pkgdir/usr/bin/dsa-a"
install -Dm755 "$startdir/target/release/dsa-c" "$pkgdir/usr/bin/dsa-c"
}
package_dsx-server() {
pkgdesc="DSX server"
depends=()
install -Dm755 "$startdir/target/release/dsx-server" "$pkgdir/usr/bin/dsx-server"
# Example sed usage — patch config paths for system install
sed -i 's|./templates|/usr/share/dsx-server/templates|g' \
"target/release/dsx-server" 2>/dev/null || true
install -Dm644 "$startdir/dsx_server/templates" "$pkgdir/usr/share/dsx-server/templates" 2>/dev/null || true
}
-50
View File
@@ -1,50 +0,0 @@
#![deny(
clippy::unwrap_used,
clippy::nursery,
clippy::perf,
clippy::pedantic,
clippy::complexity
)]
#![allow(
clippy::cast_possible_truncation,
clippy::missing_panics_doc,
clippy::missing_errors_doc,
clippy::match_wildcard_for_single_variants
)]
pub mod assembler;
pub mod image_builder;
pub mod tooling;
mod util;
pub mod prelude {
pub use crate::assembler::CompilerEngine;
pub use crate::image_builder;
pub use crate::tooling::brainf;
pub use crate::tooling::project;
}
use std::{fs, path::Path};
use num_cpus as _;
use threadpool as _;
use crate::prelude::CompilerEngine;
pub fn assemble_file(input: &str, output: &str) -> Result<(), std::io::Error> {
let mut engine = CompilerEngine::new();
engine.start_compilation(Path::new(input));
let result = engine.wait_for_result().unwrap();
let buffer: Vec<u8> = result
.iter()
.flat_map(|instruction| instruction.encode().to_be_bytes())
.collect();
if let Err(e) = fs::write(output, buffer) {
eprintln!("Failed to write to output file: {e}");
std::process::exit(1);
}
Ok(())
}
-108
View File
@@ -1,108 +0,0 @@
#![allow(dead_code)]
#![allow(unused)]
use std::{fmt, sync::mpsc::Sender};
pub struct Logger {}
impl Logger {
pub const fn new() -> Self {
Self {}
}
pub fn log(&self, message: &str) {
_ = self;
println!("\x1b[32mINFO:\x1b[0m {message}");
}
}
// #[derive(Debug)]=
// pub struct Logger {
// pub sender: Sender<Entry>,
// }
// impl Logger {
// pub fn new(sender: Sender<Entry>) -> Self {
// Self { sender }
// }
// pub fn debug<T: fmt::Display>(&self, message: T) {
// self.sender
// .send(Entry {
// etype: EntryType::Debug,
// message: message.to_string(),
// })
// .unwrap();
// }
// pub fn info<T: fmt::Display>(&self, message: T) {
// self.sender
// .send(Entry {
// etype: EntryType::Info,
// message: message.to_string(),
// })
// .unwrap();
// }
// pub fn warn<T: fmt::Display>(&self, message: T) {
// self.sender
// .send(Entry {
// etype: EntryType::Warn,
// message: message.to_string(),
// })
// .unwrap();
// }
// pub fn error<T: fmt::Display>(&self, message: T) {
// self.sender
// .send(Entry {
// etype: EntryType::Error,
// message: message.to_string(),
// })
// .unwrap();
// }
// pub fn fatal<T: fmt::Display>(&self, message: T) {
// self.sender
// .send(Entry {
// etype: EntryType::Fatal,
// message: message.to_string(),
// })
// .unwrap();
// }
// }
pub struct Entry {
etype: EntryType,
pub message: String,
}
#[derive(Copy, Clone, Eq, PartialEq)]
enum EntryType {
Debug,
Info,
Warn,
Error,
Fatal,
}
impl fmt::Display for EntryType {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"{:<5}",
match self {
Self::Debug => "DEBUG",
Self::Info => "INFO",
Self::Warn => "WARN",
Self::Error => "ERROR",
Self::Fatal => "FATAL",
}
)
}
}
impl fmt::Display for Entry {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}: {}", self.etype, self.message)
}
}
+1
View File
@@ -0,0 +1 @@
disallowed-types = ["std::collections::HashMap", "std::collections::HashSet"]
-4
View File
@@ -1,4 +0,0 @@
// TODO: Use an actual logging or tracing library for pretty (scoped) output.
pub fn log(message: &str) {
println!("\x1b[32mINFO:\x1b[0m {message}");
}
-738
View File
@@ -1,738 +0,0 @@
use std::collections::HashMap;
use std::sync::atomic::AtomicU32;
use std::time::SystemTime;
use chrono::{DateTime, Local};
use super::registers::RegisterAllocator;
use crate::{block, comment, dsa};
use crate::model::{
BinaryOperator, CompilerError, ConstExpr, Declaration, Dependency, Expression,
Program, Statement, UnaryOperator, Variable,
};
pub struct CodeGenerator {
ast: Program,
imports: HashMap<String, String>,
globals: Vec<String>,
functions: Vec<String>,
symbols: Vec<String>,
allocator: RegisterAllocator,
}
fn import(name: &str, path: &str) -> String {
format!("include {name}: \"{}\"", path)
}
impl CodeGenerator {
const RET: &'static str = "\tjmp _ret";
pub fn new(ast: Program) -> Self {
CodeGenerator {
ast,
imports: HashMap::new(),
globals: Vec::new(),
functions: Vec::new(),
symbols: Vec::new(),
allocator: RegisterAllocator::new(),
}
}
pub fn include(&mut self, name: &str, path: &str) {
self.imports.insert(name.to_string(), path.to_string());
}
fn is_global(&self, name: &str) -> bool {
// Check if this variable is in the globals list
self.globals
.iter()
.any(|g| g.contains(&format!("dw {}:", name)))
}
pub fn generate(&mut self) -> Result<String, CompilerError> {
// always include the print library for debugging!
self.include("print", "./lib/io/print.dsa");
for block in self.ast.clone().declarations {
match block {
Declaration::Variable {
var: Variable { name, .. },
..
} => self.symbols.push(name),
Declaration::Function { name, .. } => self.symbols.push(name),
Declaration::Dependency(Dependency { name, .. }) => {
self.symbols.push(name)
}
}
}
for block in self.ast.clone().declarations {
self.generate_block(block.clone())?;
}
self.generate_layout()
}
fn generate_layout(&mut self) -> Result<String, CompilerError> {
let datetime: DateTime<Local> = SystemTime::now().into();
Ok(dsa![
"",
comment!("GENERATED BY DSC COMPILER"),
comment!(format!(
"Generated at {}",
datetime.format("%Y-%m-%d %H:%M:%S")
)),
"",
// imports
comment!("Imports"),
self.imports
.iter()
.map(|(k, v)| import(k, v))
.collect::<Vec<String>>()
.join("\n"),
"",
// reserved memory
comment!("Globals & Reserved Memory"),
self.globals.join("\n"),
"",
// entry point
comment!("Entry Point"),
"dw stack: 0x10000",
"db message: \"Process Exited with code:\"",
block! [ "_init"
dsa![ldw stack, bpr],
dsa![mov bpr, spr],
dsa![push zero],
dsa![call main],
dsa![call print::print_newline],
dsa![lwi message, rg0],
dsa![push rg0],
dsa![call print::print],
dsa![pop zero],
dsa![call print::print_hex_word],
dsa![pop zero],
dsa![hlt]
],
"",
comment!("Return"),
block! [ "_ret"
dsa![mov bpr, spr],
dsa![pop bpr],
dsa![return]
],
comment!("Compiled Code Starts..."),
// block! [ "main"
// dsa![push bpr],
// dsa![mov spr, bpr],
// dsa![lwi 67, rg1],
// dsa![stw rg1, spr, 8],
// dsa![mov bpr, spr],
// dsa![pop bpr],
// dsa![return]
// ],
self.functions.join("\n"),
])
}
fn generate_global(&mut self, name: &str, init: Option<ConstExpr>) {
self.globals.push(format!(
"dw {}: {}",
name,
init.unwrap_or(ConstExpr::Number(0))
))
}
fn generate_block(&mut self, block: Declaration) -> Result<(), CompilerError> {
match block {
Declaration::Variable { var, init, .. } => {
self.generate_global(&var.name, init)
}
Declaration::Function {
name, params, body, ..
} => {
let func = self.generate_function(&name, &params, &body).join("\n");
self.functions.push(format!("{func}\n"));
}
Declaration::Dependency(Dependency { name, path }) => {
self.imports.insert(name, path);
}
};
Ok(())
}
// Example: Generate code for a function
fn generate_function(
&mut self,
name: &str,
params: &[Variable],
body: &[Statement],
) -> Vec<String> {
let mut code = Vec::new();
// Reset allocator for new function
self.allocator.reset();
// Function prologue
code.push(format!("{}:", name));
code.push("\tpush bpr".to_string());
code.push("\tmov spr, bpr".to_string());
code.push(String::new());
// Allocate parameters to registers or stack locations
for (i, param) in params.iter().enumerate() {
let offset = 8 + (i as i32 * 4); // Parameters start at bpr+8
// Track that this parameter is at a stack location
let (reg, load_code) = self.allocator.alloc_var(&param.name).unwrap();
code.extend(load_code);
code.push(format!("\tldw bpr, {}, {}", reg, offset));
}
// Generate code for function body
for stmt in body {
let stmt_code = self.generate_statement(stmt).unwrap();
code.extend(stmt_code);
}
// automatically return at function end
if let Some(x) = code.last()
&& x == Self::RET
{
} else {
code.push(Self::RET.to_string());
}
code
}
// Example: Generate code for a statement
fn generate_statement(
&mut self,
stmt: &Statement,
) -> Result<Vec<String>, CompilerError> {
let mut code = Vec::new();
match stmt {
Statement::Declaration { var, value } => {
if let Some(expr) = value {
// Evaluate expression
let (result_reg, expr_code) = self.generate_expression(expr, true)?;
code.extend(expr_code);
// Store result in variable
let store_code = self.allocator.store_var(&var.name, &result_reg);
code.extend(store_code);
// Free temporary register
self.allocator.free_temp(&result_reg);
} else {
// Just declaring variable without initialization
self.allocator.alloc_var(&var.name)?;
}
}
Statement::Break => unimplemented!(),
Statement::Continue => unimplemented!(),
Statement::PtrWrite { ptr, value } => {
let (result_reg, expr_code) = self.generate_expression(value, true)?;
code.extend(expr_code);
let (ptr_reg, ptr_code) = self.generate_expression(ptr, true)?;
code.extend(ptr_code);
code.push(format!("\tstw {}, {}", result_reg, ptr_reg));
self.allocator.free_temp(&result_reg);
self.allocator.free_temp(&ptr_reg);
}
Statement::Assign { varname, value } => {
// Evaluate expression
let (result_reg, expr_code) = self.generate_expression(value, true)?;
code.extend(expr_code);
// Check if this is a global variable
if self.is_global(varname) {
// Store to global label
code.push(format!("\tstw {}, {}", result_reg, varname));
} else {
// Store result in local variable
let store_code = self.allocator.store_var(varname, &result_reg);
code.extend(store_code);
}
// Free temporary register
self.allocator.free_temp(&result_reg);
}
Statement::Return(expr) => {
if let Some(e) = expr {
let (result_reg, expr_code) = self.generate_expression(e, true)?;
code.extend(expr_code);
code.push(format!("\tstw {}, bpr, 8", result_reg));
code.push(format!("\tjmp _ret"));
self.allocator.free_temp(&result_reg);
}
}
Statement::If {
condition,
then_stmt,
else_stmt,
} => {
// Generate condition
let (cond_reg, cond_code) = self.generate_expression(condition, true)?;
code.extend(cond_code);
// Compare with zero
code.push(format!("\tcmp {}, zero", cond_reg));
self.allocator.free_temp(&cond_reg);
// Generate unique labels
let then_label = format!("_then_{}", self.get_unique_label());
let else_label = format!("_else_{}", self.get_unique_label());
let end_label = format!("_end_{}", self.get_unique_label());
// Jump to else if condition is false (equal to zero)
code.push(format!("\tjeq {}", else_label));
// Then block
code.push(format!("{}:", then_label));
for s in then_stmt {
code.extend(self.generate_statement(s)?);
}
if then_stmt.len() == 0 {
code.push("\tnop".to_string());
}
code.push(format!("\tjmp {}", end_label));
// Else block
code.push(format!("{}:", else_label));
for s in else_stmt {
code.extend(self.generate_statement(s)?);
}
if else_stmt.len() == 0 {
code.push("\tnop".to_string());
}
code.push(format!("{}:", end_label));
}
Statement::While { condition, body } => {
let loop_start = format!("_while_start_{}", self.get_unique_label());
let loop_end = format!("_while_end_{}", self.get_unique_label());
code.push(format!("{}:", loop_start));
// Generate condition
let (cond_reg, cond_code) = self.generate_expression(condition, true)?;
code.extend(cond_code);
code.push(format!("\tcmp {}, zero", cond_reg));
self.allocator.free_temp(&cond_reg);
code.push(format!("\tjeq {}", loop_end));
// Loop body
for s in body {
code.extend(self.generate_statement(s)?);
}
code.push(format!("\tjmp {}", loop_start));
code.push(format!("{}:", loop_end));
}
Statement::Loop(body) => {
let loop_start = format!("_loop_start_{}", self.get_unique_label());
code.push(format!("{}:", loop_start));
for s in body {
code.extend(self.generate_statement(s)?);
}
code.push(format!("\tjmp {}", loop_start));
}
Statement::Expression { expr } => {
let (result_reg, expr_code) = self.generate_expression(expr, false)?;
code.extend(expr_code);
self.allocator.free_temp(&result_reg);
}
Statement::Block(statements) => {
for s in statements {
code.extend(self.generate_statement(s)?);
}
}
}
Ok(code)
}
// Example: Generate code for an expression
// Returns (register containing result, assembly code)
fn generate_expression(
&mut self,
expr: &Expression,
use_result: bool,
) -> Result<(String, Vec<String>), CompilerError> {
let mut code = Vec::new();
// optimisation to prevent generating dead code!
if expr.is_pure() && !use_result {
return Ok((String::new(), code));
}
match expr {
Expression::StringLiteral(value) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.extend(alloc_code);
// write string into memory
let uuid = self.get_unique_label();
code.push(format!("\tdb str_{uuid}: \"{value}\""));
// Load pointer to string
code.push(format!("\tlwi str_{uuid}, {reg}"));
Ok((reg, code))
}
Expression::CharLiteral(value) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.extend(alloc_code);
// Load immediate value
code.push(format!("\tlli {}, {} // '{value}'", *value as u8, reg));
Ok((reg, code))
}
Expression::Number(value) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.extend(alloc_code);
// Load immediate value
code.push(format!("\tlli {}, {}", value & 0xFFFF, reg));
if *value > 0xFFFF || *value < 0 {
code.push(format!("\tlui {}, {}", (value >> 16) & 0xFFFF, reg));
}
Ok((reg, code))
}
Expression::Variable { name, .. } => {
if self.is_global(&name.name) {
// Allocate a temporary register for the global
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.extend(alloc_code);
// Load from global label
code.push(format!("\tldw {}, {}", name.name, reg));
Ok((reg, code))
} else {
// Local variable - use existing allocator logic
let (reg, load_code) = self.allocator.load_var(&name.name)?;
code.extend(load_code);
Ok((reg, code))
}
}
Expression::Binary { op, left, right } => {
// Evaluate left operand
let (left_reg, left_code) = self.generate_expression(left, true)?;
code.extend(left_code);
// Evaluate right operand
let (right_reg, right_code) = self.generate_expression(right, true)?;
code.extend(right_code);
// Allocate result register
let (result_reg, result_alloc) = self.allocator.alloc_temp()?;
code.extend(result_alloc);
// Generate operation
match op {
BinaryOperator::Add => {
code.push(format!(
"\tadd {}, {}, {}",
left_reg, right_reg, result_reg
));
}
BinaryOperator::Sub => {
code.push(format!(
"\tsub {}, {}, {}",
left_reg, right_reg, result_reg
));
}
BinaryOperator::Mul => {
self.include("maths", "./lib/maths/core.dsa");
// Call multiply function
code.push(format!("\tpush {}", right_reg));
code.push(format!("\tpush {}", left_reg));
code.push("\tcall maths::multiply".to_string());
code.push(format!("\tpop {}", result_reg));
code.push("\tpop zero".to_string());
}
// Comparison operators - return 1 (true) or 0 (false)
BinaryOperator::Eq => {
code.push(format!("\tcmp {}, {}", left_reg, right_reg));
code.push(format!("\tlli 0, {}", result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(format!("\tjne {}", end_label)); // If not equal, skip setting to 1
code.push(format!("\tlli 1, {}", result_reg));
code.push(format!("{}:", end_label));
}
BinaryOperator::Ne => {
code.push(format!("\tcmp {}, {}", left_reg, right_reg));
code.push(format!("\tlli 0, {}", result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(format!("\tjeq {}", end_label)); // If equal, skip setting to 1
code.push(format!("\tlli 1, {}", result_reg));
code.push(format!("{}:", end_label));
}
BinaryOperator::Lt => {
code.push(format!("\tcmp {}, {}", left_reg, right_reg));
code.push(format!("\tlli 0, {}", result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(format!("\tjge {}", end_label)); // If greater or equal, skip setting to 1
code.push(format!("\tlli 1, {}", result_reg));
code.push(format!("{}:", end_label));
}
BinaryOperator::Le => {
code.push(format!("\tcmp {}, {}", left_reg, right_reg));
code.push(format!("\tlli 0, {}", result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(format!("\tjgt {}", end_label)); // If greater than, skip setting to 1
code.push(format!("\tlli 1, {}", result_reg));
code.push(format!("{}:", end_label));
}
BinaryOperator::Gt => {
code.push(format!("\tcmp {}, {}", left_reg, right_reg));
code.push(format!("\tlli 0, {}", result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(format!("\tjle {}", end_label)); // If less or equal, skip setting to 1
code.push(format!("\tlli 1, {}", result_reg));
code.push(format!("{}:", end_label));
}
BinaryOperator::Ge => {
code.push(format!("\tcmp {}, {}", left_reg, right_reg));
code.push(format!("\tlli 0, {}", result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(format!("\tjlt {}", end_label)); // If less than, skip setting to 1
code.push(format!("\tlli 1, {}", result_reg));
code.push(format!("{}:", end_label));
}
_ => unimplemented!(),
}
// Free operand registers (allocator will protect variables)
self.allocator.free_temp(&left_reg);
self.allocator.free_temp(&right_reg);
Ok((result_reg, code))
}
Expression::Call { name, args } => {
// first evaluate all the args we're going to need
let mut arg_regs = Vec::new();
for arg in args.iter().rev() {
let (arg_reg, arg_code) = self.generate_expression(arg, true)?;
code.extend(arg_code);
arg_regs.push(arg_reg);
}
// Save caller-saved registers and track which ones we saved
// old method, inefficient.
// let saved_regs = self.allocator.get_caller_saved_registers();
// for reg in &saved_regs {
// code.push(format!("\tpush {}", reg));
// }
// Save caller-saved registers and track which ones we saved
let saved_regs = self.allocator.get_caller_saved_registers();
for reg in &saved_regs {
// spill variables to stack
code.extend(self.allocator.spill_register(reg).unwrap());
}
// Evaluate and push arguments in reverse order
for (i, arg_reg) in arg_regs.iter().enumerate() {
code.push(format!(
"\tpush {} // push arg {}",
arg_reg,
args.len() - 1 - i
));
}
// if GLOBAL_METHODS.contains_key(name.name.as_str()) {
// code.push(format!("\tcall {}",
// GLOBAL_METHODS[name.name.as_str()])); } else
if self.symbols.contains(&name.name) {
// Call local function
code.push(format!("\tcall {}", name));
} else if let Some(ns) = name.namespace.clone()
&& self.imports.contains_key(&ns)
{
code.push(format!("\tcall {}", name));
} else {
return Err(CompilerError::Undefined(name.clone()));
}
let result_reg: String;
if use_result {
let (temp_result_reg, result_alloc) = self.allocator.alloc_temp()?;
result_reg = temp_result_reg;
code.extend(result_alloc);
code.push(format!("\tpop {}", result_reg));
// Clean up arguments
if args.len() > 1 {
for _ in 0..(args.len() - 1) {
code.push("\tpop zero".to_string());
}
}
} else {
result_reg = "zero".to_string();
// Clean up arguments
if args.len() > 0 {
for _ in 0..(args.len()) {
code.push("\tpop zero".to_string());
}
}
}
// Restore caller-saved registers in reverse order (LIFO)
// for reg in saved_regs.iter().rev() {
// code.push(format!("\tpop {}", reg));
// }
// Free argument registers
for reg in arg_regs {
self.allocator.free_temp(&reg);
}
Ok((result_reg, code))
}
Expression::Unary { op, operand } => {
let (operand_reg, operand_code) =
self.generate_expression(operand, true)?;
code.extend(operand_code);
let (result_reg, result_alloc) = self.allocator.alloc_temp()?;
code.extend(result_alloc);
match op {
UnaryOperator::Minus => {
// Negate: result = 0 - operand
code.push(format!("\tsub zero, {}, {}", operand_reg, result_reg));
}
UnaryOperator::Plus => {
// Just move
code.push(format!("\tmov {}, {}", operand_reg, result_reg));
}
UnaryOperator::Dereference => {
code.push(format!("\tldw {}, {}", operand_reg, result_reg));
}
UnaryOperator::Reference => {
code.extend(self.allocator.spill_register(&operand_reg)?);
code.push(format!(
"\tsubi bpr {} {}",
-(4 + self.allocator.get_stack_offset()),
result_reg
))
}
}
self.allocator.free_temp(&operand_reg);
Ok((result_reg, code))
}
Expression::Empty => Ok(("zero".to_string(), code)),
}
}
// Helper for generating unique labels
fn get_unique_label(&mut self) -> String {
// You'd implement a counter here
static COUNTER: AtomicU32 = AtomicU32::new(0);
let val = COUNTER.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
(val + 1).to_string()
}
}
/// Build a single string from any number of arguments.
/// Each argument must implement `Display` or be convertible to a string.
#[macro_export]
macro_rules! dsa {
($($arg:expr),* $(,)?) => {{
// Start with an empty String well grow it as we go.
use std::fmt::Write;
let mut s = ::std::string::String::new();
$(
// `write!` is cheaper than `format!` for each element
// because it reuses the same buffer.
write!(s, "{}\n", $arg).expect("write to String failed");
)*
s
}};
}
// ──────────────────────── dsa! ────────────────────────
// A tiny helper that just turns its tokenstream into a string.
// The trailing comma is kept its part of the syntax you want.
#[macro_export]
macro_rules! cmd {
($($tokens:tt)*) => {{
// Well just stringify the tokens and return a String.
format!("{}", concat!(stringify!($tokens), "\n"))
}};
}
// ──────────────────────── block! ────────────────────────
// Usage:
//
// let asm = block![ "name"
// dsa![mov rg0, rg1],
// dsa![add rg1, rg1]
// ];
//
// `asm` is a `&'static str` containing:
//
// name:
// mov rg0, rg1
// add rg1, rg1
//
#[macro_export]
macro_rules! block {
// The first token must be a string literal thats the label.
($label:literal $(dsa![$($ins:tt)*]),* ) => {{
// Build a single string at compile time.
const CODE: &str = concat!(
$label, ":\n",
// Each `dsa!` call yields a string like `"mov rg0, rg1"`.
// We add a newline after each one to get the desired layout.
$(concat!("\t", stringify!($($ins)*), "\n")),*
);
CODE
}};
}
#[macro_export]
macro_rules! comment {
($text:expr) => {{ format!("// {}", $text) }};
}
-9
View File
@@ -1,9 +0,0 @@
use crate::model::{CompilerError, Program};
mod codegen;
mod registers;
pub fn generate_code(ast: &Program) -> Result<String, CompilerError> {
let mut codegen = codegen::CodeGenerator::new(ast.clone());
codegen.generate()
}
-348
View File
@@ -1,348 +0,0 @@
use std::collections::HashMap;
use crate::model::CompilerError;
/// Register allocator for DSA assembly generation
/// Manages general-purpose registers (rg0-rgf) and handles stack spilling
pub struct RegisterAllocator {
/// Available general-purpose registers
available_registers: Vec<String>,
/// Maps variable names to their current location (register or stack offset)
variable_locations: HashMap<String, Location>,
/// Maps registers to the variables they currently hold
register_contents: HashMap<String, String>,
/// Current stack offset for local variables (relative to bpr)
/// Starts at -4 (going downward from base pointer)
stack_offset: i32,
/// Track which registers are currently in use
in_use: HashMap<String, bool>,
}
#[derive(Debug, Clone)]
pub enum Location {
Register(String),
Stack(i32), // offset from bpr
}
impl RegisterAllocator {
pub fn new() -> Self {
// Initialize with available GP registers (rg0-rgf = 16 registers)
let registers = vec![
"rg0", "rg1", "rg2", "rg3", "rg4", "rg5", "rg6", "rg7", "rg8", "rg9", "rga",
"rgb", "rgc", "rgd", "rge", "rgf",
]
.into_iter()
.map(String::from)
.collect();
RegisterAllocator {
available_registers: registers,
variable_locations: HashMap::new(),
register_contents: HashMap::new(),
stack_offset: -4, // Start at -4 (first local below saved bpr)
in_use: HashMap::new(),
}
}
/// Allocate a temporary register for expression evaluation
/// Returns the register name and optionally assembly code to save it
pub fn alloc_temp(&mut self) -> Result<(String, Vec<String>), CompilerError> {
let mut code = Vec::new();
// Try to find an unused register
for reg in &self.available_registers {
if !self.in_use.get(reg).unwrap_or(&false) {
self.in_use.insert(reg.clone(), true);
return Ok((reg.clone(), code));
}
}
// All registers in use - need to spill one
// Choose the first register with a variable we can spill
// Find a register to spill
let reg_to_spill = self
.available_registers
.iter()
.find(|reg| self.register_contents.contains_key(*reg))
.cloned();
if let Some(reg) = reg_to_spill {
// Spill this variable to stack
let spill_code = self.spill_register(&reg)?;
code.extend(spill_code);
self.in_use.insert(reg.clone(), true);
return Ok((reg, code));
}
Err(CompilerError::Generic(
"All registers are used up yet there are no variables to spill to the stack"
.to_string(),
))
}
/// Free a temporary register after use
/// NOTE: This will NOT free registers that contain variables!
/// Variables persist throughout their scope and must not be freed
pub fn free_temp(&mut self, reg: &str) {
// Check if this register contains a variable
if self.register_contents.contains_key(reg) {
// This register holds a variable - don't free it!
// Variables are only freed when they go out of scope via free_var()
return;
}
// This is a true temporary - safe to free
self.in_use.insert(reg.to_string(), false);
}
/// Allocate a register for a named variable
/// Returns the register and any necessary assembly code
pub fn alloc_var(
&mut self,
var_name: &str,
) -> Result<(String, Vec<String>), CompilerError> {
if let Some(location) = self.variable_locations.get(var_name).cloned() {
match location {
Location::Register(reg) => {
return Ok((reg.clone(), Vec::new()));
}
Location::Stack(offset) => {
// Variable was pushed, need to calculate actual position
let (reg, mut code) = self.alloc_temp()?;
// Load from bpr + offset (offset is negative)
code.push(format!("\tsubi bpr {} {}", -(offset + 4), reg));
code.push(format!(
"\tldw {}, {} // bpr{}: {}",
reg,
reg,
offset - 4,
var_name
));
// Update location to register
self.variable_locations
.insert(var_name.to_string(), Location::Register(reg.clone()));
self.register_contents
.insert(reg.clone(), var_name.to_string());
return Ok((reg, code));
}
}
}
// Variable doesn't have a location yet, allocate a new register
let (reg, code) = self.alloc_temp()?;
self.variable_locations
.insert(var_name.to_string(), Location::Register(reg.clone()));
self.register_contents
.insert(reg.clone(), var_name.to_string());
Ok((reg, code))
}
/// Get the current location of a variable
pub fn _get_var_location(&self, var_name: &str) -> Option<&Location> {
self.variable_locations.get(var_name)
}
/// Load a variable into a register (allocating if necessary)
/// Returns the register and assembly code to load it
pub fn load_var(
&mut self,
var_name: &str,
) -> Result<(String, Vec<String>), CompilerError> {
self.alloc_var(var_name)
}
/// Store a value from a register into a variable
/// Updates tracking and returns any necessary assembly code
pub fn store_var(&mut self, var_name: &str, source_reg: &str) -> Vec<String> {
let mut code = Vec::new();
// Check if variable already has a location
if let Some(location) = self.variable_locations.get(var_name) {
match location {
Location::Register(dest_reg) => {
if dest_reg != source_reg {
code.push(format!(
"\tmov {}, {} // var {}",
source_reg, dest_reg, var_name
));
}
}
Location::Stack(offset) => {
code.push(format!(
"\tstw {}, bpr, {} // var {}",
source_reg, offset, var_name
));
}
}
} else {
// Variable doesn't exist yet, we can just use the same reg.
// self.variable_locations.insert(
// var_name.to_string(),
// Location::Register(source_reg.to_string()),
// );
// self.register_contents
// .insert(source_reg.to_string(), var_name.to_string());
// self.in_use.insert(source_reg.to_string(), true);
let source_reg = source_reg.to_string();
// if we can avoid a move, absolutely do that.
if self.available_registers.contains(&source_reg) {
self.variable_locations
.insert(var_name.to_string(), Location::Register(source_reg.clone()));
self.register_contents
.insert(source_reg.clone(), var_name.to_string());
self.in_use.insert(source_reg, true);
} else if let Some(free_reg) = self.find_free_register() {
code.push(format!("\tmov {}, {}", source_reg, free_reg));
self.variable_locations
.insert(var_name.to_string(), Location::Register(free_reg.clone()));
self.register_contents
.insert(free_reg.clone(), var_name.to_string());
self.in_use.insert(free_reg, true);
} else {
// No free registers - allocate on stack
// code.push(format!("\tstw {}, bpr, {}", source_reg, self.stack_offset));
// self.variable_locations
// .insert(var_name.to_string(), Location::Stack(self.stack_offset));
// self.stack_offset -= 4; // Move to next stack slot
//
todo!(
"we should spill other registers and keep this variable on the stack as it's more recent!"
);
}
}
code
}
/// Spill a register to the stack
/// Returns assembly code to perform the spill
pub fn spill_register(&mut self, reg: &str) -> Result<Vec<String>, CompilerError> {
let mut code = Vec::new();
if let Some(var_name) = self.register_contents.get(reg).cloned() {
// PUSH register to stack (spr decrements automatically)
code.push(format!(
"\tpush {} // bpr{}: {}",
reg, self.stack_offset, var_name
));
// Track that we pushed one word
self.stack_offset -= 4;
// Update variable location - it's now at current spr
// Note: We track offset from bpr for consistency
self.variable_locations
.insert(var_name.clone(), Location::Stack(self.stack_offset));
// Remove from register tracking
self.register_contents.remove(reg);
}
Ok(code)
}
/// Find a free register (not currently in use)
fn find_free_register(&self) -> Option<String> {
for reg in &self.available_registers {
if !self.in_use.get(reg).unwrap_or(&false) {
return Some(reg.clone());
}
}
None
}
/// Spill all registers to stack (useful before function calls)
pub fn _spill_all(&mut self) -> Vec<String> {
let mut code = Vec::new();
let regs_to_spill: Vec<String> = self.register_contents.keys().cloned().collect();
for reg in regs_to_spill {
if let Ok(spill_code) = self.spill_register(&reg) {
code.extend(spill_code);
}
}
code
}
/// Get the total stack offset
pub fn get_stack_offset(&self) -> i32 {
self.stack_offset
}
/// Get the total stack space needed for local variables
pub fn _get_stack_size(&self) -> i32 {
-self.stack_offset // Convert negative offset to positive size
}
/// Reset allocator for a new function
pub fn reset(&mut self) {
self.variable_locations.clear();
self.register_contents.clear();
self.stack_offset = -4;
self.in_use.clear();
}
/// Mark a variable as dead (no longer needed)
/// Frees its register if it's in one
pub fn _free_var(&mut self, var_name: &str) {
if let Some(Location::Register(reg)) = self.variable_locations.get(var_name) {
let reg = reg.clone();
self.register_contents.remove(&reg);
self.in_use.insert(reg, false);
}
self.variable_locations.remove(var_name);
}
/// Get list of registers that contain variables and are in use
/// These need to be saved before function calls
pub fn get_caller_saved_registers(&self) -> Vec<String> {
self.register_contents
.iter()
.filter(|(reg, _)| *self.in_use.get(*reg).unwrap_or(&false))
.map(|(reg, _)| reg.clone())
.collect()
}
/// Save caller-saved registers before a function call
/// Returns assembly code to save them
pub fn _save_caller_saved(&mut self) -> Vec<String> {
let mut code = Vec::new();
// For simplicity, save all currently used registers
// In a more sophisticated compiler, you'd only save registers that are live
for (reg, _) in self.register_contents.clone() {
if *self.in_use.get(&reg).unwrap_or(&false) {
code.push(format!("\tpush {}", reg));
}
}
code
}
/// Restore caller-saved registers after a function call
/// Returns assembly code to restore them
pub fn _restore_caller_saved(&mut self, saved_regs: &[String]) -> Vec<String> {
let mut code = Vec::new();
// Restore in reverse order (LIFO)
for reg in saved_regs.iter().rev() {
code.push(format!("\tpop {}", reg));
}
code
}
}
-627
View File
@@ -1,627 +0,0 @@
use std::iter::Peekable;
use std::str::Chars;
#[derive(Debug, PartialEq, Clone)]
pub enum Token {
// Keywords
Fn,
Let,
If,
Else,
Loop,
While,
Break,
Return,
Continue,
Include,
Static,
Const,
// Identifiers and literals
Identifier(Name),
String(String),
Integer(u64),
Char(char),
// Symbols
LeftParen, // (
RightParen, // )
LeftBrace, // {
RightBrace, // }
Semicolon, // ;
Colon, // :
Comma, // ,
// Operators
Plus, // +
Minus, // -
Star, // *
Amphersand, // &
Slash, // /
Assign, // =
EqualEqual, // ==
Bang, // !
BangEqual, // !=
Less, // <
LessEqual, // <=
Greater, // >
GreaterEqual, // >=
RightArrow, // ->
// Special
Eof,
}
use std::fmt;
use crate::model::Name;
impl fmt::Display for Name {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
if let Some(ref ns) = self.namespace {
write!(f, "{}::{}", ns, self.name)
} else {
write!(f, "{}", self.name)
}
}
}
impl Token {
pub fn tt(&self) -> &str {
match self {
Token::Const => "Const",
Token::Static => "Static",
Token::Include => "Include",
Token::Fn => "Fn",
Token::If => "If",
Token::Let => "Let",
Token::Else => "Else",
Token::Loop => "Loop",
Token::While => "While",
Token::Break => "Break",
Token::Return => "Return",
Token::Continue => "Continue",
Token::Identifier(_) => "Identifier",
Token::String(_) => "String",
Token::Integer(_) => "UnsignedInt",
Token::Char(_) => "Char",
Token::LeftParen => "LeftParen",
Token::RightParen => "RightParen",
Token::LeftBrace => "LeftBrace",
Token::RightBrace => "RightBrace",
Token::Semicolon => "Semicolon",
Token::Colon => "Colon",
Token::Comma => "Comma",
Token::RightArrow => "RightArrow",
Token::Plus => "Plus",
Token::Minus => "Minus",
Token::Star => "Star",
Token::Amphersand => "Amphersand",
Token::Slash => "Slash",
Token::Assign => "Assign",
Token::EqualEqual => "EqualEqual",
Token::Bang => "Bang",
Token::BangEqual => "BangEqual",
Token::Less => "Less",
Token::LessEqual => "LessEqual",
Token::Greater => "Greater",
Token::GreaterEqual => "GreaterEqual",
Token::Eof => "Eof",
}
}
}
#[derive(Debug)]
pub struct Lexer<'a> {
chars: Peekable<Chars<'a>>,
current: Option<char>,
line: usize,
}
impl<'a> Lexer<'a> {
pub fn new(input: &'a str) -> Self {
let mut chars = input.chars().peekable();
let current = chars.next();
Lexer {
chars,
current,
line: 1,
}
}
fn advance(&mut self) -> Option<char> {
self.current = self.chars.next();
self.current
}
fn peek(&mut self) -> Option<&char> {
self.chars.peek()
}
fn skip_whitespace(&mut self) {
while let Some(c) = self.current {
if !c.is_whitespace() {
break;
}
if c == '\n' {
self.line += 1;
}
self.advance();
}
}
fn skip_line_comment(&mut self) {
// Skip the two slashes
self.advance(); // first /
self.advance(); // second /
// Skip until newline or EOF
while let Some(c) = self.current {
if c == '\n' {
self.line += 1;
self.advance();
break;
}
self.advance();
}
}
fn skip_block_comment(&mut self) -> Result<(), String> {
// Skip the /*
self.advance(); // /
self.advance(); // *
let start_line = self.line;
// Look for */
while let Some(c) = self.current {
if c == '\n' {
self.line += 1;
}
if c == '*' {
if let Some(&next) = self.peek() {
if next == '/' {
self.advance(); // *
self.advance(); // /
return Ok(());
}
}
}
self.advance();
}
Err(format!(
"Unterminated block comment starting at line {}",
start_line
))
}
fn skip_whitespace_and_comments(&mut self) {
loop {
self.skip_whitespace();
// Check for comments
if let Some('/') = self.current {
if let Some(&next) = self.peek() {
match next {
'/' => {
self.skip_line_comment();
continue;
}
'*' => {
if let Err(e) = self.skip_block_comment() {
eprintln!("Lexer error: {}", e);
}
continue;
}
_ => break,
}
}
}
break;
}
}
fn read_identifier(&mut self) -> String {
let mut ident = String::new();
// Include the current character if it's valid
if let Some(c) = self.current {
if c.is_alphabetic() || c == '_' {
ident.push(c);
}
}
// Read remaining characters
while let Some(&c) = self.peek() {
if c.is_alphanumeric() || c == '_' {
self.advance();
ident.push(c);
} else {
break;
}
}
ident
}
fn keyword_or_identifier(&mut self) -> Token {
let first_ident = self.read_identifier();
// Check if it's a keyword first (keywords can't have namespaces)
let keyword = match first_ident.as_str() {
"fn" => Some(Token::Fn),
"if" => Some(Token::If),
"else" => Some(Token::Else),
"while" => Some(Token::While),
"loop" => Some(Token::Loop),
"break" => Some(Token::Break),
"return" => Some(Token::Return),
"continue" => Some(Token::Continue),
"include" => Some(Token::Include),
"let" => Some(Token::Let),
"const" => Some(Token::Const),
"static" => Some(Token::Static),
_ => None,
};
if let Some(kw) = keyword {
return kw;
}
// Not a keyword - check for namespace separator (::)
// We need to peek TWO characters ahead without consuming anything
if let Some(&':') = self.peek() {
// We see one colon, but we need to check if there's another one after it
// We can't peek two ahead directly, so we need a different approach
// Save the current position by using a temporary peekable iterator
// Actually, we can't do that easily. Instead, let's just check:
// If we see ':', temporarily advance and check the next char
// Create a temporary check
let mut temp_chars = self.chars.clone();
let _ = temp_chars.next(); // This is the ':' we already saw
let second_peek = temp_chars.peek();
if let Some(&':') = second_peek {
// It's :: - consume both colons
self.advance(); // consume first :
self.advance(); // consume second :
// Read the second identifier (the actual name)
let second_ident = self.read_identifier();
// Return namespaced identifier
return Token::Identifier(Name {
namespace: Some(first_ident),
name: second_ident,
});
}
// else: It's a single colon (type annotation) - DON'T consume it
// Just fall through and return the identifier
}
// No namespace separator - just a regular identifier
Token::Identifier(Name {
namespace: None,
name: first_ident,
})
}
fn read_number(&mut self) -> Result<u64, String> {
let current = self.current.unwrap();
// Check for hex (0x) or binary (0b) prefix
if current == '0' {
if let Some(&next_char) = self.peek() {
match next_char {
'x' | 'X' => {
self.advance(); // consume '0'
self.advance(); // consume 'x'
return self.read_hex_number();
}
'b' | 'B' => {
self.advance(); // consume '0'
self.advance(); // consume 'b'
return self.read_binary_number();
}
_ => {}
}
}
}
// Read decimal number
self.read_decimal_number()
}
fn read_decimal_number(&mut self) -> Result<u64, String> {
let mut num_str = String::new();
if let Some(c) = self.current {
num_str.push(c);
}
while let Some(&c) = self.peek() {
if c.is_ascii_digit() {
self.advance();
num_str.push(c);
} else {
break;
}
}
num_str
.parse::<u64>()
.map_err(|_| format!("Invalid decimal number: {}", num_str))
}
fn read_hex_number(&mut self) -> Result<u64, String> {
let mut num_str = String::new();
// Read current character if it's a hex digit
if let Some(c) = self.current {
if c.is_ascii_hexdigit() {
num_str.push(c);
}
}
while let Some(&c) = self.peek() {
if c.is_ascii_hexdigit() {
self.advance();
num_str.push(c);
} else {
break;
}
}
if num_str.is_empty() {
return Err("Invalid hexadecimal number: no digits after 0x".to_string());
}
u64::from_str_radix(&num_str, 16)
.map_err(|_| format!("Invalid hexadecimal number: {}", num_str))
}
fn read_binary_number(&mut self) -> Result<u64, String> {
let mut num_str = String::new();
// Read current character if it's a binary digit
if let Some(c) = self.current {
if c == '0' || c == '1' {
num_str.push(c);
}
}
while let Some(&c) = self.peek() {
if c == '0' || c == '1' {
self.advance();
num_str.push(c);
} else {
break;
}
}
if num_str.is_empty() {
return Err("Invalid binary number: no digits after 0b".to_string());
}
u64::from_str_radix(&num_str, 2)
.map_err(|_| format!("Invalid binary number: {}", num_str))
}
fn read_string(&mut self) -> Result<String, String> {
self.advance(); // Skip the opening quote
let mut s = String::new();
while let Some(c) = self.current {
if c == '"' {
return Ok(s);
}
// Handle escape sequences
if c == '\\' {
self.advance();
if let Some(escaped) = self.current {
let escaped_char = match escaped {
'n' => '\n',
't' => '\t',
'r' => '\r',
'\\' => '\\',
'"' => '"',
_ => escaped, // For now, just use the character as-is
};
s.push(escaped_char);
} else {
return Err("Unexpected end of string after escape".to_string());
}
} else {
s.push(c);
}
self.advance();
}
Err("Unterminated string literal".to_string())
}
fn match_next(&mut self, expected: char) -> bool {
match self.peek() {
Some(&c) if c == expected => {
self.advance();
true
}
_ => false,
}
}
fn scan_single_char_token(&mut self, c: char) -> Option<Token> {
match c {
'(' => Some(Token::LeftParen),
')' => Some(Token::RightParen),
'{' => Some(Token::LeftBrace),
'}' => Some(Token::RightBrace),
';' => Some(Token::Semicolon),
',' => Some(Token::Comma),
'&' => Some(Token::Amphersand),
'+' => Some(Token::Plus),
'*' => Some(Token::Star),
_ => None,
}
}
fn scan_operator(&mut self, c: char) -> Option<Token> {
match c {
'-' => Some(if self.match_next('>') {
Token::RightArrow
} else {
Token::Minus
}),
'!' => Some(if self.match_next('=') {
Token::BangEqual
} else {
Token::Bang
}),
'=' => Some(if self.match_next('=') {
Token::EqualEqual
} else {
Token::Assign
}),
'<' => Some(if self.match_next('=') {
Token::LessEqual
} else {
Token::Less
}),
'>' => Some(if self.match_next('=') {
Token::GreaterEqual
} else {
Token::Greater
}),
':' => {
// Single colon (for type annotations)
// Note: :: is handled in keyword_or_identifier for namespaces
Some(Token::Colon)
}
'/' => {
// Check if it's a comment or division
if let Some(&next) = self.peek() {
if next == '/' || next == '*' {
// It's a comment, don't consume it here
// Let skip_whitespace_and_comments handle it
None
} else {
Some(Token::Slash)
}
} else {
Some(Token::Slash)
}
}
_ => None,
}
}
pub fn next_token(&mut self) -> Token {
self.skip_whitespace_and_comments();
let Some(c) = self.current else {
return Token::Eof;
};
// Try single-character tokens first
if let Some(token) = self.scan_single_char_token(c) {
self.advance();
return token;
}
// Try operators (may be multi-character)
if let Some(token) = self.scan_operator(c) {
self.advance();
return token;
}
// Char literals
if c == '\'' {
let mut value = ' ';
self.advance();
if let Some(ch) = self.current {
value = ch;
self.advance();
}
if self.current == Some('\'') {
self.advance();
return Token::Char(value);
}
eprintln!("Lexer error on line {}: Invalid char literal", self.line);
}
// String literals
if c == '"' {
let token = match self.read_string() {
Ok(s) => Token::String(s),
Err(e) => {
eprintln!("Lexer error on line {}: {}", self.line, e);
// Skip to next quote or end
while let Some(ch) = self.current {
if ch == '"' || ch == '\n' {
break;
}
self.advance();
}
Token::String(String::new())
}
};
self.advance();
return token;
}
// Identifiers and keywords (including namespaced identifiers)
if c.is_alphabetic() || c == '_' {
let token = self.keyword_or_identifier();
self.advance();
return token;
}
// Numbers (decimal, hex, binary)
if c.is_ascii_digit() {
let token = match self.read_number() {
Ok(num) => Token::Integer(num),
Err(e) => {
eprintln!("Lexer error on line {}: {}", self.line, e);
// Skip invalid number
while let Some(&ch) = self.peek() {
if !ch.is_alphanumeric() {
break;
}
self.advance();
}
Token::Integer(0)
}
};
self.advance();
return token;
}
// Unknown character - skip it
eprintln!(
"Lexer warning on line {}: Skipping unknown character '{}'",
self.line, c
);
self.advance();
self.next_token()
}
}
impl<'a> Iterator for Lexer<'a> {
type Item = Token;
fn next(&mut self) -> Option<Self::Item> {
match self.next_token() {
Token::Eof => None,
token => Some(token),
}
}
}
-604
View File
@@ -1,604 +0,0 @@
use super::lexer::Token;
use crate::model::{
BinaryOperator, Block, CompilerError, ConstExpr, Declaration, Dependency, Expression,
Program, Statement, TypeId, UnaryOperator, Variable,
};
use crate::{expect_tt, expect_value};
use std::ops::{ControlFlow, FromResidual, Try};
#[derive(Debug, Clone)]
pub enum ParseResult<T, E> {
Accept(T),
Deny,
Reject(E),
}
pub struct Parser {
tokens: Vec<Token>,
idx: usize,
}
impl Parser {
pub fn new(tokens: Vec<Token>) -> Self {
Self { tokens, idx: 0 }
}
pub fn parse(&mut self) -> ParseResult<Program, CompilerError> {
let mut declarations = Vec::new();
while let ParseResult::Accept(_) = self.peek_next() {
declarations.push(self.parse_declaration()?);
}
ParseResult::Accept(Program { declarations })
}
fn parse_declaration(&mut self) -> ParseResult<Declaration, CompilerError> {
if expect_tt!(self.peek_next()?, Fn).accepted() {
return self.parse_func();
}
if expect_tt!(self.peek_next()?, Include).accepted() {
// expect include keyword
let _ = self.next();
// expect namespace identifier
let name = expect_value!(self.next()?, Identifier)?;
// expect colon
let _ = expect_tt!(self.next()?, Colon)?;
// expect string literal (module path)
let path = expect_value!(self.next()?, String)?;
// expect semicolon
let _ = expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Declaration::Dependency(Dependency {
name: name.name,
path,
}));
}
if expect_tt!(self.peek_next()?, Const, Static).accepted() {
let is_const = match self.next()? {
Token::Const => true,
Token::Static => false,
_ => {
return ParseResult::Reject(CompilerError::Generic(String::from(
"This can't happen!",
)));
}
};
let var = self.parse_var_decl()?;
let _ = expect_tt!(self.next()?, Assign)?;
let value = self.next()?;
let init = match value {
Token::String(x) => Some(ConstExpr::String(x)),
Token::Integer(x) => Some(ConstExpr::Number(x as i32)),
_ => {
return ParseResult::Reject(CompilerError::UnexpectedToken(
value.tt().to_string(),
));
}
};
let _ = expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Declaration::Variable {
var,
init,
is_const,
});
}
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
}
fn parse_func(&mut self) -> ParseResult<Declaration, CompilerError> {
// expect function keyword
let _ = expect_tt!(self.next()?, Fn);
// expect function name
let name = expect_value!(self.next()?, Identifier)?;
// expect left paren
let _ = expect_tt!(self.next()?, LeftParen)?;
let mut params = Vec::new();
while expect_tt!(self.peek_next()?, Identifier).accepted() {
let arg = self.parse_var_decl()?;
params.push(arg);
if expect_tt!(self.peek_next()?, Comma).accepted() {
self.next()?;
} else {
break;
}
}
// expect right paren
let _ = expect_tt!(self.next()?, RightParen)?;
// see if we can parse the return type!
let mut return_type = TypeId::Void;
if expect_tt!(self.peek_next()?, RightArrow).accepted() {
let _ = self.next();
return_type = self.parse_type()?;
}
// expect vald block
if expect_tt!(self.peek_next()?, LeftBrace).accepted() {
ParseResult::Accept(Declaration::Function {
name: name.name,
params,
return_type,
body: self.parse_block()?,
})
} else {
ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
))
}
}
fn parse_block(&mut self) -> ParseResult<Block, CompilerError> {
// expect left brace
let _ = expect_tt!(self.next()?, LeftBrace)?;
let mut block = Vec::new();
while !expect_tt!(self.peek_next()?, RightBrace).accepted() {
block.push(self.parse_statement()?);
}
// expect right brace
let _ = expect_tt!(self.next()?, RightBrace)?;
ParseResult::Accept(block)
}
fn parse_statement(&mut self) -> ParseResult<Statement, CompilerError> {
// handle if statements
if expect_tt!(self.peek_next()?, If).accepted() {
self.next()?;
let condition = self.parse_expression()?;
let then_stmt = self.parse_block()?;
if !expect_tt!(self.peek_next()?, Else).accepted() {
return ParseResult::Accept(Statement::If {
condition,
then_stmt,
else_stmt: vec![],
});
}
let _ = expect_tt!(self.next()?, Else)?;
let else_stmt = self.parse_block()?;
return ParseResult::Accept(Statement::If {
condition,
then_stmt,
else_stmt,
});
}
// handle while loops
if expect_tt!(self.peek_next()?, While).accepted() {
self.next()?;
// expect valid expression
let expr = self.parse_expression()?;
// expect valid block after
let block = self.parse_block()?;
// return result
return ParseResult::Accept(Statement::While {
condition: expr,
body: block,
});
}
// handle indefinite loops
if expect_tt!(self.peek_next()?, Loop).accepted() {
self.next()?;
// parse the inner block
return ParseResult::Accept(Statement::Loop(self.parse_block()?));
}
if expect_tt!(self.peek_next()?, Return).accepted() {
self.next()?;
// handle case where nothing is returned
if expect_tt!(self.peek_next()?, Semicolon).accepted() {
return ParseResult::Accept(Statement::Return(None));
}
let expr = self.parse_expression()?;
expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Statement::Return(Some(expr)));
}
if expect_tt!(self.peek_next()?, Break).accepted() {
self.next()?;
// expect semicolon
expect_tt!(self.next()?, Semicolon)?;
// return result
return ParseResult::Accept(Statement::Break);
}
if expect_tt!(self.peek_next()?, Continue).accepted() {
self.next()?;
// expect semicolon
expect_tt!(self.next()?, Semicolon)?;
// return result
return ParseResult::Accept(Statement::Continue);
}
// handle writes to pointers!
if expect_tt!(self.peek_next()?, Star).accepted() {
self.next()?;
let left = if expect_tt!(self.peek_next()?, Identifier).accepted() {
let identifier = expect_value!(self.next()?, Identifier)?;
Expression::Variable {
name: identifier,
expr_type: None,
}
} else if expect_tt!(self.peek_next()?, LeftParen).accepted() {
self.next()?;
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, RightParen).accepted();
expr
} else {
return ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
));
};
let _ = expect_tt!(self.next()?, Assign)?;
let right = self.parse_expression()?;
// expect semicolon
expect_tt!(self.next()?, Semicolon)?;
// return result
return ParseResult::Accept(Statement::PtrWrite {
ptr: left,
value: right,
});
}
// handle let statements (declarations)
if expect_tt!(self.peek_next()?, Let).accepted() {
self.next();
// expect variable name and type.
let name = self.parse_var_decl()?;
// handle uninitialised variable case
if expect_tt!(self.peek_next()?, Semicolon).accepted() {
self.next();
return ParseResult::Accept(Statement::Declaration {
var: name,
value: None,
});
}
// handle initialised case
// expect equals
let _ = expect_tt!(self.next()?, Assign)?;
// expect a valid expression
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, Semicolon);
// return statement
return ParseResult::Accept(Statement::Declaration {
var: name,
value: Some(expr),
});
}
// handle assignment without "let"
let name = expect_value!(self.peek_next()?, Identifier);
if name.accepted() {
let varname = name?;
if expect_tt!(self.peek(1)?, LeftParen).accepted() {
let expr = self.parse_expression()?; // a function call expr
let _ = expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Statement::Expression { expr });
}
self.next()?;
let _ = expect_tt!(self.next()?, Assign)?;
let value = self.parse_expression()?;
let _ = expect_tt!(self.next()?, Semicolon);
return ParseResult::Accept(Statement::Assign {
varname: varname.name,
value,
});
}
ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
))
}
fn parse_expression(&mut self) -> ParseResult<Expression, CompilerError> {
self.parse_comparison()
}
fn parse_comparison(&mut self) -> ParseResult<Expression, CompilerError> {
let mut expr = self.parse_additive()?;
while let Some(op) = match self.peek_next()? {
Token::EqualEqual => Some(BinaryOperator::Ne),
Token::BangEqual => Some(BinaryOperator::Ne),
Token::Less => Some(BinaryOperator::Lt),
Token::Greater => Some(BinaryOperator::Gt),
Token::LessEqual => Some(BinaryOperator::Le),
Token::GreaterEqual => Some(BinaryOperator::Ge),
_ => None,
} {
self.next()?;
let right = Box::new(self.parse_additive()?);
expr = Expression::Binary {
op,
left: Box::new(expr),
right,
}
}
ParseResult::Accept(expr)
}
fn parse_additive(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_multiplicative()?;
let op = match self.peek_next()? {
Token::Plus => BinaryOperator::Add,
Token::Minus => BinaryOperator::Sub,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_additive()?),
})
}
fn parse_multiplicative(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_unary()?;
let op = match self.peek_next()? {
Token::Star => BinaryOperator::Mul,
Token::Slash => BinaryOperator::Div,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_multiplicative()?),
})
}
fn parse_unary(&mut self) -> ParseResult<Expression, CompilerError> {
let op = match self.peek_next()? {
Token::Plus => UnaryOperator::Plus,
Token::Minus => UnaryOperator::Minus,
Token::Star => UnaryOperator::Dereference,
Token::Amphersand => UnaryOperator::Reference,
_ => return ParseResult::Accept(self.parse_primary()?),
};
self.next()?;
let operand = Box::new(self.parse_unary()?);
ParseResult::Accept(Expression::Unary { op, operand })
}
fn parse_primary(&mut self) -> ParseResult<Expression, CompilerError> {
match self.peek_next()? {
Token::Integer(value) => {
self.next()?;
ParseResult::Accept(Expression::Number(value as isize))
}
Token::String(value) => {
self.next()?;
ParseResult::Accept(Expression::StringLiteral(value))
}
Token::Identifier(_) => {
let name = expect_value!(self.next()?, Identifier)?;
if matches!(self.peek_next()?, Token::LeftParen) {
// Function call
self.next()?;
let mut args = Vec::new();
if !matches!(self.peek_next()?, Token::RightParen) {
args.push(self.parse_expression()?);
while matches!(self.peek_next()?, Token::Comma) {
self.next()?;
args.push(self.parse_expression()?);
}
}
let _ = expect_tt!(self.next()?, RightParen)?;
ParseResult::Accept(Expression::Call { name, args })
} else {
ParseResult::Accept(Expression::Variable {
name,
expr_type: None,
})
}
}
Token::LeftParen => {
self.next()?;
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, RightParen)?;
ParseResult::Accept(expr)
}
_ => ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
)),
}
}
fn parse_var_decl(&mut self) -> ParseResult<Variable, CompilerError> {
let name = expect_value!(self.next()?, Identifier)?;
let _ = expect_tt!(self.next()?, Colon)?;
let type_id = self.parse_type()?;
ParseResult::Accept(Variable {
name: name.name,
type_id,
})
}
fn parse_type(&mut self) -> ParseResult<TypeId, CompilerError> {
// get the type name incl namespace
let typename = expect_value!(self.next()?, Identifier)?;
match typename.name.as_str() {
"u32" => ParseResult::Accept(TypeId::U32),
"u16" => ParseResult::Accept(TypeId::U16),
"u8" => ParseResult::Accept(TypeId::U8),
"i32" => ParseResult::Accept(TypeId::I32),
"i16" => ParseResult::Accept(TypeId::I16),
"i8" => ParseResult::Accept(TypeId::I8),
"void" => ParseResult::Accept(TypeId::Void),
"char" => ParseResult::Accept(TypeId::Char),
"str" => ParseResult::Accept(TypeId::Ptr(Box::new(TypeId::Char))),
_ => todo!("Implement parsing for other types!!"),
}
}
fn next(&mut self) -> ParseResult<Token, CompilerError> {
if self.idx >= self.tokens.len() {
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
} else {
let token = self.tokens[self.idx].clone();
self.idx += 1;
ParseResult::Accept(token)
}
}
fn peek_next(&self) -> ParseResult<Token, CompilerError> {
if self.idx >= self.tokens.len() {
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
} else {
ParseResult::Accept(self.tokens[self.idx].clone())
}
}
fn peek(&self, offset: usize) -> ParseResult<Token, CompilerError> {
if self.idx + offset >= self.tokens.len() {
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
} else {
ParseResult::Accept(self.tokens[self.idx + offset].clone())
}
}
}
impl<T, E> ParseResult<T, E> {
pub fn accepted(&self) -> bool {
matches!(self, ParseResult::Accept(_))
}
}
pub enum ParseResultResidual<T> {
Deny,
Reject(T),
}
impl<T, E> Try for ParseResult<T, E> {
type Output = T;
type Residual = ParseResultResidual<E>;
fn from_output(output: T) -> Self {
ParseResult::Accept(output)
}
fn branch(self) -> ControlFlow<Self::Residual, Self::Output> {
match self {
ParseResult::Accept(v) => ControlFlow::Continue(v),
ParseResult::Deny => ControlFlow::Break(ParseResultResidual::Deny),
ParseResult::Reject(e) => ControlFlow::Break(ParseResultResidual::Reject(e)),
}
}
}
impl<T, E> FromResidual for ParseResult<T, E> {
fn from_residual(residual: ParseResultResidual<E>) -> Self {
match residual {
ParseResultResidual::Deny => ParseResult::Deny,
ParseResultResidual::Reject(e) => ParseResult::Reject(e),
}
}
}
#[macro_export]
macro_rules! expect_tt {
($token:expr, $($variant:ident),+) => {{
let token = $token.clone();
let tt = token.tt().to_string();
let mut vs = String::new();
$(
let s = stringify!($variant);
vs.push_str(s);
vs.push_str("|");
)+
match tt.as_str() {
$(
stringify!($variant) => ParseResult::Accept(token),
)+
_ => {
// let expected = format!("[{}]", vec![$(stringify!($variant)),+].join(" | "));
ParseResult::Reject(CompilerError::UnexpectedToken(tt))
}
}
}};
}
#[macro_export]
macro_rules! expect_value {
($expr:expr, $variant:ident) => {{
let tok = $expr;
match tok.clone() {
Token::$variant(value) => ParseResult::Accept(value),
_ => {
ParseResult::Reject(CompilerError::UnexpectedToken(tok.tt().to_string()))
}
}
}};
}
@@ -1,13 +0,0 @@
use crate::model::{CompilerError, Program};
pub struct Analyser;
impl Analyser {
pub fn new() -> Self {
Self
}
pub fn analyse(&self, _ast: Program) -> Result<(), CompilerError> {
Ok(())
}
}
-15
View File
@@ -1,15 +0,0 @@
use crate::model::{CompilerError, Program};
mod c;
mod dsc;
pub fn compiler_frontend(ext: &str, data: &str) -> Result<Program, CompilerError> {
match ext {
"dsc" => Ok(dsc::generate_ast(&data)?),
"c" => Ok(c::generate_ast(&data)?),
_ => Err(CompilerError::Generic(format!(
"File type {} not supported",
ext
))),
}
}
-70
View File
@@ -1,70 +0,0 @@
#![feature(try_trait_v2)]
use std::path::Path;
use common::logging::log;
use crate::specialised::build_specialised;
mod backend;
mod frontend;
mod model;
mod specialised;
pub fn compile_file(
input_path: &Path,
output_path: &Path,
) -> Result<(), Box<dyn std::error::Error>> {
let input = std::fs::read_to_string(input_path).expect("Failed to read input file");
let input_ext = input_path
.extension()
.and_then(|s| s.to_str())
.unwrap_or("");
// check if we're using a specialised compiler
if let Some(output) = build_specialised(input_ext, &input) {
let result = match output {
Ok(output) => output,
Err(err) => return Err(format!("Compilation failed: {err:?}").into()),
};
std::fs::write(output_path, &result).expect("Failed to write output");
log(&format!(
"Compilation Successful ✅ \n\tSource: {}\n\tOutput: {}\n",
input_path.display(),
output_path.display(),
));
return Ok(());
}
// Parse the input using the frontend, providing the file extension and data.
let ast = match frontend::compiler_frontend(input_ext, &input) {
Ok(ast) => ast,
Err(err) => return Err(format!("Compilation failed: {err:?}").into()),
};
let output_ext = output_path
.extension()
.and_then(|s| s.to_str())
.unwrap_or("");
// Generate the output using the backend with the parsed result.
let result = match backend::compiler_backend(output_ext, &ast) {
Ok(result) => result,
Err(err) => return Err(format!("Compilation failed: {err:?}").into()),
};
// println!("{result}");
std::fs::write(output_path, &result).expect("Failed to write output");
log(&format!(
"Compilation Successful ✅ \n\tSource: {}\n\tOutput: {}\n",
input_path.display(),
output_path.display(),
));
Ok(())
}
-21
View File
@@ -1,21 +0,0 @@
use std::path::Path;
use compiler;
fn main() {
// read from input file: syntax "c_compiler <src.c> [output.dsa]"
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("Usage: c_compiler <src.dsc> [output.dsa]");
return;
}
let input_file = &args[1];
let output_file = if args.len() > 2 {
&args[2]
} else {
"output.dsa"
};
compiler::compile_file(Path::new(input_file), Path::new(output_file)).unwrap();
}
-213
View File
@@ -1,213 +0,0 @@
use core::fmt;
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum CompilerError {
UnexpectedToken(String),
UnexpectedEndOfInput,
UnexpectedCharacter(char),
Undefined(Name),
InvalidSyntax(String),
Generic(String),
}
#[derive(Debug, PartialEq, Clone)]
pub struct Name {
pub name: String,
pub namespace: Option<String>,
}
#[derive(Debug, Clone)]
pub struct Program {
pub declarations: Vec<Declaration>,
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum Declaration {
Function {
name: String,
return_type: TypeId,
params: Vec<Variable>,
body: Block,
},
Variable {
var: Variable,
init: Option<ConstExpr>,
is_const: bool,
},
Dependency(Dependency),
}
#[derive(Debug, Clone)]
pub struct Dependency {
pub name: String,
pub path: String,
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum TypeId {
U8,
U16,
U32,
I8,
I16,
I32,
Char,
Void,
Ptr(Box<TypeId>),
Ref(Box<TypeId>),
Array(Box<TypeId>, usize),
Struct { name: Name, fields: Vec<Variable> },
}
pub type Block = Vec<Statement>;
#[allow(unused)]
#[derive(Debug, Clone)]
pub struct Variable {
pub name: String,
pub type_id: TypeId,
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum Statement {
Block(Block),
Declaration {
var: Variable,
value: Option<Expression>,
},
Assign {
varname: String,
value: Expression,
},
PtrWrite {
ptr: Expression,
value: Expression,
},
Expression {
expr: Expression,
},
If {
condition: Expression,
then_stmt: Block,
else_stmt: Block,
},
While {
condition: Expression,
body: Vec<Statement>,
},
Loop(Block),
Break,
Continue,
Return(Option<Expression>),
}
#[derive(Debug, Clone)]
pub enum ConstExpr {
Number(i32),
String(String),
}
impl fmt::Display for ConstExpr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
ConstExpr::Number(n) => write!(f, "{}", n),
ConstExpr::String(s) => write!(f, "\"{}\"", s),
}
}
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum Expression {
Empty,
Binary {
op: BinaryOperator,
left: Box<Expression>,
right: Box<Expression>,
},
Unary {
op: UnaryOperator,
operand: Box<Expression>,
},
Variable {
name: Name,
expr_type: Option<TypeId>,
},
Call {
name: Name,
args: Vec<Expression>,
},
Number(isize),
StringLiteral(String),
CharLiteral(char),
}
impl Expression {
pub fn is_pure(&self) -> bool {
match self {
Expression::Number(_) => true,
Expression::StringLiteral(_) => true,
Expression::CharLiteral(_) => true,
Expression::Call { .. } => false,
Expression::Binary { left, right, .. } => left.is_pure() && right.is_pure(),
Expression::Unary { operand, .. } => operand.is_pure(),
Expression::Empty => true,
Expression::Variable { .. } => true,
}
}
}
#[allow(unused)]
#[derive(Debug, Clone, PartialEq)]
pub enum BinaryOperator {
Add,
Sub,
Mul,
Div,
Eq,
Ne,
Lt,
Gt,
Le,
Ge,
}
impl fmt::Display for BinaryOperator {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
BinaryOperator::Add => write!(f, "+"),
BinaryOperator::Sub => write!(f, "-"),
BinaryOperator::Mul => write!(f, "*"),
BinaryOperator::Div => write!(f, "/"),
BinaryOperator::Eq => write!(f, "=="),
BinaryOperator::Ne => write!(f, "!="),
BinaryOperator::Lt => write!(f, "<"),
BinaryOperator::Gt => write!(f, ">"),
BinaryOperator::Le => write!(f, "<="),
BinaryOperator::Ge => write!(f, ">="),
}
}
}
#[derive(Debug, Clone, PartialEq)]
pub enum UnaryOperator {
Plus,
Minus,
Reference,
Dereference,
}
impl fmt::Display for UnaryOperator {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
UnaryOperator::Plus => write!(f, "+"),
UnaryOperator::Minus => write!(f, "-"),
UnaryOperator::Dereference => write!(f, "*"),
UnaryOperator::Reference => write!(f, "&"),
}
}
}
@@ -5,7 +5,7 @@ edition.workspace = true
authors.workspace = true
[[bin]]
name = "assembler_runner"
name = "dsa-a"
path = "src/main.rs"
[lib]
@@ -13,6 +13,6 @@ name = "assembler"
path = "src/lib.rs"
[dependencies]
common = { path = "../common" }
common = { path = "../dsa_common" }
num_cpus = "1.17.0"
threadpool = "1.8.1"
@@ -8,7 +8,7 @@ use std::{
use crate::assembler::{AssembleError, Token, expand_pseudo_ops, lexer, quick_hash};
use crate::assembler::{Node, Parser, resolve_dependencies};
use crate::util::logging::Logger;
// use crate::util::logging::Logger;
// pub fn new_assemble(path: &Path) {
// let program = Program::new();
@@ -144,7 +144,8 @@ impl Module {
}
pub fn build(path: PathBuf, program: ProgramRef) -> Result<Task, AssembleError> {
// Spawn a thread that creates the main function and executes the lexer and parser.
// Spawn a thread that creates the main function and executes the lexer and
// parser.
let handle = thread::spawn(move || {
let mut module =
Self::new(path.clone(), quick_hash(&path), Vec::new(), program.clone());
@@ -223,19 +223,31 @@ fn build_shift_instruction(
opcode: Opcode,
args: &[crate::assembler::model::Token],
) -> Result<Instruction, AssembleError> {
let Some(reg_token) = args.first() else {
let Some(src_reg) = args.first() else {
return Err(AssembleError::MissingArgument(0));
};
let Some(amount_token) = args.get(1) else {
let Some(r_shamt) = args.get(1) else {
return Err(AssembleError::MissingArgument(0));
};
let Some(i_shamt) = args.get(2) else {
return Err(AssembleError::MissingArgument(1));
};
let Some(dest_reg) = args.get(3) else {
return Err(AssembleError::MissingArgument(1));
};
let reg = expect_token!(reg_token, Register)?;
let amount = expect_token!(amount_token, Immediate)? as u8;
let src = expect_token!(src_reg, Register)?;
let r_shamt = expect_token!(r_shamt, Register)?;
let i_shamt = expect_token!(i_shamt, Immediate)? as u8;
let dest = expect_token!(dest_reg, Register)?;
match opcode {
Opcode::Shl => Ok(Instruction::ShiftLeft(args!(R, sr1: reg, shamt: amount))),
Opcode::Shr => Ok(Instruction::ShiftRight(args!(R, sr1: reg, shamt: amount))),
Opcode::Shl => Ok(Instruction::ShiftLeft(
args!(R, sr1: src, sr2: r_shamt, shamt: i_shamt, dr: dest),
)),
Opcode::Shr => Ok(Instruction::ShiftRight(
args!(R, sr1: src, sr2: r_shamt, shamt: i_shamt, dr: dest),
)),
_ => unreachable!(),
}
}
@@ -4,7 +4,8 @@ use crate::assembler::model::{Node, Opcode, Symbol, Token};
/// Parse DSA assembly code with optional formatting
///
/// # Examples
/// ```
/// ```rs
/// use assembler::macros::dsa;
/// // With formatting:
/// let nodes = dsa!(hash, "mov r1, {}", 42)?;
///
@@ -5,19 +5,25 @@ use std::{
fmt, fs,
hash::{DefaultHasher, Hash, Hasher},
path::{Path, PathBuf},
sync::{Arc, Mutex, mpsc},
sync::{
Arc, Mutex,
mpsc::{self, Receiver, Sender},
},
thread,
};
pub use common::logging::log;
use common::prelude::Instruction;
use common::{
build::{BuildError, Builder},
logging::{LogReceiver, Logger},
prelude::Instruction,
};
// Module declarations
#[macro_use]
pub mod macros;
#[allow(clippy::module_inception)]
pub mod assembler;
// pub mod assembler;
pub mod codegen;
pub mod expand;
pub mod lexer;
@@ -35,45 +41,68 @@ pub use self::{
resolver::{create_sections, resolve_dependencies, resolve_symbols},
};
use crate::util::logging::{Entry, Logger};
pub struct CompilerEngine {
result_tx: mpsc::Sender<Result<Vec<Instruction>, AssembleError>>,
result_rx: Option<mpsc::Receiver<Result<Vec<Instruction>, AssembleError>>>,
pub struct Assembler {
src_path: PathBuf,
result_tx: mpsc::Sender<Result<Vec<u8>, AssembleError>>,
result_rx: Option<mpsc::Receiver<Result<Vec<u8>, AssembleError>>>,
is_running: bool,
logs_rx: LogReceiver,
}
impl CompilerEngine {
impl From<AssembleError> for BuildError {
fn from(err: AssembleError) -> Self {
Self::Generic(err.to_string())
}
}
impl Builder for Assembler {
type Output = Vec<u8>;
type Args = ();
fn logs(&self) -> Vec<String> {
self.logs_rx.logs()
}
#[must_use]
pub fn new() -> Self {
fn new(src_path: impl Into<PathBuf>) -> Self {
let (tx, rx) = mpsc::channel();
Self {
src_path: src_path.into(),
result_tx: tx,
result_rx: Some(rx),
is_running: false,
// for logging
logs_rx: LogReceiver::new(true),
}
}
/// Start the compilation process in a separate thread
pub fn start_compilation(&mut self, src: &Path) {
fn start(&mut self, args: ()) {
if self.is_running {
return;
}
let src = src.to_path_buf();
let src = self.src_path.clone();
let tx = self.result_tx.clone();
let logger = self.logs_rx.logger();
thread::spawn(move || {
let result = assemble(&src);
tx.send(result)
if let Ok(res) = assemble(&src, &logger) {
let buffer: Vec<u8> = res
.iter()
.flat_map(|instruction| instruction.encode().to_be_bytes())
.collect();
tx.send(Ok(buffer))
.expect("Failed to send compilation result from worker thread");
}
});
self.is_running = true;
}
/// Check if compilation is complete and get the result
pub fn try_get_result(&mut self) -> Option<Result<Vec<Instruction>, AssembleError>> {
fn poll(&mut self) -> Option<Result<Self::Output, common::build::BuildError>> {
if !self.is_running {
return None;
}
@@ -86,22 +115,20 @@ impl CompilerEngine {
{
Ok(result) => {
self.is_running = false;
Some(result)
Some(result.map_err(std::convert::Into::into))
}
Err(mpsc::TryRecvError::Empty) => None,
Err(mpsc::TryRecvError::Disconnected) => {
self.is_running = false;
Some(Err(AssembleError::Generic))
Some(Err(BuildError::Generic(String::from(
"Compilation terminated before a result was returned",
))))
}
}
}
/// Block until compilation is complete and return the result
pub fn wait_for_result(&mut self) -> Result<Vec<Instruction>, AssembleError> {
if !self.is_running {
return Err(AssembleError::Generic);
}
fn output(&mut self) -> Result<Self::Output, common::build::BuildError> {
if let Ok(result) = self
.result_rx
.take()
@@ -109,15 +136,19 @@ impl CompilerEngine {
.recv()
{
self.is_running = false;
result
result.map_err(std::convert::Into::into)
} else {
self.is_running = false;
Err(AssembleError::Generic)
Err(BuildError::Generic(String::from(
"Compilation terminated before a result was returned",
)))
}
}
}
fn assemble(src: &Path) -> Result<Vec<Instruction>, AssembleError> {
impl Assembler {}
fn assemble(src: &Path, logger: &Logger) -> Result<Vec<Instruction>, AssembleError> {
let mut modules = HashSet::new();
let mut program = Program::new();
@@ -127,31 +158,26 @@ fn assemble(src: &Path) -> Result<Vec<Instruction>, AssembleError> {
return Ok(vec![]);
}
prepare_dependency(src, &mut modules, &mut program)?;
prepare_dependency(src, &mut modules, &mut program, &logger)?;
let mut nodes = program.nodes.clone();
create_sections(&mut nodes)?;
resolve_symbols(&mut nodes)?;
log("Generating assembly output...");
logger.info("Generating assembly output...");
let instructions = codegen(nodes)?;
log("Compilation Successful");
logger.info("Compilation Successful");
Ok(instructions)
}
impl Default for CompilerEngine {
fn default() -> Self {
Self::new()
}
}
fn prepare_dependency(
path: &Path,
modules: &mut HashSet<u64>,
program: &mut Program,
logger: &Logger,
) -> Result<(), AssembleError> {
let filename = path
.file_name()
@@ -159,7 +185,7 @@ fn prepare_dependency(
.expect("Failed to get file name from path");
if let Ok(path) = path.canonicalize() {
log(&format!(
logger.info(&format!(
"{:20} {:20} [{}]",
"Building",
filename,
@@ -171,13 +197,13 @@ fn prepare_dependency(
.map_err(|_| AssembleError::InvalidFile(path.to_path_buf()))?;
let file_hash = quick_hash(path);
log(&format!("{:20} {:20}", "Tokenising", filename));
logger.info(&format!("{:20} {:20}", "Tokenising", filename));
let tokens = lexer::lexer(src, file_hash)?;
log(&format!("{:20} {:20}", "Parsing", filename));
logger.info(&format!("{:20} {:20}", "Parsing", filename));
let parsed = Parser::parse_nodes(tokens)?;
log(&format!("{:20} {:20}", "Resolving Deps", filename));
logger.info(&format!("{:20} {:20}", "Resolving Deps", filename));
// Get the parent directory of the source file to use as the base directory
let base_dir = path
.parent()
@@ -187,7 +213,7 @@ fn prepare_dependency(
let deps = Parser::get_dependencies(&nodes, path)?;
log(&format!("{:20} {:20}", "Expanding Pseudo-ops", filename));
logger.info(&format!("{:20} {:20}", "Expanding Pseudo-ops", filename));
// add a section instruction
nodes.insert(
@@ -202,7 +228,7 @@ fn prepare_dependency(
program.add_module(nodes);
for dep in deps {
log(&format!(
logger.info(&format!(
"{:20} {:20}",
"Including",
dep.file_name()
@@ -212,7 +238,7 @@ fn prepare_dependency(
let dep_hash = quick_hash(&dep);
if modules.insert(dep_hash) {
prepare_dependency(dep.as_path(), modules, program)?;
prepare_dependency(dep.as_path(), modules, program, logger)?;
}
}
@@ -184,11 +184,11 @@ pub enum Token {
impl fmt::Display for Token {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::Symbol(symbol) => write!(f, "{}", symbol),
Self::Register(register) => write!(f, "{}", register),
Self::Immediate(immediate) => write!(f, "{}", immediate),
Self::StringLit(string_lit) => write!(f, "{}", string_lit),
Self::Opcode(opcode) => write!(f, "{}", opcode),
Self::Symbol(symbol) => write!(f, "{symbol}"),
Self::Register(register) => write!(f, "{register}",),
Self::Immediate(immediate) => write!(f, "{immediate}",),
Self::StringLit(string_lit) => write!(f, "{string_lit}",),
Self::Opcode(opcode) => write!(f, "{opcode}",),
}
}
}
@@ -1,5 +1,6 @@
use std::path::{Path, PathBuf};
use crate::assembler::TokenType;
use crate::{assembler::AssembleError, expect_token, expect_type, node};
use crate::assembler::model::{Node, Opcode, Token};
@@ -100,6 +101,7 @@ impl Parser {
let opcode = expect_token!(self.next()?, Opcode)?;
let args: Vec<Token>;
#[allow(clippy::match_same_arms)]
match opcode {
// R-type instructions
Opcode::Mov | Opcode::Movs => {
@@ -112,22 +114,25 @@ impl Parser {
let base = expect_type!(self.next()?, Register, Symbol)?;
let dest = expect_type!(self.next()?, Register)?;
let mut offset = Token::Immediate(0);
if let Ok(next) = self.peek_next()
&& expect_type!(next, Immediate).is_ok() {
offset = self.next()?;
let offset = match self.peek_next() {
Ok(next) if expect_type!(next.clone(), Immediate).is_ok() => {
self.next()?
}
_ => Token::Immediate(0),
};
args = vec![base, dest, offset];
}
Opcode::Stb | Opcode::Sth | Opcode::Stw => {
let base = expect_type!(self.next()?, Register)?;
let dest = expect_type!(self.next()?, Register, Symbol)?;
let mut offset = Token::Immediate(0);
if let Ok(next) = self.peek_next()
&& expect_type!(next, Immediate).is_ok() {
offset = self.next()?;
let offset = match self.peek_next() {
Ok(next) if expect_type!(next.clone(), Immediate).is_ok() => {
self.next()?
}
_ => Token::Immediate(0),
};
args = vec![base, dest, offset];
}
@@ -146,15 +151,49 @@ impl Parser {
}
Opcode::Not | Opcode::Cmp => {
let reg1 = expect_type!(self.next()?, Register, Symbol)?;
let reg2 = expect_type!(self.next()?, Register, Symbol)?;
args = vec![reg1, reg2];
let src = expect_type!(self.next()?, Register, Symbol)?;
let dest = expect_type!(self.next()?, Register, Symbol)?;
args = vec![src, dest];
}
Opcode::Shl | Opcode::Shr => {
let reg = expect_type!(self.next()?, Register, Symbol)?;
let num = expect_type!(self.next()?, Immediate)?;
args = vec![reg, num];
let src = expect_type!(self.next()?, Register, Symbol)?;
// First operand after src: could be immediate or register
let first = self.next()?;
let (r_shamt, i_shamt) = match first {
Token::Register(_) => (
first,
if let Ok(tok) = self.peek_next() {
if expect_type!(tok, Immediate).is_ok() {
self.next()?
} else {
Token::Immediate(0)
}
} else {
Token::Immediate(0)
},
),
Token::Immediate(_) => (Token::Register(Register::Zero), first),
_ => {
return Err(AssembleError::UnexpectedToken(
first,
TokenType::Immediate,
));
}
};
let dest = if let Ok(tok) = self.peek_next() {
if expect_type!(tok, Register).is_ok() {
self.next()?
} else {
src.clone() // Default to src if no dest specified
}
} else {
src.clone() // Default to src if no dest specified
};
args = vec![src, r_shamt, i_shamt, dest];
}
Opcode::Inc | Opcode::Dec => {
@@ -6,11 +6,8 @@ use std::{
use common::prelude::Register;
use crate::assembler::model::{Module, Node, Opcode, Symbol, Token};
use crate::assembler::quick_hash;
use crate::assembler::{
log,
model::{Module, Node, Opcode, Symbol, Token},
};
use crate::{assembler::AssembleError, node};
pub fn resolve_symbols(nodes: &mut [Node]) -> Result<(), AssembleError> {
+25
View File
@@ -0,0 +1,25 @@
#![deny(
clippy::unwrap_used,
clippy::nursery,
clippy::perf,
clippy::pedantic,
clippy::complexity
)]
#![allow(
clippy::cast_possible_truncation,
clippy::missing_panics_doc,
clippy::missing_errors_doc,
clippy::match_wildcard_for_single_variants
)]
pub mod assembler;
pub mod image_builder;
pub mod tooling;
mod util;
pub mod prelude {
pub use crate::assembler::Assembler;
pub use crate::image_builder;
pub use crate::tooling::brainf;
pub use crate::tooling::project;
}
@@ -1,9 +1,6 @@
use common as _;
use num_cpus as _;
use threadpool as _;
use common::{self as _, build::Builder};
use assembler::{
assemble_file,
prelude::*,
tooling::{brainf, project},
};
@@ -47,5 +44,13 @@ fn main() {
let input_path = &args[2];
let output_path = &args[4];
assemble_file(input_path, output_path).unwrap();
let mut engine = Assembler::new(PathBuf::from(input_path));
engine.start(());
let result = engine.output().expect("assembler failed.");
if let Err(e) = fs::write(output_path, result) {
eprintln!("Failed to write to output file: {e}");
std::process::exit(1);
}
}
+39
View File
@@ -0,0 +1,39 @@
#![allow(dead_code)]
#![allow(unused)]
use std::{fmt, sync::mpsc::Sender};
// pub struct Entry {
// etype: EntryType,
// pub message: String,
// }
// #[derive(Copy, Clone, Eq, PartialEq)]
// enum EntryType {
// Debug,
// Info,
// Warn,
// Error,
// Fatal,
// }
// impl fmt::Display for EntryType {
// fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
// write!(
// f,
// "{:<5}",
// match self {
// Self::Debug => "DEBUG",
// Self::Info => "INFO",
// Self::Warn => "WARN",
// Self::Error => "ERROR",
// Self::Fatal => "FATAL",
// }
// )
// }
// }
// impl fmt::Display for Entry {
// fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
// write!(f, "{}: {}", self.etype, self.message)
// }
// }
@@ -4,6 +4,12 @@ version.workspace = true
edition.workspace = true
authors.workspace = true
[[bin]]
name = "dsa-c"
path = "src/main.rs"
[dependencies]
chrono = "0.4.43"
common = { path = "../common" }
common = { path = "../dsa_common" }
uuid = { version = "1.20.0", features = ["v4"] }
+964
View File
@@ -0,0 +1,964 @@
use std::collections::HashMap;
use std::sync::atomic::AtomicU32;
use std::time::SystemTime;
use chrono::{DateTime, Local};
use super::registers::RegisterAllocator;
use crate::backend::dsa::instruction::{InsBlock as IB, Instruction as I, Label};
use crate::backend::dsa::registers::Register;
use crate::model::{
AssignmentOperator, BinaryOperator, Call, CompilerError, ConstExpr, Declaration,
Dependency, Expression, Number, Program, Statement, TypeId, UnaryOperator, Variable,
};
pub struct CodeGenerator {
ast: Program,
imports: HashMap<String, I>,
globals: HashMap<String, I>,
functions: Vec<IB>,
symbols: Vec<String>,
allocator: RegisterAllocator,
is_library: bool,
}
impl CodeGenerator {
pub fn new(ast: Program, is_lib: bool) -> Self {
CodeGenerator {
ast,
imports: HashMap::new(),
globals: HashMap::new(),
functions: Vec::new(),
symbols: Vec::new(),
allocator: RegisterAllocator::new(),
is_library: is_lib,
}
}
pub fn include(&mut self, name: impl Into<String>, path: impl Into<String>) {
let name = name.into();
self.imports.insert(name.clone(), I::include(name, path));
}
fn is_global(&self, name: &str) -> bool {
// Check if this variable is in the globals list
self.globals.contains_key(name)
}
pub fn generate(&mut self) -> Result<String, CompilerError> {
// always include the print library for debugging!
if !self.is_library {
self.include("print", "./lib/print.dsa");
}
for block in self.ast.clone().declarations {
match block {
Declaration::Variable {
var: Variable { name, .. },
..
} => self.symbols.push(name),
Declaration::Function { name, .. } => self.symbols.push(name),
Declaration::Dependency(Dependency { name, .. }) => {
self.symbols.push(name)
}
Declaration::Struct { .. } => {} /* we can't do any code generation for
* a struct yet. we may need to later
* once these become class-like
* objects with implementations */
}
}
for block in self.ast.clone().declarations {
self.generate_block(block.clone())?;
}
let assembly = self.generate_layout()?;
Ok(assembly
.iter()
.map(|i| i.to_string())
.collect::<Vec<_>>()
.join("\n"))
}
fn generate_layout(&mut self) -> Result<IB, CompilerError> {
let datetime: DateTime<Local> = SystemTime::now().into();
let mut block = IB::new();
block.extend(vec![
I::global_comment(format!(
"GENERATED BY DSC COMPILER
Generated at {}",
datetime.format("%Y-%m-%d %H:%M:%S")
)),
I::Newline,
I::global_comment("Imports"),
]);
block.extend(self.imports.values().cloned().collect::<Vec<_>>());
block.extend(vec![
I::Newline,
I::global_comment("Globals & Reserved Memory"),
]);
block.extend(self.globals.values().cloned().collect::<Vec<_>>());
if !self.is_library {
block.extend(vec![
I::Newline,
I::global_comment("Entry Point"),
I::db_word("stack", 0x10000),
I::db_string("message", "Process Exited with code:"),
// init function for stack setup.
I::label("_init"),
I::ldw_label("stack", Register::Bpr),
I::mov(Register::Bpr, Register::Spr),
I::push(Register::Zero),
I::call("main"),
I::call("print::print_newline"),
I::lwi_label("message", Register::Rg0),
I::push(Register::Rg0),
I::call("print::print"),
I::pop(Register::Zero),
I::call("print::print_hex_word"),
I::pop(Register::Zero),
I::Hlt,
]);
}
block.extend(vec![
I::Newline,
// default return block boilerplate
I::global_comment("Return"),
I::label("_ret"),
I::mov(Register::Bpr, Register::Spr),
I::pop(Register::Bpr),
I::Return,
]);
for function in self.functions.iter() {
block.extend(function.iter().cloned());
}
Ok(block)
}
fn generate_global(&mut self, name: &str, init: Option<ConstExpr>) {
let init = init.unwrap_or(ConstExpr::Number(0));
match init {
ConstExpr::Number(value) => {
self.globals
.insert(name.to_string(), I::db_word(name, value as u32));
}
ConstExpr::String(str) => {
self.globals
.insert(name.to_string(), I::db_string(name, str));
}
}
}
fn generate_block(&mut self, block: Declaration) -> Result<(), CompilerError> {
match block {
Declaration::Variable { var, init, .. } => {
self.generate_global(&var.name, init)
}
Declaration::Function {
name,
params,
body,
return_type,
} => {
let func = self.generate_function(&name, &params, &body, return_type);
self.functions.push(func);
}
Declaration::Dependency(Dependency { name, path }) => {
self.include(name, path);
}
Declaration::Struct { .. } => {} /* can't do any codegen for these yet,
* they're just types. */
};
Ok(())
}
// Example: Generate code for a function
fn generate_function(
&mut self,
name: &str,
params: &[Variable],
body: &[Statement],
return_type: TypeId,
) -> IB {
let mut code = IB::new();
// Reset allocator for new function
self.allocator.reset();
let fmtparams = params
.iter()
.map(|p| format!("{}: {}", p.name, p.type_id))
.collect::<Vec<String>>()
.join(", ");
code.extend(vec![
I::global_comment(format!("fn {name}({fmtparams}) -> {return_type}")),
I::label(name),
I::push(Register::Bpr),
I::mov(Register::Spr, Register::Bpr),
]);
// Allocate parameters to registers or stack locations
for (i, param) in params.iter().enumerate() {
let offset = 8 + (i as i32 * 4); // Parameters start at bpr+8
// Track that this parameter is at a stack location
let (reg, load_code) = self.allocator.alloc_var(&param.name).unwrap();
code.append(load_code);
code.push(I::ldw_reg_offset(Register::Bpr, reg, offset));
}
// Generate code for function body
for stmt in body {
let stmt_code = self.generate_statement(stmt, &mut code).unwrap();
code.append(stmt_code);
}
// automatically return at function end
if let Some(x) = code.iter().last()
&& let I::Jmp { target: Label(val) } = x
&& val == "_ret"
{
} else {
code.push(I::jmp("_ret"));
}
code.insert(0, I::Newline);
code
}
// Example: Generate code for a statement
fn generate_statement(
&mut self,
stmt: &Statement,
func_body: &mut IB,
) -> Result<IB, CompilerError> {
let mut code = IB::new();
match stmt {
Statement::Declaration { var, value } => {
if let Some(expr) = value {
// Evaluate expression
let (result_reg, expr_code) =
self.generate_expression(expr, true, func_body)?;
code.append(expr_code);
// Store result in variable
let store_code = self.allocator.store_var(&var.name, &result_reg);
code.append(store_code);
// Free temporary register
self.allocator.free_temp(result_reg);
} else {
// Just declaring variable without initialization
self.allocator.alloc_var(&var.name)?;
}
}
Statement::Break => unimplemented!("need scope tracking first!"),
Statement::Continue => unimplemented!("need scope tracking first!"),
Statement::Defer(_func) => unimplemented!("we need scope tracking first!"),
Statement::PtrWrite { ptr, value } => {
let (result_reg, expr_code) =
self.generate_expression(value, true, func_body)?;
code.append(expr_code);
let (ptr_reg, ptr_code) =
self.generate_expression(ptr, true, func_body)?;
code.append(ptr_code);
code.push(I::stw_reg(result_reg, ptr_reg));
self.allocator.free_temp(result_reg);
self.allocator.free_temp(ptr_reg);
}
Statement::Assign {
varname,
value,
operator,
} => {
// Evaluate expression
let (result_reg, expr_code) =
self.generate_expression(value, true, func_body)?;
code.append(expr_code);
if *operator == AssignmentOperator::Assign {
// Check if this is a global variable
if self.is_global(varname) {
// Store to global label
code.push(I::stw_label(result_reg, varname.clone()))
} else {
// Store result in local variable
let store_code = self.allocator.store_var(varname, &result_reg);
code.append(store_code);
}
// Free temporary register
self.allocator.free_temp(result_reg);
return Ok(code);
}
// for more complex assignment cases we need an intermediate register.
let (temp_reg, temp_code) = self.allocator.alloc_temp()?;
code.append(temp_code);
// fetch the value of the variable
let var_reg = if self.is_global(varname) {
let instruction = I::ldw_label(varname.clone(), temp_reg);
code.push(instruction);
temp_reg
} else {
let (rg, block) = self.allocator.load_var(varname)?;
code.append(block);
rg
};
let assign_code = match operator {
AssignmentOperator::Assign => {
unreachable!("assignment was already checked earlier.")
}
AssignmentOperator::AddAssign => {
I::add(var_reg, result_reg, temp_reg)
}
AssignmentOperator::SubAssign => {
I::sub(var_reg, result_reg, temp_reg)
}
AssignmentOperator::MulAssign => {
return Err(CompilerError::Unimplemented(
"TODO: implement multiplication for assignment".to_string(),
));
}
AssignmentOperator::DivAssign => {
return Err(CompilerError::Unimplemented(
"TODO: write proper div function for DSA".to_string(),
));
}
AssignmentOperator::ModAssign => {
return Err(CompilerError::Unimplemented(
"TODO: write proper mod function for DSA".to_string(),
));
}
AssignmentOperator::AndAssign => {
I::and(var_reg, result_reg, temp_reg)
}
AssignmentOperator::OrAssign => I::or(var_reg, result_reg, temp_reg),
AssignmentOperator::XorAssign => {
I::xor(var_reg, result_reg, temp_reg)
}
AssignmentOperator::LeftShiftAssign => {
// this is only useful if we optimise out the register allocation
// inside value.
// if let Expression::Number { value, .. } = *value {
// I::shl(var_reg, value, temp_reg)
// }
I::shl(var_reg, result_reg, 0, temp_reg)
}
AssignmentOperator::RightShiftAssign => {
// this is only useful if we optimise out the register allocation
// if let Expression::Number { value, .. } = *value {
// I::shr(var_reg, value, temp_reg)
// }
I::shr(var_reg, result_reg, 0, temp_reg)
}
};
code.push(assign_code);
// Check if this is a global variable
if self.is_global(varname) {
// Store to global label
code.push(I::stw_label(temp_reg, varname.clone()))
} else {
// Store result in local variable
let store_code = self.allocator.store_var(varname, &temp_reg);
code.append(store_code);
}
self.allocator.free_temp(result_reg);
self.allocator.free_temp(temp_reg);
}
Statement::Return(expr) => {
if let Some(e) = expr {
let (result_reg, expr_code) =
self.generate_expression(e, true, func_body)?;
code.append(expr_code);
code.push(I::stw_reg_offset(result_reg, Register::Bpr, 8));
code.push(I::jmp("_ret"));
self.allocator.free_temp(result_reg);
}
}
Statement::If {
condition,
then_stmt,
else_stmt,
} => {
// Generate condition
let (cond_reg, cond_code) =
self.generate_expression(condition, true, func_body)?;
code.append(cond_code);
// Compare with zero
code.push(I::cmp(cond_reg, Register::Zero));
self.allocator.free_temp(cond_reg);
// Generate unique labels
let then_label = format!("_then_{}", self.get_unique_label());
let else_label = format!("_else_{}", self.get_unique_label());
let end_label = format!("_end_{}", self.get_unique_label());
// Jump to else if condition is false (equal to zero)
code.push(I::jeq(else_label.clone()));
// Then block
code.push(I::label(then_label));
for s in then_stmt {
code.append(self.generate_statement(s, func_body)?);
}
if then_stmt.is_empty() {
code.push(I::Nop);
}
code.push(I::jmp(end_label.clone()));
// Else block
code.push(I::label(else_label));
for s in else_stmt {
code.append(self.generate_statement(s, func_body)?);
}
if else_stmt.is_empty() {
code.push(I::Nop);
}
code.push(I::label(end_label));
}
Statement::While { condition, body } => {
let loop_start = format!("_while_start_{}", self.get_unique_label());
let loop_end = format!("_while_end_{}", self.get_unique_label());
code.push(I::label(&loop_start));
// Generate condition
let (cond_reg, cond_code) =
self.generate_expression(condition, true, func_body)?;
code.append(cond_code);
code.push(I::cmp(cond_reg, Register::Zero));
self.allocator.free_temp(cond_reg);
code.push(I::jeq(loop_end.clone()));
// Loop body
for s in body {
code.append(self.generate_statement(s, func_body)?);
}
code.push(I::jmp(loop_start));
code.push(I::label(loop_end));
}
Statement::Loop(body) => {
let loop_start = format!("_loop_start_{}", self.get_unique_label());
code.push(I::label(&loop_start));
for s in body {
code.append(self.generate_statement(s, func_body)?);
}
code.push(I::jmp(loop_start));
}
Statement::Expression { expr } => {
let (result_reg, expr_code) =
self.generate_expression(expr, false, func_body)?;
code.append(expr_code);
self.allocator.free_temp(result_reg);
}
Statement::Block(statements) => {
for s in statements {
code.append(self.generate_statement(s, func_body)?);
}
}
}
Ok(code)
}
// Example: Generate code for an expression
// Returns (register containing result, assembly code)
fn generate_expression(
&mut self,
expr: &Expression,
use_result: bool,
func_body: &mut IB,
) -> Result<(Register, IB), CompilerError> {
let mut code = IB::new();
match expr {
Expression::Empty => Ok((Register::Null, code)),
Expression::Number(n) => match n {
Number::Signed(value, _) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.append(alloc_code);
// Load immediate value
code.push(I::lwi(*value as u32, reg));
Ok((reg, code))
}
Number::Unsigned(value, _) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.append(alloc_code);
// Load immediate value
code.push(I::lwi(*value as u32, reg));
Ok((reg, code))
}
},
Expression::CharLiteral(value) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.append(alloc_code);
// Load immediate value
code.push(I::comment(format!("char literal '{value}'")));
code.push(I::lwi(*value as u32, reg));
Ok((reg, code))
}
Expression::StringLiteral(value) => {
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.append(alloc_code);
// write string into memory
let uuid = self.get_unique_label();
func_body.insert(0, I::db_string(format!("str_{uuid}"), value));
// Load pointer to string
code.push(I::lwi_label(format!("str_{uuid}"), reg));
Ok((reg, code))
}
Expression::ArrayLiteral { elements, type_id } => todo!(),
Expression::StructLiteral {
name,
fields,
type_id,
} => todo!(),
Expression::Variable { name, .. } => {
if self.is_global(&name.name) {
// Allocate a temporary register for the global
let (reg, alloc_code) = self.allocator.alloc_temp()?;
code.append(alloc_code);
// Load from global label
code.push(I::ldw_label(name.name.clone(), reg));
Ok((reg, code))
} else {
// Local variable - use existing allocator logic
let (reg, load_code) = self.allocator.load_var(&name.name)?;
code.append(load_code);
Ok((reg, code))
}
}
Expression::Binary {
op, left, right, ..
} => {
// Evaluate left operand
let (left_reg, left_code) =
self.generate_expression(left, true, func_body)?;
code.append(left_code);
// Evaluate right operand
let (right_reg, right_code) =
self.generate_expression(right, true, func_body)?;
code.append(right_code);
// Allocate result register
let (result_reg, result_alloc) = self.allocator.alloc_temp()?;
code.append(result_alloc);
// Generate operation
match op {
BinaryOperator::Add => {
code.push(I::add(left_reg, right_reg, result_reg));
}
BinaryOperator::Sub => {
code.push(I::sub(left_reg, right_reg, result_reg));
}
BinaryOperator::Mul => {
self.include("maths", "./lib/maths/core.dsa");
// Call multiply function
code.push(I::push(right_reg));
code.push(I::push(left_reg));
code.push(I::call("maths::multiply"));
code.push(I::pop(result_reg));
code.push(I::pop(Register::Zero));
}
BinaryOperator::Div => {
return Err(CompilerError::Unimplemented(
"TODO: write proper div function for DSA".to_string(),
));
// self.include("maths", "./lib/maths/core.dsa");
// // Call divide function
// code.push(format!("\tpush {}", right_reg));
// code.push(format!("\tpush {}", left_reg));
// code.push("\tcall maths::divide".to_string());
// code.push(format!("\tpop {}", result_reg));
// code.push("\tpop zero".to_string());
}
BinaryOperator::Mod => {
return Err(CompilerError::Unimplemented(
"TODO: write proper mod function for DSA".to_string(),
));
// self.include("maths", "./lib/maths/core.dsa");
// // Call modulo function
// code.push(format!("\tpush {}", right_reg));
// code.push(format!("\tpush {}", left_reg));
// code.push("\tcall maths::modulo".to_string());
// code.push(format!("\tpop {}", result_reg));
// code.push("\tpop zero".to_string());
}
BinaryOperator::BitwiseAnd => {
code.push(I::and(left_reg, right_reg, result_reg));
}
BinaryOperator::BitwiseOr => {
code.push(I::or(left_reg, right_reg, result_reg));
}
BinaryOperator::BitwiseXor => {
code.push(I::xor(left_reg, right_reg, result_reg));
}
BinaryOperator::LogicalAnd => {
return Err(CompilerError::Unimplemented(
"assembler/ISA does not yet support logical and!".to_string(),
));
}
BinaryOperator::LogicalOr => {
return Err(CompilerError::Unimplemented(
"assembler/ISA does not yet support logical or!".to_string(),
));
}
BinaryOperator::LeftShift => {
code.push(I::shl(left_reg, right_reg, 0, result_reg));
}
BinaryOperator::RightShift => {
code.push(I::shr(left_reg, right_reg, 0, result_reg));
}
// Comparison operators - return 1 (true) or 0 (false)
BinaryOperator::Equal => {
code.push(I::cmp(left_reg, right_reg));
code.push(I::lwi(1, result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(I::jeq(end_label.clone()));
code.push(I::lwi(0, result_reg));
code.push(I::label(end_label));
}
BinaryOperator::NotEqual => {
code.push(I::cmp(left_reg, right_reg));
code.push(I::lwi(1, result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(I::Jne {
target: Label(end_label.clone()),
});
code.push(I::lwi(0, result_reg));
code.push(I::label(&end_label));
}
BinaryOperator::LessThan => {
code.push(I::cmp(left_reg, right_reg));
code.push(I::lwi(1, result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(I::Jlt {
target: Label(end_label.clone()),
});
code.push(I::lwi(0, result_reg));
code.push(I::label(&end_label));
}
BinaryOperator::LessOrEqual => {
code.push(I::cmp(left_reg, right_reg));
code.push(I::lwi(1, result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(I::Jle {
target: Label(end_label.clone()),
});
code.push(I::lwi(0, result_reg));
code.push(I::label(&end_label));
}
BinaryOperator::GreaterThan => {
code.push(I::cmp(left_reg, right_reg));
code.push(I::lwi(1, result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(I::Jgt {
target: Label(end_label.clone()),
});
code.push(I::lwi(0, result_reg));
code.push(I::label(&end_label));
}
BinaryOperator::GreaterOrEqual => {
code.push(I::cmp(left_reg, right_reg));
code.push(I::lwi(1, result_reg));
let end_label = format!("_cmp_end_{}", self.get_unique_label());
code.push(I::Jge {
target: Label(end_label.clone()),
});
code.push(I::lwi(0, result_reg));
code.push(I::label(&end_label));
} // _ => unimplemented!(),
}
// Free operand registers (allocator will protect variables)
self.allocator.free_temp(left_reg);
self.allocator.free_temp(right_reg);
Ok((result_reg, code))
}
Expression::UnaryPostfix { op, operand, .. } => {
let (operand_reg, operand_code) =
self.generate_expression(operand, true, func_body)?;
code.append(operand_code);
let (result_reg, result_alloc) = self.allocator.alloc_temp()?;
code.append(result_alloc);
match op {
UnaryOperator::Increment => {
// postfix increment - return old value
code.push(I::mov(operand_reg, result_reg));
}
UnaryOperator::Decrement => {
// postfix decrement - return old value
code.push(I::mov(operand_reg, result_reg));
}
_ => {
return Err(CompilerError::Generic(format!(
"{op} is prefix only!"
)));
}
}
self.allocator.free_temp(operand_reg);
Ok((result_reg, code))
}
Expression::Unary { op, operand, .. } => {
let (operand_reg, operand_code) =
self.generate_expression(operand, true, func_body)?;
code.append(operand_code);
let (result_reg, result_alloc) = self.allocator.alloc_temp()?;
code.append(result_alloc);
match op {
UnaryOperator::Minus => {
// Negate: result = 0 - operand
code.push(I::sub(Register::Zero, operand_reg, result_reg));
}
UnaryOperator::Plus => {
// Just move
code.push(I::mov(operand_reg, result_reg));
}
UnaryOperator::Dereference => {
code.push(I::ldw_reg(operand_reg, result_reg));
}
UnaryOperator::AddressOf => {
// ensure the referenced variable is on the stack and return its
// address.
let (offset, alloc_code) =
self.allocator.free_register(&operand_reg)?;
code.push(alloc_code);
code.push(I::iadd_dest(
Register::Spr,
offset - self.allocator.get_stack_offset(),
result_reg,
));
}
UnaryOperator::SizeOf => {
if let Ok(id) = operand.type_id() {
let size = id.size();
code.push(I::lwi(size as u32, result_reg));
}
}
UnaryOperator::Increment => {
// prefix increment
code.push(I::mov(operand_reg, result_reg));
code.push(I::iadd_dest(operand_reg, 1, result_reg));
}
UnaryOperator::Decrement => {
// prefix decrement
code.push(I::mov(operand_reg, result_reg));
code.push(I::iadd_dest(operand_reg, -1, result_reg));
}
UnaryOperator::BitwiseNot => {
code.push(I::not(operand_reg, result_reg));
}
UnaryOperator::LogicalNot => {
return Err(CompilerError::Unimplemented(
"Assembler/ISA does not yet support logical not".to_string(),
));
}
_ => {
return Err(CompilerError::Generic(format!(
"{op} is postfix only!"
)));
}
}
self.allocator.free_temp(operand_reg);
Ok((result_reg, code))
}
Expression::Call {
func: Call { name, args },
..
} => {
// first evaluate all the args we're going to need
let mut arg_regs = Vec::new();
for arg in args.iter().rev() {
let (arg_reg, arg_code) =
self.generate_expression(arg, true, func_body)?;
code.append(arg_code);
arg_regs.push(arg_reg);
}
// Save caller-saved registers and track which ones we saved
let saved_regs = self.allocator.get_caller_saved_registers();
for reg in &saved_regs {
// spill variables to stack
code.push(self.allocator.free_register(reg).unwrap().1);
}
// Evaluate and push arguments in reverse order
for (i, arg_reg) in arg_regs.iter().enumerate() {
code.push(I::comment(format!("push arg {}", args.len() - 1 - i)));
code.push(I::push(*arg_reg));
}
if self.symbols.contains(&name.name) {
// Call local function
code.push(I::call(name.to_string()));
} else if let Some(ns) = name.namespace.clone()
&& self.imports.contains_key(&ns)
{
code.push(I::call(name.to_string()));
} else {
return Err(CompilerError::Undefined(name.clone()));
}
let result_reg: Register;
if use_result {
let (temp_result_reg, result_alloc) = self.allocator.alloc_temp()?;
result_reg = temp_result_reg;
code.append(result_alloc);
code.push(I::pop(result_reg));
// Clean up arguments
if args.len() > 1 {
for _ in 0..(args.len() - 1) {
code.push(I::pop(Register::Zero));
}
}
} else {
result_reg = Register::Zero;
// Clean up arguments
if args.len() > 0 {
for _ in 0..(args.len()) {
code.push(I::pop(Register::Zero));
}
}
}
// Free argument registers
for reg in arg_regs {
self.allocator.free_temp(reg);
}
Ok((result_reg, code))
}
Expression::IndexAccess {
expr,
index,
type_id,
} => {
let (expr_reg, expr_alloc) =
self.generate_expression(expr, true, func_body)?;
code.append(expr_alloc);
let (index_reg, index_alloc) =
self.generate_expression(index, true, func_body)?;
code.append(index_alloc);
let (result_reg, result_alloc) = self.allocator.alloc_temp()?;
code.append(result_alloc);
// add the expr pointer to the index to get the final address.
code.push(I::add(expr_reg, index_reg, result_reg));
// load the value at the address.
code.push(I::ldw_reg(result_reg, result_reg));
self.allocator.free_temp(expr_reg);
self.allocator.free_temp(index_reg);
Ok((result_reg, code))
}
Expression::MemberAccess {
expr,
field_name,
type_id,
} => Err(CompilerError::Unimplemented(
"Structs are not yet implemented!".to_string(),
)),
Expression::TypeCast {
expr,
target_type,
type_id,
} => {
let (expr_reg, expr_code) =
self.generate_expression(expr, true, func_body)?;
// not sure if we actually need to do anything here.
// for now we just return the previous expression.
Ok((expr_reg, expr_code))
}
}
}
// Helper for generating unique labels
fn get_unique_label(&mut self) -> String {
// You'd implement a counter here
static COUNTER: AtomicU32 = AtomicU32::new(0);
let val = COUNTER.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
(val + 1).to_string()
}
}
@@ -0,0 +1,797 @@
use std::fmt;
use crate::backend::dsa::registers::Register;
pub struct InsBlock {
instructions: Vec<Instruction>,
}
impl InsBlock {
pub fn new() -> Self {
Self {
instructions: vec![],
}
}
pub fn insert(&mut self, index: usize, instr: Instruction) {
self.instructions.insert(index, instr);
}
pub fn push(&mut self, instr: Instruction) {
self.instructions.push(instr);
}
pub fn append(&mut self, mut other: Self) {
self.instructions.append(&mut other.instructions);
}
pub fn extend(&mut self, instrs: impl IntoIterator<Item = Instruction>) {
self.instructions.extend(instrs);
}
pub fn is_empty(&self) -> bool {
self.instructions.is_empty()
}
pub fn len(&self) -> usize {
self.instructions.len()
}
pub fn iter(&self) -> impl Iterator<Item = &Instruction> {
self.instructions.iter()
}
}
impl From<Vec<Instruction>> for InsBlock {
fn from(instructions: Vec<Instruction>) -> Self {
Self { instructions }
}
}
impl From<Instruction> for InsBlock {
fn from(instr: Instruction) -> Self {
Self {
instructions: vec![instr],
}
}
}
#[derive(Debug, Clone)]
pub enum Instruction {
// Labels and comments
Label(Label),
Comment {
text: String,
top_level: bool,
},
Newline,
// Data Directives
Db {
label: String,
data: Vec<u8>,
},
Dh {
label: String,
data: Vec<u16>,
},
Dw {
label: String,
data: Vec<u32>,
},
DString {
// alias for db.
label: String,
data: String,
},
Resx {
label: String,
size: u32,
},
// Include
Include {
name: String,
path: String,
},
// Data movement
Mov {
src: Register,
dest: Register,
},
Movs {
src: Register,
dest: Register,
},
// Memory operations
Ldb {
src: MemOperand,
dest: Register,
},
Ldh {
src: MemOperand,
dest: Register,
},
Ldw {
src: MemOperand,
dest: Register,
},
Stb {
src: Register,
dest: MemOperand,
},
Sth {
src: Register,
dest: MemOperand,
},
Stw {
src: Register,
dest: MemOperand,
},
// Immediate loads
Lli {
imm: Imm,
dest: Register,
},
Lui {
imm: Imm,
dest: Register,
},
Lwi {
imm: Imm,
dest: Register,
},
LwiLabel {
label: String,
dest: Register,
},
// Arithmetic
Add {
src1: Register,
src2: Register,
dest: Register,
},
Sub {
src1: Register,
src2: Register,
dest: Register,
},
IAdd {
src: Register,
imm: Imm,
dest: Option<Register>,
},
ISub {
src: Register,
imm: Imm,
dest: Option<Register>,
},
Inc {
reg: Register,
},
Dec {
reg: Register,
},
// Bitwise
And {
src1: Register,
src2: Register,
dest: Register,
},
Or {
src1: Register,
src2: Register,
dest: Register,
},
Xor {
src1: Register,
src2: Register,
dest: Register,
},
Not {
src: Register,
dest: Register,
},
Nand {
src1: Register,
src2: Register,
dest: Register,
},
Nor {
src1: Register,
src2: Register,
dest: Register,
},
Xnor {
src1: Register,
src2: Register,
dest: Register,
},
// Shifts
Shl {
src1: Register,
r_shamt: Register,
i_shamt: u16,
dest: Register,
},
Shr {
src1: Register,
r_shamt: Register,
i_shamt: u16,
dest: Register,
},
// Comparison
Cmp {
reg1: Register,
reg2: Register,
},
// Jumps
Jmp {
target: Label,
},
Jeq {
target: Label,
},
Jne {
target: Label,
},
Jgt {
target: Label,
},
Jge {
target: Label,
},
Jlt {
target: Label,
},
Jle {
target: Label,
},
// Stack
Push {
reg: Register,
},
Pop {
reg: Register,
},
// Function calls
Call {
target: String,
}, // namespace::function
Return,
// System
Hlt,
Nop,
Int {
code: u8,
},
}
pub enum DataDirective {
U8(Vec<u8>),
U16(Vec<u16>),
U32(Vec<u32>),
String(String),
Char(char),
}
impl fmt::Display for Instruction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::Label(l) => write!(f, "{}:", l),
Self::Newline => write!(f, ""), /* empty string as newlines are inserted */
// automatically.
Self::Comment { text, top_level } => write!(
f,
"{}",
text.lines()
.map(|line| format!(
"{}// {}",
if *top_level { "" } else { " " },
line.trim(),
))
.collect::<Vec<String>>()
.join("\n")
),
Self::Include { name, path } => write!(f, "include {name}: \"{}\"", path),
Self::Db { label, data } => write!(
f,
"db {}: {}",
label,
data.iter()
.map(|&b| format!("{:#04X}", b))
.collect::<Vec<String>>()
.join(", ")
),
Self::Dh { label, data } => write!(
f,
"dh {}: {}",
label,
data.iter()
.map(|&b| format!("{:#06X}", b))
.collect::<Vec<String>>()
.join(", ")
),
Self::Dw { label, data } => write!(
f,
"dw {}: {}",
label,
data.iter()
.map(|&b| format!("{:#08X}", b))
.collect::<Vec<String>>()
.join(", ")
),
Self::DString { label, data } => write!(f, "db {}: \"{}\"", label, data),
Self::Resx { label, size } => write!(f, "resx {}: {}", label, size),
Self::Mov { src, dest } => write!(f, " mov {}, {}", src, dest),
Self::Movs { src, dest } => write!(f, " movs {}, {}", src, dest),
Self::Ldb { src: addr, dest } => {
let (reg, offset) = reg_and_offset(addr);
write!(f, " ldb {}, {}, {}", reg, dest, offset)
}
Self::Ldh { src: addr, dest } => {
let (reg, offset) = reg_and_offset(addr);
write!(f, " ldh {}, {}, {}", reg, dest, offset)
}
Self::Ldw { src, dest } => {
let (reg, offset) = reg_and_offset(src);
write!(f, " ldw {}, {}, {}", reg, dest, offset)
}
// Self::Ldbs { addr, dest } => {
// write!(f, " ldbs {}, {}", format_mem_operand(addr), dest)
// }
// Self::Ldhs { addr, dest } => {
// write!(f, " ldhs {}, {}", format_mem_operand(addr), dest)
// }
// Self::Ldws { addr, dest } => {
// write!(f, " ldws {}, {}", format_mem_operand(addr), dest)
// }
Self::Stb { src, dest: addr } => {
let (reg, offset) = reg_and_offset(addr);
write!(f, " stb {}, {}, {}", src, reg, offset)
}
Self::Sth { src, dest: addr } => {
let (reg, offset) = reg_and_offset(addr);
write!(f, " sth {}, {}, {}", src, reg, offset)
}
Self::Stw { src, dest: addr } => {
let (reg, offset) = reg_and_offset(addr);
write!(f, " stw {}, {}, {}", src, reg, offset)
}
Self::Lli { imm, dest } => write!(f, " lli {}, {}", imm, dest),
Self::Lui { imm, dest } => write!(f, " lui {}, {}", imm, dest),
Self::Lwi { imm, dest } => write!(f, " lwi {}, {}", imm, dest),
Self::LwiLabel { label, dest } => write!(f, " lwi {}, {}", label, dest),
// arithmetic
Self::Add { src1, src2, dest } => {
write!(f, " add {}, {}, {}", src1, src2, dest)
}
Self::Sub { src1, src2, dest } => {
write!(f, " sub {}, {}, {}", src1, src2, dest)
}
Self::And { src1, src2, dest } => {
write!(f, " and {}, {}, {}", src1, src2, dest)
}
Self::Or { src1, src2, dest } => {
write!(f, " or {}, {}, {}", src1, src2, dest)
}
Self::Nand { src1, src2, dest } => {
write!(f, " nand {}, {}, {}", src1, src2, dest)
}
Self::Xor { src1, src2, dest } => {
write!(f, " xor {}, {}, {}", src1, src2, dest)
}
Self::Nor { src1, src2, dest } => {
write!(f, " nor {}, {}, {}", src1, src2, dest)
}
Self::Not { src, dest } => {
write!(f, " not {} {}", src, dest)
}
Self::Xnor { src1, src2, dest } => {
write!(f, " xnor {}, {}, {}", src1, src2, dest)
}
Self::IAdd { src, imm, dest } => {
if let Some(d) = dest {
write!(f, " addi {}, {}, {}", src, imm, d)
} else {
write!(f, " addi {}, {}", src, imm)
}
}
Self::ISub { src, imm, dest } => {
if let Some(d) = dest {
write!(f, " subi {}, {}, {}", src, imm, d)
} else {
write!(f, " subi {}, {}", src, imm)
}
}
// shift instructions
Self::Shl {
src1,
r_shamt,
i_shamt,
dest,
} => {
write!(f, " shl {}, {}, {}, {}", src1, r_shamt, i_shamt, dest)
}
Self::Shr {
src1,
r_shamt,
i_shamt,
dest,
} => {
write!(f, " shl {}, {}, {}, {}", src1, r_shamt, i_shamt, dest)
}
// increment instructions
Self::Inc { reg } => write!(f, " inc {}", reg),
Self::Dec { reg } => write!(f, " dec {}", reg),
Self::Cmp { reg1, reg2 } => write!(f, " cmp {}, {}", reg1, reg2),
// jump instructions
Self::Jmp { target } => write!(f, " jmp {}", target),
Self::Jeq { target } => write!(f, " jeq {}", target),
Self::Jne { target } => write!(f, " jne {}", target),
Self::Jgt { target } => write!(f, " jgt {}", target),
Self::Jge { target } => write!(f, " jge {}", target),
Self::Jlt { target } => write!(f, " jlt {}", target),
Self::Jle { target } => write!(f, " jle {}", target),
// stack pseudoinstructions
Self::Push { reg } => write!(f, " push {}", reg),
Self::Pop { reg } => write!(f, " pop {}", reg),
// call & return pseudoinstructions
Self::Call { target } => write!(f, " call {}", target),
Self::Return => write!(f, " return"),
// misc instructions
Self::Int { code } => write!(f, " int {}", code),
Self::Hlt => write!(f, " hlt"),
Self::Nop => write!(f, " nop"),
}
}
}
impl Instruction {
// data directives
pub fn db_string(label: impl Into<String>, data: impl Into<String>) -> Self {
Self::DString {
label: label.into(),
data: data.into(),
}
}
pub fn db_word(label: impl Into<String>, data: u32) -> Self {
Self::Dw {
label: label.into(),
data: vec![data],
}
}
pub fn db_bytes(label: impl Into<String>, data: &[u8]) -> Self {
Self::Db {
label: label.into(),
data: data.to_vec(),
}
}
// Movement
pub fn mov<R1, R2>(src: R1, dest: R2) -> Self
where
R1: Into<Register>,
R2: Into<Register>,
{
Self::Mov {
src: src.into(),
dest: dest.into(),
}
}
// Memory loads
pub fn ldw_reg<R>(base: R, dest: Register) -> Self
where
R: Into<Register>,
{
Self::Ldw {
src: MemOperand::RegIndirect(base.into()),
dest,
}
}
pub fn ldw_reg_offset<R>(base: R, dest: Register, offset: i32) -> Self
where
R: Into<Register>,
{
Self::Ldw {
src: MemOperand::RegOffset(base.into(), offset),
dest,
}
}
pub fn ldw_label(label: impl Into<Label>, dest: Register) -> Self {
Self::Ldw {
src: MemOperand::Label(label.into()),
dest,
}
}
// Memory stores
pub fn stw_reg<R>(src: Register, base: R) -> Self
where
R: Into<Register>,
{
Self::Stw {
src,
dest: MemOperand::RegIndirect(base.into()),
}
}
pub fn stw_reg_offset<R>(src: Register, base: R, offset: i32) -> Self
where
R: Into<Register>,
{
Self::Stw {
src,
dest: MemOperand::RegOffset(base.into(), offset),
}
}
pub fn stw_label(src: Register, label: impl Into<Label>) -> Self {
Self::Stw {
src,
dest: MemOperand::Label(label.into()),
}
}
// Arithmetic
pub fn add(src1: Register, src2: Register, dest: Register) -> Self {
Self::Add { src1, src2, dest }
}
pub fn sub(src1: Register, src2: Register, dest: Register) -> Self {
Self::Sub { src1, src2, dest }
}
pub fn and(src1: Register, src2: Register, dest: Register) -> Self {
Self::And { src1, src2, dest }
}
pub fn or(src1: Register, src2: Register, dest: Register) -> Self {
Self::Or { src1, src2, dest }
}
pub fn xor(src1: Register, src2: Register, dest: Register) -> Self {
Self::Xor { src1, src2, dest }
}
pub fn not(src: Register, dest: Register) -> Self {
Self::Not { src, dest }
}
pub fn shl(src1: Register, r_shamt: Register, i_shamt: u16, dest: Register) -> Self {
Self::Shl {
src1,
r_shamt,
i_shamt,
dest,
}
}
pub fn shr(src1: Register, r_shamt: Register, i_shamt: u16, dest: Register) -> Self {
Self::Shr {
src1,
r_shamt,
i_shamt,
dest,
}
}
pub fn iadd(src: Register, value: i64) -> Self {
let imm = Imm(value.unsigned_abs() as u32);
if value < 0 {
Self::ISub {
src,
imm,
dest: None,
}
} else {
Self::IAdd {
src,
imm,
dest: None,
}
}
}
pub fn iadd_dest(src: Register, value: i32, dest: Register) -> Self {
let imm = Imm(value.unsigned_abs());
if value < 0 {
Self::ISub {
src,
imm,
dest: Some(dest),
}
} else {
Self::IAdd {
src,
imm,
dest: Some(dest),
}
}
}
pub fn inc(reg: Register) -> Self {
Self::Inc { reg }
}
pub fn dec(reg: Register) -> Self {
Self::Dec { reg }
}
// Immediate loads
pub fn lwi(value: u32, dest: Register) -> Self {
if value > 0xFFFF {
Self::Lwi {
imm: Imm(value),
dest,
}
} else {
Self::Lli {
imm: Imm(value),
dest,
}
}
}
pub fn lwi_label(label: impl Into<String>, dest: Register) -> Self {
Self::LwiLabel {
label: label.into(),
dest,
}
}
// Control flow
pub fn label(name: impl Into<String>) -> Self {
Self::Label(Label(name.into()))
}
pub fn jmp(target: impl Into<Label>) -> Self {
Self::Jmp {
target: target.into(),
}
}
pub fn jeq(target: impl Into<Label>) -> Self {
Self::Jeq {
target: target.into(),
}
}
pub fn cmp(reg1: Register, reg2: Register) -> Self {
Self::Cmp { reg1, reg2 }
}
// Stack
pub fn push(reg: Register) -> Self {
Self::Push { reg }
}
pub fn pop(reg: Register) -> Self {
Self::Pop { reg }
}
// Functions
pub fn call(target: impl Into<String>) -> Self {
Self::Call {
target: target.into(),
}
}
pub fn int(code: u8) -> Self {
Self::Int { code }
}
pub fn ret() -> Self {
Self::Return
}
// Utilities
pub fn comment(text: impl Into<String>) -> Self {
Self::Comment {
text: text.into(),
top_level: false,
}
}
pub fn global_comment(text: impl Into<String>) -> Self {
Self::Comment {
text: text.into(),
top_level: true,
}
}
pub fn include(name: impl Into<String>, path: impl Into<String>) -> Self {
Self::Include {
name: name.into(),
path: path.into(),
}
}
}
// Convenience trait for Label conversion
impl From<String> for Label {
fn from(s: String) -> Self {
Label(s)
}
}
impl From<&str> for Label {
fn from(s: &str) -> Self {
Label(s.to_string())
}
}
fn reg_and_offset(op: &MemOperand) -> (String, i32) {
match op {
MemOperand::RegIndirect(reg) => (reg.to_string(), 0),
MemOperand::RegOffset(reg, offset) => (reg.to_string(), *offset),
MemOperand::Label(label) => (label.to_string(), 0),
MemOperand::LabelOffset(label, offset) => (label.to_string(), *offset),
}
}
/// Memory operand for loads/stores
#[derive(Debug, Clone)]
pub enum MemOperand {
/// Register indirect: [reg]
RegIndirect(Register),
/// Register with offset: [reg + offset]
RegOffset(Register, i32),
/// Label: [label]
Label(Label),
/// Label with offset: [label + offset]
LabelOffset(Label, i32),
}
/// Immediate value (16-bit or 32-bit)
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Imm(pub u32);
impl fmt::Display for Imm {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", self.0)
}
}
/// Label reference
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Label(pub String);
impl fmt::Display for Label {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", self.0)
}
}
+11
View File
@@ -0,0 +1,11 @@
use crate::model::{CompilerError, Program};
mod codegen;
mod instruction;
mod registers;
mod scope;
pub fn generate_code(ast: &Program, is_lib: bool) -> Result<String, CompilerError> {
let mut codegen = codegen::CodeGenerator::new(ast.clone(), is_lib);
codegen.generate()
}
+560
View File
@@ -0,0 +1,560 @@
use std::{collections::HashMap, fmt};
use crate::{
backend::dsa::instruction::{InsBlock, Instruction},
model::CompilerError,
};
/// Register allocator for DSA assembly generation
/// Manages general-purpose registers (rg0-rgf) and handles stack spilling
pub struct RegisterAllocator {
/// Available general-purpose registers
/// Maps variable names to their current location (register or stack offset)
variable_locations: HashMap<String, Location>,
/// Maps registers to the variables they currently hold
register_contents: HashMap<Register, String>,
/// Current stack offset for local variables (relative to bpr)
/// Starts at -4 (going downward from base pointer)
stack_offset: i32,
/// Track which registers are currently in use
in_use: Vec<(Register, bool)>,
}
#[derive(Debug, Clone)]
pub struct Location {
register: Option<Register>,
stack: Option<i32>,
}
impl Location {
pub fn stack(offset: i32) -> Self {
Location {
register: None,
stack: Some(offset),
}
}
pub fn register(register: Register) -> Self {
Location {
register: Some(register),
stack: None,
}
}
}
impl RegisterAllocator {
pub fn new() -> Self {
// Initialize with available GP registers (rg0-rgf = 16 registers)
let in_use = vec![
Register::Rg0,
Register::Rg1,
Register::Rg2,
Register::Rg3,
Register::Rg4,
Register::Rg5,
Register::Rg6,
Register::Rg7,
Register::Rg8,
Register::Rg9,
Register::Rga,
Register::Rgb,
Register::Rgc,
Register::Rgd,
Register::Rge,
Register::Rgf,
]
.iter()
.map(|&reg| (reg, false))
.collect();
RegisterAllocator {
// available_registers: registers,
variable_locations: HashMap::new(),
register_contents: HashMap::new(),
stack_offset: -4, // Start at -4 (first local below saved bpr)
in_use,
}
}
/// Allocate a temporary register for expression evaluation
/// Returns the register name and optionally assembly code to save it
pub fn alloc_temp(&mut self) -> Result<(Register, InsBlock), CompilerError> {
// Try to find an unused register
// println!("finding! {:#?}", self.in_use);
if let Some(reg) = self.find_free_register() {
self.in_use[reg as usize].1 = true;
return Ok((reg, InsBlock::new()));
}
// All registers in use - need to spill one
// Choose the first register with a variable we can spill
// Find a register to spill
// let reg_to_spill = self
// .available_registers
// .iter()
// .find(|reg| self.register_contents.contains_key(*reg))
// .cloned();
// if let Some(reg) = reg_to_spill {
// // Spill this variable to stack
// let spill_code = self.spill_register(&reg)?;
// code.extend(spill_code);
// self.in_use.insert(reg.clone(), true);
// return Ok((reg, code));
// }
todo!("an efficient stack spilling algorithm. needs scope awareness.");
Err(CompilerError::Generic(
"All registers are used up yet there are no variables to spill to the stack"
.to_string(),
))
}
// fn set_in_use(&mut self, reg: Register, in_use: bool) {
// self.in_use[reg as usize].1 = in_use;
// }
/// Free a temporary register after use
/// NOTE: This will NOT free registers that contain variables!
/// Variables persist throughout their scope and must not be freed
pub fn free_temp(&mut self, reg: Register) {
// Check if this register contains a variable
if self.register_contents.contains_key(&reg) {
// This register holds a variable - don't free it!
// Variables are only freed when they go out of scope via free_var()
return;
}
// This is a true temporary - safe to free
if !matches!(reg, Register::Zero | Register::Null) {
self.in_use[reg as usize].1 = false;
}
}
pub fn free_var(&mut self, var: &str) {
// Check if this variable is in a register
if let Some(location) = self.variable_locations.get(var).cloned() {
if let Some(reg) = location.register
&& !matches!(reg, Register::Zero | Register::Null)
{
self.register_contents.remove(&reg);
self.in_use[reg as usize].1 = false;
}
self.variable_locations.remove(var);
}
}
/// Allocate a register for a named variable
/// Returns the register and any necessary assembly code
pub fn alloc_var(
&mut self,
var_name: &str,
) -> Result<(Register, InsBlock), CompilerError> {
if let Some(mut location) = self.variable_locations.get(var_name).cloned() {
// if the var is in a register we can use it already.
if let Some(reg) = location.register {
return Ok((reg, InsBlock::new()));
}
// if the variable is on the stack only, we need to get it in a register.
if let Some(offset) = location.stack {
// Variable was pushed, need to calculate actual position and update its
// location.
let (reg, mut code) = self.alloc_temp()?;
// acknowledge var is now in a reg as well.
location.register = Some(reg);
// Load from bpr + offset (offset is negative)
// code.push(format!("\tsubi bpr {} {}", -(offset + 4), reg));
code.push(Instruction::ldw_reg_offset(
Register::Spr,
reg,
offset - self.stack_offset,
));
// Update location to register
self.variable_locations
.insert(var_name.to_string(), location);
self.register_contents.insert(reg, var_name.to_string());
return Ok((reg, code));
}
}
// Variable doesn't have a location yet, allocate a new register
let (reg, code) = self.alloc_temp()?;
self.variable_locations
.insert(var_name.to_string(), Location::register(reg));
self.register_contents.insert(reg, var_name.to_string());
Ok((reg, code))
}
/// Get the current location of a variable
pub fn _get_var_location(&self, var_name: &str) -> Option<&Location> {
self.variable_locations.get(var_name)
}
/// Load a variable into a register (allocating if necessary)
/// Returns the register and assembly code to load it
pub fn load_var(
&mut self,
var_name: &str,
) -> Result<(Register, InsBlock), CompilerError> {
self.alloc_var(var_name)
}
/// Store a value from a register into a variable
/// Updates tracking and returns any necessary assembly code
pub fn store_var(&mut self, var_name: &str, source_reg: &Register) -> InsBlock {
let mut block = InsBlock::new();
// Check if variable already has a location
if let Some(location) = self.variable_locations.get(var_name) {
// if the variable exists in a register we write to that.
match location.register {
Some(reg) if reg == *source_reg => {
block.push(Instruction::mov(*source_reg, reg));
return block;
}
_ => (),
}
// if the variable exists on the stack but not a register we write here.
if let Some(offset) = location.stack {
block.push(Instruction::stw_reg_offset(
*source_reg,
Register::Spr,
offset - self.stack_offset,
));
return block;
}
}
// Variable doesn't exist yet, we can just use the same reg.
// if we can avoid a move, absolutely do that.
// if this is true then there's no permanent variable here so it's safe to use.
if !self.register_contents.contains_key(source_reg) {
self.variable_locations
.insert(var_name.to_string(), Location::register(*source_reg));
self.register_contents
.insert(*source_reg, var_name.to_string());
self.in_use[*source_reg as usize].1 = true;
return block;
}
// if current register isn't free, (eg is another variable) we assign somewhere
// else.
if let Some(free_reg) = self.find_free_register() {
self.variable_locations
.insert(var_name.to_string(), Location::register(free_reg));
self.register_contents
.insert(free_reg, var_name.to_string());
self.in_use[free_reg as usize].1 = true;
block.push(Instruction::mov(*source_reg, free_reg));
return block;
}
// No free registers - allocate on stack
// code.push(format!("\tstw {}, bpr, {}", source_reg, self.stack_offset));
// self.variable_locations
// .insert(var_name.to_string(), Location::Stack(self.stack_offset));
// self.stack_offset -= 4; // Move to next stack slot
//
todo!("an efficient stack spilling algorithm. needs scope awareness.");
}
/// spill a register to the stack (WITHOUT FREEING)
/// DO NOT USE this if it's for a pointer!!!!
pub fn _spill_register(&mut self, reg: &Register) -> Result<InsBlock, CompilerError> {
let mut code = InsBlock::new();
// check if the variable is declared.
if let Some(var_name) = self.register_contents.get(reg).cloned()
&& let Some(location) = self.variable_locations.get_mut(&var_name)
{
// check if var is on the stack
if let Some(offset) = location.stack {
code.push(Instruction::stw_reg_offset(
*reg,
Register::Spr,
offset - self.stack_offset,
));
return Ok(code);
}
// Track that we pushed one word
self.stack_offset -= 4;
// if the variable is not on the stack:
// push register to stack (spr decrements automatically)
let offset = self.stack_offset;
code.push(Instruction::push(*reg));
// Update variable location - it's now at current spr
// Note: We track offset from bpr for consistency
location.stack = Some(offset);
Ok(code)
} else {
Err(CompilerError::Generic(format!(
"Register {} does not contain a variable to spill!",
reg
)))
}
}
/// free a register by spilling it to the stack.
/// Returns assembly code to perform the spill
pub fn free_register(
&mut self,
reg: &Register,
) -> Result<(i32, Instruction), CompilerError> {
// check if the variable is declared.
if let Some(var_name) = self.register_contents.get(reg).cloned()
&& let Some(location) = self.variable_locations.get_mut(&var_name)
{
// check if var name is on the stack
if let Some(offset) = location.stack {
// store current register value in stack location
let code = Instruction::stw_reg_offset(
*reg,
Register::Spr,
offset - self.stack_offset,
);
// free the register.
location.register = None;
self.register_contents.remove(reg);
return Ok((offset, code));
}
// Track that we pushed one word
self.stack_offset -= 4;
let offset = self.stack_offset;
let code = Instruction::push(*reg);
// Update variable location
// Note: We track offset from bpr for consistency
location.stack = Some(offset);
location.register = None;
self.register_contents.remove(reg);
Ok((offset, code))
} else {
Err(CompilerError::Generic(format!(
"Register {} does not contain a variable to spill!",
reg
)))
}
}
/// Find a free register (not currently in use)
fn find_free_register(&self) -> Option<Register> {
self.in_use
.iter()
.filter(|(_, in_use)| !*in_use)
.map(|(reg, _)| *reg)
.next()
}
/// Spill all registers to stack (useful before function calls)
pub fn _spill_all(&mut self) -> InsBlock {
let mut code = InsBlock::new();
let regs_to_spill: Vec<Register> =
self.register_contents.keys().cloned().collect();
for reg in regs_to_spill {
if let Ok(spill_code) = self.free_register(&reg) {
code.push(spill_code.1);
}
}
code
}
/// Get the total stack offset
pub fn get_stack_offset(&self) -> i32 {
self.stack_offset
}
/// Get the total stack space needed for local variables
pub fn _get_stack_size(&self) -> i32 {
-self.stack_offset // Convert negative offset to positive size
}
/// Reset allocator for a new function
pub fn reset(&mut self) {
self.variable_locations.clear();
self.register_contents.clear();
self.stack_offset = -4;
self.in_use = vec![
Register::Rg0,
Register::Rg1,
Register::Rg2,
Register::Rg3,
Register::Rg4,
Register::Rg5,
Register::Rg6,
Register::Rg7,
Register::Rg8,
Register::Rg9,
Register::Rga,
Register::Rgb,
Register::Rgc,
Register::Rgd,
Register::Rge,
Register::Rgf,
]
.iter()
.map(|&reg| (reg, false))
.collect();
}
/// Get list of registers that contain variables and are in use
/// These need to be saved before function calls
pub fn get_caller_saved_registers(&self) -> Vec<Register> {
self.register_contents
.iter()
.filter(|(reg, _)| {
self.in_use
.get(**reg as usize)
.unwrap_or(&(Register::Null, false))
.1
})
.map(|(reg, _)| *reg)
.collect()
}
}
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
pub enum Register {
// general purpose
Rg0 = 0,
Rg1 = 1,
Rg2 = 2,
Rg3 = 3,
Rg4 = 4,
Rg5 = 5,
Rg6 = 6,
Rg7 = 7,
Rg8 = 8,
Rg9 = 9,
Rga = 10,
Rgb = 11,
Rgc = 12,
Rgd = 13,
Rge = 14,
Rgf = 15,
// special
Bpr,
Spr,
Ret,
Acc,
// read only
Pcx,
Zero,
// null
Null,
}
impl Register {
pub fn get_gp() -> [Register; 16] {
[
Register::Rg0,
Register::Rg1,
Register::Rg2,
Register::Rg3,
Register::Rg4,
Register::Rg5,
Register::Rg6,
Register::Rg7,
Register::Rg8,
Register::Rg9,
Register::Rga,
Register::Rgb,
Register::Rgc,
Register::Rgd,
Register::Rge,
Register::Rgf,
]
}
pub fn is_gp(&self) -> bool {
(*self as u8) < 16
}
pub fn from_index(idx: usize) -> Register {
match idx {
0 => Register::Rg0,
1 => Register::Rg1,
2 => Register::Rg2,
3 => Register::Rg3,
4 => Register::Rg4,
5 => Register::Rg5,
6 => Register::Rg6,
7 => Register::Rg7,
8 => Register::Rg8,
9 => Register::Rg9,
10 => Register::Rga,
11 => Register::Rgb,
12 => Register::Rgc,
13 => Register::Rgd,
14 => Register::Rge,
15 => Register::Rgf,
_ => unreachable!("this function shouldn't ever be called with idx>15"),
}
}
}
impl fmt::Display for Register {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::Rg0 => write!(f, "rg0"),
Self::Rg1 => write!(f, "rg1"),
Self::Rg2 => write!(f, "rg2"),
Self::Rg3 => write!(f, "rg3"),
Self::Rg4 => write!(f, "rg4"),
Self::Rg5 => write!(f, "rg5"),
Self::Rg6 => write!(f, "rg6"),
Self::Rg7 => write!(f, "rg7"),
Self::Rg8 => write!(f, "rg8"),
Self::Rg9 => write!(f, "rg9"),
Self::Rga => write!(f, "rga"),
Self::Rgb => write!(f, "rgb"),
Self::Rgc => write!(f, "rgc"),
Self::Rgd => write!(f, "rgd"),
Self::Rge => write!(f, "rge"),
Self::Rgf => write!(f, "rgf"),
Self::Acc => write!(f, "acc"),
Self::Ret => write!(f, "ret"),
Self::Bpr => write!(f, "bpr"),
Self::Spr => write!(f, "spr"),
Self::Zero => write!(f, "zero"),
Self::Pcx => write!(f, "pcx"),
Self::Null => write!(f, "null"),
}
}
}
+394
View File
@@ -0,0 +1,394 @@
use std::{cell::RefCell, collections::HashMap, ops::Deref, rc::Rc};
use crate::{
backend::dsa::{
instruction::{InsBlock, Instruction},
registers::{Register, RegisterAllocator},
},
error,
model::{CompilerError, Name, TypeId},
};
/// scope object
pub struct Scope<'a> {
/// outer scope, for a function this will be the global scope.
parent: Option<&'a mut Scope<'a>>,
alloc: Rc<RefCell<Allocator>>,
/// is the scope a function body or just a loop?
/// depending on the type, ending a scope will have different behaviour
r#type: ScopeType,
/// variables
variables: HashMap<String, Variable>,
entry_stack_offset: i32,
}
impl<'a> Scope<'a> {
pub fn new() -> Scope<'a> {
let alloc = Rc::new(RefCell::new(Allocator::new()));
let entry_stack_offset = alloc.borrow().get_stack_offset();
Self {
alloc,
entry_stack_offset,
parent: None,
r#type: ScopeType::Function,
variables: HashMap::new(),
}
}
pub fn new_from(parent: &'a mut Scope<'a>, r#type: ScopeType) -> Scope<'a> {
let alloc = Rc::clone(&parent.alloc);
let entry_stack_offset = alloc.borrow().get_stack_offset();
Self {
alloc,
entry_stack_offset,
parent: Some(parent),
r#type,
variables: HashMap::new(),
}
}
pub fn create_var(
&mut self,
name: String,
r#type: TypeId,
) -> Result<(), CompilerError> {
let mut var = Variable::new(name, r#type.clone());
if r#type.size() > 4 {
let slot = self.alloc.borrow_mut().allocate_stack_slot(r#type.size());
var.stack_slot = Some(slot);
} else {
let reg = self.alloc.borrow_mut().allocate_var()?;
var.register = Some(reg);
}
self.variables.insert(var.name.clone(), var);
Ok(())
}
pub fn alloc_temp(&mut self) -> Result<TempReg, CompilerError> {
self.alloc.borrow_mut().allocate_temp()
}
pub fn free_temp(&mut self, temp: &TempReg) {
self.alloc.borrow_mut().free_temp(temp)
}
pub fn free_var(&mut self, reg: &AssignedReg) {
self.alloc.borrow_mut().free_var(reg);
}
pub fn close(&mut self) {
// tell the allocator that this scope is closing
// this reverts the stack offset to what it was before this scope was created.
self.alloc.clone().borrow_mut().destroy_scope(self);
for var in self.variables.clone().values() {
if let Some(reg) = var.register {
self.free_var(&reg);
}
}
}
fn get_var(&mut self, var: &str) -> Option<&mut Variable> {
self.variables.get_mut(var)
}
pub fn offset_read(
&mut self,
var: &str,
offset: i32,
) -> Result<(TempReg, Instruction), CompilerError> {
if let Some(var) = self.get_var(var) {
let slot = var.stack_slot.ok_or_else(|| {
error("Attempt to read from a var without a stack slot!")
})?;
return self.alloc.borrow_mut().offset_read(&slot, offset);
}
Err(CompilerError::Undefined(Name::new(var, None)))
}
pub fn offset_write(
&mut self,
reg: &TempReg,
var: &str,
offset: i32,
) -> Result<Instruction, CompilerError> {
if let Some(var) = self.get_var(var) {
let slot = var.stack_slot.ok_or_else(|| {
error("Attempt to write to a var without a stack slot!")
})?;
return Ok(self.alloc.borrow_mut().offset_write(reg, &slot, offset));
}
Err(CompilerError::Undefined(Name::new(var, None)))
}
pub fn load_var(
&mut self,
var: &str,
) -> Result<(AssignedReg, Instruction), CompilerError> {
if let Some(v) = self.get_var(var).cloned()
&& let Some(slot) = v.stack_slot
{
let res = self.alloc.borrow_mut().load_var(&slot)?;
self.get_var(var).unwrap().register = Some(res.0);
return Ok(res);
}
panic!("e")
}
pub fn spill_var(&mut self, var: &str) -> Result<Instruction, CompilerError> {
if let Some(v) = self.get_var(var).cloned()
&& let Some(rg) = v.register
{
let mut slot = v.stack_slot;
let res = self.alloc.borrow_mut().spill_var(&rg, &mut slot);
self.get_var(var).unwrap().stack_slot = slot;
return res;
}
Err(CompilerError::Undefined(Name::new(var, None)))
}
}
impl Drop for Scope<'_> {
fn drop(&mut self) {
self.close()
}
}
#[derive(PartialEq, Copy, Clone, Debug)]
pub enum ScopeType {
Function,
IfBlock,
LoopBlock,
}
#[derive(Clone)]
pub struct Variable {
pub name: String,
/// the type of the variable.
r#type: TypeId,
/// size taken up in bytes.
/// if size > 4, value must be stored on the stack.
pub size: usize,
pub stack_slot: Option<StackSlot>,
pub register: Option<AssignedReg>,
}
impl Variable {
pub fn new(name: String, r#type: TypeId) -> Self {
Self {
name,
size: r#type.size(),
r#type,
stack_slot: None,
register: None,
}
}
}
pub struct Allocator {
stack_offset: i32,
in_use: [(Register, bool); 16],
}
impl Allocator {
pub fn new() -> Self {
let mut in_use = [(Register::Null, false); 16];
in_use.copy_from_slice(&Register::get_gp().map(|r| (r, false))[0..16]);
Self {
stack_offset: 0,
in_use,
}
}
pub fn get_stack_offset(&self) -> i32 {
self.stack_offset
}
pub fn destroy_scope(&mut self, scope: &mut Scope) {
self.stack_offset = scope.entry_stack_offset;
for var in scope.variables.drain() {
if let Some(assigned) = var.1.register {
self.free_var(&assigned);
}
}
}
// what we need:
// - create var in register from temporary register. free temp and use it.
//
// - create var on stack from struct/array literal. return stack offset to write to.
//
// - spill var from register to stack. return stack offset to write to.
//
// - read/write var from stack+offset into register to use while preserving the stack
// slot.
//
// - read / write bytes from the stack+offset in a larger variable into a register.
pub fn offset_read(
&mut self,
slot: &StackSlot,
offset: i32,
) -> Result<(TempReg, Instruction), CompilerError> {
let register = self.allocate_temp()?;
// instruction: reg = *(&var + offset)
Ok((
register.clone(),
Instruction::ldw_reg_offset(
Register::Spr,
*register,
(**slot + offset) - self.stack_offset,
),
))
}
pub fn offset_write(
&mut self,
reg: &TempReg,
slot: &StackSlot,
offset: i32,
) -> Instruction {
// instruction: *(&var + offset) = reg
Instruction::stw_reg_offset(
**reg,
Register::Spr,
(**slot + offset) - self.stack_offset,
)
}
pub fn load_var(
&mut self,
slot: &StackSlot,
) -> Result<(AssignedReg, Instruction), CompilerError> {
let reg = self.allocate_var()?;
Ok((
reg.clone(),
Instruction::ldw_reg_offset(Register::Spr, *reg, **slot - self.stack_offset),
))
}
pub fn spill_var(
&mut self,
reg: &AssignedReg,
slot: &mut Option<StackSlot>,
// var: &mut Variable,
) -> Result<Instruction, CompilerError> {
if let Some(slot) = &slot {
let block = Instruction::stw_reg_offset(
**reg,
Register::Spr,
**slot - self.stack_offset,
);
self.free_var(reg);
return Ok(block);
}
// var doesn't have a stack slot so we need to create one
let new_slot = self.allocate_stack_slot(4); // alloc 4 bytes for reg value.
let block = Instruction::push(**reg);
self.free_var(reg);
*slot = Some(new_slot);
Ok(block)
}
pub fn allocate_stack_slot(&mut self, size: usize) -> StackSlot {
self.stack_offset -= size as i32;
let offset = self.stack_offset;
StackSlot(offset)
}
pub fn allocate_var(&mut self) -> Result<AssignedReg, CompilerError> {
if let Some(reg) = self.find_free_register() {
Ok(AssignedReg(reg))
} else {
Err(CompilerError::Generic(
"No free registers available".to_string(),
))
}
}
pub fn allocate_temp(&mut self) -> Result<TempReg, CompilerError> {
// allocates a temporary register
if let Some(reg) = self.find_free_register() {
Ok(TempReg(reg))
} else {
todo!("an efficient stack spilling algorithm. needs scope awareness.");
}
}
pub fn free_temp(&mut self, temp: &TempReg) {
// frees a temporary register.
self.in_use[**temp as usize].1 = false;
}
pub fn free_var(&mut self, reg: &AssignedReg) {
// frees a register.
self.in_use[**reg as usize].1 = false;
}
// if we have register(s) free, return the first one.
fn find_free_register(&mut self) -> Option<Register> {
self.in_use.iter_mut().find_map(|(reg, used)| {
if !*used {
*used = true;
Some(*reg)
} else {
None
}
})
}
}
#[derive(Clone, Copy, Debug)]
pub struct TempReg(Register);
#[derive(Clone, Copy, Debug)]
pub struct AssignedReg(Register);
#[derive(Clone, Copy, Debug)]
pub struct StackSlot(i32);
impl Deref for TempReg {
type Target = Register;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl Deref for AssignedReg {
type Target = Register;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl Deref for StackSlot {
type Target = i32;
fn deref(&self) -> &Self::Target {
&self.0
}
}
@@ -2,9 +2,13 @@ use crate::model::{CompilerError, Program};
mod dsa;
pub fn compiler_backend(ext: &str, ast: &Program) -> Result<String, CompilerError> {
pub fn compiler_backend(
ext: &str,
ast: &Program,
is_lib: bool,
) -> Result<String, CompilerError> {
match ext {
"dsa" => Ok(dsa::generate_code(ast)?),
"dsa" => Ok(dsa::generate_code(ast, is_lib)?),
_ => Err(CompilerError::Generic(format!(
"File type {} not supported",
ext
@@ -351,6 +351,7 @@ impl Parser {
op,
left: Box::new(expr),
right,
type_id: None,
};
}
@@ -371,6 +372,7 @@ impl Parser {
op,
left: Box::new(expr),
right,
type_id: None,
};
}
@@ -391,6 +393,7 @@ impl Parser {
op,
left: Box::new(expr),
right,
type_id: None,
};
}
@@ -407,7 +410,11 @@ impl Parser {
if let Some(op) = op {
self.advance();
let operand = Box::new(self.parse_unary()?);
return Ok(Expression::Unary { op, operand });
return Ok(Expression::Unary {
op,
operand,
type_id: None,
});
}
self.parse_primary()
@@ -418,7 +425,10 @@ impl Parser {
TokenType::Number(n) => {
let value = *n;
self.advance();
Ok(Expression::Number(value as isize))
Ok(Expression::Number {
value: value as isize,
type_id: None,
})
}
TokenType::Identifier(name) => {
let name = name.clone();
@@ -445,6 +455,7 @@ impl Parser {
namespace: None,
},
args,
type_id: None,
})
} else {
Ok(Expression::Variable {
File diff suppressed because it is too large Load Diff
@@ -1,21 +1,21 @@
use common::logging::log;
use crate::model::{CompilerError, Program};
use common::logging::Logger;
use parser::{ParseResult, Parser};
use semantic_analyser::Analyser;
// use semantic_analyser::Analyser;
pub mod lexer;
pub mod parser;
pub mod semantic_analyser;
// pub mod semantic_analyser;
pub fn generate_ast(input: &str) -> Result<Program, CompilerError> {
log("Tokenising Input...");
pub fn generate_ast(input: &str, logger: &Logger) -> Result<Program, CompilerError> {
logger.info("Tokenising Input...");
let lexer = lexer::Lexer::new(&input);
let tokens = lexer.collect::<Vec<_>>();
// println!("{tokens:?}");
log(&format!("Parsing {} Tokens...", tokens.len()));
// println!("{tokens:#?}");
logger.info(&format!("Parsing {} Tokens...", tokens.len()));
let mut parser = Parser::new(tokens);
let ast = match parser.parse() {
@@ -27,12 +27,12 @@ pub fn generate_ast(input: &str) -> Result<Program, CompilerError> {
};
// println!("{ast:#?}");
log("Analyzing AST...");
log("Checking Type Information...");
logger.info("Analyzing AST...");
logger.info("Checking Type Information...");
let analyser = Analyser::new();
analyser.analyse(ast.clone()).unwrap();
// let mut analyser = Analyser::new();
// analyser.analyse(ast.clone()).unwrap();
log("Type Checking Complete...");
logger.info("Type Checking Complete...");
Ok(ast)
}
+990
View File
@@ -0,0 +1,990 @@
use super::lexer::Token;
use crate::model::{
AssignmentOperator, BinaryOperator, Block, Call, CompilerError, ConstExpr,
Declaration, Dependency, Expression, Number, Program, Statement, TypeId,
UnaryOperator, Variable,
};
use crate::{expect_tt, expect_value};
use std::ops::{ControlFlow, FromResidual, Try};
#[derive(Debug, Clone)]
pub enum ParseResult<T, E> {
Accept(T),
Deny,
Reject(E),
}
pub struct Parser {
tokens: Vec<Token>,
idx: usize,
}
impl Parser {
pub fn new(tokens: Vec<Token>) -> Self {
Self { tokens, idx: 0 }
}
pub fn parse(&mut self) -> ParseResult<Program, CompilerError> {
let mut declarations = Vec::new();
while let ParseResult::Accept(_) = self.peek_next() {
declarations.push(self.parse_declaration()?);
}
ParseResult::Accept(Program { declarations })
}
fn parse_declaration(&mut self) -> ParseResult<Declaration, CompilerError> {
if expect_tt!(self.peek_next()?, Fn).accepted() {
return self.parse_func();
}
if expect_tt!(self.peek_next()?, Struct).accepted() {
return self.parse_struct();
}
if expect_tt!(self.peek_next()?, Include).accepted() {
// expect include keyword
let _ = self.next();
// expect namespace identifier
let name = expect_value!(self.next()?, Identifier)?;
// expect colon
let _ = expect_tt!(self.next()?, Colon)?;
// expect string literal (module path)
let path = expect_value!(self.next()?, String)?;
// expect semicolon
let _ = expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Declaration::Dependency(Dependency {
name: name.name,
path,
}));
}
if expect_tt!(self.peek_next()?, Const, Static).accepted() {
let is_const = match self.next()? {
Token::Const => true,
Token::Static => false,
_ => {
return ParseResult::Reject(CompilerError::Generic(String::from(
"This can't happen!",
)));
}
};
let var = self.parse_var_decl()?;
let _ = expect_tt!(self.next()?, Assign)?;
let value = self.next()?;
let init = match value {
Token::String(x) => Some(ConstExpr::String(x)),
Token::SignedInt(x, _) => Some(ConstExpr::Number(x)),
Token::UnsignedInt(x, _) => Some(ConstExpr::Number(x as i32)),
_ => {
return ParseResult::Reject(CompilerError::UnexpectedToken(
value.tt().to_string(),
));
}
};
let _ = expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Declaration::Variable {
var,
init,
is_const,
});
}
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
}
fn parse_struct(&mut self) -> ParseResult<Declaration, CompilerError> {
let _ = expect_tt!(self.next()?, Struct)?;
let name = expect_value!(self.next()?, Identifier)?;
let _ = expect_tt!(self.next()?, LeftBrace)?;
let mut fields = Vec::new();
while expect_tt!(self.peek_next()?, Identifier).accepted() {
let arg = self.parse_var_decl()?;
fields.push(arg);
if expect_tt!(self.peek_next()?, Comma).accepted() {
self.next()?;
} else {
break;
}
}
let _ = expect_tt!(self.next()?, RightBrace)?;
ParseResult::Accept(Declaration::Struct { name, fields })
}
fn parse_func(&mut self) -> ParseResult<Declaration, CompilerError> {
// expect function keyword
let _ = expect_tt!(self.next()?, Fn);
// expect function name
let name = expect_value!(self.next()?, Identifier)?;
// expect left paren
let _ = expect_tt!(self.next()?, LeftParen)?;
let mut params = Vec::new();
while expect_tt!(self.peek_next()?, Identifier).accepted() {
let arg = self.parse_var_decl()?;
params.push(arg);
if expect_tt!(self.peek_next()?, Comma).accepted() {
self.next()?;
} else {
break;
}
}
// expect right paren
let _ = expect_tt!(self.next()?, RightParen)?;
// see if we can parse the return type!
let mut return_type = TypeId::Void;
if expect_tt!(self.peek_next()?, RightArrow).accepted() {
let _ = self.next();
return_type = self.parse_type()?;
}
// expect vald block
if expect_tt!(self.peek_next()?, LeftBrace).accepted() {
ParseResult::Accept(Declaration::Function {
name: name.name,
params,
return_type,
body: self.parse_block()?,
})
} else {
ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
))
}
}
fn parse_block(&mut self) -> ParseResult<Block, CompilerError> {
// expect left brace
let _ = expect_tt!(self.next()?, LeftBrace)?;
let mut block = Vec::new();
while !expect_tt!(self.peek_next()?, RightBrace).accepted() {
block.push(self.parse_statement()?);
}
// expect right brace
let _ = expect_tt!(self.next()?, RightBrace)?;
ParseResult::Accept(block)
}
fn parse_statement(&mut self) -> ParseResult<Statement, CompilerError> {
// handle if statements
if expect_tt!(self.peek_next()?, If).accepted() {
self.next()?;
let condition = self.parse_expression()?;
let then_stmt = self.parse_block()?;
if !expect_tt!(self.peek_next()?, Else).accepted() {
return ParseResult::Accept(Statement::If {
condition,
then_stmt,
else_stmt: vec![],
});
}
let _ = expect_tt!(self.next()?, Else)?;
let else_stmt = self.parse_block()?;
return ParseResult::Accept(Statement::If {
condition,
then_stmt,
else_stmt,
});
}
// handle while loops
if expect_tt!(self.peek_next()?, While).accepted() {
self.next()?;
// expect valid expression
let expr = self.parse_expression()?;
// expect valid block after
let block = self.parse_block()?;
// return result
return ParseResult::Accept(Statement::While {
condition: expr,
body: block,
});
}
// handle indefinite loops
if expect_tt!(self.peek_next()?, Loop).accepted() {
self.next()?;
// parse the inner block
return ParseResult::Accept(Statement::Loop(self.parse_block()?));
}
if expect_tt!(self.peek_next()?, Return).accepted() {
self.next()?;
// handle case where nothing is returned
if expect_tt!(self.peek_next()?, Semicolon).accepted() {
return ParseResult::Accept(Statement::Return(None));
}
let expr = self.parse_expression()?;
expect_tt!(self.next()?, Semicolon)?;
return ParseResult::Accept(Statement::Return(Some(expr)));
}
if expect_tt!(self.peek_next()?, Break).accepted() {
self.next()?;
// expect semicolon
expect_tt!(self.next()?, Semicolon)?;
// return result
return ParseResult::Accept(Statement::Break);
}
if expect_tt!(self.peek_next()?, Continue).accepted() {
self.next()?;
// expect semicolon
expect_tt!(self.next()?, Semicolon)?;
// return result
return ParseResult::Accept(Statement::Continue);
}
// handle writes to pointers!
if expect_tt!(self.peek_next()?, Star).accepted() {
self.next()?;
let left = if expect_tt!(self.peek_next()?, Identifier).accepted() {
let identifier = expect_value!(self.next()?, Identifier)?;
Expression::Variable {
name: identifier,
expr_type: None,
}
} else if expect_tt!(self.peek_next()?, LeftParen).accepted() {
self.next()?;
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, RightParen).accepted();
expr
} else {
return ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
));
};
let _ = expect_tt!(self.next()?, Assign)?;
let right = self.parse_expression()?;
// expect semicolon
expect_tt!(self.next()?, Semicolon)?;
// return result
return ParseResult::Accept(Statement::PtrWrite {
ptr: left,
value: right,
});
}
// handle let statements (declarations)
if expect_tt!(self.peek_next()?, Let).accepted() {
self.next();
// expect variable name and type.
let name = self.parse_var_decl()?;
// handle uninitialised variable case
if expect_tt!(self.peek_next()?, Semicolon).accepted() {
self.next();
return ParseResult::Accept(Statement::Declaration {
var: name,
value: None,
});
}
// handle initialised case
// expect equals
let _ = expect_tt!(self.next()?, Assign)?;
// expect a valid expression
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, Semicolon);
// return statement
return ParseResult::Accept(Statement::Declaration {
var: name,
value: Some(expr),
});
}
// handle an in-place function call
if let ParseResult::Accept(name) = expect_value!(self.peek_next()?, Identifier)
&& let ParseResult::Accept(operator) = expect_tt!(
self.peek(1)?,
Assign,
PlusEqual,
MinusEqual,
StarEqual,
SlashEqual,
PercentEqual,
AndEqual,
OrEqual,
XorEqual,
ShlEqual,
ShrEqual
)
{
// consume name token
self.next()?;
// pattern match to find operator
let operator = match operator {
Token::Assign => AssignmentOperator::Assign,
Token::PlusEqual => AssignmentOperator::AddAssign,
Token::MinusEqual => AssignmentOperator::SubAssign,
Token::StarEqual => AssignmentOperator::MulAssign,
Token::SlashEqual => AssignmentOperator::DivAssign,
Token::PercentEqual => AssignmentOperator::ModAssign,
Token::AndEqual => AssignmentOperator::AndAssign,
Token::OrEqual => AssignmentOperator::OrAssign,
Token::XorEqual => AssignmentOperator::XorAssign,
Token::ShlEqual => AssignmentOperator::LeftShiftAssign,
Token::ShrEqual => AssignmentOperator::RightShiftAssign,
_ => {
return ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
));
}
};
// consume operator token
self.next()?;
let value = self.parse_expression()?;
let _ = expect_tt!(self.next()?, Semicolon);
return ParseResult::Accept(Statement::Assign {
varname: name.name,
operator,
value,
});
}
// parse an expression and a semicolon
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, Semicolon)?;
ParseResult::Accept(Statement::Expression { expr })
}
fn parse_expression(&mut self) -> ParseResult<Expression, CompilerError> {
self.parse_logical_or()
}
fn parse_logical_or(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_logical_and()?;
let op = match self.peek_next()? {
Token::LogicalOr => BinaryOperator::LogicalOr,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_logical_or()?),
type_id: Some(TypeId::U32),
})
}
fn parse_logical_and(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_bitwise_or()?;
let op = match self.peek_next()? {
Token::LogicalAnd => BinaryOperator::LogicalAnd,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_logical_and()?),
type_id: Some(TypeId::U32),
})
}
fn parse_bitwise_or(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_bitwise_xor()?;
let op = match self.peek_next()? {
Token::Pipe => BinaryOperator::BitwiseOr,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_bitwise_or()?),
type_id: Some(TypeId::U32),
})
}
fn parse_bitwise_xor(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_bitwise_and()?;
let op = match self.peek_next()? {
Token::Caret => BinaryOperator::BitwiseXor,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_bitwise_xor()?),
type_id: Some(TypeId::U32),
})
}
fn parse_bitwise_and(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_comparison()?;
let op = match self.peek_next()? {
Token::Ampersand => BinaryOperator::BitwiseAnd,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_bitwise_and()?),
type_id: Some(TypeId::U32),
})
}
fn parse_comparison(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_shift()?;
let op = match self.peek_next()? {
Token::EqualEqual => BinaryOperator::Equal,
Token::BangEqual => BinaryOperator::NotEqual,
Token::Less => BinaryOperator::LessThan,
Token::Greater => BinaryOperator::GreaterThan,
Token::LessEqual => BinaryOperator::LessOrEqual,
Token::GreaterEqual => BinaryOperator::GreaterOrEqual,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_comparison()?),
type_id: Some(TypeId::Bool),
})
}
fn parse_shift(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_additive()?;
let op = match self.peek_next()? {
Token::LeftShift => BinaryOperator::LeftShift,
Token::RightShift => BinaryOperator::RightShift,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_shift()?),
type_id: Some(TypeId::U32),
})
}
fn parse_additive(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_multiplicative()?;
let op = match self.peek_next()? {
Token::Plus => BinaryOperator::Add,
Token::Minus => BinaryOperator::Sub,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_additive()?),
type_id: Some(TypeId::U32),
})
}
fn parse_multiplicative(&mut self) -> ParseResult<Expression, CompilerError> {
let left = self.parse_unary()?;
let op = match self.peek_next()? {
Token::Star => BinaryOperator::Mul,
Token::Slash => BinaryOperator::Div,
_ => return ParseResult::Accept(left),
};
self.next()?;
ParseResult::Accept(Expression::Binary {
op,
left: Box::new(left),
right: Box::new(self.parse_multiplicative()?),
type_id: None,
})
}
fn parse_unary(&mut self) -> ParseResult<Expression, CompilerError> {
let op = match self.peek_next()? {
// prefix inc/dec
Token::PlusPlus => UnaryOperator::Increment,
Token::MinusMinus => UnaryOperator::Decrement,
// arithmetic
Token::Plus => UnaryOperator::Plus,
Token::Minus => UnaryOperator::Minus,
// pointer
Token::Star => UnaryOperator::Dereference,
Token::Ampersand => UnaryOperator::AddressOf,
// boolean
Token::Bang => UnaryOperator::LogicalNot,
Token::Tilde => UnaryOperator::BitwiseNot,
Token::SizeOf => UnaryOperator::SizeOf,
_ => {
let expr = self.parse_primary()?;
return self.parse_postfix(expr);
}
};
self.next()?;
let operand = Box::new(self.parse_unary()?);
ParseResult::Accept(Expression::Unary {
op,
operand,
type_id: None,
})
}
fn parse_postfix(
&mut self,
mut expr: Expression,
) -> ParseResult<Expression, CompilerError> {
loop {
match self.peek_next()? {
// Type cast: expr as Type
Token::As => {
self.next()?; // consume 'as'
let target_type = self.parse_type()?;
expr = Expression::TypeCast {
expr: Box::new(expr),
target_type,
type_id: None,
};
}
// Postfix increment/decrement
Token::PlusPlus => {
self.next()?;
expr = Expression::UnaryPostfix {
op: UnaryOperator::Increment,
operand: Box::new(expr),
type_id: None,
};
}
Token::MinusMinus => {
self.next()?;
expr = Expression::UnaryPostfix {
op: UnaryOperator::Decrement,
operand: Box::new(expr),
type_id: None,
};
}
// Array indexing: expr[index]
Token::LeftBracket => {
self.next()?; // consume '['
let index = Box::new(self.parse_expression()?);
let _ = expect_tt!(self.next()?, RightBracket)?;
expr = Expression::IndexAccess {
expr: Box::new(expr),
index,
type_id: None,
};
}
// Function call: expr(args...)
Token::LeftParen => {
self.next()?; // consume '('
let mut args = Vec::new();
if !matches!(self.peek_next()?, Token::RightParen) {
loop {
args.push(self.parse_expression()?);
if !matches!(self.peek_next()?, Token::Comma) {
break;
}
self.next()?; // consume comma
}
}
let _ = expect_tt!(self.next()?, RightParen)?;
if let Expression::Variable { name, .. } = expr {
expr = Expression::Call {
func: Call { name, args },
type_id: None,
};
}
}
// Member access: expr.member (if you support structs)
Token::Dot => {
self.next()?;
let field_name = expect_value!(self.next()?, Identifier)?;
expr = Expression::MemberAccess {
expr: Box::new(expr),
field_name,
type_id: None,
};
}
// No more postfix operations
_ => break,
}
}
ParseResult::Accept(expr)
}
fn parse_primary(&mut self) -> ParseResult<Expression, CompilerError> {
match self.peek_next()? {
Token::UnsignedInt(value, type_id) => {
self.next()?;
ParseResult::Accept(Expression::Number(Number::Unsigned(value, type_id)))
}
Token::SignedInt(value, type_id) => {
self.next()?;
ParseResult::Accept(Expression::Number(Number::Signed(value, type_id)))
}
Token::String(value) => {
self.next()?;
ParseResult::Accept(Expression::StringLiteral(value))
}
Token::Char(value) => {
self.next()?;
ParseResult::Accept(Expression::CharLiteral(value))
}
Token::Identifier(name) => {
self.next()?;
// if the next token isn't the beginning of a struct literal this is just
// an identifier.
// if !expect_tt!(self.peek_next()?, LeftBrace).accepted() {
// return ParseResult::Accept(Expression::Variable {
// name,
// expr_type: None,
// });
// }
//
ParseResult::Accept(Expression::Variable {
name,
expr_type: None,
})
// let _ = self.next()?;
// let mut fields = Vec::new();
// while !expect_tt!(self.peek_next()?, RightBrace).accepted() {
// let name = expect_value!(self.next()?, Identifier)?;
// let _ = expect_tt!(self.next()?, Colon)?;
// let expr = self.parse_expression()?;
// fields.push((name, expr));
// if expect_tt!(self.peek_next()?, Comma).accepted() {
// self.next()?;
// } else {
// break;
// }
// }
// let _ = expect_tt!(self.next()?, RightBrace)?;
// ParseResult::Accept(Expression::StructLiteral {
// name,
// fields,
// type_id: None,
// })
}
Token::LeftBracket => {
self.next()?; // consume '['
let mut elements = Vec::new();
if !matches!(self.peek_next()?, Token::RightBracket) {
loop {
elements.push(self.parse_expression()?);
if !matches!(self.peek_next()?, Token::Comma) {
break;
}
self.next()?; // consume comma
}
}
expect_tt!(self.next()?, RightBracket)?;
ParseResult::Accept(Expression::ArrayLiteral {
elements,
type_id: None,
})
}
Token::LeftParen => {
self.next()?;
let expr = self.parse_expression()?;
let _ = expect_tt!(self.next()?, RightParen)?;
ParseResult::Accept(expr)
}
_ => ParseResult::Reject(CompilerError::UnexpectedToken(
self.peek_next()?.tt().to_string(),
)),
}
}
fn parse_var_decl(&mut self) -> ParseResult<Variable, CompilerError> {
let name = expect_value!(self.next()?, Identifier)?;
let _ = expect_tt!(self.next()?, Colon)?;
let type_id = self.parse_type()?;
ParseResult::Accept(Variable {
name: name.name,
type_id,
})
}
fn parse_type(&mut self) -> ParseResult<TypeId, CompilerError> {
// parse primitive or named type
if expect_tt!(self.peek_next()?, Identifier).accepted() {
return self.parse_type_identifier();
}
// parse array type
if expect_tt!(self.peek_next()?, LeftBracket).accepted() {
let _ = self.next()?;
let internal_type = self.parse_type()?;
let _ = expect_tt!(self.next()?, Semicolon)?;
let size = expect_value!(self.next()?, UnsignedInt)?;
let _ = expect_tt!(self.next()?, RightBracket)?;
return ParseResult::Accept(TypeId::Array {
r#type: Box::new(internal_type),
size: size as usize,
});
}
// parse tuple type
if expect_tt!(self.peek_next()?, LeftParen).accepted() {
let _ = self.next()?;
let mut types = Vec::new();
while !expect_tt!(self.peek_next()?, RightParen).accepted() {
types.push(self.parse_type()?);
if !expect_tt!(self.peek_next()?, Comma).accepted() {
break;
}
let _ = self.next()?;
}
let _ = expect_tt!(self.next()?, RightParen)?;
return ParseResult::Accept(TypeId::Tuple(types));
}
ParseResult::Reject(CompilerError::Generic(format!(
"Parsing type but no valid type was detected: {:?}",
self.peek_next()?
)))
}
fn parse_type_identifier(&mut self) -> ParseResult<TypeId, CompilerError> {
// get the type name incl namespace
let name = expect_value!(self.next()?, Identifier)?;
let type_id = match name.name.as_str() {
"u32" => TypeId::U32,
"u16" => TypeId::U16,
"u8" => TypeId::U8,
"i32" => TypeId::I32,
"i16" => TypeId::I16,
"i8" => TypeId::I8,
"void" => TypeId::Void,
"char" => TypeId::Char,
"str" => TypeId::Ptr(Box::new(TypeId::Char)),
_ => {
let mut generics = Vec::new();
if expect_tt!(self.peek_next()?, Less).accepted() {
let _ = self.next()?;
// loop until we find the closing '>'
while !expect_tt!(self.peek_next()?, Greater).accepted() {
generics.push(self.parse_type()?);
if !expect_tt!(self.peek_next()?, Comma).accepted() {
break;
}
let _ = self.next()?;
}
let _ = expect_tt!(self.next()?, Greater)?;
}
TypeId::UnknownCustom { name, generics }
}
};
ParseResult::Accept(type_id)
}
fn next(&mut self) -> ParseResult<Token, CompilerError> {
if self.idx >= self.tokens.len() {
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
} else {
let token = self.tokens[self.idx].clone();
self.idx += 1;
ParseResult::Accept(token)
}
}
fn peek_next(&self) -> ParseResult<Token, CompilerError> {
if self.idx >= self.tokens.len() {
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
} else {
ParseResult::Accept(self.tokens[self.idx].clone())
}
}
fn peek(&self, offset: usize) -> ParseResult<Token, CompilerError> {
if self.idx + offset >= self.tokens.len() {
ParseResult::Reject(CompilerError::UnexpectedEndOfInput)
} else {
ParseResult::Accept(self.tokens[self.idx + offset].clone())
}
}
}
impl<T, E> ParseResult<T, E> {
pub fn accepted(&self) -> bool {
matches!(self, ParseResult::Accept(_))
}
}
pub enum ParseResultResidual<T> {
Deny,
Reject(T),
}
impl<T, E> Try for ParseResult<T, E> {
type Output = T;
type Residual = ParseResultResidual<E>;
fn from_output(output: T) -> Self {
ParseResult::Accept(output)
}
fn branch(self) -> ControlFlow<Self::Residual, Self::Output> {
match self {
ParseResult::Accept(v) => ControlFlow::Continue(v),
ParseResult::Deny => ControlFlow::Break(ParseResultResidual::Deny),
ParseResult::Reject(e) => ControlFlow::Break(ParseResultResidual::Reject(e)),
}
}
}
impl<T, E> FromResidual for ParseResult<T, E> {
fn from_residual(residual: ParseResultResidual<E>) -> Self {
match residual {
ParseResultResidual::Deny => ParseResult::Deny,
ParseResultResidual::Reject(e) => ParseResult::Reject(e),
}
}
}
#[macro_export]
macro_rules! expect_tt {
($token:expr, $($variant:ident),+) => {{
let token = $token.clone();
let tt = token.tt().to_string();
let mut vs = String::new();
$(
let s = stringify!($variant);
vs.push_str(s);
vs.push_str("|");
)+
match tt.as_str() {
$(
stringify!($variant) => ParseResult::Accept(token),
)+
_ => {
// let expected = format!("[{}]", vec![$(stringify!($variant)),+].join(" | "));
ParseResult::Reject(CompilerError::UnexpectedToken(tt))
}
}
}};
}
#[macro_export]
macro_rules! expect_value {
($expr:expr, $variant:ident) => {{
let tok = $expr;
match tok.clone() {
Token::$variant(first, ..) => ParseResult::Accept(first),
_ => {
ParseResult::Reject(CompilerError::UnexpectedToken(tok.tt().to_string()))
}
}
}};
}
@@ -0,0 +1,226 @@
use std::collections::HashMap;
use crate::model::{
BinaryOperator, // You'll need to add this to your imports
CompilerError,
Declaration,
Dependency,
Expression,
Program,
TypeId,
UnaryOperator,
};
pub struct Analyser {
symbol_table: HashMap<String, Declaration>,
}
const NUMERIC_TYPES: &[TypeId] = &[
TypeId::U32,
TypeId::I32,
TypeId::I16,
TypeId::U16,
TypeId::I8,
TypeId::U8,
];
impl Analyser {
pub fn new() -> Self {
Self {
symbol_table: HashMap::new(),
}
}
pub fn analyse(&mut self, ast: Program) -> Result<(), CompilerError> {
// build table of global symbols.
for dec in ast.declarations {
let name = match dec.clone() {
Declaration::Function { name, .. } => name,
Declaration::Variable { var, .. } => var.name,
Declaration::Dependency(Dependency { name, .. }) => name,
};
self.symbol_table.insert(name, dec);
}
Ok(())
}
fn match_type(
actual: TypeId,
expected: Option<TypeId>,
) -> Result<TypeId, CompilerError> {
match expected {
Some(id) => {
if id != actual {
Err(CompilerError::TypeMismatch(id, actual))
} else {
Ok(actual)
}
}
None => Ok(actual),
}
}
fn get_type(
&mut self, // Changed from &self to &mut self since we modify expr
expr: &mut Expression,
expected_type: Option<TypeId>,
) -> Result<TypeId, CompilerError> {
match expr {
// Correct IFF we're expecting a void type
Expression::Empty => Self::match_type(TypeId::Void, expected_type),
// Correct IFF we're expecting a char type
Expression::CharLiteral(_) => Self::match_type(TypeId::Char, expected_type),
// Correct IFF we're expecting a string slice type
Expression::StringLiteral(_) => {
Self::match_type(TypeId::Ptr(Box::new(TypeId::Char)), expected_type)
}
Expression::Variable { name, expr_type } => {
let actual = expr_type.clone().ok_or(CompilerError::UnknownType)?;
Self::match_type(actual, expected_type)
}
Expression::Number { value, type_id } => {
// If we already know the TypeId
if let Some(id) = type_id {
return Self::match_type(id.clone(), expected_type);
}
// If we're expecting a type id, check it's numeric.
// TODO: add checks to make sure it's valid for its size eg u8 cant be
// more than 255
if let Some(expected) = expected_type {
if NUMERIC_TYPES.contains(&expected) {
*type_id = Some(expected.clone());
return Ok(expected);
} else {
return Err(CompilerError::TypeMismatch(expected, TypeId::U32));
}
}
// Default to i32 if no type information is available
*type_id = Some(TypeId::I32);
Ok(TypeId::I32)
}
Expression::Binary {
op,
left,
right,
type_id,
} => {
// For binary operations, both operands should have compatible types
// and the result type depends on the operation
let left_type = self.get_type(left, None)?;
let right_type = self.get_type(right, Some(left_type.clone()))?;
// For numeric operations, result has the same type as operands
if NUMERIC_TYPES.contains(&left_type)
&& NUMERIC_TYPES.contains(&right_type)
{
*type_id = Some(left_type);
Self::match_type(left_type, expected_type)
} else {
Err(CompilerError::TypeMismatch(left_type, right_type))
}
}
Expression::Unary {
op,
operand,
type_id,
} => {
match op {
UnaryOperator::Plus | UnaryOperator::Minus => {
// Unary +/- require numeric operands
let inner_type = self.get_type(operand, None)?;
if NUMERIC_TYPES.contains(&inner_type) {
*type_id = Some(inner_type.clone());
Self::match_type(inner_type, expected_type)
} else {
Err(CompilerError::TypeMismatch(inner_type, TypeId::I32))
}
}
UnaryOperator::Dereference => {
// For dereference (*ptr), the operand must be a pointer
// and the result type is what the pointer points to
let inner_type = self.get_type(operand, None)?;
match inner_type {
TypeId::Ptr(inner) => {
let deref_type = *inner;
*type_id = Some(deref_type.clone());
Self::match_type(deref_type, expected_type)
}
_ => Err(CompilerError::Generic(format!(
"Cannot dereference non-pointer type: {:?}",
inner_type
))),
}
}
UnaryOperator::Reference => {
// For reference (&var), we need to determine what we're taking
// a reference to, then wrap it in a Ptr
// If expected_type is Ptr(T), then operand should have type T
let expected_inner = match expected_type.clone() {
Some(TypeId::Ptr(inner)) => Some(*inner),
_ => None,
};
let inner_type = self.get_type(operand, expected_inner)?;
let ref_type = TypeId::Ptr(Box::new(inner_type));
*type_id = Some(ref_type.clone());
Self::match_type(ref_type, expected_type)
}
}
}
Expression::Call {
name,
args,
type_id,
} => match self.symbol_table.get(&name.name) {
Some(Declaration::Function {
params,
return_type,
..
}) => {
// check that we've given the right number of arguments.
if args.len() != params.len() {
return Err(CompilerError::Generic(format!(
"Function {} expected {} arguments but received {}",
name.name,
params.len(),
args.len()
)));
}
for (arg, param) in args.iter_mut().zip(params.iter()) {
// check that the argument type matches the parameter type.
let provided_type = self.get_type(arg, Some(param.type_id))?;
if provided_type != param.type_id {
return Err(CompilerError::TypeMismatch(
param.type_id,
provided_type,
));
}
}
*type_id = Some(return_type.clone());
Self::match_type(return_type.clone(), expected_type)
}
_ => Err(CompilerError::Generic(format!(
"Function {} not found in symbol table",
name.name
))),
},
}
}
}
+21
View File
@@ -0,0 +1,21 @@
use common::logging::Logger;
use crate::model::{CompilerError, Program};
// mod c;
mod dsc;
pub fn compiler_frontend(
ext: &str,
data: &str,
logger: &Logger,
) -> Result<Program, CompilerError> {
match ext {
"dsc" => Ok(dsc::generate_ast(&data, &logger)?),
// "c" => Ok(c::generate_ast(&data)?),
_ => Err(CompilerError::Generic(format!(
"File type {} not supported",
ext
))),
}
}
+94
View File
@@ -0,0 +1,94 @@
#![feature(try_trait_v2)]
use std::path::{Path, PathBuf};
use common::{
build::{BuildError, Builder},
logging::LogReceiver,
};
use crate::{model::CompilerError, specialised::build_specialised};
mod backend;
mod frontend;
mod model;
mod specialised;
pub struct Compiler {
src_path: PathBuf,
result: Option<Result<String, BuildError>>,
logger: LogReceiver,
}
impl Compiler {
fn build(&mut self, is_lib: bool) -> Result<String, Box<dyn std::error::Error>> {
let input =
std::fs::read_to_string(&self.src_path).expect("Failed to read input file");
let input_ext = self
.src_path
.extension()
.and_then(|s| s.to_str())
.unwrap_or("");
// check if we're using a specialised compiler
if let Some(output) = build_specialised(input_ext, &input) {
return output.map_err(|err| format!("Compilation failed: {err:?}").into());
}
// Parse the input using the frontend, providing the file extension and data.
let ast =
match frontend::compiler_frontend(input_ext, &input, &self.logger.logger()) {
Ok(ast) => ast,
Err(err) => return Err(format!("Compilation failed: {err:?}").into()),
};
// println!("Parsed AST: {:#?}", ast);
// Generate the output using the backend with the parsed result.
let result = match backend::compiler_backend("dsa", &ast, is_lib) {
Ok(result) => result,
Err(err) => return Err(format!("Compilation failed: {err:?}").into()),
};
Ok(result)
}
}
impl Builder for Compiler {
type Output = String;
type Args = bool;
fn new(src_path: impl Into<PathBuf>) -> Self {
Self {
src_path: src_path.into(),
result: None,
logger: LogReceiver::new(true),
}
}
fn start(&mut self, args: bool) {
match self.build(args) {
Ok(x) => self.result = Some(Ok(x)),
Err(err) => self.result = Some(Err(err.into())),
}
}
fn poll(&mut self) -> Option<Result<Self::Output, BuildError>> {
self.result.take()
}
fn output(&mut self) -> Result<Self::Output, BuildError> {
self.result.clone().ok_or(BuildError::Generic(String::from(
"Compiler was never started",
)))?
}
fn logs(&self) -> Vec<String> {
todo!()
}
}
pub fn error(msg: impl Into<String>) -> CompilerError {
CompilerError::Generic(msg.into())
}
+35
View File
@@ -0,0 +1,35 @@
use std::path::{Path, PathBuf};
use common::{build::Builder, logging::info};
use compiler::Compiler;
fn main() {
// read from input file: syntax "c_compiler <src.c> [output.dsa]"
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("Usage: c_compiler [--lib | --bin] <src.dsc> [output.dsa]");
return;
}
let input_file = &args[1];
let output_file = if args.len() > 2 {
&args[2]
} else {
"output.dsa"
};
{
let is_lib = args.contains(&"--lib".to_string());
let mut builder = Compiler::new(PathBuf::from(input_file));
builder.start(is_lib);
let result = builder.output().unwrap();
std::fs::write(output_file, &result).expect("Failed to write output");
info(&format!(
"Compilation Successful ✅ \n\tSource: {}\n\tOutput: {}\n",
input_file, output_file,
));
}
}
+503
View File
@@ -0,0 +1,503 @@
use core::fmt;
use common::build::BuildError;
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum CompilerError {
UnexpectedToken(String),
UnexpectedEndOfInput,
UnexpectedCharacter(char),
Undefined(Name),
InvalidSyntax(String),
Generic(String),
UnknownType,
TypeMismatch(TypeId, TypeId),
Unimplemented(String),
}
impl From<CompilerError> for BuildError {
fn from(err: CompilerError) -> Self {
BuildError::Generic(format!("{:?}", err))
}
}
#[derive(Debug, PartialEq, Eq, Clone)]
pub struct Name {
pub name: String,
pub namespace: Option<String>,
}
impl Name {
pub fn new(name: impl Into<String>, namespace: Option<String>) -> Self {
Self {
name: name.into(),
namespace,
}
}
}
#[derive(Debug, Clone)]
pub struct Program {
pub declarations: Vec<Declaration>,
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum Declaration {
Function {
name: String,
return_type: TypeId,
params: Vec<Variable>,
body: Block,
},
Variable {
var: Variable,
init: Option<ConstExpr>,
is_const: bool,
},
Dependency(Dependency),
Struct {
name: Name,
fields: Vec<Variable>,
},
}
#[derive(Debug, Clone)]
pub struct Dependency {
pub name: String,
pub path: String,
}
#[allow(unused)]
#[derive(Debug, Clone, PartialEq)]
pub enum TypeId {
U8,
U16,
U32,
I8,
I16,
I32,
Bool,
Char,
Void,
Ptr(Box<TypeId>),
Ref(Box<TypeId>),
Tuple(Vec<TypeId>),
Array {
r#type: Box<TypeId>,
size: usize,
},
UnknownCustom {
name: Name,
generics: Vec<TypeId>,
},
Struct {
name: Name,
fields: Vec<TypeId>,
generics: Vec<TypeId>,
},
}
impl TypeId {
pub fn size(&self) -> usize {
match self {
Self::U8 => 1,
Self::U16 => 2,
Self::U32 => 4,
Self::I8 => 1,
Self::I16 => 2,
Self::I32 => 4,
Self::Bool => 1,
Self::Char => 1,
Self::Void => 0,
Self::Ptr(t) => t.size(),
Self::Ref(t) => t.size(),
Self::Tuple(types) => types.iter().map(|t| t.size()).sum(),
Self::Array { r#type, size } => r#type.size() * size,
Self::UnknownCustom { .. } => 1, /* TODO: calculate type size during */
// semantic analysis
Self::Struct { fields, .. } => fields.iter().map(|t| t.size()).sum(),
}
}
}
impl fmt::Display for TypeId {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::U8 => write!(f, "u8"),
Self::U16 => write!(f, "u16"),
Self::U32 => write!(f, "u32"),
Self::I8 => write!(f, "i8"),
Self::I16 => write!(f, "i16"),
Self::I32 => write!(f, "i32"),
Self::Bool => write!(f, "bool"),
Self::Char => write!(f, "char"),
Self::Void => write!(f, "void"),
Self::Ptr(t) => write!(f, "*{}", t),
Self::Ref(t) => write!(f, "&{}", t),
Self::Tuple(elems) => write!(
f,
"({})",
elems
.iter()
.map(|t| t.to_string())
.collect::<Vec<String>>()
.join(", ")
),
Self::Array { r#type, size } => write!(f, "[{}; {}]", r#type, size),
Self::UnknownCustom { name, generics } => {
write!(
f,
"{}<{}>",
name,
generics
.iter()
.map(|t| t.to_string())
.collect::<Vec<String>>()
.join(", ")
)
}
Self::Struct {
name,
fields,
generics,
} => write!(
f,
"struct<{}> {} {{{}}}",
generics
.iter()
.map(|t| t.to_string())
.collect::<Vec<String>>()
.join(", "),
name,
fields
.iter()
.map(|t| t.to_string())
.collect::<Vec<String>>()
.join(", ")
),
}
}
}
pub type Block = Vec<Statement>;
#[allow(unused)]
#[derive(Debug, Clone, PartialEq)]
pub struct Variable {
pub name: String,
pub type_id: TypeId,
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum Statement {
Block(Block),
Declaration {
var: Variable,
value: Option<Expression>,
},
Assign {
varname: String,
operator: AssignmentOperator,
value: Expression,
},
PtrWrite {
ptr: Expression,
value: Expression,
},
Expression {
expr: Expression,
},
If {
condition: Expression,
then_stmt: Block,
else_stmt: Block,
},
While {
condition: Expression,
body: Vec<Statement>,
},
Loop(Block),
Defer(Call),
Break,
Continue,
Return(Option<Expression>),
}
#[derive(Debug, Clone)]
pub enum ConstExpr {
Number(i32),
String(String),
}
impl fmt::Display for ConstExpr {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
ConstExpr::Number(n) => write!(f, "{}", n),
ConstExpr::String(s) => write!(f, "\"{}\"", s),
}
}
}
#[allow(unused)]
#[derive(Debug, Clone)]
pub enum Expression {
Empty,
Binary {
op: BinaryOperator,
left: Box<Expression>,
right: Box<Expression>,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
Unary {
op: UnaryOperator,
operand: Box<Expression>,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
UnaryPostfix {
op: UnaryOperator,
operand: Box<Expression>,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
Variable {
name: Name,
expr_type: Option<TypeId>,
},
TypeCast {
expr: Box<Expression>,
target_type: TypeId,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
IndexAccess {
expr: Box<Expression>,
index: Box<Expression>,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
MemberAccess {
expr: Box<Expression>,
field_name: Name,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
Call {
func: Call,
// Post-Semantic Analysis
type_id: Option<TypeId>,
},
Number(Number),
StringLiteral(String),
CharLiteral(char),
ArrayLiteral {
elements: Vec<Expression>,
type_id: Option<TypeId>,
},
StructLiteral {
name: Name,
fields: Vec<(Name, Expression)>,
type_id: Option<TypeId>,
},
}
#[derive(Debug, Clone)]
pub enum Number {
Signed(i32, Option<TypeId>),
Unsigned(u32, Option<TypeId>),
}
#[derive(Debug, Clone)]
pub struct Call {
pub name: Name,
pub args: Vec<Expression>,
}
impl Expression {
pub fn is_pure(&self) -> bool {
match self {
Expression::Number { .. } => true,
Expression::StringLiteral(_) => true,
Expression::CharLiteral(_) => true,
Expression::Call { .. } => false,
Expression::Binary { left, right, .. } => left.is_pure() && right.is_pure(),
Expression::Unary { operand, .. } => operand.is_pure(),
Expression::UnaryPostfix { operand, .. } => operand.is_pure(),
Expression::Empty => true,
Expression::Variable { .. } => true,
Expression::TypeCast { expr, .. } => expr.is_pure(),
Expression::IndexAccess { expr, index, .. } => {
expr.is_pure() && index.is_pure()
}
Expression::MemberAccess { expr, .. } => expr.is_pure(),
Expression::ArrayLiteral { elements, .. } => {
elements.iter().all(|element| element.is_pure())
}
Expression::StructLiteral { fields, .. } => {
fields.iter().all(|(_, expr)| expr.is_pure())
}
}
}
pub fn type_id(&self) -> Result<TypeId, CompilerError> {
match self {
Expression::Number(
Number::Signed(_, type_id) | Number::Unsigned(_, type_id),
) => type_id.clone().ok_or(CompilerError::UnknownType),
Expression::StringLiteral(_) => Ok(TypeId::Ptr(Box::new(TypeId::Char))),
Expression::CharLiteral(_) => Ok(TypeId::Char),
Expression::Call { type_id, .. } => {
type_id.clone().ok_or(CompilerError::UnknownType)
}
Expression::Binary { type_id, .. } => {
type_id.clone().ok_or(CompilerError::UnknownType)
}
Expression::Unary { type_id, .. } => {
type_id.clone().ok_or(CompilerError::UnknownType)
}
Expression::UnaryPostfix { type_id, .. } => {
type_id.clone().ok_or(CompilerError::UnknownType)
}
Expression::Empty => Ok(TypeId::Void),
Expression::Variable { expr_type, .. } => {
expr_type.clone().ok_or(CompilerError::UnknownType)
}
Expression::TypeCast { type_id, .. } => {
type_id.clone().ok_or(CompilerError::UnknownType)
}
Expression::IndexAccess { expr, .. } => expr.type_id(),
Expression::MemberAccess { expr, .. } => expr.type_id(),
Expression::ArrayLiteral { elements, .. } => {
let element_type = elements
.first()
.map_or(TypeId::Void, |e| e.type_id().unwrap_or(TypeId::Void));
Ok(TypeId::Array {
r#type: Box::new(element_type),
size: elements.len(),
})
}
Expression::StructLiteral { name, fields, .. } => {
let fields = fields
.iter()
.map(|(_, expr)| expr.type_id())
.collect::<Result<Vec<_>, _>>()?;
Ok(TypeId::Struct {
name: name.clone(),
fields,
generics: Vec::new(),
})
}
}
}
}
#[derive(Debug, Clone, PartialEq)]
pub enum AssignmentOperator {
Assign,
AddAssign,
SubAssign,
MulAssign,
DivAssign,
ModAssign,
AndAssign,
OrAssign,
XorAssign,
LeftShiftAssign,
RightShiftAssign,
}
#[allow(unused)]
#[derive(Debug, Clone, PartialEq)]
pub enum BinaryOperator {
// arithmetic
Add,
Sub,
Mul,
Div,
Mod,
// comparison
Equal,
NotEqual,
LessThan,
GreaterThan,
LessOrEqual,
GreaterOrEqual,
// bitwise
BitwiseAnd,
BitwiseOr,
BitwiseXor,
// logical
LogicalAnd,
LogicalOr,
// shift
LeftShift,
RightShift,
}
impl fmt::Display for BinaryOperator {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::Add => write!(f, "+"),
Self::Sub => write!(f, "-"),
Self::Mul => write!(f, "*"),
Self::Div => write!(f, "/"),
Self::Mod => write!(f, "%"),
Self::Equal => write!(f, "=="),
Self::NotEqual => write!(f, "!="),
Self::LessThan => write!(f, "<"),
Self::GreaterThan => write!(f, ">"),
Self::LessOrEqual => write!(f, "<="),
Self::GreaterOrEqual => write!(f, ">="),
Self::BitwiseAnd => write!(f, "&"),
Self::BitwiseOr => write!(f, "|"),
Self::BitwiseXor => write!(f, "^"),
Self::LogicalAnd => write!(f, "&&"),
Self::LogicalOr => write!(f, "||"),
Self::LeftShift => write!(f, "<<"),
Self::RightShift => write!(f, ">>"),
}
}
}
#[derive(Debug, Clone, PartialEq)]
pub enum UnaryOperator {
Plus,
Minus,
AddressOf,
Dereference,
BitwiseNot,
LogicalNot,
Increment,
Decrement,
SizeOf,
}
impl fmt::Display for UnaryOperator {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
Self::Increment => write!(f, "++"),
Self::Decrement => write!(f, "--"),
Self::Plus => write!(f, "+"),
Self::Minus => write!(f, "-"),
Self::Dereference => write!(f, "*"),
Self::AddressOf => write!(f, "&"),
Self::BitwiseNot => write!(f, "~"),
Self::LogicalNot => write!(f, "!"),
Self::SizeOf => write!(f, "sizeof"),
}
}
}
+55
View File
@@ -0,0 +1,55 @@
use std::{
fmt,
path::{Path, PathBuf},
};
#[derive(Debug, Clone)]
pub enum BuildError {
IoError(String),
Generic(String),
}
impl From<std::io::Error> for BuildError {
fn from(err: std::io::Error) -> Self {
Self::IoError(err.to_string())
}
}
impl From<Box<dyn std::error::Error>> for BuildError {
fn from(err: Box<dyn std::error::Error>) -> Self {
Self::Generic(err.to_string())
}
}
impl fmt::Display for BuildError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::IoError(err) => write!(f, "IO Error: {err}"),
Self::Generic(err) => write!(f, "Generic Error: {err}"),
}
}
}
pub trait Builder {
type Output: Clone + std::convert::AsRef<[u8]>;
type Args: Clone;
fn new(src_path: impl Into<PathBuf>) -> Self;
// starts compilation
fn start(&mut self, args: Self::Args);
// non-blocking function, returns output if completed
fn poll(&mut self) -> Option<Result<Self::Output, BuildError>>;
// blocking function, returns output when completed.
fn output(&mut self) -> Result<Self::Output, BuildError>;
fn write_result(&mut self, path: impl AsRef<Path>) -> Result<(), BuildError> {
let output = self.output()?;
std::fs::write(path.as_ref(), output)
.map_err(|e| BuildError::IoError(e.to_string()))
}
fn logs(&self) -> Vec<String>;
}
@@ -40,7 +40,7 @@ pub enum InstructionType {
Immediate,
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
#[derive(Copy, Clone, Debug, PartialEq, Eq, Default)]
#[non_exhaustive]
pub enum Register {
// general purpose registers
@@ -69,7 +69,9 @@ pub enum Register {
Idr,
Mmr,
Zero,
NoReg,
#[default]
Null, // Invalid - Triggers a fault if accessed
// system registers - can't be written to by instructions.
Mar,
@@ -104,12 +106,6 @@ impl Register {
}
}
impl Default for Register {
fn default() -> Self {
Self::NoReg
}
}
impl TryFrom<u8> for Register {
type Error = RegisterParseError;
@@ -144,7 +140,7 @@ impl TryFrom<u8> for Register {
0x14 => Self::Idr,
0x15 => Self::Mmr,
0x16 => Self::Zero,
0x17 => Self::NoReg,
0x17 => Self::Null,
0x18 => Self::Mar,
0x19 => Self::Mdr,
0x1A => Self::Sts,
@@ -183,7 +179,7 @@ impl TryFrom<&str> for Register {
"idr" => Ok(Self::Idr),
"mmr" => Ok(Self::Mmr),
"zero" => Ok(Self::Zero),
"null" => Ok(Self::NoReg),
"null" => Ok(Self::Null),
"pcx" => Ok(Self::Pcx),
_ => Err(RegisterParseError::InvalidName(value.to_string())),
}
@@ -216,7 +212,7 @@ impl std::fmt::Display for Register {
Self::Idr => write!(f, "idr"),
Self::Mmr => write!(f, "mmr"),
Self::Zero => write!(f, "zero"),
Self::NoReg => write!(f, "noreg"),
Self::Null => write!(f, "null"),
Self::Mar => write!(f, "mar"),
Self::Mdr => write!(f, "mdr"),
Self::Sts => write!(f, "sts"),
@@ -8,9 +8,9 @@ pub trait Encode {
/// Encodes a zero argument instruction.
fn encode_no_args(opcode: u8) -> u32 {
let opcode = u32::from(opcode);
let sr1 = Register::NoReg as u32;
let sr2 = Register::NoReg as u32;
let dr = Register::NoReg as u32;
let sr1 = Register::Null as u32;
let sr2 = Register::Null as u32;
let dr = Register::Null as u32;
let shamt = 0;
(opcode << 26) | (sr1 << 21) | (sr2 << 16) | (dr << 11) | (shamt << 6)
@@ -2,7 +2,7 @@ use crate::prelude::*;
#[test]
fn test_encode_nop() {
let no_reg = Register::NoReg as u32;
let no_reg = Register::Null as u32;
let no_op = u32::from(Instruction::Nop.opcode());
let expected = (no_op << 26) | (no_reg << 21) | (no_reg << 16) | (no_reg << 11);
@@ -15,7 +15,7 @@ fn test_encode_nop() {
fn test_encode_mov() {
let rg0 = Register::Rg0 as u32;
let rg1 = Register::Rg1 as u32;
let no_reg = Register::NoReg as u32;
let no_reg = Register::Null as u32;
let instruction = Instruction::Mov(RTypeArgs::new(
Some(Register::Rg0),
@@ -53,7 +53,7 @@ fn test_encode_load_byte() {
#[test]
fn test_encode_shift_left_shamt() {
let rg0 = Register::Rg0 as u32;
let no_reg = Register::NoReg as u32;
let no_reg = Register::Null as u32;
let shift_amount = 5;
@@ -80,7 +80,7 @@ fn test_encode_shift_left_shamt() {
fn test_encode_shift_left_reg() {
let rg0 = Register::Rg0 as u32;
let rg1 = Register::Rg1 as u32;
let no_reg = Register::NoReg as u32;
let no_reg = Register::Null as u32;
let instruction = Instruction::ShiftLeft(RTypeArgs::new(
Some(Register::Rg0),
@@ -12,6 +12,7 @@
clippy::match_wildcard_for_single_variants
)]
pub mod build;
pub mod instructions;
pub mod logging;
+65
View File
@@ -0,0 +1,65 @@
use std::sync::{Arc, mpsc};
pub fn info(message: &str) {
println!("\x1b[32mINFO:\x1b[0m {message}");
}
#[derive(Debug)]
pub struct LogReceiver {
logs_rx: mpsc::Receiver<String>,
sender: Logger,
use_stdio: bool,
}
#[derive(Debug, Clone)]
pub struct Logger {
use_stdio: bool,
logs_tx: Arc<mpsc::Sender<String>>,
}
impl Logger {
#[must_use]
pub fn new(logs_tx: mpsc::Sender<String>, use_stdio: bool) -> Self {
Self {
use_stdio,
logs_tx: Arc::new(logs_tx),
}
}
pub fn info(&self, message: &str) {
let res = format!("\x1b[32mINFO:\x1b[0m {message}");
if self.use_stdio {
println!("{res}");
}
self.logs_tx.send(res).expect("Failed to send log message");
}
}
impl LogReceiver {
#[must_use]
pub fn new(use_stdio: bool) -> Self {
let (logs_tx, logs_rx) = mpsc::channel();
Self {
use_stdio,
logs_rx,
sender: Logger::new(logs_tx, use_stdio),
}
}
#[must_use]
pub fn logs(&self) -> Vec<String> {
self.logs_rx.try_iter().collect()
}
#[must_use]
pub fn logger(&self) -> Logger {
self.sender.clone()
}
}
impl Default for LogReceiver {
fn default() -> Self {
Self::new(true)
}
}
+4
View File
@@ -0,0 +1,4 @@
- we definitely need to be able to use registers for shift operations.
- we need logical boolean operations in addition to the bitwise ones.
- better conditionals.
File diff suppressed because it is too large Load Diff
+26
View File
@@ -0,0 +1,26 @@
# General TODO's
# Bugfixes
- [x] [EASY] Investigate logical and operator not compiling - either a lexer or parser issue.
- **note**: this was a parser issue.
# Missing features
- [x] [MEDIUM] Get shift operations working correctly.
- [ ] [MEDIUM] proper prefix/postfix inc/dec implementation. slightly more complex as we need to check for a variable and modify it in place
- [ ] [EASY] Add multiply and divide operations to code generation
- **note**: very easy to do but our division algorithm is hopelessly slow so not worth doing for now.
# Performance Improvements
- [ ] [MEDIUM] implement a proper div/mod library that's not slow af.
- [ ] [HARD] Immediate operations for values that support it (up to +/- u16::max for addi and subi respectively)
- this requires significant complexity in code generation as we need to traverse down the tree when we come across these operations to prevent additional register allocations.
# Compiler optimisations
# Codegen improvements
- [ ] [MEDIUM / time consuming] Add scoping to code generation
- [ ] [MEDIUM / time consuming] Rewrite entire codegen to imrpove code quality and make the code more readable.
- [ ] type-safe instruction builder
- [ ] Instruction & Register enums
- [ ] Instruction builder helper fns eg `fn add(left: &Register, right: &Register, dest: &Register) -> Instruction`
- [ ] Instruction Block types.
-10
View File
@@ -1,10 +0,0 @@
[package]
name = "dsx-build"
version.workspace = true
edition.workspace = true
authors.workspace = true
[dependencies]
compiler = { path = "../compiler" }
assembler = { path = "../assembler" }
chrono = "0.4.43"
-200
View File
@@ -1,200 +0,0 @@
use std::process::{Command, Stdio};
use std::{
env, fs,
path::{Path, PathBuf},
};
use crate::templates::{Dsa, Dsc, Template};
mod templates;
/// Run a command and exit on failure.
fn run(cmd: &mut Command) {
let status = cmd.status().expect("failed to execute command");
if !status.success() {
std::process::exit(1);
}
}
fn main() {
// Very small CLI only three subcommands.
let args: Vec<String> = env::args().collect();
if args.len() < 2 {
eprintln!("Usage: dsx-build <new|build|package> [options]");
std::process::exit(1);
}
match args[1].as_str() {
"new" => cmd_new(&args[2..]),
"build" => cmd_build(),
"package" => todo!("Package manager stub not implemented yet."),
_ => {
eprintln!("Unknown command: {}", args[1]);
std::process::exit(1);
}
}
}
// ---------- new project ----------------------------------------------------
fn cmd_new(args: &[String]) {
let mut lang = "dsa";
for i in 0..args.len() {
if args[i] == "--lang" && i + 1 < args.len() {
lang = &args[i + 1];
}
}
let lib = args.contains(&"--lib".to_string());
// Determine project root: a subdirectory named after the supplied --name argument.
let mut name_opt = None;
for i in 0..args.len() {
if args[i] == "--name" && i + 1 < args.len() {
name_opt = Some(&args[i + 1]);
break;
}
}
let project_name = match name_opt {
Some(name) => name.to_string(),
None => {
eprintln!("Error: --name argument required");
std::process::exit(1);
}
};
let cwd = env::current_dir().unwrap();
let src_path = cwd.join(&project_name).join("src");
fs::create_dir_all(&src_path).expect("Failed to create project directory");
match lang {
"dsa" => {
// Minimal DSA binary template.
let path = src_path.join(format!("main.dsa"));
let template = Dsa::create(&project_name, lib);
fs::write(path, template).expect("Unable to write DSA file");
}
"dsc" => {
let path = src_path.join(format!("main.dsc"));
let template = Dsc::create(&project_name, lib);
fs::write(path, template).expect("Unable to write DSC file");
}
_ => {
eprintln!("Unsupported language: {}", lang);
std::process::exit(1);
}
}
fs::create_dir_all(src_path.join("lib")).expect("Failed to create lib directory");
fs::write(
src_path.join("lib/print.dsa"),
templates::create_print_lib(),
)
.expect("Failed to create print.dsa");
fs::write(
src_path.join("lib/maths.dsa"),
templates::create_maths_lib(),
)
.expect("Failed to create maths.dsa");
println!(
"Created new {} project in {}.",
lang,
src_path.parent().unwrap().display()
);
}
// ---------- build ----------------------------------------------------------
fn cmd_build() {
let cwd = env::current_dir().unwrap();
// Detect .dsc or .dsa files in current directory.
let mut has_dsc = false;
let mut has_dsa = false;
for entry in fs::read_dir(&cwd.join("src")).expect("unable to read dir") {
if let Ok(entry) = entry {
let path = entry.path();
if path.extension().and_then(|s| s.to_str()) == Some("dsc") {
has_dsc = true;
} else if path.extension().and_then(|s| s.to_str()) == Some("dsa") {
has_dsa = true;
}
}
}
if !has_dsc && !has_dsa {
eprintln!("No .dsc or .dsa source found in src directory.");
std::process::exit(1);
}
// Assemble main.dsa to a dsb binary.
println!("Assembling Project to a DSB binary...");
let build_dir = cwd.join("build");
fs::create_dir_all(&build_dir).expect("Failed to create build directory");
// Copy everything from `cwd/src` to the build directory.
fn copy_recursively(src: &Path, dst: &Path) {
if src.is_file() {
fs::create_dir_all(dst.parent().unwrap())
.expect("Failed to create parent directory");
fs::copy(src, dst).expect("Failed to copy file");
} else if src.is_dir() {
for entry in fs::read_dir(src).expect("Unable to read source dir") {
let entry = entry.expect("Failed to read entry");
let child_src = entry.path();
let child_dst = dst.join(entry.file_name());
copy_recursively(&child_src, &child_dst);
}
}
}
let src_dir = cwd.join("src");
if src_dir.exists() {
copy_recursively(&src_dir, &build_dir);
}
// Change current working directory to the build directory.
env::set_current_dir(&build_dir).expect("Failed to change to build directory");
if has_dsc {
println!("Compiling DSC to DSA...");
fn compile_recursive(path: &Path) {
if path.is_dir() {
for entry in fs::read_dir(path).expect("unable to read dir") {
let entry = entry.expect("failed to read entry");
compile_recursive(&entry.path());
}
} else if path.extension().and_then(|s| s.to_str()) == Some("dsc") {
let input_path = path;
let output_path = path.with_extension("dsa");
compiler::compile_file(&input_path, &output_path).unwrap_or_else(|e| {
eprintln!("Failed to compile {:?}: {}", input_path, e);
std::process::exit(1);
});
}
}
compile_recursive(&build_dir);
}
// Replace .dsc with .dsa only in include statements, recursively for each file.
let mut sed_cmd = Command::new("bash");
sed_cmd.args(&[
"-c",
&format!(
"find \"{}\" -type f -name '*.dsa' -exec sed -i '/^include/ s/\\.dsc/.dsa/g' {{}} +",
build_dir.display()
),
]);
run(&mut sed_cmd);
fs::create_dir_all(&cwd.join("artifacts")).expect("Failed to create build directory");
assembler::assemble_file("./main.dsa", "../artifacts/out.dsb").unwrap_or_else(|e| {
eprintln!("Failed to assemble {:?}: {}", "./main.dsa", e);
std::process::exit(1);
});
println!("Build finished. Binary at {}/main.dsb", build_dir.display());
}
+20
View File
@@ -0,0 +1,20 @@
[package]
name = "dsx"
version.workspace = true
edition.workspace = true
authors.workspace = true
[[bin]]
name = "dsx"
path = "src/main.rs"
[dependencies]
compiler = { path = "../../core/compiler" }
assembler = { path = "../../core/assembler" }
common = { path = "../../core/dsa_common" }
dsx_common = { path = "../dsx_common" }
toml = "1.0.3"
chrono = "0.4.44"
reqwest = { version = "0.13.2", default-features = false, features = ["blocking", "native-tls"] }
tar = "0.4.44"
flate2 = "1.1.9"
+120
View File
@@ -0,0 +1,120 @@
use std::{env, fs, path::PathBuf, process::Command};
use dsx_common::builder::{self, BuildContext};
use reqwest::Url;
pub mod new;
pub mod repo;
pub mod templates;
fn main() {
// Very small CLI only three subcommands.
let args: Vec<String> = env::args().collect();
if args.len() < 2 {
eprintln!("Usage: dsx-build <new|build|package> [options]");
std::process::exit(1);
}
match args[1].as_str() {
"new" => new::new_project(&args[2..]),
"build" => {
if let Some(dir) = find_project_root() {
let ctx = BuildContext {
project_dir: dir.clone(),
build_dir: dir.join("build"),
artifact_dir: dir.join("artifacts"),
};
builder::build_project(ctx).expect("Build failed!");
} else {
eprintln!("No Dsx.toml found");
std::process::exit(1);
}
}
"clean" => {
if let Some(dir) = find_project_root() {
fs::remove_dir_all(dir.join("build"))
.expect("failed to remove build dir");
println!("Build directory cleaned...");
fs::remove_dir_all(dir.join("artifacts"))
.expect("failed to remove artifacts dir");
println!("Artifacts directory cleaned...");
} else {
eprintln!("No Dsx.toml found");
std::process::exit(1);
}
}
"run" => {
if let Some(dir) = find_project_root() {
let ctx = BuildContext {
project_dir: dir.clone(),
build_dir: dir.join("build"),
artifact_dir: dir.join("artifacts"),
};
builder::build_project(ctx).expect("Run failed!");
// start process and call emulator
let mut child = Command::new("dsa")
.arg("--cli")
.arg("--bin")
.arg(dir.join("./artifacts/out.dsb"))
.spawn()
.expect("Failed to start emulator");
// wait for emulator to finish
child.wait().expect("Failed to wait for emulator");
} else {
eprintln!("No Dsx.toml found");
std::process::exit(1);
}
}
"clone" => {
let url: Url = Url::parse(&args[2]).unwrap();
let name = url.path_segments().unwrap().next_back().unwrap();
let repo = env::current_dir().unwrap().join(name);
fs::create_dir_all(&repo).unwrap();
repo::clone(&repo, &url).expect("Failed to clone repository");
}
"push" => {
if let Some(dir) = find_project_root() {
repo::push(&dir).expect("Failed to push repository");
} else {
eprintln!("No Dsx.toml found");
std::process::exit(1);
}
}
"pull" => {
if let Some(dir) = find_project_root() {
repo::pull(&dir).expect("Failed to pull repository");
} else {
eprintln!("No Dsx.toml found");
std::process::exit(1);
}
}
"package" => todo!("Package manager stub not implemented yet."),
_ => {
eprintln!("Unknown command: {}", args[1]);
std::process::exit(1);
}
}
}
fn find_project_root() -> Option<PathBuf> {
// check if current dir has Dsx.toml otherwise check parent dir recursively
let mut cwd = env::current_dir().unwrap();
loop {
let dsx_toml = cwd.join("Dsx.toml");
if dsx_toml.exists() {
return Some(cwd);
}
if let Some(parent) = cwd.parent() {
cwd = parent.to_path_buf();
} else {
break;
}
}
None
}
+107
View File
@@ -0,0 +1,107 @@
use std::{env, fmt, fs};
use dsx_common::config::DsxConfig;
use crate::templates::{self, Dsa, Dsc, Template};
// ---------- new project ----------------------------------------------------
pub fn new_project(args: &[String]) {
// get project details from args.
let lang = Language::from_args(args);
let lib = args.contains(&"--lib".to_string());
let name = project_name(args).unwrap_or_else(|| {
eprintln!("Error: --name argument required");
std::process::exit(1);
});
let project_path = env::current_dir().unwrap().join(name);
let src_path = project_path.join("src");
fs::create_dir_all(&src_path).expect("Failed to create project directory");
let config_template = DsxConfig::new(name);
fs::write(
project_path.join("Dsx.toml"),
toml::to_string(&config_template).unwrap(),
)
.expect("Unable to write default config");
let (path, template) = match lang {
Language::Unknown | Language::Dsa => {
(src_path.join("main.dsa"), Dsa::create(name, lib))
}
Language::Dsc => (src_path.join("main.dsc"), Dsc::create(name, lib)),
};
fs::write(path, template).expect("Unable to write DSA file");
fs::create_dir_all(src_path.join("lib")).expect("Failed to create lib directory");
fs::write(
src_path.join("./lib/print.dsa"),
include_str!("../templates/dsa/print.dsa"),
)
.expect("Failed to create print.dsa");
fs::write(
src_path.join("./lib/maths.dsa"),
include_str!("../templates/dsa/maths.dsa"),
)
.expect("Failed to create maths.dsa");
fs::write(
src_path.join("./lib/serial.dsa"),
include_str!("../templates/dsa/serial.dsa"),
)
.expect("Failed to create maths.dsa");
println!(
"Created new {} project in {}.",
lang,
src_path.parent().unwrap().display()
);
}
// helpers
enum Language {
Unknown,
Dsa,
Dsc,
}
impl Language {
fn from_args(args: &[String]) -> Self {
let mut lang = Language::Unknown;
for i in 0..args.len() {
if args[i] == "--lang" && i + 1 < args.len() {
match args[i + 1].as_str() {
"dsa" => lang = Language::Dsa,
"dsc" => lang = Language::Dsc,
_ => {
eprintln!("Error: Invalid language argument");
std::process::exit(1);
}
}
}
}
lang
}
}
impl fmt::Display for Language {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::Unknown => write!(f, "Unknown"),
Self::Dsa => write!(f, "Dsa"),
Self::Dsc => write!(f, "Dsc"),
}
}
}
pub fn project_name(args: &[String]) -> Option<&str> {
for i in 0..args.len() {
if args[i] == "--name" && i + 1 < args.len() {
return Some(&args[i + 1]);
}
}
None
}
+145
View File
@@ -0,0 +1,145 @@
use dsx_common::config::DsxConfig;
use flate2::read::GzDecoder;
use flate2::{Compression, write::GzEncoder};
use reqwest::Url;
use std::fs::{self, File};
use std::path::Path;
use tar::Builder;
pub fn push(repo_dir: &Path) -> Result<(), DsxError> {
let config_file = fs::read_to_string(repo_dir.join("Dsx.toml"))?;
let config: DsxConfig = toml::from_str(&config_file).expect("Failed to parse config");
let mut repo_url =
Url::parse(&config.remote_url.expect(
"Repository URL is not set in Dsx.toml, set it with the key 'remote'",
))
.unwrap();
repo_url.path_segments_mut().unwrap().push("push");
let client = reqwest::blocking::Client::new();
let response = client
.post(repo_url)
.body(pack_tarball(repo_dir)?)
.header("Content-Type", "application/octet-stream")
.send()
.map_err(|e| {
DsxError::with_context(
format!("failed to stream to client: {}", e),
ErrorType::IoError,
)
})?;
if response.status().is_success() {
Ok(())
} else {
Err(DsxError::with_context(
response.text().unwrap(),
ErrorType::IoError,
))
}
}
pub fn pull(repo_dir: &Path) -> Result<(), DsxError> {
let config_file = fs::read_to_string(repo_dir.join("Dsx.toml"))?;
let config: DsxConfig = toml::from_str(&config_file).expect("Failed to parse config");
let repo_url =
Url::parse(&config.remote_url.expect(
"Repository URL is not set in Dsx.toml, set it with the key 'remote'",
))
.unwrap();
clone(repo_dir, &repo_url)
}
pub fn clone(repo_dir: &Path, url: &Url) -> Result<(), DsxError> {
let mut url = url.clone();
url.path_segments_mut().unwrap().push("pull");
let client = reqwest::blocking::Client::new();
let response = client.get(url).send().map_err(|e| {
DsxError::with_context(
format!("failed to stream from client: {e}"),
ErrorType::IoError,
)
})?;
if !response.status().is_success() {
return Err(DsxError::with_context(
response.text().unwrap(),
ErrorType::IoError,
));
}
let tmp_file = std::env::temp_dir().join("dsx-pull.tar.gz");
std::fs::write(&tmp_file, response.bytes().unwrap())
.map_err(|e| DsxError::with_context(e.to_string(), ErrorType::IoError))?;
unpack_tarball(&tmp_file, repo_dir)?;
fs::remove_file(tmp_file)?;
Ok(())
}
fn unpack_tarball(
archive: &std::path::Path,
dest: &std::path::Path,
) -> Result<(), DsxError> {
let file = File::open(archive)
.map_err(|e| DsxError::with_context(e.to_string(), ErrorType::IoError))?;
fs::create_dir_all(dest)?;
let gz = GzDecoder::new(file);
let mut tar = tar::Archive::new(gz);
tar.unpack(dest)
.map_err(|e| DsxError::with_context(e.to_string(), ErrorType::TarError))?;
Ok(())
}
fn pack_tarball(src_dir: &std::path::Path) -> Result<Vec<u8>, DsxError> {
let buf = Vec::new();
let gz = GzEncoder::new(buf, Compression::default());
let mut tar = Builder::new(gz);
tar.append_dir_all(".", src_dir)?;
let gz = tar.into_inner()?;
Ok(gz.finish()?)
}
#[derive(Debug)]
pub struct DsxError {
pub message: String,
pub r#type: ErrorType,
}
impl DsxError {
pub fn new(message: impl AsRef<str>) -> Self {
DsxError {
message: message.as_ref().to_string(),
r#type: ErrorType::Generic,
}
}
pub fn with_context(message: impl AsRef<str>, r#type: ErrorType) -> Self {
DsxError {
message: message.as_ref().to_string(),
r#type,
}
}
}
impl From<std::io::Error> for DsxError {
fn from(err: std::io::Error) -> Self {
Self::with_context(err.to_string(), ErrorType::IoError)
}
}
#[derive(Default, Debug)]
pub enum ErrorType {
BuildFailed,
IoError,
#[default]
Generic,
TarError,
}
+142
View File
@@ -0,0 +1,142 @@
use std::path::PathBuf;
pub trait Template {
fn lib(project: &str) -> String;
fn bin(project: &str) -> String;
fn create(project: &str, lib: bool) -> String {
if lib {
Self::lib(project)
} else {
Self::bin(project)
}
}
}
pub struct Dsa;
pub struct Dsc;
impl Template for Dsa {
fn lib(project: &str) -> String {
format!(
r#"//
lib.dsa
// usage:
//
// include {project} "<relative path>"
//
// usage for {project}_main:
// push (arg1)
// push (arg0)
// call {project}::{project}_main
// pop (arg0)
// pop (arg1)
// Example data declarations
// dw example_data: 0x0000
// Main function template
{project}_main:
// the correct way to start a function as defined by the calling convention
push bpr
mov spr, bpr
// explanation of how to access args
ldw bpr, rg0, 8 // arg 0
ldw bpr, rg0, 12 // arg 1
// your code goes here
// Example: load example_data into rg1
// ldw example_data, rg1
// the correct way to end a function as defined by the calling convention
mov bpr, spr
pop bpr
return
"#,
)
}
fn bin(project: &str) -> String {
format!(
r#"
// GENERATED BY DSX-BUILD
// Generated at: {timestamp}
// Project name: {project}
// Imports
include print: "./lib/print.dsa"
// Globals & Reserved Memory
dw stack: 0x10000
db message: "Process Exited with code:"
// Entry Point
_init:
ldw stack, bpr
mov bpr, spr
push zero
call main
call print::print_newline
lwi message, rg0
push rg0
call print::print
pop zero
call print::print_hex_word
pop zero
hlt
main:
push bpr
mov spr, bpr
// Your code goes here
// Return zero
stw zero, bpr, 8
mov bpr, spr
pop bpr
return"#,
timestamp = chrono::Utc::now().format("%Y-%m-%d %H:%M:%S")
)
}
}
impl Template for Dsc {
fn lib(project: &str) -> String {
format!(
r#"
// GENERATED BY DSX-BUILD
// Generated at: {timestamp}
// Project name: {project}
// Imports
include print: "./lib/print.dsa";
// Main Function
fn {project}_main() -> u32 {{
return 0;
}}"#,
timestamp = chrono::Utc::now().format("%Y-%m-%d %H:%M:%S")
)
}
fn bin(project: &str) -> String {
format!(
r#"
// GENERATED BY DSX-BUILD
// Generated at: {timestamp}
// Project name: {project}
// Imports
include print: "./lib/print.dsa";
// Main Function
fn main() -> u32 {{
return 0;
}}"#,
timestamp = chrono::Utc::now().format("%Y-%m-%d %H:%M:%S")
)
}
}
+104
View File
@@ -0,0 +1,104 @@
// multiply.dsa
// usage:
//
// include multiply "<relative path>"
//
// usage for multiply:
// push (arg1)
// push (arg0)
// call multiply::multiply
// pop (arg0)
// pop (arg1)
multiply:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 // load op 2
ldw bpr, rg1, 12 // load op 1
lwi 0, rg2 // initialise rg2 to zero
_multiply_loop:
add rg2, rg0, rg2
dec rg1
cmp rg1, zero
jgt _multiply_loop
_multiply_end:
stw rg2, bpr, 8
mov bpr, spr
pop bpr
return
divmod:
push bpr
mov spr, bpr
ldw bpr, rg1, 8 // load op 2
ldw bpr, rg0, 12 // load op 1
lli 0, rg3
_divmod_loop:
cmp rg0, rg1
jlt _divmod_end
sub rg0, rg1, rg0
inc rg3
jmp _divmod_loop
_divmod_end:
// store div in first arg
// store mod in second arg
stw rg3, bpr, 8
stw rg0, bpr, 12
mov bpr, spr
pop bpr
return
// multiply.dsa - improved version
// Multiplies two 32-bit numbers using shift-and-add
//
// Usage:
// push operand2 (multiplier)
// push operand1 (multiplicand)
// call multiply::multiply
// pop result
// pop zero (discard second argument)
new_multiply:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 // rg0 = multiplicand
ldw bpr, rg1, 12 // rg1 = multiplier
lli 0, rg2 // rg2 = result (accumulator)
lli 32, rg3 // rg3 = bit counter
mult_loop:
// Check if lowest bit of multiplier is 1
lli 1, acc
and rg1, acc, acc // acc = rg1 & 1
cmp acc, zero
jeq skip_add // if (rg1 & 1) == 0, skip addition
// Add multiplicand to result
add rg2, rg0, rg2
skip_add:
shl rg0, 1 // shift multiplicand left
shr rg1, 1 // shift multiplier right
dec rg3
cmp rg3, zero
jgt mult_loop
stw rg2, bpr, 8 // store result
mov bpr, spr
pop bpr
return
+331
View File
@@ -0,0 +1,331 @@
// lib:
// print.dsa
// usage:
//
// include print "<relative path>""
//
// usage for print:
// push (register containing address of string)
// push pcx
// jmp print::print
//
// usage for reset:
// push pcx
// jmp print::reset
//
// usage for clear:
// push pcx
// jmp print::clear
//
// usage for print_byte:
// push (register containing byte)
// push pcx
// jmp print::print_byte
//
// usage for print_word:
// push (register containing word)
// push pcx
// jmp print::print_word
//
// usage for print_num:
// push (register containing number to print in decimal)
// push pcx
// jmp print::print_num
//
include maths "./maths.dsa"
dw display: 0x20000
dw current: 0x20000
// ------------------------------------------
// prints the string at addr(arg[0]) to the screen. (no trailing whitespace unless explicitly provided)
print:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
_print_loop:
ldb rg0, acc
cmp acc, zero
jeq _end
stb acc, rg1
addi rg0, 1
addi rg1, 1
jmp _print_loop
// ------------------------------------------
println:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
_println_loop:
ldb rg0, acc
cmp acc, zero
jeq _println_end
stb acc, rg1
addi rg0, 1
addi rg1, 1
jmp _println_loop
_println_end:
call print_newline
jmp _end
// ------------------------------------------
// prints the value of arg[0] to the screen.
print_word:
// initialise
push bpr
mov spr, bpr
// load byte into acc
ldw bpr, rg0, 8
ldw current, rg1
addi rg1, 3
stb rg0, rg1
subi rg1, 1
shr rg0, 8
stb rg0, rg1
subi rg1, 1
shr rg0, 8
stb rg0, rg1
subi rg1, 1
shr rg0, 8
stb rg0, rg1
addi rg1, 4
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the screen.
print_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
stb rg0, rg1
addi rg1, 1
jmp _end
// ------------------------------------------
// prints the value of arg[0] to the screen in hex.
print_hex_word:
push bpr
mov spr, bpr
ldw current, rg1
ldb bpr, rg0, 8
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 9
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 10
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 11
push rg0
call _print_hex_byte
addi spr, 4
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the screen in hex.
print_hex_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
call _print_hex_byte
jmp _end
// function body
_print_hex_byte:
// mask to get lower nibble
lli 0xF, rg2
// save rg0 state
push rg0
shr rg0, 4
and rg0, rg2, rg0
call _print_hex_nibble
pop rg0
and rg0, rg2, rg0
call _print_hex_nibble
return
// print a hex digit
_print_hex_nibble:
lli 10, rg3
cmp rg0, rg3
jlt _print_hex_nibble_number
addi rg0, 0x37, rg0
stb rg0, rg1
addi rg1, 1
return
// helper function.
_print_hex_nibble_number:
addi rg0, 0x30, rg0
stb rg0, rg1
addi rg1, 1
return
// ------------------------------------------
// print whitespace
print_whitespace:
push bpr
mov spr, bpr
ldw current, rg1
lli 0x20, rg0
stb rg0, rg1
addi rg1, 1
jmp _end
// ------------------------------------------
// print newline
print_newline:
push bpr
mov spr, bpr
// load variables into registers
ldw display, rg0
ldw current, rg1
// get the offset from the display base
sub rg1, rg0, rg0
lwi 80, rg2
pusha 3
push rg0
push rg2
call maths::divmod
pop zero // result
pop rg3 // remainder
popa 3
sub rg1, rg3, rg2
addi rg2, 80, rg1
// _end saves the display state
jmp _end
// ------------------------------------------
// prints arg[0] as a decimal number to the screen.
print_num:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 // load number to print
lli 0, rg5 // rg5 = digit counter
// check if number is zero
cmp rg0, zero
jne _print_num_extract_digits
// special case: print '0' for zero
lli 0x30, rg6
push rg6 // push digit to stack buffer
lli 1, rg5 // we have 1 digit
jmp _print_num_output
_print_num_extract_digits:
// divide by 10 repeatedly to get digits
cmp rg0, zero
jeq _print_num_output
// call divmod(rg0, 10)
push rg0 // dividend
lli 10, rg1
push rg1 // divisor (10)
call maths::divmod
pop rg0 // quotient (continue dividing this)
pop rg1 // remainder (the digit)
// convert digit to ASCII and push to stack buffer
addi rg1, 0x30, rg6 // convert to ASCII
push rg6 // push digit to stack
inc rg5 // increment digit counter
jmp _print_num_extract_digits
_print_num_output:
// now print digits (pop them off in reverse order)
ldw current, rg1 // get display pointer
_print_num_output_loop:
// check if we've printed all digits
cmp rg5, zero
jeq _print_num_done
// pop digit and print it
pop rg6
stb rg6, rg1
addi rg1, 1
dec rg5
jmp _print_num_output_loop
_print_num_done:
jmp _end
// ------------------------------------------
// resets the cursor position on the screen to 0x20000. (0,0)
reset:
push bpr
mov spr, bpr
ldw display, rg1
jmp _end
// ------------------------------------------
// clears the screen
clear:
push bpr
mov spr, bpr
// display size = 2000 bytes / 500 words
lli 500 rg0
ldw display, rg1
_clear_loop:
dec rg0
stw zero, rg1
addi rg1, 4
cmp rg0, zero
jgt _clear_loop
jmp _end
// ------------------------------------------
// return
_end:
stw rg1, current
mov bpr, spr
pop bpr
return
+244
View File
@@ -0,0 +1,244 @@
// lib:
// print_serial.dsa
// usage:
//
// include print_serial "<relative path>"
//
// usage for print:
// push (register containing address of string)
// push pcx
// jmp print_serial::print
//
// usage for print_byte:
// push (register containing byte)
// push pcx
// jmp print_serial::print_byte
//
// usage for print_word:
// push (register containing word)
// push pcx
// jmp print_serial::print_word
//
// usage for print_hex_byte:
// push (register containing byte)
// push pcx
// jmp print_serial::print_hex_byte
//
// usage for print_hex_word:
// push (register containing word)
// push pcx
// jmp print_serial::print_hex_word
//
// usage for print_num:
// push (register containing number to print in decimal)
// push pcx
// jmp print_serial::print_num
//
// usage for println:
// push (register containing address of string)
// push pcx
// jmp print_serial::println
//
include maths "../maths.dsa"
dw serial: 0x207D0 // 0x20000 + 2000
// ------------------------------------------
// prints the string at addr(arg[0]) to the serial port.
print:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
_print_loop:
ldb rg0, acc
cmp acc, zero
jeq _end
stb acc, rg1
addi rg0, 1
jmp _print_loop
// ------------------------------------------
// prints the string at addr(arg[0]) followed by a newline to the serial port.
println:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
_println_loop:
ldb rg0, acc
cmp acc, zero
jeq _println_end
stb acc, rg1
addi rg0, 1
jmp _println_loop
_println_end:
lli 0x0A, rg2 // newline character
stb rg2, rg1
jmp _end
// ------------------------------------------
// prints the word in arg[0] as 4 raw bytes to the serial port.
print_word:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
stb rg0, rg1
shr rg0, 8
stb rg0, rg1
shr rg0, 8
stb rg0, rg1
shr rg0, 8
stb rg0, rg1
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the serial port.
print_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
stb rg0, rg1
jmp _end
// ------------------------------------------
// prints the value of arg[0] to the serial port in hex.
print_hex_word:
push bpr
mov spr, bpr
lwi 0x207D0, rg1
ldb bpr, rg0, 8
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 9
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 10
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 11
push rg0
call _print_hex_byte
addi spr, 4
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the serial port in hex.
print_hex_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
call _print_hex_byte
jmp _end
// function body
_print_hex_byte:
lli 0xF, rg2
push rg0
shr rg0, 4
and rg0, rg2, rg0
call _print_hex_nibble
pop rg0
and rg0, rg2, rg0
call _print_hex_nibble
return
// print a hex digit
_print_hex_nibble:
lli 10, rg3
cmp rg0, rg3
jlt _print_hex_nibble_number
addi rg0, 0x37, rg0
stb rg0, rg1
return
_print_hex_nibble_number:
addi rg0, 0x30, rg0
stb rg0, rg1
return
// ------------------------------------------
// prints arg[0] as a decimal number to the serial port.
print_num:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lli 0, rg5
cmp rg0, zero
jne _print_num_extract_digits
lli 0x30, rg6
push rg6
lli 1, rg5
jmp _print_num_output
_print_num_extract_digits:
cmp rg0, zero
jeq _print_num_output
push rg0
lli 10, rg1
push rg1
call maths::divmod
pop rg0
pop rg1
addi rg1, 0x30, rg6
push rg6
inc rg5
jmp _print_num_extract_digits
_print_num_output:
lwi 0x207D0, rg1
_print_num_output_loop:
cmp rg5, zero
jeq _print_num_done
pop rg6
stb rg6, rg1
dec rg5
jmp _print_num_output_loop
_print_num_done:
jmp _end
// ------------------------------------------
// return
_end:
mov bpr, spr
pop bpr
return
+15
View File
@@ -0,0 +1,15 @@
[package]
name = "dsx_common"
version.workspace = true
edition.workspace = true
authors.workspace = true
[dependencies]
compiler = { path = "../../core/compiler" }
assembler = { path = "../../core/assembler" }
common = { path = "../../core/dsa_common" }
chrono = "0.4.44"
serde = { version = "1.0.228", features = ["derive"] }
toml = "1.0.3"
walkdir = "2.5.0"
+168
View File
@@ -0,0 +1,168 @@
use std::{
env, fs, io,
path::{Path, PathBuf},
process::Command,
};
use crate::config::DsxConfig;
use assembler::prelude::Assembler;
use common::build::{BuildError, Builder};
use compiler::Compiler;
pub struct BuildContext {
pub project_dir: PathBuf,
pub build_dir: PathBuf,
pub artifact_dir: PathBuf,
}
// ---------- build ----------------------------------------------------------
pub fn build_project(ctx: BuildContext) -> Result<(), BuildError> {
// variables
let binary_path = ctx.artifact_dir.join("out.dsb");
let main_path = ctx.build_dir.join("main.dsa");
let config_path = ctx.project_dir.join("Dsx.toml");
let src_dir = ctx.project_dir.join("src");
let config: DsxConfig =
toml::from_str(&fs::read_to_string(&config_path)?).map_err(|deser_err| {
io::Error::new(io::ErrorKind::InvalidData, deser_err.to_string())
})?;
if !src_dir.exists() {
return Err(BuildError::Generic(String::from(
"Source Directory does not exist",
)));
}
// make sure there's a main file to assemble later.
if !main_exists(&src_dir)? {
return Err(BuildError::Generic(String::from(
"No main.dsa or main.dsc file found in top level of src directory.",
)));
}
// detect src.
let (has_dsa, has_dsc) = detect_source_language(&src_dir);
// create a build dir and copy all files across
fs::create_dir_all(&ctx.build_dir)?;
copy_recursively(&src_dir, &ctx.build_dir)?;
if has_dsc {
build_all_dsc(&ctx.build_dir)?;
}
// Replace .dsc with .dsa only in include statements, recursively for each file.
let status = Command::new("bash").args([
"-c",
&format!(
"find \"{}\" -type f -name '*.dsa' -exec sed -i '/^include/ s/\\.dsc/.dsa/g' {{}} +",
ctx.build_dir.display()
),
]).status()?;
if !status.success() {
return Err(BuildError::IoError(String::from(
"Failed to execute build command command",
)));
}
// assemble result
{
fs::create_dir_all(&ctx.artifact_dir)?;
let mut asm = Assembler::new(&main_path);
asm.start(());
asm.write_result(&binary_path)?;
}
println!("Build finished. Binary at {}", binary_path.display());
Ok(())
}
// ----- Helpers -------------------------------
struct BuildStep;
impl BuildStep {
pub fn compiling(path: &Path) {
println!("Compiling {}", path.display());
}
pub fn assembling(path: &Path) {
println!("Assembling {}", path.display());
}
}
/// Checks what source languages are used in the project.
fn detect_source_language(src_dir: &Path) -> (bool, bool) {
let mut contains_dsc = false;
let mut contains_dsa = false;
for entry in walkdir::WalkDir::new(src_dir).into_iter().flatten() {
match entry.path().extension().and_then(|s| s.to_str()) {
Some("dsc") => contains_dsc = true,
Some("dsa") => contains_dsa = true,
_ => {}
}
}
(contains_dsa, contains_dsc)
}
// Checks if either main.dsa or main.dsc exist in the source directory
fn main_exists(src_dir: &Path) -> Result<bool, std::io::Error> {
for entry in fs::read_dir(src_dir).into_iter().flatten() {
match entry?.path().file_name().and_then(|s| s.to_str()) {
Some("main.dsc") => return Ok(true),
Some("main.dsa") => return Ok(true),
_ => {}
}
}
Ok(false)
}
// Copy contents of one directory to another
fn copy_recursively(src: &Path, dst: &Path) -> Result<(), std::io::Error> {
if src.is_file() {
fs::create_dir_all(dst.parent().unwrap())?;
fs::copy(src, dst)?;
return Ok(());
}
if src.is_dir() {
for entry in fs::read_dir(src)? {
let entry = entry?;
let child_src = entry.path();
let child_dst = dst.join(entry.file_name());
copy_recursively(&child_src, &child_dst)?;
}
}
Ok(())
}
fn build_all_dsc(path: &Path) -> Result<(), BuildError> {
if path.is_dir() {
for entry in fs::read_dir(path)? {
let entry = entry?;
build_all_dsc(&entry.path())?;
}
return Ok(());
}
if path.extension().and_then(|s| s.to_str()) == Some("dsc") {
let input_path = path;
let output_path = path.with_extension("dsa");
let is_lib = !(input_path.file_stem().unwrap().to_str().unwrap() == "main");
let mut compiler = Compiler::new(input_path);
compiler.start(is_lib);
compiler.write_result(output_path.clone())?;
}
Ok(())
}
@@ -140,7 +140,7 @@ fn main() -> u32 {{
}
pub fn create_print_lib() -> String {
format!(
String::from(
r#"
// lib:
// print.dsa
@@ -473,12 +473,12 @@ _end:
mov bpr, spr
pop bpr
return
"#
"#,
)
}
pub fn create_maths_lib() -> String {
format!(
String::from(
r#"
// multiply.dsa
// usage:
@@ -584,6 +584,6 @@ skip_add:
mov bpr, spr
pop bpr
return
"#
"#,
)
}
+45
View File
@@ -0,0 +1,45 @@
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DsxConfig {
pub name: String,
#[serde(default)]
pub description: Option<String>,
#[serde(default)]
#[serde(rename = "remote")]
pub remote_url: Option<String>,
#[serde(default)]
pub binaries: Vec<Binary>,
// todo!
// #[serde(default)]
// pub libraries: Vec<Library>,
}
impl DsxConfig {
pub fn new(name: &str) -> Self {
Self {
name: name.to_string(),
description: None,
remote_url: None,
binaries: Vec::new(),
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Binary {
pub name: String,
pub path: String,
}
impl Binary {
pub fn new(name: &str, path: &str) -> Self {
Self {
name: name.to_string(),
path: path.to_string(),
}
}
}
+2
View File
@@ -0,0 +1,2 @@
pub mod builder;
pub mod config;
@@ -0,0 +1,4 @@
id = "test"
latest_build_date = "2026-02-25 14:39:49"
latest_build_status = "success"
latest_build_id = "test"
@@ -0,0 +1,3 @@
name = "test"
binaries = []
remote = "http://localhost:8000/api/pkg/test"
@@ -0,0 +1,77 @@
// Arena Allocator
// Supports multiple arenas that can be destroyed independently
// Much more practical than a simple bump allocator
// Global heap management
static heap_start: u32 = 0x30000;
static heap_end: u32 = 0x40000;
static heap_current: u32 = 0x30000;
// Arena structure (stored at the start of each arena):
// [0-3]: start_address (u32)
// [4-7]: current_position (u32)
// [8-11]: end_address (u32)
// Total header size: 12 bytes
// Create a new arena with given size
// Returns pointer to arena handle (or 0 if failed)
fn new(size: u32) -> u32 {
let total_size: u32 = size + 12;
let arena_ptr: u32 = heap_current;
let new_current: u32 = arena_ptr + total_size;
// Check if we have space
if new_current > heap_end {
return 0;
}
// Calculate arena data region
let data_start: u32 = arena_ptr + 12;
let data_end: u32 = arena_ptr + total_size;
// Initialize arena header
// Note: In real implementation, you'd use pointer writes here
// For now, using placeholder comments:
*arena_ptr = data_start; // start_address
*(arena_ptr + 4) = data_start; // current_position
*(arena_ptr + 8) = data_end; // end_address
heap_current = new_current;
return arena_ptr;
}
// Allocate from an arena
// Returns pointer to allocated memory (or 0 if failed)
fn alloc(arena: u32, size: u32) -> u32 {
// Read current position from arena
let current: u32 = *(arena + 4);
let end: u32 = *(arena + 8);
let new_current: u32 = current + size;
// Check if arena has space
if new_current > end {
return 0;
}
// Update current position in arena
*(arena + 4) = new_current;
return current;
}
// Destroy an arena (in bump allocator, this is a no-op)
// In a real allocator, you'd mark the memory as free
fn destroy(arena: u32) {
// In a true allocator, mark memory as reusable
// For bump allocator, we can't reclaim memory
// unless we destroy ALL arenas and reset
return 0;
}
// Reset entire heap (destroys ALL arenas)
fn reset_all() {
heap_current = heap_start;
return 0;
}
@@ -0,0 +1,104 @@
// multiply.dsa
// usage:
//
// include multiply "<relative path>"
//
// usage for multiply:
// push (arg1)
// push (arg0)
// call multiply::multiply
// pop (arg0)
// pop (arg1)
multiply:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 // load op 2
ldw bpr, rg1, 12 // load op 1
lwi 0, rg2 // initialise rg2 to zero
_multiply_loop:
add rg2, rg0, rg2
dec rg1
cmp rg1, zero
jgt _multiply_loop
_multiply_end:
stw rg2, bpr, 8
mov bpr, spr
pop bpr
return
divmod:
push bpr
mov spr, bpr
ldw bpr, rg1, 8 // load op 2
ldw bpr, rg0, 12 // load op 1
lli 0, rg3
_divmod_loop:
cmp rg0, rg1
jlt _divmod_end
sub rg0, rg1, rg0
inc rg3
jmp _divmod_loop
_divmod_end:
// store div in first arg
// store mod in second arg
stw rg3, bpr, 8
stw rg0, bpr, 12
mov bpr, spr
pop bpr
return
// multiply.dsa - improved version
// Multiplies two 32-bit numbers using shift-and-add
//
// Usage:
// push operand2 (multiplier)
// push operand1 (multiplicand)
// call multiply::multiply
// pop result
// pop zero (discard second argument)
new_multiply:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 // rg0 = multiplicand
ldw bpr, rg1, 12 // rg1 = multiplier
lli 0, rg2 // rg2 = result (accumulator)
lli 32, rg3 // rg3 = bit counter
mult_loop:
// Check if lowest bit of multiplier is 1
lli 1, acc
and rg1, acc, acc // acc = rg1 & 1
cmp acc, zero
jeq skip_add // if (rg1 & 1) == 0, skip addition
// Add multiplicand to result
add rg2, rg0, rg2
skip_add:
shl rg0, 1 // shift multiplicand left
shr rg1, 1 // shift multiplier right
dec rg3
cmp rg3, zero
jgt mult_loop
stw rg2, bpr, 8 // store result
mov bpr, spr
pop bpr
return
@@ -0,0 +1,331 @@
// lib:
// print.dsa
// usage:
//
// include print "<relative path>""
//
// usage for print:
// push (register containing address of string)
// push pcx
// jmp print::print
//
// usage for reset:
// push pcx
// jmp print::reset
//
// usage for clear:
// push pcx
// jmp print::clear
//
// usage for print_byte:
// push (register containing byte)
// push pcx
// jmp print::print_byte
//
// usage for print_word:
// push (register containing word)
// push pcx
// jmp print::print_word
//
// usage for print_num:
// push (register containing number to print in decimal)
// push pcx
// jmp print::print_num
//
include maths "./maths.dsa"
dw display: 0x20000
dw current: 0x20000
// ------------------------------------------
// prints the string at addr(arg[0]) to the screen. (no trailing whitespace unless explicitly provided)
print:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
_print_loop:
ldb rg0, acc
cmp acc, zero
jeq _end
stb acc, rg1
addi rg0, 1
addi rg1, 1
jmp _print_loop
// ------------------------------------------
println:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
_println_loop:
ldb rg0, acc
cmp acc, zero
jeq _println_end
stb acc, rg1
addi rg0, 1
addi rg1, 1
jmp _println_loop
_println_end:
call print_newline
jmp _end
// ------------------------------------------
// prints the value of arg[0] to the screen.
print_word:
// initialise
push bpr
mov spr, bpr
// load byte into acc
ldw bpr, rg0, 8
ldw current, rg1
addi rg1, 3
stb rg0, rg1
subi rg1, 1
shr rg0, 8
stb rg0, rg1
subi rg1, 1
shr rg0, 8
stb rg0, rg1
subi rg1, 1
shr rg0, 8
stb rg0, rg1
addi rg1, 4
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the screen.
print_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
stb rg0, rg1
addi rg1, 1
jmp _end
// ------------------------------------------
// prints the value of arg[0] to the screen in hex.
print_hex_word:
push bpr
mov spr, bpr
ldw current, rg1
ldb bpr, rg0, 8
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 9
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 10
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 11
push rg0
call _print_hex_byte
addi spr, 4
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the screen in hex.
print_hex_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
ldw current, rg1
call _print_hex_byte
jmp _end
// function body
_print_hex_byte:
// mask to get lower nibble
lli 0xF, rg2
// save rg0 state
push rg0
shr rg0, 4
and rg0, rg2, rg0
call _print_hex_nibble
pop rg0
and rg0, rg2, rg0
call _print_hex_nibble
return
// print a hex digit
_print_hex_nibble:
lli 10, rg3
cmp rg0, rg3
jlt _print_hex_nibble_number
addi rg0, 0x37, rg0
stb rg0, rg1
addi rg1, 1
return
// helper function.
_print_hex_nibble_number:
addi rg0, 0x30, rg0
stb rg0, rg1
addi rg1, 1
return
// ------------------------------------------
// print whitespace
print_whitespace:
push bpr
mov spr, bpr
ldw current, rg1
lli 0x20, rg0
stb rg0, rg1
addi rg1, 1
jmp _end
// ------------------------------------------
// print newline
print_newline:
push bpr
mov spr, bpr
// load variables into registers
ldw display, rg0
ldw current, rg1
// get the offset from the display base
sub rg1, rg0, rg0
lwi 80, rg2
pusha 3
push rg0
push rg2
call maths::divmod
pop zero // result
pop rg3 // remainder
popa 3
sub rg1, rg3, rg2
addi rg2, 80, rg1
// _end saves the display state
jmp _end
// ------------------------------------------
// prints arg[0] as a decimal number to the screen.
print_num:
push bpr
mov spr, bpr
ldw bpr, rg0, 8 // load number to print
lli 0, rg5 // rg5 = digit counter
// check if number is zero
cmp rg0, zero
jne _print_num_extract_digits
// special case: print '0' for zero
lli 0x30, rg6
push rg6 // push digit to stack buffer
lli 1, rg5 // we have 1 digit
jmp _print_num_output
_print_num_extract_digits:
// divide by 10 repeatedly to get digits
cmp rg0, zero
jeq _print_num_output
// call divmod(rg0, 10)
push rg0 // dividend
lli 10, rg1
push rg1 // divisor (10)
call maths::divmod
pop rg0 // quotient (continue dividing this)
pop rg1 // remainder (the digit)
// convert digit to ASCII and push to stack buffer
addi rg1, 0x30, rg6 // convert to ASCII
push rg6 // push digit to stack
inc rg5 // increment digit counter
jmp _print_num_extract_digits
_print_num_output:
// now print digits (pop them off in reverse order)
ldw current, rg1 // get display pointer
_print_num_output_loop:
// check if we've printed all digits
cmp rg5, zero
jeq _print_num_done
// pop digit and print it
pop rg6
stb rg6, rg1
addi rg1, 1
dec rg5
jmp _print_num_output_loop
_print_num_done:
jmp _end
// ------------------------------------------
// resets the cursor position on the screen to 0x20000. (0,0)
reset:
push bpr
mov spr, bpr
ldw display, rg1
jmp _end
// ------------------------------------------
// clears the screen
clear:
push bpr
mov spr, bpr
// display size = 2000 bytes / 500 words
lli 500 rg0
ldw display, rg1
_clear_loop:
dec rg0
stw zero, rg1
addi rg1, 4
cmp rg0, zero
jgt _clear_loop
jmp _end
// ------------------------------------------
// return
_end:
stw rg1, current
mov bpr, spr
pop bpr
return
@@ -0,0 +1,274 @@
// lib:
// print_serial.dsa
// usage:
//
// include print_serial "<relative path>"
//
// usage for print:
// push (register containing address of string)
// push pcx
// jmp print_serial::print
//
// usage for print_byte:
// push (register containing byte)
// push pcx
// jmp print_serial::print_byte
//
// usage for print_word:
// push (register containing word)
// push pcx
// jmp print_serial::print_word
//
// usage for print_hex_byte:
// push (register containing byte)
// push pcx
// jmp print_serial::print_hex_byte
//
// usage for print_hex_word:
// push (register containing word)
// push pcx
// jmp print_serial::print_hex_word
//
// usage for print_whitespace:
// push pcx
// jmp print_serial::print_whitespace
//
// usage for print_newline:
// push pcx
// jmp print_serial::print_newline
//
// usage for print_num:
// push (register containing number to print in decimal)
// push pcx
// jmp print_serial::print_num
//
// usage for println:
// push (register containing address of string)
// push pcx
// jmp print_serial::println
//
include maths "./maths.dsa"
dw serial: 0x207D0 // 0x20000 + 2000
// ------------------------------------------
// prints the string at addr(arg[0]) to the serial port.
print:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
_print_loop:
ldb rg0, acc
cmp acc, zero
jeq _end
stb acc, rg1
addi rg0, 1
jmp _print_loop
// ------------------------------------------
// prints the string at addr(arg[0]) followed by a newline to the serial port.
println:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
_println_loop:
ldb rg0, acc
cmp acc, zero
jeq _println_end
stb acc, rg1
addi rg0, 1
jmp _println_loop
_println_end:
lli 0x0A, rg2 // newline character
stb rg2, rg1
jmp _end
// ------------------------------------------
// prints the word in arg[0] as 4 raw bytes to the serial port.
print_word:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
stb rg0, rg1
shr rg0, 8
stb rg0, rg1
shr rg0, 8
stb rg0, rg1
shr rg0, 8
stb rg0, rg1
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the serial port.
print_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
stb rg0, rg1
jmp _end
// ------------------------------------------
// prints the value of arg[0] to the serial port in hex.
print_hex_word:
push bpr
mov spr, bpr
lwi 0x207D0, rg1
ldb bpr, rg0, 8
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 9
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 10
push rg0
call _print_hex_byte
addi spr, 4
ldb bpr, rg0, 11
push rg0
call _print_hex_byte
addi spr, 4
jmp _end
// ------------------------------------------
// prints the last byte of arg[0] to the serial port in hex.
print_hex_byte:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lwi 0x207D0, rg1
call _print_hex_byte
jmp _end
// function body
_print_hex_byte:
lli 0xF, rg2
push rg0
shr rg0, 4
and rg0, rg2, rg0
call _print_hex_nibble
pop rg0
and rg0, rg2, rg0
call _print_hex_nibble
return
// print a hex digit
_print_hex_nibble:
lli 10, rg3
cmp rg0, rg3
jlt _print_hex_nibble_number
addi rg0, 0x37, rg0
stb rg0, rg1
return
_print_hex_nibble_number:
addi rg0, 0x30, rg0
stb rg0, rg1
return
// print a single space
print_whitespace:
push bpr
mov spr, bpr
lli 0x20, rg0
ldw serial, rg1
stb rg0, rg1
jmp _end
// print a single space
print_newline:
push bpr
mov spr, bpr
lli 0x0A, rg0
ldw serial, rg1
stb rg0, rg1
jmp _end
// ------------------------------------------
// prints arg[0] as a decimal number to the serial port.
print_num:
push bpr
mov spr, bpr
ldw bpr, rg0, 8
lli 0, rg5
cmp rg0, zero
jne _print_num_extract_digits
lli 0x30, rg6
push rg6
lli 1, rg5
jmp _print_num_output
_print_num_extract_digits:
cmp rg0, zero
jeq _print_num_output
push rg0
lli 10, rg1
push rg1
call maths::divmod
pop rg0
pop rg1
addi rg1, 0x30, rg6
push rg6
inc rg5
jmp _print_num_extract_digits
_print_num_output:
lwi 0x207D0, rg1
_print_num_output_loop:
cmp rg5, zero
jeq _print_num_done
pop rg6
stb rg6, rg1
dec rg5
jmp _print_num_output_loop
_print_num_done:
jmp _end
// ------------------------------------------
// return
_end:
mov bpr, spr
pop bpr
return
@@ -0,0 +1,30 @@
include serial: "./lib/serial.dsa";
include print: "./lib/print.dsa";
include arena: "./lib/arena_alloc.dsc";
fn main() {
let x: u32 = 0;
let y: u32 = &x;
let alloc: u32 = arena::new(512);
let ptr1: u32 = arena::alloc(alloc, 32);
let ptr2: u32 = arena::alloc(alloc, 32);
serial::print_hex_word(alloc);
serial::print_newline();
serial::print_hex_word(ptr1);
serial::print_newline();
serial::print_hex_word(ptr2);
serial::print_newline();
serial::print_num(*ptr2);
serial::print_newline();
*ptr2 = 42;
serial::print_hex_word(ptr2);
serial::print_whitespace();
serial::print_num(*ptr2);
serial::print_newline();
serial::println("end");
return 0;
}
+28
View File
@@ -0,0 +1,28 @@
[package]
name = "dsx_server"
version.workspace = true
edition.workspace = true
authors.workspace = true
[[bin]]
name = "dsx-server"
path = "src/main.rs"
[dependencies]
compiler = { path = "../../core/compiler" }
assembler = { path = "../../core/assembler" }
common = { path = "../../core/dsa_common" }
dsx_common = { path = "../dsx_common" }
dotenv = "0.15.0"
rocket = { version = "0.5.1", features = ["json"] }
rocket_dyn_templates = { version = "0.2.0", features = ["tera"] }
serde = { version = "1.0.228", features = ["derive"] }
toml = "1.0.3"
tracing = "0.1.44"
tracing-subscriber = { version = "0.3.22", features = ["env-filter"] }
chrono = "0.4.43"
tar = "0.4.44"
flate2 = "1.1.9"
walkdir = "2.5.0"
tokio = { version = "1.49.0", features = ["rt"] }
+106
View File
@@ -0,0 +1,106 @@
default_job = "check"
env.CARGO_TERM_COLOR = "always"
[jobs.check]
command = ["cargo", "check"]
need_stdout = false
[jobs.check-all]
command = ["cargo", "check", "--all-targets"]
need_stdout = false
# Run clippy on the default target
[jobs.clippy]
command = ["cargo", "clippy"]
need_stdout = false
[jobs.clippy-all]
command = ["cargo", "clippy", "--all-targets"]
need_stdout = false
# Run clippy in pedantic mode
# The 'dismiss' feature may come handy
[jobs.pedantic]
command = [
"cargo", "clippy",
"--",
"-W", "clippy::pedantic",
]
need_stdout = false
# This job lets you run
# - all tests: bacon test
# - a specific test: bacon test -- config::test_default_files
# - the tests of a package: bacon test -- -- -p config
[jobs.test]
command = ["cargo", "test"]
need_stdout = true
[jobs.nextest]
command = [
"cargo", "nextest", "run",
"--hide-progress-bar", "--failure-output", "final"
]
need_stdout = true
analyzer = "nextest"
[jobs.doc]
command = ["cargo", "doc", "--no-deps"]
need_stdout = false
# If the doc compiles, then it opens in your browser and bacon switches
# to the previous job
[jobs.doc-open]
command = ["cargo", "doc", "--no-deps", "--open"]
need_stdout = false
on_success = "back" # so that we don't open the browser at each change
# You can run your application and have the result displayed in bacon,
# if it makes sense for this crate.
[jobs.run]
command = [
"cargo", "run", "--bin", "dsx-server"
# put launch parameters for your program behind a `--` separator
]
need_stdout = true
allow_warnings = true
background = true
# Run your long-running application (eg server) and have the result displayed in bacon.
# For programs that never stop (eg a server), `background` is set to false
# to have the cargo run output immediately displayed instead of waiting for
# program's end.
# 'on_change_strategy' is set to `kill_then_restart` to have your program restart
# on every change (an alternative would be to use the 'F5' key manually in bacon).
# If you often use this job, it makes sense to override the 'r' key by adding
# a binding `r = job:run-long` at the end of this file .
# A custom kill command such as the one suggested below is frequently needed to kill
# long running programs (uncomment it if you need it)
[jobs.run-long]
command = [
"cargo", "run",
# put launch parameters for your program behind a `--` separator
]
need_stdout = true
allow_warnings = true
background = false
on_change_strategy = "kill_then_restart"
# kill = ["pkill", "-TERM", "-P"]
# This parameterized job runs the example of your choice, as soon
# as the code compiles.
# Call it as
# bacon ex -- my-example
[jobs.ex]
command = ["cargo", "run", "--example"]
need_stdout = true
allow_warnings = true
# You may define here keybindings that would be specific to
# a project, for example a shortcut to launch a specific job.
# Shortcuts to internal functions (scrolling, toggling, etc.)
# should go in your personal global prefs.toml file instead.
[keybindings]
# alt-m = "job:my-job"
c = "job:clippy-all" # comment this to have 'c' run clippy on only the default target
p = "job:pedantic"
+20
View File
@@ -0,0 +1,20 @@
# Endpoints
let n be the repo name.
## Web view
GET /packages/ # home page listing packages - simple search bar.
GET /packages?q=<query> # search for a package
GET /packages/<name> # main page for a repository, shows status, files, name etc.
GET /packages/<name>/~repo/<path> # path for a file within a repo
GET /packages/<name>/~repo?q=<query> # search within a package's files
GET /packages/<name>/~artifact/ # page listing repo artifacts by date
GET /packages/<name>/~artifact/<id> # page for a specific artifact and status/logs
## API
POST /api/pkg # create repo
GET /api/pkg/<name> # repo status/metadata
POST /api/pkg/<name>/push # upload source tarball
GET /api/pkg/<name>/pull # download source tarball
GET /api/pkg/<name>/artifact # download compiled binary
+15
View File
@@ -0,0 +1,15 @@
# Folder structure
data/
repos/
<repo_name>/
repo/
Dsx.toml
README.md
src/
artifacts/
<repo_name>.dsb
<repo_name-lib>.dsb
docs/
<repo_name-lib>.md
index/

Some files were not shown because too many files have changed in this diff Show More