Skip to content

dragan-mitrasinovic/RiscToolchain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RiscToolchain

Language Build Tools Parser

A complete toolchain for assembling, linking, and emulating programs written in a custom assembly language for a simplified RISC-like CPU architecture.

πŸŽ“ Developed as a project for System Programming (13E113SS) at the University of Belgrade, School of Electrical Engineering

πŸ“‹ Table of Contents

πŸ” Overview

This project implements a complete system software toolchain consisting of three main components:

  1. Assembler - Translates assembly source code into relocatable object files
  2. Linker - Combines multiple object files into a single executable with resolved symbols and relocated addresses
  3. Emulator - Simulates the execution of the linked program on a virtual processor

The implementation demonstrates core concepts in systems programming including:

  • Two-pass assembly
  • Symbol resolution and relocation
  • Memory management
  • Instruction encoding and decoding
  • Interrupt handling
  • Stack-based calling conventions

✨ Features

Assembler

  • Two-pass assembly for forward symbol references
  • Lexical analysis using Flex
  • Syntax parsing using Bison
  • Symbol table management with global and external symbols
  • Section-based code organization
  • Literal and symbol pools for efficient addressing
  • Relocation records generation
  • Support for multiple addressing modes

Linker

  • Symbol resolution across multiple object files
  • Section placement at specified memory addresses using -place option
  • Relocation processing to fix up addresses
  • Hex file output format for emulator
  • Undefined symbol detection

Emulator

  • 16 general-purpose registers (r0-r15)
  • 3 control and status registers (CSR)
  • Memory-mapped architecture
  • Interrupt support (software, timer, terminal)
  • Stack operations (push/pop)
  • Rich instruction set including:
    • Arithmetic operations (add, sub, mul, div)
    • Logical operations (and, or, xor, not)
    • Shift operations (shl, shr)
    • Control flow (call, ret, jmp, branches)
    • Memory operations (ld, st)
    • System operations (halt, int, iret)

πŸ—οΈ Architecture

The simulated CPU architecture features:

  • Register Set:

    • r0-r13: General-purpose registers
    • r14 (sp): Stack pointer
    • r15 (pc): Program counter
    • %status, %handler, %cause: Control and status registers
  • Memory Model:

    • 32-bit addressing
    • Little-endian byte order
    • Unordered memory map (no fixed memory regions)
  • Addressing Modes:

    • Immediate: $literal
    • Register direct: %reg
    • Register indirect: [%reg]
    • Register with offset: [%reg + offset]
    • Memory direct: symbol or literal

πŸ“ Project Structure

RiscToolchain/
β”œβ”€β”€ inc/                      # Header files
β”‚   β”œβ”€β”€ assembler.hpp        # Assembler classes and structures
β”‚   β”œβ”€β”€ linker.hpp           # Linker classes and structures
β”‚   └── emulator.hpp         # Emulator classes and structures
β”œβ”€β”€ src/                      # Implementation files
β”‚   β”œβ”€β”€ assembler.cpp        # Assembler implementation
β”‚   β”œβ”€β”€ linker.cpp           # Linker implementation
β”‚   └── emulator.cpp         # Emulator implementation
β”œβ”€β”€ misc/                     # Parser and lexer definitions
β”‚   β”œβ”€β”€ lexer.l              # Flex lexer specification
β”‚   └── parser.y             # Bison parser specification
β”œβ”€β”€ test/                     # Example assembly programs
β”‚   β”œβ”€β”€ main.s               # Main program entry point
β”‚   β”œβ”€β”€ math.s               # Math functions library
β”‚   β”œβ”€β”€ handler.s            # Interrupt handler dispatcher
β”‚   β”œβ”€β”€ isr_timer.s          # Timer interrupt service routine
β”‚   β”œβ”€β”€ isr_terminal.s       # Terminal interrupt service routine
β”‚   └── isr_software.s       # Software interrupt service routine
β”œβ”€β”€ makefile                  # Build configuration
β”œβ”€β”€ start.sh                 # Example build and run script
└── README.md                # This file

πŸ”§ Prerequisites

  • C++ Compiler (g++ with C++11 support or later)
  • GNU Make
  • Flex (Fast Lexical Analyzer)
  • Bison (GNU Parser Generator)

πŸ”¨ Building

To build all components:

make all

To build individual components:

make assembler    # Build only the assembler
make linker       # Build only the linker
make emulator     # Build only the emulator

To clean build artifacts:

make clean

πŸš€ Usage

Assembler

The assembler translates assembly source files (.s) into relocatable object files (.o).

./assembler -o <output.o> <input.s>

Example:

./assembler -o main.o test/main.s
./assembler -o math.o test/math.s

Output format: Object files contain sections, symbol tables, and relocation information.

Linker

The linker combines multiple object files into a single executable hex file.

./linker -hex -place=<section>@<address> -o <output.hex> <input1.o> <input2.o> ...

Options:

  • -hex: Generate hex output format (required)
  • -place=<section>@<address>: Place section at specific memory address
  • -o <output.hex>: Specify output file name

Example:

./linker -hex \
  -place=my_code@0x40000000 \
  -place=math@0xF0000000 \
  -o program.hex \
  handler.o math.o main.o isr_terminal.o isr_timer.o isr_software.o

Emulator

The emulator loads and executes the hex file.

./emulator <program.hex>

Example:

./emulator program.hex

The emulator will execute the program until a halt instruction is encountered, then print the final processor state showing all register values.

πŸ“ Assembly Language Syntax

Directives

.global symbol1, symbol2, ...    # Export symbols
.extern symbol1, symbol2, ...    # Import external symbols
.section section_name            # Define a section
.word value                      # Define 4-byte word (literal or symbol)
.skip n                          # Skip n bytes
.end                             # End of file

Instructions

# Arithmetic
add %rs, %rd        # rd = rd + rs
sub %rs, %rd        # rd = rd - rs
mul %rs, %rd        # rd = rd * rs
div %rs, %rd        # rd = rd / rs

# Logical
not %rd             # rd = ~rd
and %rs, %rd        # rd = rd & rs
or %rs, %rd         # rd = rd | rs
xor %rs, %rd        # rd = rd ^ rs

# Shift
shl %rs, %rd        # rd = rd << rs
shr %rs, %rd        # rd = rd >> rs

# Control flow
call operand        # Call subroutine
ret                 # Return from subroutine
jmp operand         # Unconditional jump
beq %r1, %r2, dst   # Branch if equal
bne %r1, %r2, dst   # Branch if not equal
bgt %r1, %r2, dst   # Branch if greater than

# Memory
ld operand, %rd     # Load to register
st %rs, operand     # Store from register
push %rs            # Push register to stack
pop %rd             # Pop from stack to register

# System
halt                # Stop execution
int                 # Software interrupt
iret                # Return from interrupt

# CSR operations
csrrd %csr, %rd     # Read CSR to register
csrwr %rs, %csr     # Write register to CSR

# Other
xchg %r1, %r2       # Exchange register contents

Labels and Comments

label_name:         # Define a label
# This is a comment

πŸ’‘ Example

A complete example is provided in the test/ directory and can be run using:

./start.sh

This script:

  1. Assembles all test files
  2. Links them with specific memory layout
  3. Runs the emulator on the resulting program

The example demonstrates:

  • Function calls across different sections
  • Interrupt handling
  • Stack operations
  • Arithmetic operations

About

RISC assembly toolchain with two-pass assembler, linker, and CPU emulator

Resources

License

Stars

Watchers

Forks

Contributors