Fateful is a CLI tool for working with my homebrew CPU, F8ful.
It contains an emulator and an assembler, as well as a full test suite.
Fateful can be installed via cargo, but currently it does not compile on any version greater or equal to 1.80.0 as this update broke the time crate used by shadow-rs:
cargo install --git https://github.com/commonkestrel/fatefulThere is also a Logisim Evolution circuit simulation,
which is located in the logisim directory.
Running a program has two steps: assembly and emulation.
To assemble a program, run the assemble or asm command:
fateful assemble <program>.asm -o <program>.binIf this is successful, you can emulate the program with the emulate or emu command:
fateful emulate <program>.binThe emulator is a REPL that contains various commands expalined below.
The most important command for emulating a program is RUN.
Inputting RUN 0 will run the assembly program as fast as possible until a halt is detected.
Here's a demo of the screen.asm example:
The assembler can be used with the fateful asm or fateful assemble command to assemble fateful assembly into f8ful machine code.
The input and output are both optional, and default to stdin and stdout respectively.
Input is positional, being the first argument, and the output can be specified with the -o or --output flag.
Fateful assembly contains just 16 instructions,
including arithmetic, memory operations, a hardware stack,
and a jnz (Jump if not zero) instruction making the CPU turing complete.
Machine Code: 0x0
Operation: Adds the first and second operand, storing the result in the first operand.
add <register>, <register/imm8>Machine Code: 0x1
Operation: Subtracts the second operand from the first, storing the result in the first operand.
sub <register>, <register/imm8>Machine Code: 0x2
Operation: Adds the first and second operands, plus an additional 1 if the carry bit is set, storing the result in the first operand.
adc <register>, <register/imm8>Machine Code: 0x3
Operation: Subtracts the second operand from the first, subtracting an additional 1 if the carry bit is set, storing the result in the first operand.
sbb <register>, <register/imm8>Machine Code: 0x4
Operation: Performs a bitwise NAND on the first and second operands, storing the result in the first operand.
nand <register>, <register/imm8>Machine Code: 0x5
Operation: Performs a bitwise OR on the first and second operands, storing the result in the first operand.
or <register>, <register/imm8>Machine Code: 0x6
Operation: Compares the first and second operands, storing the comparison results in the status register.
cmp <register>, <register/imm8>Machine Code: 0x7
Operation: Copies the second operand into the first operand.
mv <register>, <register/imm8>Machine Code: 0x8
Operation: Loads the byte at either the RAM address provided, or the RAM address stored in the HL registers if none is provided, into the first operand.
ld <register>, [address]Machine Code: 0x9
Operation: Stores the first operand into RAM at either the address provided, or the address stored in the HL registers if none is provided.
st [address,] <register>Machine Code: 0xA
Operation: Loads the provided 16-bit address into the HL registers.
lda <address>Machine Code: 0xB
Operation: Loads the byte at either the ROM address provided, or the ROM address stored in the HL registers if none is provided, into the first operand.
lpm <register>, [address]Machine Code: 0xC
Operation: Stores the first operand to the RAM location currently pointed to by the stack pointer, then decrements the stack pointer.
push <register/imm8>Machine Code: 0xD
Operation: Increments the stack pointer, then loads the RAM location currently pointed to by the stack pointer into the first operand.
pop <register>Machine Code: 0xE
Operation: Jumps to the address pointed to by the HL registers only if the first operand is not zero.
jnz <register/imm8>Machine Code: 0xF
Operation: Sets the H bit in the status register, halting the CPU
haltLabels help make blocks of your program easily accessible.
They consist of an identifier followed by a colon (:).
Labels with an identifier beginning with . are considered local to the most recent global label.
Local labels can also be accessed globally via <parent>.<local>.
Example:
parent:
; assembly code
.local1:
jmp [.local1]
.local2:
jmp [parent.local2]Both integer and string literals are valid in Fateful assembly.
Integer literals can be defined a few ways.
An immediate integer can be a decimal (100),
but they can also be defined with hexadecimal (0x64), octal (0o144), and binary (0b01100100).
Integers can also be defined with a character, surrounded by single-quotes (').
Characters support the same escape sequences as strings.
Expressions can be used to make it more clear where a value comes from.
Expressions must be able to be evaluated at compile-time, and are surrounded in parentheses.
Expressions support adding, subtracting, multiplying, dividing, modulus, common bit-wise operators,
and boolean expressions (with true as 1 and false as 0).
There is also a special symbol that can be used in place of an integer: $.
The $ symbol represents the value of the program counter at the start of the current instruction.
This can be very useful for calculating relative jumps, and can be used anywhere an integer literal can.
Strings can only be used with the @str directive, and are surrounded in double-quotes (").
The compiler will automatically append a null-byte to every string literal.
Strings support a variety of escape sequences:
| Escape Sequence | Result |
|---|---|
\n |
Line Feed (◙ in code-page 737) |
\\ |
\ |
\" |
" |
\' |
' |
\0 |
Null character |
\v |
Vertical Tab (♂ in code-page 737) |
\t |
Horizontal Tab (○ in code-page 737) |
\r |
Carriage return (♪ in code-page 737) |
\a |
Bell (• in code-page 737) |
\b |
Backspace (◘ in code-page 737) |
\f |
Form Feed (♀ in code-page 737) |
\xFF |
8-bit character code (exactly 2 digits) |
\o777 |
8-bit character code (exactly 3 digits) |
There are a variety of C-style preprocessor directives included in the assembler, indicated with a preceding @.
These directives can apply conditional transformations to the source before compilation.
Macros are processed in top-down order, meaning if a @define is placed below an @ifdef in the file, the define will not be in scope during the check.
The define macro links an identifier to a group of tokens. Before compiling, each instance of this identifier is replaced with the specified tokens.
Unlike C, this does not support function-style definitions, meaning no arguments are allowed.
Syntax:
@define <identifier> <value>The @undef macro removes (undefines) the current definition of the given identifier.
Consequently, subsequent occurrences of the identifier are ignored by the preprocessor.
Syntax:
@undef <identifier>The @error directive emits a user-specified error message before terminating the assembly.
Syntax:
@error "error message"The @if directive controls compilation of portions of a source file.
If the expression you write after the @if is greater than 0, the block following the @if is retained for assembly.
Syntax:
@if <expr>
...
@endif
The @elif directive is only allowed as part of an @if block,
and is only evaluated if the previously evaluated blocks' check evaluates to 0.
Similar to the @if directive, if the expression you write after the @elif is greater than 0, the block following the @elif is retained for assembly.
Syntax:
@if <expr>
...
@elif <expr>
...
@endif
The @else directive is only allowed at the end of an @if block.
If the expression of the previously evaluated block's check evaluates to 0,
then the block following the @else is retained for assembly.
Syntax:
@if <expr>
...
@else
...
@endif
The @ifdef directive is functionally the same as @if 1 if the identifier has been defined,
and @if 0 when the identifier hasn't been defined, or has been undefined by the @undef directive.
Syntax:
@ifdef <identifier>
...
@endif
The @ifndef directive is functionally the same as @if 0 if the identifier has been defined,
and @if 1 when the identifier hasn't been defined, or has been undefined by the @undef directive.
Syntax:
@ifndef <identifier>
...
@endif
The include macro pastes a stream of tokens from another file. The file must be located in a package, and can be indexed by filepath relative to the root of the package.
A package is linked to an identifier through a rich comment, and can be either a local directory or a remote git repository.
Syntax:
/// <package> = <path/git repository>
@include <<package>/<file path>>Example:
/// error = https://github.com/commonkestrel/f8ful_os
@include <error/error.asm>The assembly is divided into segments, specified with the @cseg and @dseg directives,
and organized by the @org directive.
The @org directive will apply to the current segment, and can only be specified once per segment.
Segments can be used to organize blocks of data and code throughout the address space.
Code segments, signified by the @cseg directive,
are where all of your assembly instructions are located.
Each assembly program starts in an initial code segment.
Data segments, signified by the @dseg,
are a block in RAM rather than the ROM.
These segments allow you to reserve blocks for global variables.
The variables defined in a data segment will reserve a RAM address while automatically avoiding collision.
Variables can be defined with a few directives. These directives allow reserving blocks of variable size, specified here:
| Directive | Size |
|---|---|
@byte <identifier> |
1 |
@double <identifier> |
2 |
@quad <identifier> |
4 |
@var <size> <identifier> |
size |
These variables will resolve to an address at assembly, and can be accessed via $identifier.
Segments are automatically arranged to avoid collision,
but they can be manually organized with the @org directive.
This directive will place the origin of a segment at the address specified.
This is especially useful to make sure that the origni of your code is placed at 0x0000,
since the program starts running from this point.
Unfortunately, manual organization can result in collisions, so you must be careful to avoid these if manually organizing two or more segments of the same type.
You can place raw data within the program as well.
This data can be acessed with lpm.
Similar to variables in the data segment, these are placed with the following directives:
@byte <imm8>@double <imm16>@quad <imm32>@str <string>
This data is often used in conjunction with a label in order to make it easily locatable.
Example
hello:
@str "hello world"Macros are an incredibly powerful part of this assembler,
and are defined with the @macro directive.
They are similar to C's function-style #define macros,
but with optionally typed parameters and multiple definitions for different parameters.
Each parameter can have an accepted type, or multiple accepted types with the | operator.
Parameter identifiers must begin with a %.
These types are listed here:
reg: Matches a register input (A, B, C, D, E, F, H, L)imm: Matches an immediate integeraddr: Matches a RAM addresslabel: Matches a ROM addressident: Matches any identifierstr: Matches a string literalany: Matches any of the previous
Syntax:
; This syntax defines a singular signiture
@macro <identifier> (<parameters>) {
; assembly code
}
; This syntax allows for the definition of multiple signitures
@macro <identifier> {
(<parameters>) {
}
(<parameters>) {
}
}This may be a little confusing, so we can use an example from the built-in macros (we'll get to these later):
@macro jmp {
() {
jnz 1
}
(%location:label) {
lda %location
jmp
}
}What are we even looking at here?
Well, this macro contains two signitures -
one with an empty parameter list and one with a ROM address bound to the %location parameter.
As you can see, the second signiture contains another jmp instruction,
showcasing the fact that these macros are evaluated recursively.
Macros are used just like normal instructions.
For example, the jmp macro can be used like jmp [foo],
which expands to this:
lda [foo]
jnz 1Built-in macros are a group of macros included by default in every program. The details of each macro can be found in (src/assembler/macros.asm)[./src/assembler/macros.asm].
- PUSH
- POP
- PUSHA
- POPA
- JMP
- JNZ
- JLT
- JLE
- JGT
- JGE
- JEQ
- JZ
- CALL
- RET
- MV16
- ADD16
- SUB16
- INC
- DEC
- NOT
- AND
- XOR
- SHL
- NOP
- USE
push r0: reg|imm, r1: reg|immPushes two values to the stack in ascending parameter order.
push r0: reg|imm, r1: reg|imm, r2: reg|immPushes three values to the stack in ascending parameter order.
push r0: reg|imm, r1: reg|imm, r2: reg|imm, r3: reg|immPushes four values to the stack in ascending parameter order.
push r0: reg|imm, r1: reg|imm, r2: reg|imm, r3: reg|imm, r4: reg|immPushes five values to the stack in ascending parameter order.
push r0: reg|imm, r1: reg|imm, r2: reg|imm, r3: reg|imm, r4: reg|imm, r5: reg|immPushes six values to the stack in ascending parameter order.
pop r0: reg|imm, r1: reg|immPops two values from the stack in ascending parameter order.
pop r0: reg|imm, r1: reg|imm, r2: reg|immPops three values from the stack in ascending parameter order.
pop r0: reg|imm, r1: reg|imm, r2: reg|imm, r3: reg|immPops four values from the stack in ascending parameter order.
pop r0: reg|imm, r1: reg|imm, r2: reg|imm, r3: reg|imm, r4: reg|immPops five values from the stack in ascending parameter order.
pop r0: reg|imm, r1: reg|imm, r2: reg|imm, r3: reg|imm, r4: reg|imm, r5: reg|immPops six values from the stack in ascending parameter order.
pushaPushes all six general-purpose registers to the stack in ascending order.
Designed to be paired with the popa macro.
popaPops the top 6 values on the stack into the six general-purpose registers in decending order.
Designed to be paired with the pusha macro.
jmpJumps to the location pointed to by the HL register unconditionally.
jmp location: labelJumps to location unconditionally.
jnz condition: reg|imm, location:labelJumps to location if condition is not zero.
jltJumps to the address pointed to by the HL registers if the L flag in the status register is set.
jlt location: labelJumps to location if the L flag in the status register is set.
jlt x: reg, y: reg|immJumps to the location pointed to by the HL registers if x < y.
jlt x: reg, y: reg|imm, location: labelJumps to location if x < y.
jleJumps to the address pointed to by the HL registers if the L or E flags in the status register are set.
jle location: labelJumps to location if the L or E flags in the status register are set.
jle x: reg, y: reg|immJumps to the location pointed to by the HL registers if x <= y.
jle x: reg, y: reg|imm, location: labelJumps to location if x <= y.
jgtJumps to the address pointed to by the HL registers if the G flag in the status register is set.
jgt location: labelJumps to location if the G flag in the status register is set.
jgt x: reg, y: reg|immJumps to the location pointed to by the HL registers if x > y.
jgt x: reg, y: reg|imm, location: labelJumps to location if x > y.
jgeJumps to the address pointed to by the HL registers if the G or E flags in the status register are set.
jge location: labelJumps to location if the G or E flags in the status register are set.
jge x: reg, y: reg|immJumps to the location pointed to by the HL registers if x >= y.
jge x: reg, y: reg|imm, location: labelJumps to location if x >= y.
jeqJumps to the address pointed to by the HL registers if the E flag in the status register is set.
jeq location: labelJumps to location if the E flag in the status register is set.
jeq x: reg, y: reg|immJumps to the location pointed to by the HL registers if x == y.
jeq x: reg, y: reg|imm, location: labelJumps to location if x == y.
jz condition: reg|imm, location: labelJumps to location if condition is 0.
callPushes the return address to the stack and jumps to the address pointed to by the HL registers.
Designed to be paired with the ret macro.
call location: labelPushes the return address to the stack and jumps to location.
Designed to be paired with the ret macro.
retJumps to the address stored at the top of the stack.
mv16 high: reg, low: reg, imm: immMoves a 16-bit immediate integer into the provided registers.
add16 h0: reg, l0: reg, h1: reg|imm, l1: reg|immAdds two 16-bit integers. h0 and l0 make up the high and low bytes of the first operand, with h1 and l1 making up the high and low bytes of the second operand.
sub16 h0: reg, l0: reg, h1: reg|imm, l1: reg|immSubtracts two 16-bit integers. h0 and l0 make up the high and low bytes of the first operand, with h1 and l1 making up the high and low bytes of the second operand.
inc reg: regAdds 1 to the value contained in reg, storing the result back in reg.
inc high: reg, low: regAdds 1 to the 16-bit value contained in high and low, storing the result back in high and low.
dec reg: regSubtracts 1 from the value contained in reg, storing the result back in reg.
dec high: reg, low: regSubtracts 1 from the 16-bit value contained in high and low, storing the result back in high and low.
not reg: regPerforms a bitwise NOT operation on reg, storing the result back in reg.
and x: reg, y: reg|immPerforms a bitwise AND operation on x and y, storing the result in x.
and x: reg, y: reg|immPerforms a bitwise XOR operation on x and y, storing the result in x.
shl reg: regPerforms a logical shift left on reg, storing the result back in reg
nopPerforms an operation that has no effect, taking 4 clock cycles (the same as ADD).
use label: label|identEliminates the warning: unused label definition message for label.
The f8ful emulator simulates each individual clock cycle, using the CPU's microcode to determine what to do on each pulse. An upside to this is that you can dump the CPU at any time and see the microcode and program counter for each clock pulse. A downside, however, is that the emulator is much slower than it could be, since it has to check every microcode flag for every clock pulse.
The emulator contains a REPL with a few useful commands:
Syntax: SET <register>, <value>
Sets the value of register to value.
Syntax: GET <register>
Prints the data stored in register.
Syntax: PEEK <address>
Prints the data stored in memory at address.
Syntax: POKE <address>, <value>
Sets the memory at address to value.
Syntax: RUN <speed>
Runs the emulator clock at speed in HZ. If speed is zero the emulator clock will run uncapped.
Syntax: LOAD <path>, <address>
Attaches the peripheral located at path to address.
Syntax: DROP <address>
Drops the peripheral attached to address if there is one.
Syntax: DUMP
Prints the current machine state. Includes information such as the program counter, stack pointer, status register, ALU registers, general purpose registers, etc...
Syntax: STEP
Steps the emulator clock by one pulse. Only works if the emulator clock is stopped.
Syntax: RESET
Resets the CPU state to the initial state.
Syntax: STOP
Stops the emulator clock. Only works if the emulator clock is running.
Syntax: QUIT
Drops all peripherals and quits the emulator.
Syntax: HELP
Prints a help message detailing the REPL's commands.
There are several locations in memory with mapped IO. These memory-mapped addresses allow programs to interact with hardware directly. The top 48 memory addresses are reserved for various peripherals, with two implemented in the emulator.
0xFFFFis where the status register (SREG) resides.0xFFFEis the low byte of the stack pointer.0xFFFDis the high byte of the stack pointer.
Below these reserved addresses, the address range 0xF000 through 0xFFCF are reserved for the video memory.
This address range is functionally similar to VGA text mode in x86 processors,
with an 85x20 character screen.
The low (character) byte is the code point. The VGA text follows code-page 737 seen below:
The second byte is the attribute or modifier byte, describing the foreground and background colors. The lower nibble describes the foreground color, while the upper nibble describes the background color.
An example of all foreground and background colors
Fateful has a built-in test suite that can make it easy to make sure
your program actually does what it is supposed to.
You can run tests on assembly programs with the test command:
fateful test <program>.asmThe test command will check the contents of registers after halting if specified in the program.
These checks are specified in rich comments (///) similar to libraries.
For example, these are the checks included in the fib.asm example:
/// a: 0x0D
/// b: 0x15
/// c: 0x00
/// d: 0x15
These are only read by the test suite, and will be checked after the emulator halts.
In this example, the test-runner will assert that the content of the A register is 0x0D,
the B register is 0x15, the C register is 0x00, and the D register is 0x15.
If these assertions fail, the test is marked as failing.
The test command also includes a --timeout flag, which defaults to 500ms.
If the emulator does not detect a halt in this time,
the emulator will exit and the test will be marked as failing.
Peripherals are a way to extend the emulator, simulating a memory-mapped peripheral. This is done through the use of dynamic library loading, so you can create a peripheral in any language that supports the C ABI. Peripherals can be attached to one or more slots in the top 48 bytes of RAM.
Stateless peripherals are the simplest form of peripheral, with state being managed by the peripheral rather than the emulator.
Stateless peripherals are initialized by a function with the signiture int init(unsigned char).
The return value of this is expalined in errors.
The input parameter provides the number of slots that the peripheral has been attached to.
Peripherals can be written through a function with the signiture void write(unsigned char, unsigned char).
This function is called whenever the CPU writes to the given address or the address is POKEed.
The first parameter is the slot index that is being written to, and the second parameter is the value being written.
Peripherals can be read from with a function with the signiture unsigned char read(unsigned char).
This function is called whenever the CPU reads from the given address or the address is PEEKed at.
The input parameter is the slot index that is being read from, and the return value should be the value at the slot.
Peripherals are dropped through a function with the signiture void drop().
This function is called when every address this peripheral is attached to is DROPed, or the emulator is quit.
This function must clean up any seperate threads before returning or the emulator will crash.
Peripherals are reset through a function with the signiture void reset().
This functions is called whenever the emulator resets the CPU.
Stateful peripherals are a way to offload managing state to the emulator. They essentially allow a peripheral to hand off a heap-allocated pointer to that peripheral's state. This is especially important if you are spawning multiple threads and need to rejoin the main thread, or if you want to have multiple instances of the same peripheral.
There is a Rust crate - fateful_peripheral -
that handles all of the pointer magic behind the scenes, allowing you to stay completely in the safe world.
When a stateful peripheral is initialized through a function of the signiture *void stateful_init(char),
it must return a pointer to the state's location.
Warning: This pointer must be stored on the heap,
otherwise the emulator will Segfault.
Other than the pointer return,
the rest of the function should be the same as its stateless counterpart.
If an error occurs during the setup, the process for reporting these is explained in Errors.
Reading, writing, dropping, and resetting stateful peripherals is functionally the same as stateless,
but each function has an extra stateless_ prepended to its identifier,
as well as a pointer parameter (*void) at the start of each functions's parameters.
Errors are only checked upon initialization - after both init and stateful_init.
If either init returns a non-zero value or stateful_init returns a null pointer,
the emulator will check for a function with the signiture int last_error_length().
If this function exists, the emulator will then check for a function with the signiture *char last_error().
last_error_length should return the length of the ASCII string pointed to by the result of last_error.
Peripherals can optionally have a name that will displayed when the emulator is DUMPed.
This must be supplied by a function with the following signiture: *char name().
The returned pointer must point to a null-terminated ASCII string.

