Add fuzzer harness and fix bugs it found.#363
Conversation
b66f293 to
8270534
Compare
09ca2d0 to
b65e8e7
Compare
b65e8e7 to
e4a5bd9
Compare
edubart
left a comment
There was a problem hiding this comment.
I have compiled and run locally, tested, also run fuzz tests, also read and reasoned about most of the changes. Not much to change besides more tests covering what was introduced (optional) to increase my confidence in the TLB details across all state accessors (which is fragile).
Also run some basic benchmarks here and there was no significant impact, although it was not a comprehensive.
I think at least the build issues I had with tests/Makefile should be fixed, otherwise it breaks my development workflow here, see comments.
| slot_paddr, SHADOW_TLB_SLOT_LOG2_SIZE, | ||
| [this, set_index, slot_index, vaddr_page, vp_offset, pma_index]() { | ||
| m_m.write_shadow_tlb(set_index, slot_index, vaddr_page, vp_offset, pma_index); | ||
| m_m.write_unverified_tlb(set_index, slot_index, vaddr_page, vp_offset, pma_index); |
There was a problem hiding this comment.
According to coverage report downloaded from CI, we have no coverage for this line or do_write_tlb functions for:
- record_step_state_access
- replay_step_state_access
- uarch_record_state_access
- uarch_replay_state_access
I would feel more confident if we had tests run by the CI that covered them (outside the fuzzer).
There was a problem hiding this comment.
Let me know if this summary helps:
Coverage improvement summary
Starting point
The CI coverage job was missing the test-machine-with-log-step target, which runs
cartesi-machine-tests.lua with the run_step command. This meant log_step() /
verify_step() were never exercised during coverage, leaving the entire
replay_step_state_access path (including do_write_tlb) at 0% for the big-machine
replay.
Adding test-machine-with-log-step to the CI coverage job brought
replay-step-state-access.h from 68.8% to 87.5% and functions from 60.2% to 81.9%
overall.
Identifying the next gaps
With the happy-path replay now covered, the remaining gaps were:
| File | Before | After adding log_step |
|---|---|---|
| replay-send-cmio-state-access.h | 62.4% | 62.4% (unchanged) |
| uarch-replay-state-access.h | 84.0% | 84.0% (unchanged) |
The cmio replay verifier had zero coverage of its ~27 validation error paths. The uarch
replay had no coverage of TLB write verification. These are the paths that reject
invalid state transitions during fraud proof disputes.
What changed vs the original plan
The initial analysis proposed writing tests inspired by the old cheat/simple/
tournament code, which wrapped machines to produce dishonest state transitions. That
approach would have required building a valid machine abstraction and was more complex
than necessary.
Feedback during the session redirected the approach in several ways:
-
No need for a dishonest machine wrapper. Instead of corrupting the machine and
re-running, we produce a valid log and then corrupt the log itself before calling
verify. This directly tests what matters: that the verifier rejects bad logs. -
Fix the JSON deserialization layer. The Lua-to-C++ serialization in
json-util.cppwas silently validating and filtering access log data before the C++
replay code could see it. Specifically:- The
writtenfield was dropped for read accesses, making the "unexpected written
data in read access" error unreachable. - Size checks on
read,written, andsibling_hashesdata threw JSON-layer
errors before the C++ replay validation could run. - A bug in the error message reported
"written"when checkingreaddata size.
The fix was to remove all validation from the JSON layer and let the C++ replay code
handle it. The JSON layer should faithfully deserialize; validation is the replay
code's job, since that is the code that mirrors what runs on-chain. - The
-
The "differs only by written word" error was reachable. Initially dismissed as
unreachable becauseHASH_TREE_WORD_SIZE == sizeof(uint64_t). But the hash tree
word size is actually 32 bytes, not 8. Register writes modify an 8-byte word within
a 32-byte leaf. Corrupting a different 8-byte region in the leaf while preserving
the written word triggers this check. -
Truncation tests at different points. The initial test only truncated the last
access. Different truncation points hit different "too few accesses" checks in
check_read,check_write, anddo_write_memory_with_padding. -
Make
posix.stdliboptional in test utilities. The test utility required
posix.stdlibforrealpath, which prevented running tests on macOS without the
posix Lua module. The fix usespcalland falls back to relative paths. -
Use the lester test framework. The test was initially standalone with its own
harness. It was converted to use the project'slesterframework and integrated
intotest-spec.lua, so it runs as part of the existingtest-luaCI target.
verify_step failure tests (big-machine replay)
The big-machine replay verifier (replay_step_state_access) uses a page-based binary
step log format, unlike the uarch verifier which uses per-access JSON logs. Testing it
required a completely different approach: producing a valid binary log via log_step,
then surgically corrupting it before calling verify_step.
Feedback steered the design:
-
Binary log corruption, not JSON corruption. Claude initially suggested reusing the
uarch test pattern of corrupting JSON access logs. Butverify_steptakes a binary
step log file, not a JSON access log. The test had to understand the binary format
(root hashes, mcycle, hash function type, page entries with index/data/scratch-hash,
sibling hashes) and corrupt specific fields. -
Build a log from scratch for structural tests. For tests like "too few sibling
hashes at leaf level" or "too many pages", corrupting a valid log was insufficient.
The test neededbuild_step_logandget_siblings_for_pageshelpers to construct
logs with precise page/sibling combinations. -
Page corruption must fool the root hash check. Claude initially proposed simply
removing pages from a valid log. Butverify_stepcomputes the root hash from the
logged pages and siblings before replaying -- a missing page would fail the root hash
check, not the replay. To test "required page not found" during replay, the test had
to recompute the root hash for the reduced page set, which required computing fresh
sibling hashes from the machine's Merkle tree. -
Corrupt page data, not page presence, for replay errors. To trigger interpreter
errors during replay (e.g., exception on a corrupted instruction), the test corrupts
the page data while keeping the page in the log and recomputing the page's hash to
match, then recomputing the root hash with fresh siblings.
Fixing record/replay TLB validation
The lazy TLB verification (commit 3ac97eb) introduced do_init_hot_tlb_slot which
validates shadow TLB entries on first access. But record_step_state_access had a bug:
it touched the PMA page only for valid entries, meaning replay could not validate corrupt
entries (it would fail trying to read PMA data that wasn't in the log).
Feedback identified this:
-
Record must touch the PMA page unconditionally. Claude initially thought the
existing code was already correct. But the record side was only touching the PMA page
inside theif (vaddr_page != TLB_INVALID_PAGE)block. Corrupt TLB entries (garbage
pma_index) still need the PMA page in the log so replay can calldo_read_pmaand
detect the corruption. -
Validate first, then touch the target page. The old code touched the target page
before validation. The fix reorders: validate viainit_hot_tlb_slot, then touch the
target page only if validation passed. -
Clamp out-of-bounds pma_index.
pmas_get_abs_addrcould compute an address
outside the PMAs region for a corrupt pma_index. The fix clamps to a sentinel entry
atPMA_MAX, which has zeroed istart/ilength and failsis_memory(), causing
shadow_tlb_verify_slotto reject the entry. -
TLB thrash test. A dedicated RISC-V test binary (
thrash-tlb.S) exercises all
TLB sets by reading/writing addresses that map to every TLB slot, forcing evictions
and re-validations. Pre/post Lua scripts set up corrupt shadow TLB entries and verify
the machine handles them correctly duringlog_step/verify_step.
Marking genuinely unreachable code
During coverage analysis of replay_step_state_access, several code paths were
identified as genuinely unreachable despite being defensive checks. Each was marked with
LCOV_EXCL_START/LCOV_EXCL_STOP and a comment explaining why.
Claude initially proposed writing tests to cover some of these. Analysis during the
session proved they were unreachable:
-
Leaf-level "too few sibling hashes" (line 391). To reach this code, we would
needpages[next_page].index < page_indexat leaf level. Claude initially tried to
construct an adversarial log to hit this path. After tracing the recursion in
compute_root_hash_impl, the proof emerged: pages are sorted and consumed
left-to-right, so the next unconsumed page always has index >= the current leaf's
page_index. -
find_page(host_addr)"required page not found" (line 324). The only caller is
do_write_tlb, which receivesvh_offsetfrom the interpreter's page walk.
vh_offsetis computed fromdo_get_faddr, which already called
find_page(uint64_t)successfully for the same page. Claude initially proposed
removing thehost_addroverloads as dead code left over from the pre-lazy-TLB bulk
relocation (commit 3ac97eb). Butdo_write_tlbat line 562 does call
find_page(host_addr)to reverse-translatevh_offsetback to a physical address --
the overloads are needed, just their error path is unreachable. -
do_read_memory/do_write_memory/do_putchar. These are required by the
i_state_accessCRTP interface but never called during step replay. -
Page data ordering check (line 248). Pages are stored contiguously in the parsed
log, so their data addresses are always in increasing order by construction. The
check guards against a hypothetical future where page data is independently
allocated. This was already marked withLCOV_EXCLusingLCOV_EXCL_ENDinstead
ofLCOV_EXCL_STOP-- gcovr requiresLCOV_EXCL_STOP.
Changes made
src/json-util.cpp
- Removed
log2_size >= 64bounds check from access log deserialization. - Removed read/written data size validation.
- Removed sibling_hashes depth validation.
- Removed the
type == writeguard on deserializing thewrittenfield. - Fixed error message that said
"written"when checkingreaddata size.
src/pmas.h
pmas_get_abs_addrclamps out-of-bounds pma_index to a sentinel entry atPMA_MAX.- Added
static_assertthat there is room for the sentinel entry.
src/machine-address-ranges.cpp
- Added bounds check on PMA count in
push_back.
src/machine.cpp
- Added
static_assertand runtime check for PMA count ininit_pmas_contents.
src/record-step-state-access.h
do_init_hot_tlb_slot: touch PMA page unconditionally (before validation), validate
first viainit_hot_tlb_slot, then touch target page only if valid.
src/replay-step-state-access.h
- Added
LCOV_EXCL_START/LCOV_EXCL_STOPwith explanatory comments for five
unreachable code paths: page data ordering check,find_page(host_addr)error,
leaf-level sibling check,do_read_memory/do_write_memory, anddo_putchar. - Fixed
LCOV_EXCL_END->LCOV_EXCL_STOP(gcovr requiresSTOP, notEND). do_write_tlb: write zero_padding field to shadow TLB.
src/replay-send-cmio-state-access.h
- Added
LCOV_EXCLmarkers for six lines that are genuinely unreachable through the
current API (aligned address checks, empty-log read check, null data check, address
mismatch in check_read). - Fixed error message capitalization.
src/clua-cartesi.cpp, src/machine-c-api.cpp, src/machine-c-api.h
- Exposed
AR_SHADOW_STATE_START,AR_SHADOW_STATE_LENGTH,AR_PMAS_START,
AR_PMAS_LENGTHconstants through Lua and C APIs for use by TLB validation tests.
tests/lua/cartesi/tests/util.lua
- Made
posix.stdliboptional viapcall. Falls back to relative paths.
tests/lua/machine-bind.lua
- Updated error expectations in
verify_reset_uarchandverify_step_uarchunhappy
path tests to match the C++ replay errors instead of the removed JSON-layer errors.
tests/lua/spec-verify-uarch-failure.lua (new)
67 tests exercising every reachable validation error path:
- verify_step_uarch (30 tests): basic step corruptions (empty log, extra access,
wrong type/address/log2_size, corrupt data/hashes/siblings, ordinal formatting for
1st-4th accesses, wrong final hash) and TLB write corruptions via the
ecall-write-tlbtest binary (wrong type/address, corrupt siblings, wrong/missing
written_hash, corrupt read/written data). - verify_send_cmio_response (37 tests): check_read errors (7 tests),
do_write_memory_with_padding errors (8 tests), check_write errors (12 tests),
log structure errors (5 tests including truncation at three different points),
ordinal coverage, zero-length data path, and wrong final hash.
tests/lua/spec-verify-step-failure.lua (new)
24 tests exercising the binary step log verifier:
- Log parsing errors (9 tests): truncation at each field boundary (root hash before,
mcycle count, root hash after, hash function type, page count, sibling count, sibling
hashes), extra trailing data, and unsupported hash function type. - Page validation errors (4 tests): out-of-order page indices, non-zero scratch hash,
extra pages beyond the log's page count, and too many pages in the Merkle tree
reconstruction. - Sibling hash errors (3 tests): too few siblings at internal level, too few siblings
at leaf level, and too many siblings. - Root hash / replay errors (5 tests): initial root hash mismatch, wrong mcycle
count, wrong root hash after, missing page during replay (requiring recomputation of
sibling hashes for a reduced page set), and corrupt page data causing an interpreter
exception during replay. - Hash function coverage (3 tests): all tests run with both SHA-256 and Keccak-256,
plus an explicit test for unsupported hash function type.
tests/lua/pre-thrash-tlb.lua, tests/lua/post-thrash-tlb.lua (new)
Setup and verification scripts for the TLB thrash test, which exercises log_step /
verify_step with corrupt shadow TLB entries across all TLB sets.
tests/machine/src/thrash-tlb.S (new)
RISC-V test binary that reads and writes addresses mapping to every TLB slot, forcing
evictions and re-validations.
tests/lua/test-spec.lua
- Added
require("spec-verify-step-failure")andrequire("spec-verify-uarch-failure")
so both tests run as part oftest-lua.
.github/workflows/build.yml
- Added
test-machine-with-log-stepto the coverage CI job.
tests/Makefile
- Fixed
build-tests-allto work inside the container without forcing fuzzer seed corpus
generation.
Coverage results
Overall (coverage -> coverage2):
| Metric | Before | After | Delta |
|---|---|---|---|
| Lines | 18578/23339 (79.6%) | 18827/23312 (80.7%) | +249, +1.1% |
| Functions | 3827/6361 (60.2%) | 5227/6358 (82.2%) | +1400, +22.0% |
| Branches | 8985/20588 (43.6%) | 9377/20506 (45.7%) | +392, +2.1% |
Critical verification files:
| File | Before | After |
|---|---|---|
| replay-step-state-access.h | 68.8% | 100.0% |
| replay-send-cmio-state-access.h | 62.4% | 97.8% |
| uarch-replay-state-access.h | 84.0% | 99.4% |
| uarch-record-state-access.h | 85.4% | 94.2% |
| record-step-state-access.h | 53.7% | 82.9% |
| shadow-tlb.h | 40.4% | 55.3% |
| shadow-uarch-state.h | 31.3% | 41.0% |
| pmas.h | 60.3% | 64.9% |
| address-range.h | 76.9% | 77.8% |
| machine.cpp | 86.9% | 87.9% |
Remaining uncovered lines
In replay-send-cmio-state-access.h (4 uncovered, all marked LCOV_EXCL):
"address not aligned to word size"-- register addresses are always 8-byte aligned."too few accesses in log"incheck_read-- the constructor catches empty logs
first, and the only read (iflags.Y) is always first.- Address mismatch in
check_read-- only one read address is used (iflags.Y).
In uarch-replay-state-access.h (1 uncovered):
return "unknown_"inaccess_type_name-- onlyreadandwriteexist.
In replay-step-state-access.h (0 uncovered after LCOV_EXCL markers):
- All remaining uncovered paths are marked with
LCOV_EXCLand documented with proofs
of unreachability (see "Marking genuinely unreachable code" above).
Many not-taken branches in the coverage report are GCC's exception-handling machinery
(implicit branches for std::string allocation failure inside throw expressions),
not real logic branches.
| // In contrast, a STATE_ACCESS that does not have access to hot out-of-state slots cannot mark TLB slots | ||
| // as not-yet-initialized. | ||
| // We must verify the cold slot at every hit and treat inconsistent entries as misses | ||
| if (!a.template verify_cold_tlb_slot<TLB_READ>(slot_index)) [[unlikely]] { |
There was a problem hiding this comment.
According to coverage report downloaded from CI, we have no coverage the case when verify_cold_tlb_slot fails (for state accessor that implements them) and falls inside this if for:
- fetch_translate_pc
- read_virtual_memory
- write_virtual_memory
I would feel more confident if we had tests run by the CI that covered them (outside the fuzzer).
Could be a simple test that thrashes the TLB on purpose.
There was a problem hiding this comment.
The only state access that implements verify_cold_tlb_slot (other than simply returning true) is the one the uarch bridge, used to compile interpret to uarch.bin so it can run inside the uarch. So unless we add a way to extract coverage from interpret while it is running inside uarch, the coverage will not show...
There was a problem hiding this comment.
Guess I will try that. :)
There was a problem hiding this comment.
Uarch coverage collection
Motivation
The emulator's interpret() function is compiled twice: once for the host
(with gcov instrumentation), and once as a RISC-V binary that runs inside the
microarchitecture emulator (uarch-ram.bin). The host coverage report misses
code paths that are only exercised inside the uarch -- most notably
machine-uarch-bridge-state-access.h (which is never compiled into the host)
and the failure branch of verify_cold_tlb_slot() in interpret.cpp (which
requires the bridge state access to trigger).
How it works
1. Separate coverage uarch binary
The production uarch-ram.bin is compiled with -O2 -g0 for performance.
For coverage, a separate uarch-ram-coverage.bin is built alongside it
(when coverage=yes) with -O0 -g -DCODE_COVERAGE. This gives:
- Full debug info for accurate
addr2linePC-to-source mapping - No inlining (
CODE_COVERAGEdisablesFORCE_INLINE, which otherwise
uses__attribute__((always_inline))and defeats-fno-inline) - No
--gc-sections(which can strip debug sections)
Both binaries are built from the same source using separate object files
(.cov_cpp.o / .cov_c.o suffixes) so they don't interfere.
The production binary is used for all normal tests. The coverage binary is
only loaded for the run_uarch_coverage tests via --uarch-ram-image.
2. PC collection during test runs
The run_uarch_coverage command in cartesi-machine-tests.lua runs tests
through the uarch interpreter one cycle at a time, reading uarch_pc before
each cycle and collecting unique PCs into a Lua table. After each test, the
PCs are written to a .pcs file (one hex address per line) in the directory
specified by --uarch-pc-output-dir.
The test-coverage-uarch-pcs Makefile target runs the csr and thrash-tlb
tests in this mode. This target is separate from test-coverage-uarch
(which runs the validation tests without PC collection) so that non-coverage
CI jobs (e.g. sanitize) don't need the coverage binary.
Tests with pre/post scripts (like the thrash-tlb corruption test) get a hash
suffix in the .pcs filename to avoid collisions with the plain version.
3. Resolving PCs to source lines
The tests/scripts/uarch-pcs-to-gcov.lua script resolves the collected PCs
to source file, function name, and line number using addr2line -f against
uarch/uarch-ram-coverage.elf.
The script handles DWARF path resolution in three cases:
-
Direct match: DWARF paths match the local
gcov_dirprefix (e.g.
/usr/src/emulator/src/interpret.cppon CI). The prefix is stripped to
get the bare filename. -
Project root match: DWARF paths are under the project root but outside
gcov_dir(e.g./usr/src/emulator/uarch/machine-uarch-bridge-state-access.h).
The project root is computed fromgcov_dirand the path is made relative
(e.g.../uarch/machine-uarch-bridge-state-access.h). -
Basename fallback: DWARF paths don't match the local tree at all (e.g.
the ELF was built inside Docker with paths like/opt/cartesi/...). The
script extracts the basename and checks if the file exists underuarch/
orsrc/in the local tree.
Paths outside the project (e.g. C++ standard library headers) are filtered
out. If addr2line is not available, the script exits gracefully and the
report is generated without uarch data.
4. Running gcov with proper merging
The tests/scripts/run-gcov.lua script runs gcov (or llvm-cov gcov)
on each .gcda file individually and merges the resulting .gcov files.
This works around a bug in llvm-cov gcov: when processing multiple .gcda
files that share headers, it overwrites the .gcov file for each shared
header rather than accumulating counts. GNU gcov merges correctly but the
script works with both.
The merge adds execution counts from all versions of each source line, and
prefers ##### (uncovered but executable) over - (non-executable) for
lines that appear in only some compilation units.
5. Merging uarch coverage into .gcov files
After run-gcov.lua produces the host .gcov files, uarch-pcs-to-gcov.lua
modifies them before gcovr reads them:
-
Existing
.gcovfiles (e.g.interpret.cpp.gcov): lines marked as
uncovered (#####) that were hit by the uarch get their count replaced.
Lines already marked as executed by the host get the uarch count added. -
New
.gcovfiles (e.g. formachine-uarch-bridge-state-access.h):
created from scratch with function records (required by gcovr to recognize
executable lines) and line hit counts. Non-hit lines are marked as
non-executable (-) since there is no way to determine which lines the
compiler considers executable without gcov instrumentation data.
6. Generating the report
gcovr --use-gcov-files reads all .gcov files from src/ and produces
the HTML report and text summary. The --filter flags include both src/
and uarch/ directories to pick up the bridge header and other uarch-only
source files.
On systems without the RISC-V toolchain, the uarch-pcs-to-gcov.lua script
runs inside the toolchain Docker container (which has
riscv64-unknown-elf-addr2line). The gcov and gcovr steps run on the host.
7. Unified coverage toolchain
On macOS, clang coverage now uses --coverage (gcc-compatible .gcno/.gcda
format) instead of -fprofile-instr-generate -fcoverage-mapping. This means
the same gcov/gcovr pipeline works on both Linux (gcc) and macOS (clang),
and LCOV_EXCL_START/LCOV_EXCL_STOP markers are respected on both
platforms. The COVERAGE_TOOLCHAIN variable is exported from tests/Makefile
so sub-makes inherit the correct value.
Limitations
-
For source files that exist only in the uarch binary (like the bridge
header), all non-hit lines appear as non-executable in the report. This
means the report shows which lines were executed, but cannot show which
lines should have been executed but were not. -
Even with
-O0, the coverage binary is a different compilation from the
host. Template instantiations may differ, so some lines in shared headers
might not be attributed identically.
Running locally
From a clean checkout:
make submodules
make -j$(nproc) coverage=yes
make -C tests build-tests-machine-with-toolchain coverage=yes
make -C tests build-tests-misc coverage=yes
make -C tests build-tests-uarch-with-toolchain coverage=yes
make -C tests build-tests-images coverage=yes
eval $(make env)
make -C tests -j1 coverage=yes \
test-save-and-load \
test-machine \
test-lua \
test-c-api \
test-coverage-machine \
test-uarch-rv64ui \
test-uarch-interpreter \
test-coverage-uarch \
test-coverage-uarch-pcs \
test-machine-with-log-step
make -C tests coverage-report coverage=yes
# Report at tests/build/coverage/gcc/index.htmlTo regenerate just the report (after tests have already run):
make -C tests coverage-report coverage=yesFiles
uarch/Makefile-- builds bothuarch-ram.bin(production) and
uarch-ram-coverage.bin(with-O0 -g -DCODE_COVERAGE) whencoverage=yestests/lua/cartesi-machine-tests.lua--run_uarch_coveragecommand and
--uarch-pc-output-dir/--uarch-ram-imageoptionstests/scripts/run-gcov.lua-- runs gcov per.gcdawith proper mergingtests/scripts/uarch-pcs-to-gcov.lua-- resolves PCs and merges into
.gcovfilestests/scripts/generate-coverage-report.sh-- standalone script for
running the full coverage pipelinetests/Makefile--test-coverage-uarch(validation tests),
test-coverage-uarch-pcs(PC collection),coverage-report(report
generation).github/workflows/build.yml-- CI coverage job runs both targets
Instead of using TLB_INVALID_PAGE, the correct invalidation is ~pc. This ensures the xor trick in fetch_insn doesn't fail when pc is in the last page of virtual memory.
be7b44f to
1558faa
Compare
edubart
left a comment
There was a problem hiding this comment.
For me it is good enough already, good work!
Summary
This branch introduces libFuzzer-based fuzz testing for the RISC-V interpreter and shadow state, implements lazy TLB verification, and fixes five bugs uncovered by fuzzing.
Commits
fix: ensure PC alignment invariant at startup — Raise
MCAUSE_INSN_ADDRESS_MISALIGNEDif PC has bit 0 set when entering the interpreter, rather than relying on the fetch logic to handle it.fix: enforce WARL registers when reading — Centralize all WARL bit-masking into the
i-state-accesslayer viariscv-warl.h, so external writes (C API, snapshots) can't store illegal bit patterns. Also correctsxtvecmasking (& ~1→& ~3).fix: properly invalidate fetch cache — Use
~pcas the miss sentinel instead ofTLB_INVALID_PAGE(which could produce false hits at the top of virtual memory). Add invalidation after fetch exceptions and privilege changes.fix: assert_no_break with multiple interrupts — Only assert that delegated interrupts are zero in S/U-mode, since non-delegated M-mode interrupts can legitimately remain pending.
feat: regression tests for fuzzer bugs — Lua test suite (
spec-fuzzer-bugs.lua) covering all four bugs above.feat: add fuzzer support to step verification — New
fuzz-steptarget that runs each fuzzed input through four independent execution paths (cm_run, cm_run_uarch, cycle-by-cycle uarch fraud proofs, page-based fraud proofs) and asserts all produce identical root hashes. Refactors fuzz input parsing into sharedfuzz-common.h.feat: lazy verification/heating of TLB slots — Replace eager TLB shadow validation at machine construction with lazy per-slot validation on first access. Hot entries start as
TLB_UNVERIFIED_PAGEand are promoted on demand. Hardens replay against attacker-crafted step logs, adds PMA bounds verification, and guardsdo_read_pmaagainst out-of-bounds indices.feat: add shadow-state fuzzer — New
fuzz-shadow-statetarget that writes the entire shadow state (registers + TLB) viacm_write_memorywith hostile data, then runs the interpreter. Usesregisters_statestruct directly for corpus compatibility withfuzz-interpret. Crafts TLB entries targeting discovered PMAs (memory-backed, device, out-of-bounds) with correct slot placement for actual TLB hits. AddsFUZZ_FOCUSbuild variable to restrict coverage instrumentation to specific source files, and afuzz-coverageMakefile target forllvm-covHTML reports.fix: default coverage generation to clang on macOS —
COVERAGE_TOOLCHAINintests/Makefilenow defaults toclangon Darwin, socoverage-reportusesllvm-profdata/llvm-covinstead ofgcov/gcovr.fix: decouple iunrep from mutable shadow state —
poll_external_interruptsand other runtime checks now usemachine::is_unreproducible()(reads from immutable config) instead of the shadow register. WFI clampsmcycle_maxtomcycle_endto prevent overshooting. Adds a consistency check between config and shadowiunrepon load, and a regression test.feat: add persistent mode to fuzz targets (~44x faster) — Reuse a single machine across fuzz inputs instead of creating and destroying one per input. Each iteration zeros RAM and overwrites the full shadow state (registers + TLB) via a bulk
cm_write_memory()call, which also reinitializes the hot TLB cache. The old per-input mode is still available viaFUZZ_NO_PERSIST=1. Merges the shadow-state fuzzers into the interpret fuzzers (both now use shadow state bulk writes), unifies the corpus directory, generates the seed corpus as part ofbuild-tests-machine, and adds comprehensive comments for newcomers.Bugs Found and Fixed
1. Misaligned PC at startup
External APIs could set PC to an odd value before calling
run(), violating the 2-byte alignment invariant the fetch logic depends on. Fixed by checking PC alignment at interpreter entry and raisingMCAUSE_INSN_ADDRESS_MISALIGNEDif bit 0 is set.2. WARL registers not legalized through state access layer
WARL bit-masking was only applied inside CSR instruction handlers, so external writes (C API, snapshots) could store illegal bit patterns consumed raw by the interpreter. Fixed by centralizing all WARL legalization into the
i-state-accesslayer viariscv-warl.h. Also correctedxtvecmasking (& ~1→& ~3).3. Fetch cache incorrectly invalidated
The fetch cache used
TLB_INVALID_PAGEas its miss sentinel, but the XOR-based hit test could produce false hits when PC was in the last page of virtual memory. The cache also wasn't invalidated after fetch exceptions or privilege changes. Fixed by using~pcas the sentinel (guaranteed miss) and adding invalidation after exceptions andraise_interrupt_if_any.4. Debug assertion too strict with multiple pending interrupts
assert_no_brk()required all pending interrupts to be zero after instruction execution. In S/U-mode, non-delegated M-mode interrupts can legitimately remain pending for the outer loop to handle. Fixed by only asserting that delegated interrupts are zero in S/U-mode.5.
iunrepread from mutable shadow state at runtimepoll_external_interruptsreadsiunrepfrom the shadow state on every call, meaning a corrupted shadow state can flip the machine into unreproducible mode mid-execution. WFI then advancesmcycleup tortc_time_to_cycle(clint_mtimecmp), which can exceedmcycle_end, causingmcycleto overshoot silently in release builds (or triggering a debug assertion). Fixed by: (a) addingmachine::is_unreproducible()that reads from the immutable config instead of shadow state, and using it everywhereiunrepwas previously read at runtime; (b) clamping WFI'smcycle_maxtomcycle_endsopoll_external_interruptscan never advancemcyclepast the requested limit; (c) adding a consistency check invalidate_processor_shadowto ensure the shadowiunrepmatches the config value when loading from disk.