Update MCU SoC post-P&R test data by github-actions[bot] · Pull Request #44 · ChipFlow/Jacquard

github-actions · 2026-02-27T05:53:30Z

Automated rebuild of tests/mcu_soc/data/ using librelane.

Trigger: workflow_dispatch

The mcu-soc-metal CI job will validate simulation.

Add --timing-vcd flag that produces timing-accurate VCD output where signal transitions are offset from clock edges by their computed arrival times. The GPU kernel already computes per-gate arrival times for setup/hold checking; this feature writes them to global memory so the host can produce sub-cycle-accurate output. Changes: - GPU kernels (Metal/CUDA): write shared_writeout_arrival to global memory at arrival_state_offset when enabled - FlattenedScriptV1: add timing_arrivals_enabled, arrival_state_offset fields; update effective_state_size() for 3-section layout - vcd_io: add expand_states_for_arrivals(), split_arrival_states(), write_output_vcd_timed() with ps-to-timescale conversion - loom CLI: wire --timing-vcd flag, SimParams.arrival_state_offset, and timed VCD writer dispatch Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)

Add detailed section to Known Issues explaining why Loom only supports edge-triggered DFFs, why CVC's test suite can't be reused as reference tests (NAND-latch flip-flops), and what would be needed to add latch support (new DriverType, two-phase evaluation, GPU kernel changes). Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)

- Change IdCode::from(0) to IdCode(0) for vcd_ng tuple struct API - Make write_output_vcd_timed generic over W: Write for testability - Remove writer.flush() calls (vcd_ng::Writer has no flush method) - Add 8 comprehensive tests for expand/split/write timing arrivals Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)

The Metal kernel uses a double-buffered read pattern where t4_5 holds the current stage's data while the next stage's data is pre-loaded. The gate_delay extraction was incorrectly placed AFTER the t4_5 overwrite, causing it to read the next stage's padding slot instead of the current one. For single-stage designs (like inv_chain), this read garbage/zeros. Fix: extract gate_delay from t4_5.c4 before overwriting t4_5. Also fix arrival tracking to add gate_delay even for pass-through positions (orb == 0xFFFFFFFF) across all hierarchy levels, since pass-throughs can represent physical cells (e.g., inverter chains) with accumulated delays. Also fix load_timing_from_sdf to iterate all cell origins per AIG pin instead of only the first, enabling correct delay accumulation for inverter chains collapsed to a single AIG wire. Verified: inv_chain test produces correct 1323ps arrival delay matching the analytical SDF sum (CLK→Q=350ps + 16 inverters=973ps). Co-developed-by: Claude Code v2.1.62 (claude-opus-4-6)

Suppress unused variable warnings (staged, num_srams, num_ios, num_dup, part_end) and remove dead assignments (offset before break, script_pi before break) that were cluttering build output. Co-developed-by: Claude Code v2.1.62 (claude-opus-4-6)

- tb_cvc.v: CVC testbench with SDF annotation for inv_chain timing validation (expected total delay: 1323ps) - inv_chain_stimulus.vcd: Input stimulus for timing VCD tests - compare_vcd.py: VCD comparison script for Loom vs CVC output - watchlist.json: Signal watchlist for timing_sim_cpu tracing - CI workflow: CVC reference simulation job for automated validation Co-developed-by: Claude Code v2.1.62 (claude-opus-4-6)

Dockerfile builds CVC (open-src-cvc) from source on linux/amd64 with gcc/binutils for its native code compilation. run_cvc.sh builds the image, runs the inv_chain testbench with SDF back-annotation, and compares against Loom's timing output. Results: CVC reports 1235ps total delay vs Loom's 1323ps — an 88ps (7.1%) conservative overestimate. This is expected: Loom uses max(rise, fall) per cell since the GPU kernel processes 32 packed signals and cannot track per-signal transition direction. CVC tracks actual rise/fall transitions through the inverter chain. The 88ps decomposes as: 8 inverter stages × 10ps IOPATH rise/fall asymmetry = 80ps 8 interconnect wires × 1ps rise/fall asymmetry = 8ps Usage: bash tests/timing_test/cvc/run_cvc.sh Co-developed-by: Claude Code v2.1.62 (claude-opus-4-6)

Add detailed section to timing-simulation.md covering the three independent sources of timing overestimation: 1. max(rise, fall) per cell — GPU can't track transition direction across 32 packed signals (80ps / 6.5% for inv_chain) 2. max wire delay across multi-input pins — single wire delay per cell regardless of which input is critical (8ps for inv_chain) 3. max arrival across 32 packed signals per thread — mitigated by timing-aware bit packing (0ps for inv_chain, larger in practice) Documents CVC reference validation: Loom 1323ps vs CVC 1235ps (88ps / 7.1% conservative overestimate) for the inv_chain design. Updates implementation phases to reflect completed GPU arrival tracking and timing-aware VCD output. Co-developed-by: Claude Code v2.1.62 (claude-opus-4-6)

40 outputs at 5 logic depths (3, 5, 9, 13, 17) exercise Source 3 overestimation in timing-aware bit packing. CVC reference shows distinct arrival times per group (513ps to 1286ps), confirming the conservative timing model. Includes hand-crafted SDF, stimulus VCD, CVC testbench, and Docker runner script. Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)

The previous fallback logic used `find | sort -r | head -1` which grabbed a pre-PnR SDF (step 08) alphabetically instead of the post-PnR SDF from STAPostPNR (step 51) that includes interconnect delays. Now explicitly searches for stapostpnr nom_tt SDF first. Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)

robtaylor · 2026-03-04T18:38:33Z

Superseded by #55 (PnR netlist data only) + #49 (code changes, already merged to main). The code changes in this PR were duplicates from the timing-vcd-readback branch.

github-actions bot added the automated label Feb 27, 2026

github-actions bot force-pushed the auto/update-mcu-soc-data branch from ed4b23f to bdd3e71 Compare February 27, 2026 06:19

robtaylor mentioned this pull request Feb 27, 2026

Fix wrap_openframe.py: use ~ instead of ! for OEB inversion #45

Merged

2 tasks

github-actions bot force-pushed the auto/update-mcu-soc-data branch from bdd3e71 to 8ec3348 Compare February 27, 2026 08:47

This was referenced Feb 27, 2026

Fix wrap_openframe.py: parenthesize OEB inversion for parser #46

Closed

Remove unary operators from wrapper for parser compat #47

Merged

Fix wrap_openframe.py: parenthesize OEB inversion for parser #48

Merged

github-actions bot force-pushed the auto/update-mcu-soc-data branch from f43e281 to 7fdcde9 Compare February 27, 2026 12:32

robtaylor and others added 10 commits February 27, 2026 12:46

Update MCU SoC test data from librelane rebuild

6f2a2bd

github-actions bot force-pushed the auto/update-mcu-soc-data branch from 7fdcde9 to 6f2a2bd Compare February 27, 2026 19:29

robtaylor mentioned this pull request Mar 4, 2026

Update 6_final.v with PnR-produced netlist (includes output/clock buffers) #55

Merged

2 tasks

robtaylor closed this Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update MCU SoC post-P&R test data#44

Update MCU SoC post-P&R test data#44
github-actions[bot] wants to merge 11 commits intomainfrom
auto/update-mcu-soc-data

github-actions bot commented Feb 27, 2026

Uh oh!

robtaylor commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions bot commented Feb 27, 2026

Uh oh!

robtaylor commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant