Skip to content

Add coredump#417

Draft
rust-cooker wants to merge 4 commits into
mainfrom
feat/coredump-code-review
Draft

Add coredump#417
rust-cooker wants to merge 4 commits into
mainfrom
feat/coredump-code-review

Conversation

@rust-cooker
Copy link
Copy Markdown
Contributor

Summary

Implement a complete on-device coredump subsystem that generates ELF ET_CORE
files on fatal faults, supporting all 4 kernel architectures (RISC-V 32/64,
AArch64, Cortex-M).

Key components

  • ELF builder (elf.rs): Hand-written ELF32/ELF64 #[repr(C)] structs,
    5-phase builder (header → PHDRs → notes → segments → backfill)
  • Register capture (arch/): Per-architecture register dumps following
    Linux user_regs_struct layout
    • RISC-V: 32 × usize (pc, ra, sp, gp, tp, t0-t6, fp, a0-a7, s1-s11)
    • AArch64: 34 × 8 bytes (x0-x30, sp, pc, pstate)
    • Cortex-M: 18 × 4 bytes (r0-r15, cpsr, orig_r0)
  • Signal mapping (signal.rs): Maps arch-specific exception codes to POSIX
    signals (SIGSEGV, SIGBUS, SIGILL, SIGFPE, SIGTRAP, SIGABRT)
  • Memory regions (regions.rs): Collects BSS, system stack, and thread
    stacks via GlobalQueueVisitor
  • Three backends (backend.rs):
    • MemoryBackend: Static buffer for post-mortem retrieval
    • FileBackend: VFS file output (/tmp/blueos.core.<pid>)
    • LoggingBackend: Hex-encoded serial/semihosting output
  • Integration test: Runs in QEMU with .checker assertion validation
  • Parser tool: coredump_parser.py reassembles hex output → .core ELF
  • Usage docs: README.md with backend selection, Kconfig options, and tool usage

Kconfig options

  • ENABLE_COREDUMP (bool) — global switch
  • COREDUMP_BUF_SIZE (int) — buffer size per platform (64KB Cortex-M,
    256KB 32-bit, 4MB 64-bit)
  • COREDUMP_MEM_BACKEND (bool) — MemoryBackend selection

Code quality

  • #[inline(never)] on all exception entry points (capture_regs,
    capture_current_regs, dump())
  • Debug/Clone derives on all public types
  • // SAFETY: comments on every unsafe block
  • No unwrap() in core logic

Test plan

  • Compile on all 4 architectures (riscv64, riscv32, aarch64, cortex-m)
  • Integration test passes in QEMU (debug config)
  • All 27 board/config defconfigs updated with CONFIG_ENABLE_COREDUMP=y
  • Release/coverage builds verified

🤖 Generated with Claude Code
EOF
)"

rust-cooker and others added 4 commits May 12, 2026 15:00
…end support

Add full coredump subsystem for post-mortem debugging across all architectures:
- ELF32/ELF64 ET_CORE builder with PT_LOAD, PT_NOTE, NT_PRSTATUS/PRPSINFO/SIGINFO
- Three compile-time backends: MemoryBackend (static buffer), FileBackend (VFS),
  LoggingBackend (hex-encoded serial output)
- Per-architecture register capture: RISC-V (32 regs), AArch64 (34 regs), ARM Cortex-M (18 regs)
- Fault-to-signal mapping: aarch64_ec_to_signo, riscv_mcause_to_signo, arm_cfsr_to_signo
- Memory region collection: BSS, system stack, thread stacks via GlobalQueueVisitor
- Kconfig-gated: ENABLE_COREDUMP, COREDUMP_BUF_SIZE, COREDUMP_MEM_BACKEND
- .coredump_bss linker section to prevent self-collection
- FileBackend PID-specific path (/tmp/blueos.core.<pid>)
- Integration test (coredump_test.rs) with QEMU checker validation
- Host-side hex-to-ELF parser tool (coredump_parser.py)
- Coredump usage documentation (README.md)
- Coding standards compliance: #[inline(never)] on exception entry points,
  Debug derives on public types, no unwrap() in core logic

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Add missing trailing newlines to all 10 coredump .rs files and revert
README.md to English per review feedback.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@rust-cooker rust-cooker marked this pull request as draft May 13, 2026 06:01
/// overflow since thread stacks (64 KB) are much smaller than the
/// coredump buffer.
#[link_section = ".coredump_bss"]
static mut ELF_BUF: [u8; COREDUMP_BUF_SIZE] = [0u8; COREDUMP_BUF_SIZE];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use the BSS segment for memory space that is never used at runtime?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coredump may not trigger by signal when run in RTOS mode.

Comment thread kernel/src/Kconfig
endchoice
endmenu # os adapter configuration

rsource "coredump/Kconfig"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plz also put the path on the _kconfig_files, otherwise the build system can not trace the changes of coredump/Kconfig

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing assertion condition, how to determine the failure condition

Copy link
Copy Markdown
Contributor

@xuchang-vivo xuchang-vivo May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this functionailty into arch might be a better choice

collect_all_stacks(&mut col);
}
DumpMode::Full => {
collect_all_stacks(&mut col);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the mode is ALL, will the stack of the current thread definitely be recorded?

hex_buf[n] = b'\n';
// Use semihosting eprint which is always available
let s = unsafe { core::str::from_utf8_unchecked(&hex_buf[..n + 1]) };
semihosting::eprint!("{}", s);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semihosting is only used under qemu test, and it is not suitable for use here.

let n = hex_encode(chunk, &mut hex_buf);
hex_buf[n] = b'\n';
// Use semihosting eprint which is always available
let s = unsafe { core::str::from_utf8_unchecked(&hex_buf[..n + 1]) };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling from_utf8_unchecked on hex_buf will result in undefined behavior.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better approach is to use write_data8, which is part of Has8bitDataReg. You can refer to the implementation of kearly_printk for reference.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in fact, using kearly_print is not a good method, as it will mix the output of these two channels together. A separate channel should be provided for it. I suggest leaving this blank and writing a todo or fix note. We can come back to fix it once I find an elegant way to connect to the second serial port channel.

fn finalize(&mut self, mode: DumpMode) {
// Emit a trailer line with metadata
let size_kb = self.chunks * 32 / 1024;
semihosting::eprint!(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dtto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants