Add coredump#417
Conversation
…end support Add full coredump subsystem for post-mortem debugging across all architectures: - ELF32/ELF64 ET_CORE builder with PT_LOAD, PT_NOTE, NT_PRSTATUS/PRPSINFO/SIGINFO - Three compile-time backends: MemoryBackend (static buffer), FileBackend (VFS), LoggingBackend (hex-encoded serial output) - Per-architecture register capture: RISC-V (32 regs), AArch64 (34 regs), ARM Cortex-M (18 regs) - Fault-to-signal mapping: aarch64_ec_to_signo, riscv_mcause_to_signo, arm_cfsr_to_signo - Memory region collection: BSS, system stack, thread stacks via GlobalQueueVisitor - Kconfig-gated: ENABLE_COREDUMP, COREDUMP_BUF_SIZE, COREDUMP_MEM_BACKEND - .coredump_bss linker section to prevent self-collection - FileBackend PID-specific path (/tmp/blueos.core.<pid>) - Integration test (coredump_test.rs) with QEMU checker validation - Host-side hex-to-ELF parser tool (coredump_parser.py) - Coredump usage documentation (README.md) - Coding standards compliance: #[inline(never)] on exception entry points, Debug derives on public types, no unwrap() in core logic Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Add missing trailing newlines to all 10 coredump .rs files and revert README.md to English per review feedback. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
| /// overflow since thread stacks (64 KB) are much smaller than the | ||
| /// coredump buffer. | ||
| #[link_section = ".coredump_bss"] | ||
| static mut ELF_BUF: [u8; COREDUMP_BUF_SIZE] = [0u8; COREDUMP_BUF_SIZE]; |
There was a problem hiding this comment.
Why use the BSS segment for memory space that is never used at runtime?
There was a problem hiding this comment.
coredump may not trigger by signal when run in RTOS mode.
| endchoice | ||
| endmenu # os adapter configuration | ||
|
|
||
| rsource "coredump/Kconfig" |
There was a problem hiding this comment.
Plz also put the path on the _kconfig_files, otherwise the build system can not trace the changes of coredump/Kconfig
There was a problem hiding this comment.
Missing assertion condition, how to determine the failure condition
There was a problem hiding this comment.
Put this functionailty into arch might be a better choice
| collect_all_stacks(&mut col); | ||
| } | ||
| DumpMode::Full => { | ||
| collect_all_stacks(&mut col); |
There was a problem hiding this comment.
When the mode is ALL, will the stack of the current thread definitely be recorded?
| hex_buf[n] = b'\n'; | ||
| // Use semihosting eprint which is always available | ||
| let s = unsafe { core::str::from_utf8_unchecked(&hex_buf[..n + 1]) }; | ||
| semihosting::eprint!("{}", s); |
There was a problem hiding this comment.
Semihosting is only used under qemu test, and it is not suitable for use here.
| let n = hex_encode(chunk, &mut hex_buf); | ||
| hex_buf[n] = b'\n'; | ||
| // Use semihosting eprint which is always available | ||
| let s = unsafe { core::str::from_utf8_unchecked(&hex_buf[..n + 1]) }; |
There was a problem hiding this comment.
Calling from_utf8_unchecked on hex_buf will result in undefined behavior.
There was a problem hiding this comment.
A better approach is to use write_data8, which is part of Has8bitDataReg. You can refer to the implementation of kearly_printk for reference.
There was a problem hiding this comment.
But in fact, using kearly_print is not a good method, as it will mix the output of these two channels together. A separate channel should be provided for it. I suggest leaving this blank and writing a todo or fix note. We can come back to fix it once I find an elegant way to connect to the second serial port channel.
| fn finalize(&mut self, mode: DumpMode) { | ||
| // Emit a trailer line with metadata | ||
| let size_kb = self.chunks * 32 / 1024; | ||
| semihosting::eprint!( |
Summary
Implement a complete on-device coredump subsystem that generates ELF ET_CORE
files on fatal faults, supporting all 4 kernel architectures (RISC-V 32/64,
AArch64, Cortex-M).
Key components
elf.rs): Hand-written ELF32/ELF64#[repr(C)]structs,5-phase builder (header → PHDRs → notes → segments → backfill)
arch/): Per-architecture register dumps followingLinux
user_regs_structlayoutsignal.rs): Maps arch-specific exception codes to POSIXsignals (SIGSEGV, SIGBUS, SIGILL, SIGFPE, SIGTRAP, SIGABRT)
regions.rs): Collects BSS, system stack, and threadstacks via
GlobalQueueVisitorbackend.rs):MemoryBackend: Static buffer for post-mortem retrievalFileBackend: VFS file output (/tmp/blueos.core.<pid>)LoggingBackend: Hex-encoded serial/semihosting output.checkerassertion validationcoredump_parser.pyreassembles hex output →.coreELFREADME.mdwith backend selection, Kconfig options, and tool usageKconfig options
ENABLE_COREDUMP(bool) — global switchCOREDUMP_BUF_SIZE(int) — buffer size per platform (64KB Cortex-M,256KB 32-bit, 4MB 64-bit)
COREDUMP_MEM_BACKEND(bool) — MemoryBackend selectionCode quality
#[inline(never)]on all exception entry points (capture_regs,capture_current_regs,dump())Debug/Clonederives on all public types// SAFETY:comments on everyunsafeblockunwrap()in core logicTest plan
CONFIG_ENABLE_COREDUMP=y🤖 Generated with Claude Code
EOF
)"