From f2a1dcc267ef620f51e7618fab5183a6479f9862 Mon Sep 17 00:00:00 2001 From: Oliver Smith-Denny Date: Sun, 29 Jun 2025 21:50:42 -0700 Subject: [PATCH] Don't inline zero_page function There is a current crash in patina on x86 when zero_page gets inlined in a certain way, small changes to the code generation cause no crash to occur. However, when zero_page gets inlined on x86, rdi in the outer function can get clobbered and not restored, which causes undefined behavior. In the example seen, the root page of the page table is supposed to be 0x50000, stored in rdi, but rdi gets used to track incrementing the zeroing writes to this page and ends up at 0x51000, which gets stored as the page table root, causing a write because there is a mismatch when this gets used later. There is a greater investigation needed here in two parts: - Whether this function is needed to be written in assembly. It was written so to avoid the compiler from optimizing out the memory write that it thinks may not be used, but after discussion with the compiler team, this can be avoided another way. However, that move is considered risky, as if the compiler decides to throw away a zeroing of the page table, we will have garbage entries, causing undefined behavior. - Whether this function should be written in a separate asm file as a global_asm import. Doing this avoids the use of inline asm and puts all register usage in our control. However, it also removes the niceties of inline_asm, namely only doing asm for the parts required to be so and doing the rest in Rust. Also, this is a larger change and so considered risky. In order to mitigate the current crash, the simplest change is made first: don't allow the zero_page functions to be inlined. This was seen to boot and under a debugger the proper push/pop of rdi was occurring. This is done for AARCH64 as well, even though it does not use explicit registers (x86 must in order to use the rep instruction) in order to keep the implementation similar and avoid similar potential problems. Other AARCH64 asm was reviewed and determined to not use explicit general purpose registers, and so left alone. Finally, the x86 function was incorrectly declaring that it preserves_flags, when it in fact explicitly clears the direction flag, so that is removed. There is a task filed in patina to review all usage of inline asm in patina and associated repos because there is a high chance that it has been done incorrectly; there are many common pitfalls that the Rust compiler can take when special care is not given to inline asm. --- src/aarch64/reg.rs | 3 +++ src/x64.rs | 6 +++++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/aarch64/reg.rs b/src/aarch64/reg.rs index e131cf0..6042f42 100644 --- a/src/aarch64/reg.rs +++ b/src/aarch64/reg.rs @@ -360,10 +360,13 @@ pub(crate) fn is_this_page_table_active(page_table_base: PhysicalAddress) -> boo /// 1. Ensure that the compiler does not optimize out the zeroing /// 2. Ensure that the zeroing is done as quickly as possible as without this, the zero takes a long time on /// non-optimized builds +/// 3. This function must not be inlined to ensure that the register reads and writes don't affect the caller's +/// registers. /// /// # Safety /// This function is unsafe because it operates on raw pointers. It requires the caller to ensure the VA passed in /// is mapped. +#[inline(never)] pub(crate) unsafe fn zero_page(page: u64) { // If the MMU is diabled, invalidate the cache so that any stale data does // not get later evicted to memory. diff --git a/src/x64.rs b/src/x64.rs index 6ba7cd7..67c126b 100644 --- a/src/x64.rs +++ b/src/x64.rs @@ -89,6 +89,10 @@ impl PageTableHal for PageTableArchX64 { type PTE = X64PageTableEntry; const DEFAULT_ATTRIBUTES: MemoryAttributes = MemoryAttributes::empty(); + // This function must not be inlined to ensure that the register reads and writes don't affect the + // caller's registers. It has been viewed that this function is inlined several layers up the stack and has + // corrupted the rdi register, causing a crash. + #[inline(never)] unsafe fn zero_page(base: VirtualAddress) { let _page: u64 = base.into(); #[cfg(all(not(test), target_arch = "x86_64"))] @@ -99,7 +103,7 @@ impl PageTableHal for PageTableArchX64 { in("rcx") 0x200, // we write 512 qwords (4096 bytes) in("rdi") _page, // start at the page in("rax") 0, // store 0 - options(nostack, preserves_flags) + options(nostack) ); } }