Skip to content

Is it possible to *ever* preserve padding bytes? #599

@clarfonthey

Description

@clarfonthey

Related to, but not strictly the same as #518. Specifically, this shows the need for a padding-preserving type while not requiring MaybeUninit to be that type. This is mentioned in that thread but since this is a particular use case justifying it, it felt worthwhile to create a dedicated issue for it.

Context: I'm trying to create a library that can force certain alignments on types for testing purposes. Forcing alignment is trivial, but I also have an interesting case which is not trivial, which is currently running into issues: forcing misalignment. Specifically, it ensures that the alignments between align_of::<T>() and some arbitrary alignment N are not satisfied, by writing T to an offset of N - align_of::<T>() inside the struct. (Example: Misaligning a [u16; 2] to 512 bytes results in a 512-aligned struct with 510 junk bytes, the [u16; 2], and then 510 junk bytes. This means that all alignments between 4 and 512 are not satisfied, even though the minimum alignment of 2 is still satisfied.)

The library I'm working on is public if you'd like to dig deeper into what I'm trying to accomplish.

The way I accomplish this is by using repr(C) and forcing aligned [u8; N] padding before the value of MaybeUninit<T>, and then writing to the correct offset to ensure the misalignment. Because we have room for T at the end, we should be able to shift the value further into the padding, since it's just regular bytes. And, as long as you consistently access T at the right offset and implement Drop correctly, it should work correctly. But it doesn't because of padding in T.

For example, take this struct designed to test the code I wrote:

#[repr(C)] // for consistency of argument
struct MyTest<'a>(u16, &'a AtomicUsize);
impl Drop for MyTest<'_> {
    fn drop(&mut self) {
        if self.0 == 0x1122 {
            self.1.fetch_add(1, Ordering::Relaxed);
        }
    }
}

The test is simple: the Drop implementation first verifies that a sentinel value is present and then increments an atomic counter. If our misaligned struct is correct, then creating and dropping the misaligned MyTest struct should still successfully increase the counter.

On a 64-bit system, this is stored as the following, where xx represents data and __ represents padding:

xx xx __ __ __ __ __ __ xx xx xx xx xx xx xx xx

Now, let's "misalign" this struct up to 16 bytes. This means our struct will now be equivalent to:

#[repr(C, align(16))]
struct Misaligned<'_> {
    padding: MaybeUninit<[u8; 16]>,
    value: MaybeUninit<MyTest<'_>>,
}

The below shows what bytes can be initialized in this struct:

xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx __ __ __ __ __ __ xx xx xx xx xx xx xx xx

However, with the misalignment, we will be actually writing the MyTest value to an offset of 8, which creates this issue:

xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx __ __ __ __ __ __ xx xx xx xx xx xx xx xx
__ __ __ __ __ __ __ __ xx xx __ __ __ __ __ __ xx xx xx xx xx xx xx xx __ __ __ __ __ __ __ __

As you can see, the issue is trying to write the pointer to the same location as the padded u16 in the actual struct memory. Implementing this type doesn't seem possible unless we can specify a struct with "room for T, but with all bytes preserved."


As an honourable mention, it looks like some flavour of min_generic_const_args might be able to solve this, if we could ever define the following struct:

struct Writeable<T>([u8; size_of::<T>()]);

But this does not seem to be the case for the current iteration of min_generic_const_args, and it's not clear this will happen for a very long time. And, potentially, offering some flavour of this type in the standard library might be useful regardless, so, it's worth pointing out for the sake of preserving padding bytes being useful.

Similarly, there is a potential issue with making this generic "misalignment" struct in the sense that if align_of::<T>() is greater than the desired alignment N, we end up with our array of bytes not actually being large enough to fit in the padding before T. For example, take the simple example of "misaligning" u16 to 1 byte:

struct Misaligned {
    padding: MaybeUninit<[u8; 1]>,
    value: MaybeUninit<u16>,
}

Here, we can't write value to the beginning of the struct because of the extra padding. However, this particular issue is solved completely by making the padding a union with T, since it means that T will always be valid when written to this position. So, the actual problem of having to have min(N, size_of::<T>()) bytes is not actually an issue:

union Padding {
    padding: [u8; 1],
    value: u16,
}
struct Misaligned {
    padding: MaybeUninit<Padding>,
    value: MaybeUninit<u16>,
}

(Technically, there's wasted space, but since this is all for testing purposes, that doesn't really matter. There's going to be space wasted regardless in the original configuration, since e.g. misaligning u16 to 512 bytes results in a 1024-byte struct, when 512 bytes would suffice.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions