Re: [PATCH 3/3] rust: add abstraction for `struct page`

From: Alice Ryhl
Date: Thu Feb 08 2024 - 08:46:35 EST


On Thu, Feb 1, 2024 at 7:02 AM Trevor Gross <tmgross@xxxxxxxxx> wrote:
>
> On Wed, Jan 24, 2024 at 6:22 AM Alice Ryhl <aliceryhl@xxxxxxxxxx> wrote:
> > +/// A pointer to a page that owns the page allocation.
> > +///
> > +/// # Invariants
> > +///
> > +/// The pointer points at a page, and has ownership over the page.
> > +pub struct Page {
> > + page: NonNull<bindings::page>,
> > +}
>
> Shouldn't this be UnsafeCell / Opaque? Since `struct page` contains locks.

That only matters when we use a reference. Here, it's behind a raw pointer.

> > +// SAFETY: It is safe to transfer page allocations between threads.
> > +unsafe impl Send for Page {}
> > +
> > +// SAFETY: Calling `&self` methods on this type in parallel is safe. It might
> > +// allow you to perform a data race on bytes stored in the page, but we treat
> > +// this like data races on user pointers.
> > +unsafe impl Sync for Page {}
>
> These races should probably be in the Page docs, rather than pointing
> to user pointers.

New safety comment:

SAFETY: As long as the safety requirements for `&self` methods on this
type are followed, there is no problem with calling them in parallel.

> > +impl Page {
> > + /// Allocates a new set of contiguous pages.
>
> "set of contiguous page" -> "page"?

Thanks, done.

> > + pub fn new() -> Result<Self, AllocError> {
> > + // SAFETY: These are the correct arguments to allocate a single page.
> > + let page = unsafe {
> > + bindings::alloc_pages(
> > + bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
> > + 0,
> > + )
> > + };
> > +
> > + match NonNull::new(page) {
> > + // INVARIANT: We checked that the allocation above succeeded.
> > + Some(page) => Ok(Self { page }),
> > + None => Err(AllocError),
> > + }
>
> Optionally:
>
> let page = NonNull::new(page).ok_or(AllocError)?;
> Ok(Self { page })

Done.

> > + }
> > +
> > + /// Returns a raw pointer to the page.
>
> Maybe add ", valid for PAGE_SIZE" or similar to make this obvious.

This is a pointer to the `struct page`, not the actual page data.

> > + pub fn as_ptr(&self) -> *mut bindings::page {
> > + self.page.as_ptr()
> > + }
> > +
> > + /// Runs a piece of code with this page mapped to an address.
>
> Maybe ", then immediately unmaps the page" to make the entire operation clear.

Ok.

> > + /// It is up to the caller to use the provided raw pointer correctly.
> > + pub fn with_page_mapped<T>(&self, f: impl FnOnce(*mut c_void) -> T) -> T {
>
> If there is exclusive access into the page, this signature could be:
>
> FnOnce(&mut [u8; PAGE_SIZE]) -> T
>
> Otherwise possibly
>
> FnOnce(*mut [u8; PAGE_SIZE]) -> T
>
> But based on the thread with Boqun it seems there is no synchronized
> access here. In this case, "use the provided raw pointer correctly" or
> the type level docs should clarify what you can and can't rely on with
> pointers into a page.
>
> E.g. if I'm understanding correctly, you can never construct a &T or
> &mut T anywhere in this page unless T is Sync.

We discussed this in the meeting and concluded that we should use *mut u8 here.

> > + /// Runs a piece of code with a raw pointer to a slice of this page, with
> > + /// bounds checking.
> > + ///
> > + /// If `f` is called, then it will be called with a pointer that points at
> > + /// `off` bytes into the page, and the pointer will be valid for at least
> > + /// `len` bytes. The pointer is only valid on this task, as this method uses
> > + /// a local mapping./
> > + ///
> > + /// If `off` and `len` refers to a region outside of this page, then this
> > + /// method returns `EINVAL` and does not call `f`.
> > + pub fn with_pointer_into_page<T>(
> > + &self,
> > + off: usize,
> > + len: usize,
> > + f: impl FnOnce(*mut u8) -> Result<T>,
> > + ) -> Result<T> {
>
> Same question about exclusive access
>
> impl FnOnce(&mut [u8]) -> Result<T>

We discussed this in the meeting. Slices raise all sorts of cans of
worms with uninit and exclusivity, so the raw methods won't use them.

> > + let bounds_ok = off <= PAGE_SIZE && len <= PAGE_SIZE && (off + len) <= PAGE_SIZE;
> > +
> > + if bounds_ok {
> > + self.with_page_mapped(move |page_addr| {
> > + // SAFETY: The `off` integer is at most `PAGE_SIZE`, so this pointer offset will
> > + // result in a pointer that is in bounds or one off the end of the page.
> > + f(unsafe { page_addr.cast::<u8>().add(off) })
> > + })
> > + } else {
> > + Err(EINVAL)
> > + }
> > + }
> > +
> > + /// Maps the page and reads from it into the given buffer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// Callers must ensure that `dest` is valid for writing `len` bytes.
> > + pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {
>
> Is there a reason not to use a slice just for a destination to read into?

Ditto.

> > + self.with_pointer_into_page(offset, len, move |from_ptr| {
>
> Nit: do the names from_ptr/to_ptr come from existing binder? src/dst
> seems more common (also dst vs. dest).

Renamed everything to use src/dst

> > + self.with_pointer_into_page(offset, len, move |to_ptr| {
> > + // SAFETY: If `with_pointer_into_page` calls into this closure, then
> > + // it has performed a bounds check and guarantees that `to_ptr` is
> > + // valid for `len` bytes.
> > + unsafe { ptr::copy(src, to_ptr, len) };
> > + Ok(())
> > + })
> > + }
> > +
> > + /// Maps the page and zeroes the given slice.
>
> Mention that this will error with the same conditions as with_pointer_into_page.

That method is private. I will add documentation for this that doesn't
reference with_pointer_into_page.

> > + pub fn fill_zero(&self, offset: usize, len: usize) -> Result {
> > + self.with_pointer_into_page(offset, len, move |to_ptr| {
> > + // SAFETY: If `with_pointer_into_page` calls into this closure, then
> > + // it has performed a bounds check and guarantees that `to_ptr` is
> > + // valid for `len` bytes.
> > + unsafe { ptr::write_bytes(to_ptr, 0u8, len) };
> > + Ok(())
> > + })
> > + }
> > +
> > + /// Copies data from userspace into this page.
> > + pub fn copy_into_page(
> > + &self,
> > + reader: &mut UserSlicePtrReader,
> > + offset: usize,
> > + len: usize,
> > + ) -> Result {
>
> Maybe copy_from_user_slice or something that includes "user", since
> as-is it sounds like copying a page into another page.
>
> Also, docs should point out the error condition.

Done.

Thanks,
Alice