Re: [PATCH v3] rust: str: add {make,to}_{upper,lower}case() to CString

From: Alice Ryhl
Date: Fri Feb 16 2024 - 04:10:17 EST


On Thu, Feb 15, 2024 at 5:51 PM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
>
> On Thu, Feb 15, 2024 at 10:38:07AM +0100, Alice Ryhl wrote:
> > On Thu, Feb 15, 2024 at 2:18 AM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
> > >
> > > On Wed, Feb 14, 2024 at 08:59:06PM +0100, Alice Ryhl wrote:
> > > > On 2/14/24 20:27, Boqun Feng wrote:
> > > > > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote:
> > > > > > --- a/rust/kernel/str.rs
> > > > > > +++ b/rust/kernel/str.rs
> > > > > > @@ -5,7 +5,7 @@
> > > > > > use alloc::alloc::AllocError;
> > > > > > use alloc::vec::Vec;
> > > > > > use core::fmt::{self, Write};
> > > > > > -use core::ops::{self, Deref, Index};
> > > > > > +use core::ops::{self, Deref, DerefMut, Index};
> > > > > > use crate::{
> > > > > > bindings,
> > > > > > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError
> > > > > > unsafe { core::mem::transmute(bytes) }
> > > > > > }
> > > > > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any
> > > > > > + /// additional checks.
> > > > > > + ///
> > > > > > + /// # Safety
> > > > > > + ///
> > > > > > + /// `bytes` *must* end with a `NUL` byte, and should only have a single
> > > > > > + /// `NUL` byte (or the string will be truncated).
> > > > > > + #[inline]
> > > > > > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
> > > > > > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition.
> > > > > > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) }
> > > > >
> > > > > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides,
> > > > > I think the dereference (or reborrow) is only safe if `CStr` is
> > > > > `#[repr(transparent)]. I.e.
> > > > >
> > > > > #[repr(transparent)]
> > > > > pub struct CStr([u8]);
> > > > >
> > > > > with that you can implement the function as (you can still use `cast()`
> > > > > implementation, but I sometimes find `transmute` is more simple).
> > > > >
> > > > > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
> > > > > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is
> > > > > // safe to do, and per the function safety requirement, `bytes`
> > > > > // is a valid `CStr`.
> > > > > unsafe { core::mem::transmute(bytes) }
> > > > > }
> > > > >
> > > > > but this is just my thought, better wait for others' feedback as well.
> > > >
> > > > Transmuting references is generally frowned upon. It's better to use a
> > > > pointer cast.
> > > >
> > >
> > > Ok, but honestly, I don't think the pointer casting is better ;-) What
> > > wants to be done here is simply converting a `&mut [u8]` to `&mut CStr`,
> > > adding two levels of pointer casting is kinda noise. (Also
> > > `from_bytes_with_nul` uses `transmute` as well).
> >
> > Here's my logic for preferring pointer casts: Transmute raises
> > questions about the layout of fat pointers, whereas pointer casts are
> > obviously okay.
> >
>
> But in this case, eventually you need to worry about fat pointer layout
> when you dereference the `*mut CStr`, right? In other words, the
> dereference is only safe if `*mut [u8]` has the same fat pointer layout
> as `*mut CStr`. I prefer to transmute here because it's a newtype
> paradigm, and transmute kinda makes that clear.

No, if the `*mut CStr` and `*mut [u8]` types disagree on whether the
data or vtable pointer is first in the layout, then an as cast should
swap them.

The question of whether their vtables (well I guess it's just a length
in this case) are compatible is separate.

Alice