Re: [PATCH v3] rust: str: add {make,to}_{upper,lower}case() to CString

From: Boqun Feng
Date: Wed Feb 14 2024 - 20:18:35 EST


On Wed, Feb 14, 2024 at 08:59:06PM +0100, Alice Ryhl wrote:
> On 2/14/24 20:27, Boqun Feng wrote:
> > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote:
> > > Add functions to convert a CString to upper- / lowercase, either
> > > in-place or by creating a copy of the original CString.
> > >
> > > Naming followes the one from the Rust stdlib, where functions starting
> > > with 'to' create a copy and functions starting with 'make' perform an
> > > in-place conversion.
> > >
> > > This is required by the Nova project (GSP only Rust successor of
> > > Nouveau) to convert stringified enum values (representing different GPU
> > > chipsets) to strings in order to generate the corresponding firmware
> > > paths. See also [1].
> > >
> > > [1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust
> > >
> > > Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx>
> > > ---
> > > Changes in V3:
> > > - add an `impl DerefMut for CString`, such that these functions can be defined
> > > for `CStr` as `&mut self` and still be called on a `CString`
> > > Changes in V2:
> > > - expand commit message mentioning the use case
> > > - expand function doc comments to match the ones from Rust's stdlib
> > > - rename to_* to make_* and add the actual to_* implementations
> > > ---
> > > rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++-
> > > 1 file changed, 80 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
> > > index 7d848b83add4..02d6e510b852 100644
> > > --- a/rust/kernel/str.rs
> > > +++ b/rust/kernel/str.rs
> > > @@ -5,7 +5,7 @@
> > > use alloc::alloc::AllocError;
> > > use alloc::vec::Vec;
> > > use core::fmt::{self, Write};
> > > -use core::ops::{self, Deref, Index};
> > > +use core::ops::{self, Deref, DerefMut, Index};
> > > use crate::{
> > > bindings,
> > > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError
> > > unsafe { core::mem::transmute(bytes) }
> > > }
> > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any
> > > + /// additional checks.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// `bytes` *must* end with a `NUL` byte, and should only have a single
> > > + /// `NUL` byte (or the string will be truncated).
> > > + #[inline]
> > > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
> > > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition.
> > > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) }
> >
> > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides,
> > I think the dereference (or reborrow) is only safe if `CStr` is
> > `#[repr(transparent)]. I.e.
> >
> > #[repr(transparent)]
> > pub struct CStr([u8]);
> >
> > with that you can implement the function as (you can still use `cast()`
> > implementation, but I sometimes find `transmute` is more simple).
> >
> > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
> > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is
> > // safe to do, and per the function safety requirement, `bytes`
> > // is a valid `CStr`.
> > unsafe { core::mem::transmute(bytes) }
> > }
> >
> > but this is just my thought, better wait for others' feedback as well.
>
> Transmuting references is generally frowned upon. It's better to use a
> pointer cast.
>

Ok, but honestly, I don't think the pointer casting is better ;-) What
wants to be done here is simply converting a `&mut [u8]` to `&mut CStr`,
adding two levels of pointer casting is kinda noise. (Also
`from_bytes_with_nul` uses `transmute` as well).

> As for .cast() vs the `as` operator, I'm not sure you can use .cast() in
> this case since the pointers are unsized. So you might have to use `as`
> instead.
>

You're right, that's a bit unfortunate..

Regards,
Boqun

> Alice