Re: [PATCH v3] rust: str: add {make,to}_{upper,lower}case() to CString

From: Boqun Feng
Date: Wed Feb 14 2024 - 14:29:47 EST


On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote:
> Add functions to convert a CString to upper- / lowercase, either
> in-place or by creating a copy of the original CString.
>
> Naming followes the one from the Rust stdlib, where functions starting
> with 'to' create a copy and functions starting with 'make' perform an
> in-place conversion.
>
> This is required by the Nova project (GSP only Rust successor of
> Nouveau) to convert stringified enum values (representing different GPU
> chipsets) to strings in order to generate the corresponding firmware
> paths. See also [1].
>
> [1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust
>
> Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx>
> ---
> Changes in V3:
> - add an `impl DerefMut for CString`, such that these functions can be defined
> for `CStr` as `&mut self` and still be called on a `CString`
> Changes in V2:
> - expand commit message mentioning the use case
> - expand function doc comments to match the ones from Rust's stdlib
> - rename to_* to make_* and add the actual to_* implementations
> ---
> rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 80 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
> index 7d848b83add4..02d6e510b852 100644
> --- a/rust/kernel/str.rs
> +++ b/rust/kernel/str.rs
> @@ -5,7 +5,7 @@
> use alloc::alloc::AllocError;
> use alloc::vec::Vec;
> use core::fmt::{self, Write};
> -use core::ops::{self, Deref, Index};
> +use core::ops::{self, Deref, DerefMut, Index};
>
> use crate::{
> bindings,
> @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError
> unsafe { core::mem::transmute(bytes) }
> }
>
> + /// Creates a mutable [`CStr`] from a `[u8]` without performing any
> + /// additional checks.
> + ///
> + /// # Safety
> + ///
> + /// `bytes` *must* end with a `NUL` byte, and should only have a single
> + /// `NUL` byte (or the string will be truncated).
> + #[inline]
> + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
> + // SAFETY: Properties of `bytes` guaranteed by the safety precondition.
> + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) }

First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides,
I think the dereference (or reborrow) is only safe if `CStr` is
`#[repr(transparent)]. I.e.

#[repr(transparent)]
pub struct CStr([u8]);

with that you can implement the function as (you can still use `cast()`
implementation, but I sometimes find `transmute` is more simple).

pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
// SAFETY: `CStr` is transparent to `[u8]`, so the transmute is
// safe to do, and per the function safety requirement, `bytes`
// is a valid `CStr`.
unsafe { core::mem::transmute(bytes) }
}

but this is just my thought, better wait for others' feedback as well.

Regards,
Boqun