Re: [PATCH v3] rust: str: add {make,to}_{upper,lower}case() to CString

From: Alice Ryhl
Date: Wed Feb 14 2024 - 14:59:40 EST


On 2/14/24 20:27, Boqun Feng wrote:
On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote:
Add functions to convert a CString to upper- / lowercase, either
in-place or by creating a copy of the original CString.

Naming followes the one from the Rust stdlib, where functions starting
with 'to' create a copy and functions starting with 'make' perform an
in-place conversion.

This is required by the Nova project (GSP only Rust successor of
Nouveau) to convert stringified enum values (representing different GPU
chipsets) to strings in order to generate the corresponding firmware
paths. See also [1].

[1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust

Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx>
---
Changes in V3:
- add an `impl DerefMut for CString`, such that these functions can be defined
for `CStr` as `&mut self` and still be called on a `CString`
Changes in V2:
- expand commit message mentioning the use case
- expand function doc comments to match the ones from Rust's stdlib
- rename to_* to make_* and add the actual to_* implementations
---
rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 80 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
index 7d848b83add4..02d6e510b852 100644
--- a/rust/kernel/str.rs
+++ b/rust/kernel/str.rs
@@ -5,7 +5,7 @@
use alloc::alloc::AllocError;
use alloc::vec::Vec;
use core::fmt::{self, Write};
-use core::ops::{self, Deref, Index};
+use core::ops::{self, Deref, DerefMut, Index};
use crate::{
bindings,
@@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError
unsafe { core::mem::transmute(bytes) }
}
+ /// Creates a mutable [`CStr`] from a `[u8]` without performing any
+ /// additional checks.
+ ///
+ /// # Safety
+ ///
+ /// `bytes` *must* end with a `NUL` byte, and should only have a single
+ /// `NUL` byte (or the string will be truncated).
+ #[inline]
+ pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
+ // SAFETY: Properties of `bytes` guaranteed by the safety precondition.
+ unsafe { &mut *(bytes as *mut [u8] as *mut CStr) }

First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides,
I think the dereference (or reborrow) is only safe if `CStr` is
`#[repr(transparent)]. I.e.

#[repr(transparent)]
pub struct CStr([u8]);

with that you can implement the function as (you can still use `cast()`
implementation, but I sometimes find `transmute` is more simple).

pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr {
// SAFETY: `CStr` is transparent to `[u8]`, so the transmute is
// safe to do, and per the function safety requirement, `bytes`
// is a valid `CStr`.
unsafe { core::mem::transmute(bytes) }
}

but this is just my thought, better wait for others' feedback as well.

Transmuting references is generally frowned upon. It's better to use a pointer cast.

As for .cast() vs the `as` operator, I'm not sure you can use .cast() in this case since the pointers are unsized. So you might have to use `as` instead.

Alice