Re: [PATCH v2 2/2] rust: arc: remove `ArcBorrow` in favour of `WithRef`

From: Benno Lossin
Date: Tue Sep 26 2023 - 13:16:19 EST


On 26.09.23 18:35, Boqun Feng wrote:
> On Tue, Sep 26, 2023 at 05:41:17PM +0200, Alice Ryhl wrote:
>> On Tue, Sep 26, 2023 at 5:24 PM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
>>>
>>> On Tue, Sep 26, 2023 at 04:26:59PM +0800, Gary Guo wrote:
>>>> On Mon, 25 Sep 2023 22:26:56 +0000
>>>> Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>>>>
>>> [...]
>>>>>
>>>>> The pointer was originally derived by a call to `into_raw`:
>>>>> ```
>>>>> pub fn into_raw(self) -> *const T {
>>>>> let ptr = self.ptr.as_ptr();
>>>>> core::mem::forget(self);
>>>>> // SAFETY: The pointer is valid.
>>>>> unsafe { core::ptr::addr_of!((*ptr).data) }
>>>>> }
>>>>> ```
>>>>> So in this function the origin (also the origin of the provenance)
>>>>> of the pointer is `ptr` which is of type `NonNull<WithRef<T>>`.
>>>>> Raw pointers do not lose this provenance information when you cast
>>>>> it and when using `addr_of`/`addr_of_mut`. So provenance is something
>>>>> that is not really represented in the type system for raw pointers.
>>>>>
>>>>> When doing a round trip through a reference though, the provenance is
>>>>> newly assigned and thus would only be valid for a `T`:
>>>>> ```
>>>>> let raw = arc.into_raw();
>>>>> let reference = unsafe { &*raw };
>>>>> let raw: *const T = reference;
>>>>> let arc = unsafe { Arc::from_raw(raw) };
>>>>> ```
>>>>> Miri would complain about the above code.
>>>>>
>>>>
>>>> One thing we can do is to opt from strict provenance, so:
>>>>
>>>
>>> A few questions about strict provenance:
>>>
>>>> ```
>>>> let raw = arc.into_raw();
>>>> let _ = raw as usize; // expose the provenance of raw
>>>
>>> Should this be a expose_addr()?
>>
>> Pointer to integer cast is equivalent to expose_addr.
>>
>>>> let reference = unsafe { &*raw };
>>>> let raw = reference as *const T as usize as *const T;
>>>
>>> and this is a from_exposed_addr{_mut}(), right?
>>
>> Integer to pointer cast is equivalent to from_exposed_addr.
>>
>
> Got it, thanks!
>
>>>> let arc = unsafe { Arc::from_raw(raw) };
>>>> ```
>>>>
>>>
>>> One step back, If we were to use strict provenance API (i.e.
>>> expose_addr()/from_exposed_addr()), we could use it to "fix" the
>>> original problem? By:
>>>
>>> * expose_addr() in as_with_ref()
>>> * from_exposed_addr() in `impl From<&WithRef<T>> for Arc`
>>>
>>> right?
>>>
>>> More steps back, is the original issue only a real issue under strict
>>> provenance rules? Don't make me wrong, I like the ideas behind strict
>>> provenance, I just want to check, if we don't enable strict provenance
>>> (as a matter of fact, we don't do it today),
>>
>> Outside of miri, strict provenance is not really something you enable.
>> It's a set of rules that are stricter than the real rules, that are
>> designed such that when you follow them, your code will be correct
>> under any conceivable memory model we might end up with. They will
>> never be the rules that the compiler actually uses.
>>
>> I think by "opt out from strict provenance", Gary just meant "use
>> int2ptr and ptr2int casts to reset the provenance".
>>
>>> will the original issue found by Alice be a UB?
>>
>> Yes, it's UB under any ruleset that exists out there. There's no flag
>> to turn it off.
>>
>>> Or is there a way to disable Miri's check on
>>> strict provenance? IIUC, the cause of the original issue is that "you
>>> cannot reborrow a pointer derived from a `&` to get a `&mut`, even when
>>> there is no other alias to the same object". Maybe I'm still missing
>>> something, but without strict provenance, is this a problem? Or is there
>>> a provenance model of Rust without strict provenance?
>>
>> It's a problem under all of the memory models. The rule being violated
>> is exactly the same rule as the one behind this paragraph:
>>
>>> Transmuting an & to &mut is Undefined Behavior. While certain usages may appear safe, note that the Rust optimizer is free to assume that a shared reference won't change through its lifetime and thus such transmutation will run afoul of those assumptions. So:
>>>
>>> Transmuting an & to &mut is always Undefined Behavior.
>>> No you can't do it.
>>> No you're not special.
>> From: https://doc.rust-lang.org/nomicon/transmutes.html
>
> But here the difference it that we only derive a `*mut` from a `&`,
> rather than transmute to a `&mut`, right? We only use `&` to get a
> pointer value (a usize), so I don't think that rule applies here? Or in
> other words, does the following implemenation look good to you?
>
> impl<T: ?Sized> Arc<T> {
> pub fn as_with_ref(&self) -> &WithRef<T> {
> // expose
> let _ = self.ptr.as_ptr() as usize;
> unsafe { self.ptr.as_ref() }
> }
> }
>
> impl<T: ?Sized> From<&WithRef<T>> for Arc<T> {
> fn from(b: &WithRef<T>) -> Self {
> // from exposed
> let ptr = unsafe { NonNull::new_unchecked(b as *const _ as usize as *mut _) };
> // SAFETY: The existence of `b` guarantees that the refcount is non-zero. `ManuallyDrop`
> // guarantees that `drop` isn't called, so it's ok that the temporary `Arc` doesn't own the
> // increment.
> ManuallyDrop::new(unsafe { Arc::from_inner(ptr) })
> .deref()
> .clone()
> }
> }
>
>
> An equivalent code snippet is as below (in case anyone wants to try it
> in miri):
> ```rust
> let raw = Box::into_raw(arc);
>
> // as_with_ref()
> let _ = raw as usize;
> let reference = unsafe { &*raw };
>
> // from()
> let raw: *mut T = reference as *const _ as usize as *mut _ ;
>
> // drop()
> let arc = unsafe { Box::from_raw(raw) };
> ```

I don't understand why we are trying to use ptr2int to fix this.
Simply wrapping the `T` field inside `WithRef` with `UnsafeCell`
should be enough.

--
Cheers,
Benno