Re: [PATCH v2 2/2] rust: arc: remove `ArcBorrow` in favour of `WithRef`

From: Benno Lossin
Date: Mon Sep 25 2023 - 18:27:16 EST


On 26.09.23 00:02, Boqun Feng wrote:
> On Mon, Sep 25, 2023 at 11:58:46PM +0200, Alice Ryhl wrote:
>> On 9/25/23 23:55, Boqun Feng wrote:
>>> On Mon, Sep 25, 2023 at 09:03:52PM +0000, Benno Lossin wrote:
>>>> On 25.09.23 20:51, Boqun Feng wrote:
>>>>> On Mon, Sep 25, 2023 at 05:00:45PM +0000, Benno Lossin wrote:
>>>>>> On 25.09.23 18:16, Boqun Feng wrote:
>>>>>>> On Mon, Sep 25, 2023 at 03:07:44PM +0000, Benno Lossin wrote:
>>>>>>>> ```rust
>>>>>>>> struct MutatingDrop {
>>>>>>>> value: i32,
>>>>>>>> }
>>>>>>>>
>>>>>>>> impl Drop for MutatingDrop {
>>>>>>>> fn drop(&mut self) {
>>>>>>>> self.value = 0;
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> let arc = Arc::new(MutatingDrop { value: 42 });
>>>>>>>> let wr = arc.as_with_ref(); // this creates a shared `&` reference to the MutatingDrop
>>>>>>>> let arc2: Arc<MutatingDrop> = wr.into(); // increments the reference count to 2
>>>>>>>
>>>>>>> More precisely, here we did a
>>>>>>>
>>>>>>> &WithRef<_> -> NonNull<WithRef<_>>
>>>>>>>
>>>>>>> conversion, and later on, we may use the `NonNull<WithRef<_>>` in
>>>>>>> `drop` to get a `Box<WithRef<_>>`.
>>>>>>
>>>>>> Indeed.
>>>>>>
>>>>>
>>>>> Can we workaround this issue by (ab)using the `UnsafeCell` inside
>>>>> `WithRef<T>`?
>>>>>
>>>>> impl<T: ?Sized> From<&WithRef<T>> for Arc<T> {
>>>>> fn from(b: &WithRef<T>) -> Self {
>>>>> // SAFETY: The existence of the references proves that
>>>>> // `b.refcount.get()` is a valid pointer to `WithRef<T>`.
>>>>> let ptr = unsafe { NonNull::new_unchecked(b.refcount.get().cast::<WithRef<T>>()) };
>>>>>
>>>>> // SAFETY: see the SAFETY above `let ptr = ..` line.
>>>>> ManuallyDrop::new(unsafe { Arc::from_inner(ptr) })
>>>>> .deref()
>>>>> .clone()
>>>>> }
>>>>> }
>>>>>
>>>>> This way, the raw pointer in the new Arc no longer derives from the
>>>>> reference of `WithRef<T>`.
>>>>
>>>> No, the code above only obtains a pointer that has provenance valid
>>>> for a `bindings::refcount_t` (or type with the same layout, such as
>>>> `Opaque<bindings::refcount_t>`). But not the whole `WithRef<T>`, so accessing
>>>> it by reading/writing will still be UB.
>>>>
>>>
>>> Hmm... but we do the similar thing in `Arc::from_raw()`, right?
>>>
>>> pub unsafe fn from_raw(ptr: *const T) -> Self {
>>> ..
>>> }
>>>
>>> , what we have is a pointer to T, and we construct a pointer to
>>> `ArcInner<T>/WithRef<T>`, in that function. Because the `sub` on pointer
>>> gets away from provenance? If so, we can also do a sub(0) in the above
>>> code.
>>
>> Not sure what you mean. Operations on raw pointers leave provenance
>> unchanged.
>
> Let's look at the function from_raw(), the input is a pointer to T,
> right? So you only have the provenance to T, but in that function, the
> pointer is casted to a pointer to WithRef<T>/ArcInner<T>, that means you
> have the provenance to the whole WithRef<T>/ArcInner<T>, right? My
> question is: why isn't that a UB?

The pointer was originally derived by a call to `into_raw`:
```
pub fn into_raw(self) -> *const T {
let ptr = self.ptr.as_ptr();
core::mem::forget(self);
// SAFETY: The pointer is valid.
unsafe { core::ptr::addr_of!((*ptr).data) }
}
```
So in this function the origin (also the origin of the provenance)
of the pointer is `ptr` which is of type `NonNull<WithRef<T>>`.
Raw pointers do not lose this provenance information when you cast
it and when using `addr_of`/`addr_of_mut`. So provenance is something
that is not really represented in the type system for raw pointers.

When doing a round trip through a reference though, the provenance is
newly assigned and thus would only be valid for a `T`:
```
let raw = arc.into_raw();
let reference = unsafe { &*raw };
let raw: *const T = reference;
let arc = unsafe { Arc::from_raw(raw) };
```
Miri would complain about the above code.

--
Cheers,
Benno