Re: [RFC] memcpy_nocache() and memcpy_writethrough()

From: Dan Williams
Date: Tue Jan 03 2017 - 21:14:51 EST


On Tue, Jan 3, 2017 at 5:59 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Jan 03, 2017 at 05:38:54PM -0800, Dan Williams wrote:
>> > 1) memcpy_to_pmem() seems to rely upon the __copy_from_user_nocache()
>> > having only used movnt; it does not attempt clwb at all.
>>
>> Yes, and there was a fix a while back to make sure it always used
>> movnt so clwb after the fact is not required:
>>
>> a82eee742452 x86/uaccess/64: Handle the caching of 4-byte nocache
>> copies properly in __copy_user_nocache()
>>
>> > 2) __copy_from_user_nocache() for short copies does not use movnt at all.
>> > In that case neither sfence nor clwb is issued.
>>
>> For the 32bit case, yes, but the pmem driver should warn about this
>> when it checks platform persistent memory capabilities (i.e. x86 32bit
>> not supported). Ugh, we may have lost that warning for this specific
>> case recently, I'll go double check and fix it up.
>>
>> > 3) it uses movnt only for part of copying in case of misaligned copy;
>> > No clwb is issued, but sfence *is* - at the very end in 64bit case,
>> > between movnt and copying the tail - in 32bit one. Incidentally,
>> > while 64bit case takes care to align the destination for movnt part,
>> > 32bit one does not.
>> >
>> > How much of the above is broken and what do the callers rely upon?
>>
>> 32bit issues are known, but 64bit path is ok since that fix above.
>
> Bollocks. That fix above does *NOT* eliminate all cached stores. Just look
> at the damn function - it still does cached stores for until the target is
> aligned and it does the same for tail when end of destination is not aligned.
> Right there in arch/x86/lib/copy_user_64.S.

No, it does not eliminate all cache stores, but the cases where we use
it have naturally aligned targets.

Yes, it is terrible to then call wrap it in a memcpy_to_pmem() wrapper
which does not document these alignment constraints.