Re: fuse uring / wake_up on the same core

From: Bernd Schubert
Date: Wed Apr 26 2023 - 18:43:30 EST


On 3/27/23 12:28, Peter Zijlstra wrote:
> On Fri, Mar 24, 2023 at 07:50:12PM +0000, Bernd Schubert wrote:
>
>> With the fuse-uring patches that part is basically solved - the waitq
>> that that thread is about is not used anymore. But as per above,
>> remaining is the waitq of the incoming workq (not mentioned in the
>> thread above). As I wrote, I have tried
>> __wake_up_sync((x), TASK_NORMAL), but it does not make a difference for
>> me - similar to Miklos' testing before. I have also tried struct
>> completion / swait - does not make a difference either.
>> I can see task_struct has wake_cpu, but there doesn't seem to be a good
>> interface to set it.
>>
>> Any ideas?
>
> Does the stuff from:
>
> https://lkml.kernel.org/r/20230308073201.3102738-1-avagin@xxxxxxxxxx

Thanks Peter, I have already replied in that thread - using
__wake_up_on_current_cpu() helps to avoid cpu migrations. Well, some
update since my last mail in that thread (a few hours ago), more logging
reveals that I still see a few cpu switches, but nothing compared to
what I had before.
My issue is now that these patches are not enough and contrary to
previous testing, forcefully disabling cpu migration using
migrate_disable() before wait_event_* in fuse's request_wait_answer()
and enabling it after does not help either - my process to create files
(bonnie++) somewhere migrates to another cpu at a later time.
The only workaround I currently have is to set the ring thread
processing vfs/fuse events in userspace to SCHED_IDLE. In combination
with WF_CURRENT_CPU performance then goes from ~2200 to ~9000 file
creates/s for a single thread in the latest branch (should be scalable).
Which is very close to binding the bonnie++ process to a single core
(~9400 creates/s).

Is there something available to mark ring threads as IO processing and
that there is no need to migrate away the submitting thread from IO
threads?

* application sends request -> forwards to ring and wake ring -> wait
* ring wakes up (core bound) -> process request -> sends completion ->
wake up application -> wait for next request
* application wakes up with request result

==> I don't understand why the application is moved to another process
at all, after the wake issue is eliminated.

I also only see SCHED_IDLE only as workaround, as it would likely have
side effects if there is anything else running on the system and would
consume cpus while another process is doing IO.
Is there a way to trace where and why a process is migrated away?


Thanks,
Bernd