Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

From: Oleg Nesterov
Date: Fri Jun 02 2023 - 14:00:22 EST

Next message: Alexander Duyck: "Re: [PATCH net-next v3 09/12] iavf: switch to Page Pool"
Previous message: Peter Xu: "Re: [PATCH v16 2/5] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs"
In reply to: Jason Wang: "Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression"
Next in thread: Linus Torvalds: "Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 06/02, Jason Wang wrote:
>
> On Thu, Jun 1, 2023 at 3:43 PM Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > and the final rewrite:
> >
> > if (work->node) {
> > work_next = work->node->next;
> > if (true)
> > clear_bit(&work->flags);
> > }
> >
> > so again, I do not see the load-store control dependency.
>
> This kind of optimization is suspicious. Especially considering it's
> the control expression of the loop but not a condition.

It is not about optimization,

> Looking at the assembly (x86):
>
> 0xffffffff81d46c5b <+75>: callq 0xffffffff81689ac0 <llist_reverse_order>
> 0xffffffff81d46c60 <+80>: mov %rax,%r15
> 0xffffffff81d46c63 <+83>: test %rax,%rax
> 0xffffffff81d46c66 <+86>: je 0xffffffff81d46c3a <vhost_worker+42>
> 0xffffffff81d46c68 <+88>: mov %r15,%rdi
> 0xffffffff81d46c6b <+91>: mov (%r15),%r15
> 0xffffffff81d46c6e <+94>: lock andb $0xfd,0x10(%rdi)
> 0xffffffff81d46c73 <+99>: movl $0x0,0x18(%rbx)
> 0xffffffff81d46c7a <+106>: mov 0x8(%rdi),%rax
> 0xffffffff81d46c7e <+110>: callq 0xffffffff821b39a0
> <__x86_indirect_thunk_array>
> 0xffffffff81d46c83 <+115>: callq 0xffffffff821b4d10 <__SCT__cond_resched>
> ...
>
> I can see:
>
> 1) The code read node->next (+91) before clear_bit (+94)

The code does. but what about CPU ?

> 2) And the it uses a lock prefix to guarantee the execution order

As I said from the very beginning, this code is fine on x86 because
atomic ops are fully serialised on x86.

OK. we can't convince each other. I'll try to write another email when
I have time,

If this code is correct, then my understanding of memory barriers is even
worse than I think. I wouldn't be surprised, but I'd like to understand
what I have missed.

Oleg.

Next message: Alexander Duyck: "Re: [PATCH net-next v3 09/12] iavf: switch to Page Pool"
Previous message: Peter Xu: "Re: [PATCH v16 2/5] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs"
In reply to: Jason Wang: "Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression"
Next in thread: Linus Torvalds: "Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]