Re: [PATCH v8 0/6] cgroup/cpuset: Add new cpuset partition type & empty effecitve cpus

From: Waiman Long
Date: Wed Nov 10 2021 - 13:30:40 EST



On 11/10/21 12:29, Marcelo Tosatti wrote:
On Wed, Nov 10, 2021 at 05:15:41PM +0100, Jan Kiszka wrote:
On 10.11.21 17:10, Marcelo Tosatti wrote:
On Wed, Nov 10, 2021 at 03:21:54PM +0000, Moessbauer, Felix wrote:

-----Original Message-----
From: Michal Koutný <mkoutny@xxxxxxxx>
Sent: Wednesday, November 10, 2021 2:57 PM
To: Moessbauer, Felix (T RDA IOT SES-DE) <felix.moessbauer@xxxxxxxxxxx>
Cc: longman@xxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
cgroups@xxxxxxxxxxxxxxx; corbet@xxxxxxx; frederic@xxxxxxxxxx; guro@xxxxxx;
hannes@xxxxxxxxxxx; juri.lelli@xxxxxxxxxx; linux-doc@xxxxxxxxxxxxxxx; linux-
kernel@xxxxxxxxxxxxxxx; linux-kselftest@xxxxxxxxxxxxxxx;
lizefan.x@xxxxxxxxxxxxx; mtosatti@xxxxxxxxxx; pauld@xxxxxxxxxx;
peterz@xxxxxxxxxxxxx; shuah@xxxxxxxxxx; tj@xxxxxxxxxx; Kiszka, Jan (T RDA
IOT) <jan.kiszka@xxxxxxxxxxx>; Schild, Henning (T RDA IOT SES-DE)
<henning.schild@xxxxxxxxxxx>
Subject: Re: [PATCH v8 0/6] cgroup/cpuset: Add new cpuset partition type &
empty effecitve cpus

Hello.

On Wed, Nov 10, 2021 at 12:13:57PM +0100, Felix Moessbauer
<felix.moessbauer@xxxxxxxxxxx> wrote:
However, I was not able to see any latency improvements when using
cpuset.cpus.partition=isolated.
Interesting. What was the baseline against which you compared it (isolcpus, no
cpusets,...)?
For this test, I just compared both settings cpuset.cpus.partition=isolated|root.
There, I did not see a significant difference (but I know, RT tuning depends on a ton of things).

The test was performed with jitterdebugger on CPUs 1-3 and the following
cmdline:
rcu_nocbs=1-4 nohz_full=1-4 irqaffinity=0,5-6,11 intel_pstate=disable
On the other cpus, stress-ng was executed to generate load.
[...]
This requires cgroup.type=threaded on both cgroups and changes to the
application (threads have to be born in non-rt group and moved to rt-group).
But even with isolcpus the application would need to set affinity of threads to
the selected CPUs (cf cgroup migrating). Do I miss anything?
Yes, that's true. But there are two differences (given that you use isolcpus):
1. the application only has to set the affinity for rt threads.
Threads that do not explicitly set the affinity are automatically excluded from the isolated cores.
Even common rt test applications like jitterdebugger do not pin their non-rt threads.
2. Threads can be started on non-rt CPUs and then bound to a specific rt CPU.
This binding can be specified before thread creation via pthread_create.
By that, you can make sure that at no point in time a thread has a "forbidden" CPU in its affinities.

With cgroup2, you cannot guarantee the second aspect, as thread creation and moving to a cgroup is not an atomic operation.
Also - please correct me if I'm wrong - you first have to create a thread before moving it into a group.
At creation time, you cannot set the final affinity mask (as you create it in the non-rt group and there the CPU is not in the cpuset.cpus).
Once you move the thread to the rt cgroup, it has a default mask and by that can be executed on other rt cores.
man clone3:

CLONE_NEWCGROUP (since Linux 4.6)
Create the process in a new cgroup namespace. If this flag is not set, then (as with fork(2)) the
process is created in the same cgroup namespaces as the calling process.

For further information on cgroup namespaces, see cgroup_namespaces(7).

Only a privileged process (CAP_SYS_ADMIN) can employ CLONE_NEWCGROUP.

Is there pthread_attr_setcgroup_np()?

Jan
Don't know... Waiman?

I don't think there is such libpthread call yet.

-Longman