Re: boot cgroup questions

From: Max Krasnyanskiy
Date: Thu Apr 10 2008 - 14:03:29 EST


The context here was that we were talking about a way to group irqs and assign them to the cpusets. I was proposing to just treat IRQs as tasks, and you were proposing to add some additional grouping. Replies inline below.

Paul Jackson wrote:
Max K wrote:
cleaner imo than dealing with complex irq grouping schemes.

What's this "complex irq grouping scheme" that you're referring to?

If it's what I posted last week, with named sets of irqs, and each
cpuset naming which set it belonged to, that seems to me to actually
fit the usage pattern rather well.
I was just saying that cpuset already provides a nice grouping. After thinking about this some more I still do not see a need to group IRQs before assigning them to the cpusets. That's the complexity I was talking about.

The jobs running in particular cpusets need only know the 'name' of
the set of irqs it makes sense to send to its CPUs (the realtime
irqs, a particular piece of hardwares irqs, the ordinary system
irqs, the absolute minimum set of irqs, ...) and the system admin
gets to specify, one time, which irq numbers are in which named
set, or to change, later on, which set a particular irq is in, all
without having to have detailed knowledge of the jobs that want
particular irq sets directed to their CPUs.

We tend to label whatever makes sense to us as "simple", and whatever
doesn't seem necessary in our experience, or doesn't make sense, as
"complex".

Such labels are losing their meaning these days, other than to help
others figure out what we favor, or disfavor.
I agree in general. In this particular case additional grouping introduces even more hierarchy. I seems to me that
"irqN -> cpu1, cpu2, cpu3"
is a very simple, straightforward relationship. Whereas
"irqN -> groupX"
"groupX -> cpu1"
"groupX -> cpu2"
"groupX -> cpu3"
Is not that straightforward.

Anyway. I think it all boils down to the compatibility with existing user-space apps. I still like the simple approach of treating irqs like tasks when it comes to assigning them to the cpusets. Which as we discussed earlier in some cases may require an extra level in the cpuset hierarchy. The question is, is that really such a big problem. If we make in kernel boot set optional, by default all irqs will be in the root cpuset. Which means people can still use /proc/irq/N/smp_affinity and manage irqs just like they do now. There is no compatibility issues in that case.

So do you think the apps compatibility is an issue in that case ?
Also isn't it likely that the apps will gradually adapt to handling multi-level cpusets anyway ? I mean you guys were talking about how wonderful and flexible cpusets are, but we cannot seem to use the flexibility because the apps are designed for a flat layout.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/