Re: [PATCH] sched/cputime: Make IRQ time accounting configurable at boot time

From: Bart Van Assche
Date: Fri Jun 16 2023 - 10:41:31 EST


On 6/16/23 00:45, Peter Zijlstra wrote:
On Thu, Jun 15, 2023 at 01:37:26PM -0700, Bart Van Assche wrote:
Some producers of Android devices want IRQ time accounting enabled while
others want IRQ time accounting disabled. Hence, make IRQ time accounting
configurable at boot time.

Why would they want this disabled? IRQ time accounting avoids a number
of issues under high irq/softirq pressure.

Disabling this makes no sense.

This is why disabling IRQ time accounting makes a ton of sense to me:
* If disabling IRQ time accounting would not be useful, there wouldn't
be a kernel configuration option that controls whether it is enabled
or disabled - it would be enabled all the time.
* If enabling IRQ time accounting would be essential, all Linux
distributors would enable it. In the x86 kernels I checked, IRQ time
accounting is disabled (Debian and openSUSE).
* For x86 there is already a kernel parameter for disabling IRQ time
accounting (tsc=noirqtime).
* Modern hardware, e.g. UFSHCI 4.0 controllers, supports sending the
completion interrupt to the CPU core that submitted the I/O. With such
hardware IRQ overload (100% spent in IRQ handlers and 0% outside IRQ
handlers) is impossible because the submitter is slowed down by the
completion interrupts.
* The performance overhead of CONFIG_IRQ_TIME_ACCOUNTING is
unacceptable. A quick test in an x86 VM shows that enabling
CONFIG_IRQ_TIME_ACCOUNTING reduces IOPS by 10% (220K -> 200K). On an
Android setup I measured an IOPS reduction of 40% (100K -> 60K) due
to CONFIG_IRQ_TIME_ACCOUNTING. This is not acceptable.

Thanks,

Bart.