Re: [RFC PATCH] irqchip/gic-v3: Make gic_handle_irq() notrace

From: Julien Thierry
Date: Tue Apr 02 2019 - 10:00:11 EST


Hi,

On 29/03/2019 15:02, Zenghui Yu wrote:
> Hi Marc,
>
> On 2019/3/29 21:58, Marc Zyngier wrote:
>> Hi Zenghui,
>>
>> On 29/03/2019 13:23, Zenghui Yu wrote:
>>> Enable pseudo NMI together with function_graph tracer, will lead
>>> the system to a hang. This is easy to reproduce,
>>>
>>> ÂÂ 1) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
>>> ÂÂ 2) echo function_graph > /sys/kernel/debug/tracing/current_tracer
>>>
>>> This patch (RFC) set gic_handle_irq() as notrace and it seems works
>>> fine now. But I have no idea about what the issue is exactly, and
>>> you can regard this patch as a report then :)
>>>
>>> Can someone give a look at it and provide some explanations ?
>>>

Thanks for testing this and reporting the issue.

>>> Thanks!
>>>
>>> Cc: Julien Thierry <julien.thierry@xxxxxxx>
>>> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
>>> Signed-off-by: Zenghui Yu <yuzenghui@xxxxxxxxxx>
>>> ---
>>> Â drivers/irqchip/irq-gic-v3.c | 2 +-
>>> Â 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>>> index 15e55d3..8d0c25f 100644
>>> --- a/drivers/irqchip/irq-gic-v3.c
>>> +++ b/drivers/irqchip/irq-gic-v3.c
>>> @@ -487,7 +487,7 @@ static inline void gic_handle_nmi(u32 irqnr,
>>> struct pt_regs *regs)
>>> ÂÂÂÂÂÂÂÂÂ gic_deactivate_unhandled(irqnr);
>>> Â }
>>> Â -static asmlinkage void __exception_irq_entry gic_handle_irq(struct
>>> pt_regs *regs)
>>> +static asmlinkage notrace void __exception_irq_entry
>>> gic_handle_irq(struct pt_regs *regs)
>>> Â {
>>> ÂÂÂÂÂ u32 irqnr;
>>> Â
>>
>>
>> That's interesting. Do you have any out of tree patch that actually
>> makes use of the pseudo-NMI feature? Without those patches, the
>> behaviour should stay unchanged.
>
> I am at commit 1a9df9e29c2afecf6e3089442d429b377279ca3c. No more
> patches, and this is the most confusing. Just out of curiosity, I
> wanted to run Julien's "Use NMI for perf interrupt" patch (posted
> on the mailing list), so I have to enable NMI first.
>
> That said, with
> Â 1) Select Kernel Feature -> Support for NMI-like interrupts
> Â 2) Set "irqchip.gicv3_pseudo_nmi=1" on the kernel command line
> Â 3) No pseudo-NMIs have been generated at all
> and this issue was hit.
>

I finally found out what happens.

When using interrupt priority masking, at the begining of
gic_handle_irq(), we are in this awkward state where we still have the I
bit set and PMR unmasked (this is because we cannot ack the interrupt
that was signaled if it has lower priority than the current priority mask).

To try and keep things simple, we decided that local_irq_*() would only
deal with PMR (when using priority masking). With one additional case
being that, if PSR.I is set when saving flags, we'd represent it in the
form of a value of PMR (i.e. GIC_PRIO_IRQOFF), so that irqs_disabled()
and such would still accurately state that no interrupt could happen in
those sections. The assumption was that in the few sections were we are
having the PSR.I set, we wouldn't care about having interrupts disabled
with PSR.I or PMR. And now that assumption appears to be wrong:

trace_graph_entry(), called at the begining of gic_handle_irq() when
enabling the tracer, does use local_irq_save/restore(). The save
operation returns flags GIC_PRIO_IRQOFF (because PSR.I is set when we
enter gic_handle_irq() ). The restore operation will then proceed to
mask PMR, once we get back to gic_handle_irq() we can't acknowledge the
interrupt we just received...

The function tracer does not appear to save/restore flags on function
entry (I saw save/restore operations in the stack tracer but for some
reason couldn't get them to trigger the issue).

To confirm this, I checked with the following diff (which is not a fix,
it is better to mark gic_handle_irq() as notrace if I don't find
something more suitable).

Cheers,

--
Julien Thierry

-->

diff --git a/kernel/trace/trace_functions_graph.c
b/kernel/trace/trace_functions_graph.c
index 69ebf3c..80b941e 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -133,6 +133,7 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
int ret;
int cpu;
int pc;
+ bool restore = false;

if (trace_recursion_test(TRACE_GRAPH_NOTRACE_BIT))
return 0;
@@ -172,7 +173,13 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
if (tracing_thresh)
return 1;

- local_irq_save(flags);
+ if (!irqs_disabled()) {
+ restore = true;
+ local_irq_save(flags);
+ } else {
+ local_save_flags(flags);
+ }
+
cpu = raw_smp_processor_id();
data = per_cpu_ptr(tr->trace_buffer.data, cpu);
disabled = atomic_inc_return(&data->disabled);
@@ -184,7 +191,8 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
}

atomic_dec(&data->disabled);
- local_irq_restore(flags);
+ if (restore)
+ local_irq_restore(flags);

return ret;
}