Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced

From: Andi Kleen
Date: Wed Sep 03 2008 - 05:45:27 EST


> Hmm... that does pull in significantly more code yes...
>
> It should be fixed. I guess a quickfix is to add a new call in
> kernel/smp.c for use by panic code.

Or just a new vector. I'll take a look later.
>
> > an IPI?
>
> Oh. I thought the actual function call structure itself should be the
> nice cleanup part that everyone would like...

I fail to see how much more complex code is a cleanup to be honest.

> > And are there really benchmarks that show the
> > queueing does actually improve something significantly?
>
> For the call_function_mask case, I did see a benefit of queueing when
> testing vmap scalability on a bigish system, yes. I had since done

Does that improve anything real-world?

> For call_function_single, queueing could really help with high interrupt
> loads like the block completion migration that Jens has been looking at.

But does it or not?

> In that case, queueing will act like interrupt mitigation when the target
> CPU becomes saturated so it will help keep performance from degrading,
> and it also allows multiple source CPUs to target a single destination
> without losing too much scalability.
>
> Also there was some straight-line benefit of queueing due to the source
> CPU not having to wait for an ack from the destination before continuing.
> Basically that would previously put a hard upper limit of remote function
> call to the interrupt latency.
>
> Benchmarks for "real" things unfortunately not easy to come by yet,
> simply because it was so crap before that it was unusable for anything
> remotely performance oriented. After the rewrite, I hope to see some

At least the "industry standard benchmark" Intel spends a lot of time
one used IPIs and iirc didn't run into too big issues. It was far
less a problem than the endless sl[au]b regressions at least.


> It was certianly tried. It's not a simple problem to do this with any
> reasonable scalability, however.

Well just not having a big lock is a big step forward. But that
doesn't need that much code, it's ~10-20 lines of new code
as you can see in the scalable tlb flush.

The queueing seems to add all the complexity and so far I haven't
seen much evidence that the queueing actually helps enough
to pay for its costs in complexity and maintenance overhead
(and bugginess as the panic case and the earlier crash issue shows)

> > > It is reasonable I think, but I don't like testing symbolic constants
> > > with inequalities like in your patch 2/2. Can you just make it
> > > system_state != SYSTEM_PANIC ?
> >
> > Well I like it.
>
> How is it better than system_state != SYSTEM_PANIC? It breaks if
> SYSTEM_PANIC gets something in front of it, and even when it is

No it will not break in that case.

> correct it makes you double check the definition of SYSTEM_PANIC
> to see what it does (is there any state above SYSTEM_PANIC?)

The system states are already ordered and it would make
a lot of sense to keep it that way.

Anyways the code will go anyways I guess because we have
established that it's not enough to fix panic.

-Andi

--
ak@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/