Re: [PATCH 1/6] x86-mce: Modify CMCI poll interval to adjust for small check_interval values.

From: Havard Skinnemoen
Date: Fri Jul 11 2014 - 17:05:48 EST


On Fri, Jul 11, 2014 at 1:36 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Fri, Jul 11, 2014 at 11:56:11AM -0700, Havard Skinnemoen wrote:
>> > Basically the scheme becomes the following:
>> >
>> > * We switch to polling if we detect a second CMCI under an interval X
>> > * We poll Y times, each polling with a duration Z.
>> > * If during those Y*Z msec of polling, we've encountered errors, we
>> > enlarge the polling interval to additional Y*Z msec.
>> >
>> >
>> > check_interval will be capped on the low end to something bigger than
>> > the polling duration Y*Z and only the storm detection code will be
>> > allowed to go to lower intervals and switch to polling.
>> >
>> > At least something like that. In general, I'd like to make it more
>> > robust for every system without the need for user interaction, i.e.
>> > adjusting check_interval and where it just works.
>>
>> But at the same time, this scheme introduces even more variables that
>> need careful tuning, e.g. storm polling interval and storm duration,
>> while not really doing anything to make check_interval superfluous. Do
>
> Oh, we can't make check_interval superfluous - it is API to userspace
> for a long time now.

Oh, I guess I misunderstood. I thought you were actually talking about
removing that knob.

>> you really think we can tune these variables correctly for every
>> system out there?
>
> Right, I was trying to figure out a scheme first where polling intervals
> and thresholds would actually make sense and not be arbitrary.
>
> We probably won't be able to have the exact values for each system but a
> smart approximation could do the job nicely enough.

Sounds good, but we need to limit the complexity (which is why we
can't get exact values).

>> Or if we want to be generous: How about we just hardcode
>> check_interval to 5 seconds. Would that be fine with everyone?
>
> We could but again, it is an API to userspace exported through sysfs.
>
> Besides, on a healthy system, you see errors so seldomly that 5sec is
> pure waste of energy.

True, but it sometimes makes sense to turn it down to a seemingly
insane value, e.g. during hardware testing and qualification. Which is
why I want to make sure values in that range work.

But please disregard my suggestion to hardcode check_interval -- it's
a bad idea and we're not going to remove that knob anyway.

>> > I don't know whether any of the above makes sense - I hope that the
>> > gist of it at least shows what IO think we should be doing: instead
>> > of letting users configure the check_interval and influence the CMCI
>> > polling interval, we should rely purely on machine characteristics to
>> > set minimum values under which we poll and above which, we do the normal
>> > duration enlarging dance.
>>
>> I think the scheme may work, although I'm worried about the burstiness
>> mentioned above.
>>
>> But I don't really buy that pulling a handful of numbers out of thin
>> air and saying it should work for everyone is going to work.
>
> No no, absolutely not. This is exactly what I think should be fixed as
> the current numbers are likely pulled out of thin air. Simply because
> figuring the optimal ones is a very hard task, as we come to realize.
>
>> Either we need solid data to back up those numbers, or we need to make
>> them configurable so people can experiment and find what works best
>> for them.
>
> ..., or, we could measure them on each system and approximate them to
> the ones close to optimal for that particular system, over the course of
> its runtime.

I like the idea, but I'm worried about the complexity. Maybe what you
said elsewhere makes sense -- I'll have to look at it more closely.

> Thanks for taking the time and humouring me with that crazy
> brainstorming!

You're welcome, and likewise :)

Havard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/