Re: 2.6.37.1 s2disk regression (TPM)

From: Jiri Slaby
Date: Tue Feb 22 2011 - 03:41:22 EST


On 02/22/2011 01:42 AM, Stefan Berger wrote:
> On 02/21/2011 05:10 PM, Jiri Slaby wrote:
>> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>>> There has to be another problem which caused my regression. And
>>>>>>>> since it
>>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>>> worked
>>>>>>>> for me, the ones read from TPM do not.
>>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>>> TPM as
>>>>>>> I mentioned, my working timeouts are:
>>>>>>> 3020000 4510000 181000000
>>>>>> 1000000 2000 150000
>>>>>>
>>>>>> Actually the first one from HW is 1. This is one is HZ after
>>>>>> correction
>>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>>> Following the specs, the timeouts are supposed to be in
>>>>> microseconds and
>>>>> ascending order for short, medium and long duration. Of course, if the
>>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>>> patch I put in but at the same time we'll need a work-around for
>>>>> devices
>>>>> like this.
>>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>>> cannot go in now. The rule is no regressions.
>>>>
>>>> After you have the workaround it should go into the next rc1 after
>>>> that.
>>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>>> dmidecode output? Or are you going to base it solely on TPM
>>>> manufacturer/version
>>> It's more reliable to base the workaround on the values themselves,
>>> instead of the TPM's ID, since
>>> we don't know whether other models will behave similarly.
>> As I wrote, you may base it on dmi data.
>>
>>> It should be fine then to extend the existing workaround for short
>>> timeouts to the medium and long ones.
>> OK, but how will you guess the values?
> One way of doing it would be to at least make sure that the timeouts are
>
> short < medium < long
>
> and if that's not true, as in the case of your TPM, set the timeouts to
> 0 and have Rajiv's work-around kick in OR we assign the same high
> values to the timeouts explicily that Rajiv's work-around is using right
> now. Of course there could be another type of bad TPM firmware out there
> where all values are in ascending order but given in ms and cause
> time-outs -- but I would wait for someone to point that out since I am
> not aware of such a device.

Note that it is in ascending order (1 2000 150000). As I wrote the first
timeout (1) is replaced by one HZ in get_timeouts.

regards,
--
js
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/