Re: ibmvtpm byteswapping inconsistency

From: Tyrel Datwyler
Date: Fri Jan 27 2017 - 19:53:36 EST


On 01/27/2017 01:03 AM, Michal Suchanek wrote:
> On 27 January 2017 at 02:50, Benjamin Herrenschmidt
> <benh@xxxxxxxxxxxxxxxxxxx> wrote:
>> On Thu, 2017-01-26 at 17:42 -0800, Tyrel Datwyler wrote:
>>> On 01/26/2017 12:22 PM, Michal SuchÃnek wrote:
>>>> Hello,
>>>>
>>>> building ibmvtpm I noticed gcc warning complaining that second word
>>>> of
>>>> struct ibmvtpm_crq in tpm_ibmvtpm_suspend is uninitialized.
>>>>
>>>> The structure is defined as
>>>>
>>>> struct ibmvtpm_crq {
>>>> u8 valid;
>>>> u8 msg;
>>>> __be16 len;
>>>> __be32 data;
>>>> __be64 reserved;
>>>> } __attribute__((packed, aligned(8)));
>>>>
>>>> initialized as
>>>>
>>>> struct ibmvtpm_crq crq;
>>>> u64 *buf = (u64 *) &crq;
>>>> ...
>>>> crq.valid = (u8)IBMVTPM_VALID_CMD;
>>>> crq.msg = (u8)VTPM_PREPARE_TO_SUSPEND;
>>>>
>>>> and submitted with
>>>>
>>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, cpu_to_be64(buf[0]),
>>>> cpu_to_be64(buf[1]));
>>>
>>> These should be be64_to_cpu() here. The underlying hcall made by
>>> ibmvtpm_send_crq() requires parameters to be in cpu endian unlike the
>>> RTAS interface which requires data in BE.
>>
>> Hrm... an hcall takes register arguments. Register arguments don't have
>> an endianness.
>>
>> The problem is that we are packing an in-memory structure into 2
>> registers and it's expected that this structure is laid out in the
>> registers as if it had been loaded by a BE CPU.
>>
>> So we have two things at play here:
>>
>> - The >8-bit fields should be laid out BE in the memory image
>> - That whole 128-bit structure should be loaded into 2 64-bit
>> registers MSB first.
>>
>> So the "double" swap is somewhat needed. The uglyness comes from the
>> passing-by-register of the h-call but it should work.
>>
>> That said, be64_to_cpup(buf) and be64_to_cpup(buf+1) might give you
>> better result (though recent gcc's might not make a difference).
>
> If this should work then the below case that swaps the fields separately is
> broken.
>
> Anyway, structures have no endianess so when they start with a byte they
> start with that byte no matter the host endian.
> crq.valid is the first byte always. And then each field is to be swapped
> separately.
>
> On the other hand, bitfields are part of an integer and the field should be
> swapped as part of the integer.
>
> That is,
> #define CRQ_VALID ((buf[0] >> 56) & 0xff)
> CRQ_VALID is part of an integer in buf and would be laid out differently
> on start or end depending on the host being BE or LE.
>
> And the question is what the PAPR actually defines because both ways are
> used in the code. You can describe an in-memory layout either way.

Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
-----------------------------------------------------------------------
Word0 | Valid | Type | Length | Data
-----------------------------------------------------------------------
Word1 | Reserved
-----------------------------------------------------------------------

The following definition looks to match:

struct ibmvtpm_crq {
u8 valid;
u8 msg;
__be16 len;
__be32 data;
__be64 reserved;
} __attribute__((packed, aligned(8)));

>
>>>>
>>>> which means that the second word indeed contains purely garbage.
>>>>
>>>> This is repeated a few times in the driver so I added memset to
>>>> quiet
>>>> gcc and make behavior deterministic in case the unused fields get
>>>> some
>>>> meaning in the future.
>>>>
>>>> However, in tpm_ibmvtpm_send the structure is initialized as
>>>>
>>>> struct ibmvtpm_crq crq;
>>>> __be64 *word = (__be64 *)&crq;
>>>> ...
>>>> crq.valid = (u8)IBMVTPM_VALID_CMD;
>>>> crq.msg = (u8)VTPM_TPM_COMMAND;
>>>> crq.len = cpu_to_be16(count);
>>>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle);
>>>>
>>>> and submitted with
>>>>
>>>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]),
>>>> be64_to_cpu(word[1]));
>>>> meaning it is swapped twice.
>>>>
>>>>
>>>> Where is the interface defined? Are the command arguments passed as
>>>> BE
>>>> subfields (the second case was correct before adding the extra
>>>> whole
>>>> word swap) or BE words (the first case doing whole word swap is
>>>> correct)?
>>>
>>> The interface is defined in PAPR. The crq format is defined in BE
>>> terms.
>
> Which exact PAPR? Where can I get it?
> The PAPR document I found does not say anything about vtpm.

Unfortunately, vtpm doesn't appear to be covered in the publicly
available LoPAPR.

-Tyrel

>
> Thanks
>
> Michal
>