Re: [PATCH 1/2] mailbox: switch to hrtimer for tx_complete polling

From: Jassi Brar
Date: Wed Jul 29 2015 - 04:33:14 EST


On Mon, Jul 27, 2015 at 3:18 PM, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
> On 27/07/15 04:26, Jassi Brar wrote:
>>
>>>>
>>>>> we might end-up waiting
>>>>> for atleast a jiffy even though the response for that message from the
>>>>> remote is received via interrupt and processed in relatively smaller
>>>>> time granularity.
>>>>>
>>>> That is wrong.
>>>
>>>
>>> No see below.
>>>
>>>> If the controller supports TX interrupt it should set txdone_irq,
>>>> which prevents polling i.e, controller driver calls mbox_chan_txdone.
>>>>
>>>> If the controller doesn't support TX interrupt but the client
>>>> receives some ack packet, then the client should set knows_txdone and
>>>> call mbox_client_txdone. Again you don't have to wait on polling.
>>>>
>>>
>>> Sorry if I was not clear in the commit message, but I thought I did
>>> mention TXDONE_BY_POLL. The case I am referring is definitely not
>>> TXDONE_BY_IRQ or TXDONE_BY_ACK.
>>>
>> That statement is still wrong. The TXDONE_BY_POLL modifier does't make it
>> right.
>>
>
> I am fine to modify/clarify that statement.
>
>> Anyways, I see you meant the 3rd case of neither IRQ nor ACK.
>>
>
> Yes the remote indicates by setting a flag in status register.
>
However, looking at the arm_scpi.c the protocol does support
TXDONE_BY_ACK that is, every command has a reply packet telling if the
command was successful or failure. When you receive a reply, obviously
the command has already been received by the remote. Which is
mbox_client.knows_txdone or TXDONE_BY_ACK.

>> It seems your remote doesn't send some protocol level 'ack' packet
>> replying if the command was successfully executed or not. That means
>> Linux can't differentiate successful execution of the command from a
>> silent failure (remote still has to set the TX_done flag to make way
>> for next messages).
>
> Agreed and again I confirm the remote processor in question just sets
> the flag always and correctly and doesn't use a protocol ACK.
>
As I note above, the arm_scpi.c tells a different story.

>>> 2. Because of that, under stress testing with multiple clients active at
>>> a time, I am seeing the mailbox buffer overflows quite easily just
>>> because it's blocked on Tx polling(almost 10x slower) and doesn't
>>> process new requests though the remote can handle.
>>>
>> Yeah this situation may arise. The following fix is needed regardless, I
>> think.
>>
>
> IIUC it just triggers poll_txdone when chan->active_req is set, but that
> will anyway happen through timer, no ?
>
No. It polls whenever a new message is queued so that we are not
waiting on a completed but sleeping task.

> Anyway it's not the main topic of discussion here and can be taken up
> separately.
>
>> ============================
>> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
>> index 07fd507..a1dded9 100644
>> --- a/drivers/mailbox/mailbox.c
>> +++ b/drivers/mailbox/mailbox.c
>> @@ -64,7 +64,12 @@ static void msg_submit(struct mbox_chan *chan)
>>
>> spin_lock_irqsave(&chan->lock, flags);
>>
>> - if (!chan->msg_count || chan->active_req)
>> + if (chan->active_req) {
>> + err = 0;
>> + goto exit;
>> + }
>> +
>> + if (!chan->msg_count)
>> goto exit;
>>
>> count = chan->msg_count;
>> ============================
>>


>>> Hope this clarifies the reasons for switching to hrtimer.
>>>
>> I am not against using hrtimer, just need to make sure we don't simply
>> suppress the symptoms of wrong implementation.
>
> Agreed, and that's a valid concern. So far based on the testing and
> benchmarking done so far, we don't think this patch is suppressing
> anything incorrectly.
>
> If you still have concerns with this solution, please explain them here
> so that we can discuss and come to conclusion and the issue is fixed.
>
I just replied on the patch where you set
cl->knows_txdone = false;
and which is not the case.

We may use hrtimer for polling, but your platform doesn't have to rely on that.

-jassi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/