Re: [PATCH] Bluetooth: btusb: Add a new quirk to skip HCI_FLT_CLEAR_ALL on fake CSR controllers

From: Hans de Goede
Date: Fri Jan 07 2022 - 04:55:35 EST


Hi,

On 1/6/22 23:29, Ismael Ferreras Morezuelas wrote:
> Hi again, Marcel.
>
>>> /* Clear the reset quirk since this is not an actual
>>> * early Bluetooth 1.1 device from CSR.
>>> @@ -1942,16 +1943,16 @@ static int btusb_setup_csr(struct hci_dev *hdev)
>>> /*
>>> * Special workaround for these BT 4.0 chip clones, and potentially more:
>>> *
>>> - * - 0x0134: a Barrot 8041a02 (HCI rev: 0x1012 sub: 0x0810)
>>> + * - 0x0134: a Barrot 8041a02 (HCI rev: 0x0810 sub: 0x1012)
>>
>> Don’t get this change.
>>
>
> I swapped both numbers by mistake in my last commit when I moved it from
> a conditional into a comment This is explained in the changeset as:
>
> > Also, swapped the HCI subver and LMP subver numbers for the Barrot
> > in the comment, which I copied wrong the last time around.
>
> Thought I might as well fix it here, because it may never get corrected otherwise; it's misleading.
>
>
>>> * - 0x7558: IC markings FR3191AHAL 749H15143 (HCI rev/sub-version: 0x0709)
>>> *
>>> * These controllers are really messed-up.
>>> *
>>> * 1. Their bulk RX endpoint will never report any data unless
>>> - * the device was suspended at least once (yes, really).
>>> + * the device was suspended at least once (yes, really).
>>> * 2. They will not wakeup when autosuspended and receiving data
>>> - * on their bulk RX endpoint from e.g. a keyboard or mouse
>>> - * (IOW remote-wakeup support is broken for the bulk endpoint).
>>> + * on their bulk RX endpoint from e.g. a keyboard or mouse
>>> + * (IOW remote-wakeup support is broken for the bulk endpoint).
>>
>> Fix the style issues separately.
>
> Fair enough, I can obviate this part, no worries.
>
>
>>
>>> *
>>> * To fix 1. enable runtime-suspend, force-suspend the
>>> * HCI and then wake-it up by disabling runtime-suspend.
>>> @@ -1971,7 +1972,7 @@ static int btusb_setup_csr(struct hci_dev *hdev)
>>> if (ret >= 0)
>>> msleep(200);
>>> else
>>> - bt_dev_err(hdev, "CSR: Failed to suspend the device for our Barrot 8041a02 receive-issue workaround");
>>> + bt_dev_warn(hdev, "CSR: Couldn't suspend the device for our Barrot 8041a02 receive-issue workaround");
>>
>> Why change this?
>>
>
> Because depending on the clone this print may end up showing all the time;
> we *try* doing it, but some clones don't like it. I thought showing it
> as a warning would make sense. Tweaking the text a bit again helps
> when tracking down which version of the workaround the user is
> running, we can narrow it by just grepping the log itself.
>
> If it doesn't work it doesn't really affect anything, and we can't
> whitelist something half the clones use to get unstuck and the other
> half just ignore and stay peachy. We're trying an unified solution.
>
>
>>>
>>> pm_runtime_forbid(&data->udev->dev);
>>>
>>> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
>>> index 63065bc01..4e5d5979d 100644
>>> --- a/include/net/bluetooth/hci.h
>>> +++ b/include/net/bluetooth/hci.h
>>> @@ -246,6 +246,12 @@ enum {
>>> * HCI after resume.
>>> */
>>> HCI_QUIRK_NO_SUSPEND_NOTIFIER,
>>> +
>>> + /* When this quirk is set, HCI_OP_SET_EVENT_FLT requests with
>>> + * HCI_FLT_CLEAR_ALL are ignored. A subset of the CSR controller
>>> + * clones struggle with this and instantly lock up.
>>> + */
>>> + HCI_QUIRK_BROKEN_FILTER_CLEAR_ALL,
>>> };
>>>
>>> /* HCI device flags */
>>> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
>>> index 8d33aa648..7af649afc 100644
>>> --- a/net/bluetooth/hci_core.c
>>> +++ b/net/bluetooth/hci_core.c
>>> @@ -150,6 +150,7 @@ static void bredr_setup(struct hci_request *req)
>>> {
>>> __le16 param;
>>> __u8 flt_type;
>>> + struct hci_dev *hdev = req->hdev;
>>
>> This should always go first in a function.
>>
>>>
>>> /* Read Buffer Size (ACL mtu, max pkt, etc.) */
>>> hci_req_add(req, HCI_OP_READ_BUFFER_SIZE, 0, NULL);
>>> @@ -169,9 +170,13 @@ static void bredr_setup(struct hci_request *req)
>>> /* Read Current IAC LAP */
>>> hci_req_add(req, HCI_OP_READ_CURRENT_IAC_LAP, 0, NULL);
>>>
>>> - /* Clear Event Filters */
>>> - flt_type = HCI_FLT_CLEAR_ALL;
>>> - hci_req_add(req, HCI_OP_SET_EVENT_FLT, 1, &flt_type);
>>> + /* Clear Event Filters; some fake CSR controllers lock up after setting
>>> + * this type of filter, so avoid sending the request altogether.
>>> + */
>>> + if (!test_bit(HCI_QUIRK_BROKEN_FILTER_CLEAR_ALL, &hdev->quirks)) {
>>> + flt_type = HCI_FLT_CLEAR_ALL;
>>> + hci_req_add(req, HCI_OP_SET_EVENT_FLT, 1, &flt_type);
>>> + }
>>>
>>> /* Connection accept timeout ~20 secs */
>>> param = cpu_to_le16(0x7d00);
>>> diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
>>> index 92611bfc0..cfcf64c0c 100644
>>> --- a/net/bluetooth/hci_request.c
>>> +++ b/net/bluetooth/hci_request.c
>>> @@ -980,11 +980,15 @@ void hci_req_add_le_passive_scan(struct hci_request *req)
>>> static void hci_req_clear_event_filter(struct hci_request *req)
>>> {
>>> struct hci_cp_set_event_filter f;
>>> + struct hci_dev *hdev = req->hdev;
>>> +
>>> + if (!hci_dev_test_flag(hdev, HCI_BREDR_ENABLED))
>>> + return;
>>>
>>> - if (!hci_dev_test_flag(req->hdev, HCI_BREDR_ENABLED))
>>> + if (test_bit(HCI_QUIRK_BROKEN_FILTER_CLEAR_ALL, &hdev->quirks))
>>> return;
>>>
>>> - if (hci_dev_test_flag(req->hdev, HCI_EVENT_FILTER_CONFIGURED)) {
>>> + if (hci_dev_test_flag(hdev, HCI_EVENT_FILTER_CONFIGURED)) {
>>> memset(&f, 0, sizeof(f));
>>> f.flt_type = HCI_FLT_CLEAR_ALL;
>>> hci_req_add(req, HCI_OP_SET_EVENT_FLT, 1, &f);
>>
>> This is not enough. If you do not have clear event filter, we need to disable suspend/resume support. These device can for obvious reason not sleep accordingly.
>
> Would adding HCI_QUIRK_NO_SUSPEND_NOTIFIER be enough here? I'll take another look at the
> codebase, any comments about the best/simplest way to tackle this would help me a lot.
>
> As you can see I'm still getting my feet wet around the Bluetooth subsystem.
>
> In theory adding another quirk condition in hci_req_set_event_filter() or anything that leads
> to it (i.e. hci_suspend_dev() / BT_SUSPEND_CONFIGURE_WAKE) may also do the trick.
>
> The least code I touch the better. Right now I'm thinking of mirroring my hci_req_clear_event_filter()
> nop-out in hci_req_set_event_filter() *and* just setting the NO_SUSPEND_NOTIFIER quirk.

Note we already disable runtime suspend for these broken clones in
the btusb code, I think that in itself might be enough, together with
a comment in the header where the quirk is defined that devices using this
must disable runtime suspend ?

Regards,

Hans