Re: [PATCH v5] watchdog: add driver for StreamLabs USB watchdog device

From: Alexey Klimov
Date: Mon Jul 25 2022 - 21:31:25 EST


On Mon, Jul 25, 2022 at 3:02 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> On 7/24/22 20:06, Alexey Klimov wrote:

[...]

> > + * one buffer is used for communication, however transmitted message is only
> > + * 32 bytes long
> > + */
> > +#define BUFFER_TRANSFER_LENGTH 32
> > +#define BUFFER_LENGTH 64
> > +#define USB_TIMEOUT 350
> > +
> Comment about the unit (ms) might be useful.

Yes. I'll add it.

> > +#define STREAMLABS_CMD_START 0xaacc
> > +#define STREAMLABS_CMD_STOP 0xbbff
> > +
> > +/* timeout values are taken from windows program */
> > +#define STREAMLABS_WDT_MIN_TIMEOUT 1
> > +#define STREAMLABS_WDT_MAX_TIMEOUT 46
> > +
> > +struct streamlabs_wdt {
> > + struct watchdog_device wdt_dev;
> > + struct usb_interface *intf;
> > + /* Serialises usb communication with a device */
> > + struct mutex lock;
> > + __le16 *buffer;
> > +};
> > +
> > +static bool nowayout = WATCHDOG_NOWAYOUT;
> > +module_param(nowayout, bool, 0);
> > +MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
> > + __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
> > +
> > +/* USB call wrappers to send and receive messages to/from the device. */
> > +static int usb_streamlabs_send_msg(struct usb_device *usbdev, __le16 *buf)
> > +{
> > + int retval;
> > + int size;
> > +
> > + retval = usb_interrupt_msg(usbdev, usb_sndintpipe(usbdev, 0x02),
> > + buf, BUFFER_TRANSFER_LENGTH,
> > + &size, USB_TIMEOUT);
> > +
> > + if (size != BUFFER_TRANSFER_LENGTH)
> > + return -EIO;
> > +
>
> If usb_interrupt_msg() returns an error, it will likely not set size,
> which may result in a random -EIO. I think this should be something like
>
> if (retval)
> return retval;
> if (size != BUFFER_TRANSFER_LENGTH)
> return -EIO;
>
> return 0;

Good point. I'll change it.


> > + return retval;
> > +}
> > +
> > +static int usb_streamlabs_get_msg(struct usb_device *usbdev, __le16 *buf)
> > +{
> > + int retval;
> > + int size;
> > +
> > + retval = usb_interrupt_msg(usbdev, usb_rcvintpipe(usbdev, 0x81),
> > + buf, BUFFER_LENGTH,
> > + &size, USB_TIMEOUT);
> > +
> > + if (size != BUFFER_LENGTH)
> > + return -EIO;
> > +
> Same here.
>
> > + return retval;
> > +}
> > +
> > +/*
> > + * This function is used to check if watchdog timeout in the received buffer
> > + * matches the timeout passed from watchdog subsystem.
> > + */
> > +static int usb_streamlabs_wdt_check_timeout(__le16 *buf, unsigned long timeout)
> > +{
> > + if (buf[3] != cpu_to_le16(timeout))
> > + return -EPROTO;
> > +
> > + return 0;
> > +}
> > +
> > +static int usb_streamlabs_wdt_check_response(u8 *buf)
> > +{
> > + /*
> > + * If watchdog device understood the command it will acknowledge
> > + * with values 1,2,3,4 at indexes 10, 11, 12, 13 in response message
> > + * when response treated as 8bit message.
> > + */
> > + if (buf[10] != 1 || buf[11] != 2 || buf[12] != 3 || buf[13] != 4)
> > + return -EPROTO;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * This function is used to check if watchdog command in the received buffer
> > + * matches the command passed to the device.
> > + */
> > +static int usb_streamlabs_wdt_check_command(__le16 *buf, u16 cmd)
> > +{
> > + if (buf[0] != cpu_to_le16(cmd))
> > + return -EPROTO;
> > +
> > + return 0;
> > +}
> > +
> > +static int usb_streamlabs_wdt_validate_response(__le16 *buf, u16 cmd,
> > + unsigned long timeout_msec)
> > +{
> > + int retval;
> > +
> > + retval = usb_streamlabs_wdt_check_response((u8 *)buf);
> > + if (retval)
> > + return retval;
> > +
> > + retval = usb_streamlabs_wdt_check_command(buf, cmd);
> > + if (retval)
> > + return retval;
> > +
> > + retval = usb_streamlabs_wdt_check_timeout(buf, timeout_msec);
> > + return retval;
>
> assignment to retval is unnecessary.

Ok.

> > +}
> > +
> > +static void usb_streamlabs_wdt_prepare_buf(__le16 *buf, u16 cmd,
> > + unsigned long timeout_msec)
> > +{
> > + /*
> > + * remaining elements expected to be zero everytime during
> > + * communication
> > + */
> > + buf[0] = cpu_to_le16(cmd);
> > + buf[1] = cpu_to_le16(0x8000);
> > + buf[3] = cpu_to_le16(timeout_msec);
>
> Not setting buf[2] and buf[4] contradicts the comment above. Maybe
> those offsets are not _expected_ to be set by the device, but that
> is not guaranteed. It may be safer to just use memset() at the
> beginning of the function to clear the buffer.

Sure. I guess it makes sense to zero the buffer before reading the
message from the device too?
Before usb_streamlabs_get_msg(usbdev, wdt->buffer).

> > + buf[5] = 0x0;
> > + buf[6] = 0x0;
> > +}
> > +
> > +static int __usb_streamlabs_wdt_cmd(struct streamlabs_wdt *wdt, u16 cmd)
> > +{
> > + struct usb_device *usbdev;
> > + unsigned long timeout_msec;
> > + /* how many times to re-try getting the state of the device */
> > + unsigned int retry_counter = 10;
> > + int retval;
> > +
> > + if (unlikely(!wdt->intf))
> > + return -ENODEV;
> > +
> > + usbdev = interface_to_usbdev(wdt->intf);
> > + timeout_msec = wdt->wdt_dev.timeout * MSEC_PER_SEC;
> > +
> > + usb_streamlabs_wdt_prepare_buf(wdt->buffer, cmd, timeout_msec);
> > +
> > + /* send command to watchdog */
> > + retval = usb_streamlabs_send_msg(usbdev, wdt->buffer);
> > + if (retval)
> > + return retval;
> > +
> > + /*
> > + * Transition from one state to another in this device
> > + * doesn't happen immediately, especially stopping the device
> > + * is not observed on the first reading of the response.
> > + * Plus to avoid potentially stale response message in the device
> > + * we keep reading the state of the device until we see:
> > + * -- that device recognised the sent command;
> > + * -- the received state (started or stopped) matches the state
> > + * that was requested;
> > + * -- the timeout passed matches the timeout value read from
> > + * the device.
> > + * Keep retrying 10 times and if watchdog device doesn't respond
> > + * correctly as expected we bail out and return an error.
> > + */
> > + do {
> > + retval = usb_streamlabs_get_msg(usbdev, wdt->buffer);
> > + if (retval)
> > + break;
> > +
> > + retval = usb_streamlabs_wdt_validate_response(wdt->buffer, cmd,
> > + timeout_msec);
> > + } while (retval && retry_counter--);
> > +
> > + return retry_counter ? retval : -EIO;
> > +}
> > +
> > +static int usb_streamlabs_wdt_cmd(struct streamlabs_wdt *streamlabs_wdt, u16 cmd)
> > +{
> > + int retval;
> > +
> > + mutex_lock(&streamlabs_wdt->lock);
> > + retval = __usb_streamlabs_wdt_cmd(streamlabs_wdt, cmd);
> > + mutex_unlock(&streamlabs_wdt->lock);
> > +
> > + return retval;
> > +}
> > +
> > +static int usb_streamlabs_wdt_start(struct watchdog_device *wdt_dev)
> > +{
> > + struct streamlabs_wdt *streamlabs_wdt = watchdog_get_drvdata(wdt_dev);
> > +
> > + return usb_streamlabs_wdt_cmd(streamlabs_wdt, STREAMLABS_CMD_START);
> > +}
> > +
> > +static int usb_streamlabs_wdt_stop(struct watchdog_device *wdt_dev)
> > +{
> > + struct streamlabs_wdt *streamlabs_wdt = watchdog_get_drvdata(wdt_dev);
> > +
> > + return usb_streamlabs_wdt_cmd(streamlabs_wdt, STREAMLABS_CMD_STOP);
> > +}
> > +
> > +static const struct watchdog_info streamlabs_wdt_ident = {
> > + .options = WDIOF_SETTIMEOUT | WDIOF_KEEPALIVEPING,
> > + .identity = DRIVER_NAME,
> > +};
> > +
> > +static const struct watchdog_ops usb_streamlabs_wdt_ops = {
> > + .owner = THIS_MODULE,
> > + .start = usb_streamlabs_wdt_start,
> > + .stop = usb_streamlabs_wdt_stop,
> > +};
> > +
> > +static int usb_streamlabs_wdt_probe(struct usb_interface *intf,
> > + const struct usb_device_id *id)
> > +{
> > + struct usb_device *usbdev = interface_to_usbdev(intf);
> > + struct streamlabs_wdt *streamlabs_wdt;
> > + int retval;
> > +
> > + /*
> > + * USB IDs of this device appear to be weird/unregistered. Hence, do
> > + * an additional check on product and manufacturer.
> > + * If there is similar device in the field with same values then
> > + * there is stop command in probe() below that checks if the device
> > + * behaves as a watchdog.
> > + */
> > + if (!usbdev->product || !usbdev->manufacturer ||
> > + strncmp(usbdev->product, "USBkit", 6) ||
> > + strncmp(usbdev->manufacturer, "STREAM LABS", 11))
> > + return -ENODEV;
> > +
> > + streamlabs_wdt = devm_kzalloc(&intf->dev, sizeof(struct streamlabs_wdt),
> > + GFP_KERNEL);
> > + if (!streamlabs_wdt)
> > + return -ENOMEM;
> > +
> > + streamlabs_wdt->buffer = devm_kzalloc(&intf->dev, BUFFER_LENGTH,
> > + GFP_KERNEL);
> > + if (!streamlabs_wdt->buffer)
> > + return -ENOMEM;
> > +
>
> Nit: buffer could be made part of struct streamlabs_wdt and be tagged with
> ____cacheline_aligned to avoid the double allocation.

It was discussed in the past.
https://lore.kernel.org/linux-watchdog/5714E7D3.4030809@xxxxxxxxxxxx/
https://lore.kernel.org/linux-watchdog/1460988518.25119.28.camel@xxxxxxxx/

The conclusion there was that with separate allocation it is
guaranteed to not share a cacheline with mutex lock.
Do we know for sure that it is safe with ____cacheline_aligned attribute?

Oliver, thoughts?

I see that a lot of drivers use cacheline alignment for buffers, so I
guess that should be okay nowadays and I can change it back to initial
version with cacheline alignment.

> > + mutex_init(&streamlabs_wdt->lock);
> > +
> > + streamlabs_wdt->wdt_dev.info = &streamlabs_wdt_ident;
> > + streamlabs_wdt->wdt_dev.ops = &usb_streamlabs_wdt_ops;
> > + streamlabs_wdt->wdt_dev.timeout = STREAMLABS_WDT_MAX_TIMEOUT;
> > + streamlabs_wdt->wdt_dev.max_timeout = STREAMLABS_WDT_MAX_TIMEOUT;
> > + streamlabs_wdt->wdt_dev.min_timeout = STREAMLABS_WDT_MIN_TIMEOUT;
> > + streamlabs_wdt->wdt_dev.parent = &intf->dev;
> > +
> > + streamlabs_wdt->intf = intf;
> > + usb_set_intfdata(intf, &streamlabs_wdt->wdt_dev);
> > + watchdog_set_drvdata(&streamlabs_wdt->wdt_dev, streamlabs_wdt);
> > + watchdog_set_nowayout(&streamlabs_wdt->wdt_dev, nowayout);
> > +
> > + retval = usb_streamlabs_wdt_stop(&streamlabs_wdt->wdt_dev);
> > + if (retval)
> > + return -ENODEV;
> > +
>
> A comment explaining why the watchdog is explicitly stopped when running
> might be useful.

What do you mean by saying "when running"?
Everytime during my testing the initial state is "stopped" on
boot/power on/after reset, so not sure what you mean by saying "when
running".
There is a comment above that explains the stop command but I will
add/change comments that explain things better.
The point of executing "stop" command here is to check that device
being probed behaves like we expect it to. This is a bit paranoid
check since I am a not 100% sure that all devices with such USB ids
are watchdogs -- that's why additional checks for usbdev->product and
usbdev->manufacturer and this stop command that doesn't change the
initial state. In theory, I can remove this stop command at all.

Thank you for the review.

[...]

Best regards,
Alexey