Re: [virtio-dev] Re: [RFC PATCH 1/1] can: virtio: Initial virtio CAN driver.

From: Harald Mommer
Date: Fri Feb 03 2023 - 10:03:20 EST


Hello,

we had here at OpenSynergy an internal discussion about an open source virtio-can device implementation.

The outcome of this is now that an open source virtio-can device is to be developed.

It has not yet been decided whether the open source device implementation will be done using qemu or kvmtool (or something else?). Negative or positive feedback for or against one of those is likely to influence the decision what will be used as basis for the development. Using kvmtool may be easier to do for me (to be investigated in detail) but on the other hand we have some people around in the team who have the knowledge to support with qemu.


On 04.11.22 18:03, Arnd Bergmann wrote:
On Fri, Nov 4, 2022, at 16:32, Jan Kiszka wrote:
On 03.11.22 14:55, Harald Mommer wrote:
On 27.08.22 11:39, Marc Kleine-Budde wrote:
Is there an Open Source implementation of the host side of this
interface?
there is neither an open source device nor is it currently planned. The
device I'm developing is closed source.
Likely not helpful long-term /wrt kernel QA - how should kernelci or
others even have a chance to test the driver? Keep in mind that you are
not proposing a specific driver for an Opensynergy hypervisor, rather
for the open and vendor-agnostic virtio spec.

But QEMU already supports both CAN and virtio, thus should be relatively
easy to augment with this new device.
Agreed, either hooking into the qemu support, or having a separate
vhost-user backend that forwards data to the host stack would be
helpful here, in particular to see how the flow control works.

What I would like to have is considering a CAN frame as sent when it was sent on the bus (vs. given to the lower layers where it is only scheduled for later transmission but not actually physically sent). This behavior is enabled by feature flag VIRTIO_CAN_F_LATE_TX_ACK. But under really heavy load conditions this does not work reliably. It looks like the own transmitted frames are discarded sometimes in heavy overload.

The reception of the own transmitted frames is used to trigger the state transition from "TX pending" => "TX done" for a pending transmitted frame in the device. So loosing those own transmitted frames leads to the situation that "TX pending" frames stay in this state forever and everything gets stuck quickly. So the feature flag VIRTIO_CAN_F_LATE_TX_ACK is currently not usable reliably in Linux. Either I need to find a good workaround or better a proper way to avoid that any of those acknowledgement frames is lost ever. I've no good solution found to cope with this yet.

But without VIRTIO_CAN_F_LATE_TX_ACK there is also no working flow control. Means I would like to see some day not only "how flow control works" but also "that flow control works regardless how the CAN stack is tortured".

IIRC when we discussed virtio-can on the stratos list, one of the
issues that was pointed out was filtering of frames for specific
CAN IDs in the host socketcan for assigning individual IDs to
separate guests. It would be good to understand whether a generic
host implementation has the same problems, and what can be
done in socketcan to help with that.

Arnd

Harald