Re: [PATCH v6 02/21] dt-bindings: Add binding for gunyah hypervisor

From: Elliot Berman
Date: Thu Nov 03 2022 - 15:45:51 EST

Next message: Yonghong Song: "Re: [PATCH bpf-next 2/3] bpf: Add bpf_perf_event_read_sample() helper"
Previous message: Miklos Szeredi: "Re: [GIT PULL] fuse fixes for 6.1-rc4"
In reply to: Jassi Brar: "Re: [PATCH v6 02/21] dt-bindings: Add binding for gunyah hypervisor"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/2/2022 8:21 PM, Jassi Brar wrote:

On Wed, Nov 2, 2022 at 6:23 PM Elliot Berman <quic_eberman@xxxxxxxxxxx> wrote:

On 11/2/2022 11:24 AM, Jassi Brar wrote:

On Wed, Nov 2, 2022 at 1:06 PM Elliot Berman <quic_eberman@xxxxxxxxxxx> wrote:

Hi Jassi,

On 11/1/2022 7:01 PM, Jassi Brar wrote:

On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@xxxxxxxxxxx> wrote:

On 11/1/2022 2:58 PM, Jassi Brar wrote:

On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@xxxxxxxxxxx> wrote:

On 11/1/2022 9:23 AM, Jassi Brar wrote:

On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@xxxxxxxxxxx> wrote:

Hi Jassi,

On 10/27/2022 7:33 PM, Jassi Brar wrote:
> On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
<quic_eberman@xxxxxxxxxxx> wrote:
> .....
>> +
>> + gunyah-resource-mgr@0 {
>> + compatible = "gunyah-resource-manager-1-0",
"gunyah-resource-manager";
>> + interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
full IRQ */
>> + <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
empty IRQ */
>> + reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>> + /* TX, RX cap ids */
>> + };
>>
> All these resources are used only by the mailbox controller driver.
> So, this should be the mailbox controller node, rather than the
> mailbox user.> One option is to load gunyah-resource-manager as a
module that relies
> on the gunyah-mailbox provider. That would also avoid the "Allow
> direct registration to a channel" hack patch.

A message queue to another guest VM wouldn't be known at boot time and
thus couldn't be described on the devicetree.

I think you need to implement of_xlate() ... or please tell me what
exactly you need to specify in the dt.

Dynamically created virtual machines can't be known on the dt, so there
is nothing to specify in the DT. There couldn't be a devicetree node for
the message queue client because that client is only exists once the VM
is created by userspace.

The underlying "physical channel" is the synchronous SMC instruction,
which remains 1 irrespective of the number of mailbox instances
created.

I disagree that the physical channel is the SMC instruction. Regardless
though, there are num_online_cpus() "physical channels" with this
perspective.

So basically you are sharing one resource among users. Why doesn't the
RM request the "smc instruction" channel once and share it among
users?

I suppose in this scenario, a single mailbox channel would represent all
message queues? This would cause Linux to serialize *all* message queue
hypercalls. Sorry, I can only think negative implications.

Error handling needs to move into clients: if a TX message queue becomes
full or an RX message queue becomes empty, then we'll need to return
error back to the client right away. The clients would need to register
for the RTS/RTR interrupts to know when to send/receive messages and
have retry error handling. If the mailbox controller retried for the
clients as currently proposed, then we could get into a scenario where a
message queue could never be ready to send/receive and thus stuck
forever trying to process that message. The effect here would be that
the mailbox controller becomes a wrapper to some SMC instructions that
aren't related at the SMC instruction level.

A single channel would limit performance of SMP systems because only one
core could send/receive a message. There is no such limitation for
message queues to behave like this.

This is just an illusion. If Gunyah can handle multiple calls from a
VM parallely, even with the "bind-client-to-channel" hack you can't
make sure different channels run on different cpu cores. If you are
ok with that, you could simply populate a mailbox controller with N
channels and allocate them in any order the clients ask.

I wanted to make sure I understood the ask here completely. On what
basis is N chosen? Who would be the mailbox clients?

A channel structure is cheap, so any number that is not likely to run
out. Say you have 10 possible users in a VM, set N=16. I know ideally
it should be precise and flexible but the gain in simplicity makes the
trade-off very acceptable.

I think I get the direction you are thinking now. N is chosen based off
of how many clients there might be. One mailbox controller will
represent all message queues and each channel will be one message queue.
There are some limitations that might make it more complex to implement
than having 1 message queue per controller like I have now.

My interpretation is that mailbox controller knows the configuration of
its channels before being bound to a client. For dynamically created
message queues, the client would need tell the controller about the
message queue configuration. I didn't find example where client is
providing information about a channel to the controller.

1. need a mechanism to allow the client to provide the
gunyah_resources for the channel (i.e. the irqs and cap ids).

IIUC there is exactly one resource-manager in a VM. Right?
Looking at your code, TX and RX irq are used only by the mailbox
driver and are the same for all clients/users. So that should be a
property under the mailbox controller node. Not sure what cap ids are.

Ah -- "message queues" are a generic inter-VM communication mechanism offered by Gunyah. One use case for message queues is to communicate with the resource-manager, but other message queues can exist between other virtual machines. Those other message queues use different TX and RX irq and have different client protocols.

In mailbox terminology, we have one known channel at boot-up time (the resource manager). That known channel can inform Linux about other channels at runtime. The client (not the controller) decodes received data from the channel to discover the new channels.

One approach we found was coming from pcc.c, which has their own request_channel function (pcc_mbox_request_channel). We could follow this approach as well...

2. Still need to have bind-client-to-channel patch since clients
aren't real devices and so shouldn't be on DT.

the clients may be virtual (serial, gpio etc) but the resource-manager
requires some mailbox hardware to communicate, so the resource-manager
should be the mailbox client (that further spawns virtual devices)

Yes, this the design I'm aiming for. Also want to highlight that the resource-manager spawns Gunyah virtual devices such as message queue channels.

Thanks,
Elliot

Next message: Yonghong Song: "Re: [PATCH bpf-next 2/3] bpf: Add bpf_perf_event_read_sample() helper"
Previous message: Miklos Szeredi: "Re: [GIT PULL] fuse fixes for 6.1-rc4"
In reply to: Jassi Brar: "Re: [PATCH v6 02/21] dt-bindings: Add binding for gunyah hypervisor"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]