Re: [PATCH v5 15/20] RISC-V: KVM: Add timer functionality

From: Alexander Graf
Date: Fri Aug 23 2019 - 08:26:06 EST




On 23.08.19 14:11, Anup Patel wrote:
On Fri, Aug 23, 2019 at 5:19 PM Alexander Graf <graf@xxxxxxxxxx> wrote:



On 23.08.19 13:46, Anup Patel wrote:
On Fri, Aug 23, 2019 at 5:03 PM Graf (AWS), Alexander <graf@xxxxxxxxxx> wrote:



Am 23.08.2019 um 13:05 schrieb Anup Patel <anup@xxxxxxxxxxxxxx>:

On Fri, Aug 23, 2019 at 1:23 PM Alexander Graf <graf@xxxxxxxxxx> wrote:

On 22.08.19 10:46, Anup Patel wrote:
From: Atish Patra <atish.patra@xxxxxxx>

The RISC-V hypervisor specification doesn't have any virtual timer
feature.

Due to this, the guest VCPU timer will be programmed via SBI calls.
The host will use a separate hrtimer event for each guest VCPU to
provide timer functionality. We inject a virtual timer interrupt to
the guest VCPU whenever the guest VCPU hrtimer event expires.

The following features are not supported yet and will be added in
future:
1. A time offset to adjust guest time from host time
2. A saved next event in guest vcpu for vm migration

Implementing these 2 bits right now should be trivial. Why wait?


[...]

... in fact, I feel like I'm missing something obvious here. How does
the guest trigger the timer event? What is the argument it uses for that
and how does that play with the tbfreq in the earlier patch?

We have SBI call inferface between Hypervisor and Guest. One of the
SBI call allows Guest to program time event. The next event is specified
as absolute cycles. The Guest can read time using TIME CSR which
returns system timer value (@ tbfreq freqency).

Guest Linux will know the tbfreq from DTB passed by QEMU/KVMTOOL
and it has to be same as Host tbfreq.

The TBFREQ config register visible to user-space is a read-only CONFIG
register which tells user-space tools (QEMU/KVMTOOL) about Host tbfreq.

And it's read-only because you can not trap on TB reads?

There is no TB registers.

The tbfreq can only be know through DT/ACPI kind-of HW description
for both Host and Guest.

The KVM user-space tool needs to know TBFREQ so that it can set correct
value in generated DT for Guest Linux.

So what access methods do get influenced by TBFREQ? If it's only the SBI
timer, we can control the frequency, which means we can make TBFREQ
read/write.

There are two things influenced by TBFREQ:
1. TIME CSR which is a free running counter
2. SBI calls for programming next timer event

The Guest TIME CSR will be at same rate as Host TIME CSR so
we cannot show different TBFREQ to Guest Linux.

In future, we will be having a dedicated RISC-V timer extension which
will have all programming done via CSRs but until then we are stuck
with TIME CSR + SBI call combination.

Please make sure that in a future revision of the spec either

a) TIME CSR can be trapped or
b) TIME CSR can be virtualized (virtual TIME READ has offset and multiplier on phys TIME READ applied)

and the same goes for the timer extension - either make it all trappable or all propery adjustable. You need to be double cautious there that people don't design something that breaks live migration between hosts that have a different TBFREQ.


Thanks,

Alex