Re: [RFC PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry

From: wangyanan (Y)
Date: Mon Nov 30 2020 - 10:25:22 EST



On 2020/11/30 21:34, Will Deacon wrote:
On Mon, Nov 30, 2020 at 08:18:46PM +0800, Yanan Wang wrote:
In dirty logging case(logging_active == True), we need to collapse a block
entry into a table if necessary. After dirty logging is canceled, when merging
tables back into a block entry, we should not only free the non-huge page
tables but also unmap the non-huge mapping for the block. Without the unmap,
inconsistent TLB entries for the pages in the the block will be created.

We could also use unmap_stage2_range API to unmap the non-huge mapping,
but this could potentially free the upper level page-table page which
will be useful later.

Signed-off-by: Yanan Wang <wangyanan55@xxxxxxxxxx>
---
arch/arm64/kvm/hyp/pgtable.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 696b6aa83faf..fec8dc9f2baa 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -500,6 +500,9 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level,
return 0;
}
+static void stage2_flush_dcache(void *addr, u64 size);
+static bool stage2_pte_cacheable(kvm_pte_t pte);
+
static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
struct stage2_map_data *data)
{
@@ -507,9 +510,17 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
struct page *page = virt_to_page(ptep);
if (data->anchor) {
- if (kvm_pte_valid(pte))
+ if (kvm_pte_valid(pte)) {
+ kvm_set_invalid_pte(ptep);
+ kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu,
+ addr, level);
put_page(page);
This doesn't make sense to me: the page-table pages we're walking when the
anchor is set are not accessible to the hardware walker because we unhooked
the entire sub-table in stage2_map_walk_table_pre(), which has the necessary
TLB invalidation.

Are you seeing a problem in practice here?

Yes, I indeed find a problem in practice.

When the migration was cancelled, a TLB conflic abort  was found in guest.

This problem is fixed before rework of the page table code, you can have a look in the following two links:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c3736cd32bf5197aed1410ae826d2d254a5b277

https://lists.cs.columbia.edu/pipermail/kvmarm/2019-March/035031.html

+ if (stage2_pte_cacheable(pte))
+ stage2_flush_dcache(kvm_pte_follow(pte),
+ kvm_granule_size(level));
I don't understand the need for the flush either, as we're just coalescing
existing entries into a larger block mapping.

In my opinion, after unmapping, it is necessary to ensure the cache coherency, because it is unknown whether the future mapping memory attribute is changed or not (cacheable -> non_cacheable) theoretically.

Will
.