Re: [PATCH RFC 1/2] s390x: mm: allow mixed page table types (2k and 4k)

From: Christian Borntraeger
Date: Fri Jun 02 2017 - 03:11:16 EST


On 06/01/2017 02:59 PM, David Hildenbrand wrote:
>
>> diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
>> index d4d409b..b22c2b6 100644
>> --- a/arch/s390/mm/pgtable.c
>> +++ b/arch/s390/mm/pgtable.c
>> @@ -196,7 +196,7 @@ static inline pgste_t ptep_xchg_start(struct mm_struct *mm,
>> {
>> pgste_t pgste = __pgste(0);
>>
>> - if (mm_has_pgste(mm)) {
>> + if (pgtable_has_pgste(mm, __pa(ptep))) {
>> pgste = pgste_get_lock(ptep);
>> pgste = pgste_pte_notify(mm, addr, ptep, pgste);
>> }
>> @@ -207,7 +207,7 @@ static inline pte_t ptep_xchg_commit(struct mm_struct *mm,
>> unsigned long addr, pte_t *ptep,
>> pgste_t pgste, pte_t old, pte_t new)
>> {
>> - if (mm_has_pgste(mm)) {
>> + if (pgtable_has_pgste(mm, __pa(ptep))) {
>> if (pte_val(old) & _PAGE_INVALID)
>> pgste_set_key(ptep, pgste, new, mm);
>> if (pte_val(new) & _PAGE_INVALID)
>
> I think these two checks are wrong. We really have to test here the
> mapcount bit only (relying on mm_has_pgste(mm) is wrong in case global
> vm.allocate_pgste ist set).
>
> But before I continue working on this, I think it makes sense to clarify
> if something like that would be acceptable at all.

I think that is up to Martin to decide. Given the fact that Fedora, SUSE, Ubuntu always
enable this sysctl when the qemu package is installed (other distros as well?) I think
that we should really think about changing things. I see 2 options:

1. dropping 2k page tables completely
pro: - simplifies pagetable code (e.g. look at page_table_alloc code)
- we could get rid of a lock in the pgtable allocation path (mm->context.pgtable_lock)
- I am not aware of any performance impact due to the 4k page tables
- transparent for old QEMUs
- KVM will work out of the box
con: - higher page table memory usage for non-KVM processes

2. go with your approach
pro: - lower page table memory usage for non-KVM processes
- KVM will work out of the box
- no addtl overhead for non-KVM processes
con: - higher overhead for KVM processes during paging (since we are going to use IPTE
or friends anyway, the question is: does it matter?)
- needs QEMU change

Christian