Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking

From: Nai Xia
Date: Wed Jun 22 2011 - 07:33:58 EST


On Wednesday 22 June 2011 19:28:08 Avi Kivity wrote:
> On 06/22/2011 02:24 PM, Avi Kivity wrote:
> > On 06/22/2011 02:19 PM, Izik Eidus wrote:
> >> On 6/22/2011 2:10 PM, Avi Kivity wrote:
> >>> On 06/22/2011 02:05 PM, Izik Eidus wrote:
> >>>>>> + spte = rmap_next(kvm, rmapp, NULL);
> >>>>>> + while (spte) {
> >>>>>> + int _dirty;
> >>>>>> + u64 _spte = *spte;
> >>>>>> + BUG_ON(!(_spte& PT_PRESENT_MASK));
> >>>>>> + _dirty = _spte& PT_DIRTY_MASK;
> >>>>>> + if (_dirty) {
> >>>>>> + dirty = 1;
> >>>>>> + clear_bit(PT_DIRTY_SHIFT, (unsigned long *)spte);
> >>>>>> + }
> >>>>>
> >>>>> Racy. Also, needs a tlb flush eventually.
> >>>> +
> >>>>
> >>>> Hi, one of the issues is that the whole point of this patch is not
> >>>> do tlb flush eventually,
> >>>> But I see your point, because other users will not expect such
> >>>> behavior, so maybe there is need into a parameter
> >>>> flush_tlb=?, or add another mmu notifier call?
> >>>>
> >>>
> >>> If you don't flush the tlb, a subsequent write will not see that
> >>> spte.d is clear and the write will happen. So you'll see the page
> >>> as clean even though it's dirty. That's not acceptable.
> >>>
> >>
> >> Yes, but this is exactly what we want from this use case:
> >> Right now ksm calculate the page hash to see if it was changed, the
> >> idea behind this patch is to use the dirty bit instead,
> >> however the guest might not really like the fact that we will flush
> >> its tlb over and over again, specially in periodically scan like ksm
> >> does.
> >
> > I see.
>
> Actually, this is dangerous. If we use the dirty bit for other things,
> we will get data corruption.

Yeah,yeah, I actually clarified in a reply letter to Chris about his similar
concern that we are currently the _only_ user. :)
We can add the flushing when someone else should rely on this bit.

>
> For example we might want to map clean host pages as writeable-clean in
> the spte on a read fault so that we don't get a page fault when they get
> eventually written.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/