Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

From: Jan Kara
Date: Mon Mar 11 2019 - 10:45:36 EST


On Thu 07-03-19 16:27:17, Andrea Arcangeli wrote:
> > driver that GUP page for hours/days/weeks/months ... obviously the
> > race window is big enough here. It affects many fs (ext4, xfs, ...)
> > in different ways. I think ext4 is the most obvious because of the
> > kernel log trace it leaves behind.
> >
> > Bottom line is for set_page_dirty to be safe you need the following:
> > lock_page()
> > page_mkwrite()
> > set_pte_with_write()
> > unlock_page()
>
> I also wondered why ext4 writepage doesn't recreate the bh if they got
> dropped by the VM and page->private is 0. I mean, page->index and
> page->mapping are still there, that's enough info for writepage itself
> to take a slow path and calls page_mkwrite to find where to write the
> page on disk.

There are two problems:

1) What to do with errors that page_mkwrite() can generate (ENOMEM, ENOSPC,
EIO). On page fault you just propagate them to userspace, on set_page_dirty()
you have no chance so you just silently loose data.

2) We need various locks to protect page_mkwrite(), possibly do some IO.
set_page_dirty() is rather uncertain context to acquire locks or do IO...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR