Re: Using pmem from a driver exposing a memory mapping (mmap) to userspace

From: Boaz Harrosh
Date: Wed Apr 29 2015 - 03:07:04 EST


On 04/28/2015 06:35 PM, Mathieu Desnoyers wrote:
> Hi!
>
> I'm currently adaping lttng-modules to use DAX and pmem.
> It will allow LTTng buffers to be recovered after a kernel
> crash. I've moved pretty much all struct page pointers to
> page frame numbers, as I remember being told that pmem does
> not have struct page.
>
> Now I'm looking into adapting my mmap and page fault handler
> implementation (based on struct page) to a page-frame number
> based implementation when the ring buffer is backed by
> persistent memory, which will probably not require any page
> fault handler at all when based by pmem+dax memory.

There will be page-faults at lease once for every combination
of application+page. Sure there may only be one per a+p
until the application does a close on the file.

Your job can be simple if you use the pmem's inode. You know
how each block-device is a mini file system with a single file.
Use bdev->bd_inode to get to the one inode associated with
your pmem bdev. Well this inode is IS_DAX(), so if you supply
your own get_block() function to the DAX handlers you need
not duplicate any mmap code at all.

(You can also use the same DAX infrastructure for the read/write_iter
implementation)

>
> My current work is in this branch: https://github.com/compudj/lttng-modules-dev/tree/persistent-memory-buffers
> (see last commits)
>
> LTTng-modules supports both mmap() and splice(), but I plan
> to only provide mmap() support for persistent memory, since
> splice() really requires struct page.
>

No splice just works fine. In-fact a NULL .splice_XXX vector
will use the default_file_splice_read/write which does a
copy and uses your regular read/write_iter vectors. So
leave the .splice NULL and it will be supported by your
read/write_iter interface.

> Are there existing driver mmap implementations doing similar
> things, or do you have recommendations on how to implement
> this ?
>

DAX.c lib does all that you need. You only need your own
translation from your device files to a chunk of pmem.

Its how I'd do it, good luck. CC me on the patches I'll
review them.

Cheers
Boaz

> Thanks,
> Mathieu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/