Re: [PATCH] Export shmem_file_setup and shmem_getpage for DRM-GEM

From: Keith Packard
Date: Mon Aug 04 2008 - 06:26:57 EST

Next message: David Woodhouse: "Re: [PATCH] Using Intel CRC32 instruction to accelerate CRC32calgorithm by new crypto API."
Previous message: Sebastian Siewior: "Re: [PATCH] Using Intel CRC32 instruction to accelerate CRC32calgorithm by new crypto API."
In reply to: Nick Piggin: "Re: [PATCH] Export shmem_file_setup and shmem_getpage for DRM-GEM"
Next in thread: Nick Piggin: "Re: [PATCH] Export shmem_file_setup and shmem_getpage for DRM-GEM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 2008-08-04 at 19:02 +1000, Nick Piggin wrote:

> This is how I'd suggested it work as well. I think a little bit
> more effort should be spent looking at making this work.

What I may be able to do is create a file, then hand it to my driver and
close the fd. That would avoid any ulimit or low-fd issues.

> Mapping the file into an address space might be a way to make it
> work (using get_user_pages to get the struct page). splice might
> also work. read_mapping_page or similar could also be something to
> look at. But using shmem_getpage seems wrong because it circumvents
> the vfs API.

It seems fairly ugly to map the object to user space just to get page
pointers; the expense of constructing that mapping will be entirely
wasted most of the time.

Would it be imprudent to use pagecache_write_begin/pagecache_write_end
here? For shmem, that appears to point at functions which will do what I
need. Of course, it will cause extra page-outs as each page will be
marked dirty, even if the GPU never writes them.

While shmem offers good semantics for graphics objects, it doesn't seem
like it is unique in any way, and it seems like it should be possible to
do this operation on any file system.

> If you genuinely have problems that can't be fit into existing
> APIs without significant modification, and that is specific just to
> your app, then we could always look at making special cases for you.
> But it would be nice if we generically solve problems you have with
> processes manipulating thousands of files.

There are some unique aspects to this operation which don't really have
parallels in other environments.

I'm doing memory management for a co-processor which uses the same pages
as the CPU. So, I need to allocate many pages that are just handed to
the GPU and never used by the CPU at all. Most rendering buffers are of
this form -- if you ever need to access them from the CPU, you've done
something terribly wrong.

Then there are textures which are constructed by the CPU (usually) and
handed over to the GPU for the entire lifetime of the application. These
are numerous enough that we need to be able to page them to disk; the
kernel driver will fault them back in when the GPU needs them again.

On the other hand, there are command and vertex buffers which are
constructed in user space and passed to the GPU for execution. These
operate just like any bulk-data transfer, and, in fact, I'm using the
pwrite API to transmit this data. For these buffers, the entire key is
to make sure you respect the various caches to keep them from getting
trashed.

--
keith.packard@xxxxxxxxx

Attachment: signature.asc
Description: This is a digitally signed message part

Next message: David Woodhouse: "Re: [PATCH] Using Intel CRC32 instruction to accelerate CRC32calgorithm by new crypto API."
Previous message: Sebastian Siewior: "Re: [PATCH] Using Intel CRC32 instruction to accelerate CRC32calgorithm by new crypto API."
In reply to: Nick Piggin: "Re: [PATCH] Export shmem_file_setup and shmem_getpage for DRM-GEM"
Next in thread: Nick Piggin: "Re: [PATCH] Export shmem_file_setup and shmem_getpage for DRM-GEM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]