Memory management problems (__get_free_pages, remap_page_range and friends)

From: Karim Yaghmour (karym@info.polymtl.ca)
Date: Thu Mar 16 2000 - 04:30:35 EST


Hello,

While enhancing the capabilities of the trace driver used
by the Linux Trace Toolkit I came across two quite hairy
problems. The issues revolve around the daemon using mmap() on
the trace buffers so that the collected data doesn't have to
be read() by the trace daemon. Rather, the daemon "mmap()"s
the trace "device" and the passes on the returned addresses
onto to write() which then comits data to a file. The goals:
speed up the comit step and decrease trace pollution.

(Note: these observations regard kernel 2.2.13. I know this
isn't the latest, so if my observations don't apply to the
latest kernel (devel or stable), please do not hesitate to
tell me about it).

1) When using __get_free_pages to allocate large chunks
of memory, the kernel doesn't allow me to allocate anything
with an order above 9. Moreover, if I install/remove the
trace module a couple of times (each time it is removed,
the cleanup routine frees the allocated pages using free_pages)
allocation starts to fail. That is, __get_free_pages returns
NULL. Has this been observed before or am I doing something
wrong? Isn't there a way around this? In the case where a
disk partition had to be checked upon boot, then even the
first __get_free_pages won't work.

2) Since the allocated region has to be visible from user
space with as less trouble as possible, I would've liked to
map the allocated region directly into the daemon's address
space. Unfortunatly, the road to Rome was long ... I tried
remap_page_range, but it obviously failed to work. I tried
using the virtual address at which the buffers where allocated
in the kernel space (on the intel mapping is one-to-one and
I said to myself that if all else failed why not give this one
a try), this failed too. I came back to remap_page_range
and played around with the PG_reserved bit and it sadly worked
(sadly because I would have liked it to be cleaner). I also
read the part in the Rubini book where it says that it would
be better to use a nopage handler. I would agree in most
cases, but not in this one. The first reason is speed. I want
to make sure that the impact tracing has on the system
is minimal. The second reason is behavior. That is, tracing
shouldn't modify the system's behavior, which it will if it
has to start dealing with page faults etc. Is there a better
way to do things?

Any suggestions and comments are welcomed.

===================================================
                 Karim Yaghmour
               karym@info.polymtl.ca
            Operating System Consultant
 (Linux kernel, real-time and distributed systems)
===================================================

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:17 EST