Understanding Linux addr space, malloc, and heap

From: Vincent W. Freeh
Date: Fri Oct 21 2005 - 07:47:08 EST


I am trying to understand the Linux addr space. I figured someone might be able to shed some light on it. Or at least point me to some sources that will help.

I don't understand what is happening with malloc and the heap in my process. According to /proc/<pid>/maps the memory from heap to stack initially looks like that. I only show the four "maps" from the heap and above. (This is a slightly altered form consisting of start_addr, end_addr, size_in_pgs, permissions, and path_if_one):

0x08d42000 - 0x08d63000 (33 pgs) rw-p path `[heap]'
0xb7ef8000 - 0xb7ef9000 (1 pgs) rw-p
0xb7f09000 - 0xb7f0b000 (2 pgs) rw-p
0xbfaf5000 - 0xbfb0b000 (22 pgs) rw-p path `[stack]'

I cannot touch (rd, wr, or even mprotect) the map immediate above the heap--must be a sandboxing page. Before any malloc, brk = 0x8d42000 first page of heap.

If I malloc <= 33 pages the memory comes from the first map above and the brk changes as appropriate. However, some new maps appear between the two above. And the 2d one above gets bigger. However, all data comes from the heap. brk remain below the top of the heap. As shown below.

0x08d42000 - 0x08d63000 (33 pgs) ---p path `[heap]'
0xb7d00000 - 0xb7d01000 (1 pgs) rw-p
0xb7d01000 - 0xb7d21000 (32 pgs) ---p
0xb7d21000 - 0xb7e00000 (223 pgs) ---p
0xb7ef7000 - 0xb7ef9000 (2 pgs) rw-p
0xb7f09000 - 0xb7f0b000 (2 pgs) rw-p
0xbfaf5000 - 0xbfb0b000 (22 pgs) rw-p path `[stack]'

Now if I malloc > 33 pages, the data comes from the heap and the next map(s). That is the 34th pages is 0xb7d01000, in above example. What is going on?

Another thing I don't understand is that I can touch maps 3 & 4 above (0xb7d01000 & 0xb7d21000) both rd and wr. However, I cannot mprotect the 4th map---but mprotect does not fail, just doesn't change permissions. I can mprotect the 32 pages in the map 3. This is my initial problem: I can only mprotect 65 pages. The 66th page (from map 4) silently doesn't mprotect.

Looking around at other processes, they seem very different. Both tcsh and emacs (appear to) have the 1 pg sandbox just below the stack (good place) and much larger heaps.

First, please fix any erroneous statements/assumptions above. Next I have many questions. A few follow.

* How does the heap work? I learned/teach that heap is a contiguous chunk of memory that holds dynamically-allocated memory. Doesn't appear to be the case.

* Man pg says can only mprotect mmap-able pages. But what are these? How can I tell?

* Why does mprotect silently fail?

* I thought brk indicated the top of the heap and that all dynamic memory would be between bss end and brk. That's not true. What is brk for then?

Thanks,
v.
--
Vincent (Vince) W. Freeh
Dept of Computer Science
North Carolina State University
http://www.csc.ncsu.edu/faculty/freeh
919-513-7196
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/