Ahmon Dancy: NFS mmap problem?

Ahmon Dancy (dancy@franz.com)
Mon, 06 May 1996 15:34:55 -0700


I'm forwarding this to the linux kernel mailing list on behalf of a
user here:

(Note this message refers to linux kernel v1.3.94)

---- Forwarded message ----
Please go ahead and send the linux people the bug report I sent you, and
note that we made the change below and it did not help. Also note that
although the file I was dumping to was a local file, it was mounted via
NFS due to the way I cd'd to the directory. When I cd to the same
directory directly, the dump works.

Date: Thu, 02 May 1996 10:18:07 -0700
From: Ahmon Dancy <dancy>

I put this change in on fabi and I'm rebooting now.

>> In the file linux/mm/filemap.c, function filemap_sync(), there is a line
>> that says:
>>
>> dir = pgd_offset(current->mm, address);
>>
>> but it should be
>>
>> dir = pgd_offset(vma->vm_mm, address);
>>
>> (ie change "current->mm" into "vma->vm_mm"). Does that fix the problem
>> for you?
>>
>> Linus

Here's the original information from the user:

This transcript describes what I believe to be a bug in mmap(), in which pages
which were written to but not yet flushed to the file get lost when a write
to a new page causes a page fault.

The context is this: We are doing what is similar to an unexec() in
gnu emacs, which we call dumplisp. The source file (the original executable)
and the destination file are both mapped into memory, and the creation and
mmap commands for the destination file are

if ((dl_dst_fd = open(dst_file, O_RDWR|O_CREAT, 0777)) < 0) {
dumplisp_return("can't create output file: %s", dst_file);
}

and

dl_dst_base = mmap(0, dst_file_size, PROT_READ|PROT_WRITE, MAP_SHARED,
dl_dst_fd, 0);

dl_dst_base happens to be 0x4036e000 in the transcript below.
In the example, we do 3 memcpy's. The first two are writing the
ehdr and the phdr, both of which occur on the same memory page. At various
times I print out the first part of the destination file (actually, its
memory representation) and it looks like a good ELF header, until the first
write to a different page via the movsl instruction in memcpy (note that since
the addresses are even to a 4-byte boundary, ecx is zero when the movsb loop
is done, so no data is written at that time).

My theory is that the page fault interrupt handler is not properly saving
the physical page(s) that have already been allocated and are currently
mapped in, and so either the mapping is lost or else a zero-mapping is
forced (which has the same effect of losing the original physical pages).

I would appreciate your analysis and fix.

Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave, Ste 275 Berkeley, CA 94704 uunet!franz!duane (uucp)
Phone: (510) 548-3600; FAX: (510) 548-8253 duane@Franz.COM (internet)

% gdb cl
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15.1 (i586-unknown-linux),
Copyright 1995 Free Software Foundation, Inc...(no debugging symbols found)...
(gdb) break elf_dumplisp
Breakpoint 1 at 0x800c4ea
(gdb) run
Starting program: /a/fabi/root/wow/linuxscm/4.3.linux/src/cl
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...Allegro CL 4.3 [Linux/X86; R1] (5/1/96 0:13)
Copyright (C) 1985-1996, Franz Inc., Berkeley, CA, USA. All Rights Reserved.
Loading home .clinit.cl
; Loading #p"/a/fabi/root/wow/linuxscm/4.3.linux/src/.clinit.cl"
; Loading src:;comp;debugstructs.cl
; (/a/fabi/root/wow/linuxscm/4.3.linux/src/comp/debugstructs.cl)
;; Optimization settings: safety 1, space 1, speed 1, debug 2.
;; For a complete description of all compiler switches given the current
;; optimization settings evaluate (explain-compiler-settings).
user(1): (dumplisp :name "dl1_mt.foo" :libfasl-warning nil)
gc: E=72% N=50168 O+=48232 pfu=708+704 pfg=22+257
gc: E=0% N=49448 O+=720 pfu=1+1 pfg=0+20

Breakpoint 1, 0x800c4ea in elf_dumplisp ()
(gdb) set debug_dumplisp=1
(gdb) break memcpy
Breakpoint 2 at 0x40049f78
(gdb) c
Continuing.
src mapped: 0x400d9000 to 0x4036d660
phdr: type=6 offset= 0x34 paddr=0x8000034 memsz= 0xa0
phdr: type=3 offset= 0xd4 paddr=0x80000d4 memsz= 0x13
phdr: type=1 offset= 0x0 paddr=0x8000000 memsz= 0x4357c
phdr: type=1 offset= 0x43580 paddr=0x8044580 memsz= 0x13b58
phdr: type=2 offset= 0x510f8 paddr=0x80520f8 memsz= 0x98
txtpi=0x2 ehdr_in_text=1 ehdr_offset= 0x34
area1: 0x806f000 to 0x81e9e10 (size: 0x17ae10, rounded: 0x17b000); pad 0x5e1f0
area1: 0x8248000 to 0x82ca000 (size: 0x82000, rounded: 0x82000); pad 0x0
area1: 0x82ca000 to 0x82d2e00 (size: 0x8e00, rounded: 0x9000); pad 0x6f200
area1: 0x8342000 to 0x8350000 (size: 0xe000, rounded: 0xe000); pad 0x86000
dst_file_size += 0x34 bytes (ehdr)
dst_file_size += 0x370 bytes (shdr)

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) c
Continuing.
dst_file_size += 0x4357c bytes (text segment)

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) c
Continuing.
dst_file_size += 0xdc10 bytes (data segment--filesz only)
dst_file_size += 0x214000 bytes (heap)
xtra_size is 0
add 0x1e6 to xtra_size
add 0x9a to xtra_size
add 0x5150 to xtra_size
hole: add 0x370 to xtra_size
add 0x435c to xtra_size

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) c
Continuing.
dst_file_size += 0x9a9c bytes (not-in-core stuff)

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) c
Continuing.
dst_file_size += 0x4004 bytes (struct dumplisp_info)

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) c
Continuing.
new dst_file_size is 2799200 (0x2ab660)
dst mapped: 0x4036e000 to 0x40619660
memcpy(0x4036e000, 0x400d9000, 0x34)
dst in file: 0x0 to 0x34
src in file: 0x0

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) c
Continuing.
memcpy(0x4036e034, 0x400d9034, 0xa0)
dst in file: 0x34 to 0xd4
src in file: 0x34

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) x/20x 0x4036e000
0x4036e000 <ypall_foreach+2708576>: 0x464c457f 0x00010101 0x00000000 0x00000000
0x4036e010 <ypall_foreach+2708592>: 0x00030002 0x00000001 0x0800bc80 0x00000034
0x4036e020 <ypall_foreach+2708608>: 0x00051410 0x00000000 0x00200034 0x00280005
0x4036e030 <ypall_foreach+2708624>: 0x00130016 0x00000000 0x00000000 0x00000000
0x4036e040 <ypall_foreach+2708640>: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) x/s 0x4036e000
0x4036e000 <ypall_foreach+2708576>: "\177ELF\001\001\001"
(gdb) c
Continuing.
memcpy(0x403bf410, 0x4012a410, 0x370)
dst in file: 0x51410 to 0x51780
src in file: 0x51410

Breakpoint 2, 0x40049f78 in memcpy ()
(gdb) x/20x 0x4036e000
0x4036e000 <ypall_foreach+2708576>: 0x464c457f 0x00010101 0x00000000 0x00000000
0x4036e010 <ypall_foreach+2708592>: 0x00030002 0x00000001 0x0800bc80 0x00000034
0x4036e020 <ypall_foreach+2708608>: 0x00051410 0x00000000 0x00200034 0x00280005
0x4036e030 <ypall_foreach+2708624>: 0x00130016 0x00000006 0x00000034 0x08000034
0x4036e040 <ypall_foreach+2708640>: 0x08000034 0x000000a0 0x000000a0 0x00000005
(gdb) display/i $pc
2: x/i $eip 0x40049f78 <memcpy>: pushl %ebp
(gdb) si
0x40049f79 in memcpy ()
2: x/i $eip 0x40049f79 <memcpy+1>: pushl %edi
(gdb)
0x40049f7a in memcpy ()
2: x/i $eip 0x40049f7a <memcpy+2>: pushl %esi
(gdb)
0x40049f7b in memcpy ()
2: x/i $eip 0x40049f7b <memcpy+3>: movl 0x10(%esp,1),%ebp
(gdb)
0x40049f7f in memcpy ()
2: x/i $eip 0x40049f7f <memcpy+7>: movl 0x18(%esp,1),%edx
(gdb)
0x40049f83 in memcpy ()
2: x/i $eip 0x40049f83 <memcpy+11>: movl %ebp,%edi
(gdb)
0x40049f85 in memcpy ()
2: x/i $eip 0x40049f85 <memcpy+13>: movl 0x14(%esp,1),%esi
(gdb)
0x40049f89 in memcpy ()
2: x/i $eip 0x40049f89 <memcpy+17>: cmpl $0x7,%edx
(gdb)
0x40049f8c in memcpy ()
2: x/i $eip 0x40049f8c <memcpy+20>: jbe 0x40049fa9 <memcpy+49>
(gdb)
0x40049f8e in memcpy ()
2: x/i $eip 0x40049f8e <memcpy+22>: movl %ebp,%eax
(gdb)
0x40049f90 in memcpy ()
2: x/i $eip 0x40049f90 <memcpy+24>: negl %eax
(gdb)
0x40049f92 in memcpy ()
2: x/i $eip 0x40049f92 <memcpy+26>: andl $0x3,%eax
(gdb)
0x40049f95 in memcpy ()
2: x/i $eip 0x40049f95 <memcpy+29>: subl %eax,%edx
(gdb)
0x40049f97 in memcpy ()
2: x/i $eip 0x40049f97 <memcpy+31>: movl %eax,%ecx
(gdb)
0x40049f99 in memcpy ()
2: x/i $eip 0x40049f99 <memcpy+33>: cld
(gdb)
0x40049f9a in memcpy ()
2: x/i $eip 0x40049f9a <memcpy+34>: repz movsb %ds:(%esi),%es:(%edi)
(gdb) p/x $ecx
$1 = 0x0
(gdb) si
0x40049f9c in memcpy ()
2: x/i $eip 0x40049f9c <memcpy+36>: movl %edx,%eax
(gdb)
0x40049f9e in memcpy ()
2: x/i $eip 0x40049f9e <memcpy+38>: shrl $0x2,%eax
(gdb)
0x40049fa1 in memcpy ()
2: x/i $eip 0x40049fa1 <memcpy+41>: movl %eax,%ecx
(gdb)
0x40049fa3 in memcpy ()
2: x/i $eip 0x40049fa3 <memcpy+43>: cld
(gdb)
0x40049fa4 in memcpy ()
2: x/i $eip 0x40049fa4 <memcpy+44>: repz movsl %ds:(%esi),%es:(%edi)
(gdb) x/20x 0x4036e000
0x4036e000 <ypall_foreach+2708576>: 0x464c457f 0x00010101 0x00000000 0x00000000
0x4036e010 <ypall_foreach+2708592>: 0x00030002 0x00000001 0x0800bc80 0x00000034
0x4036e020 <ypall_foreach+2708608>: 0x00051410 0x00000000 0x00200034 0x00280005
0x4036e030 <ypall_foreach+2708624>: 0x00130016 0x00000006 0x00000034 0x08000034
0x4036e040 <ypall_foreach+2708640>: 0x08000034 0x000000a0 0x000000a0 0x00000005
(gdb) info registers
eax 0xdc 220
ecx 0xdc 220
edx 0x370 880
ebx 0x403bf410 1077670928
esp 0xbfff80c4 0xbfff80c4
ebp 0x403bf410 0x403bf410
esi 0x4012a410 1074963472
edi 0x403bf410 1077670928
eip 0x40049fa4 0x40049fa4
ps 0x312 786
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x2b 43
gs 0x2b 43
(gdb) si
0x40049fa4 in memcpy ()
2: x/i $eip 0x40049fa4 <memcpy+44>: repz movsl %ds:(%esi),%es:(%edi)
(gdb) x/20x 0x4036e000
0x4036e000 <ypall_foreach+2708576>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4036e010 <ypall_foreach+2708592>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4036e020 <ypall_foreach+2708608>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4036e030 <ypall_foreach+2708624>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4036e040 <ypall_foreach+2708640>: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) info registers
eax 0xdc 220
ecx 0xdb 219
edx 0x370 880
ebx 0x403bf410 1077670928
esp 0xbfff80c4 0xbfff80c4
ebp 0x403bf410 0x403bf410
esi 0x4012a414 1074963476
edi 0x403bf414 1077670932
eip 0x40049fa4 0x40049fa4
ps 0x10312 66322
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x2b 43
gs 0x2b 43
(gdb)

------- End of Forwarded Message