Re: [PATCH 1/2] sysfs: fix race between readdir and lseek

From: Li Zefan
Date: Tue Mar 26 2013 - 03:31:11 EST


On 2013/3/22 17:31, Ming Lei wrote:
> On Fri, Mar 22, 2013 at 1:48 PM, Li Zefan <lizefan@xxxxxxxxxx> wrote:
>> On 2013/3/21 12:48, Ming Lei wrote:
>>
>> Yes, it can...As I said, it's irrelevant, because it's vfs that changes
>> file->f_pos.
>>
>> SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
>> {
>> struct fd f = fdget(fd);
>> ssize_t ret = -EBADF;
>>
>> if (f.file) {
>> loff_t pos = file_pos_read(f.file); <--- read f_pos
>> ret = vfs_read(f.file, buf, count, &pos); <--- return -EISDIR
>> file_pos_write(f.file, pos); <--- write f_pos
>
> Considered that f_pos of sysfs directory is always less than INT_MAX,
> we need't worry about atomic writing it in file_pos_write().
>
> The only probable problem on sysfs is below scenario in read()/write():
>
> - pos is read as less than 2 in file_pos_read(f.file)
> - ret = vfs_read(f.file, buf, count, &pos)
> ---> readdir() in another path
> - file_pos_write(pos)
> ---> readdir() found f_pos becomes 0 or 1, and may cause
> use-after-free problem
>
> Considered that vfs_read()/vfs_write on sysfs dir is almost doing nothing, the
> above problem may only exist in theory.

The read() vs readdir() race in sysfs directory doesn't exist in theory only.

Mar 25 11:16:57 lxc34 kernel: [ 3581.923110] ------------[ cut here ]------------
Mar 25 11:16:57 lxc34 kernel: [ 3581.923124] WARNING: at fs/sysfs/sysfs.h:195 sysfs_readdir+0x277/0x290()
Mar 25 11:16:57 lxc34 kernel: [ 3581.923131] Hardware name: Tecal RH2285
Mar 25 11:16:57 lxc34 kernel: [ 3581.923136] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_i
scsi bridge ipv6 stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave binfmt_misc fuse loop dm_mod c
oretemp acpi_cpufreq mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf
128mul bnx2 iTCO_wdt iTCO_vendor_support sg i2c_i801 ehci_pci mptctl tpm_tis tpm tpm_bios serio_raw microcode lp
c_ich i2c_core hid_generic mfd_core button usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif edd
ext3 mbcache jbd fan processor ide_pci_generic ide_core ata_generic ata_piix libata mptsas mptscsih mptbase scs
i_transport_sas scsi_mod thermal thermal_sys hwmon
Mar 25 11:16:57 lxc34 kernel: [ 3581.923238] Pid: 13289, comm: a.out Not tainted 3.9.0-rc1-0.7-default+ #38
Mar 25 11:16:57 lxc34 kernel: [ 3581.923245] Call Trace:
Mar 25 11:16:57 lxc34 kernel: [ 3581.923251] [<ffffffff8120b137>] ? sysfs_readdir+0x277/0x290
Mar 25 11:16:57 lxc34 kernel: [ 3581.923258] [<ffffffff8120b137>] ? sysfs_readdir+0x277/0x290
Mar 25 11:16:57 lxc34 kernel: [ 3581.923273] [<ffffffff81042d3f>] warn_slowpath_common+0x7f/0xc0
Mar 25 11:16:57 lxc34 kernel: [ 3581.923281] [<ffffffff81042d9a>] warn_slowpath_null+0x1a/0x20
Mar 25 11:16:57 lxc34 kernel: [ 3581.923288] [<ffffffff8120b137>] sysfs_readdir+0x277/0x290
Mar 25 11:16:57 lxc34 kernel: [ 3581.923296] [<ffffffff811a11a0>] ? sys_ioctl+0x90/0x90
Mar 25 11:16:57 lxc34 kernel: [ 3581.923303] [<ffffffff811a11a0>] ? sys_ioctl+0x90/0x90
Mar 25 11:16:57 lxc34 kernel: [ 3581.923310] [<ffffffff811a1589>] vfs_readdir+0xa9/0xc0
Mar 25 11:16:57 lxc34 kernel: [ 3581.923317] [<ffffffff811a163b>] sys_getdents64+0x9b/0x110
Mar 25 11:16:57 lxc34 kernel: [ 3581.923327] [<ffffffff814bb5d9>] system_call_fastpath+0x16/0x1b
Mar 25 11:16:57 lxc34 kernel: [ 3581.923333] ---[ end trace a995ae360b2301bd ]---
Mar 25 11:16:57 lxc34 kernel: [ 3581.923339] ida_remove called for id=19055 which is not allocated.
...

And finally my kernel crashed.

I'm not asking you to fix it, but just let you know about this bug. Al proposed a fix but I don't know
if he'll work on it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/