Linux kernel bug

From: Jari Ruusu (jari.ruusu@pp.inet.fi)
Date: Sun Apr 09 2000 - 14:17:02 EST


Hi,

Linux kernels have a bug involving syncronous writes crossing the end of
device limit (ENOSPC condition). Things go wrong if all requirements are
met:

1) Application does syncronous writes directly to a removable device,
      e.g. /dev/fd0
2) A ENOSPC condition is reached at middle of write. No problems
      occour if write ends at end of medium, or if write begins at end
      of medium.
3) Removable device is changed
4) Application reads data at end of device (e.g. last kilobyte)

This problem has nothing to do with media change detection, and to my
knowledge, is present at all linux kernels. I have been fixing this
since kernel 1.2.13. I have also used kernels 2.0.36 and 2.2.14. Both
of them have this same bug. This bug is still present in the latest
kernel 2.3.99-pre3 that was available at ftp.funet.fi today.

In the lastest version (2.3.99-pre3) the problem is in file
/usr/src/linux/fs/block_dev.c Just look at lines 62, 132, 149, 155
and you will see the problem.

I WON'T TRY TO SEND THIS MESSAGE OR PATCHES DIRECTLY TO LINUS TORVALDS
ANY MORE. HE DOESN'T READ HIS EMAIL!

Below is source code for a small test program (testbug.c) that
demonstrates the problem. You will need two formatted 1.44 MB floppies
to run the program. The program does following:

1) Prompt to insert floppy #1
2) Syncronously write 0x11 to last kilobyte of floppy #1
3) Prompt to insert floppy #2
4) Syncronously write 0xEE to last kilobyte of floppy #2
5) Prompt to insert floppy #1
6) Read last kilobyte of floppy #1
7) Compare the results, buggy kernels return 0xEE

-----------------------------------------------------
Running ./testbug on a buggy kernel will output this:
-----------------------------------------------------

Insert a formatted floppy #1 to /dev/fd0
All data on the floppy will be lost!
Press ENTER to continue, CTRL-d to abort:
Writing to floppy #1 ... done

Insert a formatted floppy #2 to /dev/fd0
All data on the floppy will be lost!
Press ENTER to continue, CTRL-d to abort:
Writing to floppy #2 ... done

Insert floppy #1 to /dev/fd0
Press ENTER to continue, CTRL-d to abort:
Reading from floppy #1 ... done

Your kernel has O_SYNC/ENOSPC bug!

-----------------------------------------------------
Running ./testbug on patched kernel will output this:
-----------------------------------------------------

Insert a formatted floppy #1 to /dev/fd0
All data on the floppy will be lost!
Press ENTER to continue, CTRL-d to abort:
Writing to floppy #1 ... done

Insert a formatted floppy #2 to /dev/fd0
All data on the floppy will be lost!
Press ENTER to continue, CTRL-d to abort:
Writing to floppy #2 ... done

Insert floppy #1 to /dev/fd0
Press ENTER to continue, CTRL-d to abort:
Reading from floppy #1 ... done

Your kernel works fine.

-------------------- CUT HERE --------------------

/* testbug.c */

/*
 * This program tests if your kernel has a O_SYNC/ENOSPC bug.
 * You will need two formatted 1440 KB floppies. Some data on the floppies
 * will be overwritten, so do a mkfs on them afterwards.
 */

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>

#define FDDEV "/dev/fd0" /* floppy device */
#define FDSIZE 1440 /* floppy size in KB */

char buf[2048];

void fdwork(mode, code)
int mode;
int code;
{
    int fd;
 
    fflush(stdout);
    if((fd = open(FDDEV, mode ? O_WRONLY | O_SYNC : O_RDONLY, 0700)) == -1) {
        printf("open() failed. Aborted.\n");
        exit(1);
    }
    /* Seek to 1 KB less than floppy size */
    if(lseek(fd, (FDSIZE - 1) * 1024, SEEK_SET) == (off_t)-1) {
        printf("lseek() failed. Aborted.\n");
        exit(1);
    }
    memset(buf, code, 2048);
    if(mode) {
        /* Try to write 2 KB. Only 1 KB gets written (ENOSPC) */
        if(write(fd, buf, 2048) != 1024) {
            printf("write() failed. Aborted.\n");
            exit(1);
        }
    } else {
        /* Read last KB. Data should come from floppy */
        /* Buggy kernels return data from screwed buffers */
        if(read(fd, buf, 1024) != 1024) {
            printf("read() failed. Aborted.\n");
            exit(1);
        }
    }
    if(close(fd) == -1) {
        printf("close() failed. Aborted.\n");
        exit(1);
    }
}
     
void waitkey(void)
{
    char tmpbuf[80];
    
    printf("Press ENTER to continue, CTRL-d to abort: ");
    fflush(stdout);
    if(read(0, tmpbuf, sizeof(tmpbuf)) < 1) {
        printf("aborted\n");
        exit(2);
    }
}
    
void main(argc, argv)
int argc;
char **argv;
{
    char comp[1024];
    
    printf("\nInsert a formatted floppy #1 to %s\n", FDDEV);
    printf("All data on the floppy will be lost!\n");
    waitkey();
    printf("Writing to floppy #1 ... ");
    fdwork(1, 0x11);
    
    printf("done\n\nInsert a formatted floppy #2 to %s\n", FDDEV);
    printf("All data on the floppy will be lost!\n");
    waitkey();
    printf("Writing to floppy #2 ... ");
    fdwork(1, 0xEE);
    
    printf("done\n\nInsert floppy #1 to %s\n", FDDEV);
    waitkey();
    printf("Reading from floppy #1 ... ");
    fdwork(0, 0x55);
    printf("done\n\n");
    
    memset(comp, 0x11, 1024);
    if(memcmp(buf, comp, 1024)) {
        printf("Your kernel has O_SYNC/ENOSPC bug!\n\n");
        exit(3);
    }
    printf("Your kernel works fine.\n\n");
    exit(0);
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:12 EST