2.1.131 IDE CD-ROM "buffer botch" errors and other problems

Jamie Lokier (lkd@tantalophile.demon.co.uk)
Wed, 16 Dec 1998 17:40:57 +0000


Summary
=======

1. "buffer botch" errors, causing I/O error messages.

2. Incorrect data may be stored in buffer cache.
(Maybe even sensitive data the app should not see, if it was
simply an uninitialised page from the kernel).

3. Only seen it with 2.1.131, where it is quite consistent.

4. Only seen once memory is full, with many dirty buffers.

Details
=======

I'm seeing a lot of "buffer botch" errors when reading large files from
a CD, with kernel 2.1.131. Notably, when installing RPMs from a Red Hat
CD.

But that's just the first problem.

Immediately after each such message, there's an I/O error for the same
block. Despite the error, some faulty data is put in the cache, because
subsequent attempts to read the same file return incorrect data
(according to RPM) at the same point, _without_ the CD-ROM drive
spinning up.

Here's a typical log:

Dec 16 02:11:52 tantalophile kernel: hdc: cdrom_read_from_buffer: buffer botch (5530)
Dec 16 02:11:52 tantalophile kernel: end_request: I/O error, dev 16:00 (hdc), sector 5530
Dec 16 02:16:33 tantalophile kernel: hdc: cdrom_read_from_buffer: buffer botch (4738)
Dec 16 02:16:33 tantalophile kernel: end_request: I/O error, dev 16:00 (hdc), sector 4738
Dec 16 02:20:58 tantalophile kernel: hdc: cdrom_read_from_buffer: buffer botch (626498)
Dec 16 02:20:58 tantalophile kernel: end_request: I/O error, dev 16:00 (hdc), sector 626498

Unmounting and remounting often permits the file to be read again
(presumably because it flushes the cached data).

The hardware and CD are fine, because the bug doesn't occur with kernel
2.1.119. It's quite repeatable with 2.1.131.

These errors happen immediately after booting. It seems that main
memory must fill up with page/buffer cache etc. first. Then the
problems occur while installing large RPMs. At the same time, there's a
lot of dirty buffers in memory, because `sync' takes a while to flush
them after on of the "buffer botch" errors.

So I'm guessing, that to recreate this bug it's necessary to have a lot
of dirty buffers and full memory, and hence memory allocation is a bit
tight.

The bug may be present in 2.1.119 but not showing up due to the
different VM swapping strategies. It may also be hidden in more recent
kernels (e.g. 2.1.131ac10) due to newer, better VM strategies, but I
haven't tried those kernels.

-- Jamie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/