Re: Got it! SG select() bug evidence for 2.2.2

Simon Ekstrand (simius@algonet.se)
Fri, 19 Mar 1999 19:47:41 +0100 (CET)


On Fri, 19 Mar 1999, Monty wrote:

>
> I'll say this first because I know most won't read much of this...:
>
> If you've seen the following output from cdparanoia, MAIL ME!!:
>
> "SCSI transport: writeable when reading packet"

Yes, I've seen the messages since somewhere around 2.2.1

> The above message indicates you've probably tripped a kernel bug that
> shows up only very rarely, and I need to know more about that bug!

I have no problems reproducing it, I get it every time i run
cdparanoia.

The following is output from running cdparanoia -s -v -B "1-"

Checking /dev/cdrom for cdrom...
Testing /dev/cdrom for cooked ioctl() interface
/dev/scd1 is not a cooked ioctl CDROM.
Testing /dev/cdrom for SCSI interface
generic device: /dev/sgb
ioctl device: /dev/scd1

SCSI transport: writeable when reading packet

Drive is neither a CDROM nor a WORM device

<a bunch of stuff snipped>

Checking /dev/sga for cdrom...
Testing /dev/sga for cooked ioctl() interface
/dev/sga is not a cooked ioctl CDROM.
Testing /dev/sga for SCSI interface
No cdrom device found to match generic device /dev/sga
generic device: /dev/sga
ioctl device: not found

SCSI transport: writeable when reading packet

Drive is neither a CDROM nor a WORM device

Checking /dev/sg1 for cdrom...
Could not stat /dev/sg1: No such file or directory

Checking /dev/sgb for cdrom...
Testing /dev/sgb for cooked ioctl() interface
/dev/sgb is not a cooked ioctl CDROM.
Testing /dev/sgb for SCSI interface

And there it simply hangs, needing a kill -9 to terminate the
process. The hang is caused by cdparanoia trying to open /dev/sgb

> I gave the users in question a patch to paranoia to ignore the fd
> indicating writable and wait for readable status from select(); SG
> commands suddenly work with no errors. I have not yet given code that
> logs for certain whether or not the fd first select()s writable and
> then upon a subsequent select() indicates readable; all I know is that
> on these machines, giving select() a readable fd_set only works, but
> when passing the same fd in the both a read and write fd_set to
> select(), writeable status is returned (which, as I'm beating into the
> ground, seems impossible from reading the kernel source).

That patch would be really nice :).

> The two machines for which I have documented evidence of the bug are
> running 2.2.2 and 2.2.3. The description of the problem matches
> behavior that was first reported to me by users in late 2.1 (before
> the release of a paranoia version with this check). Note that I do
> not have documented evidence that this is the same bug as appeared to
> happen earlier in 2.1; it could be a coincidence. However, the late
> 2.1 SG driver did see the select() infrastructure replaced with
> poll()...

I'm currently running kernel 2.2.3-ac1 and
cdparanoia III alpha prerelease 9.4 (December 16, 1998).
This is probably a rather old version of cdparanoia, but its worked
fine for me up until a this problem occured. So i haven't felt
any need to upgrade.

> I'm bringing this up despite the newer drivers because from looking at
> the 2.2 SG, I don't see how it could be making this mistake--- and
> thus I'm worried it could be select(). I have a patch out to the user
> that will test exactly this theory and I'll report back as soon as I
> know. I'm also sending him Doug's and Joerg's newest SGs to try.

If you need any other help debugging i'm willing to help.

My SCSI card is an Adaptec AHA-2940 UW.

output from /proc/scsi/scsi

Attached devices:
Host: scsi0 Channel: 00 Id: 02 Lun: 00
Vendor: MATSHITA Model: CD-R CW-7502 Rev: 3.02
Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
Vendor: NEC Model: CD-ROM DRIVE:465 Rev: 1.03
Type: CD-ROM ANSI SCSI revision: 02

Output from /proc/scsi/aic7xxx/0

Adaptec AIC7xxx driver version: 5.1.12/3.2.4
Compile Options:
TCQ Enabled By Default : Enabled
AIC7XXX_PROC_STATS : Disabled
AIC7XXX_RESET_DELAY : 5

Adapter Configuration:
SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter
Ultra Wide Controller
PCI MMAPed I/O Base: 0xe8000000
Adapter SEEPROM Config: SEEPROM found and used.
Adaptec SCSI BIOS: Enabled
IRQ: 11
SCBs: Active 0, Max Active 2,
Allocated 15, HW 16, Page 255
Interrupts: 71730
BIOS Control Word: 0x18b6
Adapter Control Word: 0x005e
Extended Translation: Enabled
Disconnect Enable Flags: 0xffff
Ultra Enable Flags: 0x0010
Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 8
Tagged Queue By Device array for aic7xxx host instance 0:
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
Actual queue depth per device for aic7xxx host instance 0:
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:

(scsi0:0:2:0)
Device using Narrow/Sync transfers at 10.0 MByte/sec, offset 8
Transinfo settings: current(25/8/0), goal(12/15/0), user(12/15/0)
Total transfers 0 (0 reads and 0 writes)

(scsi0:0:4:0)
Device using Narrow/Sync transfers at 20.0 MByte/sec, offset 15
Transinfo settings: current(12/15/0), goal(12/15/0), user(12/15/0)
Total transfers 1 (1 reads and 0 writes)

Regards,

-Simon Ekstrand (simius@algonet.se)

-----

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/