Re: mmotm 2009-08-06-00-30 uploaded

From: Dave Young
Date: Thu Aug 13 2009 - 22:55:59 EST


On Thu, Aug 13, 2009 at 11:02 PM, Emmanuel Benisty<benisty.e@xxxxxxxxx> wrote:
> On Wed, Aug 12, 2009 at 6:48 AM, Dave Young<hidave.darkstar@xxxxxxxxx> wrote:
>> On Tue, Aug 11, 2009 at 10:08:24PM +0800, Dave Young wrote:
>>> On Sat, Aug 08, 2009 at 06:13:53PM +0800, Dave Young wrote:
>>> > On Sat, Aug 8, 2009 at 12:49 AM, Andrew Morton<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>>> > > On Fri, 7 Aug 2009 21:47:00 +0800 Dave Young <hidave.darkstar@xxxxxxxxx> wrote:
>>> > >
>>> > >> Hi, andrew
>>> > >>
>>> > >> Booting with this release, init (maybe getty?) reports something like:
>>> > >>
>>> > >> INIT: open /dev/console failed with input/output error
>>> > >>
>>> > >> 2.6.31-rc5 is fine.
>>> > >>
>>> > >> Any hints to find the root problem?
>>> > >
>>> > > Not really, sorry. ÂMight be tty changes in linux-next?
>>> >
>>> > I bisected linux-next, find following patch as a result
>>> >
>>> > commit 65b8c7d9be5862ff8ac839607b444b6f6b11d2fb
>>> > Author: Alan Cox <alan@xxxxxxxxxxxxxxx>
>>> > Date: Â Thu Aug 6 09:58:02 2009 +1000
>>> >
>>> > Â Â cyclades: use the full port_close function
>>> >
>>> > But, I did not select cyclades in my .config, nor do i have the hardware. Weird.
>>>
>>> The above result is wrong, it's a mistake.
>>>
>>> After a whole day's testing and debugging with linux-2.6 git tree and tty patch series, I found the patch causing this issue.
>>> --
>>> From: Alan Cox <alan@xxxxxxxxxxxxxxx>
>>> Subject: tty: make the kref destructor occur asynchronously
>>> --
>>>
>>> If we make the tty release in a work queue, then tty_reopen might fail with -EIO. I read the sysvinit source code, it will retry 5 times, if still failed, it will warning, then no output before login.
>>>
>>> My distribution is slackware 12.2
>>>
>>> I tested with following debug patch.
>>>
>>> --- linux-2.6.orig/drivers/char/tty_io.c   Â2009-08-11 21:29:03.000000000 +0800
>>> +++ linux-2.6/drivers/char/tty_io.c  2009-08-11 21:40:01.000000000 +0800
>>> @@ -1246,8 +1246,10 @@ static int tty_reopen(struct tty_struct
>>> Â{
>>> Â Â Â struct tty_driver *driver = tty->driver;
>>>
>>> - Â Â if (test_bit(TTY_CLOSING, &tty->flags))
>>> + Â Â if (test_bit(TTY_CLOSING, &tty->flags)) {
>>> + Â Â Â Â Â Â printk(KERN_INFO "tty_io.c: closing\n");
>>> Â Â Â Â Â Â Â return -EIO;
>>> + Â Â }
>>>
>>> Â Â Â if (driver->type == TTY_DRIVER_TYPE_PTY &&
>>> Â Â Â Â Â driver->subtype == PTY_TYPE_MASTER) {
>>> @@ -1255,8 +1257,10 @@ static int tty_reopen(struct tty_struct
>>> Â Â Â Â Â Â Â Â* special case for PTY masters: only one open permitted,
>>> Â Â Â Â Â Â Â Â* and the slave side open count is incremented as well.
>>> Â Â Â Â Â Â Â Â*/
>>> - Â Â Â Â Â Â if (tty->count)
>>> + Â Â Â Â Â Â if (tty->count) {
>>> + Â Â Â Â Â Â Â Â Â Â printk(KERN_INFO "tty_io.c: open count %d\n", tty->count);
>>> Â Â Â Â Â Â Â Â Â Â Â return -EIO;
>>> + Â Â Â Â Â Â }
>>>
>>> Â Â Â Â Â Â Â tty->link->count++;
>>> Â Â Â }
>>> @@ -1705,6 +1709,7 @@ static int __tty_open(struct inode *inod
>>> Â Â Â int index;
>>> Â Â Â dev_t device = inode->i_rdev;
>>> Â Â Â unsigned saved_flags = filp->f_flags;
>>> + Â Â static int t;
>>>
>>> Â Â Â nonseekable_open(inode, filp);
>>>
>>> @@ -1778,8 +1783,15 @@ got_driver:
>>>
>>> Â Â Â mutex_unlock(&tty_mutex);
>>> Â Â Â tty_driver_kref_put(driver);
>>> - Â Â if (IS_ERR(tty))
>>> + Â Â if (IS_ERR(tty)) {
>>> + Â Â Â Â Â Â int r = PTR_ERR(tty);
>>> + Â Â Â Â Â Â if (t == 5) {
>>> + Â Â Â Â Â Â Â Â Â Â printk(KERN_INFO "%s: %d, %d, retval: %d\n", __FILE__, __LINE__, r, retval);
>>> + Â Â Â Â Â Â Â Â Â Â t =0;
>>> + Â Â Â Â Â Â } else
>>> + Â Â Â Â Â Â Â Â Â Â t++;
>>> Â Â Â Â Â Â Â return PTR_ERR(tty);
>>> + Â Â }
>>>
>>> Â Â Â filp->private_data = tty;
>>> Â Â Â file_move(filp, &tty->tty_files);
>>>
>>>
>>
>> Here is a fix for that issue, please help to review.
>> --
>>
>> Due to tty release routines runs in workqueue now,
>> error like following will be reported while booting:
>>
>> INIT open /dev/console input/output error
>>
>> Opening a tty while closing not finished is what cause such problem.
>>
>> Fix it by flush hangup_work in such case and then call tty_init_dev.
>>
>>
>> Signed-off-by: Dave Young <hidave.darkstar@xxxxxxxxx>
>> --
>> drivers/char/tty_io.c | Â 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> --- linux-2.6.orig/drivers/char/tty_io.c    Â2009-08-12 07:12:31.000000000 +0800
>> +++ linux-2.6/drivers/char/tty_io.c   2009-08-12 07:31:30.000000000 +0800
>> @@ -1770,9 +1770,14 @@ got_driver:
>> Â Â Â Â}
>>
>> Â Â Â Âif (tty) {
>> - Â Â Â Â Â Â Â retval = tty_reopen(tty);
>> - Â Â Â Â Â Â Â if (retval)
>> - Â Â Â Â Â Â Â Â Â Â Â tty = ERR_PTR(retval);
>> + Â Â Â Â Â Â Â if (test_bit(TTY_CLOSING, &tty->flags)) {
>> + Â Â Â Â Â Â Â Â Â Â Â flush_work(&tty->hangup_work);
>> + Â Â Â Â Â Â Â Â Â Â Â tty = tty_init_dev(driver, index, 0);
>> + Â Â Â Â Â Â Â } else {
>> + Â Â Â Â Â Â Â Â Â Â Â retval = tty_reopen(tty);
>> + Â Â Â Â Â Â Â Â Â Â Â if (retval)
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â tty = ERR_PTR(retval);
>> + Â Â Â Â Â Â Â }
>> Â Â Â Â} else
>> Â Â Â Â Â Â Â Âtty = tty_init_dev(driver, index, 0);
>>
>
> Thanks Dave, I had the very same issue and your patch fixed it.
>

FIne, so I'm not the only person who have such problem. Thank you for testing.

--
Regards
dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/