Solved -> [problem in 2.2.* vs (working in 2.0.36)]

Christian Robert (Christian.Robert@polymtl.ca)
Wed, 28 Apr 1999 18:29:44 -0400


Uppon arriving at work today, I found on the console this message
who explain what happened.

VFS: No free dquots, contact mvw@planets.elm.net

That's why the process was frozen. I ran out of dquot structures.

I added

echo 4096 > /proc/sys/fs/dquot-max

to rc.local, rebooted and now all seems to work ok.

Sorry for the bad report. Next time I will make more
tests before reporting a problem.

Christian Robert,
Poly.

Christian Robert wrote:
>
> hi,
>
> I will make it short. I have a program in 2.0.36 who work all
> the time. My special config that I think is important is that in my
> rc.S I invoke "update -f 30" rather than the default (i think 5sec)
>
> My program create 8 thousands directories and 8 thousands files
> in a few seconds, (may be 120).
>
> On 2.0.36 kernel, he create about 1300, then he goes to sleep while the
> "/sbin/update" update the write-thru cache to disk, and then resume
> working as he supposed to do, and complete the job, a step at a time.
>
> On 2.2.* (*=any), when he goes first to sleep (ps show STAT 'D'), update
> does the work, but after that, the cpu goes to 0% and my program never
> resume the work.
>
> In that state (ps show STAT 'D') he cannot even be killed -TERM nor kill -9
> Only reboot can get rid of that process.
>
> I removed the "-f 30" on /sbin/update, and now the program work ok
> but obviously there may be a bug somewhere.
>
> The important thing should be: When "update" is invoked by his
> own timeout, everything is fine. When "update" is invoked
> because of a starving of write cache process, he forgot to resume
> the process who where stopped by the starving and waiting on him.
>
> Christian Robert,
> Polytechnique of Montreal.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/