Re: Corruption in 2.1.106

Richard B. Johnson (root@chaos.analogic.com)
Thu, 18 Jun 1998 14:11:20 -0400 (EDT)


On Thu, 18 Jun 1998, [iso-8859-1] menion™ wrote:

> >
> > Are other people seeing 2.1.106 corrupting disks and memory on x86 ?
> >
> > Alan
>
> This is an example after 2 days of running: I frequently do this, and
> this is a good one. After 6 - 8 days I typically have 10 - 20 inodes
> with zero dtime. My root FS is the worst affected, but then it get's
> the most reading/writing attention. There is another error that is
> wuite freqent, but I can't recall it. (I just got this after reading
> your message and trying it out on my root filesystem.
>
>
> ..[snip]..
> Parallelizing fsck version 1.10 (24-Apr-97)
> e2fsck 1.10, 24-Apr-97 for EXT2 FS 0.5b, 95/08/09
> Pass 1: Checking inodes, blocks, and sizes
> Deleted inode 14510 has zero dtime. Fix<y>? yes
>
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Fix summary information<y>? yes
>
> Block bitmap differences: -60618 -60619 -60620 -60621 -60622 -60623
> -60624
> -60625 -60626 -60627 -60628 -60629. FIXED
> Free blocks count wrong for group 7 (1183, counted=1195). FIXED
> Free blocks count wrong (214664, counted=214676). FIXED
> Inode bitmap differences: -14510. FIXED
> Free inodes count wrong for group #7 (1805, counted=1806). FIXED
> Free inodes count wrong (221839, counted=221840). FIXED
>
> /dev/hda1: ***** FILE SYSTEM WAS MODIFIED *****
> /dev/hda1: 45400/267240 files (3.7% non-contiguous), 850748/1065424
> blocks
> ..[snip]..
>
> thx..js
>

And here is the result of running my name-server for 3 days on an
otherwise unused machine.

[snip]
(none):/mnt# /sbin/fsck /
Parallelizing fsck version 1.04 (16-May-96)
e2fsck 1.04, 16-May-96 for EXT2 FS 0.5b, 95/08/09
/dev/sdc1: clean, 113021/514000 files, 1531941/2048001 blocks
^^^^^__________ Note, unmounted properly.

(none):/mnt# /sbin/fsck -f /
Parallelizing fsck version 1.04 (16-May-96)
e2fsck 1.04, 16-May-96 for EXT2 FS 0.5b, 95/08/09
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 10384 has zero dtime.
Set dtime<y>? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Fix summary information<y>? yes

Block bitmap differences: -45759 -45760 -45761 -45762 -45763 -45764. FIXED
Free blocks count wrong for group 5 (3072, counted=3078). FIXED
Free blocks count wrong (516060, counted=516066). FIXED
Inode bitmap differences: -10384. FIXED
Free inodes count wrong for group #5 (1638, counted=1639). FIXED
Free inodes count wrong (400979, counted=400980). FIXED

/dev/sdc1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdc1: 113020/514000 files (6.2% non-contiguous), 1531935/2048001 blocks
(none):/mnt# fdisk /dev/sdc
bash: fdisk: command not found
(none):/mnt# /sbin/fdisk /dev/dev/sdc

Command (m for help): p

Disk /dev/sdc: 255 heads, 63 sectors, 553 cylinders
Units = cylinders of 16065 * 512 bytes

Device Boot Begin Start End Blocks Id System
/dev/sdc1 * 1 1 255 2048256 83 Linux native
/dev/sdc2 256 256 288 265072+ 82 Linux swap
/dev/sdc3 289 289 553 2128612+ 83 Linux native

Command (m for help): q
(none):/mnt# exit
[snip]

Because I did not trust init and scripts with the shutdown sequence,
I wrote this to reboot these machines:

-------------
/*

Free software from rjohnson@analogic.com
Written by Richard B. Johnson. No Copyright is claimed. It's free.

This software is not guaranteed to do anything useful but:

This version of reboot doesn't require init to be running.
This will run even if you booted the kernel with a shell instead of
init.

You can reboot locally, over the network, or over a wire line.

For an ordinary user to use this, `chmod 4755` is required and the
program file has to be put in a root-owned directory.

Bugs(sorta):

(1) Only "MOUNTS" file systems will be dismounted. Adjust if you
have more than 0x100 mounted file-systems.

(2) This requires the glibc headers to compile.

(3) The /proc file-system must exist in the kernel although it
doesn't have to be mounted.

*/

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <signal.h>
#include <malloc.h>
#include <string.h>
#include <fcntl.h>
#include <signal.h>
#include <sys/mount.h>
#include <sys/reboot.h>
#include <sys/stat.h>
#include <dirent.h>
#define INIT 1
#define MOUNTS 0x100
#define NAME_LEN (0x100 * sizeof(char))

int main(int /*@unused@*/ unused, char *argv[])
{
int i, j;
FILE *file;
char *mounts[MOUNTS];
char mountp[NAME_LEN];
DIR *dir;
struct dirent *d;
pid_t pid;
pid_t me;

puts("Rebooting.....");
(void)fflush(stdout);
(void)signal(SIGHUP, SIG_IGN);
(void)signal(SIGWINCH, SIG_IGN);
(void)signal(SIGTERM, SIG_IGN);
(void)signal(SIGTSTP, SIG_IGN); /* In case we are pid 1 */
if((fork()) != 0) exit(0);
memset(argv[0], 0x00, (strlen(argv[0])));
strcpy(argv[0], "Reboot");
(void)setuid(0);
(void)setgid(0);
(void)chdir("/");
if((file = fopen("/proc/mounts", "r")) == NULL)
{
(void) mkdir("/proc", ACCESSPERMS);
(void) mount("proc", "/proc", "proc", 0, 0);
if((file = fopen("/proc/mounts", "r")) == NULL)
{
fprintf(stderr, "Can't read /proc filesystem.");
return 1;
}
}
j = fileno(file);
for(i = 0; i< MOUNTS; i++)
{
if((mounts[i] = (char *) malloc(NAME_LEN)) == NULL)
break;
*mounts[i] = (char ) 0x00;
if((fgets(mounts[i], NAME_LEN, file)) == NULL)
{
free(mounts[i]);
break;
}
}
/*
* Note: 'i' always points beyond the last buffer when we exit the
* above loop even if 'i' is still 0 (no file-systems mounted).
*/
(void)signal(SIGHUP, SIG_IGN);
(void)signal(SIGWINCH, SIG_IGN);
(void)signal(SIGTERM, SIG_IGN);
(void)signal(SIGTSTP, SIG_IGN); /* In case we are pid 1 */
while (j >= 0)
(void)close(j--);
(void)setsid();
(void)sync();
me = getpid();
if(me != INIT)
(void)kill(INIT, SIGTSTP); /* Tell init to pause if it exists */
if((dir = opendir("/proc")) != NULL)
{
while((d = readdir(dir)) != NULL)
{
pid = (pid_t) atoi(d->d_name);
if((pid > 1) && (pid != me))
(void)kill(pid, SIGTERM);
}
rewinddir(dir);
(void)sync(); /* Update for killed processes */
(void) sleep(2); /* Let processes die naturally */
while((d = readdir(dir)) != NULL)
{
pid = (pid_t) atoi(d->d_name);
if((pid > 1) && (pid != me))
(void)kill(pid, SIGKILL);
}
(void)closedir(dir);
}
(void)unlink("/etc/mtab~"); /* One of these could exist */
(void)sync(); /* Try to help umount */
while(i-- > 0) /* This logic is correct ! */
{
if((sscanf(mounts[i], "%*s %s", mountp)) == 1)
{
(void)umount(mountp);
(void)sleep(1);
}
free(mounts[i]);
}
(void)sleep(5); /* Let Disks get flushed */
return (reboot(RB_AUTOBOOT));
}
---------------------------

Cheers,
Dick Johnson
***** FILE SYSTEM MODIFIED *****
Penguin : Linux version 2.1.105 on an i586 machine (66.15 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu