Re: linux-2.4.0 breaks grub install into partition

From: Andrea Arcangeli (andrea@suse.de)
Date: Sun Jul 09 2000 - 12:14:03 EST


[ disclaimer: I have no idea what GRUS is ]

On Sun, 9 Jul 2000, OKUJI Yoshinori wrote:

> I don't think what GRUB does is a wrong thing basically. Some types
>of software always need (or want) to access raw devices, for example,
>FDISK programs, filesystem resizers, and fast database servers. So,

fdisk works on the partition table and the kernel never touch in write
mode the partition table so as far as you run only 1 fdisk/lilo at once,
you're obviously safe. (that's a simple userspace/admin issue)

>AFAIK, all the realistic operating systems export raw devices to
>user-level programs and support one or more system calls to keep
>anything in the kernel consistent.

We need callbacks in the filesystem for sure in order to do live snapshots
of a logical volume and reiserfs should provide it soon IIRC (even if by
doing tricks to allow to mount a not cleanly unmounted journaling fs in
read only, or doing writeable snapshots (to allow journal reply on mount)
we could avoid such callbacks for snapshotting a journaling fs, but we'll
for sure need it for ext2 for example). For ext2 the thing looks pretty
trivial we'll probably only need to grab the superblock lock while doing
the snapshot (so we theorically don't even need the callback in the ext2
case but we need a callback to allow other fs to potentially do other
things of course).

> For now, the grub shell calls sync() (twice before any operation)
>and ioctl(fd, BLKFLSBUF, 0) (after and before operations) under

Both things are useless if your object is to read consistent data from a
live fs, so you can remove them. If your object is to write to a live fs
via raw device (note I'm not talking about rawio device here) then you
can't achive your object unless you are also able to tell the fs what you
changed (some metadata can be cached in dentries or also in 2.4.x the
data is never in buffer cache so the fs won't notice your changes).

>Linux. I thought that was enough, since sync should make filesystems
>and buffer caches consistent, and BLKFLSBUF should flush buffer caches

In 2.2.x when you read from raw device or when you write to raw device
you're assured to see the same data that the fs is seeing too because of
page-cache/buffer-cache costly synchronization. However on a live
filesystem the fs can change from under you while you take a page fault
during the read(/dev/hda) syscall for example and if you're not holding
the superblock lock (in the ext2 case) you may read not consistent data
anyway.

In 2.4.x buffer cache is completly unsynchronized with page cache so
running BLKFLSBUF would make some more sense there to make sure that the
next time you'll read data from the raw blockdevice you'll see the data
that the kernel _was_ (not _is_) seeing at the time of the BLKFLSBUF but
between the ioctl(BLKFLSBUF) and the read(/dev/hda) you'll be rescheduled
and somebody will write to the page-cache again... (for metadata ext2
2.2.x and 2.4.x are the same here I was only talking about data here)

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jul 15 2000 - 21:00:10 EST