Re: [PATCH] init/do_mounts.c: Add root="fstag:<tag>" syntax for root device

From: Stefan Hajnoczi
Date: Thu Jun 10 2021 - 04:16:50 EST


On Wed, Jun 09, 2021 at 11:45:43AM -0400, Vivek Goyal wrote:
> On Wed, Jun 09, 2021 at 10:51:56AM +0100, Stefan Hajnoczi wrote:
> > On Tue, Jun 08, 2021 at 11:35:24AM -0400, Vivek Goyal wrote:
> > > We want to be able to mount virtiofs as rootfs and pass appropriate
> > > kernel command line. Right now there does not seem to be a good way
> > > to do that. If I specify "root=myfs rootfstype=virtiofs", system
> > > panics.
> > >
> > > virtio-fs: tag </dev/root> not found
> > > ..
> > > ..
> > > [ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]
> > >
> > > Basic problem here is that kernel assumes that device identifier
> > > passed in "root=" is a block device. But there are few execptions
> > > to this rule to take care of the needs of mtd, ubi, NFS and CIFS.
> > >
> > > For example, mtd and ubi prefix "mtd:" or "ubi:" respectively.
> > >
> > > "root=mtd:<identifier>" or "root=ubi:<identifier>"
> > >
> > > NFS and CIFS use "root=/dev/nfs" and CIFS passes "root=/dev/cifs" and
> > > actual root device details come from filesystem specific kernel
> > > command line options.
> > >
> > > virtiofs does not seem to fit in any of the above categories. In fact
> > > we have 9pfs which can be used to boot from but it also does not
> > > have a proper syntax to specify rootfs and does not fit into any of
> > > the existing syntax. They both expect a device "tag" to be passed
> > > in a device to be mounted. And filesystem knows how to parse and
> > > use "tag".
> > >
> > > So this patch proposes that we add a new prefix "fstag:" which specifies
> > > that identifier which follows is filesystem specific tag and its not
> > > a block device. Just pass this tag to filesystem and filesystem will
> > > figure out how to mount it.
> > >
> > > For example, "root=fstag:<tag>".
> > >
> > > In case of virtiofs, I can specify "root=fstag:myfs rootfstype=virtiofs"
> > > and it works.
> > >
> > > I think this should work for 9p as well. "root=fstag:myfs rootfstype=9p".
> > > Though I have yet to test it.
> > >
> > > This kind of syntax should be able to address wide variety of use cases
> > > where root device is not a block device and is simply some kind of
> > > tag/label understood by filesystem.
> >
> > "fstag" is kind of virtio-9p/fs specific. The intended effect is really
> > to specify the file system source (like in mount(2)) without it being
> > interpreted as a block device.
>
> [ CC christoph ]
>
> I think mount(2) has little different requirements. It more or less
> passes the source to filesystem. But during early boot, we do so
> much more with source, that is parse it and determine device major
> and minor and create blockdevice and then call into filesystem.
>
> >
> > In a previous discussion David Gilbert suggested detecting file systems
> > that do not need a block device:
> > https://patchwork.kernel.org/project/linux-fsdevel/patch/20190906100324.8492-1-stefanha@xxxxxxxxxx/
> >
> > I never got around to doing it, but can do_mounts.c just look at struct
> > file_system_type::fs_flags FS_REQUIRES_DEV to detect non-block device
> > file systems?
>
> I guess we can use FS_REQUIRES_DEV. We probably will need to add a helper
> to determine if filesystem passed in "rootfstype=" has FS_REQUIRES_DEV
> set or not.
>
> For now, I have written a patch which does not rely on FS_REQUIRES_DEV.
> Instead I have created an array of filesystems which do not want
> root=<source> to be treated as block device and expect that "source"
> will be directly passed to filesytem to be mounted.
>
> Reason I am not parsing FS_REQUIRES_DEV yet is that I am afraid that
> this can change behavior and introduce regression. Some filesystem
> which does not have FS_REQUIRES_DEV set but still somehow is going
> through block device path (or some path which I can't see yet).
>
> So for now I am playing safe and explicitly creating a list of
> filesystems which will opt-in into this behavior. But if folks think
> that my fears of regression are misplaced and I should parse
> FS_REQUIRES_DEV and that way any filesystem which does not have
> FS_REQUIRES_DEV set automatically gets opted in, I can do that.
>
> >
> > That way it would know to just mount with root= as the source instead of
> > treating it as a block device. No root= prefix would be required and it
> > would handle NFS, virtiofs, virtio-9p, etc without introducing the
> > concept of a "tag".
> >
> > root=myfs rootfstype=virtiofs rootflags=...
> >
> > I wrote this up quickly after not thinking about the topic for 2 years,
> > so the idea may not work at all :).
>
> Now with this patch "root=myfs, rootfstype=virtiofs, rootflags=..." syntax
> works for virtiofs.
>
> Please have a look.

Looks good from a user perspective.

> Subject: [PATCH] init/do_mounts.c: Add a path to boot from non blockdev filesystems
>
> We want to be able to mount virtiofs as rootfs and pass appropriate
> kernel command line. Right now there does not seem to be a good way
> to do that. If I specify "root=myfs rootfstype=virtiofs", system
> panics.
>
> virtio-fs: tag </dev/root> not found
> ..
> ..
> [ end Kernel panic - not syncing: VFS: Unable to mount root fs on
> +unknown-block(0,0) ]
>
> Basic problem here is that kernel assumes that device identifier
> passed in "root=" is a block device. But there are few execptions
> to this rule to take care of the needs of mtd, ubi, NFS and CIFS.
>
> For example, mtd and ubi prefix "mtd:" or "ubi:" respectively.
>
> "root=mtd:<identifier>" or "root=ubi:<identifier>"
>
> NFS and CIFS use "root=/dev/nfs" and CIFS passes "root=/dev/cifs" and
> actual root device details come from filesystem specific kernel
> command line options.
>
> virtiofs does not seem to fit in any of the above categories. In fact
> we have 9pfs which can be used to boot from but it also does not
> have a proper syntax to specify rootfs and does not fit into any of
> the existing syntax. They both expect a device "tag" to be passed
> in a device to be mounted. And filesystem knows how to parse and
> use "tag".
>
> So this patch proposes that we internally create a list of filesystems
> which don't expect a block device and whatever "source" has been
> passed in "root=<source>" option, should be passed to filesystem and
> filesystem should be able to figure out how to use "source" to
> mount filesystem.
>
> As of now I have only added "virtiofs" in the list of such filesystems.
> To enable it on 9p, it should be a simple change. Just add "9p" to
> the nobdev_filesystems[] array.

virtio-9p should be simple. I'm not sure how much additional setup the
other 9p transports require. TCP and RDMA seem doable if there are
kernel parameters to configure things before the root file system is
mounted.

Attachment: signature.asc
Description: PGP signature