Re: [PATCH v11 21/48] ext4: Add richacl feature flag

From: Austin S Hemmelgarn
Date: Mon Oct 19 2015 - 14:46:04 EST


On 2015-10-19 13:33, Andreas Gruenbacher wrote:
On Mon, Oct 19, 2015 at 6:19 PM, Austin S Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
On 2015-10-19 11:34, Andreas Gruenbacher wrote:
On Mon, Oct 19, 2015 at 3:16 PM, Austin S Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
On 2015-10-16 13:41, Andreas Gruenbacher wrote:
On Fri, Oct 16, 2015 at 7:31 PM, Austin S Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
I would like to re-iterate, on both XFS and ext4, I _really_ think this
should be a ro_compat flag, and not an incompat one. If a person has
the
ability to mount the FS (even if it's a read-only mount), then they by
definition have read access to the file or partition that the
filesystem
is contained in, which means that any ACL's stored on the filesystem
are
functionally irrelevant,
It is unfortunately not safe to make such a file system accessible to
other users, so the feature is not strictly read-only compatible.
OK, seeing as I wasn't particularly clear as to why I object to this in
my
other e-mail, let's try this again.
Can you please explain exactly why it isn't safe to make such a
filesystem
accessible to other users?
See here: http://www.spinics.net/lists/linux-ext4/msg49541.html
OK, so to clarify, this isn't 'safe' because:
1. The richacls that exist on the filesystem won't be enforced.
2. Newly created files will have no ACL's set.

It is worth noting that these are also issues with any kind of access
control mechanism. Using your logic, all LSM's need to set separate
incompat feature flags in filesystems they are being used on, as should
POSIX ACLs, and for that matter so should Samba in many circumstances, and
any NFS system not using idmapping or synchronized/centralized user
databases. Now, if the SELinux (or SMACK, or TOMOYO) people had taken this
approach, then I might be inclined to not complain (at least not to you, I'd
be complaining to them about this rather poor design choice), but that is
not the case, because (I assume) they realized that all this provides is a
false sense of security.

LSMs reside above the filesystem level. Let's take SELinux as an
example. It has its own consistency check mechanism (relabeling). Fsck
could check the syntax of SELinux labels, but it couldn't do anything
sensible about corrupted labels, and syntactically correct labels also
don't mean much. A relabeling run to verify or restory the appropriate
policy would still be necessary to verify that labels are semantically
correct, and for that, the filesystem needs to be mounted in the right
place in the filesystem hierarchy.

TOMOYO and AppArmor are not based on inode labels at all.
Apologies for being unintentionally over-inclusive WRT LSM's, and also for forgetting that TOMOYO doesn't use inode labels.
LSMs usually also just provide an extra layer of security; when turned
off, the basic security mechanisms still in effect will make sure that
the system works just like before. (There are configurations like MLS
where that is not the case, but those are uncommon.)
Um, actually no, even without MLS (or MCS), there is no guarantee whatsoever that the system will work 'just' like before (I've actually seen systems break on any (or even in one case all) of the transitions between enforcing/permissive/off for SELinux, even assuming that relabeling is done properly).
ACLs are quite different from that. They can be checked statically by
fsck. They are a basic security concept, and when turned off, there is
no guarantee that the system will still be safe.
LSM's hook into the VFS code right alongside the regular permissions checks. They are intended to supplement the regular UNIX DAC based permissions. Richacls (and POSIX ACLs) also hook into the regular permissions checks, and also are intended to supplement the regular UNIX DAC based permissions. Fsck only checks the syntax of regular UNIX DAC based permissions (it doesn't verify that the listed UID/GID are actually valid in userspace, nor does it check for semantically nonsensical permissions modes like 007 or 000), and it really can't properly check anything more than that on ACL's either. Based on all of this, richacls and LSM's which mediate filesystem accesses are on exactly the same level with the exception that LSM's usually have way more functionality beyond just using ACL's to control file access, LSM's are usually MAC based while ACL's are DAC based, and the fact that ACL's are (usually) not dependent on where in the filesystem hierarchy they are found.

On top of that, there is no guarantee that a system will still be safe when you turn SELinux (or other LSM's) off either (in fact, for some configurations, it can be mathematically proven that the system will not be safe if you turn off SELinux).
Issue 1, as I have said before, is functionally irrelevant for anyone who
actually knows what they are doing; all you need for ext* is one of the
myriad of programs for un-deleting files on such a filesystem (such as
ext4magic or extundelete, and good luck convincing them to not allow being
used when this flag is set), for BTRFS you just need the regular filesystem
administration utilities ('btrfs restore' works wonders, and that one will
_never_ honor any kind of permissions, because it's for disaster recovery),
and while I don't know of any way to do this with XFS, that is only because
I don't use XFS myself and have not had the need to provide tech support for
anyone who does. If somebody absolutely _needs_ a guarantee that the acls
will be enforced, they need to be using whole disk encryption, not just
acls, and even that can't provide such a guarantee.

As for issue 2, that can be solved by making it a read-only compatible flag,
which is what I was suggesting be done in the first place. The only
situation I can think of that this would cause an issue for is if the
filesystem was not cleanly unmounted, and the log-replay doesn't set the
ACL's, but mounting an uncleanly unmounted filesystem that has richacls on a
kernel without support should fall into one of the following 2 cases more
than 99% of the time:
1. The system crashed hard, and the regular kernel is un-bootable for some
reason, in this case you're at the point of disaster recovery, should not be
exposing _anything_ to a multi-user environment, and probably care a lot
more about being able to get the system running again than about not
accidentally creating a file with a missing ACL.
2. The filesystem was maliciously stolen in some way (either the hardware
was acquired, or more likely, someone got an image of a still mounted
filesystem), in which case all of my statements above regarding issue 1
apply.
Please spare me with all that nonsense. Compared to mount options,
filesystem feature flags in this case simplify things (you don't have
to specify whether a filesystem contains POSIX ACLs or richacls), and
they prevent administrator errors: when a filesystem mounts, it is
safe to use; when it doesn't, it is not. That's all there is to it.
You're ignoring what I'm actually saying. I've said absolutely nothing about needing to use mount options at all, and I'm not arguing against using filesystem feature flags, I'm arguing for using them sensibly in a way that does not present a false sense of security.

Making the comparability flag read-only will exactly solve the second issue that you outlined for more than 99% of all use cases (and the remaining cases are provably irrelevant), still indicate clearly to the FS code which type of ACL is in use on the FS, and remove the entirely false sense of security WRT the first issue you outlined.

Making it an incompatible flag will likely cause headaches for some legitimate users, and at most delay competent hackers by a few seconds to a few minutes, and script kiddies by a few hours, and is really no better than security by obscurity (and from a purely logistical standpoint, that's _all_ it is) in that it actively tries to hide the fact that someone having read access to the storage the filesystem is on can bypass the ACL's.

To reiterate, if someone can call mount() on a filesystem, and mount() does not return -EPERM, then even if mount() returns a different error, they still have the ability to completely bypass all permissions and ACL's in that filesystem, because they have the ability to read the entire filesystem directly.

The _only_ way to properly protect against people bypassing the ACL's is to use full disk encryption and lock down root access on the system, and even that can't completely prevent it from happening.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature