Re: [RFC v2 1/4] fs: Add generic file system event notifications

From: Beata Michalska
Date: Mon Apr 27 2015 - 11:08:54 EST


On 04/27/2015 04:24 PM, Greg KH wrote:
> On Mon, Apr 27, 2015 at 01:51:41PM +0200, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications, to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface by newly introduced multicast group.
>>
>> Threshold notifications have been included, allowing
>> triggering an event whenever the amount of free space drops
>> below a certain level - or levels to be more precise as two
>> of them are being supported: the lower and the upper range.
>> The notifications work both ways: once the threshold level
>> has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>>
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
>>
>> Signed-off-by: Beata Michalska <b.michalska@xxxxxxxxxxx>
>> ---
>> Documentation/filesystems/events.txt | 231 ++++++++++
>> fs/Makefile | 1 +
>> fs/events/Makefile | 6 +
>> fs/events/fs_event.c | 770 ++++++++++++++++++++++++++++++++++
>> fs/events/fs_event.h | 25 ++
>> fs/events/fs_event_netlink.c | 99 +++++
>> fs/namespace.c | 1 +
>> include/linux/fs.h | 6 +-
>> include/linux/fs_event.h | 58 +++
>> include/uapi/linux/fs_event.h | 54 +++
>> include/uapi/linux/genetlink.h | 1 +
>> net/netlink/genetlink.c | 7 +-
>> 12 files changed, 1257 insertions(+), 2 deletions(-)
>> create mode 100644 Documentation/filesystems/events.txt
>> create mode 100644 fs/events/Makefile
>> create mode 100644 fs/events/fs_event.c
>> create mode 100644 fs/events/fs_event.h
>> create mode 100644 fs/events/fs_event_netlink.c
>> create mode 100644 include/linux/fs_event.h
>> create mode 100644 include/uapi/linux/fs_event.h
>
> Any reason why you just don't do uevents for the block devices today,
> and not create a new type of netlink message and userspace tool required
> to read these?

The idea here is to have support for filesystems with no backing device as well.
Parsing the message with libnl is really simple and requires few lines of code
(sample application has been presented in the initial version of this RFC)

>
>> --- a/fs/Makefile
>> +++ b/fs/Makefile
>> @@ -126,3 +126,4 @@ obj-y += exofs/ # Multiple modules
>> obj-$(CONFIG_CEPH_FS) += ceph/
>> obj-$(CONFIG_PSTORE) += pstore/
>> obj-$(CONFIG_EFIVAR_FS) += efivarfs/
>> +obj-y += events/
>
> Always?
>
>> diff --git a/fs/events/Makefile b/fs/events/Makefile
>> new file mode 100644
>> index 0000000..58d1454
>> --- /dev/null
>> +++ b/fs/events/Makefile
>> @@ -0,0 +1,6 @@
>> +#
>> +# Makefile for the Linux Generic File System Event Interface
>> +#
>> +
>> +obj-y := fs_event.o
>
> Always? Even if the option is not selected? Why is everyone forced to
> always use this code? Can't you disable it for the "tiny" systems that
> don't need it?
>

I was considering making it optional and I guess it's worth getting back
to this idea.

>> +struct fs_trace_entry {
>> + atomic_t count;
>
> Why not just use a 'struct kref' for your count, which will save a bunch
> of open-coding of reference counting, and forcing us to audit your code
> to verify you got all the corner cases correct? :)
>
>> + atomic_t active;
>> + struct super_block *sb;

Not sure if using kref would change much here as the kref would not really
make it easier to verify those corner cases, unfortunately.

>
> Are you properly reference counting this pointer? I didn't see where
> that was happening, so I must have missed it.
>
> thanks,
>

You haven't. And if I haven't missed anything, the sb is being used only
as long as the super is alive. Most of the code operates on sb only if it
was explicitly asked to, through call from filesystem. There is also
a callback notifying of mount being dropped (which proceeds the call to
kill_super) that invalidates the object that depends on it.
Still, it should be explicitly stated that the sb is being used through
bidding up the s_count counter, though that would require taking the
sb_lock. AFAIK, one can get the reference to super block but for a particular
device. Maybe it would be worth having it more generic (?).


> greg k-h
>


BR
Beata
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/