Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure

From: Kyle Moffett
Date: Sat Jun 18 2011 - 21:54:51 EST


On Fri, Jun 17, 2011 at 10:27, James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, 2011-06-17 at 01:04 +0200, Kay Sievers wrote:
>> >> We need many names, and we need all of them from the very beginning,
>> >> and they should not change during device lifetime unless the device
>> >> state changes.
>> >
>> > So that's actually an argument for leaving the links, surely? ÂWe can
>> > have many inbound links, but the kernel can only print one name in
>> > messages, which would be the preferred name that was currently set.
>>
>> I really question any concept of _the_ name. My take on it: It will
>> never work in reality.
>
> OK, so lets take the common example: a desktop with three disks and an
> enclosure with three slots and labels "fred", "jim", and "betty".
>
> The desired outcome is that whenever the user manipulates those devices
> he uses a name related to the label, so whenever dmesg flags a problem,
> it says sd betty: Âdevice offline or something. ÂWhenever he mounts, he
> mounts by /dev/disk/by-preferred/betty (or whatever the current udev
> vernacular is). ÂWhenever smartmon says there's an over temp problem. it
> says that fred has it; Âcat /proc/partitions shows how fred, jim and
> betty are partitioned and so on.

Hm...

So there's already all this work going into an event-tracing framework,
and most of the interesting device errors are getting converted to use
functions such as "dev_err()" and the like.

Perhaps the kernel needs a "log" event? You could add a basic unique-id
allocator (64-bit integer) and give each device or other interesting object a
unique "tag". A generic printk without a "tag" field would automatically
get tag 0.

There would be another few special events generated to make it possible
to uniquely map tags to device-model objects (or filesystems or whatever)
long after the fact, including enough information to determine the parent
device or other key attributes.

Then all of the dev_dbg() would automatically generate the necessary
trace events tagged by device, with the log-level and "string" as the
payload.

Suddenly you can monitor a device (and optionally all of its parents or
children) for "interesting kernel events", even if that particular driver
is still doing all of its logging with "primitive" dev_err() printks.

Since it's tagged by device you can just install a modified "klogd" that
cooperates with udev to log events with information about exactly
which device-model node it applies to. You can even have that
program generate dbus messages, so your desktop environment
can complain that the kernel has reported filesystem errors on that
thumbdrive you just plugged in, but that the media itself seems to
be fine (no I/O errors).

A future extension might be to allow trace-events to have a "fallback"
handler of sorts analogous to the way that audit messages are
currently handled. If a process is monitoring events and has a filter
which matches the event then it will be handled by that process;
otherwise it will call the "fallback" handler and resort to a printk().

That would allow a more advanced driver to generate specific
status and error messages for consumption by monitoring software,
but still fall back to dmesg when the system is in single-user-mode
or the monitoring software dies, etc.

Thoughts?

Cheers,
Kyle Moffett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/