Re: [PATCH v3 21/22] netoops: Add user-programmable boot_id

From: Matt Mackall
Date: Tue Dec 14 2010 - 17:48:00 EST


On Tue, 2010-12-14 at 14:33 -0800, Mike Waychison wrote:
> On Tue, Dec 14, 2010 at 2:06 PM, Matt Mackall <mpm@xxxxxxxxxxx> wrote:
> > On Tue, 2010-12-14 at 13:59 -0800, Mike Waychison wrote:
> >> On Tue, Dec 14, 2010 at 1:42 PM, Matt Mackall <mpm@xxxxxxxxxxx> wrote:
> >> > On Tue, 2010-12-14 at 13:30 -0800, Mike Waychison wrote:
> >> >> Add support for letting userland define a 32bit boot id. This is useful
> >> >> for users to be able to correlate netoops reports to specific boot
> >> >> instances offline.
> >> >
> >> > This sounds a lot like the pre-existing /proc/sys/kernel/random/boot_id
> >> > that's used by kerneloops.org.
> >>
> >> Could be. I'm looking at it now... There is no documentation for this
> >> boot_id field?
> >
> > Probably not. It's just a random number generated at boot.
> >
> >> Reusing this guy would work, except that it doesn't appear to allow
> >> arbitrary values to be set. We need to inject our boot sequence
> >> number (which is figured out in userland) in the packet somehow as we
> >> need to correlate it to our other monitoring systems.
> >
> > What happens if you oops before userspace is available?
> >
>
> Either one of two general cases:
> - The crash is a one-off and the machine comes back. The boot
> number sequence will see a hole in it, which is a clue that something
> bad happened.
> - The machine is in a crash loop. This has the same failure mode
> for us as if the machine never made it onto the network due to
> whatever reason: bad cables, bad firmware, bad ram, ...
>
> In both cases, we can detect that something is wrong and handle it.
> Note that our firmware is responsible for incrementing the boot
> sequence at bootup, which is why the above works. In general though,
> our machines do make it up to userland -- staying alive once booted is
> the hard part ;)

Interesting. Is this Google-specific firmware magic? I'd probably accept
a hook in random.c to fold a number into the UUID, which would unify
things.

--
Mathematics is the supreme nostalgia of our time.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/