Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching

From: Masami Hiramatsu
Date: Fri May 09 2014 - 00:07:55 EST


(2014/05/09 10:46), David Lang wrote:
> On Wed, 7 May 2014, Ingo Molnar wrote:
>
>> * Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>>
>>> On Tue, May 06, 2014 at 09:32:28AM +0200, Ingo Molnar wrote:
>>>>
>>>> * Jiri Kosina <jkosina@xxxxxxx> wrote:
>>>>
>>>>> On Mon, 5 May 2014, David Lang wrote:
>>>>>
>>>>>> how would you know that all instances of the datastructure in memory
>>>>>> have= been touched? just because all tasks have run and are outside the
>>>>>> function in question doesn't tell you data structures have been
>>>>>> converted. You have n= o way of knowing when (or if) the next call to
>>>>>> the modified function will take place on any potential in-memory
>>>>>> structure.
>>>>>
>>>>> The problem you are trying to avoid here is functions expecting to read
>>>>> "v2" format of the data from memory, while there are still tasks that are
>>>>> unpredictably writing "v1" format of the data to the memory.
>>>>>
>>>>> There are several ways to attack this problem:
>>>>>
>>>>> - stop the whole system, convert all the existing data structures to new
>>>>> format (which might potentially be non-trivial, mostly because you
>>>>> have to *know* where all the data structures have been allocated), apply
>>>>> patch, resume operation [ksplice, probably kpatch in future]
>>>>> - restrict the data format to be backwards compatible [to be done
>>>>> manually during patch creation, currently what kGraft needs to do in
>>>>> such case]
>>>>> - have a proxy code which can read both "v1" and "v2" formats, and writes
>>>>> back in the same format it has seen the data structure on input
>>>>> - once all the *code* has been converted, it still has to understand "v1"
>>>>> and "v2", but it can now start writing out "v2" format only [possible
>>>>> with kGraft, not implemented in automated fashion]
>>>>>
>>>>> Ideas are of course more than welcome.
>>>>
>>>> So what I'm curious about, what is the actual 'in the field' distro
>>>> experience, about the type of live-patches that get pushed with
>>>> urgency?
>>>>
>>>> My guess would be that the overwhelming majority of live-patches don't
>>>> change data structures - and hence the right initial model would be to
>>>> ensure (via tooling, and via review) that 'v1' and 'v2' data is
>>>> exactly the same.
>>>
>>> Yes, in general we want to avoid data changes. In practice, we expect
>>> most patches to be small, localized security fixes, so it shouldn't be
>>> an issue in most cases.
>>>
>>> Currently the kpatch tooling detects any compile-time changes to
>>> static data and refuses to build the patch module in that case.
>>>
>>> But there's no way to programmatically detect changes to dynamic
>>> data. Which is why the user always has to be very careful when
>>> selecting a patch.
>>
>> And since this is about the system kernel it's dead easy to mess up a
>> new kernel function and make the system unbootable - so it's not like
>> 'be careful' isn't something implied already.
>
> It's possible to have two versions of code that each work independently, but
> that you can't switch between easily on the fly.
>
> If the new code assumes a lock is held that the old code didn't take, then when
> you switch, you are eventually going to hit a case where the new code trys to
> release a lock it doesn't hold.
>
> detecting all possible cases progromatically seems close to impossible.

Agreed. Perhaps, spinlock or locks which have small critical section are
usually able to make safe, because a lock caller also does unlock.
But mutex etc. usually have different locker/unlocker function.
In that case, we'll need to check running all functions which is in the
critical region.

> but this means that there are two categories of patches
>
> 1. patches that are safe to put in a kernel that you are going to boot from
>
> 2. patches that are able to be applied on the fly
>
> and the tool isn't going to be able to tell you which category the patch is in.
> It can identify some of the items that make it unlikely or impossible for the
> patch to belong to #2, but don't rely on the tool catching all of them

Yeah, I think we'd better start with heuristic decision. Most of the
cases, it could be applied.

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@xxxxxxxxxxx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/