Re: [PATCH 1/1] kasan: fix livelock in qlist_move_cache

From: Dmitry Vyukov
Date: Tue Nov 28 2017 - 12:58:07 EST


On Tue, Nov 28, 2017 at 6:56 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Tue, Nov 28, 2017 at 12:30 PM, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
>> Hi,
>> By using perf top, qlist_move_cache occupies 100% cpu did really
>> happen in my environment yesterday, or I
>> won't notice the kasan code.
>> Currently I have difficulty to let it reappear because the frontend
>> guy modified some user mode code.
>> I can repeat again and again now is
>> kgdb_breakpoint () at kernel/debug/debug_core.c:1073
>> 1073 wmb(); /* Sync point after breakpoint */
>> (gdb) p quarantine_batch_size
>> $1 = 3601946
>> And by instrument code, maximum
>> global_quarantine[quarantine_tail].bytes reached is 6618208.
>
> On second thought, size does not matter too much because there can be
> large objects. Quarantine always quantize by objects, we can't part of
> an object into one batch, and another part of the object into another
> object. But it's not a problem, because overhead per objects is O(1).
> We can push a single 4MB object and overflow target size by 4MB and
> that will be fine.
> Either way, 6MB is not terribly much too. Should take milliseconds to process.
>
>
>
>
>> I do think drain quarantine right in quarantine_put is a better
>> place to drain because cache_free is fine in
>> that context. I am willing do it if you think it is convenient :-)


Andrey, do you know of any problems with draining quarantine in push?
Do you have any objections?

But it's still not completely clear to me what problem we are solving.