Re: [PATCH 2/2] memcg fix stale swap cache account leak v6

From: KAMEZAWA Hiroyuki
Date: Fri May 08 2009 - 12:26:42 EST


Thank you for review.

Ingo Molnar wrote:
> x
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>
>> +struct swapio_check {
>> + spinlock_t lock;
>> + void *swap_bio_list;
>> + struct delayed_work work;
>> +} stale_swap_check;
>
> Small nit. It's nice that you lined up the first two fields, but it
> would be nice to line up the third one too:
>
> struct swapio_check {
> spinlock_t lock;
> void *swap_bio_list;
> struct delayed_work work;
> } stale_swap_check;
>
ok.

>> + while (nr--) {
>> + cond_resched();
>> + spin_lock_irq(&sc->lock);
>> + bio = sc->swap_bio_list;
>
>> @@ -66,6 +190,7 @@ static void end_swap_bio_write(struct bi
>> (unsigned long long)bio->bi_sector);
>> ClearPageReclaim(page);
>> }
>> + mem_cgroup_swapio_check_again(bio, page);
>
> Hm, this patch adds quite a bit of scanning overhead to
> end_swap_bio_write(), to work around artifacts of a global LRU not
> working well with a partitioned system's per-partition LRU needs.
>
I'm not sure what is "scanning" overhead. But ok, this is not very light.

> Isnt the right solution to have a better LRU that is aware of this,
> instead of polling around in the hope of cleaning up stale entries?
>
I tried to modify LRU in the last month but I found it's difficult.

Hmm, maybe this patch's method is overkill. I have another option
(used in v1-v2) for fixing writeback. I'll try following again.

== add following codes to vmscan.c ==

shrink_list()
add_to_swap().
memcg_confirm_swapcache_valid()
-> We have race with zap_pte() here.
After add_to_swap(), check account information of memcg.
If memcg doesn't have account on this page, this page may
be unused and not worth to do I/O. check usage again and
try to free it.
==

The difficult part is how to fix race in swapin-readahead and we have
several option to fix writeback, I think.

I'll retry.

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/