Re: [RFC][PATCH] prevent incorrect oom under split_lru

From: MinChan Kim
Date: Wed Jun 25 2008 - 09:05:51 EST


On Wed, Jun 25, 2008 at 9:11 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, 2008-06-25 at 15:56 +0900, MinChan Kim wrote:
>> On Wed, Jun 25, 2008 at 3:08 PM, KOSAKI Motohiro
>> <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> > Hi Kim-san,
>> >
>> >> >> So, if priority==0, We should try to reclaim all page for prevent OOM.
>> >> >
>> >> > You are absolutely right. Good catch.
>> >>
>> >> I have a concern about application latency.
>> >> If lru list have many pages, it take a very long time to scan pages.
>> >> More system have many ram, More many time to scan pages.
>> >
>> > No problem.
>> >
>> > priority==0 indicate emergency.
>> > it doesn't happend on typical workload.
>> >
>>
>> I see :)
>>
>> But if such emergency happen in embedded system, application can't be
>> executed for some time.
>> I am not sure how long time it take.
>> But In some application, schedule period is very important than memory
>> reclaim latency.
>>
>> Now, In your patch, when such emergency happen, it continue to reclaim
>> page until it will scan entire page of lru list.
>> It
>
> IMHO embedded real-time apps shoud mlockall() and not do anything that
> can result in memory allocations in their fast (deterministic) paths.
Hi peter,

I agree with you. but if application's virtual address space is big,
we have a hard problem with mlockall since memory pressure might be a
big.
Of course, It will be a RT application design problem.

> The much more important case is desktop usage - that is where we run non
> real-time code, but do expect 'low' latency due to user-interaction.
>
> >From hitting swap on my 512M laptop (rather frequent occurance) I know
> we can do better here,..
>

Absolutely. It is another example. So, I suggest following patch.
It's based on idea of Takenori Nagano's memory reclaim more efficiently.

I expect It will reduce application latency and will not have a regression.
How about you ?

Signed-off-by: MinChan Kim <minchan.kim@xxxxxxxxx>
---
mm/vmscan.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9a5e423..07477cc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1460,9 +1460,12 @@ static unsigned long shrink_zone(int priority,
struct zone *zone,
* kernel will slowly sift through each list.
*/
scan = zone_page_state(zone, NR_LRU_BASE + l);
- scan >>= priority;
- scan = (scan * percent[file]) / 100;
+ if (priority) {
+ scan >>= priority;
+ scan = (scan * percent[file])/10;
+ }
zone->lru[l].nr_scan += scan + 1;
+
nr[l] = zone->lru[l].nr_scan;
if (nr[l] >= sc->swap_cluster_max)
zone->lru[l].nr_scan = 0;
@@ -1489,6 +1492,9 @@ static unsigned long shrink_zone(int priority,
struct zone *zone,

nr_reclaimed += shrink_list(l, nr_to_scan,
zone, sc, priority);
+ if (priority == 0 && !current_is_kswapd() &&
+ nr_reclaimed >= sc->swap_cluster_max)
+ break;
}
}
}
--
1.5.4.3




--
Kinds regards,
MinChan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/