Re: [PATCH 1/2] mm, vmstat: Release zone lock more frequently when reading /proc/pagetypeinfo

From: Waiman Long
Date: Wed Oct 23 2019 - 14:14:35 EST


On 10/23/19 2:01 PM, Michal Hocko wrote:
> On Wed 23-10-19 13:34:22, Waiman Long wrote:
>> With a threshold of 100000, it is still possible that the zone lock
>> will be held for a very long time in the worst case scenario where all
>> the counts are just below the threshold. With up to 6 migration types
>> and 11 orders, it means up to 6.6 millions.
>>
>> Track the total number of list iterations done since the acquisition
>> of the zone lock and release it whenever 100000 iterations or more have
>> been completed. This will cap the lock hold time to no more than 200,000
>> list iterations.
>>
>> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
>> ---
>> mm/vmstat.c | 18 ++++++++++++++----
>> 1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/vmstat.c b/mm/vmstat.c
>> index 57ba091e5460..c5b82fdf54af 100644
>> --- a/mm/vmstat.c
>> +++ b/mm/vmstat.c
>> @@ -1373,6 +1373,7 @@ static void pagetypeinfo_showfree_print(struct seq_file *m,
>> pg_data_t *pgdat, struct zone *zone)
>> {
>> int order, mtype;
>> + unsigned long iteration_count = 0;
>>
>> for (mtype = 0; mtype < MIGRATE_TYPES; mtype++) {
>> seq_printf(m, "Node %4d, zone %8s, type %12s ",
>> @@ -1397,15 +1398,24 @@ static void pagetypeinfo_showfree_print(struct seq_file *m,
>> * of pages in this order should be more than
>> * sufficient
>> */
>> - if (++freecount >= 100000) {
>> + if (++freecount > 100000) {
>> overflow = true;
>> - spin_unlock_irq(&zone->lock);
>> - cond_resched();
>> - spin_lock_irq(&zone->lock);
>> + freecount--;
>> break;
>> }
>> }
>> seq_printf(m, "%s%6lu ", overflow ? ">" : "", freecount);
>> + /*
>> + * Take a break and release the zone lock when
>> + * 100000 or more entries have been iterated.
>> + */
>> + iteration_count += freecount;
>> + if (iteration_count >= 100000) {
>> + iteration_count = 0;
>> + spin_unlock_irq(&zone->lock);
>> + cond_resched();
>> + spin_lock_irq(&zone->lock);
>> + }
> Aren't you overengineering this a bit? If you are still worried then we
> can simply cond_resched for each order
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index c156ce24a322..ddb89f4e0486 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1399,13 +1399,13 @@ static void pagetypeinfo_showfree_print(struct seq_file *m,
> */
> if (++freecount >= 100000) {
> overflow = true;
> - spin_unlock_irq(&zone->lock);
> - cond_resched();
> - spin_lock_irq(&zone->lock);
> break;
> }
> }
> seq_printf(m, "%s%6lu ", overflow ? ">" : "", freecount);
> + spin_unlock_irq(&zone->lock);
> + cond_resched();
> + spin_lock_irq(&zone->lock);
> }
> seq_putc(m, '\n');
> }
>
> I do not have a strong opinion here but I can fold this into my patch 2.

If the free list is empty or is very short, there is probably no need to
release and reacquire the lock. How about adding a check for a lower
bound like:

if (freecount > 1000) {
ÂÂÂ spin_unlock_irq(&zone->lock);
ÂÂÂ cond_resched();
ÂÂÂ spin_lock_irq(&zone->lock);
}

Cheers,
Longman