Re: [PATCH] xfs: idle aild if the AIL is pushed up to the target LSN

From: Brian Foster
Date: Mon Apr 25 2016 - 10:24:52 EST


On Mon, Apr 25, 2016 at 09:42:43AM +0200, Lucas Stach wrote:
> The current logic only idles aild if the AIL is empty, rescheduling
> the worker with a timeout of 50ms otherwise. If the target LSN isn't
> moved forward, the worker will not make any progress as it only
> pushes the AIL up to the target LSN, leading to the empty AIL
> condition to only be met after the log worker pushed the complete
> AIL. As the log worker runs on the scale of minutes, this leaves
> aild in the "no progress, 50ms timeout" state for extended periods of
> time, causing many unecessary wakeups.
>

Where do you get the "scale of minutes" from? I believe the default log
worker interval is 30s, though it can be increased by the admin.

The other thing to note is that the AIL target is also driven by demand
on log space. E.g., see xlog_grant_push_ail(), which executes when a
transaction attempts to reserve log space that isn't currently available
in the log.

That said, I'm not sure whether there's a notable benefit of idling for
50ms over just scheduling out when we've hit the target lsn. It seems
like that anybody who pushes the target forward again is going to wake
up the thread anyways. On the other hand, if the fs is idle the thread
will eventually schedule out indefinitely. Have you observed problematic
behavior due to the current heuristic?

> Fix this by idling aild as soon as the AIL is pushed up to the target
> LSN. All code paths that move the target LSN forward already ensure
> that aild is woken up again when needed.
>
> Signed-off-by: Lucas Stach <dev@xxxxxxxxxx>
> ---
> I'm pretty sure that the above deduction is correct, but as this my
> first real encounter with the XFS code base, someone with a bit more
> knowledge than me should give this some thought.
> ---
> fs/xfs/xfs_trans_ail.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index d6c9c3e..7eca852 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -495,6 +495,7 @@ xfsaild(
> {
> struct xfs_ail *ailp = data;
> long tout = 0; /* milliseconds */
> + struct xfs_log_item *lip;
>
> current->flags |= PF_MEMALLOC;
> set_freezable();
> @@ -508,7 +509,8 @@ xfsaild(
> spin_lock(&ailp->xa_lock);
>
> /*
> - * Idle if the AIL is empty and we are not racing with a target
> + * Idle if the AIL is empty or pushed up to the requested
> + * target LSN and we are not racing with a target
> * update. We check the AIL after we set the task to a sleep
> * state to guarantee that we either catch an xa_target update
> * or that a wake_up resets the state to TASK_RUNNING.
> @@ -517,7 +519,8 @@ xfsaild(
> * The barrier matches the xa_target update in xfs_ail_push().
> */
> smp_rmb();
> - if (!xfs_ail_min(ailp) &&
> + lip = xfs_ail_min(ailp);
> + if ((!lip || XFS_LSN_CMP(lip->li_lsn, ailp->xa_target) >= 0) &&

I think you'd want to use > 0, as == 0 suggests you have an item that
matches the target.

FWIW, another option might be to increase the timeout to something much
larger than the current 50ms when we've hit the target. See towards the
end of xfsaild_push() where tout is determined based on the state of the
push (and we already have logic to check against xa_target).

Brian

> ailp->xa_target == ailp->xa_target_prev) {
> spin_unlock(&ailp->xa_lock);
> freezable_schedule();
> --
> 2.5.5
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs