Re: [PATCH UPDATED] workqueue: add documentation

From: Tejun Heo
Date: Mon Sep 13 2010 - 04:08:43 EST


Hello,

On 09/13/2010 02:51 AM, Dave Chinner wrote:
> We talked about this for XFS w.r.t. the xfslogd IO completion
> work items to be promoted ahead of data IO completion items and
> that has worked fine. This appears to gives us only two
> levels of priority, or from an user point of view, two levels of
> dependency between workqueue item execution.

It's not priority per-se. It's basically a bypass switch for
workqueue work deferring mechanism.

> Thinking about the XFS situation more, we actually have three levels
> of dependency: xfslogd -> xfsdatad -> xfsconvertd. That is, we defer
> long running, blocking items from xfsdatad to xfsconvertd so we
> don't block the xfsdatad from continuing to process data IO
> completion items. How do we guarantee that the xfsconvertd work
> items won't prevent/excessively delay processing of xfsdatad items?

What do you mean by "long running"? Do you mean it would consume a
lot of CPU cycles or it would block for locks and IOs a lot? It's the
latter, right? There isn't much to worry about.

>> +@max_active determines the maximum number of execution contexts per
>> +CPU which can be assigned to the work items of a wq. For example,
>> +with @max_active of 16, at most 16 work items of the wq can be
>> +executing at the same time per CPU.
>
> I think the reason you were seeing XFS blow this out of the water is
> that every IO completion for a write beyond EOF (i.e. every single
> one for an extending streaming write) will require inode locking to
> update file size. If the inode is locked, then the item will
> delay(1), and the cmwq controller will run the next item in a new
> worker. That will then block in delay(1) 'cause it can't get the
> inode lock, as so on....
>
> As such, I can't see that increasing the max_active count for XFS is
> a good thing - all it will do is cause larger blockages to occur....

>From the description above, it looks like xfs developed its own way of
regulating work processing involving multiple workqueues and yielding
queue positions with delay. For now, it probably would be best to
just keep things running as they are but in the long run it might be
beneficial to replace those explicit mechanisms.

>> +6. Guidelines
>> +
>> +* Do not forget to use WQ_RESCUER if a wq may process work items which
>> + are used during memory reclaim. Each wq with WQ_RESCUER set has one
>> + rescuer thread reserved for it. If there is dependency among
>> + multiple work items used during memory reclaim, they should be
>> + queued to separate wq each with WQ_RESCUER.
>> +
>> +* Unless strict ordering is required, there is no need to use ST wq.
>> +
>> +* Unless there is a specific need, using 0 for @nr_active is
> max_active?

Oops, thanks. Updated.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/