[PATCH 2/2 v4] softlockup: check all tasks in hung_task

From: Mandeep Singh Baines
Date: Wed Feb 04 2009 - 23:36:22 EST


Ingo Molnar (mingo@xxxxxxx) wrote:
>
> * Mandeep Singh Baines <msb@xxxxxxxxxx> wrote:
>
> > +static void check_hung_rcu_refresh(struct task_struct *g, struct task_struct *t)
>
> please rename this to rcu_lock_break().
>

Fixed.

> > do_each_thread(g, t) {
> > - if (!--max_count)
> > - goto unlock;
> > + if (!--max_count) {
> > + max_count = sysctl_hung_task_check_count;
> > + check_hung_rcu_refresh(g, t);
> > + /* Exit if t or g was unhashed during refresh. */
> > + if (t->state == TASK_DEAD || g->state == TASK_DEAD)
> > + goto unlock;
>
> Thinking about it some more, i think a slightly different approach (that has
> the same end effect):
>
> - Add a "static const int check_count_batching = 1024;" variable that adds
> some natural batching - and initialize max_count to that value. There's
> little point to make that batching configurable.
>

Fixed.

The batch_count controls the preemptibility of hung_task. While it might
not make sense to expose the value to user-space, we may want to use a
different value for the PREEMPT config (not sure what the specific values
should be):

#if defined(CONFIG_PREEMPT) && !defined(CONFIG_PREEMPT_RCU)
static const int check_count_batching = 256;
#else
static const int check_count_batching = 2048;
#endif

> - Leave sysctl_hung_task_check_count present but change its default to
> something really large like MAX_PID.

Fixed.

Alternatively, the user could renice khungtaskd in order to control the share
of CPU used.

---
Changed the default value of hung_task_check_count to PID_MAX_LIMIT.
hung_task_batch_count added to put an upper bound on the critical
section. Every hung_task_batch_count checks, the rcu lock is broken.
Keeping the critical section small minimizes time preemption is disabled
and keeps rcu grace periods small.

To prevent following a stale pointer, get_task_struct is called on g and t.
To verify that g and t have not been unhashed while outside the critical
section, the task states are checked.

The design was proposed by Frédéric Weisbecker.

Frédéric Weisbecker (fweisbec@xxxxxxxxx) wrote:
>
> Instead of having this arbitrary limit of tasks, why not just
> lurk the need_resched() and then schedule if it needs too.
>
> I know that sounds a bit racy, because you will have to release the
> tasklist_lock and
> a lot of things can happen in the task list until you become resched.
> But you can do a get_task_struct() on g and t before your thread is
> going to sleep and then put them
> when it is awaken.
> Perhaps some tasks will disappear or be appended in the list before g
> and t, but that doesn't really matter:
> if they disappear, they didn't lockup, and if they were appended, they
> are not enough cold to be analyzed :-)
>
> This way you can drop the arbitrary limit of task number given by the user....
>
> Frederic.
>

Signed-off-by: Mandeep Singh Baines <msb@xxxxxxxxxx>
---
kernel/hung_task.c | 39 +++++++++++++++++++++++++++++++++++++--
1 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index a841db3..34b678c 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -17,9 +17,18 @@
#include <linux/sysctl.h>

/*
- * Have a reasonable limit on the number of tasks checked:
+ * The number of tasks checked:
*/
-unsigned long __read_mostly sysctl_hung_task_check_count = 1024;
+unsigned long __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
+
+/*
+ * Limit number of tasks checked in a batch.
+ *
+ * This value controls the preemptibility of khungtaskd since preemption
+ * is disabled during the critical section. It also controls the size of
+ * the RCU grace period. So it needs to be upper-bound.
+ */
+static const int hung_task_batching = 1024;

/*
* Zero means infinite timeout - no checking done:
@@ -109,6 +118,24 @@ static void check_hung_task(struct task_struct *t, unsigned long now,
panic("hung_task: blocked tasks");
}

+ /*
+ * To avoid extending the RCU grace period for an unbounded amount of time,
+ * periodically exit the critical section and enter a new one.
+ *
+ * For preemptible RCU it is sufficient to call rcu_read_unlock in order
+ * exit the grace period. For classic RCU, a reschedule is required.
+ */
+static void rcu_lock_break(struct task_struct *g, struct task_struct *t)
+{
+ get_task_struct(g);
+ get_task_struct(t);
+ rcu_read_unlock();
+ cond_resched();
+ rcu_read_lock();
+ put_task_struct(t);
+ put_task_struct(g);
+}
+
/*
* Check whether a TASK_UNINTERRUPTIBLE does not get woken up for
* a really long time (120 seconds). If that happens, print out
@@ -116,6 +143,7 @@ static void check_hung_task(struct task_struct *t, unsigned long now,
*/
static void check_hung_uninterruptible_tasks(unsigned long timeout)
{
+ int batch_count = hung_task_batching;
int max_count = sysctl_hung_task_check_count;
unsigned long now = get_timestamp();
struct task_struct *g, *t;
@@ -131,6 +159,13 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
do_each_thread(g, t) {
if (!--max_count)
goto unlock;
+ if (!--batch_count) {
+ batch_count = hung_task_batching;
+ rcu_lock_break(g, t);
+ /* Exit if t or g was unhashed during refresh. */
+ if (t->state == TASK_DEAD || g->state == TASK_DEAD)
+ goto unlock;
+ }
/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
if (t->state == TASK_UNINTERRUPTIBLE)
check_hung_task(t, now, timeout);
--
1.5.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/