Re: Fw: 2.5.61 oops running SDET

From: Martin J. Bligh (mbligh@aracnet.com)
Date: Sun Feb 16 2003 - 11:39:40 EST


>> OK, I did the following, which is what I think you wanted, plus Zwane's
>> observation that task_state acquires the task_struct lock (we're the only
>> caller, so I just removed it), but I still get the same panic and this time
>> the box hung. No doubt I've just done something stupid in the patch ...
>> (task_state takes the tasklist_lock ... is that safe to do inside task_lock?)
>
> It looks like you now try and take the tasklist_lock inside a task_lock()
> region, but task_lock says no-can-do:
>
> /* Protects ->fs, ->files, ->mm, and synchronises with wait4(). Nests
> * inside tasklist_lock */
> static inline void task_lock(struct task_struct *p)
> {
> ...

Right, that's what I expected, and that may well explain the hang, but I
don't see how that explains the oops still happening. What exactly is
tasklist_lock protecting here anyway?

        read_lock(&tasklist_lock);
        buffer += sprintf(buffer,
                "State:\t%s\n"
                "Tgid:\t%d\n"
                "Pid:\t%d\n"
                "PPid:\t%d\n"
                "TracerPid:\t%d\n"
                "Uid:\t%d\t%d\t%d\t%d\n"
                "Gid:\t%d\t%d\t%d\t%d\n",
                get_task_state(p), p->tgid,
                p->pid, p->pid ? p->real_parent->pid : 0,
                p->pid && p->ptrace ? p->parent->pid : 0,
                p->uid, p->euid, p->suid, p->fsuid,
                p->gid, p->egid, p->sgid, p->fsgid);
        read_unlock(&tasklist_lock);

Is it these two accesses:

p->real_parent->pid ?
p->parent->pid ?

Don't see what I can do for this apart from to invert the ordering and take
tasklist_lock around the whole function, and nest task_lock inside that, or
I suppose I could take the task_lock for each of the parents? I seem to
recall Linus reminding people recently that it was only the lock
acquisition order that was important, not release ... does something like
the following look OK?

diff -urpN -X /home/fletch/.diff.exclude virgin/fs/proc/array.c sdet2/fs/proc/array.c
--- virgin/fs/proc/array.c Sat Feb 15 16:11:45 2003
+++ sdet2/fs/proc/array.c Sun Feb 16 08:37:45 2003
@@ -147,11 +147,11 @@ static inline const char * get_task_stat
         return *p;
 }
 
+/* Call me with the tasklist_lock and task_lock for p held already */
 static inline char * task_state(struct task_struct *p, char *buffer)
 {
         int g;
 
- read_lock(&tasklist_lock);
         buffer += sprintf(buffer,
                 "State:\t%s\n"
                 "Tgid:\t%d\n"
@@ -165,13 +165,10 @@ static inline char * task_state(struct t
                 p->pid && p->ptrace ? p->parent->pid : 0,
                 p->uid, p->euid, p->suid, p->fsuid,
                 p->gid, p->egid, p->sgid, p->fsgid);
- read_unlock(&tasklist_lock);
- task_lock(p);
         buffer += sprintf(buffer,
                 "FDSize:\t%d\n"
                 "Groups:\t",
                 p->files ? p->files->max_fds : 0);
- task_unlock(p);
 
         for (g = 0; g < p->ngroups; g++)
                 buffer += sprintf(buffer, "%d ", p->groups[g]);
@@ -243,20 +240,22 @@ extern char *task_mem(struct mm_struct *
 int proc_pid_status(struct task_struct *task, char * buffer)
 {
         char * orig = buffer;
- struct mm_struct *mm = get_task_mm(task);
 
+ read_lock(&tasklist_lock);
+ task_lock(task);
         buffer = task_name(task, buffer);
         buffer = task_state(task, buffer);
+ read_unlock(&tasklist_lock);
  
- if (mm) {
- buffer = task_mem(mm, buffer);
- mmput(mm);
- }
- buffer = task_sig(task, buffer);
+ if (task->mm)
+ buffer = task_mem(task->mm, buffer);
+ if (task->sighand)
+ buffer = task_sig(task, buffer);
         buffer = task_cap(task, buffer);
 #if defined(CONFIG_ARCH_S390)
         buffer = task_show_regs(task, buffer);
 #endif
+ task_unlock(task);
         return buffer - orig;
 }
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Feb 23 2003 - 22:00:15 EST