Re: IO scheduler based IO Controller V2

From: Gui Jianfeng
Date: Thu May 07 2009 - 01:48:48 EST


Vivek Goyal wrote:
> Hi Gui,
>
> Thanks for the report. I use cgroup_path() for debugging. I guess that
> cgroup_path() was passed null cgrp pointer that's why it crashed.
>
> If yes, then it is strange though. I call cgroup_path() only after
> grabbing a refenrece to css object. (I am assuming that if I have a valid
> reference to css object then css->cgrp can't be null).

I think so too...

>
> Anyway, can you please try out following patch and see if it fixes your
> crash.
>
> ---
> block/elevator-fq.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> Index: linux11/block/elevator-fq.c
> ===================================================================
> --- linux11.orig/block/elevator-fq.c 2009-05-05 15:38:06.000000000 -0400
> +++ linux11/block/elevator-fq.c 2009-05-06 11:55:47.000000000 -0400
> @@ -125,6 +125,9 @@ static void io_group_path(struct io_grou
> unsigned short id = iog->iocg_id;
> struct cgroup_subsys_state *css;
>
> + /* For error case */
> + buf[0] = '\0';
> +
> rcu_read_lock();
>
> if (!id)
> @@ -137,15 +140,12 @@ static void io_group_path(struct io_grou
> if (!css_tryget(css))
> goto out;
>
> - cgroup_path(css->cgroup, buf, buflen);
> + if (css->cgroup)

According to CR2, when kernel crashing, css->cgroup equals 0x00000100.
So i guess this patch won't fix this issue.

> + cgroup_path(css->cgroup, buf, buflen);
>
> css_put(css);
> -
> - rcu_read_unlock();
> - return;
> out:
> rcu_read_unlock();
> - buf[0] = '\0';
> return;
> }
> #endif
>
> BTW, I tried following equivalent script and I can't see the crash on
> my system. Are you able to hit it regularly?

yes, it's 50% chance that i can reproduce it.
i'v attached the rwio source code.

>
> Instead of killing the tasks I also tried moving the tasks into root cgroup
> and then deleting test1 and test2 groups, that also did not produce any crash.
> (Hit a different bug though after 5-6 attempts :-)
>
> As I mentioned in the patchset, currently we do have issues with group
> refcounting and cgroup/group going away. Hopefully in next version they
> all should be fixed up. But still, it is nice to hear back...
>
>

--
Regards
Gui Jianfeng

X bitmap