RE: a small doubt

Bret Indrelee (bindrelee@sbs-cp.com)
Fri, 8 Oct 1999 11:35:01 -0500


Keith Owens [mailto:kaos@ocs.com.au] wrote:
> On Fri, 8 Oct 1999 09:18:36 -0500 ,
> Bret Indrelee <bindrelee@sbs-cp.com> wrote:
> >Oliver Xymoron [mailto:oxymoron@waste.org] wrote:
> >> Why don't we simply use an OOPS() macro that causes an oops in an
> >> arch-specific way? And throw this in a generic place:
> >
> >What is wrong with just using the existing panic() from kernel.h?
> >
> >If there is information that panic() doesn't save which you
> would get with
> >an OOPS message, then that is a problem with panic.
>
> panic() is just a general message formatter, like printk() but with a
> few more side effects ;). Causing an oops gets the registers
> and, more
> to the point, gets them at the point of failure. Registers after you
> have called another routine are much less useful.

OK, I've seen enough varients of this response that I have to ask a
question.

How many of you are doing assembly language programming?

Assuming that the situation is one that can not be cleanly recovered from, I
would usually do something like:

---

#define LINE_STR STRIZE(__LINE__) #define STRIZE(literal) STRIZE2(literal) #define STRIZE2(literal) #literal

panic(__FILE__ " at line " LINE_STR " in " __FUNCTION__ ": This should have never happened.\n");

---

Then you put a kernel breakpoint in at panic and dump out the traceback there.

Since most of my coding is done in C, looking at the machine registers isn't usually all that valuable. The call stack would be valuable, it is too bad that panic doesn't automatically do this.

A core dump would certainly be nice. Allow me to go in with the equivilent of ADB and look at the various symbols I need. The call stack is many times essential, looks like panic() needs an upgrade to provide this.

As for recovery, looks like I'll have to check into what oops does. I expect the kernel to be able to recover from an application page fault, but not one in the kernel. If the situation is so bad I can't recover from it, how does the kernel know how to?

Doesn't seem safe to me.

The original post was about dput() in fs/dcache.c file. out: if (count >= 0) { dentry->d_count = count; return; }

printk(KERN_CRIT "Negative d_count (%d) for %s/%s\n", count, dentry->d_parent->d_name.name, dentry->d_name.name); *(int *)0 = 0; }

I believe this is in the kernel's file system directory cache. Since there aren't any comments about what d_count is actually counting, I can only guess that it is a reference count. How does the kernel know how to recover from a bad reference count in a d_entry when the cache handling code can't do it?

Seems like this is an ideal place to mark the entry bad and then (potentially) call panic. Since panic() calls sync, some action should be required first.

Regardless, intentionally causing a page fault just seems like the wrong thing to do.

-Bret

------------------------------------------------------------- SBS Technologies, Connectivity Products ... solutions for real-time connectivity

Bret Indrelee, Engineer SBS Technologies, Inc., Connectivity Products 1284 Corporate Center Drive, St. Paul MN 55121 Direct: (651) 905-4731 Main: (651) 905-4700 Fax: (651) 905-4701 E-mail: bindrelee@sbs-cp.com http://www.sbs.com -------------------------------------------------------------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/