Re: [PATCH 08/28] kdb: core for kgdb back end (2 of 2)

From: Jason Wessel
Date: Thu Feb 18 2010 - 13:36:56 EST


Scott Lurndal wrote:
>
> IIRC the original KDB would stop all the cpus when entered,
> thus locking to avoid concurrent access to data
> was not necessary when displaying kernel data structures.

Just because the system is "effectively frozen" does not mean you can
safely walk a structure or call something that takes a lock.

One of the key points Eric and others have made is that they do not want
some of this helper code in the kernel core, nor do they want alternate
lock semantics while the kernel debugger is active. You can achieve the
same sort of interrogation with a gdb helper macro. At some point kdb
could be extended to have the same sort of functionality if someone
finds they just cannot live without it.


> However,
> KDB user and developers were assumed to be aware that when KDB was
> entered the system context was in an indeterminate state particularly
> with respect to linked lists and other non-tabular data structures.
>
> KDB code that displayed data structures which were kept in a non-table
> data structure (linked list, tree, etc.) was be required to both
> validate each pointer it tries to follow as well as ensure that it
> detects loops (either by terminating the list traversal after a certain
> number of elements or by allowing the KDB user to terminate the traversal
> with e.g. 'q').
>
>
>> It looks to me like the original kdb took the approach of calling the
>> setjmp() longjmp() and if there was any kind of fault, it long jumped
>> back to the original context. Obviously that doesn't solve any kind of
>> problem with a list loop.
>>
>
> Yes. The list loop was expected to be handled either by the display
> code terminating after some number of traversal step or by the KDB user
> terminating the command via the keyboard (e.g. 'q' at a more-type prompt).
>
>

The new kdb has a pager as well as abort operations, but it does not
make use of setjmp() longjmp() to handle faults while executing other
helper print code.

> If the new KDB framework allows other cpus to continue to run while kdb
> data structure display commands are running, then much more care must
> be taken in the display command code to avoid inconsistent data causing
> loops or #PF.
>
>

In the new kdb. The system is fully stopped by the kernel debug core.
There is the concept of the master CPU (the one running the debug shell)
and the slave CPUs which are all the other cores. All the slaves spin
in a control loop, and you may switch cpus with the kdb cpu command
without exiting the debug context. The master can "trade places" with
an online slave cpu.

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/