Re: DTrace-like analysis possible with future Linux kernels?

From: Robert Milkowski
Date: Mon Aug 23 2004 - 19:25:20 EST


On Sun, 22 Aug 2004, Alan Cox wrote:
On Sad, 2004-08-21 at 07:03, Tomasz Kłoczko wrote:
In Solaris DTrace is enabled in _normal production_ kernel and you can
hang any probe or probes set without restarting system or any runed
application which was compiled withoud debug info.

Solaris only runs on large computers. You don't want kprobes randomly on
your phone, pda, wireless router. Solaris deals with an extremely narrow
market segment of "big computers for people with lots of money".

Overstatement.
Solaris runs on x86 platform, and runs quite well.
And guess what - DTrace runs on x86 like a charm.

[1] Remember: if you want profile some part of code you mast _first_
(re)compile them with profiling enabled. If you wand debug some code

OProfile doesn't require this.

I must admit I don't know OProfile.
But can you profile already running application without interuption (not to mention stopping it) to it? Sure, DTrace introduces some overhead but if it's not acceptable just stop DTrace and application again runs with its original speed.
What about getting structure contents, function arguments and returns, etc... all on the fly. And then D languages which saves a lot of time with its aggregations, thread local variables, speculations. Sure you can use Perl, AWK, and so on - but it takes time - a lot of time.


Some enterprise systems have limited summary time to few hours per year
and restart cycle can take houre or more (checking and initialize hardware
components). If you will try say for admin this kind system "restat your

Enterprise users generally get kernels from vendors who have done the
analysis of needs for them (and hopefully got it right). They generally
don't ftp 2.6.8.1 type make config and try it out.


I think you missed the point.
DTrace is not only about kernel, and it's definitly not a tool for ONLY kernel developers. It's a great tool for user land developers and for sys admins. And it's really easy (well...) to corelate data from kernel and application. All in production and it just works.

I think Tomasz was writing not only about kernel/system uptimes but also about application uptimes. And DTrace can help in both cases.

Coming back to kernel - if you have for example some kind of memory leak in kernel then support guys can provide you with DTrace script, see what's going on and get problem solved without unnecessary system restarts. Then
provide you with new kernel (patch). So even if enterprise user do not want recompile its own kernel, if there's a kernel problem DTrace can save him some downtime.


Actually I generally
- Glance across the load meters
- Ask ps where everything is waiting
- Potentially turn on oprofile
- Potentially use netfilter to see who is causing all my traffic
- Maybe strace a few apps to see what is up
- Educate the user concerned (if needed)

I've already got the symbols (and they are in the debuginfo package from
the rpm build too). I could insmod kgdb but that level of
debugging is generally inappropriate.

DTrace value is ease of use value.


Sure it's one of its values.
I would add safety of running DTrace in a production. In fact it sould be number one feature. DTrace does great job in preventing you from crashing system or applications.
I would add that there is easy (at least a lot easier then without DTrace)
way to understand what's going on in system and which appliacation and why is cousing it. For example Bryan example with xcalls. And all of that in production systems.

Sure, you can make your own module on Linux, load it and trace whatever you want. But:

1. it's not easy
2. requires quite knowledge about kernel
3. could easly crash your kernel
4. takes time
5. only kernel level - what about applications and correlation between
apps and kernel?
...
...


My point is DTrace is really awesome tool.
Sure you can do many things without DTrace but it will take much more time. And there are a lot of things you can do with DTrace which are really hard to do in a production using it.

It just saves your time and gives you answers to questions you would not even ask before DTrace 'coz of inacceptable time it would take to answer them. It's really 'fun' to see what's going on in system and/or appliactions with DTrace.

My post sounds almost like marketing crap - but it is really what I find in DTrace. I must admit that I did not really appreciate this tool until I've started using it.


--
Robert Milkowski
milek@xxxxxxxxxxxxxxxxxx