RE: Mainline kernel OLTP performance update

From: Wilcox, Matthew R
Date: Wed May 06 2009 - 15:25:56 EST


I suppose another way to see how close you are to replicating our setup in terms of time spent in the interrupt handler is to see how many interrupts you're getting per second. With only one controller, you're probably getting more interrupt coalescing than we're seeing.

I suppose you could run Orion multiple times, or if you have an array which can do RAID-0 for you, you can have multiple spindles per LUN (which is what we do -- 30 LUNs, each with 15 spindles).

> -----Original Message-----
> From: Anirban Chakraborty [mailto:anirban.chakraborty@xxxxxxxxxx]
> Sent: Wednesday, May 06, 2009 11:24 AM
> To: Wilcox, Matthew R; Styner, Douglas W; linux-kernel@xxxxxxxxxxxxxxx
> Cc: Tripathi, Sharad C; arjan@xxxxxxxxxxxxxxx; Kleen, Andi; Siddha, Suresh
> B; Ma, Chinang; Wang, Peter Xihong; Nueckel, Hubert; Recalde, Luis F;
> Nelson, Doug; Cheng, Wu-sun; Prickett, Terry O; Shunmuganathan,
> Rajalakshmi; Garg, Anil K; Chilukuri, Harita; chris.mason@xxxxxxxxxx
> Subject: Re: Mainline kernel OLTP performance update
>
>
>
>
> On 5/6/09 11:12 AM, "Wilcox, Matthew R" <matthew.r.wilcox@xxxxxxxxx>
> wrote:
>
> > That's a more accurate simulation of our workload, but Anirban's setup
> doesn't
> > have nearly as many spindles as ours, so he won't do as many IOPS and
> may not
> > see the problem.
> >
> I was getting an IOPS in the order of 46000, which was not too far from
> what
> Doug was getting. Orion settings indeed have a cache cold setting
> (specifying cache size as 0). The IO was done as 1k block size in
> sequential
> mode to the raw devices.
> I can have that many luns but the issue is that Orion does not support
> that
> may devices and I do not have the source code for it.
> Let me see if I can find some other tool.
>
> -Anirban
>
> > All I'm trying to do is get something that will show the problem on his
> setup,
> > and I think sequential IO is going to be the right answer here. I could
> > easily be wrong.
> >
> > Neither FIO nor dd is going to have the cache behaviour of the database
> (maybe
> > Orion does?) As far as I can tell, we come to the kernel cache-cold for
> every
> > IO simply because the database uses as many cache entries as it can. We
> could
> > write a little program to just thrash through cachelines, or just run
> gcc at
> > the same time as this -- apparently gcc will happily chew through all
> the
> > cache it can too.
> >
> >> -----Original Message-----
> >> From: Styner, Douglas W
> >> Sent: Wednesday, May 06, 2009 11:05 AM
> >> To: Wilcox, Matthew R; Anirban Chakraborty; linux-
> kernel@xxxxxxxxxxxxxxx
> >> Cc: Tripathi, Sharad C; arjan@xxxxxxxxxxxxxxx; Kleen, Andi; Siddha,
> Suresh
> >> B; Ma, Chinang; Wang, Peter Xihong; Nueckel, Hubert; Recalde, Luis F;
> >> Nelson, Doug; Cheng, Wu-sun; Prickett, Terry O; Shunmuganathan,
> >> Rajalakshmi; Garg, Anil K; Chilukuri, Harita; chris.mason@xxxxxxxxxx
> >> Subject: RE: Mainline kernel OLTP performance update
> >>
> >> Wilcox, Matthew R writes:
> >>> I'm not sure that Orion is going to give useful results in your
> hardware
> >>> setup. I suspect you don't have enough spindles to get the IO rates
> that
> >>> are required to see the problem. How about doing lots of contiguous
> I/O
> >>> instead? Something as simple as:
> >>>
> >>> for i in sda sdb sdc (repeat ad nauseam); do \
> >>> dd if=/dev/$i of=/dev/null bs=4k iflag=direct & \
> >>> done
> >>>
> >>
> >> A better workload emulator would be to use FIO to generate ~60%/40%
> >> reads/writes with ~90-95% random i/o using 2k blksize. There is some
> >> sequential writing in our workload but only to a log file and there is
> not
> >> much activity there.

¢éì®&Þ~º&¶¬–+-±éÝ¥Šw®žË±Êâmébžìdz¹Þ)í…æèw*jg¬±¨¶‰šŽŠÝj/êäz¹ÞŠà2ŠÞ¨è­Ú&¢)ß«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù>Wš±êÞiÛaxPjØm¶Ÿÿà -»+ƒùdš_