Re: 2.5.66-mm2

From: Martin J. Bligh (mbligh@aracnet.com)
Date: Wed Apr 02 2003 - 10:34:31 EST


>> Ho hum. All very strange. Kernbench seems to be really behaving itself
>> quite well now, but SDET sucks worse than ever. The usual 16x NUMA-Q
>> machine ....
>>
>> Kernbench: (make -j N vmlinux, where N = 2 x num_cpus)
>> Elapsed System User CPU
>> 2.5.66-mm2 44.04 81.12 569.40 1476.75
>> 2.5.66-mm2-ext3 44.43 84.10 568.82 1469.00
>
> Is this ext2 versus ext3? If so, that's a pretty good result isn't it? I
> forget what kernbench looked like for stock ext3.

Yes, it's splendid. Used to look more like this:

Kernbench: (make -j N vmlinux, where N = 2 x num_cpus)
                              Elapsed System User CPU
            2.5.61-mjb0.1 46.04 115.46 563.07 1472.25
       2.5.61-mjb0.1-ext3 48.45 143.79 564.14 1459.00

That was before I had noatime though (I think) ...

Kernbench: (make -j N vmlinux, where N = 2 x num_cpus)
                              Elapsed System User CPU
              2.5.65-mjb1 43.73 81.69 563.54 1475.00
         2.5.65-mjb1-ext3 44.13 79.77 564.56 1460.25

So I think after noatime, SDET was really the big problem.

>> SDET 32 (see disclaimer)
>> Throughput Std. Dev
>> 2.5.66-mm2 100.0% 0.7%
>> 2.5.66-mm2-ext3 4.7% 1.5%
>
> Yes, this is presumably a lot more metadata-intensive, so we're just
> hammering the journal semaphore to death. We're working on it.

Ah, that makes sense, thanks.
 
>> ftp://ftp.kernel.org/pub/linux/kernel/people/mbligh/benchmarks/2.5.66-mm2-ext3/
>
> Offtopic, a raw sdet64 profile says:
>
> 5392317 total
> 4478683 default_idle
> 307163 __down
> 169770 .text.lock.sched
> 106769 schedule
> 88092 __wake_up
> 57280 .text.lock.transaction
>
> I'm slightly surprised that the high context switch rate is showing up so
> much contention in sched.c. I'm assuming that it's on the sleep/wakeup path
> and not in the context switch path. It would be interesting to inline the
> spinlock code and reprofile.

OK, done. diffprofile with an without spinlines below:

    350487 158.8% schedule
     67691 0.6% total
     49184 0.5% default_idle
     43095 1680.1% journal_start
     42048 136.1% do_get_write_access
     34988 638.7% journal_stop
     16813 1527.1% inode_change_ok
      6752 3.6% __wake_up
      5960 0.9% __down
      3942 3718.9% sem_exit
      3323 725.5% inode_setattr
      3191 1470.5% proc_pid_readlink
      1646 2743.3% sys_ioctl
       418 16.8% atomic_dec_and_lock
       408 4.4% journal_add_journal_head
       303 841.7% proc_root_lookup
       258 4.6% __find_get_block
       256 33.0% follow_mount
       245 1.7% journal_dirty_metadata
       223 4.7% __find_get_block_slow
       213 2366.7% chrdev_open
       209 1492.9% proc_root_readdir
       202 2.6% find_get_page
       171 8.8% journal_unlock_journal_head
       146 6.2% kmem_cache_free
       126 2.7% do_anonymous_page
       120 1.3% copy_page_range
       111 1387.5% sys_sysctl
       107 93.9% journal_get_create_access
       106 0.8% cpu_idle
       106 963.6% __posix_lock_file
       101 280.6% put_filp
       100 1666.7% de_put
...
      -102 -13.1% fget
      -118 -100.0% .text.lock.char_dev
      -120 -2.2% block_write_full_page
      -123 -20.0% block_prepare_write
      -127 -93.4% remove_from_page_cache
      -142 -100.0% .text.lock.sysctl
      -167 -1.5% __blk_queue_bounce
      -183 -27.8% do_generic_mapping_read
      -203 -11.7% free_hot_cold_page
      -205 -100.0% .text.lock.dcache
      -211 -13.0% buffered_rmqueue
      -237 -98.8% .text.lock.namei
      -426 -100.0% .text.lock.dec_and_lock
      -458 -100.0% .text.lock.root
      -516 -8.8% __copy_to_user_ll
      -781 -100.0% .text.lock.journal
     -1696 -100.0% .text.lock.ioctl
     -3123 -100.0% .text.lock.base
     -4048 -100.0% .text.lock.sem
    -19538 -100.0% .text.lock.attr
   -117523 -99.9% .text.lock.transaction
   -347530 -100.0% .text.lock.sched

Thanks,

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 07 2003 - 22:00:16 EST