Re: zram: per-cpu compression streams

From: Minchan Kim
Date: Wed Apr 27 2016 - 03:28:34 EST


Hello Sergey,

On Tue, Apr 26, 2016 at 08:23:05PM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
>
> On (04/19/16 17:00), Minchan Kim wrote:
> [..]
> > I'm convinced now with your data. Super thanks!
> > However, as you know, we need data how bad it is in heavy memory pressure.
> > Maybe, you can test it with fio and backgound memory hogger,
>
> it's really hard to produce stable test results when the system
> is under mem pressure.
>
> first, I modified zram to export the re-compression number
> (put cpu stream and re-try handler allocation)
>
> mm_stat for numjobs{1..10}. the number of re-compressions is in "< NUM>" format
>
> 3221225472 3221225472 3221225472 0 3221229568 0 0 < 6421>
> 3221225472 3221225472 3221225472 0 3221233664 0 0 < 6998>
> 3221225472 2912157607 2952802304 0 2952814592 0 84 < 7271>
> 3221225472 2893479936 2899120128 0 2899136512 0 156 < 8260>
> 3221217280 2886040814 2899099648 0 2899128320 0 78 < 8297>
> 3221225472 2880045056 2885693440 0 2885718016 0 54 < 7794>
> 3221213184 2877431364 2883756032 0 2883801088 0 144 < 7336>
> 3221225472 2873229312 2876096512 0 2876133376 0 28 < 8699>
> 3221213184 2870728008 2871693312 0 2871730176 0 30 < 8189>
> 2899095552 2899095552 2899095552 0 2899136512 78643 0 < 7485>

It would be great when we see the below ratio for each test.

1-compression : 2(re)-compression

>
> as we can see, the number of re-compressions can vary from 6421 to 8699.
>
>
> the test:
>
> -- 4 GB x86_64 box
> -- zram 3GB, lzo
> -- mem-hogger pre-faults 3GB of pages before the fio test
> -- fio test has been modified to have 11% compression ratio (to increase the
> chances of re-compressions)

Could you test concurrent mem hogger with fio rather than pre-fault before fio test
in next submit?

> -- buffer_compress_percentage=11
> -- scramble_buffers=0
>
>
> considering buffer_compress_percentage=11, the box was under somewhat
> heavy pressure.
>
> now, the results

Yeb, Even, recompression case is fater than old but want to see more heavy memory
pressure case and the ratio I mentioned above.

If the result is still good, please send public patch with number.
Thanks for looking this, Sergey!

>
>
> fio stats
>
> 4 streams 8 streams per cpu
> ===========================================================
> #jobs1
> READ: 2411.4MB/s 2430.4MB/s 2440.4MB/s
> READ: 2094.8MB/s 2002.7MB/s 2034.5MB/s
> WRITE: 141571KB/s 140334KB/s 143542KB/s
> WRITE: 712025KB/s 706111KB/s 745256KB/s
> READ: 531014KB/s 525250KB/s 537547KB/s
> WRITE: 530960KB/s 525197KB/s 537492KB/s
> READ: 473577KB/s 470320KB/s 476880KB/s
> WRITE: 473645KB/s 470387KB/s 476948KB/s
> #jobs2
> READ: 7897.2MB/s 8031.4MB/s 7968.9MB/s
> READ: 6864.9MB/s 6803.2MB/s 6903.4MB/s
> WRITE: 321386KB/s 314227KB/s 313101KB/s
> WRITE: 1275.3MB/s 1245.6MB/s 1383.5MB/s
> READ: 1035.5MB/s 1021.9MB/s 1098.4MB/s
> WRITE: 1035.6MB/s 1021.1MB/s 1098.6MB/s
> READ: 972014KB/s 952321KB/s 987.66MB/s
> WRITE: 969792KB/s 950144KB/s 985.40MB/s
> #jobs3
> READ: 13260MB/s 13260MB/s 13222MB/s
> READ: 11636MB/s 11636MB/s 11755MB/s
> WRITE: 511500KB/s 507730KB/s 504959KB/s
> WRITE: 1646.1MB/s 1673.9MB/s 1755.5MB/s
> READ: 1389.5MB/s 1387.2MB/s 1479.6MB/s
> WRITE: 1387.6MB/s 1385.3MB/s 1477.4MB/s
> READ: 1286.8MB/s 1289.1MB/s 1377.3MB/s
> WRITE: 1284.8MB/s 1287.1MB/s 1374.9MB/s
> #jobs4
> READ: 19851MB/s 20244MB/s 20344MB/s
> READ: 17732MB/s 17835MB/s 18097MB/s
> WRITE: 667776KB/s 655599KB/s 693464KB/s
> WRITE: 2041.2MB/s 2072.6MB/s 2474.1MB/s
> READ: 1770.1MB/s 1781.7MB/s 2035.5MB/s
> WRITE: 1765.8MB/s 1777.3MB/s 2030.5MB/s
> READ: 1641.6MB/s 1672.4MB/s 1892.5MB/s
> WRITE: 1643.2MB/s 1674.2MB/s 1894.4MB/s
> #jobs5
> READ: 19468MB/s 18484MB/s 18439MB/s
> READ: 17594MB/s 17757MB/s 17716MB/s
> WRITE: 843266KB/s 859627KB/s 867928KB/s
> WRITE: 1927.1MB/s 2041.8MB/s 2168.9MB/s
> READ: 1718.6MB/s 1771.7MB/s 1963.5MB/s
> WRITE: 1712.7MB/s 1765.6MB/s 1956.8MB/s
> READ: 1705.3MB/s 1663.6MB/s 1767.3MB/s
> WRITE: 1704.3MB/s 1662.6MB/s 1766.2MB/s
> #jobs6
> READ: 21583MB/s 21685MB/s 21483MB/s
> READ: 19160MB/s 18432MB/s 18618MB/s
> WRITE: 986276KB/s 1004.2MB/s 981.11MB/s
> WRITE: 2013.6MB/s 1922.5MB/s 2429.1MB/s
> READ: 1797.1MB/s 1678.9MB/s 2038.8MB/s
> WRITE: 1794.8MB/s 1675.9MB/s 2035.2MB/s
> READ: 1678.2MB/s 1632.5MB/s 1917.4MB/s
> WRITE: 1673.9MB/s 1627.6MB/s 1911.6MB/s
> #jobs7
> READ: 20697MB/s 21677MB/s 21062MB/s
> READ: 18781MB/s 18667MB/s 19338MB/s
> WRITE: 1074.6MB/s 1099.8MB/s 1105.3MB/s
> WRITE: 2100.7MB/s 2010.3MB/s 2598.7MB/s
> READ: 1783.2MB/s 1710.2MB/s 2027.8MB/s
> WRITE: 1784.3MB/s 1712.1MB/s 2029.6MB/s
> READ: 1690.8MB/s 1620.6MB/s 1893.6MB/s
> WRITE: 1681.4MB/s 1611.7MB/s 1883.7MB/s
> #jobs8
> READ: 19883MB/s 20827MB/s 20395MB/s
> READ: 18562MB/s 18178MB/s 17822MB/s
> WRITE: 1240.5MB/s 1307.3MB/s 1331.7MB/s
> WRITE: 2132.1MB/s 2143.6MB/s 2564.9MB/s
> READ: 1841.1MB/s 1831.1MB/s 2111.4MB/s
> WRITE: 1843.1MB/s 1833.1MB/s 2113.4MB/s
> READ: 1795.4MB/s 1778.6MB/s 2029.3MB/s
> WRITE: 1791.4MB/s 1774.5MB/s 2024.5MB/s
> #jobs9
> READ: 18834MB/s 19470MB/s 19402MB/s
> READ: 17988MB/s 18118MB/s 18531MB/s
> WRITE: 1339.4MB/s 1441.2MB/s 1512.6MB/s
> WRITE: 2102.4MB/s 2111.9MB/s 2478.8MB/s
> READ: 1754.5MB/s 1777.3MB/s 2050.2MB/s
> WRITE: 1753.9MB/s 1776.7MB/s 2049.5MB/s
> READ: 1686.4MB/s 1698.2MB/s 1931.6MB/s
> WRITE: 1684.1MB/s 1696.8MB/s 1929.1MB/s
> #jobs10
> READ: 19128MB/s 19517MB/s 19592MB/s
> READ: 18177MB/s 17544MB/s 18221MB/s
> WRITE: 1397.1MB/s 1567.4MB/s 1683.2MB/s
> WRITE: 2151.9MB/s 2205.1MB/s 2642.6MB/s
> READ: 1879.2MB/s 1907.3MB/s 2223.2MB/s
> WRITE: 1878.5MB/s 1906.2MB/s 2222.8MB/s
> READ: 1835.7MB/s 1837.9MB/s 2131.4MB/s
> WRITE: 1838.6MB/s 1840.8MB/s 2134.8MB/s
>
>
> perf stats
>
> 4 streams 8 streams per cpu
> ====================================================================================================================
> jobs1
> stalled-cycles-frontend 52,219,601,943 ( 55.87%) 53,406,899,652 ( 56.33%) 49,944,625,376 ( 56.27%)
> stalled-cycles-backend 23,194,739,214 ( 24.82%) 24,397,423,796 ( 25.73%) 22,782,579,660 ( 25.67%)
> instructions 86,078,512,819 ( 0.92) 86,235,354,709 ( 0.91) 80,378,845,354 ( 0.91)
> branches 15,732,850,506 ( 532.108) 15,743,473,327 ( 522.592) 14,725,420,241 ( 523.425)
> branch-misses 104,546,578 ( 0.66%) 107,847,818 ( 0.69%) 106,343,602 ( 0.72%)
> jobs2
> stalled-cycles-frontend 118,614,605,521 ( 59.74%) 113,520,838,279 ( 59.94%) 104,301,243,221 ( 59.06%)
> stalled-cycles-backend 59,490,170,824 ( 29.96%) 56,518,872,622 ( 29.84%) 50,161,702,782 ( 28.40%)
> instructions 169,663,993,572 ( 0.85) 160,959,388,344 ( 0.85) 153,541,182,646 ( 0.87)
> branches 31,859,926,551 ( 497.945) 30,132,524,256 ( 494.660) 28,579,927,064 ( 503.079)
> branch-misses 164,531,311 ( 0.52%) 163,509,596 ( 0.54%) 145,472,902 ( 0.51%)
> jobs3
> stalled-cycles-frontend 153,932,401,104 ( 60.86%) 158,470,334,291 ( 60.81%) 150,767,641,835 ( 59.21%)
> stalled-cycles-backend 77,023,824,597 ( 30.45%) 79,673,952,089 ( 30.57%) 72,693,245,174 ( 28.55%)
> instructions 197,452,119,661 ( 0.78) 204,116,060,906 ( 0.78) 207,832,729,315 ( 0.82)
> branches 36,579,918,543 ( 404.660) 37,980,582,651 ( 406.326) 39,091,715,974 ( 428.559)
> branch-misses 214,292,753 ( 0.59%) 215,861,282 ( 0.57%) 203,320,703 ( 0.52%)
> jobs4
> stalled-cycles-frontend 237,223,396,661 ( 64.22%) 227,572,336,186 ( 64.37%) 202,100,979,033 ( 61.41%)
> stalled-cycles-backend 129,935,296,918 ( 35.17%) 124,957,172,193 ( 35.34%) 103,626,575,103 ( 31.49%)
> instructions 270,083,196,348 ( 0.73) 257,652,752,109 ( 0.73) 259,773,237,031 ( 0.79)
> branches 52,120,828,566 ( 391.426) 49,121,254,042 ( 385.647) 49,896,944,076 ( 420.532)
> branch-misses 260,480,947 ( 0.50%) 254,957,745 ( 0.52%) 239,402,681 ( 0.48%)
> jobs5
> stalled-cycles-frontend 257,778,703,389 ( 64.89%) 265,688,762,182 ( 65.13%) 229,916,792,090 ( 61.41%)
> stalled-cycles-backend 142,090,098,727 ( 35.77%) 147,101,411,510 ( 36.06%) 117,081,586,471 ( 31.27%)
> instructions 291,859,438,730 ( 0.73) 298,380,653,546 ( 0.73) 302,840,047,693 ( 0.81)
> branches 55,111,567,225 ( 385.905) 56,316,470,332 ( 383.545) 57,500,842,324 ( 428.083)
> branch-misses 270,056,201 ( 0.49%) 269,400,845 ( 0.48%) 258,495,925 ( 0.45%)
> jobs6
> stalled-cycles-frontend 311,626,093,277 ( 65.61%) 314,291,595,576 ( 65.77%) 249,524,291,273 ( 61.39%)
> stalled-cycles-backend 174,358,063,361 ( 36.71%) 177,312,195,233 ( 37.10%) 126,508,172,269 ( 31.13%)
> instructions 345,271,436,105 ( 0.73) 346,679,577,246 ( 0.73) 333,258,054,473 ( 0.82)
> branches 65,298,537,641 ( 381.664) 65,995,652,812 ( 383.717) 62,730,160,550 ( 428.999)
> branch-misses 313,241,654 ( 0.48%) 307,876,772 ( 0.47%) 282,570,360 ( 0.45%)
> jobs7
> stalled-cycles-frontend 333,896,608,350 ( 64.68%) 349,165,441,969 ( 64.85%) 276,185,831,513 ( 59.95%)
> stalled-cycles-backend 186,083,638,772 ( 36.05%) 197,000,957,906 ( 36.59%) 138,835,486,733 ( 30.14%)
> instructions 388,707,023,219 ( 0.75) 404,347,465,692 ( 0.75) 394,078,203,426 ( 0.86)
> branches 71,999,476,930 ( 387.008) 76,197,698,685 ( 392.759) 73,195,649,665 ( 440.914)
> branch-misses 328,598,294 ( 0.46%) 323,895,230 ( 0.43%) 298,205,996 ( 0.41%)
> jobs8
> stalled-cycles-frontend 378,806,234,772 ( 66.73%) 369,453,970,323 ( 66.55%) 313,738,845,641 ( 62.55%)
> stalled-cycles-backend 211,732,966,238 ( 37.30%) 207,691,463,546 ( 37.41%) 161,120,924,768 ( 32.12%)
> instructions 406,674,721,912 ( 0.72) 401,922,649,599 ( 0.72) 405,830,823,213 ( 0.81)
> branches 75,637,492,422 ( 369.371) 74,287,789,757 ( 371.226) 75,967,291,039 ( 420.260)
> branch-misses 355,733,892 ( 0.47%) 328,972,387 ( 0.44%) 318,203,258 ( 0.42%)
> jobs9
> stalled-cycles-frontend 422,712,242,907 ( 66.39%) 417,293,429,710 ( 66.14%) 343,703,467,466 ( 61.35%)
> stalled-cycles-backend 239,356,726,574 ( 37.59%) 231,725,068,834 ( 36.73%) 172,101,321,805 ( 30.72%)
> instructions 465,964,470,967 ( 0.73) 468,561,486,803 ( 0.74) 474,119,504,255 ( 0.85)
> branches 86,724,291,348 ( 377.755) 86,534,438,758 ( 380.374) 88,431,722,886 ( 437.939)
> branch-misses 385,706,052 ( 0.44%) 360,946,347 ( 0.42%) 337,858,267 ( 0.38%)
> jobs10
> stalled-cycles-frontend 451,844,797,592 ( 67.24%) 435,099,070,573 ( 67.18%) 352,877,428,118 ( 62.18%)
> stalled-cycles-backend 255,533,666,521 ( 38.03%) 249,295,276,734 ( 38.49%) 179,754,582,074 ( 31.67%)
> instructions 472,331,884,636 ( 0.70) 458,948,698,965 ( 0.71) 464,131,768,633 ( 0.82)
> branches 88,848,212,769 ( 366.556) 85,330,239,413 ( 365.282) 86,837,838,069 ( 424.329)
> branch-misses 398,856,497 ( 0.45%) 359,532,394 ( 0.42%) 333,821,387 ( 0.38%)
>
>
>
> perf reported execution time
>
> 4 streams 8 streams per cpu
> ====================================================================
> seconds elapsed 41.359653597 43.131195776 40.961640812
> seconds elapsed 37.778174380 38.681792299 38.368529861
> seconds elapsed 38.367149768 39.368008799 37.687545579
> seconds elapsed 40.402963748 39.177529033 36.205357101
> seconds elapsed 44.145428970 43.251655348 41.810848146
> seconds elapsed 49.344988495 49.951048242 44.270045250
> seconds elapsed 53.865398777 54.271392367 48.824173559
> seconds elapsed 57.028770416 56.228105290 51.332017545
> seconds elapsed 62.931350164 61.251237873 55.977463074
> seconds elapsed 67.088285633 63.544376242 57.690998344
>
>
> -ss