Re: [External] Re: [PATCH v3] blk-throtl: Introduce sync and async queues for blk-throtl

From: hanjinke
Date: Fri Jan 06 2023 - 23:44:50 EST




在 2023/1/7 上午2:15, Tejun Heo 写道:
Hello,

On Sat, Jan 07, 2023 at 02:07:38AM +0800, hanjinke wrote:
In our internal scenario, iocost has been deployed as the main io isolation
method and is gradually spreading。

Ah, glad to hear. If you don't mind sharing, how are you configuring iocost
currently? How do you derive the parameters?


For cost.model setting, We first use the tools iocost provided to test the benchmark model parameters of different types of disks online, and then save these benchmark parameters to a parametric Model Table. During the deployment process, pull and set the corresponding model parameters according to the type of disk.

The setting of cost.qos should be considered slightly more,we need to make some compromises between overall disk throughput and io latency.
The average disk utilization of the entire disk on a specific business and the RLA(if it is io sensitive) of key businesses will be taken as important input considerations. The cost.qos will be dynamically fine-tuned according to the health status monitoring of key businesses.

For cost.weight setting, high-priority services will gain greater advantages through weight settings to deal with a large number of io requests in a short period of time. It works fine as work-conservation
of iocost works well according to our observation.

These practices can be done better and I look forward to your better suggestions.


blk-throttle has a lot of issues which may be difficult to address. Even the
way it's configured is pretty difficult to scale across different hardware /
application combinations and we've neglected its control performance and
behavior (like handling of shared IOs) for quite a while.

While iocost's work-conserving control does address a lot of the use cases
we see today, it's likely that we'll need hard limits more in the future
too. I've been thinking about implementing io.max on top of iocost. There
are some challenges around dynamic vrate adj semantics but it's kinda
attractive because iocost already has the concept of total device capacity.

Indeed in our multi-tenancy scenario, the hard limits are necessary.

Jinke
Thanks.