Re: [PATCH v2 00/51] block: support multipage bvec

From: Jens Axboe
Date: Mon Jun 26 2017 - 12:46:55 EST


On 06/26/2017 06:09 AM, Ming Lei wrote:
> Hi,
>
> This patchset brings multipage bvec into block layer:
>
> 1) what is multipage bvec?
>
> Multipage bvecs means that one 'struct bio_bvec' can hold
> multiple pages which are physically contiguous instead
> of one single page used in linux kernel for long time.
>
> 2) why is multipage bvec introduced?
>
> Kent proposed the idea[1] first.
>
> As system's RAM becomes much bigger than before, and
> at the same time huge page, transparent huge page and
> memory compaction are widely used, it is a bit easy now
> to see physically contiguous pages from fs in I/O.
> On the other hand, from block layer's view, it isn't
> necessary to store intermediate pages into bvec, and
> it is enough to just store the physicallly contiguous
> 'segment' in each io vector.
>
> Also huge pages are being brought to filesystem and swap
> [2][6], we can do IO on a hugepage each time[3], which
> requires that one bio can transfer at least one huge page
> one time. Turns out it isn't flexiable to change BIO_MAX_PAGES
> simply[3][5]. Multipage bvec can fit in this case very well.
>
> With multipage bvec:
>
> - segment handling in block layer can be improved much
> in future since it should be quite easy to convert
> multipage bvec into segment easily. For example, we might
> just store segment in each bvec directly in future.
>
> - bio size can be increased and it should improve some
> high-bandwidth IO case in theory[4].
>
> - Inside block layer, both bio splitting and sg map can
> become more efficient than before by just traversing the
> physically contiguous 'segment' instead of each page.
>
> - there is opportunity in future to improve memory footprint
> of bvecs.
>
> 3) how is multipage bvec implemented in this patchset?
>
> The 1st 18 patches comment on some special cases and deal with
> some special cases of direct access to bvec table.
>
> The 2nd part(19~29) implements multipage bvec in block layer:
>
> - put all tricks into bvec/bio/rq iterators, and as far as
> drivers and fs use these standard iterators, they are happy
> with multipage bvec
>
> - use multipage bvec to split bio and map sg
>
> - bio_for_each_segment_all() changes
> this helper pass pointer of each bvec directly to user, and
> it has to be changed. Two new helpers(bio_for_each_segment_all_sp()
> and bio_for_each_segment_all_mp()) are introduced.
>
> The 3rd part(30~49) convert current users of bio_for_each_segment_all()
> to bio_for_each_segment_all_sp()/bio_for_each_segment_all_mp().
>
> The last part(50~51) enables multipage bvec.
>
> These patches can be found in the following git tree:
>
> https://github.com/ming1/linux/commits/mp-bvec-1.4-v4.12-rc
>
> Thanks Christoph for looking at the early version and providing
> very good suggestions, such as: introduce bio_init_with_vec_table(),
> remove another unnecessary helpers for cleanup and so on.
>
> Any comments are welcome!

I'll take some time to review this over the next week or so. In any
case, it's a little late to stuff into 4.13 and get a decent amount
of exposure and testing on it. A 4.14 target for this would be the
way to go, imho.


--
Jens Axboe