Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO

From: Ming Lei
Date: Thu Jul 30 2015 - 04:01:47 EST

Next message: Borislav Petkov: "Re: [PATCH] x86_64/efi: Mapping Boot and Runtime EFI memory regions to different starting virtual address"
Previous message: Jiang Liu: "Re: [Patch v5 0/6] Consolidate ACPI PCI root common code into ACPI core"
In reply to: Dave Chinner: "Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO"
Next in thread: Dave Chinner: "Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jul 29, 2015 at 6:08 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Wed, Jul 29, 2015 at 07:21:47AM -0400, Ming Lei wrote:
>> On Wed, Jul 29, 2015 at 4:41 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > On Wed, Jul 29, 2015 at 03:33:52AM -0400, Ming Lei wrote:
>> >> On Mon, Jul 27, 2015 at 1:33 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>> >> > On Mon, Jul 27, 2015 at 05:53:33AM -0400, Ming Lei wrote:
>> >> >> Because size has to be 4k aligned too.
>> >> >
>> >> > Yes. But again I don't see any reason to limit us to a hardcoded 512
>> >> > byte block size here, especially considering the patches to finally
>> >>
>> >> From loop block's view, the request size can be any count of 512-byte
>> >> sectors, then the transfer size to backing device can't guarantee to be
>> >> 4k aligned always.
>> >
>> > In theory, yes. In practise, doesn't happen very often.
>> >
>> >> > allow enabling other block sizes from userspace.
>> >>
>> >> I have some questions about the patchset, and looks the author doesn't
>> >> reply it yet.
>> >>
>> >> On Mon, Jul 27, 2015 at 6:06 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> >> >> Because size has to be 4k aligned too.
>> >> >
>> >> > So check that, too. Any >= 4k block size filesystem should be doing
>> >> > mostly 4k aligned and sized IO...
>> >>
>> >> I guess you mean we only use direct IO for the 4k aligned and sized IO?
>> >> If so, that won't be efficient because the page cache has to be flushed
>> >> during the switch.
>> >
>> > It will be extremely rare for a 4k block size filesystem to do
>> > anything other than 4k aligned and sized IO. Think about it for a
>> > minute: what does the page cache do to unaligned IO patterns (i.e.
>> > buffered IO)? It does IO in page sizes, and so if the application
>> > if doing badly aligned or sized IO with buffered IO, then the
>> > underlying device will only ever size page sized and aligned IO.
>> >
>> > Hence sector aligned IO will only come from applications doing
>> > direct IO. If the application is doing direct IO and it's not
>> > properly aligned, then it already is going to get sucky performance
>> > because most filesystem serialise sub-block size direct IO because
>> > concurrent sub-block IOs to the same block usually leads to data
>> > corruption.
>>
>> The blocksize of filesysten over loop can be 512, 1024, 2048, and
>> suppose sector size of backing device is 4096, then filesystem
>> can see aligned direct IO when IO size/offset from application is aligned
>> with fs block size, but loop still can't do direct IO for all this
>> kind of requests
>> against backing file.
>
> Sure, but again you're talking about a fairly rare configuration.
> The vast majority of filesystems use 4k block sizes, just like the
> vast majority of applications use buffered IO. Don't jump through
> hoops to optimise a case that probably doesn't need optimising. Make
> it work correctly first, then optimise performance later when
> someone has a need for it to be really fast.

OK, I will support 1024, 2048 and 4096 sector size in v8.

>
>> Another case is that application may access loop block directly, such
>> as 'dd if=/dev/loopN', but it may not be common, and maybe it needn't
>> to consider.
>
> 'dd if=/dev/loopN bs=4k....'
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Borislav Petkov: "Re: [PATCH] x86_64/efi: Mapping Boot and Runtime EFI memory regions to different starting virtual address"
Previous message: Jiang Liu: "Re: [Patch v5 0/6] Consolidate ACPI PCI root common code into ACPI core"
In reply to: Dave Chinner: "Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO"
Next in thread: Dave Chinner: "Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]