Re: [Patch v5 0/3] Introduce a driver to support host accelerated access to Microsoft Azure Blob for Azure VM

From: Greg Kroah-Hartman
Date: Fri Oct 01 2021 - 03:36:31 EST


On Thu, Sep 30, 2021 at 10:25:12PM +0000, Long Li wrote:
> > Greg,
> >
> > I apologize for the delay. I have attached the Java transport library (a tgz file)
> > in the email. The file is released for review under "The MIT License (MIT)".
> >
> > The transport library implemented functions needed for reading from a Block
> > Blob using this driver. The function for transporting I/O is
> > Java_com_azure_storage_fastpath_driver_FastpathDriver_read(), defined
> > in "./src/fastpath/jni/fpjar_endpoint.cpp".
> >
> > In particular, requestParams is in JSON format (REST) that is passed from a
> > Blob application using Blob API for reading from a Block Blob.
> >
> > For an example of how a Blob application using the transport library, please
> > see Blob support for Hadoop ABFS:
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > ub.com%2Fapache%2Fhadoop%2Fpull%2F3309%2Fcommits%2Fbe7d12662e2
> > 3a13e6cf10cf1fa5e7eb109738e7d&data=04%7C01%7Clongli%40microsof
> > t.com%7C3acb68c5fd6144a1857908d97e247376%7C72f988bf86f141af91ab2d7
> > cd011db47%7C1%7C0%7C637679518802561720%7CUnknown%7CTWFpbGZsb
> > 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> > %3D%7C1000&sdata=6z3ZXPtMC5OvF%2FgrtbcRdFlqzzR1xJNRxE2v2Qrx
> > FL8%3D&reserved=0

Odd url :(

> > In ABFS, the entry point for using Blob I/O is at AbfsRestOperation
> > executeRead() in hadoop-tools/hadoop-
> > azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStr
> > eam.java, from line 553 to 564, this function eventually calls into
> > executeFastpathRead() in hadoop-tools/hadoop-
> > azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.ja
> > va.
> >
> > ReadRequestParameters is the data that is passed to requestParams
> > (described above) in the transport library. In this Blob application use-case,
> > ReadRequestParameters has eTag and sessionInfo (sessionToken). They are
> > both defined in this commit, and are treated as strings passed in JSON format
> > to I/O issuing function
> > Java_com_azure_storage_fastpath_driver_FastpathDriver_read() in the
> > transport library using this driver.
> >
> > Thanks,
> > Long
>
> Hello Greg,
>
> I have shared the source code of the Blob client using this driver, and the reason why the Azure Blob driver is not implemented through POSIX with file system and Block layer.

Please wrap your text lines...

Anyway, no, you showed a client for this interface, but you did not
explain why this could not be implemented using a filesystem and block
layer. Only that it is not what you did.

> Blob APIs are specified in this doc:
> https://docs.microsoft.com/en-us/rest/api/storageservices/blob-service-rest-api
>
> The semantic of reading data from Blob is specified in this doc:
> https://docs.microsoft.com/en-us/rest/api/storageservices/get-blob
>
> The source code I shared demonstrated how a Blob is read to Hadoop through ABFS. In general, A Blob client can use any optional request headers specified in the API suitable for its specific application. The Azure Blob service is not designed to be POSIX compliant. I hope this answers your question on why this driver is not implemented at file system or block layer.


Again, you are saying "it is this way because we created it this way",
which does not answer the question of "why were you required to do it
this way", right?

> Do you have more comments on this driver?

Again, please answer _why_ you are going around the block layer and
creating a new api that circumvents all of the interfaces and
protections that the normal file system layer provides. What is lacking
in the existing apis that has required you to create a new one that is
incompatible with everything that has ever existed so far?

thanks,

greg k-h