Re: [PATCH 03/17] fpga: dfl: fme: support 512bit data width PR

From: Scott Wood
Date: Mon Mar 25 2019 - 18:53:54 EST


On Mon, 2019-03-25 at 11:07 +0800, Wu Hao wrote:
> In early partial reconfiguration private feature, it only
> supports 32bit data width when writing data to hardware for
> PR. 512bit data width PR support is an important optimization
> for some specific solutions (e.g. XEON with FPGA integrated),
> it allows driver to use AVX512 instruction to improve the
> performance of partial reconfiguration. e.g. programming one
> 100MB bitstream image via this 512bit data width PR hardware
> only takes ~300ms, but 32bit revision requires ~3s per test
> result.
>
> Please note now this optimization is only done on revision 2
> of this PR private feature which is only used in integrated
> solution that AVX512 is always supported.
>
> Signed-off-by: Ananda Ravuri <ananda.ravuri@xxxxxxxxx>
> Signed-off-by: Xu Yilun <yilun.xu@xxxxxxxxx>
> Signed-off-by: Wu Hao <hao.wu@xxxxxxxxx>
> ---
> drivers/fpga/dfl-fme-main.c | 3 ++
> drivers/fpga/dfl-fme-mgr.c | 75 +++++++++++++++++++++++++++++++++++++---
> -----
> drivers/fpga/dfl-fme-pr.c | 45 ++++++++++++++++-----------
> drivers/fpga/dfl-fme.h | 2 ++
> drivers/fpga/dfl.h | 5 +++
> 5 files changed, 99 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index 086ad24..076d74f 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -21,6 +21,8 @@
> #include "dfl.h"
> #include "dfl-fme.h"
>
> +#define DRV_VERSION "0.8"

What is this going to be used for? Under what circumstances will the
driver version be bumped? What does it have to do with 512-bit writes?

> +#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512)
> +
> +#include <asm/fpu/api.h>
> +
> +static inline void copy512(void *src, void __iomem *dst)
> +{
> + kernel_fpu_begin();
> +
> + asm volatile("vmovdqu64 (%0), %%zmm0;"
> + "vmovntdq %%zmm0, (%1);"
> + :
> + : "r"(src), "r"(dst));
> +
> + kernel_fpu_end();
> +}

Shouldn't there be some sort of check that AVX512 is actually supported
on the running system?

Also, src should be const, and the asm statement should have a memory
clobber.

> +#else
> +static inline void copy512(void *src, void __iomem *dst)
> +{
> + WARN_ON_ONCE(1);
> +}
> +#endif

Likewise, this will be called if a revision 2 device is used on non-x86
(or on x86 with an old binutils). The driver should fall back to 32-bit
in such cases.

> @@ -200,21 +228,32 @@ static int fme_mgr_write(struct fpga_manager *mgr,
> pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT,
> pr_status);
> }
>
> - if (count < 4) {
> + if (count < priv->pr_datawidth) {
> dev_err(dev, "Invalid PR bitstream size\n");
> return -EINVAL;

Shouldn't this have become a WARN_ON in patch 2 given that the kernel
already pads the buffer?

> }
>
> - pr_data = 0;
> - pr_data |= FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
> - *(((u32 *)buf) + i));
> - writeq(pr_data, fme_pr + FME_PR_DATA);
> - count -= 4;
> + switch (priv->pr_datawidth) {
> + case 4:
> + pr_data = 0;
> + pr_data |= FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
> + *((u32 *)buf));

I know it's not new, but why not just "pr_data = FIELD..."? Const should
also be preserved in the cast, and you can drop one set of parentheses.

> + writeq(pr_data, fme_pr + FME_PR_DATA);
> + break;
> + case 64:
> + copy512((void *)buf, fme_pr + FME_PR_512_DATA);
> + break;

Unnecessary cast.

> + default:
> + ret = -EFAULT;
> + goto done;

How is it EFAULT? Any other value for pr_datawidth should be WARN_ON
since it's set by kernel code.

> @@ -159,13 +161,10 @@ static int fme_pr(struct platform_device *pdev,
> unsigned long arg)
> fpga_bridges_put(&region->bridge_list);
>
> put_device(&region->dev);
> -unlock_exit:
> - mutex_unlock(&pdata->lock);
> free_exit:
> vfree(buf);
> - if (copy_to_user((void __user *)arg, &port_pr, minsz))
> - return -EFAULT;
> -

Why is the copy_to_user being removed?

-Scott