Re: [PATCH v12 0/6] powerpc/papr_scm: Add support for reporting nvdimm health

From: Vaibhav Jain
Date: Mon Jun 15 2020 - 08:48:54 EST


This accidently got reposted. Please ignore.

v13 version of the patch series located at
https://lore.kernel.org/linux-nvdimm/20200615124407.32596-1-vaibhav@xxxxxxxxxxxxx


Vaibhav Jain <vaibhav@xxxxxxxxxxxxx> writes:

> Changes since v11 [1]:
> * Minor update to 'papr_pdsm.h' fixing a misleading comment about
> 'possible' padding being added by GCC which doesn't apply in case
> structs are marked as __packed.
> * Fix the order of initialization of 'struct nd_papr_pdsm_health' in
> papr_pdsm_health().
> * Added acks from Ira for various patches.
>
> [1] https://lore.kernel.org/linux-nvdimm/20200607131339.476036-1-vaibhav@xxxxxxxxxxxxx
> ---
>
> The PAPR standard[2][4] provides mechanisms to query the health and
> performance stats of an NVDIMM via various hcalls as described in
> Ref[3]. Until now these stats were never available nor exposed to the
> user-space tools like 'ndctl'. This is partly due to PAPR platform not
> having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
> report the dimm health status and a user had no way to determine the
> current health status of a NDVIMM.
>
> To overcome this limitation, this patch-set updates papr_scm kernel
> module to query and fetch NVDIMM health stats using hcalls described
> in Ref[3]. This health and performance stats are then exposed to
> userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
> libndctl.
>
> These changes coupled with proposed ndtcl changes located at Ref[5]
> should provide a way for the user to retrieve NVDIMM health status
> using ndtcl.
>
> Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
> in a emulation environment:
>
> # ndctl list -DH
> [
> {
> "dev":"nmem0",
> "health":{
> "health_state":"fatal",
> "shutdown_state":"dirty"
> }
> }
> ]
>
> Dimm health report output on a pseries guest lpar with vPMEM or HMS
> based NVDIMMs that are in perfectly healthy conditions:
>
> # ndctl list -d nmem0 -H
> [
> {
> "dev":"nmem0",
> "health":{
> "health_state":"ok",
> "shutdown_state":"clean"
> }
> }
> ]
>
> PAPR NVDIMM-Specific-Methods(PDSM)
> ==================================
>
> PDSM requests are issued by vendor specific code in libndctl to
> execute certain operations or fetch information from NVDIMMS. PDSMs
> requests can be sent to papr_scm module via libndctl(userspace) and
> libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
> handled in the dimm control function papr_scm_ndctl(). Current
> patchset proposes a single PDSM to retrieve NVDIMM health, defined in
> the newly introduced uapi header named 'papr_pdsm.h'. Support for
> more PDSMs will be added in future.
>
> Structure of the patch-set
> ==========================
>
> The patch-set starts with a doc patch documenting details of hcall
> H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
> thats used in subsequent patches to generate sysfs attribute content.
>
> Third patch implements support for fetching NVDIMM health information
> from PHYP and partially exposing it to user-space via a NVDIMM sysfs
> flag.
>
> Fourth patch updates papr_scm_ndctl() to handle a possible error case
> and also improve debug logging.
>
> Fifth patch deals with implementing support for servicing PDSM
> commands in papr_scm module.
>
> Finally the last patch implements support for servicing PDSM
> 'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
> libndctl.
>
> References:
> [2] "Power Architecture Platform Reference"
> https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
> [3] commit 58b278f568f0
> ("powerpc: Provide initial documentation for PAPR hcalls")
> [4] "Linux on Power Architecture Platform Reference"
> https://members.openpowerfoundation.org/document/dl/469
> [5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v12
>
> ---
>
> Vaibhav Jain (6):
> powerpc: Document details on H_SCM_HEALTH hcall
> seq_buf: Export seq_buf_printf
> powerpc/papr_scm: Fetch nvdimm health information from PHYP
> powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
> ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
> powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
>
> Documentation/ABI/testing/sysfs-bus-papr-pmem | 27 ++
> Documentation/powerpc/papr_hcalls.rst | 46 ++-
> arch/powerpc/include/uapi/asm/papr_pdsm.h | 125 ++++++
> arch/powerpc/platforms/pseries/papr_scm.c | 373 +++++++++++++++++-
> include/uapi/linux/ndctl.h | 1 +
> lib/seq_buf.c | 1 +
> 6 files changed, 562 insertions(+), 11 deletions(-)
> create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
> create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
>
> --
> 2.26.2
>

--
Cheers
~ Vaibhav