Re: [RFC PATCH 3/9] Documentation/scrub-configure.rst: Add documentation for scrub driver

From: Jonathan Cameron
Date: Mon Sep 18 2023 - 08:30:03 EST


On Mon, 18 Sep 2023 14:15:33 +0200
David Hildenbrand <david@xxxxxxxxxx> wrote:

> On 18.09.23 12:25, Shiju Jose wrote:
> > Hi David,
> >
> > Thanks for looking into this.
> >
> >> -----Original Message-----
> >> From: David Hildenbrand <david@xxxxxxxxxx>
> >> Sent: 18 September 2023 08:24
> >> To: Shiju Jose <shiju.jose@xxxxxxxxxx>; linux-acpi@xxxxxxxxxxxxxxx; linux-
> >> mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> >> Cc: rafael@xxxxxxxxxx; lenb@xxxxxxxxxx; naoya.horiguchi@xxxxxxx;
> >> tony.luck@xxxxxxxxx; james.morse@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx;
> >> jiaqiyan@xxxxxxxxxx; jthoughton@xxxxxxxxxx; somasundaram.a@xxxxxxx;
> >> erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; rientjes@xxxxxxxxxx;
> >> duenwen@xxxxxxxxxx; Vilas.Sridharan@xxxxxxx; mike.malvestuto@xxxxxxxxx;
> >> gthelen@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; Jonathan Cameron
> >> <jonathan.cameron@xxxxxxxxxx>; tanxiaofei <tanxiaofei@xxxxxxxxxx>;
> >> Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>
> >> Subject: Re: [RFC PATCH 3/9] Documentation/scrub-configure.rst: Add
> >> documentation for scrub driver
> >>
> >> On 15.09.23 19:28, shiju.jose@xxxxxxxxxx wrote:
> >>> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >>>
> >>> Add documentation for scrub driver, supports configure scrub
> >>> parameters, in Documentation/scrub-configure.rst
> >>>
> >>> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >>> ---
> >>> Documentation/scrub-configure.rst | 55
> >> +++++++++++++++++++++++++++++++
> >>> 1 file changed, 55 insertions(+)
> >>> create mode 100644 Documentation/scrub-configure.rst
> >>>
> >>> diff --git a/Documentation/scrub-configure.rst
> >>> b/Documentation/scrub-configure.rst
> >>> new file mode 100644
> >>> index 000000000000..9f8581b88788
> >>> --- /dev/null
> >>> +++ b/Documentation/scrub-configure.rst
> >>> @@ -0,0 +1,55 @@
> >>> +==========================
> >>> +Scrub subsystem driver
> >>> +==========================
> >>> +
> >>> +Copyright (c) 2023 HiSilicon Limited.
> >>> +
> >>> +:Author: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >>> +:License: The GNU Free Documentation License, Version 1.2
> >>> + (dual licensed under the GPL v2) :Original Reviewers:
> >>> +
> >>> +- Written for: 6.7
> >>> +- Updated for:
> >>> +
> >>> +Introduction
> >>> +------------
> >>> +The scrub subsystem driver provides the interface for configure the
> >>
> >> "... interface for configuring memory scrubbers in the system."
> >>
> >> are we only configuring firmware/hw-based memory scrubbing? I assume so.
> > The scrub control could be used for the SW based memory scrubbing too.
>
> Okay, looks like there is not too much hw/firmware specific in there
> (besides these weird range changes).
> [...]
>
> >>> +-------
> >>> +
> >>> + The usage takes the form shown in this example::
> >>> +
> >>> + # echo 0x300000 > /sys/class/scrub/scrub0/region0/addr_base
> >>> + # echo 0x100000 > /sys/class/scrub/scrub0/region0/addr_size
> >>> + # cat /sys/class/scrub/scrub0/region0/speed_available
> >>> + # 1-60
> >>> + # echo 25 > /sys/class/scrub/scrub0/region0/speed
> >>> + # echo 1 > /sys/class/scrub/scrub0/region0/enable
> >>> +
> >>> + # cat /sys/class/scrub/scrub0/region0/speed
> >>> + # 0x19
> >>
> >> Is it reasonable to return the speed as hex? You set it as dec.
> > Presently return speed as hex to reduce the number of callback function needed
> > for reading the hex/dec data because the values for the address range
> > need to be in hex.
>
> If speed_available returns dec, speed better also return dec IMHO.
>
> >
> >>
> >>> + # cat /sys/class/scrub/scrub0/region0/addr_base
> >>> + # 0x100000
> >>
> >> But didn't we set it to 0x300000 ...
> > This is an emulated example for testing the RASF/RAS2 definition.
> > According to the RASF & RAS2 definition, the actual address range in the
> > platform could vary from the requested address range for the patrol scrubbing.
> > "The platform calculates the nearest patrol scrub boundary address
> > from where it can start". The platform returns the actual address range
> > in response to GET_PATROL_PARAMETERS command to the firmware.
> > Please see section 5.2.21.2.1 Hardware-based Memory Scrubbing ,
> > Table 5.87: Parameter Block Structure for PATROL_SCRUB in the
> > ACPI 6.5 specification.
> >
>
> So you configure [0x300000 - 0x400000] and you get [0x100000 - 0x300000]
>
> How does that make any sense? :)
>
> Shouldn't we rather return an error when setting a range that is
> impossible, instead of the hardware deciding to scrub something
> completely different (as can be seen in the example)?
>

A broader scrub is probably reasonable, but agreed that scrubbing narrower
is 'interesting' as not scrubbing the memory requeseted.
It's really annoying that neither ACPI table provides any proper
discoverability. Whilst we can fix that long term, we are stuck with
a clunky poke it and see interface in the meantime.

Jonathan