Re: [PATCH 0/3] Fix dt-validate issues on qemu dtbdumps due to dt-bindings

From: Andrew Jones
Date: Tue Aug 16 2022 - 10:06:43 EST


On Mon, Aug 15, 2022 at 07:18:02PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
> Any takers on trashing my regex? Otherwise I'll just submit
> a v2 with the regex and it can be shat on there instead :)
>
> On 09/08/2022 19:36, Conor Dooley wrote:
> > On 09/08/2022 15:14, Rob Herring wrote:
> >> On Mon, Aug 08, 2022 at 10:01:11PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
> >>> On 08/08/2022 22:34, Jessica Clarke wrote:
> >>>> On Fri, Aug 05, 2022 at 05:28:42PM +0100, Conor Dooley wrote:
> >>>>> From: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> >>>>> The final patch adds some new ISA strings
> >>>>> which needs scruitiny from someone with more knowledge about what ISA
> >>>>> extension strings should be reported in a dt than I have.
> >>>>
> >>>> Listing every possible ISA string supported by the Linux kernel really
> >>>> is not going to scale...
> >>
> >> How does the kernel scale? (No need to answer)
> >>
> >>> Yeah, totally correct there. Case for adding a regex I suppose, but I
> >>> am not sure how to go about handling the multi-letter extensions or
> >>> if parsing them is required from a binding compliance point of view.
> >>> Hoping for some input from Palmer really.
> >>
> >> Yeah, looks like a regex pattern is needed.
> >
> > I started pottering away at this but I have arrived at:
> > rv64imaf?d?c?h?(_z[imafdqcbvkh]([a-z])*)*$

Don't forget the ^ at the start.

Do we need to worry about optional major and minor version numbers?
Or check that Z names have at least one character following the category
character? Actually, the first letter after Z being a category is only a
convention. Maybe we don't want to enforce that. What about X extensions?

Thanks,
drew

> >
> > I suspect that before "h?" there should be more single letter
> > extensions added for completeness sake. So then it'd bloat out to:
> > rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$
> >
> > I checked a couple different "bad" isa strings against it and
> > nothing went up in flames but my regex skills are far from great
> > so I'm sure there's better ways to represent this.
> >
> > Anyways, this pattern is based on my understanding that:
> > - the single letter order is fixed & we don't care about things that
> > can't even do "ima"
> > - the multi letter extensions are all in a "_z<foo>" format where the
> > first letter of <foo> is a valid single letter extension
> > - we don't care about the e extension from an OS PoV (this could be a
> > very flawed take...)
> > - after the first two chars, the extension name could be an english
> > word (ifencei anyone?) so it's not worth restricting the charset
> > - that attempting to validate the contents of the multiletter extensions
> > with dt-validate beyond the formatting is a futile, massively verbose
> > or unwieldy exercise at best
> >
> > Some or all of those assumptions could be very very wrong so if {someone,
> > anyone} wants to correct me - feel ***more*** than free..
> >
> > Thanks,
> > Conor.
> >
> > patch would then look like:
> >
> > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > index d632ac76532e..1e54e7746190 100644
> > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > @@ -74,9 +74,7 @@ properties:
> > insensitive, letters in the riscv,isa string must be all
> > lowercase to simplify parsing.
> > $ref: "/schemas/types.yaml#/definitions/string"
> > - enum:
> > - - rv64imac
> > - - rv64imafdc
> > + pattern: rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$
> >
> > # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here
> > timebase-frequency: false
>