Re: [PATCH v2 00/12] PolarFire SoC reset controller & clock cleanups

From: Conor.Dooley
Date: Wed Aug 10 2022 - 15:43:51 EST


On 10/08/2022 20:32, Nathan Chancellor wrote:
> On Wed, Aug 10, 2022 at 07:20:24PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
>> On 10/08/2022 19:56, Nathan Chancellor wrote:
>>> Hi Conor,
>>>
>>> On Tue, Aug 09, 2022 at 11:05:32PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
>>>> +CC clang people :)
>>>>
>>>> Got an odd one here and would appreciate some pointers for where to
>>>> look. This code when built with gcc boots fine, for example with:
>>>> riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0
>>>> The same code but build with clang build it fails to boot but prior to
>>>> that applying this patchset it boots fine. Specifically it is the patch
>>>> "clk: microchip: mpfs: move id & offset out of clock structs"
>>>>
>>>> I applied this patchset on top of tonight's master (15205c2829ca) but
>>>> I've been seeing the same problem for a few weeks on -next too. I tried
>>>> the following 2 versions of clang/llvm:
>>>> ClangBuiltLinux clang version 15.0.0 (5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0
>>>> ClangBuiltLinux clang version 15.0.0 (bab8af8ea062f6332b5c5d13ae688bb8900f244a), LLD 15.0.0
>>>
>>> Good to know that it reproduces with fairly recent versions of LLVM :)
>>>
>>>> It's probably something silly that I've overlooked but I am not au
>>>> fait with these sort of things unfortunately, but hey - at least I'll
>>>> learn something then.
>>>
>>> I took a quick glance at the patch you mentioned above and I don't
>>> immediately see anything as problematic...
>>
>> Yeah, I couldn't see any low hanging fruit either.
>>
>>> I was going to see if I could
>>> reproduce this locally in QEMU since I do see there is a machine
>>> 'microchip-icicle-kit' but I am not having much success getting the
>>> machine past SBI. Does this reproduce in QEMU or are you working with
>>> the real hardware? If QEMU, do you happen to have a working invocation
>>> handy?
>>
>> Yeah... So there was a QEMU incantation that worked at some point in
>> the past (ie when someone wrote the QEMU port) but most peripherals
>> are not implemented and current versions of our openSBI implementation
>> requires more than one of the unimplemented peripherals. I was trying to
>> get it working lately in the evenings based on some patches that were a
>> year old but no joy :/
>
> Heh, I guess that would explain why it wasn't working for me :)
>
>> I'm running on the real hardware, I'll give the older combo of qemu
>> "bios" etc a go again over the weekend & try to get it working. In the
>> meantime, any suggestions?
>
> Are you building with 'LLVM=1' or just 'CC=clang'? If 'LLVM=1', I would
> try breaking it apart into the individual options (LD=ld.lld,
> OBJCOPY=llvm-objcopy) and see if dropping one of those makes a
> difference. We have had subtle differences between the GNU and LLVM
> tools before and it is much easier to look into that difference if we
> know it happens in only one tool.

LLVM=1.

>
> Otherwise, I am not sure I have any immediate ideas other than looking
> at the disassembly and trying to see if something is going wrong. Is
> the object file being modified in any other way (I don't think there is
> something like objtool for RISC-V but I could be wrong)?

I'll give the options a go so, I'll LYK how I get on.
Thanks,
Conor.

>
> Cheers,
> Nathan
>
>>>> The boot log is fairly short so here ya go:
>>>>
>>>> [ 0.000000] Linux version 5.19.0-13253-g374b508ee318 (conor@spud) (ClangBuiltLinux clang version 15.0.0 (git@g
>>>> ithub.com:llvm/llvm-project.git 5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0) #1 SMP Tue Aug 9 22:42:10 IST
>>>> 2022
>>>> [ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
>>>> [ 0.000000] Machine model: Microchip PolarFire-SoC Icicle Kit
>>>> [ 0.000000] earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8')
>>>> [ 0.000000] printk: bootconsole [ns16550a0] enabled
>>>> [ 0.000000] efi: UEFI not found.
>>>> [ 0.000000] Zone ranges:
>>>> [ 0.000000] DMA32 [mem 0x0000000080200000-0x00000000ffffffff]
>>>> [ 0.000000] Normal [mem 0x0000000100000000-0x000000103fffffff]
>>>> [ 0.000000] Movable zone start for each node
>>>> [ 0.000000] Early memory node ranges
>>>> [ 0.000000] node 0: [mem 0x0000000080200000-0x00000000adffffff]
>>>> [ 0.000000] node 0: [mem 0x0000001000000000-0x000000103fffffff]
>>>> [ 0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000103fffffff]
>>>> [ 0.000000] On node 0, zone Normal: 16064512 pages in unavailable ranges
>>>> [ 0.000000] SBI specification v0.3 detected
>>>> [ 0.000000] SBI implementation ID=0x1 Version=0x9
>>>> [ 0.000000] SBI TIME extension detected
>>>> [ 0.000000] SBI IPI extension detected
>>>> [ 0.000000] SBI RFENCE extension detected
>>>> [ 0.000000] SBI HSM extension detected
>>>> [ 0.000000] CPU with hartid=0 is not available
>>>> [ 0.000000] CPU with hartid=0 is not available
>>>> [ 0.000000] riscv: base ISA extensions acdfim
>>>> [ 0.000000] riscv: ELF capabilities acdfim
>>>> [ 0.000000] percpu: Embedded 18 pages/cpu s34168 r8192 d31368 u73728
>>>> [ 0.000000] pcpu-alloc: s34168 r8192 d31368 u73728 alloc=18*4096
>>>> [ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
>>>> [ 0.000000] CPU node for /cpus/cpu@0 exist but the possible cpu range is :0-3
>>>> [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 224263
>>>> [ 0.000000] Kernel command line: root=/dev/nfs ip=dhcp debug nfsroot=192.168.2.5:/stuff/nfs_share earlycon
>>>> [ 0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
>>>> [ 0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
>>>> [ 0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
>>>> [ 0.000000] software IO TLB: area num 4.
>>>> [ 0.000000] software IO TLB: mapped [mem 0x00000000aa000000-0x00000000ae000000] (64MB)
>>>> [ 0.000000] Virtual kernel memory layout:
>>>> [ 0.000000] fixmap : 0xffffffc6fee00000 - 0xffffffc6ff000000 (2048 kB)
>>>> [ 0.000000] pci io : 0xffffffc6ff000000 - 0xffffffc700000000 ( 16 MB)
>>>> [ 0.000000] vmemmap : 0xffffffc700000000 - 0xffffffc800000000 (4096 MB)
>>>> [ 0.000000] vmalloc : 0xffffffc800000000 - 0xffffffd800000000 ( 64 GB)
>>>> [ 0.000000] lowmem : 0xffffffd800000000 - 0xffffffe7bfe00000 ( 62 GB)
>>>> [ 0.000000] kernel : 0xffffffff80000000 - 0xffffffffffffffff (2047 MB)
>>>> [ 0.000000] Memory: 803032K/1800192K available (7485K kernel code, 2858K rwdata, 4096K rodata, 2181K init, 394K bss, 997160K reserved, 0K cma-reserved)
>>>> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
>>>> [ 0.000000] rcu: Hierarchical RCU implementation.
>>>> [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
>>>> [ 0.000000] rcu: RCU debug extended QS entry/exit.
>>>> [ 0.000000] Tracing variant of Tasks RCU enabled.
>>>> [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
>>>> [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
>>>> [ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
>>>> [ 0.000000] CPU with hartid=0 is not available
>>>> [ 0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller
>>>> [ 0.000000] riscv-intc: 64 local interrupts mapped
>>>> [ 0.000000] plic: interrupt-controller@c000000: mapped 186 interrupts with 4 handlers for 9 contexts.
>>>> [ 0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
>>>> [ 0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [4]
>>>> [ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns
>>>> [ 0.000003] sched_clock: 64 bits at 1000kHz, resolution 1000ns, wraps every 2199023255500ns
>>>> [ 0.009713] Console: colour dummy device 80x25
>>>> [ 0.014676] printk: console [tty0] enabled
>>>> [ 0.019217] printk: bootconsole [ns16550a0] disabled
>>>>
>>>> FWIW this is right about when the clock driver gets loaded
>>>> as you might imagine.
>>>>
>>>> My config is here, but it's been reproduced on a few different
>>>> defconfigs:
>>>> https://raw.githubusercontent.com/ConchuOD/polarfire-soc-buildroot-sdk/dev/conf/lowmem/defconfig
>>>>
>>>> On 04/07/2022 13:15, Conor Dooley wrote:
>>>>> Hey all,
>>>>> I know I have not sat on the RFC I sent about the aux. bus parts
>>>>> for too long, but figured I'd just send the whole thing anyway to all
>>>>> lists etc.
>>>>>
>>>>> Kinda two things happening in this series, but I sent it together to
>>>>> ensure the second part would apply correctly.
>>>>>
>>>>> The first is the reset controller that I promised after discovering the
>>>>> issue triggered by CONFIG_PM & the phy not coming up correctly. I have
>>>>> now removed all the messing with resets from clock enable/disable
>>>>> functions & now use the aux bus to set up a reset controller driver.
>>>>> Since I needed something to test it, I hooked up the reset for the
>>>>> Cadence MACB on PolarFire SoC. This has been split into a second series
>>>>> for v2:
>>>>> https://lore.kernel.org/all/20220704114511.1892332-1-conor.dooley@xxxxxxxxxxxxx/
>>>>>
>>>>> The second part adds rate control for the MSS PLL clock, followed by
>>>>> some simplifications to the driver & conversions of some custom structs
>>>>> to the corresponding structs in the framework.
>>>>>
>>>>> Thanks,
>>>>> Conor.
>>>>>
>>>>> Changes since v1:
>>>>> - split off the net patches
>>>>> - clk: actually pass the spinlock to the converted dividers & gates
>>>>> - reset: added a spinlock around RMW access to registers
>>>>> - reset: switched to BIT(i) macros
>>>>> - reset: used local copies of some variables as pointed out by Philipp
>>>>> - reset: dropped the success printout
>>>>>
>>>>> Conor Dooley (12):
>>>>> dt-bindings: clk: microchip: mpfs: add reset controller support
>>>>> clk: microchip: mpfs: add reset controller
>>>>> reset: add polarfire soc reset support
>>>>> MAINTAINERS: add polarfire soc reset controller
>>>>> riscv: dts: microchip: add mpfs specific macb reset support
>>>>> clk: microchip: mpfs: add module_authors entries
>>>>> clk: microchip: mpfs: add MSS pll's set & round rate
>>>>> clk: microchip: mpfs: move id & offset out of clock structs
>>>>> clk: microchip: mpfs: simplify control reg access
>>>>> clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo()
>>>>> clk: microchip: mpfs: convert cfg_clk to clk_divider
>>>>> clk: microchip: mpfs: convert periph_clk to clk_gate
>>>>>
>>>>> .../bindings/clock/microchip,mpfs.yaml | 17 +-
>>>>> MAINTAINERS | 1 +
>>>>> arch/riscv/boot/dts/microchip/mpfs.dtsi | 7 +-
>>>>> drivers/clk/microchip/Kconfig | 1 +
>>>>> drivers/clk/microchip/clk-mpfs.c | 379 +++++++++---------
>>>>> drivers/reset/Kconfig | 7 +
>>>>> drivers/reset/Makefile | 2 +-
>>>>> drivers/reset/reset-mpfs.c | 157 ++++++++
>>>>> include/soc/microchip/mpfs.h | 8 +
>>>>> 9 files changed, 386 insertions(+), 193 deletions(-)
>>>>> create mode 100644 drivers/reset/reset-mpfs.c
>>>>>
>>>>>
>>>>> base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3