Re: [PATCH v3] ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512

From: Marek Szyprowski
Date: Fri Mar 08 2024 - 11:22:07 EST


On 08.03.2024 15:51, Catalin Marinas wrote:
> On Fri, Mar 08, 2024 at 03:01:28PM +0100, Marek Szyprowski wrote:
>> This patch landed in today's linux-next as commit 0499a78369ad ("ARM64:
>> Dynamically allocate cpumasks and increase supported CPUs to 512").
>> Unfortunately it triggers the following warning during boot on most of
>> my ARM64-based test boards. Here is an example from Odroid-N2 board:
>>
>>  ------------[ cut here ]------------
>>  WARNING: CPU: 4 PID: 63 at drivers/opp/core.c:2554
>> dev_pm_opp_set_config+0x390/0x710
>>  Modules linked in: dw_hdmi_i2s_audio meson_gxl smsc onboard_usb_hub(+)
>> rtc_pcf8563 panfrost snd_soc_meson_axg_sound_card drm_shmem_helper
>> crct10dif_ce dwmac_generic snd_soc_meson_card_utils gpu_sched
>> snd_soc_meson_g12a_toacodec snd_soc_meson_g12a_tohdmitx rc_odroid
>> snd_soc_meson_codec_glue pwm_meson ao_cec_g12a meson_gxbb_wdt meson_ir
>> snd_soc_meson_axg_frddr snd_soc_meson_axg_toddr snd_soc_meson_axg_tdmin
>> meson_vdec(C) v4l2_mem2mem videobuf2_dma_contig snd_soc_meson_axg_tdmout
>> videobuf2_memops axg_audio videobuf2_v4l2 sclk_div videodev
>> reset_meson_audio_arb snd_soc_meson_axg_fifo clk_phase dwmac_meson8b
>> stmmac_platform videobuf2_common mdio_mux_meson_g12a meson_drm
>> meson_dw_hdmi rtc_meson_vrtc stmmac meson_ddr_pmu_g12 mc dw_hdmi
>> drm_display_helper pcs_xpcs snd_soc_meson_t9015 meson_canvas gpio_fan
>> display_connector snd_soc_meson_axg_tdm_interface
>> snd_soc_simple_amplifier snd_soc_meson_axg_tdm_formatter nvmem_meson_efuse
>>  hub 1-1:1.0: USB hub found
>>  CPU: 4 PID: 63 Comm: kworker/u12:5 Tainted: G         C 6.8.0-rc3+ #14677
>>  Hardware name: Hardkernel ODROID-N2 (DT)
>>  Workqueue: events_unbound deferred_probe_work_func
>>  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>  pc : dev_pm_opp_set_config+0x390/0x710
>>  lr : dev_pm_opp_set_config+0x5c/0x710
>>  ...
>>  Call trace:
>>   dev_pm_opp_set_config+0x390/0x710
>>   dt_cpufreq_probe+0x268/0x480
>>   platform_probe+0x68/0xd8
>>   really_probe+0x148/0x2b4
>>   __driver_probe_device+0x78/0x12c
>>   driver_probe_device+0xdc/0x164
>>   __device_attach_driver+0xb8/0x138
>>   bus_for_each_drv+0x84/0xe0
>>   __device_attach+0xa8/0x1b0
>>   device_initial_probe+0x14/0x20
>>   bus_probe_device+0xb0/0xb4
>>   deferred_probe_work_func+0x8c/0xc8
>>   process_one_work+0x1ec/0x53c
>>   worker_thread+0x298/0x408
>>   kthread+0x124/0x128
>>   ret_from_fork+0x10/0x20
>>  irq event stamp: 317178
>>  hardirqs last  enabled at (317177): [<ffff8000801788d4>]
>> ktime_get_coarse_real_ts64+0x10c/0x110
>>  hardirqs last disabled at (317178): [<ffff800081222030>] el1_dbg+0x24/0x8c
>>  softirqs last  enabled at (315802): [<ffff800080010a60>]
>> __do_softirq+0x4a0/0x4e8
>>  softirqs last disabled at (315793): [<ffff8000800169b0>]
>> ____do_softirq+0x10/0x1c
>>  ---[ end trace 0000000000000000 ]---
>>  cpu cpu2: error -EBUSY: failed to set regulators
>>  cpufreq-dt: probe of cpufreq-dt failed with error -16
>>
>> It looks that cpufreq-dt and/or opp drivers needs some adjustments
>> related with this change.
> That's strange. Is this with defconfig? I wonder whether NR_CPUS being
> larger caused the issue with this specific code. Otherwise
> CPUMASK_OFFSTACK may not work that well on arm64.

I've used defconfig with some debug options enabled and some drivers
compiled-in:

make ARCH=arm64 defconfig

/scripts/config -e BLK_DEV_RAM --set-val BLK_DEV_RAM_COUNT 4 --set-val
BLK_DEV_RAM_SIZE 81920 --set-val CMA_SIZE_MBYTES 96 -e PROVE_LOCKING -e
DEBUG_ATOMIC_SLEEP -e STAGING -e I2C_GPIO -e PM_DEBUG -e
PM_ADVANCED_DEBUG -e USB_GADGET -e USB_ETH -e CONFIG_DEVFREQ_THERMAL -e
CONFIG_BRCMFMAC_PCIE -e CONFIG_NFC -d ARCH_SUNXI -d ARCH_ALPINE -d
DRM_NOUVEAU -d ARCH_BCM_IPROC -d ARCH_BERLIN -d ARCH_BRCMSTB -d
ARCH_LAYERSCAPE -d ARCH_LG1K -d ARCH_HISI -d ARCH_MEDIATEK -d ARCH_MVEBU
-d ARCH_SEATTLE -d ARCH_SYNQUACER -d ARCH_RENESAS -d ARCH_STRATIX10 -d
ARCH_TEGRA -d ARCH_SPRD -d ARCH_THUNDER -d ARCH_THUNDER2 -d
ARCH_UNIPHIER -d ARCH_XGENE -d ARCH_ZX -d ARCH_ZYNQMP -d HIBERNATION -d
CLK_SUNXI -d CONFIG_EFI -d CONFIG_TEE -e FW_CFG_SYSFS

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland