Re: [RFC PATCH v6 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management

From: Arnd Bergmann
Date: Sat Jan 07 2023 - 19:30:05 EST


On Sat, Jan 7, 2023, at 23:10, Lad, Prabhakar wrote:

>> > +
>> > + memset(&thead_cmo_ops, 0x0, sizeof(thead_cmo_ops));
>> > + if (IS_ENABLED(CONFIG_ERRATA_THEAD_CMO)) {
>> > + thead_cmo_ops.clean_range = &thead_cmo_clean_range;
>> > + thead_cmo_ops.inv_range = &thead_cmo_inval_range;
>> > + thead_cmo_ops.flush_range = &thead_cmo_flush_range;
>> > + riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
>> > + }
>>
>> The implementation here looks reasonable, just wonder whether
>> the classification as an 'errata' makes sense. I would probably
>> consider this a 'driver' at this point, but that's just
>> a question of personal preference.
>>
> zicbom is a CPU feature that doesn't have any DT node and hence no
> driver and similarly for T-HEAD SoC.

A driver does not have to be a 'struct platform_driver' that
matches to a device node, my point was more about what to
name it, regardless of how the code is entered.

> Also the arch_setup_dma_ops()
> happens quite early before driver probing due to which we get WARN()
> messages during bootup hence I have implemented it as errata; as
> errata patching happens quite early.

But there is no more patching here, just setting the
function pointers, right?

>> > +struct riscv_cache_ops {
>> > + void (*clean_range)(unsigned long addr, unsigned long size);
>> > + void (*inv_range)(unsigned long addr, unsigned long size);
>> > + void (*flush_range)(unsigned long addr, unsigned long size);
>> > + void (*riscv_dma_noncoherent_cmo_ops)(void *vaddr, size_t size,
>> > + enum dma_data_direction dir,
>> > + enum dma_noncoherent_ops ops);
>> > +};
>>
>> I don't quite see how the fourth operation is used here.
>> Are there cache controllers that need something beyond
>> clean/inv/flush?
>>
> This is for platforms that dont follow standard cache operations (like
> done in patch 5/6) and there drivers decide on the operations
> depending on the ops and dir.

My feeling is that the set of operations that get called should
not depend on the cache controller but at best the CPU. I tried to
enumerate how zicbom and ax45 differ here, and how that compares
to other architectures:

zicbom ax45,mips,arc arm arm64
fromdevice clean/flush inval/inval inval/inval clean/inval
todevice clean/- clean/- clean/- clean/-
bidi flush/flush flush/inval clean/inval clean/inval

So everyone does the same operation for DMA_TO_DEVICE, but
they differ in the DMA_FROM_DEVICE handling, for reasons I
don't quite see:

Your ax45 code does the same as arc and mips. arm and
arm64 skip invalidating the cache before bidi mappings,
but arm has a FIXME comment about that. arm64 does a
'clean' instead of 'inval' when mapping a fromdevice
page, which seems valid but slower than necessary.

Could the zicbom operations be changed to do the same
things as the ax45/mips/arc ones, or are there specific
details in the zicbom spec that require this?

Arnd