Re: [PATCH v2 0/2] iommu/arm-smmu-v3: make sure the kdump kernel can work well when smmu is enabled

From: Leizhen (ThunderTown)
Date: Fri Apr 19 2019 - 17:36:56 EST




On 2019/4/17 9:39, Leizhen (ThunderTown) wrote:
>
>
> On 2019/4/16 17:14, Will Deacon wrote:
>> On Mon, Apr 08, 2019 at 10:31:47AM +0800, Leizhen (ThunderTown) wrote:
>>> On 2019/4/4 23:30, Will Deacon wrote:
>>>> On Mon, Mar 18, 2019 at 09:12:41PM +0800, Zhen Lei wrote:
>>>>> v1 --> v2:
>>>>> 1. Drop part2. Now, we only use the SMMUv3 hardware feature STE.config=0b000
>>>>> (Report abort to device, no event recorded) to suppress the event messages
>>>>> caused by the unexpected devices.
>>>>> 2. rewrite the patch description.
>>>>
>>>> This issue came up a while back:
>>>>
>>>> https://lore.kernel.org/linux-pci/20180302103032.GB19323@xxxxxxx/
>>>>
>>>> and I'd still prefer to solve it using the disable_bypass logic which we
>>>> already have. Something along the lines of the diff below?
>>>
>>> Yes, my patches also use disable_bypass=1(set ste.config=0b000). If
>>> SMMU_IDR0.ST_LEVEL=0(Linear Stream table supported), then all STE entries
>>> are allocated and initialized(set ste.config=0b000). But if SMMU_IDR0.ST_LEVEL=1
>>> (2-level Stream Table), we only allocated and initialized the first level tables,
>>> but leave level 2 tables dynamic allocated. That means, C_BAD_STREAMID(eventid=0x2)
>>> will be reported, if an unexpeted device access memory without reinitialized in
>>> kdump kernel.
>>
>> So is your problem just that the C_BAD_STREAMID events are noisy? If so,
>> perhaps we should be disabling fault reporting entirely in the kdump kernel.
>>
>> How about the update diff below? I'm keen to have this as simple as
>> possible, so we don't end up introducing rarely tested, complex code on
>> the crash path.
> In theory, it can solve the problem, let me test it.
Hi Will,
I have tested your patch on my board today. It works well.

>
> But then again, below patch will also disable the fault reporting come from the
> expected devices which are used in the kdump kernel. In fact, my patches have been
> merged into our interval version more than 2 months, no bug have been found yet.
>
> However, my patches do not support the case that the hardware does not support the
> "STE bypass" feature, I think your patch can also resolve it.
>
>>
>> Will
>>
>> --->8
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index d3880010c6cf..d8b73da6447d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -2454,13 +2454,9 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>> /* Clear CR0 and sync (disables SMMU and queue processing) */
>> reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
>> if (reg & CR0_SMMUEN) {
>> - if (is_kdump_kernel()) {
>> - arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>> - arm_smmu_device_disable(smmu);
>> - return -EBUSY;
>> - }
>> -
>> dev_warn(smmu->dev, "SMMU currently enabled! Resetting...\n");
>> + WARN_ON(is_kdump_kernel() && !disable_bypass);
>> + arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
>> }
>>
>> ret = arm_smmu_device_disable(smmu);
>> @@ -2553,6 +2549,8 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
>> return ret;
>> }
>>
>> + if (is_kdump_kernel())
>> + enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
>>
>> /* Enable the SMMU interface, or ensure bypass */
>> if (!bypass || disable_bypass) {
>>
>> .
>>
>

--
Thanks!
BestRegards