Re: qemu-system-s390x hang in tcg

From: Alex Bennée
Date: Wed Jun 29 2022 - 11:23:23 EST



Sven Schnelle <svens@xxxxxxxxxxxxx> writes:

> Sven Schnelle <svens@xxxxxxxxxxxxx> writes:
>
>> Alex Bennée <alex.bennee@xxxxxxxxxx> writes:
>>
>>> Sven Schnelle <svens@xxxxxxxxxxxxx> writes:
>>>
>>>> Hi,
>>>>
>>>> David Hildenbrand <david@xxxxxxxxxx> writes:
>>>>
>>>>> On 04.05.22 09:37, Janosch Frank wrote:
>>>>>> I had a short look yesterday and the boot usually hangs in the raid6
>>>>>> code. Disabling vector instructions didn't make a difference but a few
>>>>>> interruptions via GDB solve the problem for some reason.
>>>>>>
>>>>>> CCing David and Thomas for TCG
>>>>>>
>>>>>
>>>>> I somehow recall that KASAN was always disabled under TCG, I might be
>>>>> wrong (I thought we'd get a message early during boot that the HW
>>>>> doesn't support KASAN).
>>>>>
>>>>> I recall that raid code is a heavy user of vector instructions.
>>>>>
>>>>> How can I reproduce? Compile upstream (or -next?) with kasan support and
>>>>> run it under TCG?
>>>>
>>>> I spent some time looking into this. It's usually hanging in
>>>> s390vx8_gen_syndrome(). My first thought was that it is a problem with
>>>> the VX instructions, but turned out that it hangs even if i remove all
>>>> the code from s390vx8_gen_syndrome().
>>>>
>>>> Tracing the execution of TB's, i see that the generated code is always
>>>> jumping between a few TB's, but never exiting the TB's to check for
>>>> interrupts (i.e. return to cpu_tb_exec(). I only see calls to
>>>> helper_lookup_tb_ptr to lookup the tb pointer for the next TB.
>>>>
>>>> The raid6 code is waiting for some time to expire by reading jiffies,
>>>> but interrupts are never processed and therefore jiffies doesn't change.
>>>> So the raid6 code hangs forever.
>>>>
>>>> As a test, i made a quick change to test:
>>>>
>>>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>>>> index c997c2e8e0..35819fd5a7 100644
>>>> --- a/accel/tcg/cpu-exec.c
>>>> +++ b/accel/tcg/cpu-exec.c
>>>> @@ -319,7 +319,8 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>>>> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>>>>
>>>> cflags = curr_cflags(cpu);
>>>> - if (check_for_breakpoints(cpu, pc, &cflags)) {
>>>> + if (check_for_breakpoints(cpu, pc, &cflags) ||
>>>> + unlikely(qatomic_read(&cpu->interrupt_request))) {
>>>> cpu_loop_exit(cpu);
>>>> }
>>>>
>>>> And that makes the problem go away. But i'm not familiar with the TCG
>>>> internals, so i can't say whether the generated code is incorrect or
>>>> something else is wrong. I have tcg log files of a failing + working run
>>>> if someone wants to take a look. They are rather large so i would have to
>>>> upload them somewhere.
>>>
>>> Whatever is setting cpu->interrupt_request should be calling
>>> cpu_exit(cpu) which sets the exit flag which is checked at the start of
>>> every TB execution (see gen_tb_start).
>>
>> Thanks, that was very helpful. I added debugging and it turned out
>> that the TB is left because of a pending irq. The code then calls
>> s390_cpu_exec_interrupt:
>>
>> bool s390_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
>> {
>> if (interrupt_request & CPU_INTERRUPT_HARD) {
>> S390CPU *cpu = S390_CPU(cs);
>> CPUS390XState *env = &cpu->env;
>>
>> if (env->ex_value) {
>> /* Execution of the target insn is indivisible from
>> the parent EXECUTE insn. */
>> return false;
>> }
>> if (s390_cpu_has_int(cpu)) {
>> s390_cpu_do_interrupt(cs);
>> return true;
>> }
>> if (env->psw.mask & PSW_MASK_WAIT) {
>> /* Woken up because of a floating interrupt but it has already
>> * been delivered. Go back to sleep. */
>> cpu_interrupt(CPU(cpu), CPU_INTERRUPT_HALT);
>> }
>> }
>> return false;
>> }
>>
>> Note the 'if (env->ex_value) { }' check. It looks like this function
>> just returns false in case tcg is executing an EX instruction. After
>> that the information that the TB should be exited because of an
>> interrupt is gone. So the TB's are never exited again, although the
>> interrupt wasn't handled. At least that's my assumption now, if i'm
>> wrong please tell me.
>
> Looking at the code i see CF_NOIRQ to prevent TB's from getting
> interrupted. But i only see that used in the core tcg code. Would
> that be a possibility, or is there something else/better?

Yes CF_NOIRQ is exactly the compiler flag you would use to prevent a
block from exiting early when you absolutely want to execute the next
block. We currently only use it from core code to deal with icount
related things but I can see it's use here. I would probably still wrap
it in a common function in cpu-exec-common.c I'm unsure of the exact
semantics for s390 so I will defer to Richard and others but something
like (untested):

/*
* Ensure the next N instructions are not interrupted by IRQ checks.
*/
void cpu_loop_exit_unint(CPUState *cpu, uintptr_t pc, int len)
{
if (pc) {
cpu_restore_state(cpu, pc, true);
}
cpu->cflags_next_tb = len | CF_LAST_IO | CF_NOIRQ | curr_cflags(cpu);
cpu_loop_exit(cpu);
}

And then in HELPER(ex) you can end the helper with:

void HELPER(ex)(CPUS390XState *env, uint32_t ilen, uint64_t r1, uint64_t addr)
{
...

/*
* We must execute the next instruction exclusively so exit the loop
* and trigger a NOIRQ TB which won't check for an interrupt until
* it finishes executing.
*/
cpu_loop_exit_unint(cpu, 0, 1);
}

Some notes:

* Take care to ensure the CPU state is synchronised

Which means the helper cannot use the flags
TCG_CALL_NO_(READ_GLOBALS|WRITE_GLOBALS|SIDE_EFFECTS). And you you
will to make sure you write the current PC in the tcg gen code in
op_ex()

* I think the env->ex_value can be removed after this

* We will actually exit the execution loop (via a sigjmp) but the IRQ
check in cpu_handle_interrupt() will be skipped due to the custom
flags. When the next block is looked up (or generated) it will be
entered but then immediately exit

* I think even a branch to self should work because the second
iteration will be interuptable

> Sorry for the dumb questions, i'm not often working on qemu ;-)

There are no dumb questions, just opportunities for better documentation ;-)

--
Alex Bennée