Re: Gang scheduling

From: Subhra Mazumdar
Date: Tue Feb 12 2019 - 22:00:04 EST


Hi Tim,

On 10/12/18 11:01 AM, Tim Chen wrote:
On 10/10/2018 05:09 PM, Subhra Mazumdar wrote:
Hi,

I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github.

https://github.com/pdxChen/gang/commits/sched_1.23-loadbal

I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also is there any documentaion on how to use it (any knobs I need to turn on for gang scheduling to happen?) or is it enabled by default for KVMs?

Thanks,
Subhra

I would suggest you try
https://github.com/pdxChen/gang/tree/sched_1.23-base
without the load balancing part of gang scheduling.
It is enabled by default for KVMs.
I applied the following 3 patches on 4.19 and tried to install a KVM (with
virt-install). But the kernel hangs with following error:

kernel:watchdog: BUG: soft lockup - CPU#21 stuck for 23s! [kworker/21:1:573]

kvm,sched: Track VCPU threads
x86/kvm,sched: Add fast path for reschedule interrupt
sched: Optimize scheduler_ipi()

The track VCPU patch seems to be the culprit.

Thanks,
Subhra

Due to the constant change in gang scheduling status of the QEMU thread
depending on whether vcpu is loaded or unloaded,
the load balancing part of the code doesn't work very well.

The current version of the code need to be optimized further. Right now
the QEMU thread constantly does vcpu load and unload during VM enter and exit.
We gang schedule only after vcpu load and register the thread to be gang
scheduled. When we do vcpu unload, the thread is removed from the set
to be gang scheduled. Each time there's a synchronization with the
sibling thread that's expensive.

However, for QEMU, there's a one to one correspondence between the QEMU
thread and vcpu. So we don't have to change the gang scheduling status
for such thread to avoid the church and sync with the sibling. That should
be helpful for VM with lots of I/O causing constant VM exits. We're
still working on this optimization. And the load balancing should be
better after this change.

Tim