Re: [PATCH] devcoredump: increase the device delete timeout to 10 mins

From: Abhinav Kumar
Date: Sat Feb 12 2022 - 03:34:05 EST


Hi Greg

On 2/12/2022 12:29 AM, Greg KH wrote:
On Fri, Feb 11, 2022 at 11:52:41PM -0800, Abhinav Kumar wrote:
Hi Greg

On 2/11/2022 11:04 PM, Greg KH wrote:
On Fri, Feb 11, 2022 at 10:59:39AM -0800, Abhinav Kumar wrote:
Hi Greg

Thanks for the response.

On 2/11/2022 3:09 AM, Greg KH wrote:
On Tue, Feb 08, 2022 at 11:44:32AM -0800, Abhinav Kumar wrote:
There are cases where depending on the size of the devcoredump and the speed
at which the usermode reads the dump, it can take longer than the current 5 mins
timeout.

This can lead to incomplete dumps as the device is deleted once the timeout expires.

One example is below where it took 6 mins for the devcoredump to be completely read.

04:22:24.668 23916 23994 I HWDeviceDRM::DumpDebugData: Opening /sys/class/devcoredump/devcd6/data
04:28:35.377 23916 23994 W HWDeviceDRM::DumpDebugData: Freeing devcoredump node

What makes this so slow? Reading from the kernel shouldn't be the
limit, is it where the data is being sent to?

We are still checking this. We are seeing better read times when we bump up
the thread priority of the thread which was reading this.

Where is the thread sending the data to?

The thread is writing the data to a file in local storage. From our
profiling, the read is the one taking the time not the write.

The read is coming directly from memory, there should not be any
slowdown at all here. How can that be the delay? Have a trace
somewhere?

thanks,

greg k-h

Yes, like I mentioned in my previous comment we are still checking why its taking so long. We will update with our findings if we have any.
Alright, we will try to capture trace to share and will update this thread if we find something as well.