Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

From: Michel Dänzer
Date: Thu Aug 10 2023 - 03:34:00 EST


On 8/9/23 21:15, Marek Olšák wrote:
> On Wed, Aug 9, 2023 at 3:35 AM Michel Dänzer <michel.daenzer@xxxxxxxxxxx> wrote:
>> On 8/8/23 19:03, Marek Olšák wrote:
>>> It's the same situation as SIGSEGV. A process can catch the signal,
>>> but if it doesn't, it gets killed. GL and Vulkan APIs give you a way
>>> to catch the GPU error and prevent the process termination. If you
>>> don't use the API, you'll get undefined behavior, which means anything
>>> can happen, including process termination.
>>
>> Got a spec reference for that?
>>
>> I know the spec allows process termination in response to e.g. out of bounds buffer access by the application (which corresponds to SIGSEGV). There are other causes for GPU hangs though, e.g. driver bugs. The ARB_robustness spec says:
>>
>> If the reset notification behavior is NO_RESET_NOTIFICATION_ARB,
>> then the implementation will never deliver notification of reset
>> events, and GetGraphicsResetStatusARB will always return
>> NO_ERROR[fn1].
>> [fn1: In this case it is recommended that implementations should
>> not allow loss of context state no matter what events occur.
>> However, this is only a recommendation, and cannot be relied
>> upon by applications.]
>>
>> No mention of process termination, that rather sounds to me like the GL implementation should do its best to keep the application running.
>
> It basically says that we can do anything.

Not really? If program termination is a possible outcome, the spec otherwise mentions that explicitly, ala "including program termination".


> A frozen window or flipping between 2 random frames can't be described
> as "keeping the application running".

This assumes that an application which uses OpenGL cannot have any other purpose than using OpenGL.


--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and Xwayland developer