RE: Re: [PATCH v2 3/3] efi: Capsule update with user helper interface

From: Kweh, Hock Leong
Date: Tue Feb 24 2015 - 07:49:25 EST


> -----Original Message-----
> From: Kweh, Hock Leong
> Sent: Tuesday, February 24, 2015 6:54 PM
>
> In callbackfn_efi_capsule, you call request_firmware_nowait. When that
> callback is invoked, I think that the /sys/class/firmware/efi-capsule-file
> directory doesn't exist at all.
> If the callback takes longer than it takes your script to make it through a full
> iteration, then it will try uploading the second capsule before the firmware
> class directory is there, so it will fail.
>
> But I just realized that your script has a loop below to handle that.
> It's this:
>
> oldtime=$(date +%S)
> oldtime=$(((time + 2) % 60))
> until [ -f /sys/class/firmware/efi-capsule-file/loading ]
> do
> newtime=$(date +%S)
> if [ $newtime -eq $oldtime ]
> then
> break
> fi
> done
>
> Aside from the fact that this loop itself is racy (it may loop forever if
> something goes wrong in the kernel, since $newtime -eq $oldtime may
> never happen), it should help, if you're lucky. But there's another bug.
>
>
> Here's the race:
>
> User:
> echo 1 > /sys/class/firmware/efi-capsule-file/loading
> cat capsule1 > /sys/class/firmware/efi-capsule-file/data
> echo 0 > /sys/class/firmware/efi-capsule-file/loading
> Kernel: Be a little slow here due to preemption or whatever.
>
> User:
> -f /sys/class/firmware/efi-capsule-file/loading returns true capsules_loaded
> == 0 Assume failure, incorrectly
>
> Kernel: catch up and increment capsules_loaded.
>
> If these patches get applied, then I think that the protocol needs to be
> documented in Documentation/ABI. It should say something like:
>
> To upload an EFI capsule, do this:
>
> Write 1 to /sys/class/firmware/efi-capsule-file/loading
> Write the capsule to /sys/class/firmware/efi-capsule-file/data
> Write 0 to /sys/class/firmware/efi-capsule-file/loading
>
> Make sure that /sys/class/firmware/efi-capsule-file disappears and comes
> back, perhaps by cd-ing there and waiting for all the files in the directory to
> go away.
>
> Then, and only then, read capsules_loaded to detect success.
>
>
> Once you've written that doc, please seriously consider whether this
> interface is justifiable. I think it sucks.
>
> --Andy

Hi All,

After some internal discussion and re-design prototyping & testing on
this efi capsule interface kernel module, I would like to start a discussion
here on the new idea and wish to get input for the implementation and
eventually upstream.

This new idea will expose 2 read-only file notes from the efi capsule kernel
module - "capsule_ticket" and "capsule_report". The "capsule_ticket" will
let user to obtain a NON-ZERO number and then perform a mutex lock. The
obtained number will be used later for "capsule_report" status checking.
Unlock mutex is done after "echo 0 > loading" at the end of callback function.

So the process steps basically look like this:
1.) cat capsule_ticket =======> acquire a number and lock mutex then
expose firmware_class user helper
interface as well as start timer for timeout
counting
2.) repeat step 1 if obtained a "0" number
3.) echo 1 > loading
4.) cat bin > data
5.) echo 0 > loading =======> stop the timeout counting then unlock
mutex at the end of callback routine
6.) cat capsule_report =======> grep the number acquired from beginning
for checking succeeded/failed

The ticket numbering is starting from 1 to 999 and then rolls over from
1 to 999 again. The "capsule_report" will show the whole list of the acquired
entries and will be limited to 128 entries which is capped within one page buffer.

The format of this "capsule_report" output will look like:
"Capsule $num uploads successful/failed/aborted"

There is a 2mins time out (internal) for each of the number acquired.
After that, the interface will be closed/disappeared.

I have attached a SAMPLE script and kernel module code in case you are not
able to understand my description above.

I believe this would really take care of the multi-capsule update concurrently
issue and also asynchronous status reporting issue. Besides, this will indirectly
take care the module unload issue with "rmmod" and not "rmmod -f".
What do you guys think?

----------------------------------------------------------------------------------------------

There is another idea during internal discussion. The 2nd idea would require
some changes to drivers/base/firmware_class.c user helper interface for the
mutex locking as well as status return. Mutex lock will be performed while
doing the 'echo 1 > loading' session and status return will be performed after
the 'echo 0 > loading'. The mutex unlock will be done at 'echo 0 > loading' too
or may be at the status reading session for avoid asynchronous reporting.

This will likely changed the original firmware class user helper interface design
behavior. But this approach could save the 2 file notes from the 1st approach.


Which of the approach do you guys prefer? Thanks.


Regards,
Wilson

#!/bin/sh

for arg in "$@"
do
if [ -f $arg ]
then
retry=0
until [ ! $num = '0' ]
do
sleep 1
num=$(cat /sys/devices/platform/efi_capsule_user_helper/capsule_ticket)
if [ $retry -le 130 ]
then
retry=$((retry + 1))
else
echo "Failed to acquire capsule ticket"
exit 1
fi
done

oldtime=$(date +%S)
oldtime=$(((time + 2) % 60))
until [ -f /sys/class/firmware/efi-capsule-file/loading ]
do
newtime=$(date +%S)
if [ $newtime -eq $oldtime ]
then
echo "Failed to expose user helper interface"
exit 1
fi
done

echo 1 > /sys/class/firmware/efi-capsule-file/loading
cat $arg > /sys/class/firmware/efi-capsule-file/data
echo 0 > /sys/class/firmware/efi-capsule-file/loading

until [ -n "$result" ]
do
result=$(cat /sys/devices/platform/efi_capsule_user_helper/capsule_report | grep $num)
done
echo $result " (" $arg ")"
else
echo "File $arg not found !!"
fi
done
exit 0

Attachment: 0002-efi-Capsule-update-with-user-helper-interface.patch
Description: 0002-efi-Capsule-update-with-user-helper-interface.patch