Re: [RFCv2 0/8] USI stylus support series

From: Tero Kristo
Date: Fri Dec 10 2021 - 03:50:17 EST


Hi Benjamin,

On 09/12/2021 15:53, Benjamin Tissoires wrote:
Hi Tero,

On Thu, Dec 9, 2021 at 9:56 AM Tero Kristo <tero.kristo@xxxxxxxxxxxxxxx> wrote:
Hi Benjamin,

On 08/12/2021 16:56, Benjamin Tissoires wrote:
Hi Tero,

On Tue, Nov 30, 2021 at 5:13 PM Tero Kristo <tero.kristo@xxxxxxxxxxxxxxx> wrote:
Hi Benjamin,

On 30/11/2021 16:44, Benjamin Tissoires wrote:
Hi Tero,

On Fri, Nov 26, 2021 at 2:02 PM Tero Kristo <tero.kristo@xxxxxxxxxxxxxxx> wrote:
Hi,

This series is an update based on comments from Benjamin. What is done
is this series is to ditch the separate hid-driver for USI, and add the
generic support to core layers. This part basically brings the support
for providing USI events, without programmability (patches 1-6).
That part seems to be almost good for now. I have a few things to check:
- patch2: "HID: hid-input: Add suffix also for HID_DG_PEN" I need to
ensure there are no touchscreens affected by this (there used to be a
mess with some vendors where they would not declare things properly)
- patch5: "HID: core: map USI pen style reports directly" this one
feels plain wrong. I would need to have a look at the report
descriptor but this is too specific in a very generic code
Relevant part of the report descriptor is here:

Field(8)
Physical(Digitizers.Stylus)
Logical(Digitizers.Preferred Line Style)
Application(Digitizers.Pen)
Usage(6)
Digitizers.Ink
Digitizers.Pencil
Digitizers.Highlighter
Digitizers.Chisel Marker
Digitizers.Brush
Digitizers.No Preference
Logical Minimum(1)
Logical Maximum(6)
Physical Minimum(0)
Physical Maximum(255)
Unit Exponent(-1)
Unit(SI Linear : Centimeter)
Report Size(8)
Report Count(1)
Report Offset(88)
Flags( Variable Absolute NoPreferredState )

To me, it looks almost like it is a bug in the report descriptor itself;
as you see there are 6 usage values but the report size / count is 1
byte. The fact that there are 6 usage values in the field confuses
hid-core. Basically the usage values are used as encoded content for the
field.
It took me a few days but I finally understand that this report
descriptor is actually correct.

The descriptor gives an array of 1 element of size 8, which is enough
to give an index within the available values being [Digitizers.Ink,
Digitizers.Pencil, Digitizers.Highlighter, Digitizers.Chisel Marker,
Digitizers.Brush, Digitizers.No Preference]

Given that logical min is 1, this index is 1-based.

So the job of the kernel is to provide the event
Digitizers.Highlighter whenever the value here is 3. The mapping 3 <->
Digitizers.Highlighter is specific to this report descriptor and
should not be forwarded to user space.
Yes, all this is true. I also see you re-wrote this part a bit in the
series to add individual events for all the different line styles. I'll
give this a shot and see how it works out. A problem I see is that we
need to be able to program the pen line style also somehow, do we just
set a single pen style to "enabled" and all the rest get set to
"disabled" under the hood?

I think we need to have a parameter `PreferredLineStyle` which can
only take the values from the array above.

If your API provides that, the rest is implementation detail.
Assigning a value to it will by definition invalidate the old value.

Of course this means that the evdev approach is not suited for that,
which makes me think that is probably not the best option.

Ok I will experiment with this.

Alternatively I think this could be patched up in the BPF program, as I
am modifying the content of the raw hid report already; I could just as
well modify this one also. Or, maybe I could fix the report descriptor
itself to act as a sane variable, as I am parsing the report descriptor
already?
I couldn't understand the fix you did in the BPF program. Can you
explain it by also giving me an example of raw event from the device
and the outputs (fixed and not fixed) of the kernel?
The fix in the BPF code is this (under process_tag()):

/*
* Force flags for line style. This makes it act
* as a simple variable from HID core point of
view.
*/
bpf_hid_set_data(ctx, (*idx + 1) << 3, 8, 0x2);

After that, the pen line style gets forwarded as a simple integer value
to input-core / userspace also. raw events did not need modification
after all, I just modified the report descriptor.
Right. So you are stripping away the actual meaning, which is report
descriptor dependent.
This is not good because a HW vendor might decide to not order the 6
possible values by their HID usage but put the `No Prefererence` first
for instance. There is also a strong possibility a HW vendor decides
to not rely on the PreferredLineStyleIsLocked and gives a choice of
only one possible value (though that would be mean as this is a per
stylus propriety).

Ok thanks for explanation, I will experiment with this also and see how it works.




Talking about that, I realized that you gave me the report descriptor
of the Acer panel in an other version of this RFC. Could you give me:
- the bus used (USB or I2C)?
I have been using I2C in all my testing, the controllers I have access
to are behind I2C only.
- the vendor ID?
- the product ID?
- and the same for the other panel, with its report descriptor?

This way I can add them in the testsuite, and start playing with them.
Attached a tarball with both descriptors and their corresponding IDs
(copied the R+N+I data from hid-recorder.)
Thanks!

Additionally, a HID-BPF based sample is provided which can be used to
program / query pen parameters in comparison to the old driver level
implementation (patches 7-8, patch #8 is an incremental change on top of
patch #7 which just converts the fifo to socket so that the client can
also get results back from the server.)
After a few more thoughts, I wondered what your input is on this. We
should be able to do the very same with plain hidraw... However, you
added a `hid/raw_event` processing that will still be kept in the
kernel, so maybe bpf would be useful for that at least.
Yes, plain hidraw can be sort of used to program the values, however the
interface is kind of annoying to use for the USI pens. You need to be
touching the display with the pen before anything is accepted. Maybe
writing some support code to the libevdev would help.

The hidraw hook is needed for processing the cached values also, USI
pens report their parameters with a delay of some few hundred ms
depending on controller vendor. And in some cases they don't report
anything back before forcibly querying the value from the controller,
and also the write mechanism acts differently; some controllers report
the programmed value back, others keep reporting the old value until the
pen leaves the screen and touches it again.
Hmm, not sure I follow this entirely. I guess I would need to have one
of such devices in my hands :(
Yes, it is kind of confusing, I was also trying to figure out the
details with a remote proxy (someone telling me how things behave) until
I decided to order a second chromebook that had the same controller. I
can try to provide logs of the different cases if you want though. The
quirks I know of at the moment:
I'll need more clarifications (and getting logs might help me
understand better, yes, please):

1) controller does not immediately report "correct" values when pen
touches screen (ELAN)
I assume this is in the input reports, not in the feature reports.
Yes, this is with input reports. Provided a sample in the attached tarball (usi-pen-initial-latency-*; there is dmesg + hid-recorder files.)
What happens in the hovering case (not touching)?
It looks like the controller only queries the pen for actual values when it touches the screen, while hovering, it reports the old/incorrect values forever.
Do we get fake values easily identifiable or are they just as normal
as the correct ones?
They are as normal as correct values, because the controller picks whatever it has (apparently in its internal memory), these can be zeroes (from boot), or values from previous pen that touched screen.

Anyway, considering the use case, this might not be an issue (I was
re-reading the HUT and this is only an indication for applications).

2) controller does never report "correct" values when pen touches screen
(must do a force GET_REPORT) (GOODIX)
Again, Input reports?
Check attached tarball for usi-pen-goodix*. Added both hid-recorder and dmesg outputs, as they provide slightly different data; in dmesg you can see where I actually send the GET_REPORT for pen color and it updates in the input report also slightly after this.
What's in the hovering state reported?
Hovering state doesn't alter the report, but with Goodix controller, you can actually GET_REPORT and get sane data out of the pen while it is only hovering.
Is the GET_REPORT needed against the feature report or the input report?
Feature report. Once I GET_REPORT for the Preferred Line Color feature report, the proper value gets magically updated for the input reports also.

3) controller does not report "correct" values after SET_REPORT
(reporting old value) (ELAN)
Am I correct?:

- Pen is hovering/touching
- controller is reporting correct current values in the input reports
(following the 2 cases above)
- host sends a SET_REPORT on the feature
- controller is still sending the old values in the input reports
I did retry this with the latest code I have and I wasn't able to reproduce it anymore. Might have been a glitch earlier with the driver, but I am certain I did see this kind of behavior because I had a workaround in the driver for it. There is a latency in changing the value and before it reaches the input report though.
What happens if you issue a GET_REPORT on the Input?
On the Feature?
I'll check this if I can reproduce the issue.

4) controller responds with bogus data in GET_REPORT (does not know the
correct value yet) (ELAN + GOODIX)
I assume that's when the stylus is not in proximity, and when you
issue a GET_REPORT of the feature report, not the input, correct?
Yes, with feature report. This is during the initial latency period where the pen has not been probed by the controller yet.

If so, this is something I would have expected, given that those
properties are per stylus, not per controller.

I believe other vendors have different behavior with their controllers
also, as the specs are not 100% clear on multiple things.
Well, depending on your answers above, we might have a common set of
cases we can use, or paper over it through bpf if there is a strong
need.

Also, a few more questions:
- have you tried those cases above with the same stylus, or is it HP
controller - HP stylus / Acer-Acer?
I have two interchangeable styluses which I can use with both acer/hp devices. It is also possible to swap between pen1 and pen2 and get the different parameters out of them.
- do these pens have physical notification of the style/width, or do
they just store the data in their memory?
No physical notification, just in memory.
- what are the chromebook models (if I need to eventually expense one)?

Acer one is chromebook spin 713 (CP713-2W series, model #N19Q5)

HP one is chromebook x360 (model #12b-ca0810no)

You need to be careful with what to buy though if you are looking for a specific USI controller, these are generally not documented anywhere, and there appears to be different versions of the same chromebook model even (e.g. spin 713 has multiple different variants.)

- to me, the Goodix report descriptor is bogus in the feature reports.
The Usage Page is stuck at "vendor defined" when it should have been
reset to "Digitizer" before the report ID 9. Is it just me and my
tools or am I missing something?
Yeah, it has plenty of vendor defined data in it, which should be digitizer. I had hardcoded part of these in my earlier driver.

The whole series is based on top of Benjamin's hid-bpf support work, and
I've pushed a branch at [1] with a series that works and brings in
the dependency. There are also a few separate patches in this series to
fix the problems I found from Benjamin's initial work for hid-bpf; I
wasn't able to get things working without those. The branch is also
based on top of 5.16-rc2 which required some extra changes to the
patches from Benjamin.
Yeah, I also rebased on top of 5.16 shortly after sharing that branch
and got roughly the same last fix (HID: bpf: compile fix for
bpf_hid_foreach_rdesc_item). I am *very* interested in your "HID: bpf:
execute BPF programs in proper context" because that is something that
was bothering me a lot :)
Right, I think I have plenty of lockdep / scheduler checks enabled in my
kernel. They generate plenty of spam with i2c-hid without that patch.
The same issue may not be visible with some other low level hid devices
though, I don't have testing capability for anything but the i2c-hid
right now. I2C is quite notorious for the locking aspects as it is slow
and is used to control some pretty low level stuff like power management
etc.
As a rule of thumb, hid_hw_raw_request() can not and should not be
called in IRQ.
I tested your patch with a USB device, and got plenty of complaints too.

I know bpf now has the ability to defer a function call with timers,
so maybe that's what we need here.
That sounds like something that would work yes, I did use workqueue
before when this was a separate driver instead of a BPF program.
"HID: bpf: add expected_attach_type to bpf prog during detach" is
something I'll need to bring in too

but "HID: bpf: fix file mapping" is actually wrong. I initially wanted
to attach BPF programs to hidraw, but shortly realized that this is
not working because the `hid/rdesc_fixup` kills the hidraw node and so
releases the BPF programs. The way I am now attaching it is to use the
fd associated with the modalias in the sysfs file (for instance: `sudo
./hid_surface_dial /sys/bus/hid/devices/0005:045E:091B.*/modalias`).
This way, the reference to the struct hid_device is kept even if we
disconnect the device and reprobe it.
Ok I can check this out if it works me also. The samples lead me to
/dev/hidraw usage.
Thanks again for your work, and I'd be curious to have your thoughts
on hid-bpf and if you think it is better than hidraw/evdev write/new
ioctls for your use case.
The new driver was 777 lines diff, the BPF one is 496 lines so it
appears smaller. The driver did support two different vendors though
(ELAN+Goodix, with their specific quirks in place), the BPF only a
single one right now (ELAN).

The vendor specific quirks are a question, do we want to support that
somehow in a single BPF binary, or should we attach vendor specific BPF
programs?
Good question.
The plan I had was to basically pre-compile BPF programs for the
various devices, but having them separated into generic + vendor
specifics seems interesting too.

I don't have a good answer right now.
At least for USI purposes, ELAN+GOODIX controllers have pretty different
quirks for them and it seems like having separate BPF programs might be
better; trying to get the same BPF program to run for both sounds
painful (it was rather painful to get this to work for single vendor.)
The more I look at the 2 report descriptors, the more I think that we
should be able to have a common code in hid-input for dealing with
input reports, and then have specifics as bpf programs.
As mentioned earlier, I think Goodix needs a report fixup for the
features, and we might want to change the reported values for Elan
immediately after we issue the config change.

It seems very much that we are in the same situation as Windows 7
multitouch screens. The spec was not restrictive enough that the HW
makers were not very careful, and we added multiple quirks for them.
I would prefer to have a common minimal hid-input handling and defer
the quirks in a BPF program :)
Yes, sounds good to me. When are you going to merge the base hid-bpf driver? :)

Chromium-os devices are one of the main customers for USI pens right
now, and I am not sure how well they will take the BPF concept. :) I did
ask their feedback though, and I'll come back on this once I have something.
Cool thanks.

Personally, I don't have much preference either way at this moment, both
seem like feasible options. I might lean a bit towards evdev/ioctl as it
seems a cleaner implementation as of now. The write mechanism I
implemented for the USI-BPF is a bit hacky, as it just directly writes
to a shared memory buffer and the buffer gets parsed by the kernel part
when it processes hidraw event. Anyways, do you have any feedback on
that part? BPF is completely new to me again so would love to get some
feedback.
Yeah, this feels wrong to me too.
I guess what we want is to run a BPF call initiated from the
userspace. I am not sure if this is doable. I'll need to dig further
too (I am relatively new to BPF too as a matter of facts).
I could not find a way to initiate BPF call from userspace, thats the
reason I implemented it this way. That said, I don't see any case where
this would fail though; we only ever write the values from single source
(userspace) and read them from kernel. If we miss a write, we just get
the old value and report the change later on.
Yeah, I understand it works, it's just that you can not initiate a
bpf_hid_raw_request() call from a raw_event callback. You are in an
IRQ, and we need to run things as fast as possible.
So either we defer, or we find another way of contacting the stylus
outside of the IRQ.
Hmm right, I wonder why lockdep and friends don't complain about this though, maybe BPF is switching context somehow. But yes, obviously running this in irq context would be wrong.

To initiate a BPF call from userspace we would need some sort of hid-bpf
callback to a BPF program, which gets triggered by an ioctl or evdev
write or something coming from userspace. Which brings us back to the
chicken-egg problem we have with USI right now. :)
I am thinking of adding a new syscall hid_bpf_run() that the userspace
program can trigger. Otherwise, it seems from a very rough overview we
could hijack the bpf_test_run() syscall, but that would not be very
nice.
Initiating an event from evdev is not very compatible with the BPF
approach because you'll need to also open the evdev node, which you
don't when you run a BPF program.

This would work for me.

-Tero


Cheers,
Benjamin

-Tero


Cheers,
Benjamin

One option is of course to push the write portion of the code to
userspace and just use hidraw, but we still need to filter out the bogus
events somehow, and do that in vendor specific manner. I don't think
this can be done on userspace, as plenty of information that would be
needed to do this properly has been lost at the input-event level.

-Tero

Cheers,
Benjamin

-Tero

[1] https://github.com/t-kristo/linux/tree/usi-5.16-rfc-v2-bpf


Attachment: usi-logs.tar.gz
Description: application/gzip