Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic pwm-ir-tx

From: Ivaylo Dimitrov
Date: Fri Sep 29 2023 - 04:50:05 EST


Hi,

On 26.09.23 г. 23:18 ч., Sean Young wrote:
On Tue, Sep 26, 2023 at 03:43:18PM +0300, Ivaylo Dimitrov wrote:
On 26.09.23 г. 10:16 ч., Sean Young wrote:
On Mon, Sep 25, 2023 at 07:06:44PM +0300, Ivaylo Dimitrov wrote:
On 1.09.23 г. 17:18 ч., Sean Young wrote:
The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
handled entirely by the generic pwm-ir-tx driver, and in fact the
pwm-ir-tx driver has been compatible with ir-rx51 from the start.


Unfortunately, pwm-ir-tx does not work on n900. My investigation shows that
for some reason usleep_range() sleeps for at least 300-400 us more than what
interval it is requested to sleep. I played with cyclictest from rt-tests
package and it gives similar results - increasing the priority helps, but I
was not able to make it sleep for less that 300 us in average. I tried
cpu_latency_qos_add_request() in pwm-ir-tx, but it made no difference.

I get similar results on motorola droid4 (OMAP4), albeit there average sleep
is in 200-300 us range, which makes me believe that either OMAPs have issues
with hrtimers or the config we use has some issue which leads to scheduler
latency. Or, something else...

The pwm-ir-tx driver does suffer from this problem, but I was under the
impression that the ir-rx51 has the same problem.


Could you elaborate on the "pwm-ir-tx driver does suffer from this problem"?
Where do you see that?

So on a raspberry pi (model 3b), if I use the pwm-ir-tx driver, I get random
delays of up to 100us. It's a bit random and certainly depends on the load.

I'm measuring using a logic analyzer.

There have been reports by others on different machines with random delays
and/or transmit failures (as in the receiver occassionally fails to decode
the IR). I usually suggest they use the gpio-ir-tx driver, which does work
as far as I know (the signal looks perfect with a logic analyzer).

So far I've taken the view that the driver works ok for most situations,
since IR is usually fine with upto 100us missing here or there.

The gpio-ir-tx driver works much better because it does the entire send
under spinlock - obviously that has its own problems, because an IR transmit
can be 10s or even 100s of milliseconds.

I've never known of a solution to the pwm-ir-tx driver. If using hrtimers
directly improves the situation even a bit, then that would be great.


The issue with hrtimers is that we cannot use them directly, as pwm_apply_state() may sleep, but hrtimer function is called in atomic context.

ir-rx51 does not suffer from the same problem (albeit it has its own one,
see bellow)

In either case help is appreciated to dig further trying to find the reason
for such a big delay.

pwm-ir-tx uses usleep_range() and ir-rx51 uses hrtimers. I thought that
usleep_range() uses hrtimers; however if you're not seeing the same delay
on ir-rx51 then maybe it's time to switch pwm-ir-tx to hrtimers.


usleep_range() is backed by hrtimers already, however the difference comes
from how hrtimer is used in ir-rx51: it uses timer callback function that
gets called in softirq context, while usleep_range() puts the task in
TASK_UNINTERRUPTIBLE state and then calls schedule_hrtimeout_range(). For
some reason it takes at least 200-400 us (on average) even on OMAP4 to
switch back to TASK_RUNNING state.

The issue with ir-rx51 and the way it uses hrtimers is that it calls
pwm_apply_state() from hrtimer function, which is not ok, per the comment
here
https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/pwm/core.c#L502

I can make pwm-ir-tx switch to hrtimers, that's not an issue, but I am
afraid that there is some general scheduler or timers (or something else)
issue that manifests itself with usleep_range() misbehaving.

If we can switch pwm-ir-tx to hrtimers, that would be great.


I made some POC here, but unfortunately it failed more or less. The idea of POC is: setup hrtimer, start it in pwm_ir_tx() and do wait_for_completion() in a loop while calling complete() for the timer function. While it improves things a bit, I wouldn't say it makes the driver working ok on n900 - my TV registers one of let's say 5-10 pulse packages.

We have couple of issues:

- scheduler seems to use 32kHz timer, which means that we can never have precise pulse width, with error up to ~30 us, no matter what we do, IIUC.

- wait_for_completion() suffers from the same latency issue that usleep_range() has - it exits after 300-400 us after complete() has been called in the timer function.

- turning pwm off needs ~300us, because of either omap_dm_timer_stop() calling clk_get_rate() or __omap_dm_timer_stop() waiting for fclk period * 3.5 (see https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/clocksource/timer-ti-dm.c#L269)

- in order to achieve some sane latency distribution, I have to set_user_nice(current, MIN_NICE); in pwm_ir_tx()


The ir-rx51 removal patches have already been queued to media_staging;
we may have to remove them from there if we can't solve this problem.


ir-rx51 has conceptual problem of calling function that might sleep from atomic context, however, we can fix omap_dm_timer_stop() to not call clk_get_rate() and that would make it working. So yeah, if we can't fix pwm-ir-tx then patches removal along with fixing dmtimer and fixing a couple of code issues ir-rx51 has, seems the only option to have working IRTX on n900. Maybe we can rename it to pwm-ir-tx-hrtimer as there is nothing n900 specific in it.

I don't have a n900 to test on, unfortunately.


I have and once I have an idea what's going on will port pwm-ir-tx to
hrtimers, if needed. Don't want to do it now as I am afraid the completion I
will have to use will have the same latency problems as usleep_range()

That would be fantastic. Please do keep us up to date with how you are
getting on. Like I said, it would be nice to this resolved before the next
merge window.


The only thing I didn't try yet is to start another thread and to set that thread to use FIFO scheduler. I will report once I have tried that.

Regards,
Ivo

Thanks,
Sean