Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music

From: Mark Brown
Date: Thu Feb 02 2023 - 14:36:56 EST


On Thu, Feb 02, 2023 at 10:39:51AM -0800, Guenter Roeck wrote:
> On 2/2/23 09:04, Mark Brown wrote:

> > It's definitely unclear, there's a datasheet at [1] which does appear to
> > explicitly call for a 512ms delay though (see figure 20 on page 50). It
> > does look like it should only be applied in the case where an inserted
> > jack is detected (ie, when identifying an accessory or button press) and
> > not when removal is detected though.

> The datasheet doesn't really suggest that a delay shall be applied using
> msleep (ie in the code). The chip presumably debounces internally (see

Obviously it doesn't call for an explicit implementation in the host.

> jackdet_debounce and jackdet_rem_deb), and there is also jack_detect_rate
> to configure the detection rate. The table seems to suggest (to me) that
> there is an e_jack_insertion event, which would then be followed 64-512 ms
> later with an e_jack_detect_complete event.

Right, I think what I was looking at was that in combination of the fact
that there's a *much* longer window before the host clears the interrupt
shown on the first JACK_IN. It could be spurious and possibly just due
to the host type check thing in the diagram but it smells real bad, like
the hardware state machine has robustness issues or something. The
diagram currently doesn't quite correspond to the code since we have the
delay applied unconditionally, and there's that undocumented
register for the ground switch being managed.

> Whatever is done in software is on top of that, or at least that is my
> understanding, and not explained by anything in the datasheet.

> Given that the chip itself supports debouncing internally, it is not clear
> to me what the delay is actually supposed to accomplish. Soft debounce
> on top of chip debounce ? I don't see that explained anywhere, though of
> course I might be missing it.

That's what it looks like it's trying to accomplish but as you say it's
not exactly explicit. I *suspect* it's trying to debounce in more cases
than is needed.

> > > is not conceivable (or explained) why the ground switch is enabled only
> > > to be disabled immediately afterwards if a jack was removed.

> > It smells like there's a power benefit to leaving it disabled when
> > unplugged (which seems plausible), and possibly like the detection is
> > more stable with the ground switch enabled. The ground switch is not
> > documented AFAICT (it's in register 0xfb which isn't named and doesn't
> > appear to appear in the datsheet from a quick search). The code is
> > leaving the switch enabled so long as an accessory is plugged.

> I understand. What I don't understand is that it is always enabled
> in the interrupt handler, no matter if a jack was inserted or not,
> only to be disabled immediately if the jack was disabled or after
> insertion detection work is complete.

My guess was that it was making the detection more stable, it's
surprising that it'd help with simple presence detection though.

> > Are you saying this is actually crashing, or just that you're getting
> > warnings about threads being blocked for too long (that was what was
> > posted earlier in the thread)? The only things I can see that look like

> ChromeOS is configured to crash after stalled threads are detected (ie
> after 120 seconds), so this is actually causing crashes.

Ah, that's much more serious than I'd understood from the log you
posted.

> > they have the potential to actually lock up are the cancel_work_sync()
> > calls but they were unchanged and the backtrace you showed was showing
> > the thread in the msleep(). My guess would be that you've got systems
> > where there are very frequent jack detection events (potentiallly with
> > broken accessories, or possibly due to the ground switch putting things
> > into the wrong priority) and that the interrupt is firing again as soon
> > as the thread unmasks the primary interrupt which means it never
> > actually stops running.

> That is what I strongly suspect is happening. I don't know why exactly
> the interrupt is firing continuously, but the hang is always in msleep().
> One possibility might be that the event is actually a disconnect event,
> and that enabling and immediately disabling the ground switch causes
> another interrupt, which is then handled immediately, causing the hang.

Could be. I'd be willing to guess that it's not just one event but
rather a stream of events of some kind. Possibly if it's due to the
ground switch it's spuriously detecting a constant stream of button
presses for the affected systems, which don't produce any UI visible
result which would cause users to pull the accessory for whatever
reason? Whatever's going on I bet it's broken accessories triggering it.

> > My other guess would be that moving the delay that's been added to a
> > delayed work would avoid the warnings, though you might want to manually
> > keep the physical interrupt disabled while that's running which is fun.

> I am not sure if that would fix anything. The current code sleeps, then
> enables the ground switch and does the rest of the detection. I'd somewhat
> understand the code if it would enable the ground switch after an "insertion
> detected" interrupt, then wait for some amount of time and handle the rest
> of the detection after waiting (even though that should really be handled by
> the "detection complete" interrupt). But that isn't what it does.
> If we were to implement the above, I suspect the result would be that the
> interrupt still happens all the time, and the only difference would be that
> it would be "silenced" while the delayed work is waiting to be scheduled.
> That doesn't really fix the problem, it only works around it. But, sure,
> it would be much better than the current situation.

Yes, exactly - I was just looking at a refactoring in the code which
would mitigate the immediate problem while keeping the current partially
documented algorithm in place.

> My "wild shot" fix would be to enable the ground switch after an insertion
> event and to drop the software sleep entirely.

That's entirely plausible to me, either together or possibly just one of
those is actually needed. Do you want to send a patch?

> However, it is really impossible to know what the delay is for in the
> first place. Looking into the code further, the sleep time actually matches
> the configured jack detection rate. I have no idea why it would make sense
> to wait for a detection cycle after an event, then enable the ground switch
> and actually handle the event (which by then probably reports that jack
> detection is complete after an insertion). I really don't understand
> the logic behind that.

This all smells like there's either a race condition in a state machine
somewhere or the button detection needs a bit of help (though if it's
the latter then it'd be conditional on a microphone having been
detected).

Hopefully David will get back to us with some explanation and ideally
fix.

Attachment: signature.asc
Description: PGP signature