Closed Bug 784079 Opened 8 years ago Closed 8 years ago

Gonk sensor polling blocks the process from exiting, activation/deactivation are racey.


(Core :: Hardware Abstraction Layer (HAL), defect)

Gonk (Firefox OS)
Not set



blocking-kilimanjaro ?


(Reporter: marshall, Assigned: marshall)




(2 files)

This is a pretty sensitive race condition -- I've only been able to reproduce this on the ARM emulator. Attaching gdb also makes the error go away. 

After a bit of investigation, a symptom of this problem is that |sActivatedSensors| gets decremented to -1, and the check in GonkSensor [1] continues to poll.

The "real" fix for this would probably be:
- Make |sActivatedSensors| thread safe by protected it with a mutex
- Ensure there is only one code path that disables the orientation sensor. Since I haven't been able to reproduce this with the debugger attached, I've only been able to verify one code path: nsAppShell::Exit [2]. I've also attached the trace I got from that code path.

Attached patch hack fix - v1Splinter Review
This probably isn't the right way to fix this, but it unblocks my testing.
Someone more familiar with HAL might want to chime in on the best way to fix this
(and provide a better patch). Marking for feedback in case this is acceptable..
Attachment #653452 - Flags: feedback?(mwu)
Attachment #653452 - Flags: feedback?(mwu) → feedback?(slee)
Comment on attachment 653452 [details] [diff] [review]
hack fix - v1

Hi Marshall,

sActivatedSensors will be modified and accessed only by sSwitchThread. So that I think it does not need a mutex to protect it. I think the reason why sActivatedSensors is negative is that someone does not call hal::UnregisterSensorObserver/hal::RegisterSensorObserver from main thread. Can you build a debug version and try again? We have an assertion in hal::UnregisterSensorObserver that can check whether it is called from main thread. Thanks~
So it turns out that enabling debugging also makes this race condition go away :(.

Since then, even after clobbering and disabling debug, I haven't been able to reproduce the problem again. Resolving as WORKSFORME, will re-open if I encounter it again.
Closed: 8 years ago
Resolution: --- → WORKSFORME
Attachment #653452 - Flags: feedback?(slee)
I think one of possible reason for this issue is that 
    Gonk Sensor didnt' check the pairs of activate/deactivate sensors.

The formal way of this should be 
    1. Create records for each activating commands. 
       And each record should contain the identifier for caller.

    2. For each deactivating commands, sensor service should check whether it's identifier 
       was in activating records then allow to do deactivation.

    3. Protecting records and de/activating sensors with the mutex.

In this way, we can guarantee the pairs of activate/deactivate commands.
And avoid this issue.
Chris and I have now been able to reproduce this on Otoro, but the symptoms are slightly different.

On Otoro, we never see |sActivatedSensors| go below 0, but we have seen mismatched activation/deactivations, and we've also verified that that |SensorDevice.poll()| is blocking the Gonk sensor thread from cleanly exiting when we try to shutdown with |nsAppRunner::Shutdown()|.
blocking-kilimanjaro: --- → ?
Resolution: WORKSFORME → ---
Summary: Orientation sensor deactivation happens twice, stops the process from cleanly exiting → Gonk sensor polling blocks the process from exiting, activation/deactivation are racey.
Chris has opened another bug w/ patch for this, marking this as duplicate.
Closed: 8 years ago8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 786573
You need to log in before you can comment on or make changes to this bug.