Last Comment Bug 788118 - Non-continuous sensor blocks thread from switching sensor
: Non-continuous sensor blocks thread from switching sensor
Status: VERIFIED FIXED
[LOE:S]
:
Product: Core
Classification: Components
Component: Hardware Abstraction Layer (HAL) (show other bugs)
: unspecified
: ARM Gonk (Firefox OS)
: -- normal (vote)
: mozilla18
Assigned To: Marco Chen [:mchen]
:
Mentors:
: 772669 (view as bug list)
Depends on:
Blocks: 772669
  Show dependency treegraph
 
Reported: 2012-09-04 03:09 PDT by Marco Chen [:mchen]
Modified: 2012-09-20 00:35 PDT (History)
9 users (show)
ryanvm: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+


Attachments
Log showing a non-continous sensor blocking the thread from switching sensors. (2.11 KB, text/plain)
2012-09-04 03:09 PDT, Marco Chen [:mchen]
no flags Details
WIPv1 on Main Thread with Log (Have Issues) (3.95 KB, patch)
2012-09-05 03:53 PDT, Marco Chen [:mchen]
no flags Details | Diff | Splinter Review
Log File from WIPv1 (17.75 KB, text/plain)
2012-09-05 03:56 PDT, Marco Chen [:mchen]
no flags Details
WIPv2 (8.74 KB, patch)
2012-09-07 01:14 PDT, Marco Chen [:mchen]
no flags Details | Diff | Splinter Review
WIPv3 (8.74 KB, patch)
2012-09-14 00:12 PDT, Marco Chen [:mchen]
no flags Details | Diff | Splinter Review
WIPv4 (8.29 KB, patch)
2012-09-16 20:58 PDT, Marco Chen [:mchen]
no flags Details | Diff | Splinter Review
WIPv4 (8.12 KB, patch)
2012-09-17 00:39 PDT, Marco Chen [:mchen]
mwu.code: review+
Details | Diff | Splinter Review
Check-in-candidate (8.25 KB, patch)
2012-09-17 19:57 PDT, Marco Chen [:mchen]
mwu.code: review+
Details | Diff | Splinter Review

Description Marco Chen [:mchen] 2012-09-04 03:09:21 PDT
Created attachment 658037 [details]
Log showing a non-continous sensor blocking the thread from switching sensors.

Once a non-continuous sensor is enabled, the thread will be blocked at poll().
If someone want to enable another sensor in this time, the SwitchSensor will be postponed until the environment making non-continuous sensor output events.

I suggest to create another thread for switching sensor not do on /polling thread/.
Comment 1 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-09-04 11:18:41 PDT
We need to call activate() on the main thread.  I had a patch to do this but I deleted it.
Comment 2 Marco Chen [:mchen] 2012-09-04 23:30:09 PDT
If we access the device node or sys attributes on main thread, 
then it will be blocked on these call for waiting finish. 
And we can't guarantee how much time consumed by driver layer for any I/O actions. This may effect the performance on main thread. That's why I suggest to implement switching sensor on another thread not main. Any suggestion?
Comment 3 Marco Chen [:mchen] 2012-09-05 03:53:50 PDT
Created attachment 658438 [details] [diff] [review]
WIPv1 on Main Thread with Log (Have Issues)

Consult with other colleges, they performed functions consumed less time on main thread directly. So do it on this issue.

But refer to the WIPv1 log, some issues here.
    1. It seems someone will iterate all sensors for enabling & disabling one time.
    2. The first time to activate Orientation sensor consumed 5 second. (strange.)
    3. For screen orientation, the acceleration sensor should be referred not orientation sensor. The later one is used on e-Compass and maybe a virtual sensor rely on acceleration and magnetic.
    4. After got into home screen and lunch the browser, the second time of enabling orientation seems to be failed. Because no events are output.

Continue to check.
Comment 4 Marco Chen [:mchen] 2012-09-05 03:56:38 PDT
Created attachment 658439 [details]
Log File from WIPv1

The log from Otoro with WIPv1 patch.
Comment 5 Marco Chen [:mchen] 2012-09-05 05:55:37 PDT
After do some investigation related to issues on Comment 3.

1. Anyone know why we need to iterate sensors once.

2. The 5 seconds is consumed by SensorDevice::activate(). This function will also notify Android battery service about sensor status via binder IPC. But it seems there is no Android battery service any more so it will consume 5 seconds on BatteryService::GetInstance(). Since it is the code outside the B2G how do avoid this issue if we do switch sensor on main thread.

3. Anyone know this?

4. Obsolete this issue. 

5. If we ignore the issue 2, this modification can resolve this Bug 788118.
Comment 6 Marco Chen [:mchen] 2012-09-07 01:14:43 PDT
Created attachment 659172 [details] [diff] [review]
WIPv2

1. Moving switching sensor from thread to main thread.
2. Moving the controll of sensor hw module from SensorDevice to GonkSensor.
   This is in order to void the 5 seconds from Binder IPC timeout for geting 
   Android Battery Service in SersorDevice.
3. Since polling and switching sensors are separated to main and new thread, keeping new thread to poll data.
Comment 7 Justin Lebar (not reading bugmail) 2012-09-07 11:26:32 PDT
Comment on attachment 659172 [details] [diff] [review]
WIPv2

This is probably more up mwu's alley.
Comment 8 Andrew Overholt [:overholt] 2012-09-10 10:43:25 PDT
Rotation not working is a blocker.
Comment 9 Marco Chen [:mchen] 2012-09-11 19:03:51 PDT
Hi, Michael Wu (:mwu). If you have time, please help me to review the patch.
Thanks very much for your effort.
Comment 10 Michael Wu [:mwu] 2012-09-12 15:49:07 PDT
Hi Marco - I'll attempt to review within the next 24 hours. I've been occupied the last 5 days or so with travel.
Comment 11 Michael Wu [:mwu] 2012-09-13 12:56:50 PDT
Comment on attachment 659172 [details] [diff] [review]
WIPv2

Review of attachment 659172 [details] [diff] [review]:
-----------------------------------------------------------------

General comments -

1. Add a space after //
2. Look at your patch in splinter review and delete any leading whitespace you might have added. Leading whitespace is highlighted in pink.

::: hal/gonk/GonkSensor.cpp
@@ +156,5 @@
>  };
>  
>  namespace hal_impl {
>  
> +#define DEFAULT_DEVICE_POLL_RATE 200000000 /*200ms*/

Why did you change this?

@@ +164,5 @@
>  static DebugOnly<int> sSensorRefCount[NUM_SENSOR_TYPE];
> +static base::Thread* sPollingThread;
> +static SensorHWInterface* sDevice;
> +
> +class SensorHWInterface

This class seems entirely unnecessary. It appears to serve 2 purposes -

1. Handling the case when we can't initialize a sensor module or sensor poll device. In this case, SetSensorState/EnableSensorNotifications/DisableSensor notifications need to check whether initialization is successful before doing anything else. This allows us to limit initialization checking to 3 functions instead of the 4 here, and also ensure we don't unnecessarily run other code if we can't initialize the sensor module first.
2. Wrapping the poll function so it handles EINTR correctly. There is only one caller and that logic can be inlined into the caller. Additionally, I have seen implementations of poll that already handle EINTR internally, so it seems unnecessary. See https://github.com/mozilla-b2g/android-device-crespo/blob/master/libsensors/sensors.cpp#L230 for one implementation

@@ +232,5 @@
>    const size_t numEventMax = 16;
>    sensors_event_t buffer[numEventMax];
> +  int n;
> +  const sensor_t* sensors;
> +  ssize_t size = sDevice->getSensorList(&sensors);

I appreciate this optimization, but if possible, please try to put improvements into separate patches. A series of small, focused patches will be reviewed faster than larger patches doing many things.

@@ +249,2 @@
>  
> +  return;

Don't return at the end of functions with a void return type.

@@ +260,4 @@
>  
> +  sDevice->activate((void*)aThreadId, aSensor.handle, aActivate);
> +  //only the timing of enabling sensor adjust the polling rate
> +  if(!aActivate)

Always put a space after if.

@@ +281,4 @@
>    for (ssize_t i = 0; i < size; i++) {
>      if (sensors[i].type == type) {
> +      //swiching sensor on main thread
> +      SwitchSensor(activate, sensors[i], pthread_self());

Dispatching the switch to another thread was done specifically to avoid a startup slowdown. See bug 730363. Maybe we can just use a thread separate from the poll thread that's dedicated to activating/deactivating sensors.

I think this is pretty much the heart of the patch. Some of the other changes may be useful, but a lot of time was spent understanding other parts of the patch which were trying to fix/improve something else. Please try to reduce the scope of a patch next time, or at least list all the things that the patch is trying to do.
Comment 12 Justin Lebar (not reading bugmail) 2012-09-13 12:58:25 PDT
> 2. Look at your patch in splinter review and delete any leading whitespace you might have added. 
> Leading whitespace is highlighted in pink.

git rebase --whitespace=fix can also help with this.
Comment 13 Marco Chen [:mchen] 2012-09-13 20:00:29 PDT
Hi, Michael. Thanks for your review first.

::: hal/gonk/GonkSensor.cpp
@@ +156,5 @@
>  };
>  
>  namespace hal_impl {
>  
> +#define DEFAULT_DEVICE_POLL_RATE 200000000 /*200ms*/

This value is referred from SensorDevice.h (Android)

@@ +164,5 @@
>  static DebugOnly<int> sSensorRefCount[NUM_SENSOR_TYPE];
> +static base::Thread* sPollingThread;
> +static SensorHWInterface* sDevice;
> +
> +class SensorHWInterface

This class is used to instead of SensorDevice.cpp from Android. So we can avoid the bug 730363 even we switch sensors on main thread.

@@ +281,4 @@
>    for (ssize_t i = 0; i < size; i++) {
>      if (sensors[i].type == type) {
> +      //swiching sensor on main thread
> +      SwitchSensor(activate, sensors[i], pthread_self());

Since we use SensorHWInterface to avoid bug730363 so we can consider to put SwitchSensor on main thread or another new thread. In my opinion, SwitchSensor didn't take too much effort (just opening the power & setup work for interrupt to report data) and didn't be called frequently so I still suggest it been done on main thread not wasting resource for this easy stuff. 

-----------------------------------------------------------------------

Thanks for Michael's reminding about patch should be focused on this issue only.
If we clear the concern on 
    a. Use Class SensorHWInterface to instead of SensorDevice.cpp from Andorid.
    b. Put SwitchSensor on main thread or new thread.
I will create a more precise patch for this issue only.

Thanks for Justin;s tip.
Comment 14 Michael Wu [:mwu] 2012-09-13 20:28:49 PDT
(In reply to Marco Chen [:mchen] from comment #13)
> Hi, Michael. Thanks for your review first.
> 
> ::: hal/gonk/GonkSensor.cpp
> @@ +156,5 @@
> >  };
> >  
> >  namespace hal_impl {
> >  
> > +#define DEFAULT_DEVICE_POLL_RATE 200000000 /*200ms*/
> 
> This value is referred from SensorDevice.h (Android)
> 

Hmm, ok. This is a change from the previous value, however.

> @@ +281,4 @@
> >    for (ssize_t i = 0; i < size; i++) {
> >      if (sensors[i].type == type) {
> > +      //swiching sensor on main thread
> > +      SwitchSensor(activate, sensors[i], pthread_self());
> 
> Since we use SensorHWInterface to avoid bug730363 so we can consider to put
> SwitchSensor on main thread or another new thread. In my opinion,
> SwitchSensor didn't take too much effort (just opening the power & setup
> work for interrupt to report data) and didn't be called frequently so I
> still suggest it been done on main thread not wasting resource for this easy
> stuff. 
> 

Hmm, ok, I didn't realize we were using Android's wrapper instead of the hal api directly. Using the hal api is much better - thanks.

However, SensorHWInterface is still not a very useful wrapper around the hal api - the hal api should be used directly. If activate isn't taking long, which it shouldn't, we should just keep this on the main thread.
Comment 15 Marco Chen [:mchen] 2012-09-14 00:12:45 PDT
Created attachment 661145 [details] [diff] [review]
WIPv3

1. Adjust the nits according to coding style.
2. Remove SensorHWInterface class then using hal module directly.
3. Call switchSensor on main thread.
4. Keep pollingThread on polling always.
5. sActivatedSensors is removed due to no necessary to monitor the timing of enable/disable polling. 

And for modification on SensorRunnable, I tried but found I can't avoid these because the pointers related to sensor hal module are put on hal_imple namespace. So I didn't remove it to keep more focus on this issue.
Comment 16 Michael Wu [:mwu] 2012-09-14 00:41:27 PDT
Comment on attachment 661145 [details] [diff] [review]
WIPv3

Review of attachment 661145 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing the trailing whitespace!

Please put a space after // comments.

::: hal/gonk/GonkSensor.cpp
@@ +23,5 @@
>  #include "Hal.h"
>  #include "HalSensor.h"
>  #include "hardware/sensors.h"
>  #include "mozilla/Util.h"
> +#include <gui/Sensor.h>

What is this header being included for?

@@ +164,4 @@
>  static DebugOnly<int> sSensorRefCount[NUM_SENSOR_TYPE];
> +static base::Thread* sPollingThread;
> +static struct sensors_poll_device_t* sSensorDevice;
> +static struct sensors_module_t*sSensorModule;

Specifying struct should not be necessary here.

@@ +172,3 @@
>    const size_t numEventMax = 16;
>    sensors_event_t buffer[numEventMax];
> +  int n;

Move this declaration to the line where it's first assigned.

@@ +172,5 @@
>    const size_t numEventMax = 16;
>    sensors_event_t buffer[numEventMax];
> +  int n;
> +  const sensor_t* sensors;
> +  ssize_t size = sSensorModule->get_sensors_list(sSensorModule, &sensors);

This returns an int.

@@ +185,5 @@
>  
> +    for (int i = 0; i < n; ++i) {
> +      NS_DispatchToMainThread(new SensorRunnable(buffer[i], sensors, size));
> +    }
> +  } while(n >= 0);

Add a space after while

@@ +192,5 @@
>  static void
>  SwitchSensor(bool aActivate, sensor_t aSensor, pthread_t aThreadId)
>  {
>    int index = HardwareSensorToHalSensor(aSensor.type);
> +  status_t err(NO_ERROR);

activate returns int, not status_t.

@@ +197,5 @@
>  
>    MOZ_ASSERT(sSensorRefCount[index] || aActivate);
>  
> +  err = sSensorDevice->activate(sSensorDevice, aSensor.handle, aActivate);
> +  MOZ_ASSERT(!err);

I don't think we should expect external code to always succeed, so we should handle/ignore/warn instead of asserting. Same with the setDelay call.

@@ +218,5 @@
>  SetSensorState(SensorType aSensor, bool activate)
>  {
>    int type = HalSensorToHardwareSensor(aSensor);
>    const sensor_t* sensors = NULL;
> +  ssize_t size;

get_sensors_list returns an int.

Move this declaration to where it's assigned.

@@ +249,5 @@
> +    }
> +
> +    sensors_open(&sSensorModule->common, &sSensorDevice);
> +    if (!sSensorDevice) {
> +      LOGE("Can't get sensor poll device from module \n");

sSensorModule should be set to null in this case.

@@ +252,5 @@
> +    if (!sSensorDevice) {
> +      LOGE("Can't get sensor poll device from module \n");
> +      return;
> +    }
> +    if (sSensorDevice) {

This will always be true.

@@ +262,5 @@
> +      }
> +  } //if (!sSensorModule)
> +
> +  //If sensor hal module/device didn't be loaded succefully, skip to create thread.
> +  if (!sPollingThread && sSensorDevice) {

If sSensorModule is set to null if we can't open a sSensorDevice, then sSensorDevice will always be set and this can be simplified.
Comment 17 Marco Chen [:mchen] 2012-09-14 01:45:41 PDT
> ::: hal/gonk/GonkSensor.cpp
> @@ +23,5 @@
> >  #include "Hal.h"
> >  #include "HalSensor.h"
> >  #include "hardware/sensors.h"
> >  #include "mozilla/Util.h"
> > +#include <gui/Sensor.h>
> 
> What is this header being included for?
> 

It is for type of ssize_t, but after changing to int it is no necessary now.
Will remove it.

>@@ +172,3 @@
>>    const size_t numEventMax = 16;
>>    sensors_event_t buffer[numEventMax];
>> +  int n;
>
>Move this declaration to the line where it's first assigned.

If I declare it into do-while loop, then it will not be a scope for "while (n >= 0)" on line 189. So I need keep it there. Or any suggestion?

> @@ +197,5 @@
> >  
> >    MOZ_ASSERT(sSensorRefCount[index] || aActivate);
> >  
> > +  err = sSensorDevice->activate(sSensorDevice, aSensor.handle, aActivate);
> > +  MOZ_ASSERT(!err);
> 
> I don't think we should expect external code to always succeed, so we should
> handle/ignore/warn instead of asserting. Same with the setDelay call.
> 

Ok, since we can't do anything when this fail is happened. Just ignore it. (original code ignore it too.)

> @@ +249,5 @@
> > +    }
> > +
> > +    sensors_open(&sSensorModule->common, &sSensorDevice);
> > +    if (!sSensorDevice) {
> > +      LOGE("Can't get sensor poll device from module \n");
> 
> sSensorModule should be set to null in this case.
> 

In the case of sSensorModule got pointer but sSensorDevice not, if I set sSensorModule to null then these code section will be ran again when someone call EnableSensorNotifications().
So I keep sSensorModule with pointer to avoid running again and again. And later code will check sSensorDevice only.

> @@ +262,5 @@
> > +      }
> > +  } //if (!sSensorModule)
> > +
> > +  //If sensor hal module/device didn't be loaded succefully, skip to create thread.
> > +  if (!sPollingThread && sSensorDevice) {
> 
> If sSensorModule is set to null if we can't open a sSensorDevice, then
> sSensorDevice will always be set and this can be simplified.

I can't got your point, could you explain more?

Thanks.
Comment 18 Michael Wu [:mwu] 2012-09-14 10:20:26 PDT
(In reply to Marco Chen [:mchen] from comment #17)
> >@@ +172,3 @@
> >>    const size_t numEventMax = 16;
> >>    sensors_event_t buffer[numEventMax];
> >> +  int n;
> >
> >Move this declaration to the line where it's first assigned.
> 
> If I declare it into do-while loop, then it will not be a scope for "while
> (n >= 0)" on line 189. So I need keep it there. Or any suggestion?
> 

The n >= 0 check will never be false, however. It can just be converted to an infinite loop.

> > @@ +249,5 @@
> > > +    }
> > > +
> > > +    sensors_open(&sSensorModule->common, &sSensorDevice);
> > > +    if (!sSensorDevice) {
> > > +      LOGE("Can't get sensor poll device from module \n");
> > 
> > sSensorModule should be set to null in this case.
> > 
> 
> In the case of sSensorModule got pointer but sSensorDevice not, if I set
> sSensorModule to null then these code section will be ran again when someone
> call EnableSensorNotifications().
> So I keep sSensorModule with pointer to avoid running again and again. And
> later code will check sSensorDevice only.
> 

That's not really a case we need to optimize for, however.

> > @@ +262,5 @@
> > > +      }
> > > +  } //if (!sSensorModule)
> > > +
> > > +  //If sensor hal module/device didn't be loaded succefully, skip to create thread.
> > > +  if (!sPollingThread && sSensorDevice) {
> > 
> > If sSensorModule is set to null if we can't open a sSensorDevice, then
> > sSensorDevice will always be set and this can be simplified.
> 
> I can't got your point, could you explain more?
> 

This function will always return before reaching this point if we set sSensorModule to null at the point I mentioned before.
Comment 19 Michael Wu [:mwu] 2012-09-14 10:34:36 PDT
Comment on attachment 661145 [details] [diff] [review]
WIPv3

Review of attachment 661145 [details] [diff] [review]:
-----------------------------------------------------------------

Some additional comments. I think we'll be ready to land once these are all addressed.

::: hal/gonk/GonkSensor.cpp
@@ +168,3 @@
>  
>  static void
>  PollSensorsOnce()

This needs to be renamed since it doesn't just call poll once anymore. Maybe just PollSensors()

@@ +202,2 @@
>  
> +  //only the timing of enabling sensor adjust the polling rate

Comment unnecessary.

@@ +229,3 @@
>    for (ssize_t i = 0; i < size; i++) {
>      if (sensors[i].type == type) {
> +      //swiching sensor on main thread

Comment unnecessary.

@@ +240,3 @@
>  {
> +  //The life cycle of sSensorModule/sSensorDevice & sPollingThread is accompanied
> +  //with B2G, so they will be freed after B2G crashed or terminated

We never free these things. A comment like:

// sPollingThread never terminates because poll may never return

near the thread initialization would be more useful, I think.

@@ +259,5 @@
> +
> +        for (size_t i=0 ; i<size_t(count) ; i++)
> +          sSensorDevice->activate(sSensorDevice, sensors[i].handle, 0);
> +      }
> +  } //if (!sSensorModule)

These sort of comments are usually restricted to #endif lines. If you need this in normal indented code to find your place, then your code is probably too indented.
Comment 20 Marco Chen [:mchen] 2012-09-16 20:58:52 PDT
Created attachment 661675 [details] [diff] [review]
WIPv4

Follow the comment from :mwu.
Comment 21 Mozilla RelEng Bot 2012-09-17 00:00:24 PDT
Try run for cbb8e080817f is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=cbb8e080817f
Results (out of 273 total builds):
    exception: 2
    success: 242
    warnings: 27
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mchen@mozilla.com-cbb8e080817f
Comment 22 Marco Chen [:mchen] 2012-09-17 00:39:54 PDT
Created attachment 661708 [details] [diff] [review]
WIPv4

Follow comments from Michael Wu.
Comment 23 Michael Wu [:mwu] 2012-09-17 11:10:16 PDT
Comment on attachment 661708 [details] [diff] [review]
WIPv4

Review of attachment 661708 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, though I may have caught a logic error that I didn't see before - sorry about that.

::: hal/gonk/GonkSensor.cpp
@@ +197,2 @@
>  
> +  if (!aActivate) {

Wait, shouldn't this be something like if (!sSensorRefCount[index]) or if (aActivate) ? This code looks like it only sets the poll rate when turning the sensor off.

@@ +251,5 @@
> +    sPollingThread = new base::Thread("GonkSensors");
> +    MOZ_ASSERT(sPollingThread);
> +    // sPollingThread never terminates because poll may never return
> +    sPollingThread->Start();
> +    // keep this thread to poll sens data always.

I think the comment above this one explains roughly the same thing, so this can be removed.
Comment 24 Marco Chen [:mchen] 2012-09-17 19:57:30 PDT
Created attachment 662001 [details] [diff] [review]
Check-in-candidate

Follow the comments from mwu.
And if this passes the review, I will add checkin-needed flag.

Thanks for mwu's help really.
Comment 25 Michael Wu [:mwu] 2012-09-17 20:27:26 PDT
Comment on attachment 662001 [details] [diff] [review]
Check-in-candidate

Looks good, thanks!
Comment 26 Cervantes Yu [:cyu] [:cervantes] 2012-09-17 23:18:58 PDT
*** Bug 772669 has been marked as a duplicate of this bug. ***
Comment 27 Andrew Overholt [:overholt] 2012-09-18 12:48:37 PDT
Since this is ready for checkin, just noting that it's likely to be completed within a week :)
Comment 28 Ryan VanderMeulen [:RyanVM] 2012-09-18 18:35:57 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/80542168d0d9
Comment 29 Graeme McCutcheon [:graememcc] 2012-09-19 07:33:35 PDT
https://hg.mozilla.org/mozilla-central/rev/80542168d0d9
Comment 30 John Shih 2012-09-20 00:35:17 PDT
verfied on otoro-ics 2012-09-20
build info
gaia : 9ac2d4f8dd4eb1995bd1c2dfcafa990bceca0b7b
mozilla-central : 0b6677a9d5ed3cd8f0dd95f39cd3a0a204129d76

Note You need to log in before you can comment on or make changes to this bug.