Closed Bug 1061186 Opened 9 years ago Closed 9 years ago

Intermittent test_mozsettings.html | Geolocation didn't work after setting geolocation.enabled to true.

Categories

(Core :: DOM: Geolocation, defect)

x86
Windows 8
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox33 --- unaffected
firefox34 --- fixed
firefox35 --- fixed
firefox-esr31 --- unaffected
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: cbook, Assigned: garvan)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

WINNT 6.2 fx-team debug test mochitest-4 on 2014-09-01 03:39:17 PDT for push 0b31ce3ca299

slave: t-w864-ix-040

https://tbpl.mozilla.org/php/getParsedLog.php?id=47164753&tree=Fx-Team



118 INFO TEST-UNEXPECTED-FAIL | /tests/dom/tests/mochitest/geolocation/test_mozsettings.html | Geolocation didn't work after setting geolocation.enabled to true.
Component: DOM → Geolocation
Flags: needinfo?(gkeeley)
The typical fix for these intermittent and rare geolocation test bugs is to up the timeout values in the tests. Currently, the timeouts in this test are 500ms when toggling a pref on/off. 
To repro the problem locally, I have to lower the timeout to 5ms, even 10 ms was not enough. I suspect when timeouts get *that* low, the event loop hasn't run enough cycles to guarantee a normal run of the test case (that is, a *normal* flow of events when toggling a setting, might be place an event on the queue that when processed, places another event on the queue, etc.). 
I'll add a patch for a longer timeout, although my evidence for this being the problem is not convincing.
Is there a way to modify these tests so they're not so heavily dependent on timing? Events we can wait for or something?

https://developer.mozilla.org/en-US/docs/Mozilla/QA/Avoiding_intermittent_oranges has some tips, FWIW.
It looks like the previous mochitest timeout bugs we dealt with through increasing the timeout, I'll join the band-aid parade and do the same here, by doubling the wait timeout.
Hooking in to an actual event would be much better (there would still be a timeout to detect failure if the event didn't occur), but I don't know of a mechanism that can be hooked in to.
Attachment #8492474 - Flags: review?(josh)
Flags: needinfo?(gkeeley)
Comment on attachment 8492474 [details] [diff] [review]
Increase mochitest wait time from 500ms to 1sec.

Review of attachment 8492474 [details] [diff] [review]:
-----------------------------------------------------------------

Sigh.
Attachment #8492474 - Flags: review?(josh) → review+
(In reply to Garvan Keeley [:garvank] from comment #17)
> Hooking in to an actual event would be much better (there would still be a
> timeout to detect failure if the event didn't occur), but I don't know of a
> mechanism that can be hooked in to.

Maybe Ehsan has an idea?
Flags: needinfo?(ehsan.akhgari)
What if we immediately get the geolocation settings value and execute the new geo requests inside the callback (after asserting that the value is toggled, obviously)?
(In reply to Garvan Keeley [:garvank] from comment #15)
> The typical fix for these intermittent and rare geolocation test bugs is to
> up the timeout values in the tests. Currently, the timeouts in this test are
> 500ms when toggling a pref on/off. 
> To repro the problem locally, I have to lower the timeout to 5ms, even 10 ms
> was not enough. I suspect when timeouts get *that* low, the event loop
> hasn't run enough cycles to guarantee a normal run of the test case (that
> is, a *normal* flow of events when toggling a setting, might be place an
> event on the queue that when processed, places another event on the queue,
> etc.). 

How many event loop hits does this need to work correctly?  The idiom that I have used in some tests in the past is like this: <http://mxr.mozilla.org/mozilla-central/source/content/base/test/test_bug666604.html?force=1#25> but that requires you to know exactly how many asynchronous tasks need to run.  Finding the number is usually possible through code inspection, but I don't know anything about this test in particular.
Flags: needinfo?(ehsan.akhgari)
All the code and tests predate me, and to grok that would require more cycles than I have ATM (not something I can intuit easily anyway). With that hitEventLoop idea, but would you have to know an exact number? The event loop won't sit idle, it will keep running, it would seem best to overestimate slightly. 

It will be interesting to see if changing the timeout from 500ms to 1000ms stops the mochitest failures. 1000ms is 100x longer than the timeout I need locally for the test to succeed (in the event of CPU saturation of the test host, I suppose all bets are off in terms of timing).
(In reply to Garvan Keeley [:garvank] from comment #22)
> All the code and tests predate me, and to grok that would require more
> cycles than I have ATM (not something I can intuit easily anyway). With that
> hitEventLoop idea, but would you have to know an exact number? The event
> loop won't sit idle, it will keep running, it would seem best to
> overestimate slightly. 

You do want to know a lower bound, yes, because otherwise your callback may run before the work you are waiting for is finished.  How long the event loop is running doesn't matter.  The idea is that if you have an asynchronous task which is broken up in let's say 3 parts, each part posting a task to the main thread after it runs, you can post your own tasks 3 or 4 times in a row, and that will guarantee that your callback runs after the last step of the asynchronous process you are waiting for.

> It will be interesting to see if changing the timeout from 500ms to 1000ms
> stops the mochitest failures. 1000ms is 100x longer than the timeout I need
> locally for the test to succeed (in the event of CPU saturation of the test
> host, I suppose all bets are off in terms of timing).

We can definitely play with the timer values, but all that gives you is more breathing room (and sometimes in practice, that's "good enough for now"...)
May I point again to my idea in comment 20? It seems like an easy way to gate further actions on ensuring the relevant callbacks have finished running.
Yeah, just curious how Ehsan's idea worked, I am under 'put out fires only' mode in gecko geo ATM, so I wasn't planning on jumping on the slight smoulder.

Josh, the current code is:
toggleSetting( value, callback to perform geo request), and the callback gets called before the setting has taken effect.

Could you clarify what you suggest to change there?
I'm saying that inside the callback you dispatch a runnable that creates a lock and queries the setting value, re-dispatching itself if the setting doesn't report being changed yet. If it has been changed, you continue with the original desired callback.
Argh, forgot to checkin-needed
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/48a74ac43195
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.