Closed
Bug 769524
Opened 12 years ago
Closed 11 years ago
Frequent robocop testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Firefox 22
People
(Reporter: philor, Assigned: gbrown)
References
Details
(Keywords: intermittent-failure)
Attachments
(3 files, 1 obsolete file)
1.36 KB,
patch
|
gbrown
:
review+
emorley
:
checkin+
|
Details | Diff | Splinter Review |
1.81 KB,
patch
|
Details | Diff | Splinter Review | |
1.34 KB,
patch
|
gbrown
:
review+
|
Details | Diff | Splinter Review |
Brought to you by bug 756704 turning hangs and timeouts into test failures, near as I can tell. https://tbpl.mozilla.org/php/getParsedLog.php?id=13080199&tree=Mozilla-Inbound Android Tegra 250 mozilla-inbound opt test robocop on 2012-06-28 12:43:08 PDT for push 23d9d354236e slave: tegra-141 1 INFO TEST-START | testSearchSuggestions 2 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data Exception caught during test! junit.framework.AssertionFailedError: 2 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data at junit.framework.Assert.fail(Assert.java:47) at org.mozilla.fennec.FennecMochitestAssert._logMochitestResult(FennecMochitestAssert.java:107) at org.mozilla.fennec.FennecMochitestAssert.ok(FennecMochitestAssert.java:137) at org.mozilla.fennec.FennecNativeActions$GeckoEventExpecter.blockForEvent(FennecNativeActions.java:141) at org.mozilla.fennec.tests.testSearchSuggestions.testSearchSuggestions(testSearchSuggestions.java:46) at java.lang.reflect.Method.invokeNative(Native Method) at java.lang.reflect.Method.invoke(Method.java:521) at android.test.InstrumentationTestCase.runMethod(InstrumentationTestCase.java:204) at android.test.InstrumentationTestCase.runTest(InstrumentationTestCase.java:194) at android.test.ActivityInstrumentationTestCase2.runTest(ActivityInstrumentationTestCase2.java:186) at org.mozilla.fennec.tests.BaseTest.runTest(BaseTest.java:93) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:169) at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:154) at android.test.InstrumentationTestRunner.onStart(InstrumentationTestRunner.java:520) at android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1447) 3 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | Exception caught - junit.framework.AssertionFailedError: 2 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data
Reporter | ||
Comment 1•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=13085248&tree=Mozilla-Inbound
Reporter | ||
Comment 2•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=13085248&tree=Mozilla-Inbound
Reporter | ||
Comment 3•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=13087267&tree=Mozilla-Inbound
Reporter | ||
Comment 4•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=13087714&tree=Mozilla-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → gbrown
Assignee | ||
Comment 7•12 years ago
|
||
In my testing, I see the SearchEngines:Data event received very quickly -- we wait less than 20 ms usually. So the 90000 ms timeout seems reasonable. I think the test may be occasionally missing the SearchEngines:Data event by starting to listen too late. This patch moves creation of the event expecter back to before we start waiting for Gecko:Ready. Is that reasonable?
Attachment #637945 -
Flags: review?(bnicholson)
Assignee | ||
Comment 8•12 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=f55b31bbf3c3
Comment 9•12 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #7) > Created attachment 637945 [details] [diff] [review] > patch to listen for suggestions data event earlier > > In my testing, I see the SearchEngines:Data event received very quickly -- > we wait less than 20 ms usually. So the 90000 ms timeout seems reasonable. > > I think the test may be occasionally missing the SearchEngines:Data event by > starting to listen too late. This patch moves creation of the event expecter > back to before we start waiting for Gecko:Ready. Is that reasonable? I don't understand how that could happen; we only send the SearchEngines:Get message after the AwesomeBar has been opened, so SearchEngines:Data should only be received then as well. As long as we listen for SearchEngines:Data before we call clickOnAwesomeBar(), I think we should be safe. Are you seeing SearchEngines:Data received before the AwesomeBar is clicked? If so, I don't think that should be happening...
Assignee | ||
Comment 10•12 years ago
|
||
(In reply to Brian Nicholson (:bnicholson) from comment #9) > Are you seeing SearchEngines:Data received before the AwesomeBar is clicked? I don't have conclusive evidence of that, no. The tbpl logs suggest there are times when we wait 90 s for SearchEngines:Data and don't receive it, and I thought we might have missed it...but if that doesn't make sense, then....?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 26•12 years ago
|
||
Comment on attachment 637945 [details] [diff] [review] patch to listen for suggestions data event earlier Cancelling review -- this isn't a sensible solution and probably won't help.
Attachment #637945 -
Flags: review?(bnicholson)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 28•12 years ago
|
||
Probable dup of 771189 -- just marking as a dependency for now.
Depends on: 771189
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 33•12 years ago
|
||
(In reply to TinderboxPushlog Robot from comment #32) > philor > https://tbpl.mozilla.org/php/getParsedLog.php?id=13296058&tree=Mozilla- > Inbound > Android Tegra 250 mozilla-inbound opt test robocop on 2012-07-06 11:23:13 > slave: tegra-087 Well that is disappointing! This happened after the fix for 771189 landed on inbound.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 35•12 years ago
|
||
(In reply to TinderboxPushlog Robot from comment #34) > philor > https://tbpl.mozilla.org/php/getParsedLog.php?id=13296701&tree=Mozilla- > Inbound > Android Tegra 250 mozilla-inbound opt test robocop on 2012-07-06 11:58:21 > slave: tegra-248 This one is different -- see bug 769145.
Comment 36•12 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #35) > (In reply to TinderboxPushlog Robot from comment #34) > > philor > > https://tbpl.mozilla.org/php/getParsedLog.php?id=13296701&tree=Mozilla- > > Inbound > > Android Tegra 250 mozilla-inbound opt test robocop on 2012-07-06 11:58:21 > > slave: tegra-248 > > This one is different -- see bug 769145. I filed bug 771677 separately to keep track of this.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 44•12 years ago
|
||
Not exactly a proper fix, but it should avoid these errors until we find the real cause. Try push: https://tbpl.mozilla.org/?tree=Try&rev=fbe54653ee9f
Attachment #639941 -
Flags: review?(gbrown)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 66•12 years ago
|
||
Comment on attachment 639941 [details] [diff] [review] Add sleep to testSearchSuggestions to workaround blocked event queue Review of attachment 639941 [details] [diff] [review]: ----------------------------------------------------------------- I think it's fine to land this to resolve the frequent oranges while we sort this out. Please [leave open] this bug.
Attachment #639941 -
Flags: review?(gbrown) → review+
Assignee | ||
Updated•12 years ago
|
Attachment #637945 -
Attachment is obsolete: true
Assignee | ||
Comment 67•12 years ago
|
||
Investigating today, I found no sign of the blocked event queue -- I'm pretty sure that problem was resolved by bug 771189. I can reproduce this problem quite consistently in local test runs; the test successfully clicks on the awesome bar, the UI remains responsive (no ANR) but the wait for SearchEngines:Data times out. Adding a sleep, as in bnicholson's patch, allows the test to succeed. Without the sleep, I see SearchEngines:Get sent: https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/mobile/android/base/awesomebar/AllPagesTab.java#l82 but the handler: https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/mobile/android/chrome/content/browser.js#l5666 is not called. It looks like SearchService's _syncInit https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/toolkit/components/search/nsSearchService.js#l2526 is not called, although it is scheduled to be called just around the time that SearchEngines:Get is sent: https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/toolkit/components/search/nsSearchService.js#l3230 I suspect we need to call SearchService:ensureInitialized before/while processing SearchEngines:Get to avoid this failure. Will continue investigating tomorrow...
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 72•12 years ago
|
||
http://hg.mozilla.org/integration/mozilla-inbound/rev/166b95692423
Whiteboard: [orange] → [orange] [leave open]
Comment 73•12 years ago
|
||
I'm still seeing a problem with the SearchEngines:Get event getting processed in the queue. With this patch applied, here's the output I'm seeing: I/Gecko ( 7059): BRN: posting event: Accessibility:Settings I/Gecko ( 7059): BRN: posting event: Preferences:Get I/Gecko ( 7059): BRN: posting event: Accessibility:Settings I/Gecko ( 7059): BRN: posting event: Tab:Selected I/Gecko ( 7059): BRN: posting event: Window:Resize I/Gecko ( 7059): BRN: posting event: Preferences:Get I/Gecko ( 7059): BRN: posting event: SearchEngines:Get I/Gecko ( 7059): BRN: popping event: Accessibility:Settings I/Gecko ( 7059): BRN: popping event: Preferences:Get I/Gecko ( 7059): BRN: popping event: Accessibility:Settings I/Gecko ( 7059): BRN: popping event: Tab:Selected I/Gecko ( 7059): BRN: popping event: Window:Resize I/Gecko ( 7059): BRN: popping event: Preferences:Get That is, SearchEngines:Get is received and posted in AndroidJNI.cpp, but isn't popped off the queue to be processed. As a result, the SearchEngines:Get observer in browser.js is never getting called.
Assignee | ||
Comment 74•12 years ago
|
||
I definitely see SearchEngines:Get popped: 07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Accessibility:Settings 07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Telemetry:Add 07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Preferences:Get 07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Preferences:Get 07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Accessibility:Settings 07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Tab:Selected 07-10 09:45:55.464 4326 4335 I Gecko : BRN: posting event: Window:Resize 07-10 09:45:55.480 4326 4335 I Gecko : BRN: posting event: Preferences:Get 07-10 09:45:55.772 4326 4326 I Gecko : BRN: posting event: SearchEngines:Get 07-10 09:45:55.796 4326 4335 I Gecko : BRN: popping event: Accessibility:Settings 07-10 09:45:55.854 4326 4335 I Gecko : BRN: popping event: Telemetry:Add 07-10 09:45:56.132 4326 4335 I Gecko : BRN: popping event: Preferences:Get 07-10 09:45:56.172 4326 4335 I Gecko : BRN: popping event: Preferences:Get 07-10 09:45:57.290 4326 4326 I Gecko : BRN: posting event: Passwords:Init 07-10 09:45:57.296 4326 4335 I Gecko : BRN: popping event: Accessibility:Settings 07-10 09:45:57.296 4326 4335 I Gecko : BRN: popping event: Tab:Selected 07-10 09:45:57.308 4326 4326 I Gecko : BRN: posting event: Passwords:Init 07-10 09:45:57.308 4326 4335 I Gecko : BRN: popping event: Window:Resize 07-10 09:45:57.332 4326 4335 I Gecko : BRN: popping event: Preferences:Get 07-10 09:45:57.336 4326 4335 I Gecko : BRN: popping event: SearchEngines:Get 07-10 09:45:57.416 4326 4335 I Gecko : BRN: popping event: Passwords:Init ...and ExecuteSoon(_syncInit) is usually called, but _syncInit is not executed. So I suppose something (something unrelated?) is causing that thread to hang, and we are just hanging at different points in the SearchEngines:Get processing.
Assignee | ||
Comment 75•12 years ago
|
||
It looks like this is a slight variation on bug 771189. The Gecko thread is waiting in CreateCompositor, and there is another thread waiting on waitForValidSurface.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•12 years ago
|
Whiteboard: [orange] [leave open] → [orange]
Updated•12 years ago
|
Attachment #639941 -
Flags: checkin+
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [orange]
Could this be related to the fact that bug 760036 makes nsSearchService initialization async?
(In reply to Geoff Brown [:gbrown] from comment #74) > ...and ExecuteSoon(_syncInit) is usually called, but _syncInit is not > executed. > > So I suppose something (something unrelated?) is causing that thread to > hang, and we are just hanging at different points in the SearchEngines:Get > processing. Ah, I had missed that one. That looks scary. If necessary, I can certainly hack around the symptoms with a very unsatisfactory solution: ensuring that nsSearchService always takes the deprecated, synchronous, main thread initialization path under Android.
Marking as blocker, as this prevents bug 760036, itself a blocker, from landing.
Severity: normal → blocker
Assignee | ||
Comment 87•12 years ago
|
||
I backed out the workaround (sleep) on a try push just to verify this is still a problem: It failed on nearly every run, with the same blockForEvent timeout. https://tbpl.mozilla.org/?tree=Try&rev=e1359a5dedd0
Assignee | ||
Comment 88•12 years ago
|
||
I also added some logging and verified that the cause of the timeouts is bug 772672 still: test waits for Gecko:Ready test initializes expecter for SearchEngines:Data test clicks on awesomebar AllPagesTab is created and sends SearchEngines:Get surfaceDestroyed is called CreateCompositor is called waitForValidSurface is called test waits for SearchEngines:Data... and times out after 90 s of inactivity on all threads With the sleep, events proceed like this: test waits for Gecko:Ready test sleeps for 5 seconds during which: CreateCompositor is called waitForValidSurface is called waitForValidSurface completes CreateCompositor completes test initializes expecter for SearchEngines:Data test clicks on awesomebar AllPagesTab is created and sends SearchEngines:Get surfaceDestroyed is called SearchEngines:Get is received SearchEngines:Data is sent test receives SearchEngines:Data and proceeds normally -- all tests pass
Assignee | ||
Comment 89•12 years ago
|
||
Returning severity to normal, since this no longer blocks 760036.
Severity: blocker → normal
Thanks for the help, gbrown.
Assignee | ||
Comment 91•11 years ago
|
||
I checked to see if the recent work on bug 826300 might allow us to remove the work-around; it does not. Without the initial sleep, this test still intermittently fails with a timeout waiting for SearchEngines:Data.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 96•11 years ago
|
||
These last 4 failures appear to be caused by the patch for bug 777712 -- backed out for other test failures.
Comment 97•11 years ago
|
||
Now that bug 772672 should be fixed, I'm re-testing to see if removing the sleep workaround here works. Try push at https://tbpl.mozilla.org/?tree=Try&rev=add2900e0aaa
Assignee | ||
Comment 99•11 years ago
|
||
Comment on attachment 720049 [details] [diff] [review] Remove workaround Review of attachment 720049 [details] [diff] [review]: ----------------------------------------------------------------- Great! Thanks much!
Attachment #720049 -
Flags: review?(gbrown) → review+
Comment 100•11 years ago
|
||
Adding checkin-needed for the above patch since inbound is closed.
Keywords: checkin-needed
Comment 101•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/93b2766c6f25
Keywords: checkin-needed
Comment 102•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/93b2766c6f25
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 22
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•