Closed
Bug 769524
Opened 13 years ago
Closed 12 years ago
Frequent robocop testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Firefox 22
People
(Reporter: philor, Assigned: gbrown)
References
Details
(Keywords: intermittent-failure)
Attachments
(3 files, 1 obsolete file)
|
1.36 KB,
patch
|
gbrown
:
review+
emorley
:
checkin+
|
Details | Diff | Splinter Review |
|
1.81 KB,
patch
|
Details | Diff | Splinter Review | |
|
1.34 KB,
patch
|
gbrown
:
review+
|
Details | Diff | Splinter Review |
Brought to you by bug 756704 turning hangs and timeouts into test failures, near as I can tell.
https://tbpl.mozilla.org/php/getParsedLog.php?id=13080199&tree=Mozilla-Inbound
Android Tegra 250 mozilla-inbound opt test robocop on 2012-06-28 12:43:08 PDT for push 23d9d354236e
slave: tegra-141
1 INFO TEST-START | testSearchSuggestions
2 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data
Exception caught during test!
junit.framework.AssertionFailedError: 2 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data
at junit.framework.Assert.fail(Assert.java:47)
at org.mozilla.fennec.FennecMochitestAssert._logMochitestResult(FennecMochitestAssert.java:107)
at org.mozilla.fennec.FennecMochitestAssert.ok(FennecMochitestAssert.java:137)
at org.mozilla.fennec.FennecNativeActions$GeckoEventExpecter.blockForEvent(FennecNativeActions.java:141)
at org.mozilla.fennec.tests.testSearchSuggestions.testSearchSuggestions(testSearchSuggestions.java:46)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:521)
at android.test.InstrumentationTestCase.runMethod(InstrumentationTestCase.java:204)
at android.test.InstrumentationTestCase.runTest(InstrumentationTestCase.java:194)
at android.test.ActivityInstrumentationTestCase2.runTest(ActivityInstrumentationTestCase2.java:186)
at org.mozilla.fennec.tests.BaseTest.runTest(BaseTest.java:93)
at junit.framework.TestCase.runBare(TestCase.java:127)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:169)
at android.test.AndroidTestRunner.runTest(AndroidTestRunner.java:154)
at android.test.InstrumentationTestRunner.onStart(InstrumentationTestRunner.java:520)
at android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1447)
3 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | Exception caught - junit.framework.AssertionFailedError: 2 INFO TEST-UNEXPECTED-FAIL | testSearchSuggestions | GeckoEventExpecter - blockForEvent timeout: SearchEngines:Data
| Reporter | ||
Comment 1•13 years ago
|
||
| Reporter | ||
Comment 2•13 years ago
|
||
| Reporter | ||
Comment 3•13 years ago
|
||
| Reporter | ||
Comment 4•13 years ago
|
||
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Updated•13 years ago
|
Assignee: nobody → gbrown
| Assignee | ||
Comment 7•13 years ago
|
||
In my testing, I see the SearchEngines:Data event received very quickly -- we wait less than 20 ms usually. So the 90000 ms timeout seems reasonable.
I think the test may be occasionally missing the SearchEngines:Data event by starting to listen too late. This patch moves creation of the event expecter back to before we start waiting for Gecko:Ready. Is that reasonable?
Attachment #637945 -
Flags: review?(bnicholson)
| Assignee | ||
Comment 8•13 years ago
|
||
Comment 9•13 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #7)
> Created attachment 637945 [details] [diff] [review]
> patch to listen for suggestions data event earlier
>
> In my testing, I see the SearchEngines:Data event received very quickly --
> we wait less than 20 ms usually. So the 90000 ms timeout seems reasonable.
>
> I think the test may be occasionally missing the SearchEngines:Data event by
> starting to listen too late. This patch moves creation of the event expecter
> back to before we start waiting for Gecko:Ready. Is that reasonable?
I don't understand how that could happen; we only send the SearchEngines:Get message after the AwesomeBar has been opened, so SearchEngines:Data should only be received then as well. As long as we listen for SearchEngines:Data before we call clickOnAwesomeBar(), I think we should be safe.
Are you seeing SearchEngines:Data received before the AwesomeBar is clicked? If so, I don't think that should be happening...
| Assignee | ||
Comment 10•13 years ago
|
||
(In reply to Brian Nicholson (:bnicholson) from comment #9)
> Are you seeing SearchEngines:Data received before the AwesomeBar is clicked?
I don't have conclusive evidence of that, no. The tbpl logs suggest there are times when we wait 90 s for SearchEngines:Data and don't receive it, and I thought we might have missed it...but if that doesn't make sense, then....?
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 26•13 years ago
|
||
Comment on attachment 637945 [details] [diff] [review]
patch to listen for suggestions data event earlier
Cancelling review -- this isn't a sensible solution and probably won't help.
Attachment #637945 -
Flags: review?(bnicholson)
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 28•13 years ago
|
||
Probable dup of 771189 -- just marking as a dependency for now.
Depends on: 771189
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 33•13 years ago
|
||
(In reply to TinderboxPushlog Robot from comment #32)
> philor
> https://tbpl.mozilla.org/php/getParsedLog.php?id=13296058&tree=Mozilla-
> Inbound
> Android Tegra 250 mozilla-inbound opt test robocop on 2012-07-06 11:23:13
> slave: tegra-087
Well that is disappointing! This happened after the fix for 771189 landed on inbound.
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 35•13 years ago
|
||
(In reply to TinderboxPushlog Robot from comment #34)
> philor
> https://tbpl.mozilla.org/php/getParsedLog.php?id=13296701&tree=Mozilla-
> Inbound
> Android Tegra 250 mozilla-inbound opt test robocop on 2012-07-06 11:58:21
> slave: tegra-248
This one is different -- see bug 769145.
Comment 36•13 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #35)
> (In reply to TinderboxPushlog Robot from comment #34)
> > philor
> > https://tbpl.mozilla.org/php/getParsedLog.php?id=13296701&tree=Mozilla-
> > Inbound
> > Android Tegra 250 mozilla-inbound opt test robocop on 2012-07-06 11:58:21
> > slave: tegra-248
>
> This one is different -- see bug 769145.
I filed bug 771677 separately to keep track of this.
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 44•13 years ago
|
||
Not exactly a proper fix, but it should avoid these errors until we find the real cause.
Try push: https://tbpl.mozilla.org/?tree=Try&rev=fbe54653ee9f
Attachment #639941 -
Flags: review?(gbrown)
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 66•13 years ago
|
||
Comment on attachment 639941 [details] [diff] [review]
Add sleep to testSearchSuggestions to workaround blocked event queue
Review of attachment 639941 [details] [diff] [review]:
-----------------------------------------------------------------
I think it's fine to land this to resolve the frequent oranges while we sort this out. Please [leave open] this bug.
Attachment #639941 -
Flags: review?(gbrown) → review+
| Assignee | ||
Updated•13 years ago
|
Attachment #637945 -
Attachment is obsolete: true
| Assignee | ||
Comment 67•13 years ago
|
||
Investigating today, I found no sign of the blocked event queue -- I'm pretty sure that problem was resolved by bug 771189.
I can reproduce this problem quite consistently in local test runs; the test successfully clicks on the awesome bar, the UI remains responsive (no ANR) but the wait for SearchEngines:Data times out.
Adding a sleep, as in bnicholson's patch, allows the test to succeed.
Without the sleep, I see SearchEngines:Get sent:
https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/mobile/android/base/awesomebar/AllPagesTab.java#l82
but the handler:
https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/mobile/android/chrome/content/browser.js#l5666
is not called.
It looks like SearchService's _syncInit
https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/toolkit/components/search/nsSearchService.js#l2526
is not called, although it is scheduled to be called just around the time that SearchEngines:Get is sent:
https://hg.mozilla.org/mozilla-central/file/37fc31499d8a/toolkit/components/search/nsSearchService.js#l3230
I suspect we need to call SearchService:ensureInitialized before/while processing SearchEngines:Get to avoid this failure. Will continue investigating tomorrow...
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 72•13 years ago
|
||
Whiteboard: [orange] → [orange] [leave open]
Comment 73•13 years ago
|
||
I'm still seeing a problem with the SearchEngines:Get event getting processed in the queue. With this patch applied, here's the output I'm seeing:
I/Gecko ( 7059): BRN: posting event: Accessibility:Settings
I/Gecko ( 7059): BRN: posting event: Preferences:Get
I/Gecko ( 7059): BRN: posting event: Accessibility:Settings
I/Gecko ( 7059): BRN: posting event: Tab:Selected
I/Gecko ( 7059): BRN: posting event: Window:Resize
I/Gecko ( 7059): BRN: posting event: Preferences:Get
I/Gecko ( 7059): BRN: posting event: SearchEngines:Get
I/Gecko ( 7059): BRN: popping event: Accessibility:Settings
I/Gecko ( 7059): BRN: popping event: Preferences:Get
I/Gecko ( 7059): BRN: popping event: Accessibility:Settings
I/Gecko ( 7059): BRN: popping event: Tab:Selected
I/Gecko ( 7059): BRN: popping event: Window:Resize
I/Gecko ( 7059): BRN: popping event: Preferences:Get
That is, SearchEngines:Get is received and posted in AndroidJNI.cpp, but isn't popped off the queue to be processed. As a result, the SearchEngines:Get observer in browser.js is never getting called.
| Assignee | ||
Comment 74•13 years ago
|
||
I definitely see SearchEngines:Get popped:
07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Accessibility:Settings
07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Telemetry:Add
07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Preferences:Get
07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Preferences:Get
07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Accessibility:Settings
07-10 09:45:55.414 4326 4335 I Gecko : BRN: posting event: Tab:Selected
07-10 09:45:55.464 4326 4335 I Gecko : BRN: posting event: Window:Resize
07-10 09:45:55.480 4326 4335 I Gecko : BRN: posting event: Preferences:Get
07-10 09:45:55.772 4326 4326 I Gecko : BRN: posting event: SearchEngines:Get
07-10 09:45:55.796 4326 4335 I Gecko : BRN: popping event: Accessibility:Settings
07-10 09:45:55.854 4326 4335 I Gecko : BRN: popping event: Telemetry:Add
07-10 09:45:56.132 4326 4335 I Gecko : BRN: popping event: Preferences:Get
07-10 09:45:56.172 4326 4335 I Gecko : BRN: popping event: Preferences:Get
07-10 09:45:57.290 4326 4326 I Gecko : BRN: posting event: Passwords:Init
07-10 09:45:57.296 4326 4335 I Gecko : BRN: popping event: Accessibility:Settings
07-10 09:45:57.296 4326 4335 I Gecko : BRN: popping event: Tab:Selected
07-10 09:45:57.308 4326 4326 I Gecko : BRN: posting event: Passwords:Init
07-10 09:45:57.308 4326 4335 I Gecko : BRN: popping event: Window:Resize
07-10 09:45:57.332 4326 4335 I Gecko : BRN: popping event: Preferences:Get
07-10 09:45:57.336 4326 4335 I Gecko : BRN: popping event: SearchEngines:Get
07-10 09:45:57.416 4326 4335 I Gecko : BRN: popping event: Passwords:Init
...and ExecuteSoon(_syncInit) is usually called, but _syncInit is not executed.
So I suppose something (something unrelated?) is causing that thread to hang, and we are just hanging at different points in the SearchEngines:Get processing.
| Assignee | ||
Comment 75•13 years ago
|
||
It looks like this is a slight variation on bug 771189. The Gecko thread is waiting in CreateCompositor, and there is another thread waiting on waitForValidSurface.
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 79•13 years ago
|
||
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•13 years ago
|
Whiteboard: [orange] [leave open] → [orange]
Updated•13 years ago
|
Attachment #639941 -
Flags: checkin+
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [orange]
Could this be related to the fact that bug 760036 makes nsSearchService initialization async?
(In reply to Geoff Brown [:gbrown] from comment #74)
> ...and ExecuteSoon(_syncInit) is usually called, but _syncInit is not
> executed.
>
> So I suppose something (something unrelated?) is causing that thread to
> hang, and we are just hanging at different points in the SearchEngines:Get
> processing.
Ah, I had missed that one. That looks scary.
If necessary, I can certainly hack around the symptoms with a very unsatisfactory solution: ensuring that nsSearchService always takes the deprecated, synchronous, main thread initialization path under Android.
Marking as blocker, as this prevents bug 760036, itself a blocker, from landing.
Severity: normal → blocker
| Assignee | ||
Comment 87•12 years ago
|
||
I backed out the workaround (sleep) on a try push just to verify this is still a problem: It failed on nearly every run, with the same blockForEvent timeout.
https://tbpl.mozilla.org/?tree=Try&rev=e1359a5dedd0
| Assignee | ||
Comment 88•12 years ago
|
||
I also added some logging and verified that the cause of the timeouts is bug 772672 still:
test waits for Gecko:Ready
test initializes expecter for SearchEngines:Data
test clicks on awesomebar
AllPagesTab is created and sends SearchEngines:Get
surfaceDestroyed is called
CreateCompositor is called
waitForValidSurface is called
test waits for SearchEngines:Data... and times out after 90 s of inactivity on all threads
With the sleep, events proceed like this:
test waits for Gecko:Ready
test sleeps for 5 seconds during which:
CreateCompositor is called
waitForValidSurface is called
waitForValidSurface completes
CreateCompositor completes
test initializes expecter for SearchEngines:Data
test clicks on awesomebar
AllPagesTab is created and sends SearchEngines:Get
surfaceDestroyed is called
SearchEngines:Get is received
SearchEngines:Data is sent
test receives SearchEngines:Data and proceeds normally -- all tests pass
| Assignee | ||
Comment 89•12 years ago
|
||
Returning severity to normal, since this no longer blocks 760036.
Severity: blocker → normal
Thanks for the help, gbrown.
| Assignee | ||
Comment 91•12 years ago
|
||
I checked to see if the recent work on bug 826300 might allow us to remove the work-around; it does not. Without the initial sleep, this test still intermittently fails with a timeout waiting for SearchEngines:Data.
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 96•12 years ago
|
||
These last 4 failures appear to be caused by the patch for bug 777712 -- backed out for other test failures.
Comment 97•12 years ago
|
||
Now that bug 772672 should be fixed, I'm re-testing to see if removing the sleep workaround here works. Try push at https://tbpl.mozilla.org/?tree=Try&rev=add2900e0aaa
| Assignee | ||
Comment 99•12 years ago
|
||
Comment on attachment 720049 [details] [diff] [review]
Remove workaround
Review of attachment 720049 [details] [diff] [review]:
-----------------------------------------------------------------
Great! Thanks much!
Attachment #720049 -
Flags: review?(gbrown) → review+
Comment 100•12 years ago
|
||
Adding checkin-needed for the above patch since inbound is closed.
Keywords: checkin-needed
Comment 101•12 years ago
|
||
Keywords: checkin-needed
Comment 102•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 22
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•