Open Bug 1144816 Opened 10 years ago Updated 2 years ago

Intermittent browser_referrer_open_link_in_tab.js | Test timed out

Categories

(Core :: DOM: Navigation, defect, P3)

defect

Tracking

()

Tracking Status
firefox37 --- unaffected
firefox38 --- fixed
firefox39 --- fixed
firefox42 --- affected
firefox-esr31 --- unaffected

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: intermittent-failure, leave-open)

Attachments

(4 files, 1 obsolete file)

Christoph, I believe you wrote these tests even though Alex was the bug assignee? 09:17:50 INFO - 673 INFO TEST-START | browser/base/content/test/referrer/browser_referrer_open_link_in_tab.js 09:17:58 INFO - 1426695478096 Browser.Experiments.Experiments TRACE Experiments #0::enabled=true, true 09:17:58 INFO - 1426695478101 Browser.Experiments.Experiments TRACE Experiments #0::Registering instance with Addon Manager. 09:17:58 INFO - 1426695478102 Browser.Experiments.Experiments TRACE Experiments #0::Registering previous experiment add-on provider. 09:17:58 INFO - 1426695478105 Browser.Experiments.Experiments TRACE PreviousExperimentProvider #0::startup() 09:17:58 INFO - 1426695478108 Browser.Experiments.Experiments TRACE Experiments #0::_loadFromCache 09:17:58 INFO - 1426695478132 Browser.Experiments.Experiments TRACE Experiments #0::_loadTask finished ok 09:17:58 INFO - 1426695478133 Browser.Experiments.Experiments TRACE Experiments #0::_run 09:17:58 INFO - 1426695478134 Browser.Experiments.Experiments TRACE Experiments #0::_main iteration 09:17:58 INFO - 1426695478137 Browser.Experiments.Experiments TRACE Experiments #0::_evaluateExperiments 09:17:58 INFO - 1426695478144 Browser.Experiments.Experiments TRACE Experiments #0::_main finished, scheduling next run 09:19:20 INFO - Xlib: extension "RANDR" missing on display ":0". 09:19:21 INFO - TEST-INFO | screentopng: exit 0 09:19:21 INFO - 674 INFO checking window state 09:19:21 INFO - 675 INFO Console message: [JavaScript Warning: "unsafe CPOW usage" {file: "resource:///modules/sessionstore/TabState.jsm" line: 96}] 09:19:21 INFO - 676 INFO Console message: [JavaScript Warning: "unsafe CPOW usage" {file: "resource:///modules/sessionstore/TabState.jsm" line: 96}] 09:19:21 INFO - 677 INFO Console message: [JavaScript Warning: "unsafe CPOW usage" {file: "resource:///modules/sessionstore/TabState.jsm" line: 96}] 09:19:21 INFO - 678 INFO Console message: [JavaScript Warning: "unsafe CPOW usage" {file: "resource:///modules/sessionstore/TabState.jsm" line: 96}] 09:19:21 INFO - 679 INFO browser_referrer_open_link_in_tab: policy=[undefined] rel=[undefined] http:// -> http:// 09:19:21 INFO - 680 INFO Console message: 1426695478096 Browser.Experiments.Experiments TRACE Experiments #0::enabled=true, true 09:19:21 INFO - 681 INFO Console message: 1426695478101 Browser.Experiments.Experiments TRACE Experiments #0::Registering instance with Addon Manager. 09:19:21 INFO - 682 INFO Console message: 1426695478102 Browser.Experiments.Experiments TRACE Experiments #0::Registering previous experiment add-on provider. 09:19:21 INFO - 683 INFO Console message: 1426695478105 Browser.Experiments.Experiments TRACE PreviousExperimentProvider #0::startup() 09:19:21 INFO - 684 INFO Console message: 1426695478108 Browser.Experiments.Experiments TRACE Experiments #0::_loadFromCache 09:19:21 INFO - 685 INFO Console message: 1426695478132 Browser.Experiments.Experiments TRACE Experiments #0::_loadTask finished ok 09:19:21 INFO - 686 INFO Console message: 1426695478133 Browser.Experiments.Experiments TRACE Experiments #0::_run 09:19:21 INFO - 687 INFO Console message: 1426695478134 Browser.Experiments.Experiments TRACE Experiments #0::_main iteration 09:19:21 INFO - 688 INFO Console message: 1426695478137 Browser.Experiments.Experiments TRACE Experiments #0::_evaluateExperiments 09:19:21 INFO - 689 INFO Console message: 1426695478144 Browser.Experiments.Experiments TRACE Experiments #0::_main finished, scheduling next run 09:19:21 INFO - 690 INFO Console message: 1426695509398 Toolkit.GMP WARN GMPInstallManager.parseResponseXML got node name: html, expected: updates 09:19:21 INFO - 691 INFO Console message: [JavaScript Error: "1426695509398 Toolkit.GMP ERROR GMPInstallManager.simpleCheckAndInstall Could not check for addons: {"target":{},"message":"got node name: html, expected: updates"}" {file: "resource://gre/modules/Log.jsm" line: 749}] 09:19:21 INFO - 692 INFO Console message: 1426695509499 Services.HealthReport.HealthReporter WARN Saved state file does not exist. 09:19:21 INFO - 693 INFO Console message: 1426695530627 Toolkit.GMP WARN GMPInstallManager.parseResponseXML got node name: html, expected: updates 09:19:21 INFO - 694 INFO Console message: [JavaScript Error: "1426695530627 Toolkit.GMP ERROR GMPInstallManager.simpleCheckAndInstall Could not check for addons: {"target":{},"message":"got node name: html, expected: updates"}" {file: "resource://gre/modules/Log.jsm" line: 749}] 09:19:21 INFO - 695 INFO TEST-UNEXPECTED-FAIL | browser/base/content/test/referrer/browser_referrer_open_link_in_tab.js | Test timed out - expected PASS 09:19:21 INFO - 696 INFO MEMORY STAT vsize after test: 930234368 09:19:21 INFO - 697 INFO MEMORY STAT residentFast after test: 180236288 09:19:21 INFO - 698 INFO MEMORY STAT heapAllocated after test: 67358968 09:19:21 INFO - 699 INFO TEST-OK | browser/base/content/test/referrer/browser_referrer_open_link_in_tab.js | took 90082ms
Flags: needinfo?(mozilla)
[11:54:53] <Gijs> RyanVM|sheriffduty: did you leave the patch in in the end? [11:55:03] <RyanVM|sheriffduty> Gijs: yeah, just a one-off so far [11:55:06] <Gijs> oh, hum [11:55:11] <RyanVM|sheriffduty> retriggers were ok too [11:55:17] <RyanVM|sheriffduty> 4/5 green anyway [11:55:31] <Gijs> might be OK to just add a requestlongerblahblahblah in the test [11:55:38] <Gijs> they are already there in some of the other tests [11:55:46] <Gijs> apparently closing windows on e10s can be slow [11:55:58] <RyanVM|sheriffduty> i could believe that [11:55:59] <Gijs> (looking at the timings for those tests, that does seem to be truthy)
(In reply to :Gijs Kruitbosch from comment #2) > [11:54:53] <Gijs> RyanVM|sheriffduty: did you leave the patch in in the end? > [11:55:03] <RyanVM|sheriffduty> Gijs: yeah, just a one-off so far > [11:55:06] <Gijs> oh, hum > [11:55:11] <RyanVM|sheriffduty> retriggers were ok too > [11:55:17] <RyanVM|sheriffduty> 4/5 green anyway > [11:55:31] <Gijs> might be OK to just add a requestlongerblahblahblah in the > test > [11:55:38] <Gijs> they are already there in some of the other tests > [11:55:46] <Gijs> apparently closing windows on e10s can be slow > [11:55:58] <RyanVM|sheriffduty> i could believe that > [11:55:59] <Gijs> (looking at the timings for those tests, that does seem to > be truthy) There are two tests that already have: > function test() { > requestLongerTimeout(5); // slowwww shutdown on e10s > ... > } Should we add requestLongerTimout to the remaining tests as well or just for the 'open_link_in_new_tab.js' case? Most likely that will fix the problem here!
Flags: needinfo?(mozilla)
(In reply to Christoph Kerschbaumer [:ckerschb] from comment #3) > (In reply to :Gijs Kruitbosch from comment #2) > > [11:54:53] <Gijs> RyanVM|sheriffduty: did you leave the patch in in the end? > > [11:55:03] <RyanVM|sheriffduty> Gijs: yeah, just a one-off so far > > [11:55:06] <Gijs> oh, hum > > [11:55:11] <RyanVM|sheriffduty> retriggers were ok too > > [11:55:17] <RyanVM|sheriffduty> 4/5 green anyway > > [11:55:31] <Gijs> might be OK to just add a requestlongerblahblahblah in the > > test > > [11:55:38] <Gijs> they are already there in some of the other tests > > [11:55:46] <Gijs> apparently closing windows on e10s can be slow > > [11:55:58] <RyanVM|sheriffduty> i could believe that > > [11:55:59] <Gijs> (looking at the timings for those tests, that does seem to > > be truthy) > > There are two tests that already have: > > function test() { > > requestLongerTimeout(5); // slowwww shutdown on e10s > > ... > > } > > Should we add requestLongerTimout to the remaining tests as well or just for > the 'open_link_in_new_tab.js' case? > > Most likely that will fix the problem here! Might as well add it everywhere, I guess? :-\
That would do it hopefully!
Attachment #8579593 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8579593 [details] [diff] [review] bug_1144816_referrer_tests_timeout.patch Review of attachment 8579593 [details] [diff] [review]: ----------------------------------------------------------------- rs=me
Attachment #8579593 - Flags: review?(gijskruitbosch+bugs) → review+
(In reply to :Gijs Kruitbosch from comment #6) > Comment on attachment 8579593 [details] [diff] [review] > bug_1144816_referrer_tests_timeout.patch > > Review of attachment 8579593 [details] [diff] [review]: > ----------------------------------------------------------------- > > rs=me Do you work 24/7? You are really always around - thanks Gijs :-)
Comment on attachment 8579593 [details] [diff] [review] bug_1144816_referrer_tests_timeout.patch Any chance you wanna check that in?
Attachment #8579593 - Flags: checkin?(ryanvm)
Comment on attachment 8579593 [details] [diff] [review] bug_1144816_referrer_tests_timeout.patch Oh - wrong checkin; it's checking-needed, right?
Attachment #8579593 - Flags: checkin?(ryanvm)
Assignee: nobody → mozilla
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [fixed-in-fx-team]
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 39
Looks more like a race condition in the test than a timeout. It logs this: browser_referrer_open_link_in_tab: policy=[undefined] rel=[undefined] http:// -> http:// ...and then nothing else. If it was slow on exit, then it would log the other five cases too. Is there a screenshot, perhaps, to help narrow it down? Or some sort of overall flakiness dashboard that shows logs for several failures? I wasn't able to reproduce this locally (linux opt e10s). Tried 750 iterations inside the test. Tried 20 runs of the entire test. Did a few more runs concurrently with stress(1). Tried placing busy loops in various strategic places. None of these actually made it fail, so I'm as yet unsure what the race was, and hence how to fix it. :-(
Comment 14 apparently agrees :( If you're having trouble reproducing (not shocking given the horrible perf issues we have with our AWS VMs), you can get a loaner slave too: https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_slave
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Firefox 39 → ---
(In reply to Alex Verstak from comment #13) > Looks more like a race condition in the test than a timeout. It logs this: > > browser_referrer_open_link_in_tab: policy=[undefined] rel=[undefined] > http:// -> http:// > > ...and then nothing else. If it was slow on exit, then it would log the > other five cases too. > > Is there a screenshot, perhaps, to help narrow it down? http://mozilla-releng-blobs.s3.amazonaws.com/blobs/mozilla-inbound/sha512/eac4e5cfc90d304d1e69c416354da73f5b7c3ffb37777968960590baa404c5142c2bbbf02577ceaf14495294dd21160b3b8d2a71aa279e0a31ab946cfc87113a (click the link in the treeherder robot comment, then click the revlink, then click the failing test, then scroll the thing in the bottom right-hand corner down, and the 'artifacts uploaded' thing has a link to a screenshot)
(In reply to :Gijs Kruitbosch from comment #16) > then scroll the thing in the bottom right-hand > corner down, and the 'artifacts uploaded' thing has a link to a screenshot Gah, sleepiness - bottom *left* corner!
Aha, it focused the link, but didn't pop up the context menu. Two possibilities: 1. Context menu's own event listeners aren't yet hooked up when browser-delayed-startup-finished fires (which is when we do #2). 2. There's a focus race of some sort, i.e., when the content script does element.focus() + domWindowUtils.sendMouseEvent(), something somewhere handles them out of sequence. It's a pretty deep swim, for me at least. I'll dive in tonight once my work day is over. In the meantime, if we need this fixed right this moment, how about disabling referrer context menu tests in e10s? There're three of them - this one, new window, and new private window.
(In reply to Alex Verstak from comment #19) > Aha, it focused the link, but didn't pop up the context menu. Two > possibilities: > > 1. Context menu's own event listeners aren't yet hooked up when > browser-delayed-startup-finished fires (which is when we do #2). > > 2. There's a focus race of some sort, i.e., when the content script does > element.focus() + domWindowUtils.sendMouseEvent(), something somewhere > handles them out of sequence. > > It's a pretty deep swim, for me at least. I'll dive in tonight once my work > day is over. In the meantime, if we need this fixed right this moment, how > about disabling referrer context menu tests in e10s? There're three of them > - this one, new window, and new private window. Yeah, I have seen issues related to content focus on linux e10s causing popups to not pop up right, bug 1102900. :'-( Seems like disabling on linux + e10s would be sensible.
(I'm off to work for the day - please add checkin-needed if you like it.)
Attachment #8580139 - Flags: review?(gijskruitbosch+bugs)
Worth a shot before we disable the tests, IMO. Thoughts? (Still can't reproduce it locally, so not sure if this fixes it or not. But it does focus the content window - FWIW, logs show that something else indeed had focus before this change.)
Attachment #8580586 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8580139 [details] [diff] [review] disable context menu tests on e10s Apologies for the delay (I'm on PST/PDT this week and crazy busy). I think if we do do this it should be linux-only as well, as the failures are linux-only so far. You can use boolean logic in these conditions so skip-if = os == "linux" && e10s should work.
Attachment #8580139 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8580586 [details] [diff] [review] focus content window before clicking to open the popup Review of attachment 8580586 [details] [diff] [review]: ----------------------------------------------------------------- I mean, rs=me to see if it helps, although it does make me wonder if we should go back to using waitForFocus here somehow? I guess that's not available on the content side of things, but perhaps we should be using it on the chrome side and pass it the content window? (It should deal with that "complexity" for us, AIUI.)
Attachment #8580586 - Flags: review?(gijskruitbosch+bugs) → review+
Attachment #8580139 - Attachment is obsolete: true
(In reply to :Gijs Kruitbosch from comment #28) > I mean, rs=me to see if it helps, although it does make me wonder if we > should go back to using waitForFocus here somehow? I guess that's not > available on the content side of things, but perhaps we should be using it > on the chrome side and pass it the content window? (It should deal with that > "complexity" for us, AIUI.) Passing the content window on the chrome side of waitForFocus() didn't work last time I tried it: https://bugzilla.mozilla.org/show_bug.cgi?id=1113431#c104 (Fixing that is a broader issue - waitForFocus is widely used, so some other tests may depend on it doing nothing in that case. :-) This patch is doing the same thing as waitForFocus(content), except that it doesn't wait for load because it probably doesn't need to - the caller waits for BrowserTestUtils.browserLoaded. (Unless there's a separate load event on the content window that fires later?) Try run looks good. Check in? (I'm also in the Pacific time zone but have a day job, so anything that requires compile/test generally has to wait for the late evening.)
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 39
(In reply to Treeherder Robot from comment #34) > log: > https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=2349818 > repository: fx-team > start_time: 2015-03-20T15:27:16 > who: wkocher[at]mozilla[dot]com > machine: t-yosemite-r5-0080 > buildname: Rev5 MacOSX Yosemite 10.10 fx-team opt test > mochitest-e10s-browser-chrome-1 > revision: 41d6e4de79be > > ValueError: invalid literal for int() with base 10: 'ps:' > ValueError: invalid literal for int() with base 10: 'ps:' > 764 INFO TEST-UNEXPECTED-FAIL | > browser/base/content/test/referrer/browser_referrer_open_link_in_tab.js | > Test timed out - expected PASS This is both !linux and after this patch landed. :-(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The OS X timeout looks like a different issue. The log shows 5 test cases (search for "tab:"): http://mozilla-releng-blobs.s3.amazonaws.com/blobs/fx-team/sha512/e1d0016c8dcac3bfde0ef208e0774a8084dd7673fd553b7f9a83cec972f23a3c6a095fb1d892ebc468b234f9c3217203c39686e66cc057f6c52dbb3235fb1842 15:31:52 policy=[undefined] rel=[undefined] http:// -> http://" 15:31:53 policy=[undefined] rel=[undefined] https:// -> http://" 15:31:53 policy=[origin] rel=[undefined] https:// -> http://" 15:31:53 policy=[origin] rel=[noreferrer] https:// -> http://" 15:31:54 policy=[origin-when-crossorigin] rel=[undefined] https:// -> https://" ...and then nothing happens for several minutes, and it eventually times out while showing the spinner: http://mozilla-releng-blobs.s3.amazonaws.com/blobs/fx-team/sha512/e1d0016c8dcac3bfde0ef208e0774a8084dd7673fd553b7f9a83cec972f23a3c6a095fb1d892ebc468b234f9c3217203c39686e66cc057f6c52dbb3235fb1842 This could be a busy VM fluke, or perhaps a deadlock in browser code. I.e., very likely *not* caused by the referrer test's failure to focus something, or to wait for something to happen. Comment 35, though, is the same old linux failure. So, that one needs more work too - perhaps wait for content window's load, as I began to say this morning in comment 31. There's also bug 1145199 where the browser thinks it's offline on windows. This is before it loads the first test case, so it probably affects all the referrer e10s tests, not just the one it's complaining about. Anyhow... Perhaps disabling the referrer tests on e10s is the right thing to do at this point? My goal is to ship meta-referrer, hopefully in 38 or 39; not to ship e10s... which isn't even planned to be shipped until 45, right? https://wiki.mozilla.org/Platform/Roadmap#e10s
(In reply to Alex Verstak from comment #38) > Comment 35, though, is the same old linux failure. So, that one needs more > work too - perhaps wait for content window's load, as I began to say this > morning in comment 31. That's on mozilla-inbound, and your patch landed on fx-team, so it wouldn't have included your patch yet.
Do we consider it fixed then? No more robotic messages about linux-opt or osx for some time. Or are these errors going to some other place (other than this bug)?
It seems to me raising the timeout might just fix this. The timeouts are all on mac now; 2 of 3 runs timed out after running all 6/6 tests, and the remaining run timed out after running 5/6 tests. All three screenshots show the e10s spinner. We weren't getting any timeouts on mac before the focus patch. So, either (a) the focus patch just made it a little bit slower and pushed it over the limit (which is plausible) or else (b) the focus patch lead to deadlock in the browser (not impossible, but seems like a stretch).
Attachment #8581330 - Flags: review?(gijskruitbosch+bugs)
I think that increasing the timeout might fix the problem. We should do it on all the tests though - we are facing similar issues in other referrer tests (see bug 1146069 for example).
@Christoph - yes, bug 1146069 looks like the same thing - also timing out on exit, also mac e10s. And yes, the pending patch here increases the timeout for all these tests: https://bugzilla.mozilla.org/attachment.cgi?id=8581330&action=diff
Comment on attachment 8581330 [details] [diff] [review] longer timeout for referrer tests, slow on mac e10s Review of attachment 8581330 [details] [diff] [review]: ----------------------------------------------------------------- rs=me
Attachment #8581330 - Flags: review?(gijskruitbosch+bugs) → review+
@Christoph, may I ask you to add checkin-needed? Bugzilla doesn't let me do that somehow - could be because the bug is assigned to you, which it doesn't let me change either.
Flags: needinfo?(mozilla)
(In reply to Alex Verstak from comment #54) > @Christoph, may I ask you to add checkin-needed? Bugzilla doesn't let me do > that somehow - could be because the bug is assigned to you, which it doesn't > let me change either. I would have landed it for you, but inbound is closed at the moment. I set the checkin-needed keyword and also assigned the bug to you, since you are doing all the work here. Thanks!
Assignee: mozilla → averstak
Flags: needinfo?(mozilla)
Keywords: checkin-needed
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Sadly, this seems to have spiked again immediately after bug 1176281 landed. :-( poiru, any chance you could have another look at what's going on here?
Flags: needinfo?(birunthan)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to :Gijs Kruitbosch from comment #110) > Sadly, this seems to have spiked again immediately after bug 1176281 landed. > :-( > > poiru, any chance you could have another look at what's going on here? Will do.
Flags: needinfo?(birunthan)
Depends on: 1176281
Any updates on this top orange?
Flags: needinfo?(birunthan)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #266) > Any updates on this top orange? I can't reproduce it locally so it's tricky to diagnose. We are trying to reproduce this on a loaner. Here is a patch to disable this test for now, if you prefer.
Assignee: averstak → birunthan
Status: REOPENED → ASSIGNED
Flags: needinfo?(birunthan)
Attachment #8636644 - Flags: review?(ryanvm)
Attachment #8636644 - Flags: review?(ryanvm) → review+
Keywords: leave-open
Assignee: birunthan → nobody
Status: ASSIGNED → NEW
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3