Closed
Bug 1318389
Opened 7 years ago
Closed 7 years ago
Intermittent browser/modules/test/browser_CaptivePortalWatcher.js | This test exceeded the timeout threshold. It should be rewritten or split up. If that's not possible, use requestLongerTimeout(N), but only as a last resort. -
Categories
(Firefox :: General, defect)
Firefox
General
Tracking
()
RESOLVED
FIXED
Firefox 54
People
(Reporter: intermittent-bug-filer, Assigned: nhnt11)
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell fixed])
Attachments
(3 files)
Filed by: philringnalda [at] gmail.com https://treeherder.mozilla.org/logviewer.html#?job_id=6783126&repo=autoland https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-linux-debug/1479388048/autoland_ubuntu32_vm-debug_test-mochitest-e10s-browser-chrome-4-bm04-tests1-linux32-build95.txt.gz
Comment hidden (Intermittent Failures Robot) |
Comment 2•7 years ago
|
||
Looks like this test was recently added and has pretty much immediately gone intermittent. Can we work out why and/or fix this?
Flags: needinfo?(nhnt11)
Comment 3•7 years ago
|
||
(based on timing I suspect bug 1315969)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 10•7 years ago
|
||
this bug spiked yesterday, :gijs, can you find someone to look at this? We have had a <30 failures/week trend, until this week.
Flags: needinfo?(nhnt11) → needinfo?(gijskruitbosch+bugs)
Comment 11•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #10) > this bug spiked yesterday, :gijs, can you find someone to look at this? We > have had a <30 failures/week trend, until this week. Nihanth? I don't have cycles myself, unfortunately. Joel, do you have a more precise regression range? That would probably also help...
Flags: needinfo?(nhnt11)
Flags: needinfo?(jmaher)
Flags: needinfo?(gijskruitbosch+bugs)
Comment 12•7 years ago
|
||
sure thing, working on some retriggers for the weekend and monday to narrow this down, starting on m-c right now: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=Linux%20debug%20Mochitests%20executed%20by%20TaskCluster%20with%20e10s%20test-linux32%2Fdebug-mochitest-browser-chrome-e10s-2%20tc-M-e10s(bc2)&tochange=00d16f03506b7f9f754b01a0a458c05445ac6dba&fromchange=43e17d5a149245dc87323290cddbb19d4b1343d4&selectedJob=76625838 will dive down to a project branch in an hour or two when all the results are in.
Comment 13•7 years ago
|
||
in doing retriggers, I was not able to figure out a root cause, nor a point where we increased in frequency :(
Flags: needinfo?(jmaher)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 17•7 years ago
|
||
Let's split up this file which contains many tests that do plenty of time-consuming UI operations.
Flags: needinfo?(nhnt11)
Comment 18•7 years ago
|
||
mozreview-review |
Comment on attachment 8838186 [details] Bug 1318389 - Move captive portal test utility functions to head.js. https://reviewboard.mozilla.org/r/113154/#review114642 I mean, I guess, but have we actually determined this is going to fix the timeouts? IOW, are the timeouts actually because we're not managing to run everything within the timeout limit, or are we just getting stuck in the middle of a test sometimes? ::: browser/base/content/test/captivePortal/browser_CaptivePortalWatcher.js:34 (Diff revision 1) > let win1 = yield openWindowAndWaitForFocus(); > let win2 = yield openWindowAndWaitForFocus(); This doesn't make a lot of sense, seems like you could just open two windows without focusing them? ::: browser/base/content/test/captivePortal/head.js:37 (Diff revision 1) > +function* cleanUp() { > + RecentWindow.getMostRecentBrowserWindow = RecentWindow._getMostRecentBrowserWindowCopy; > + delete RecentWindow._getMostRecentBrowserWindowCopy; > + window.CaptivePortalWatcher.init(); > +} the setup function should just register a cleanup function that does this. That also avoids storing stuff on a global/public object instead of just locally.
Attachment #8838186 -
Flags: review?(gijskruitbosch+bugs) → review+
Comment 19•7 years ago
|
||
mozreview-review |
Comment on attachment 8838187 [details] Bug 1318389 - Move single run captive portal test cases to separate test file. https://reviewboard.mozilla.org/r/113156/#review114644 ::: browser/base/content/test/captivePortal/browser.ini:7 (Diff revision 1) > +[browser_CaptivePortalWatcher_1.js] > +skip-if = os == "win" # Bug 1313894 Please rename this to something identifying what kind of tests are in this file. I don't really know what 'single run testcases' means, so a comment in the new test wouldn't go amiss. Otherwise I assume you're just moving code around so this looks fine. If there are actual changes please clarify on the bug.
Attachment #8838187 -
Flags: review?(gijskruitbosch+bugs) → review+
Assignee | ||
Comment 20•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8838186 [details] Bug 1318389 - Move captive portal test utility functions to head.js. https://reviewboard.mozilla.org/r/113154/#review114642 Yeah, this failure is specifically a "This test exceeded the timeout threshold." issue. The cleanUp function finishes and then mochitest complains that the test file took too long.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 23•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8838186 [details] Bug 1318389 - Move captive portal test utility functions to head.js. https://reviewboard.mozilla.org/r/113154/#review114642 > This doesn't make a lot of sense, seems like you could just open two windows without focusing them? A window is focused when it is opened, we simply ensure we wait for the focus to complete to avoid intermittent timing issues since we need to do a manual focus operation after opening the two windows. > the setup function should just register a cleanup function that does this. That also avoids storing stuff on a global/public object instead of just locally. Good point, thanks!
Assignee | ||
Comment 24•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8838187 [details] Bug 1318389 - Move single run captive portal test cases to separate test file. https://reviewboard.mozilla.org/r/113156/#review114644 > Please rename this to something identifying what kind of tests are in this file. I don't really know what 'single run testcases' means, so a comment in the new test wouldn't go amiss. Otherwise I assume you're just moving code around so this looks fine. If there are actual changes please clarify on the bug. Hmm, that was only named singleRunTestCases to distinguish from the other testcases that were run once for captive-portal-login-success and captive-portal-login-abort. Instead of naming the file something specific, I've renamed the array to just "testcases" which is much simpler. The array of tests in the first file is still called testCasesForBothSuccessAndAbort, and there's a comment. I think this makes sense.
Comment 25•7 years ago
|
||
(In reply to Nihanth Subramanya [:nhnt11] from comment #24) > Comment on attachment 8838187 [details] > Bug 1318389 - Move single run captive portal test cases to separate test > file. > > https://reviewboard.mozilla.org/r/113156/#review114644 > > > Please rename this to something identifying what kind of tests are in this file. I don't really know what 'single run testcases' means, so a comment in the new test wouldn't go amiss. Otherwise I assume you're just moving code around so this looks fine. If there are actual changes please clarify on the bug. > > Hmm, that was only named singleRunTestCases to distinguish from the other > testcases that were run once for captive-portal-login-success and > captive-portal-login-abort. Instead of naming the file something specific, > I've renamed the array to just "testcases" which is much simpler. The array > of tests in the first file is still called testCasesForBothSuccessAndAbort, > and there's a comment. I think this makes sense. Maybe for inside the file, but ideally we should avoid having "test_foo1.js" and "test_foo2.js" files if at all possible. Is there really nothing distinctive about the tests you're putting in the second file that you can use to name this in a better way?
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 32•7 years ago
|
||
this is failing 50%+ of the time on linux* debug e10s. I see r+ in here, but no landing, can we land today/tomorrow? I would prefer to have this disabled given the high failure rate if we cannot land these patches
Flags: needinfo?(nhnt11)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 36•7 years ago
|
||
(In reply to :Gijs (away until Feb 27) from comment #25) > Maybe for inside the file, but ideally we should avoid having "test_foo1.js" > and "test_foo2.js" files if at all possible. Is there really nothing > distinctive about the tests you're putting in the second file that you can > use to name this in a better way? I thought about it and honestly can't think of a true distinction that isn't coincidental. In the future when adding tests, I intend for them to be "appended". The reason for splitting them is purely to prevent the test run from exceeding the timeout. There's precedent for the file naming in the popupNotifications tests. I'll admit this argument isn't very strong :( But in the interest of silencing this failure I'm going to land this as is.
Flags: needinfo?(nhnt11)
Comment 37•7 years ago
|
||
Pushed by nhnt11@gmail.com: https://hg.mozilla.org/integration/autoland/rev/3fcbccbc04b8 Move captive portal test utility functions to head.js. r=Gijs https://hg.mozilla.org/integration/autoland/rev/e8836935d34b Move single run captive portal test cases to separate test file. r=Gijs
Updated•7 years ago
|
Whiteboard: [stockwell fixed]
Comment 38•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3fcbccbc04b8 https://hg.mozilla.org/mozilla-central/rev/e8836935d34b
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 41•7 years ago
|
||
(In reply to Nihanth Subramanya [:nhnt11] from comment #36) > But in the interest of silencing this failure I'm going to land this as is. This was the right the decision. > (In reply to :Gijs (away until Feb 27) from comment #25) > > Maybe for inside the file, but ideally we should avoid having "test_foo1.js" > > and "test_foo2.js" files if at all possible. Is there really nothing > > distinctive about the tests you're putting in the second file that you can > > use to name this in a better way? > > I thought about it and honestly can't think of a true distinction that isn't > coincidental. In the future when adding tests, I intend for them to be > "appended". The reason for splitting them is purely to prevent the test run > from exceeding the timeout. There's precedent for the file naming in the > popupNotifications tests. However, other people making similar decisions as the only reason to make decisions (in spite of counterarguments) is a bad idea. The tests as-is have descriptive names for each of their tasks, so it seems like we can use those to help with naming the test files. Please can you file a followup for this?
Flags: needinfo?(nhnt11)
Updated•7 years ago
|
Comment 42•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/3163a9a12db6 https://hg.mozilla.org/releases/mozilla-aurora/rev/4b3e3400ae19
Flags: in-testsuite+
Comment 43•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/dec755e6f29a https://hg.mozilla.org/releases/mozilla-beta/rev/d6097266ddb5
Comment 44•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/dec755e6f29a https://hg.mozilla.org/releases/mozilla-esr52/rev/d6097266ddb5
status-firefox-esr52:
--- → fixed
Comment 45•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-release/rev/dec755e6f29a https://hg.mozilla.org/releases/mozilla-release/rev/d6097266ddb5
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 48•7 years ago
|
||
The change - comment 37 - seemed to mostly fix this failure, but now it has come back, with quite a few failures the last couple of days.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 52•7 years ago
|
||
I used this as a testcase for my mochitest retriggering work (bug 1343327), I tried to see if I could reproduce the failure by running the test in isolation. I could, easily: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6f2ad538f1c9939d57591e6f06a7a6e7750bd118&selectedJob=83496495 (don't be fooled by the fact that the job passed, that's a seperate bug) Since the test was retriggered with a high debugging level, there's a fair amount of interesting debug information in the log: https://treeherder.mozilla.org/logviewer.html#?job_id=83496495&repo=try&lineNumber=2904 In particular I'm seeing a pile of assertions like this: [task 2017-03-13T22:18:43.263116Z] GECKO(1279) | [Child 1334] WARNING: NS_ENSURE_TRUE(standardURL) failed: file /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229 [task 2017-03-13T22:18:43.266667Z] GECKO(1279) | [Child 1334] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 107 [task 2017-03-13T22:18:43.269168Z] GECKO(1279) | [Child 1334] WARNING: 'NS_FAILED(aResult)', file /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 614 [task 2017-03-13T22:18:43.274259Z] GECKO(1279) | [Child 1334] WARNING: NS_ENSURE_TRUE(standardURL) failed: file /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229 [task 2017-03-13T22:18:43.276297Z] GECKO(1279) | [Child 1334] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 107 [task 2017-03-13T22:18:43.278148Z] GECKO(1279) | [Child 1334] WARNING: 'NS_FAILED(aResult)', file /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 614 [task 2017-03-13T22:18:43.310829Z] GECKO(1279) | [Child 1334] WARNING: NS_ENSURE_TRUE(standardURL) failed: file /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229 [task 2017-03-13T22:18:43.313371Z] GECKO(1279) | [Child 1334] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 107 [task 2017-03-13T22:18:43.315077Z] GECKO(1279) | [Child 1334] WARNING: 'NS_FAILED(aResult)', file /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, line 614
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 54•7 years ago
|
||
(In reply to William Lachance (:wlach) (use needinfo!) from comment #52) > I used this as a testcase for my mochitest retriggering work (bug 1343327), > I tried to see if I could reproduce the failure by running the test in > isolation. I could, easily: > > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=6f2ad538f1c9939d57591e6f06a7a6e7750bd118&selectedJob=8 > 3496495 > > (don't be fooled by the fact that the job passed, that's a seperate bug) > > Since the test was retriggered with a high debugging level, there's a fair > amount of interesting debug information in the log: > > https://treeherder.mozilla.org/logviewer. > html#?job_id=83496495&repo=try&lineNumber=2904 > > In particular I'm seeing a pile of assertions like this: > > [task 2017-03-13T22:18:43.263116Z] GECKO(1279) | [Child 1334] WARNING: > NS_ENSURE_TRUE(standardURL) failed: file > /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229 > [task 2017-03-13T22:18:43.266667Z] GECKO(1279) | [Child 1334] WARNING: > NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file > /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, > line 107 > [task 2017-03-13T22:18:43.269168Z] GECKO(1279) | [Child 1334] WARNING: > 'NS_FAILED(aResult)', file > /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, > line 614 > [task 2017-03-13T22:18:43.274259Z] GECKO(1279) | [Child 1334] WARNING: > NS_ENSURE_TRUE(standardURL) failed: file > /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229 > [task 2017-03-13T22:18:43.276297Z] GECKO(1279) | [Child 1334] WARNING: > NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file > /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, > line 107 > [task 2017-03-13T22:18:43.278148Z] GECKO(1279) | [Child 1334] WARNING: > 'NS_FAILED(aResult)', file > /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, > line 614 > [task 2017-03-13T22:18:43.310829Z] GECKO(1279) | [Child 1334] WARNING: > NS_ENSURE_TRUE(standardURL) failed: file > /home/worker/workspace/build/src/caps/nsPrincipal.cpp, line 229 > [task 2017-03-13T22:18:43.313371Z] GECKO(1279) | [Child 1334] WARNING: > NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file > /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, > line 107 > [task 2017-03-13T22:18:43.315077Z] GECKO(1279) | [Child 1334] WARNING: > 'NS_FAILED(aResult)', file > /home/worker/workspace/build/src/extensions/cookie/nsPermissionManager.cpp, > line 614 I'm quite unsure what these assertions are about. I think that this test is still simply taking too long to run. I'm going to look at this today.
Comment 55•7 years ago
|
||
on a passing run of linux64-debug-e10s, we have: [task 2017-03-13T06:30:07.270511Z] 06:30:07 INFO - TEST-OK | browser/base/content/test/captivePortal/browser_CaptivePortalWatcher.js | took 87731ms and we fail at 90 seconds, so this test needs to be split up or reqeustLongerTimeout(). Thanks for looking into this :nhnt11
Whiteboard: [stockwell fixed] → [stockwell needswork]
Comment 56•7 years ago
|
||
I wonder what accounts for the longer runtime though? Looking at the log, it appears that the `test_detectedWithNoBrowserWindow_LongRecheck` subtest is taking about 15 seconds for the second case, which seems quite excessive to me.
Comment 57•7 years ago
|
||
most tests run in a 'range', if this is <3 seconds from the timeout, that seems acceptable in the 'range'. I suspect finding one root cause and fixing it will temporarily put this off for a month or two as something else will slow things down.
Comment 58•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #57) > most tests run in a 'range', if this is <3 seconds from the timeout, that > seems acceptable in the 'range'. I suspect finding one root cause and > fixing it will temporarily put this off for a month or two as something else > will slow things down. Ok apparently Kyle has a dashboard for tracking test times. :) Using that, it looks like there hasn't been a radical shift in the timing of this test, just a gentle climb around March 7th or 8th: https://people-mozilla.org/~klahnakoski/testfailures/test.html#search=browser_CaptivePortalWatcher.js&sampleMax=2017-03-14&sampleMin=2017-03-01 So probably just increasing the timeout would be ok.
Comment hidden (Intermittent Failures Robot) |
Comment 60•7 years ago
|
||
ues requestLongerTimeout(2), this seems to work: https://treeherder.mozilla.org/#/jobs?repo=try&revision=dc039b31345a4732a4436bdc2c0eb7e860fd2f97&filter-searchStr=bc1
Attachment #8847767 -
Flags: review?(nhnt11)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 63•7 years ago
|
||
Comment on attachment 8847767 [details] [diff] [review] use requestLongerTimeout Review of attachment 8847767 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the patch! I took some time so that I could think about whether we want to un-split the test if we're just going to requestLongerTimeout() anyway, but having thought about it, I think we should leave it split; it's better for the long run, and I will at some point think about how we can organize these tests better. ::: browser/base/content/test/captivePortal/browser_CaptivePortalWatcher.js @@ +1,3 @@ > "use strict"; > > +// Bug 1318389 - this just takes a long time in debug Nit: Can we say something a bit more descriptive like: // Bug 1318389 - This test does a lot of window and tab manipulation, // causing it to take a long time on debug.
Attachment #8847767 -
Flags: review?(nhnt11) → review+
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(nhnt11)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 66•7 years ago
|
||
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/b4e1fb56018b Intermittent browser/modules/test/browser_CaptivePortalWatcher.js, requestLongerTimeout. This test does a lot of window and tab manipulation causing it to take a long time on debug. r=nhnt11
Comment 67•7 years ago
|
||
thanks for the review, adjusted the comment and ensured the commit message reflected this as well.
Comment 68•7 years ago
|
||
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d9c723843699 fix eslint issue. a=me
Comment hidden (Intermittent Failures Robot) |
Comment 70•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b4e1fb56018b https://hg.mozilla.org/mozilla-central/rev/d9c723843699
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Updated•7 years ago
|
Whiteboard: [stockwell needswork] → [stockwell fixed]
Comment hidden (Intermittent Failures Robot) |
Comment 72•7 years ago
|
||
I don't think the requestLongerTimeout has helped, based on comment #71. :-(
Flags: needinfo?(nhnt11)
Comment 73•7 years ago
|
||
it has, we had so many early on and the last 4 days there have been no instances.
Flags: needinfo?(nhnt11)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•