Closed Bug 1335801 Opened 3 years ago Closed 3 years ago

Intermittent browser/base/content/test/referrer/browser_referrer_middle_click.js | Test timed out after | A promise chain failed to handle a rejection: - TypeError: content.document.getElementById(...) is null

Categories

(Firefox :: General, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Firefox 54
Iteration:
55.1 - Mar 20
Tracking Status
firefox52 --- wontfix
firefox-esr52 --- fixed
firefox53 --- fixed
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: mrbkap)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [e10s-multi:+][stockwell fixed])

Attachments

(2 files, 3 obsolete files)

See Also: → 1298196
This test was skipped for bug 1315042, briefly enabled (resulting in these timeouts), then skipped again.
...and re-enabled in bug 1337354.

:gabor -- Thanks for your efforts! Unfortunately, this test is failing again.
Flags: needinfo?(gkrizsanits)
(In reply to Geoff Brown [:gbrown] from comment #2)
> ...and re-enabled in bug 1337354.
> 
> :gabor -- Thanks for your efforts! Unfortunately, this test is failing again.

Sigh... I'm becoming less and less of a fan of this test... Do we have some numbers about how frequent it is now? I was really bad before the patch in bug 1337354...
Flags: needinfo?(gkrizsanits) → needinfo?(gbrown)
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1335801 says 78 failures so far. Maybe 50 per day? It looks pretty bad.
Flags: needinfo?(gbrown)
Attached patch really wait for tab load... (obsolete) — Splinter Review
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #7)
> 2 approaches to try and fix it:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=96cf8919aae66cc77eb357147f611c9441c8beba&selectedJob=7
> 6374474
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=d1abbc4406cad556df724048b9b04608c6d9f191&selectedJob=7
> 6359004

Seems like the original version worked, but the openNewForegroundTab version we ended up landing is not perfect. I think because the promise array in this helper gets resolved for the about:blank load event, instead of waiting for the actual URL to be loaded. This seem to be fixable easily by a flag. Shouldn't we fix the helper method as well though?
Attachment #8836652 - Flags: review?(mrbkap)
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #11)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #7)
> for the actual URL to be loaded. This seem to be fixable easily by a flag.
> Shouldn't we fix the helper method as well though?

Gah, I hate about:blank. I looked into fixing the helper and it's more complicated than I thought it would be. That being said, I have a patch that could help. Let me know if you like that approach better. I'm not a huge fan of using STATE_STOP, since it its meaning is pretty opaque.
Attached patch Patch to BrowserTestUtils (obsolete) — Splinter Review
The idea here is to make the function that opens the tab return the URL that we want to load so we can wait on the right URL.

Gabor, what do you think?
Attachment #8836951 - Flags: review?(gkrizsanits)
Comment on attachment 8836652 [details] [diff] [review]
really wait for tab load...

Clearing this for now.
Attachment #8836652 - Flags: review?(mrbkap)
Comment on attachment 8836951 [details] [diff] [review]
Patch to BrowserTestUtils

Review of attachment 8836951 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, I think this is the best we can do here if my theory is right. But since I never tested that it is the reason for the failure (although I'm 99% sure given the fixes I tested above) could you run a few (dozens of) rounds on try before checking this in (or after)? I want to be really sure that we fix this test finally :)
Attachment #8836951 - Flags: review?(gkrizsanits) → review+
Summary: Intermittent browser/base/content/test/referrer/browser_referrer_middle_click.js | Test timed out - → Intermittent browser/base/content/test/referrer/browser_referrer_middle_click.js | Test timed out after | A promise chain failed to handle a rejection: - TypeError: content.document.getElementById(...) is null
Duplicate of this bug: 1339452
Assignee: nobody → mrbkap
Attachment #8836652 - Attachment is obsolete: true
Attachment #8836951 - Attachment is obsolete: true
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

https://reviewboard.mozilla.org/r/112522/#review114082
Attachment #8837377 - Flags: review?(gkrizsanits) → review+
Comment on attachment 8837378 [details]
Bug 1335801 - Wait on the correct URL to load.

https://reviewboard.mozilla.org/r/112524/#review114084
Attachment #8837378 - Flags: review?(gkrizsanits) → review+
Iteration: --- → 54.2 - Feb 20
Whiteboard: [e10s-multi:?]
Whiteboard: [e10s-multi:?] → [e10s-multi:+]
is there a reason these patches haven't landed? this is failing 50% of the time
The patches are orange on try. I'm still working on this.
(In reply to Blake Kaplan (:mrbkap) from comment #26)
> The patches are orange on try. I'm still working on this.

Shouldn't we just land one of the patches above and do the cleaning up bits as a followup?
I can't get the current version of the test to fail [1] or a patch with a potential fix [2] to do anything other than pass. Let's see if my bug fix fixes this bug (especially as it seems related to the problem that Gabor was working around).

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=18a23922ae29012459da93d21648e78510363d4a
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f816a34bbb5b46eb15c50a69a10859cc5bffac0
Attachment #8837378 - Attachment is obsolete: true
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

https://reviewboard.mozilla.org/r/112522/#review116178
Attachment #8837377 - Flags: review?(felipc) → review+
Pushed by mrbkap@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9d35d61ac2ac
Actually wait for the right href to load. r=Felipe
(In reply to Blake Kaplan (:mrbkap) from comment #32)
> I can't get the current version of the test to fail [1] or a patch with a
> potential fix [2] to do anything other than pass.

Failures are very usually on e10s; both try pushes ran only non-e10s tests. I'll try to add e10s runs to those pushes.
https://hg.mozilla.org/mozilla-central/rev/9d35d61ac2ac
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
I stil lsee a lot of failure here, hard to tell is this is older builds or not, but the rate the failures are coming in and most are trunk/try, I suspect this isn't fixed.

I will follow up tomorrow
This definitely not fixed by my patch. I'm still looking :(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
thanks :mrbkap, let me know if you need more information or help, I am motivated to see this resolved in the short term as this has been a high frequency failure for the last couple weeks.
Duh - someTabLoaded was racing with onload for the window.
mozreview is so confused -- maybe we should have opened a new bug? Anyway, Gabor, mind stamping the new patch informally here?
Flags: needinfo?(gkrizsanits)
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

https://reviewboard.mozilla.org/r/112522/#review116690

We should kill this someTabLoaded function some time... (with fire)
Attachment #8837377 - Flags: review+
Flags: needinfo?(gkrizsanits)
Pushed by mrbkap@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/20e91e72e991
Wait for the right events. r=Felipe,krizsa
Bug 1342500 tracks killing someTabLoaded.
Backed out for failing browser_referrer_open_link_in_container_tab.js, at least on Windows:

https://hg.mozilla.org/integration/autoland/rev/07a4c8ff3830383a6361622a0a1c817c5d2be35b

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=20e91e72e991503d24dfbe7e373ebceef60f00e6&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=80058728&repo=autoland

12:06:01     INFO - TEST-START | browser/base/content/test/referrer/browser_referrer_open_link_in_container_tab.js
12:06:04     INFO - TEST-INFO | started process screenshot
12:06:04     INFO - TEST-INFO | screenshot: exit 0
12:06:04     INFO - Buffered messages logged at 12:06:01
12:06:04     INFO - Console message: [JavaScript Error: "TypeError: win is null" {file: "resource://app/modules/URLBarZoom.jsm" line: 24}]
12:06:04     INFO - Console message: [JavaScript Error: "TypeError: win is null" {file: "resource://app/modules/URLBarZoom.jsm" line: 24}]
12:06:04     INFO - Buffered messages logged at 12:06:02
12:06:04     INFO - browser_referrer_open_link_in_container_tab: policy=[undefined] rel=[undefined] http:// -> http://
12:06:04     INFO - TEST-PASS | browser/base/content/test/referrer/browser_referrer_open_link_in_container_tab.js | We have a menupopup. - 
12:06:04     INFO - TEST-PASS | browser/base/content/test/referrer/browser_referrer_open_link_in_container_tab.js | We have a first container entry. - 
12:06:04     INFO - TEST-PASS | browser/base/content/test/referrer/browser_referrer_open_link_in_container_tab.js | We have a first container entry. - 
12:06:04     INFO - TEST-PASS | browser/base/content/test/referrer/browser_referrer_open_link_in_container_tab.js | We have a usercontextid value. - 
12:06:04     INFO - Buffered messages finished
12:06:04     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/referrer/browser_referrer_open_link_in_container_tab.js | A promise chain failed to handle a rejection:  - TypeError: content.document.getElementById(...) is null
Flags: needinfo?(mrbkap)
following up here- at a 33% failure rate, there is motivation to ensure we get this fixed; do speak up if this is more difficult and we can disable it temporarily
Flags: needinfo?(mrbkap)
Whiteboard: [e10s-multi:+] → [e10s-multi:+][stockwell needswork]
(In reply to Joel Maher ( :jmaher) from comment #63)
> following up here- at a 33% failure rate, there is motivation to ensure we
> get this fixed; do speak up if this is more difficult and we can disable it
> temporarily

Disabling the test will just likely push this failure to the next test in the directory. It happens for the first one in the suite (or at least that's what happened in the previous attempts of disabling it).
can we get an update to this bug?  Given the fact that all tests can fail depending on who is first to run, should we disable the entire directory of tests on all platforms?  This is so frequent that I want to get this resolved tomorrow one way or another.
Flags: needinfo?(mrbkap)
Attachment #8837377 - Flags: review+ → review?(gkrizsanits)
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

https://reviewboard.mozilla.org/r/112522/#review118624
Attachment #8837377 - Flags: review?(gkrizsanits) → review+
thanks for the patch here, let me know if you need me to push this or validate via try.
Iteration: 54.2 - Feb 20 → 54.3 - Mar 6
Priority: -- → P1
I am not clear if there are other patches needed, or if we can just land this current patch to fix/reduce the failures here.
Pushed by mrbkap@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ba253783b432
Make these tests wait correctly. r=Felipe,krizsa
Whiteboard: [e10s-multi:+][stockwell needswork] → [e10s-multi:+][stockwell fixed]
as we cannot just disable one test, and this is on all platforms opt/debug, we need to disable everything to stop the bleeding :(

As we have had 4+ weeks of 300+ failures/week, this is a bit out of control.

I am open to other ideas to get this fixed today/tomorrow.
Attachment #8844443 - Flags: review?(gbrown)
Comment on attachment 8844443 [details] [diff] [review]
disable referrer tests until we can fix them

Review of attachment 8844443 [details] [diff] [review]:
-----------------------------------------------------------------

Most of the tests in this manifest are already skipped on linux for bug 1145199 and bug 1144816...feels like they all need some attention.

I appreciate the efforts here - thanks :mrbkap, :gabor! - but unless a fix is fast approaching, let's get these turned off.
Attachment #8844443 - Flags: review?(gbrown) → review+
I would like one more chance to fix this today. I have a fix for browser_referrer_simple_click.js which should be the last perma-orange.
Flags: needinfo?(mrbkap)
great, I will stay tuned!  thanks :mrbkap :)
Attachment #8837377 - Flags: review+ → review?(gkrizsanits)
Pushed by mrbkap@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/11f369a25fba
Make these tests wait correctly. r=Felipe,krizsa
thanks for the quick turnaround!
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

https://reviewboard.mozilla.org/r/112522/#review119674
Attachment #8837377 - Flags: review?(gkrizsanits) → review+
https://hg.mozilla.org/mozilla-central/rev/11f369a25fba
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Iteration: 54.3 - Mar 6 → 55.1 - Mar 20
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

Approval Request Comment
[Feature/Bug causing the regression]: n/a
[User impact if declined]: n/a
[Is this code covered by automated tests?]: Yes, this is automated tests.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: n/a
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: n/a
[Why is the change risky/not risky?]: Test only changes, run hundreds or thousands of times a day on automation.
[String changes made/needed]: n/a
Attachment #8837377 - Flags: approval-mozilla-aurora?
Comment on attachment 8837377 [details]
Bug 1335801 - Make these tests wait correctly.

Fix tests. We can take them in aurora. Aurora54+.
Attachment #8837377 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.