Closed
Bug 1272942
Opened 9 years ago
Closed 8 years ago
Intermittent browser_aboutCertError.js | Uncaught exception - TypeError: learnMoreLink is null, exceptionButton is null, advancedButton is null, Argument 1 of Window.getComputedStyle is not an object
Categories
(Firefox :: General, defect, P3)
Firefox
General
Tracking
()
RESOLVED
FIXED
Iteration:
52.1 - Oct 3
People
(Reporter: philor, Assigned: johannh)
References
Details
(Keywords: intermittent-failure, Whiteboard: [fxprivacy])
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Reporter | ||
Updated•9 years ago
|
Summary: Intermittent browser_aboutCertError.js | Uncaught exception - TypeError: learnMoreLink is null → Intermittent browser_aboutCertError.js | Uncaught exception - TypeError: learnMoreLink is null, exceptionButton is null, advancedButton is null
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 10•9 years ago
|
||
I've been trying to narrow this one down. From my latest attempts on try:
https://treeherder.mozilla.org/#/jobs?repo=try&author=rwood@mozilla.com&fromchange=9de8f271e18b35c8d1de9b635a44e2c82947ba9c&tochange=28a8dac4ce1de786646df2802e963f115fe0fd44
2ea3d51ba1bb from Friday July 29th ==> intermittent not seen, at least in those 50 retriggers
e5859dfe0bcb from Saturday July 30th ==> reproduced the failure
Going to do 50 more retriggers on 2ea3d51ba1bb to see if it is consistent or not.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 13•9 years ago
|
||
> Going to do 50 more retriggers on 2ea3d51ba1bb to see if it is consistent or
> not.
Looks consistent, so looks like *maybe* the first time this intermittent occurred is after one of these two merges/pushes:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&fromchange=c3565c8b1cdb575db1c80c7791984a6490598b84&tochange=e5859dfe0bcbd40f4e33f4a633f73ea3473a7849
However, I cannot be certain, and after reading the latest comments in the duplicate Bug 1291489 I'm not sure it is worth trying to pin-point it to a specific change (in case it is a race condition).
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 16•9 years ago
|
||
I vote disabling at this point.
Comment hidden (Intermittent Failures Robot) |
Comment 18•9 years ago
|
||
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Comment 19•9 years ago
|
||
Brian, I see you wrote this test originally. It's plagued with a variety of issues at the moment [1], can you please take a look or suggest someone who can as it's on the path to being disabled otherwise.
FWIW, the timing of when this test got really flaky seems to correspond decently well to when bug 712612 landed.
[1] https://bugzilla.mozilla.org/buglist.cgi?keywords=intermittent-failure%2C%20&keywords_type=allwords&list_id=13197180&short_desc=browser_aboutCertError.js&resolution=---&query_format=advanced&short_desc_type=allwordssubstr
status-firefox49:
--- → affected
status-firefox50:
--- → affected
status-firefox51:
--- → affected
Flags: needinfo?(bgrinstead)
Comment 20•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #19)
> Brian, I see you wrote this test originally. It's plagued with a variety of
> issues at the moment [1], can you please take a look or suggest someone who
> can as it's on the path to being disabled otherwise.
>
> FWIW, the timing of when this test got really flaky seems to correspond
> decently well to when bug 712612 landed.
>
> [1]
> https://bugzilla.mozilla.org/buglist.cgi?keywords=intermittent-
> failure%2C%20&keywords_type=allwords&list_id=13197180&short_desc=browser_abou
> tCertError.js&resolution=---
> &query_format=advanced&short_desc_type=allwordssubstr
I haven't worked on this in quite some time and don't know why it started failing. But I created a two try pushes to try and help track it down:
1) try push with extra logging to see state of DOM before failure: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ea1247a09f3f.
2) try push that switches from DOMContentLoaded to the custom event "AboutNetErrorLoad". Since this seems to fail on different buttons at different times it makes me think that waitForCertErrorLoad is resolving too soon sometimes. From what I remember, load events are hard to detect on about pages and that's why it's trying to wait for DOMContentLoaded. https://treeherder.mozilla.org/#/jobs?repo=try&revision=f0d1ccdc5042
Let's see if anything comes out of those pushes. If not, I don't think I'm the right person to decide about disabling all or part of it - we should ask Panos or Johann about that.
Flags: needinfo?(bgrinstead)
Comment hidden (Intermittent Failures Robot) |
Updated•9 years ago
|
Whiteboard: [fxprivacy][triage]
Updated•9 years ago
|
Whiteboard: [fxprivacy][triage] → [fxprivacy]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 24•9 years ago
|
||
Looking into this I just wanted to remark that the failures are happening because Firefox is simply intermittently crashing when loading CertError pages. It'd be interesting to know if this is happening in production or just in our tests.
###!!! [Child][DispatchAsyncMessage] Error: (msgtype=0x2000A,name=???) Route error: message sent to unknown actor ID
Assertion failure: aCode == MsgDropped (Processing error in CompositorBridgeChild), at /builds/slave/m-cen-m64-00000000000000000000/build/src/gfx/layers/ipc/CompositorBridgeChild.cpp:1087
Also, why is it crashing only on about:certerror?
So afaict it doesn't really make sense to inspect DOM state or anything, the DOM isn't loaded at all.
Comment 27•9 years ago
|
||
(In reply to Johann Hofmann [:johannh] from comment #24)
> Looking into this I just wanted to remark that the failures are happening
> because Firefox is simply intermittently crashing when loading CertError
> pages. It'd be interesting to know if this is happening in production or
> just in our tests.
>
> ###!!! [Child][DispatchAsyncMessage] Error: (msgtype=0x2000A,name=???) Route
> error: message sent to unknown actor ID
> Assertion failure: aCode == MsgDropped (Processing error in
> CompositorBridgeChild), at
> /builds/slave/m-cen-m64-00000000000000000000/build/src/gfx/layers/ipc/
> CompositorBridgeChild.cpp:1087
>
> Also, why is it crashing only on about:certerror?
>
> So afaict it doesn't really make sense to inspect DOM state or anything, the
> DOM isn't loaded at all.
Where did you find this error message? I can't find this particular msgtype=0x2000A in the test logs I sampled. Instead I find some other very different crashes in the logs but none of them seem to be directly related to the test failure.
Flags: needinfo?(kchen)
Assignee | ||
Comment 28•9 years ago
|
||
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #27)
> Where did you find this error message? I can't find this particular
> msgtype=0x2000A in the test logs I sampled. Instead I find some other very
> different crashes in the logs but none of them seem to be directly related
> to the test failure.
Mmh I didn't notice that the msgtype is 0x2000A on my machine (I ran an artifact build on OSX) instead of 0x2000C which appears in all logs I looked at, e.g. here:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-aurora&job_id=3450307#L2937
The test failures are happening because we're trying to use DOM elements that don't exist on the crashed page, as far as I see it. The screenshots also look like it didn't load successfully.
Updated•9 years ago
|
Assignee: nobody → jhofmann
Status: NEW → ASSIGNED
Updated•9 years ago
|
Iteration: --- → 51.3 - Sep 12
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 31•9 years ago
|
||
Kan-Ru, sorry, I'm not really certain what your resolution was here. Are you sure that these crashes (https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=11533424#L1824) are not related to this problem? They happen in every log I looked at and I can reproduce them locally. Could you explain what's happening there if it's not causing the failures?
Thanks!
Flags: needinfo?(kchen)
Comment 32•9 years ago
|
||
msgtype 0x2000A is mozilla::layers::PAPZ::Msg_NotifyAPZStateChange so I think this another instance of PAPZ shutdown error. Kats?
Flags: needinfo?(kchen) → needinfo?(bugmail)
Comment 33•9 years ago
|
||
I reproduced in rr and it looks like mCanSend should be getting set to false in RemoteContentController::Destroy(). When that function calls SendDestroy, it sends a destroy message to APZChild which promptly deletes itself during the processing of that message. So the parent side shouldn't be sending any more messages to the child after it has called SendDestroy(). I'll write a patch and test it, but I'll put it on a new bug because I'm not sure it there are other issues that are contributing to this intermittent failure.
Flags: needinfo?(bugmail)
Updated•9 years ago
|
Iteration: 51.3 - Sep 19 → 52.1 - Oct 3
Comment 34•9 years ago
|
||
Bug 1304457 (now merged to central) should fix the 0x2000A crashes. However this intermittent failure is still showing up (without the process crash) so whoever owns this test should continue investigating.
Assignee | ||
Comment 35•9 years ago
|
||
Thanks for fixing these! Too bad that it didn't seem to fix it, I'll try to investigate further...
Comment hidden (Intermittent Failures Robot) |
Updated•9 years ago
|
Whiteboard: [fxprivacy] → [fxprivacy][triage]
Assignee | ||
Comment 37•9 years ago
|
||
So the only recent occurrences of this outside of Beta [0] (where the patch hasn't landed yet) are due to bug 569229, which seems to conveniently have gotten a patch ready after 6 years.
[0] https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1272942&tree=all&startday=2016-09-25&endday=2016-10-01
I'd say let's wait for bug 569229 to be fixed and keep an eye on the results then, we might be able to resolve this. One thing that seems certain is that fixing bug 1304457 drastically reduced the number of intermittents here.
Depends on: 569229
Assignee | ||
Updated•9 years ago
|
Summary: Intermittent browser_aboutCertError.js | Uncaught exception - TypeError: learnMoreLink is null, exceptionButton is null, advancedButton is null → Intermittent browser_aboutCertError.js | Uncaught exception - TypeError: learnMoreLink is null, exceptionButton is null, advancedButton is null, Argument 1 of Window.getComputedStyle is not an object
Comment 39•9 years ago
|
||
Bug 1304457 is marked as not affecting Beta, so we're thinking that the failures there are all bug 569229?
Assignee | ||
Comment 40•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #39)
> Bug 1304457 is marked as not affecting Beta, so we're thinking that the
> failures there are all bug 569229?
Nope, that's bug 1304457. It's exactly the same error, so they must've identified a wrong bug as regressor.
Comment 41•9 years ago
|
||
(In reply to Johann Hofmann [:johannh] - partially unresponsive until 11/14 from comment #40)
> Nope, that's bug 1304457. It's exactly the same error, so they must've
> identified a wrong bug as regressor.
That's not necessarily the case, we need to map back from the msgtype to the protocol/message using the steps at [1]. On beta the set of protocols is different from central so it might map back to something else. And the PAPZ RemoteContentController code in particular already has the fix applied on beta [2], so I'd be quite surprised it turned up as the culprit.
[1] https://wiki.mozilla.org/Electrolysis/Debugging#Working_backwards_from_a_C.2B.2B_Message_to_its_IPDL_message
[2] http://hg.mozilla.org/releases/mozilla-beta/file/e8610794c397/gfx/layers/ipc/RemoteContentController.cpp#l287
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•9 years ago
|
Whiteboard: [fxprivacy][triage] → [fxprivacy]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 52•8 years ago
|
||
Closing this as successful due to robot inactivity \o/
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•