Open
Bug 751575
Opened 12 years ago
Updated 11 months ago
Permanent failure during test_pb_notification_ipc.js | command timed out: 1200 seconds without output, attempting to kill
Categories
(Core :: DOM: Navigation, defect)
Tracking
()
REOPENED
People
(Reporter: emorley, Unassigned)
References
Details
(Keywords: leave-open, Whiteboard: [purple][leave open])
Attachments
(1 file, 1 obsolete file)
1.43 KB,
patch
|
jst
:
review+
|
Details | Diff | Splinter Review |
Filing this so I have something to point the tree closure message at. Earlier there was what appears to have been some infra issues, so a number of windows builds failed, requiring retriggers (which are still running, ETA 1 hour+ until tests complete) before we know the correct regression range for this. Tbh, I can't see that it's anything other than CPG related sadly. The range we have so far is: hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=f5a3a7b9c6b0&tochange=400c2b30015d If it is indeed CPG, bholley/luke/jst have asked that CPG not be backed out if we can help it: bholley_mobile: Really though, we want to avoid a backout if at all possible bholley_mobile: Cpg touches lots of stuff. Every two days it spends outside the tree something else lands that breaks with it bholley_mobile: This has happened a bunch of times already bholley_mobile: So i (and jst) think we should bias towards disabling tests and fixing them in a followup bholley_mobile: I promise to be super responsive on that front bholley_mobile: Im just taking today away from the laptop to unwind a bit bholley_mobile: There are also lots of things like sec bugs blocked on it bholley_mobile: And ionmonkey edmorley: I'll file a bug to point the tree closure at and try disabling the test that is causing the suite to abort, hopefully that will be it -- Similar to bug 717448, except perma-purple and mostly happens during test_pb_notification_ipc.js. Both Win opt and WinXP opt are perma-purple with this. Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:07:06 PDT for push 400c2b30015d slave: talos-r3-xp-074 https://tbpl.mozilla.org/php/getParsedLog.php?id=11419727&tree=Mozilla-Inbound { TEST-INFO | C:\talos-slave\test\build\xpcshell\tests\docshell\test\unit\test_privacy_transition.js | running test ... TEST-PASS | C:\talos-slave\test\build\xpcshell\tests\docshell\test\unit\test_privacy_transition.js | test passed (time: 219.000ms) TEST-INFO | C:\talos-slave\test\build\xpcshell\tests\docshell\test\unit_ipc\test_pb_notification_ipc.js | running test ... command timed out: 1200 seconds without output, attempting to kill SIGKILL failed to kill process using fake rc=-1 program finished with exit code -1 remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last): Failure: exceptions.RuntimeError: SIGKILL failed to kill process ] <snip> <p class="error">exceptions.RuntimeError: SIGKILL failed to kill process</p> Traceback (most recent call last): Failure: exceptions.RuntimeError }
Reporter | ||
Comment 1•12 years ago
|
||
Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:07:25 PDT for push 400c2b30015d slave: talos-r3-w7-040 https://tbpl.mozilla.org/php/getParsedLog.php?id=11419750&tree=Mozilla-Inbound test_writer_starvation.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:11:18 PDT for push 12d1d626759c slave: talos-r3-xp-021 https://tbpl.mozilla.org/php/getParsedLog.php?id=11419829&tree=Mozilla-Inbound test_writer_starvation.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:13:57 PDT for push 12d1d626759c slave: talos-r3-w7-079 https://tbpl.mozilla.org/php/getParsedLog.php?id=11419893&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:39:41 PDT for push a6a335cd2c94 slave: talos-r3-xp-039 https://tbpl.mozilla.org/php/getParsedLog.php?id=11420444&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:38:05 PDT for push a6a335cd2c94 slave: talos-r3-w7-061 https://tbpl.mozilla.org/php/getParsedLog.php?id=11420464&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound pgo test xpcshell on 2012-05-03 07:24:32 PDT for push de5745bce8bc slave: talos-r3-xp-013 https://tbpl.mozilla.org/php/getParsedLog.php?id=11429138&tree=Mozilla-Inbound *** test_0000_bootstrap_svc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:43:13 PDT for push de5745bce8bc slave: talos-r3-xp-037 https://tbpl.mozilla.org/php/getParsedLog.php?id=11420501&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 02:52:47 PDT for push de5745bce8bc slave: talos-r3-w7-021 https://tbpl.mozilla.org/php/getParsedLog.php?id=11420814&tree=Mozilla-Inbound test_pb_notification_ipc.js
Depends on: 723541
Summary: Permanent failure during test_pb_notification_ipc.js or test_writer_starvation.js | command timed out: 1200 seconds without output, attempting to kill → Permanent failure during test_pb_notification_ipc.js or test_writer_starvation.js or test_0000_bootstrap_svc.js | command timed out: 1200 seconds without output, attempting to kill
Comment 2•12 years ago
|
||
I'm not sure why this is related to bug 723541?
Reporter | ||
Comment 3•12 years ago
|
||
Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 04:00:26 PDT for push 074c8fb332a8 slave: talos-r3-xp-074 https://tbpl.mozilla.org/php/getParsedLog.php?id=11422366&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 04:00:12 PDT for push 074c8fb332a8 slave: talos-r3-w7-050 https://tbpl.mozilla.org/php/getParsedLog.php?id=11422383&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 04:55:07 PDT for push cda84ca70452 slave: talos-r3-xp-050 https://tbpl.mozilla.org/php/getParsedLog.php?id=11424139&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 04:55:05 PDT for push cda84ca70452 slave: talos-r3-w7-045 https://tbpl.mozilla.org/php/getParsedLog.php?id=11424164&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 04:42:44 PDT for push d7271f499b8b slave: talos-r3-xp-061 https://tbpl.mozilla.org/php/getParsedLog.php?id=11423785&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 04:42:36 PDT for push d7271f499b8b slave: talos-r3-w7-076 https://tbpl.mozilla.org/php/getParsedLog.php?id=11423821&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 05:27:42 PDT for push d9525fcfdd8b slave: talos-r3-xp-007 https://tbpl.mozilla.org/php/getParsedLog.php?id=11425245&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 05:27:41 PDT for push d9525fcfdd8b slave: talos-r3-w7-041 https://tbpl.mozilla.org/php/getParsedLog.php?id=11425287&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 05:37:27 PDT for push 47ececd6cb72 slave: talos-r3-xp-004 https://tbpl.mozilla.org/php/getParsedLog.php?id=11425515&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 05:37:30 PDT for push 47ececd6cb72 slave: talos-r3-w7-052 https://tbpl.mozilla.org/php/getParsedLog.php?id=11425565&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 06:16:17 PDT for push 94b06a04f17b slave: talos-r3-xp-021 https://tbpl.mozilla.org/php/getParsedLog.php?id=11426619&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 06:16:12 PDT for push 94b06a04f17b slave: talos-r3-w7-070 https://tbpl.mozilla.org/php/getParsedLog.php?id=11426634&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 06:41:44 PDT for push 02b78dbc753f slave: talos-r3-xp-012 https://tbpl.mozilla.org/php/getParsedLog.php?id=11427396&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 06:40:16 PDT for push 02b78dbc753f slave: talos-r3-w7-072 https://tbpl.mozilla.org/php/getParsedLog.php?id=11427453&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 5.1 mozilla-inbound opt test xpcshell on 2012-05-03 07:17:15 PDT for push 94913b445e72 slave: talos-r3-xp-017 https://tbpl.mozilla.org/php/getParsedLog.php?id=11428455&tree=Mozilla-Inbound test_pb_notification_ipc.js Rev3 WINNT 6.1 mozilla-inbound opt test xpcshell on 2012-05-03 07:17:11 PDT for push 94913b445e72 slave: talos-r3-w7-052 https://tbpl.mozilla.org/php/getParsedLog.php?id=11428464&tree=Mozilla-Inbound test_pb_notification_ipc.js
Reporter | ||
Comment 4•12 years ago
|
||
(In reply to Brian R. Bondy [:bbondy] from comment #2) > I'm not sure why this is related to bug 723541? Because of: > Rev3 WINNT 5.1 mozilla-inbound pgo test xpcshell on 2012-05-03 07:24:32 PDT > for push de5745bce8bc > slave: talos-r3-xp-013 > https://tbpl.mozilla.org/php/getParsedLog.php?id=11429138&tree=Mozilla- > Inbound > *** test_0000_bootstrap_svc.js Out of the ~24 failures, all bar two have been in test_pb_notification_ipc.js, presume the others may just have been the other known intermittent purples occurring by coincidence.
Comment 5•12 years ago
|
||
I think it i just coincidence but adding myself to the CC list
Reporter | ||
Comment 6•12 years ago
|
||
Ok, retriggers finally come through: range confirmed as CPG: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ac00c792933e&tochange=bed8c4e3dfdf
Reporter | ||
Comment 7•12 years ago
|
||
Attachment #620727 -
Flags: review?(luke)
Reporter | ||
Comment 8•12 years ago
|
||
Going to call the test_writer_starvation.js and test_0000_bootstrap_svc.js instances coincidence/already filed intermittent purple for now.
Summary: Permanent failure during test_pb_notification_ipc.js or test_writer_starvation.js or test_0000_bootstrap_svc.js | command timed out: 1200 seconds without output, attempting to kill → Permanent failure during test_pb_notification_ipc.js | command timed out: 1200 seconds without output, attempting to kill
Comment 9•12 years ago
|
||
Comment on attachment 620727 [details] [diff] [review] Temporarily disable test_pb_notification_ipc.js Stealing review request, r=jst.
Attachment #620727 -
Flags: review?(luke) → review+
Reporter | ||
Comment 10•12 years ago
|
||
Many thanks :-) Disable test: https://hg.mozilla.org/integration/mozilla-inbound/rev/90d25e0f6c68
Whiteboard: [purple] → [purple] [leave open]
Reporter | ||
Comment 11•12 years ago
|
||
Seems to have worked :-D (I was kinda worried yet more failures were hiding behind the aborted run) https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=90d25e0f6c68
Comment 12•12 years ago
|
||
A locally build (non-pgo) opt xpcshell does not reproduce, I'll try the binary try is using.
Comment 13•12 years ago
|
||
Ah, it does repro 100% with a PGO xpcshell.
Comment 14•12 years ago
|
||
Woohoo, I think I found the culprit: conservative GC. Changing a do_timeout(0, Components.utils.forceGC) to Components.utils.schedulePreciseGC(function(){}) fixes the problem.
Comment 15•12 years ago
|
||
Updated•12 years ago
|
Attachment #620883 -
Flags: review?(bzbarsky) → review+
Comment 16•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d220661ef24
Target Milestone: --- → mozilla15
Updated•12 years ago
|
Whiteboard: [purple] [leave open] → [purple]
Comment 17•12 years ago
|
||
Comment on attachment 620883 [details] [diff] [review] re-enable and fix test r=me, fwiw
Comment 18•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=11454869&tree=Mozilla-Inbound (first run after attachment 620883 [details] [diff] [review] landed) https://tbpl.mozilla.org/php/getParsedLog.php?id=11454376&tree=Mozilla-Inbound (second run)
Whiteboard: [purple] → [purple][leave open]
Comment 19•12 years ago
|
||
Oh, fun, it was permafailing on *Windows* PGO before; now it's permafailing on Linux PGO.
Comment 20•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=11455629&tree=Mozilla-Inbound
Reporter | ||
Comment 21•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=11457897&tree=Mozilla-Inbound
Reporter | ||
Comment 22•12 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #19) > Oh, fun, it was permafailing on *Windows* PGO before; now it's permafailing > on Linux PGO. Backed out: https://hg.mozilla.org/integration/mozilla-inbound/rev/9c2c2046677f
Target Milestone: mozilla15 → ---
Reporter | ||
Comment 23•12 years ago
|
||
> Disable test: > https://hg.mozilla.org/integration/mozilla-inbound/rev/90d25e0f6c68 Merged: https://hg.mozilla.org/mozilla-central/rev/90d25e0f6c68
Reporter | ||
Comment 24•12 years ago
|
||
> Comment on attachment 620883 [details] [diff] [review] > re-enable and fix test Merged the re-enable, since inbound was backing up: https://hg.mozilla.org/mozilla-central/rev/8d220661ef24 The latest backout will be merged to m-c shortly.
Comment 25•12 years ago
|
||
<meme>run all the GCs</mem> On the bright side, this is definitely an issue in how the test is forcing the temporary DocShell to get GC'd, definitely not a CPG bug. bz/jst: is there a canonical way to ensure the DocShell schedulePreciseGC? (jdm was expressing doubt that schedulePreciseGC worked on xpcshell, but I didn't understand.)
Comment 26•12 years ago
|
||
I'm not sure what the question is, exactly....
Comment 27•12 years ago
|
||
How can that test reliably kill that DocShell (the one created at the beginning of destroy_transient_docshell which you can see in the patch context).
Comment 28•12 years ago
|
||
I have no idea... I'd think that a precise GC should do it.
Comment 29•12 years ago
|
||
Backed out: https://hg.mozilla.org/mozilla-central/rev/6a9a1e259aeb
Comment 30•12 years ago
|
||
I think ehsan meant https://hg.mozilla.org/mozilla-central/rev/9c2c2046677f I'm stumped here as to why this docshell wouldn't be cleaned up now... :(
Comment 31•12 years ago
|
||
My doubt has to do with the implementation of schedulePreciseGC, which will not actually perform the GC until no JS is running at all. In xpcshell, that never occurs. In tests where nothing happens until the callback occurs, this means the test never completes; in this case, it probably just means that a regular gc ends up being triggered on some platforms and the docshell is destroyed.
Comment 32•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=11477422&tree=Firefox https://tbpl.mozilla.org/php/getParsedLog.php?id=11478422&tree=Firefox https://tbpl.mozilla.org/php/getParsedLog.php?id=11479514&tree=Firefox
Comment 33•12 years ago
|
||
Luke, is there a way for us to truly force a full GC to happen exactly when we need one, whether there's JS on the stack or not? IOW, if the JS engine APIs exist to do that we can expose that (if it's not already exposed), and call that here to make this deterministic. I'm happy to help write that patch if those APIs exist.
Comment 34•12 years ago
|
||
The problem with a synchronous GC is that it would have to do conservative stack scanning which is the source of non-determinacy here. schedulePreciseGC would be great if it worked.
Reporter | ||
Updated•12 years ago
|
Severity: blocker → normal
Updated•12 years ago
|
Assignee: luke → nobody
Comment 35•12 years ago
|
||
Disabled the non-ipc version in https://hg.mozilla.org/integration/mozilla-inbound/rev/fb17ffb3bf77 because another unrelated change caused it to start timing out, only on 10.5 debug. Bye-bye test, see you when schedulePreciseGC from xpcshell works.
Reporter | ||
Comment 36•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/fb17ffb3bf77
Reporter | ||
Updated•11 years ago
|
Attachment #620727 -
Attachment is obsolete: true
Reporter | ||
Updated•9 years ago
|
Status: ASSIGNED → NEW
Reporter | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
Comment 37•2 years ago
|
||
Re-opening as the test got re-disabled, and hence it is still disabled today and should be investigated at some stage.
Updated•2 years ago
|
Severity: normal → S3
Comment 38•1 year ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:kmag, maybe it's time to close this bug?
For more information, please visit auto_nag documentation.
Flags: needinfo?(kmaglione+bmo)
You need to log in
before you can comment on or make changes to this bug.
Description
•