Closed
Bug 1453007
Opened 7 years ago
Closed 5 years ago
Frequent T-P-e10s(g2) failures on Linux and OS X: Task killed because maxRunTime was exceeded / Automation Error: mozprocess timed out after 3600 seconds running / damp | webconsole/openwithcache.js: Test timed out
Categories
(Testing :: Talos, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: aryx, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: intermittent-failure, regression, Whiteboard: [stockwell disabled])
Attachments
(1 file)
1.78 KB,
patch
|
rwood
:
review+
|
Details | Diff | Splinter Review |
See https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&group_state=expanded&filter-searchStr=t-p-e10s(g2)&fromchange=83de58ddda2057f1cb949537f6b111e3b115ea3d&tochange=a8061a09cd7064a8783ca9e67979d77fb52e001e&selectedJob=172879454
It's almost a permafail and not obvious if this is one or more issues.
There are more webconsole/openwithcache.js failures available in this central-as-beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&revision=41356ce53ff617857225cedc97b0254ebde4dd9f&group_state=expanded&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable&selectedJob=172833082
Changelog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=83de58ddda20&tochange=a8061a09cd70
Flags: needinfo?(rwood)
Reporter | ||
Updated•7 years ago
|
Keywords: intermittent-failure
Comment 1•7 years ago
|
||
Hmmm
08:10:02 INFO - PID 1312 | TEST-UNEXPECTED-FAIL | damp | webconsole/openwithcache.js: Test timed out
:ochameau, as the owner of DAMP, can you please investigate? Thanks!
Flags: needinfo?(rwood) → needinfo?(poirot.alex)
Reporter | ||
Comment 2•7 years ago
|
||
Comment 3•7 years ago
|
||
Note that thanks to bug 1444826, we get much better error logged into PerfHerder.
So that we can file specific bugs like this one!
Now let's see what going on with this console test...
Comment 4•7 years ago
|
||
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #0)
> There are more webconsole/openwithcache.js failures available in this
> central-as-beta simulation:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=41356ce53ff617857225cedc97b0254ebde4dd9f&group_state=e
> xpanded&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-
> resultStatus=exception&filter-resultStatus=retry&filter-
> resultStatus=usercancel&filter-resultStatus=runnable&selectedJob=172833082
Is it MacOS only?
It seems to be passing on Linux, it fails later, during tps (which is run after DAMP), due to bug 1443130.
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #2)
> The failures start with
> https://hg.mozilla.org/mozilla-central/
> pushloghtml?changeset=db03e2cdae9ab05482f98e3c567896234c66e226
Could it be that bug 1448077 breaks something only on MacOS?
Flags: needinfo?(poirot.alex)
Comment 5•7 years ago
|
||
Thanks :ochameau.
Yes definitely looks like two different issues here,
1) this issue, 'TEST-UNEXPECTED-FAIL | damp | webconsole/openwithcache.js: Test timed out', which yes seems to only happen on OSX and on the talos profiling nightly job. So an issue with running g2 with --geckoProfiling on, on OSX.
Yes possibly bug 1448077 since that is also in the changeset. :julian, can you take a peak please and see if it's related to this new failure? Thanks!]
And,
2) g2 exceeding the total task run time, which is Bug 1443130, and happens in the nightly talos profiling job
Which we will resolve in that bug by extending the max task runtime, and I"ll make a patch in that bug.
Flags: needinfo?(jdescottes)
Comment 6•7 years ago
|
||
For now I can't reproduce the issue locally, and I also don't understand why Bug 1448077 would make it fail.
I just added a few g2 with geckoprofile on the last try push I did before landing:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fed11a87970988ef70ba86da3ec24238f923c91f&selectedJob=172929836
They are all green, so maybe it's something else in the changelog?
Here is an extract from a failure log (from in https://treeherder.mozilla.org/#/jobs?repo=try&revision=41356ce53ff617857225cedc97b0254ebde4dd9f&selectedJob=172862156)
04:12:28 INFO - PID 579 | Executing test 'webconsole/openwithcache.js'
04:12:28 INFO - PID 579 |
[.... Removed some lines here to make it easier to read]
04:12:28 INFO - PID 579 |
04:12:29 INFO - PID 579 | Assertion failure
04:12:29 INFO - PID 579 | assert@resource:///modules/AsyncTabSwitcher.jsm:447:36
04:12:29 INFO - PID 579 | finish@resource:///modules/AsyncTabSwitcher.jsm:300:5
04:12:29 INFO - PID 579 | postActions@resource:///modules/AsyncTabSwitcher.jsm:609:7
04:12:29 INFO - PID 579 | onUnloadTimeout@resource:///modules/AsyncTabSwitcher.jsm:623:5
04:12:29 INFO - PID 579 | queueUnload/this.unloadTimer<@resource:///modules/AsyncTabSwitcher.jsm:925:44
04:12:29 INFO - PID 579 | Unable to read VR Path Registry from /Users/cltbld/Library/Application Support/OpenVR/.openvr/openvrpaths.vrpath
04:12:29 INFO - PID 579 | 2018-04-10 04:12:29.286 plugin-container[637:11985] *** CFMessagePort: bootstrap_register(): failed 1100 (0x44c) 'Permission denied', port = 0x9237, name = 'com.apple.tsm.portname'
04:12:29 INFO - PID 579 | See /usr/include/servers/bootstrap_defs.h for the error codes.
04:17:28 INFO - PID 579 | TEST-UNEXPECTED-FAIL | damp | webconsole/openwithcache.js: Test timed out
It mentions AsyncTabSwitcher, which seems to have been updated in Bug 1451460, which is also in the pushlog.
And this pattern seems to repeat itself for all the failures that mention openwithcache.
Mike, do you think this failure could be linked to one of your patches?
Flags: needinfo?(jdescottes) → needinfo?(mconley)
Comment 7•7 years ago
|
||
Try after Bug 1451460 : https://treeherder.mozilla.org/#/jobs?repo=try&revision=f50dcba03c34cccc89a477a5510982a836949ceb
Try before Bug 1451460: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d153c14f7b7d2b682bef956acfafd3041e0e2bb2
So it looks like the timeouts are linked to Bug 1451460. But interestingly they all failed after a very long time, and I didn't see any of the "damp | webconsole/openwithcache.js: Test timed out" that can be found on the other runs mentioned in this bug.
Comment hidden (Intermittent Failures Robot) |
Comment 10•7 years ago
|
||
Attachment #8967442 -
Flags: review?(rwood)
Comment 11•7 years ago
|
||
Comment on attachment 8967442 [details] [diff] [review]
skip tps on osx, for profiling skip tps on linux/osx
Review of attachment 8967442 [details] [diff] [review]:
-----------------------------------------------------------------
R+ (with the landing of the patch in Bug 1453720 first of course)
Attachment #8967442 -
Flags: review?(rwood) → review+
Comment 12•7 years ago
|
||
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5e672478d592
skip tps on osx and linux/osx for profiling. r=rwood
Updated•7 years ago
|
Keywords: leave-open
Whiteboard: [stockwell disabled]
Comment 13•7 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 18•6 years ago
|
||
Sorry it's taken me so long to get to this. :(
Yeah, we're failing an assertion in the AsyncTabSwitcher, and that would definitely cause the test to time out. We should try to figure out why the assertion is failing.
Blocks: async-tab-switcher
Flags: needinfo?(mconley)
Comment 19•6 years ago
|
||
ni?ing myself to see if I can get some logging in here to see what's going on.
Flags: needinfo?(mconley)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 25•5 years ago
|
||
No reports for 8 months.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Updated•5 years ago
|
Keywords: leave-open
Comment 26•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Keywords: regression
Comment 27•5 years ago
|
||
Apologies - declaring needinfo bankruptcy on needinfo's greater than 2 years old.
Flags: needinfo?(mconley)
You need to log in
before you can comment on or make changes to this bug.
Description
•