Closed
Bug 1374170
Opened 8 years ago
Closed 8 years ago
Intermittent underreported Taskcluster OS X and Windows Aborting task - max run time exceeded!
Categories
(Testing :: Talos, defect)
Testing
Talos
Tracking
(firefox59 fixed)
RESOLVED
FIXED
mozilla59
Tracking | Status | |
---|---|---|
firefox59 | --- | fixed |
People
(Reporter: philor, Assigned: rwood)
References
Details
(Whiteboard: [stockwell unknown])
Attachments
(1 file)
Because bug 1333957, you will never know how often it happens.
https://public-artifacts.taskcluster.net/BQcj93oYT3iEC4gUMoP0WQ/0/public/logs/live_backing.log
https://public-artifacts.taskcluster.net/JL5vDEXlRzGmnnmUAM7viw/0/public/logs/live_backing.log
https://public-artifacts.taskcluster.net/Y_9HIB_WS-uL7htHfpU7cA/0/public/logs/live_backing.log
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Reporter | ||
Comment 4•8 years ago
|
||
Hello Windows.
Summary: Intermittent underreported Taskcluster OS X Aborting task - max run time exceeded! → Intermittent underreported Taskcluster OS X and Windows Aborting task - max run time exceeded!
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 17•8 years ago
|
||
This has 30 failures in the last 7 days.
It seems to be specific to osx talos(h2) jobs.
:rwood: Could you please take a look?
Component: General → Talos
Flags: needinfo?(rwood)
Product: Taskcluster → Testing
Whiteboard: [stockwell needswork]
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to Henrietta Maior [:henrietta_maior] from comment #17)
> This has 30 failures in the last 7 days.
>
> It seems to be specific to osx talos(h2) jobs.
>
> :rwood: Could you please take a look?
Thanks Henrietta. I'm not too familiar with orange factor, in the links above I see all kinds of different tests timing out (not just h2 on osx) but maybe those links aren't recent. Can you please point me towards logs/links for the osx h2 jobs? Thanks! :)
Flags: needinfo?(rwood) → needinfo?(hmaior)
Comment hidden (Intermittent Failures Robot) |
Comment 20•8 years ago
|
||
Robert, here you have a recent example of a h2 log:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=139743150
This failure happens only on OSX, on both h2 and tp6.
Flags: needinfo?(hmaior)
Comment hidden (Intermittent Failures Robot) |
Comment 22•8 years ago
|
||
In the last 7 days there have been 41 failures.
:rwood, do you have any updates?
Flags: needinfo?(rwood)
Assignee | ||
Comment 23•8 years ago
|
||
Looking into this now, trying to repro on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1817058ccfc98b490c0889a521b00bd5e19aeef1
Flags: needinfo?(rwood)
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 25•8 years ago
|
||
(In reply to OrangeFactor Robot from comment #24)
...
> For more details, see:
> https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1374170&startday=2017-11-06&endday=2017-11-12&tree=all
Most of the recent occurrences of this intermittent are in mochitest. From a look at some of the logs, it looks like the tests are running fine, it's just that they take longer to run sometimes and they are exceeding the taskcluster hard limit of 60 minutes, and taskcluster is killing the job - the job is not failing on it's own.
This makes sense as talos h2 and tp6, where this intermittent has also been seen, is using the default task max-run-time of 3600 min [1] whereas some other talos tests have this set to 7200 min.
I believe the solution here is to add the "max-run-time: 7200" for talos heavy jobs (h*, tp6*), that should fix it for talos.
The same may work for mochitest, as on all platforms except linux, mochitest is also using a max-run-time of 3600 [2]. This should be increased also.
I'll make a patch to increase these max-run-times, hopefully that will take care of a bunch of intermittents of this type.
[1] https://searchfox.org/mozilla-central/rev/a662f122c37704456457a526af90db4e3c0fd10e/taskcluster/ci/test/talos.yml#2
[2] https://searchfox.org/mozilla-central/rev/a662f122c37704456457a526af90db4e3c0fd10e/taskcluster/ci/test/mochitest.yml#108
Comment 26•8 years ago
|
||
the mochitest-e10s-2 failures on osx debug seem to be a hang, not sure if 7200 seconds will help- lets solve the talos issues, and know that a week or so from now the osx/windows stylo-disabled will go away :)
Assignee | ||
Comment 27•8 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #26)
> the mochitest-e10s-2 failures on osx debug seem to be a hang, not sure if
> 7200 seconds will help- lets solve the talos issues, and know that a week or
> so from now the osx/windows stylo-disabled will go away :)
Ok, will make the patch for talos only, thanks
Comment hidden (mozreview-request) |
Comment 29•8 years ago
|
||
mozreview-review |
Comment on attachment 8928651 [details]
Bug 1374170 - Increase talos h2 and tp6 task max-run-time to prevent intermittent run time exceeded failure;
https://reviewboard.mozilla.org/r/199886/#review205010
Attachment #8928651 -
Flags: review?(jmaher) → review+
Comment 30•8 years ago
|
||
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4ffa8b8095c5
Increase talos h2 and tp6 task max-run-time to prevent intermittent run time exceeded failure; r=jmaher
Comment 31•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox59:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Assignee: nobody → rwood
You need to log in
before you can comment on or make changes to this bug.
Description
•