Closed Bug 1374170 Opened 8 years ago Closed 8 years ago

Intermittent underreported Taskcluster OS X and Windows Aborting task - max run time exceeded!

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(firefox59 fixed)

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: philor, Assigned: rwood)

References

Details

(Whiteboard: [stockwell unknown])

Attachments

(1 file)

Hello Windows.
Summary: Intermittent underreported Taskcluster OS X Aborting task - max run time exceeded! → Intermittent underreported Taskcluster OS X and Windows Aborting task - max run time exceeded!
This has 30 failures in the last 7 days. It seems to be specific to osx talos(h2) jobs. :rwood: Could you please take a look?
Component: General → Talos
Flags: needinfo?(rwood)
Product: Taskcluster → Testing
Whiteboard: [stockwell needswork]
(In reply to Henrietta Maior [:henrietta_maior] from comment #17) > This has 30 failures in the last 7 days. > > It seems to be specific to osx talos(h2) jobs. > > :rwood: Could you please take a look? Thanks Henrietta. I'm not too familiar with orange factor, in the links above I see all kinds of different tests timing out (not just h2 on osx) but maybe those links aren't recent. Can you please point me towards logs/links for the osx h2 jobs? Thanks! :)
Flags: needinfo?(rwood) → needinfo?(hmaior)
Robert, here you have a recent example of a h2 log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=139743150 This failure happens only on OSX, on both h2 and tp6.
Flags: needinfo?(hmaior)
In the last 7 days there have been 41 failures. :rwood, do you have any updates?
Flags: needinfo?(rwood)
Flags: needinfo?(rwood)
(In reply to OrangeFactor Robot from comment #24) ... > For more details, see: > https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1374170&startday=2017-11-06&endday=2017-11-12&tree=all Most of the recent occurrences of this intermittent are in mochitest. From a look at some of the logs, it looks like the tests are running fine, it's just that they take longer to run sometimes and they are exceeding the taskcluster hard limit of 60 minutes, and taskcluster is killing the job - the job is not failing on it's own. This makes sense as talos h2 and tp6, where this intermittent has also been seen, is using the default task max-run-time of 3600 min [1] whereas some other talos tests have this set to 7200 min. I believe the solution here is to add the "max-run-time: 7200" for talos heavy jobs (h*, tp6*), that should fix it for talos. The same may work for mochitest, as on all platforms except linux, mochitest is also using a max-run-time of 3600 [2]. This should be increased also. I'll make a patch to increase these max-run-times, hopefully that will take care of a bunch of intermittents of this type. [1] https://searchfox.org/mozilla-central/rev/a662f122c37704456457a526af90db4e3c0fd10e/taskcluster/ci/test/talos.yml#2 [2] https://searchfox.org/mozilla-central/rev/a662f122c37704456457a526af90db4e3c0fd10e/taskcluster/ci/test/mochitest.yml#108
the mochitest-e10s-2 failures on osx debug seem to be a hang, not sure if 7200 seconds will help- lets solve the talos issues, and know that a week or so from now the osx/windows stylo-disabled will go away :)
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #26) > the mochitest-e10s-2 failures on osx debug seem to be a hang, not sure if > 7200 seconds will help- lets solve the talos issues, and know that a week or > so from now the osx/windows stylo-disabled will go away :) Ok, will make the patch for talos only, thanks
Comment on attachment 8928651 [details] Bug 1374170 - Increase talos h2 and tp6 task max-run-time to prevent intermittent run time exceeded failure; https://reviewboard.mozilla.org/r/199886/#review205010
Attachment #8928651 - Flags: review?(jmaher) → review+
Pushed by rwood@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4ffa8b8095c5 Increase talos h2 and tp6 task max-run-time to prevent intermittent run time exceeded failure; r=jmaher
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Blocks: 1420078
Assignee: nobody → rwood
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: