Closed Bug 622308 Opened 14 years ago Closed 14 years ago

Frequent Talos hangs

Tracking

(Not tracked)

Status:

RESOLVED WORKSFORME

People

(Reporter: philor, Unassigned)

Details

Phil Ringnalda (:philor)

Reporter

Description

•

14 years ago

Starting with http://hg.mozilla.org/mozilla-central/rev/39db16b78175 (which was Windows-only but failed on Linux) we've been having Talos hangs, mostly on Linux, and on both mozilla-central and TraceMonkey: 2 Linux and 1 Linux64 on that push, 1 Linux and 2 Linux64 on the next push, 1 Linux64 on the next push which was a comment change to trigger another build, 1 Linux and 1 Windows on the only TraceMonkey push of the day.

joduinn mentioned that some slaves had come back from staging today, which got me looking at recent builds for the slaves involved:

talos-r3-fed64-023 - https://build.mozilla.org/buildapi/recent/talos-r3-fed64-023
talos-r3-fed-012 - https://build.mozilla.org/buildapi/recent/talos-r3-fed-012
talos-r3-fed64-027 - https://build.mozilla.org/buildapi/recent/talos-r3-fed64-027
talos-r3-fed64-053 - https://build.mozilla.org/buildapi/recent/talos-r3-fed64-053
talos-r3-fed-038 - https://build.mozilla.org/buildapi/recent/talos-r3-fed-038
talos-r3-fed-024 - https://build.mozilla.org/buildapi/recent/talos-r3-fed-024
talos-r3-fed64-039 - https://build.mozilla.org/buildapi/recent/talos-r3-fed64-039
talos-r3-fed-044 - https://build.mozilla.org/buildapi/recent/talos-r3-fed-044
talos-r3-w7-007 - https://build.mozilla.org/buildapi/recent/talos-r3-w7-007

Every one of them had a big gap before today, some only since the 17th or 25th, but several since October or November, so I suspect these are those slaves, and that they weren't as healthy as they seemed while they were hanging out in staging.

Phil Ringnalda (:philor)

Reporter

Comment 1

•

14 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1293846099.1293848598.16328.gz

talos-r3-fed64-011 - https://build.mozilla.org/buildapi/recent/talos-r3-fed64-011

Phil Ringnalda (:philor)

Reporter

Comment 2

•

14 years ago

Uh oh, https://build.mozilla.org/buildapi/recent/talos-r3-fed64-040 doesn't fit with the slave pattern, though http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1293842455.1293849865.20936.gz fits with the hang pattern.

Phil Ringnalda (:philor)

Reporter

Comment 3

•

14 years ago

Fits again: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1293845806.1293853277.2994.gz - talos-r3-fed-033 - https://build.mozilla.org/buildapi/recent/talos-r3-fed-033

Phil Ringnalda (:philor)

Reporter

Comment 4

•

14 years ago

Though none of the affected slaves has yet done exactly the same test on the same branch (the closest is https://build.mozilla.org/buildapi/recent/talos-r3-fed-024 doing svg on whatever shadow-central is), at least 8 of them have gone on to successfully do another talos run on the same branch, so although I don't have a theory for what sort of leftover something could cause such a thing, it's possible that they are only failing on their first talos run after coming back to production.

Phil Ringnalda (:philor)

Reporter

Comment 5

•

14 years ago

Six pushes without seeing any more, perhaps we're out of the woods (or perhaps Monday morning we'll have 30 pushes going at once before anyone notices that the first 10 are failing, who knows?).

Severity: blocker → normal

Phil Ringnalda (:philor)

Reporter

Comment 6

•

14 years ago

I can't imagine anyone being able to do anything useful at this point with "a bunch of slaves that just came back to production failed once on their first Talos run 5 days ago, and then were fine."

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → WORKSFORME

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Frequent Talos hangs

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: philor, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated