Don't RETURN the buildid and slavename-approximation nobody wants to see

VERIFIED FIXED

Status

Testing
Talos
VERIFIED FIXED
6 years ago
6 years ago

People

(Reporter: philor, Assigned: philor)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

6 years ago
Created attachment 584225 [details] [diff] [review]
utterly untested fix v.1

So much boring history. Here, kid, sit over there and I'll tell you why things are the stupid way they are.

Several million years ago, when using a dinosaur as a mascot was just like a picture of your pet cat, builds came from a single physical computer per branch+OS, so a Linux trunk build from 201112241246 meant the only possible build that the box named george could have produced at that instant, based on the state of the CVS repo on 2011-12-24 at 12:46.

In a world where 30 different Linux slaves can start builds at the same instant on 30 different revs, and the only relationship between a datetime and what's in the build is that something pushed later cannot be in it, assuming that the clocks on the slave, hg.m.o, and the pusher's computer are all correct, nobody wants to see a TinderboxPrinted buildid.

More recently, only a few short years ago, bug 448047 comment 25 expressed doubt about whether run_tests.py should do the TinderboxPrinting of the slavename, #c26 absolutely correctly said that buildbot should be the one to do it, then the bug went ahead and did the wrong thing instead. Then for the wrong reason, bug 560236 half-corrected that, and made buildbot print the slavename first thing, without getting rid of the repetitive print from run_tests.py. (In the dark ages of tinderbox-based tbpl, it wouldn't show anything if the run didn't TinderboxPrint the revision, since it wouldn't have any way to say what push the run went with, not just "fails before Tinderbox printing something" but moving slavename up to first thing was still the right move, because you always want to know which slave it was that failed, no matter how early it fails.)

tbpl hates all the noise TinderboxPrinted by Talos runs so much that it does special log parsing for Talos, to throw away the things nobody wants to see. For things that it thinks are moderately successful Talos runs (things which have a graphserver URL in them), it just ignores the buildid, and the not-quite slavename (which I think once upon a time was the actual slavename, but now seems to sometimes be tegra-036.n for tegra-036), but for things which went so wrong that there's no graphserver URL, it falls back to the parser which doesn't ignore the noise, and displays a summary like

* s: tegra-036
* s: tegra-036.n
* id: 20111224091218
* FAIL: Busted: ts
* FAIL: timeout exceeded
Attachment #584225 - Flags: review?(jmaher)
Comment on attachment 584225 [details] [diff] [review]
utterly untested fix v.1

Review of attachment 584225 [details] [diff] [review]:
-----------------------------------------------------------------

I am fine with the changes, but I don't know the impact this will have on the build bot code for desktop or remote.
Attachment #584225 - Flags: review?(jmaher) → review+
Attachment #584225 - Flags: review?(bear)

Updated

6 years ago
Attachment #584225 - Flags: review?(bear) → review+
(Assignee)

Comment 2

6 years ago
http://hg.mozilla.org/build/talos/rev/4f1829d10b4e
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Comment 3

6 years ago
And in production today :)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.