Closed Bug 896015 Opened 11 years ago Closed 11 years ago

Fix return codes and tbpl coloring for talos mozharness

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: jyeo)

References

Details

Attachments

(2 files, 1 obsolete file)

If we look at this log:
https://tbpl.mozilla.org/php/getParsedLog.php?id=25494235&tree=Try&full=1
we see that the return code is 1 rather than 2, yet, tbpl marked the job as "red".

mozharness unittests only listen to return codes and we want talos mozharness to do the same.

log_eval_func=rc_eval_func({
              0: SUCCESS,
              1: WARNINGS,
              2: FAILURE,
              3: EXCEPTION,
              4: RETRY,
            }),

The patch attached has already been reviewed positively.

(In reply to Aki Sasaki [:aki] from comment #22)
> Comment on attachment 778586 [details] [diff] [review]
> only consider return codes for talos mozharness
> 
> Before we land this and reconfig, we should verify the talos script actually
> exits with the appropriate self.return_code.
> 
> To do that we may have to add more self.buildbot_status() calls in
> mozharness.mozilla.testing.talos.  However, iirc, talos never goes orange,
> only green/red/retry.

(In reply to Ed Morley [:edmorley UTC+1] from comment #23)
> (In reply to Aki Sasaki [:aki] from comment #22)
> > However, iirc, talos never goes orange,
> > only green/red/retry.
> 
> Talos also goes orange for crashes and test-unexpected-fails as of bug
> 829728, via:
> https://hg.mozilla.org/build/buildbotcustom/file/f24d9219c221/steps/talos.
> py#l115
Attachment #778618 - Flags: review+
Attached patch Call buildbot_status (obsolete) — Splinter Review
I haven't tested this yet. I will do a try push next week. Let me know if this is what you are referring to.
Attachment #778639 - Flags: review?(aki)
Comment on attachment 778639 [details] [diff] [review]
Call buildbot_status

Hm, I missed the fact that we already set self.return_code.

Do the talos return codes match what we're expecting here? (i.e., 0 success, 1 warning, 2 failure, 4 retry ?)  If so, we may not need this patch at all  If not, we may need to map the talos return codes to this set, or patch talos to return the codes we're expecting.
Attachment #778639 - Flags: review?(aki) → review+
talos returns 1 for orange, 2 for red, 0 for success.  With these and mozharness talos as it is on try server, I am unable to get orange:
https://tbpl.mozilla.org/php/getParsedLog.php?id=25494235&tree=Try&full=1
(In reply to Aki Sasaki [:aki] from comment #2)
> Comment on attachment 778639 [details] [diff] [review]
> Call buildbot_status
> 
> Hm, I missed the fact that we already set self.return_code.
> 
> Do the talos return codes match what we're expecting here? (i.e., 0 success,
> 1 warning, 2 failure, 4 retry ?)  If so, we may not need this patch at all 
> If not, we may need to map the talos return codes to this set, or patch
> talos to return the codes we're expecting.

(In reply to Joel Maher (:jmaher) from comment #3)
> talos returns 1 for orange, 2 for red, 0 for success.  With these and
> mozharness talos as it is on try server, I am unable to get orange:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=25494235&tree=Try&full=1

aki, does this mean that we're good to go with just the first patch?

I will land it for now since it only affect talos on try.
Flags: needinfo?(aki)
Comment on attachment 778618 [details] [diff] [review]
only consider return codes for talos mozharness

https://hg.mozilla.org/build/buildbotcustom/rev/a23709fffc46
Attachment #778618 - Flags: checked-in+
Fixing a minor bug. I left out the underscore when calling determine_status.

I'm not sure how we are doing the tbpl log parsing. I applied the patch, forced return codes 0, 1 and 2 in talos and I got # TBPL SUCCESS #, # TBPL WARNING # and # TBPL FAILURE # respectively, printed in the logs. Is this the desired output?

I checked the logs for the other tests (mochitest, reftest, etc) and they color the jobs as orange when # TBPL WARNING # is present. I believe tbpl would color it orange when it sees # TBPL WARNING # in the logs right?
Attachment #779220 - Flags: review+
Attachment #778639 - Attachment is obsolete: true
Sheriffs, anyone know about comment 6?
TBPL just displays the result given to it by buildbot, so this is a buildbot/releng question :-)
(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) (EDT/UTC-4) from comment #4)
> (In reply to Aki Sasaki [:aki] from comment #2)
> > Comment on attachment 778639 [details] [diff] [review]
> > Call buildbot_status
> > 
> > Hm, I missed the fact that we already set self.return_code.
> > 
> > Do the talos return codes match what we're expecting here? (i.e., 0 success,
> > 1 warning, 2 failure, 4 retry ?)  If so, we may not need this patch at all 
> > If not, we may need to map the talos return codes to this set, or patch
> > talos to return the codes we're expecting.
> 
> (In reply to Joel Maher (:jmaher) from comment #3)
> > talos returns 1 for orange, 2 for red, 0 for success.  With these and
> > mozharness talos as it is on try server, I am unable to get orange:
> > https://tbpl.mozilla.org/php/getParsedLog.php?id=25494235&tree=Try&full=1
> 
> aki, does this mean that we're good to go with just the first patch?
> 
> I will land it for now since it only affect talos on try.

Yes, we're good to go with just the first patch.
If we want to declare any Exception or Retry statuses, we'll need to add those, but for green/orange/red we're good.
Flags: needinfo?(aki)
(In reply to Jason Yeo [:jyeo] from comment #6)
> Created attachment 779220 [details] [diff] [review]
> Call buildbot_status. Fix minor bug.
> 
> Fixing a minor bug. I left out the underscore when calling determine_status.
> 
> I'm not sure how we are doing the tbpl log parsing. I applied the patch,
> forced return codes 0, 1 and 2 in talos and I got # TBPL SUCCESS #, # TBPL
> WARNING # and # TBPL FAILURE # respectively, printed in the logs. Is this
> the desired output?
> 
> I checked the logs for the other tests (mochitest, reftest, etc) and they
> color the jobs as orange when # TBPL WARNING # is present. I believe tbpl
> would color it orange when it sees # TBPL WARNING # in the logs right?

The TBPL strings are only needed until patch 1 is put into production; afterwards we're going to go by exit status only.
in production
jmaher, can you please re-trigger your job and see if it got fixed?
all fixed, thanks guys!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → Mozharness
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: