Closed
Bug 938543
Opened 11 years ago
Closed 10 years ago
when blobber fails to upload talos results in a green run on tbpl
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: jmaher, Unassigned)
References
Details
I have been seeing this in testing .etl file uploads, but there was a crash on try server for an existing test and it was trying to upload the crash report and failed. So we have talos crashing and blobber failing, but we still return 0 and the job is green? here is a link to the try server job where it failed to upload a crash report: https://tbpl.mozilla.org/php/getParsedLog.php?id=30553443&tree=Try#error0 here is a link to a try server job where it has failed to upload a .etl file: https://tbpl.mozilla.org/php/getParsedLog.php?id=30540421&tree=Try&full=1 I believe the logic we should use is: talosreturncode = runtalos() blobberreturncode = runblobber() if failed(talosreturncode) return talosreturncode return blobberreturncode
Comment 1•11 years ago
|
||
In general, we usually get told not to set WARNINGS (orange) or FAILURE (red) for infra issues. Should we use EXCEPTION (which ends up purple) instead?
Reporter | ||
Comment 2•11 years ago
|
||
we could ignore the blobber error code and just proxy the talos return code then. I would vote for if talos fails report it as such (red or orange), and if blobber fails report it as purple.
Comment 3•11 years ago
|
||
I think that's a good plan long-term, but there are still some things to be worked out before that. Until we're ready to rely 100% on blobber, I'd prefer if failures didn't affect the status of the build.
Reporter | ||
Comment 4•11 years ago
|
||
so to confirm, we will still report the original error code of the test job, and ignore any and all failures from blobber?
Comment 5•11 years ago
|
||
I think the exact final behaviour is TBD, but there are a few options if blobber fails: - turn the job red (like graph server post failures) - turn the job orange - turn the job purple (infra failure) - silently ignore the failure and report the original error code of the test job I don't think we'd automatically retry on blobber failure. I'm lean towards orange or purple...or some kind of "infra warning" colour that we don't have yet.
Reporter | ||
Comment 6•11 years ago
|
||
thinking larger now- all our blobber uploads are when we have failures- so technically we should be turning the job orange (screenshots, .etl files) or red (crash reports). Right now blobber is masking real test failures (on non production branches)
Comment 7•11 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #6) > thinking larger now- all our blobber uploads are when we have failures- so > technically we should be turning the job orange (screenshots, .etl files) or > red (crash reports). Right now blobber is masking real test failures (on > non production branches) I think *currently* all our blobber uploads are when we have failures. In the future we could potentially upload logs, localconfig.json, buildprops.json, or other files, even on success.
Comment 8•11 years ago
|
||
I haven't looked at where and how blobber is deciding its status, but ignorantly assuming its a buildstep with a worst_status, it seems like it just needs to put EXCEPTION after WARNING and FAILURE instead of before in a custom worst_status to implement comment 0's "purple if we were green before blobber failed, red or orange if we were that before blobber failed." And it's true what they say, ignorance is bliss.
Comment 9•11 years ago
|
||
To sum up: Current situation: blobber upload success or failure doesn't impact job status one way or another Desired future state: failure on blobber upload should turn a green job purple, otherwise leave job status alone Is that accurate?
Reporter | ||
Comment 10•11 years ago
|
||
just saw another case where blobber turned a job green that should have been red.
Reporter | ||
Comment 11•11 years ago
|
||
catlee, was blobber and mozharness scripts changed to preserve the original return code? funny how we both commented at the same time.
Comment 12•11 years ago
|
||
Have a link to the job from comment #10? At this point blobber shouldn't be affecting the job status one way or another.
Reporter | ||
Comment 13•11 years ago
|
||
http://dev-master01.build.scl1.mozilla.com:8036/builders/Android%204.0%20Panda%20mozilla-central%20opt%20test%20crashtest/builds/6/steps/run_script/logs/stdio
Comment 14•11 years ago
|
||
what's wrong with that? unzip fails, exits with 9, and mozharness then exits with 9.
Reporter | ||
Comment 15•11 years ago
|
||
ok, I was mistaken. I don't have access to the webserver, so I was relying on an irc conversation that could easily be out of context. given this information, I would wager that comment 9 is accurate.
Reporter | ||
Comment 16•11 years ago
|
||
catlee, please see bug 946922, we still have blobber messing up the return codes.
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•