Closed
Bug 876159
Opened 11 years ago
Closed 11 years ago
Test runs which fail and then exceed the maximum log size are marked as passing
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: philor, Unassigned)
References
Details
Attachments
(5 files, 1 obsolete file)
1.16 KB,
patch
|
philor
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
2.21 KB,
patch
|
philor
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
2.35 KB,
patch
|
philor
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
5.66 KB,
patch
|
philor
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
11.08 KB,
patch
|
Callek
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=23400763&tree=Mozilla-Inbound is a debug Ubuntu crashtest run, which exceeded the maximum log size, but is marked as green. Every tree is closed.
Reporter | ||
Comment 1•11 years ago
|
||
Those crashtests may well be passing, the summary comes from https://tbpl.mozilla.org/php/getParsedLog.php?id=23402698&tree=Try where Ms2ger spotted it, because his try push did break mochitest-1, but the debug ones which exceeded the maximum log size were marked as green.
Reporter | ||
Comment 2•11 years ago
|
||
And we can easily get the linux crashtests down below the log limit, should just be a matter of 'hg backout -r 6115fce2399d'. I'm only holding off on that because we might need the permafailgreen to test fixing detection of it. Doesn't solve all our undetected problems, though, because https://tbpl.mozilla.org/php/getParsedLog.php?id=23401583&tree=Mozilla-Inbound is a Win7 debug mochitest-2 that OOMs and overflows the log (intermittently).
Comment 3•11 years ago
|
||
r=me to comment out that line with a reference to this bug
Comment 4•11 years ago
|
||
(xpcom/base/nsCycleCollector.cpp, line 1890)
Comment 5•11 years ago
|
||
Or you can just delete that warning, rather than commenting it out. I was planning on removing it next week anyways.
Reporter | ||
Comment 6•11 years ago
|
||
My possibly-flawed understand of how we used to catch these is that http://mxr.mozilla.org/build/source/buildbotcustom/steps/unittest.py#230 insisted on seeing the "Failed: 0" line, and if it didn't see it, then the run failed. http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/testing/unittest.py#120 looks to be the mozharness equivalent, where it thinks that in that case fail_count will still be the -1 it was inited as, but apparently that doesn't work: both truncated logs which have an error in the non-truncated part and should have set fail_count to > 0, and truncated logs which don't have an error in the non-truncated part and should have left it at -1 wind up with SUCCESS.
Comment 7•11 years ago
|
||
Well, there are other issues here. We have one case where even in the truncated log, there are test failures yet it is green so it seems that the log being truncated bypasses running through the parser at all and sets the build green. I think bypassing the parser altogether is fine, but if the log is so much bigger than expected that it has to be truncated, that condition alone should be sufficient to turn the build red.
Comment 8•11 years ago
|
||
Not sure if this will work, but it's worth a shot.
Attachment #754171 -
Flags: review?(philringnalda)
Reporter | ||
Comment 9•11 years ago
|
||
Comment on attachment 754171 [details] [diff] [review] detect truncation -> burn I'm pretty sure it won't work, because the message is a buildbot HEADER rather than an ANYTHINGYOUCANACTUALLYSEE, but Callek thought it was possible that it was visible to log parsing despite that, so if you're willing to do a reconfig I'm more than willing to trigger a crashtest that will overflow to see.
Attachment #754171 -
Flags: review?(philringnalda) → review+
Comment 10•11 years ago
|
||
The other approach we can take is to use rc_eval_func instead of regex_eval_func, and change the scripts to set a return status as opposed to spitting out a string. That approach would take a lot more changes, so I'm going to reconfig to see if this works first.
Comment 11•11 years ago
|
||
Comment on attachment 754171 [details] [diff] [review] detect truncation -> burn http://hg.mozilla.org/build/buildbotcustom/rev/f05c900f0d29
Attachment #754171 -
Flags: checked-in+
Comment 12•11 years ago
|
||
Also, we've been talking about uploading logs, configs, etc. from test jobs for a while now. That would allow for uploading the un-truncated info log, the raw log and other log level logs, the localconfig.json, screenshots, or anything else we put in the upload dir for uploading. There would still need to be something telling buildbot what the status is, though.
Reporter | ||
Comment 13•11 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #3) > r=me to comment out that line with a reference to this bug Not enough, https://tbpl.mozilla.org/php/getParsedLog.php?id=23409164&full=1&branch=try is the same OOM without the CC warning, still overflowing. Kill the nsDeque OOM warning? Kill imptests? I'll kill imptests if you tell me to, just tell me to, I don't mind, I'll kill them, say the word.
Comment 14•11 years ago
|
||
Untested, but should work in theory. We'd need to change the regex_log_evaluator() to rc_eval_func() in buildbotcustom.
Comment 15•11 years ago
|
||
Attachment #754179 -
Attachment is obsolete: true
Comment 16•11 years ago
|
||
Updated•11 years ago
|
Attachment #754181 -
Flags: review?
Comment 17•11 years ago
|
||
Comment on attachment 754184 [details] [diff] [review] [custom] rc_eval_func instead of regex_log_evaluator 5.. manually canceled?
Updated•11 years ago
|
Attachment #754184 -
Flags: review?
Comment 18•11 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #17) > Comment on attachment 754184 [details] [diff] [review] > [custom] rc_eval_func instead of regex_log_evaluator > > 5.. manually canceled? How would we detect that in mozharness?
Comment 19•11 years ago
|
||
Comment on attachment 754181 [details] [diff] [review] [mozharness] set self.return_code, lower touch version You need to assign a reviewer.
Reporter | ||
Updated•11 years ago
|
Attachment #754181 -
Flags: review? → review+
Reporter | ||
Comment 20•11 years ago
|
||
Comment on attachment 754184 [details] [diff] [review] [custom] rc_eval_func instead of regex_log_evaluator "Dunno": RETRY always scares me because there are so many things I dunno, but there's no better time than now to risk infinite retries :)
Attachment #754184 -
Flags: review? → review+
Comment 21•11 years ago
|
||
Comment on attachment 754181 [details] [diff] [review] [mozharness] set self.return_code, lower touch version https://hg.mozilla.org/build/mozharness/rev/d90aad39ac1f Landed on default. I'll merge to production once the reconfig's done.
Attachment #754181 -
Flags: checked-in+
Comment 22•11 years ago
|
||
Comment on attachment 754184 [details] [diff] [review] [custom] rc_eval_func instead of regex_log_evaluator http://hg.mozilla.org/build/buildbotcustom/rev/117b78891fc5 Merged to production-0.8, starting reconfig.
Attachment #754184 -
Flags: checked-in+
Reporter | ||
Comment 23•11 years ago
|
||
The Win debug OOM looks like it's maybe 8% based on my try run, so it doesn't block reopening like the crashtest does, but I'll be coming after it with a machete before long.
Comment 24•11 years ago
|
||
Reconfig done, mozharness patch transplanted to production.
Comment 25•11 years ago
|
||
with unit tests that pass.
Attachment #754189 -
Flags: review?(philringnalda)
Reporter | ||
Comment 26•11 years ago
|
||
Comment on attachment 754189 [details] [diff] [review] [mozharness] set failure if we surpass max log size >+def cleanup(): >+ gc.collect() >+ # I'm using MercurialVCS here because that gives me access to >+ # >+ c = CleanupObj() This is not going to be a surprise, but I've got no idea what that comment is saying.
Attachment #754189 -
Flags: review?(philringnalda) → review+
Comment 27•11 years ago
|
||
Comment on attachment 754189 [details] [diff] [review] [mozharness] set failure if we surpass max log size Landed to default, transplanted to production. And yeah, that comment was left over from a different file I copied from (cleaned up now). http://hg.mozilla.org/build/mozharness/rev/7416a68e544b
Attachment #754189 -
Flags: checked-in+
Comment 28•11 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #13) > Not enough, > https://tbpl.mozilla.org/php/getParsedLog.php?id=23409164&full=1&branch=try > is the same OOM without the CC warning, still overflowing. Kill the nsDeque > OOM warning? Kill imptests? I'll kill imptests if you tell me to, just tell > me to, I don't mind, I'll kill them, say the word. Yeah, I think killing the nsDeque OOM warning is good as a short term measure to prevent log badness. I should be able to add it back later this week, when I make it so the CC doesn't blast it out a million times when things fall apart. It is just papering over extreme badness in that test (bug 875585), but you are going to need to get somebody else's approval for disabling it. ;) Ms2ger should have a better idea who might have an opinion.
Reporter | ||
Comment 29•11 years ago
|
||
Never been so happy to see a red job; trees reopened, *thank you Aki*.
Severity: blocker → normal
Reporter | ||
Comment 30•11 years ago
|
||
Filed bug 876545 on dealing with the way that debug mochitest-1 is so close to 50MB that a single timeout and screenshot puts it over.
Comment 31•11 years ago
|
||
Attachment #756301 -
Flags: review?(bugspam.Callek)
Comment 32•11 years ago
|
||
Comment on attachment 756301 [details] [diff] [review] move worst_level() to LogMixin, use it in buildbot_status() Review of attachment 756301 [details] [diff] [review]: ----------------------------------------------------------------- This looks ok to me ::: mozharness/mozilla/buildbot.py @@ +83,5 @@ > if not level: > level = TBPL_STATUS_DICT[tbpl_status] > + self.worst_buildbot_status = self.worst_level(tbpl_status, self.worst_buildbot_status, TBPL_STATUS_DICT.keys()) > + if self.worst_buildbot_status != tbpl_status: > + self.info("Current worst status %s is worse; keeping it." % self.worst_buildbot_status) not a fan of how this english reads, but its a very minor bikeshed nit with no new shed color in mind, so no worry.
Attachment #756301 -
Flags: review?(bugspam.Callek) → review+
Comment 33•11 years ago
|
||
Comment on attachment 756301 [details] [diff] [review] move worst_level() to LogMixin, use it in buildbot_status() http://hg.mozilla.org/build/mozharness/rev/43e1d7a78e96 , transplanted to production.
Attachment #756301 -
Flags: checked-in+
Reporter | ||
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Assignee | ||
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•