Closed
Bug 1261199
Opened 9 years ago
Closed 8 years ago
Use StructuredOutputParser for parsing reftest test results
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(firefox51 fixed)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox51 | --- | fixed |
People
(Reporter: ahal, Assigned: pyang)
References
Details
Attachments
(1 file, 4 obsolete files)
3.84 KB,
patch
|
pyang
:
review+
|
Details | Diff | Splinter Review |
Reftest emits structured logs, but mozharness currently doesn't consume them. It is still parsing the old formatted logs using a set of regexes. We should make sure to use the StructuredOutputParser class instead of DesktopUnittestParser:
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/testing/testbase.py#502
We'll need to be careful that we don't miss failures that are hidden in unstructured output. This might mean running the same set of regexes when we come across unstructured output.
Assignee | ||
Comment 1•9 years ago
|
||
Adding reftest for using structured log
Attachment #8755752 -
Flags: feedback?(ahalberstadt)
Assignee | ||
Comment 2•9 years ago
|
||
Attachment #8755752 -
Attachment is obsolete: true
Attachment #8755752 -
Flags: feedback?(ahalberstadt)
Assignee | ||
Comment 3•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c8da83bce42e6ef730865dc8e08921136a25322&selectedJob=21354910
try result looks pretty ok.
Assignee: nobody → pyang
Assignee | ||
Comment 4•9 years ago
|
||
Andrew, we don't see issues for enabling strcuturedlog for reftest. any comments?
Flags: needinfo?(ahalberstadt)
Reporter | ||
Comment 5•9 years ago
|
||
Awesome! So at the very minimum, I'd like to see:
1) Force a test-failure, make sure job turns orange. You can change an == to a != in the reftest manifest to do this.
2) Force a leak, make sure job turns orange. You can remove an asserts directive in the manifest to do this.
3) Force a crash, make sure job turns red. You can add a proc.kill(sig=singal.SIGABRT) somewhere in the harness to do this (after sleeping a bit to give the harnes a chance to start running tests).
For all of the above we also need to make sure the appropriate failure messages appear in the log. I'd recommend doing a push both with and without your patch here, and compare the logs of the two. We really need better tests for all of this, but unfortunately we do not.
Flags: needinfo?(ahalberstadt)
Assignee | ||
Comment 6•9 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #5)
> For all of the above we also need to make sure the appropriate failure
> messages appear in the log. I'd recommend doing a push both with and without
> your patch here, and compare the logs of the two. We really need better
> tests for all of this, but unfortunately we do not.
does it mean to push 3 x 2 (with & without) pushes? Or we need combinations of those 3 conditions.
Reporter | ||
Comment 7•9 years ago
|
||
No, you don't need to do combinations.. you could even do all 3 in the same push if you wanted (i.e in different chunks)
Assignee | ||
Comment 8•9 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f2ba4edb2a272b1aa0804fa344a2171648678fc3&selectedJob=21921047
It works for forcing orange.
Working on force crash case.
Assignee | ||
Comment 9•9 years ago
|
||
Hi ahal, here are force error changeset and result
Force crash changeset
https://hg.mozilla.org/try/rev/ff2dc3f806a6ffd1b699e86ac1c6366b07c39a57
treeherder result
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ff2dc3f806a6ffd1b699e86ac1c6366b07c39a57
for the red suites, main process was killed so no 'SUITE-END' log.
Flags: needinfo?(ahalberstadt)
Reporter | ||
Comment 10•9 years ago
|
||
Haha nice! I like your time bomb. I see what you were saying earlier.. it would be good to get a little module we could use to test certain conditions like that somewhere so we can re-use it in other places. So it looks like there are a couple things to investigate here:
1) It looks like anytime something goes wrong, no suite_end message is being emitted. We should always be emitting this even if there are failures or crashes
2) When there is a crash, we don't generate a stack trace. Usually there should be a long spew of addresses and a PROCESS-CRASH log line. I'm not sure why this is not happening :/
3) The output format is slightly different from before. I think this is because we aren't using the customized reftest formatter here:
https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/output.py#13
We'll need to make sure that is the formatter being used to log to stdout. Let me know if you have anymore questions about these problems, or need a hand looking into them.
Flags: needinfo?(ahalberstadt)
Assignee | ||
Comment 11•9 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #10)
> Haha nice! I like your time bomb. I see what you were saying earlier.. it
> would be good to get a little module we could use to test certain conditions
> like that somewhere so we can re-use it in other places. So it looks like
> there are a couple things to investigate here:
>
> 1) It looks like anytime something goes wrong, no suite_end message is being
> emitted. We should always be emitting this even if there are failures or
> crashes
supposedly suite_end should be emitted by runner process. When it was forced killed, does it make sense to send suite_end by manage process? or we need to add signal handler for those situations.
>
> 2) When there is a crash, we don't generate a stack trace. Usually there
> should be a long spew of addresses and a PROCESS-CRASH log line. I'm not
> sure why this is not happening :/
>
ya agree. Should add more log/stack trace while process lost.
> 3) The output format is slightly different from before. I think this is
> because we aren't using the customized reftest formatter here:
> https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/output.
> py#13
>
> We'll need to make sure that is the formatter being used to log to stdout.
Not really familiar with its format. Will check if we need to add another patch for that.
> Let me know if you have anymore questions about these problems, or need a
> hand looking into them.
Thanks! Those are really helpful :D
Assignee | ||
Comment 12•8 years ago
|
||
leave only config patch here.
Attachment #8755827 -
Attachment is obsolete: true
Assignee | ||
Comment 13•8 years ago
|
||
Adding structured log enabling part
Attachment #8772705 -
Attachment is obsolete: true
Assignee | ||
Comment 14•8 years ago
|
||
try result by latest patch
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f15406fc51f9fd5f1a536c15bfb15be8eb3b48cd
Assignee | ||
Comment 15•8 years ago
|
||
Comment on attachment 8779692 [details] [diff] [review]
enable-reftest
I re-triggered failed tests and looks green.
Try wasn't stable in these couple of days, maybe we can run them again.
Attachment #8779692 -
Flags: review?(ahalberstadt)
Reporter | ||
Comment 16•8 years ago
|
||
Comment on attachment 8779692 [details] [diff] [review]
enable-reftest
Review of attachment 8779692 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks, this looks good! I still want to do a bit of testing before we land this though, I'll do a new try run. For future reference, you can use '-u reftest,crashtest,jsreftest' for your try push, as we only care about those suites with this patch.
Attachment #8779692 -
Flags: review?(ahalberstadt) → review+
Reporter | ||
Comment 17•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=31f11b48774b
I'll do some local tests tomorrow too.
Reporter | ||
Comment 18•8 years ago
|
||
Nice that try run includes some known intermittents and a crash, both seem to get logged correctly! Paul could you upload a new patch with a proper commit message?
Flags: needinfo?(pyang)
Reporter | ||
Comment 19•8 years ago
|
||
I did find one minor "problem" that we need some sheriff input on.
When tests fail to start, or there is a crash before the tests finish, DesktopUnittestOutputParser prints "No tests run or summary not found":
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/testing/unittest.py#191
However when this happens with the StructuredOutputParser, it prints either "No checks run" and/or "No suite_end message was emitted by this harness":
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/structuredlog.py#15
The StructuredOutputParser behaviour is strictly better than the old message, as it can differentiate between "tests didn't start" and "tests didn't finish". There are two problems with changing this message though:
1) Treeherder regexes need to be updated to flag these two new messages
2) Existing intermittent bugs filed with the old message will no longer be starrable
Problem 1) should be fixed either way, it's just a matter of do we need to block on it. Problem 2) will cause some pain for the sheriffs. A potential compromise would be to temporarily emit all three messages from the StructuredOutputParser, which will give sheriffs some time to migrate over to the new message from the old.
Philor, Wes, what would you prefer we do here?
Flags: needinfo?(wkocher)
Flags: needinfo?(philringnalda)
Comment 20•8 years ago
|
||
If you look at the existing bugs on the current message, you'll see that nobody is working on any of them, and that there's no understanding whatsoever of what the message means or why it is output or when it should or should not be starred or which bug it should be starred as, so it won't hurt anything if these classes of failure have no starrable message for a while.
Flags: needinfo?(philringnalda)
Reporter | ||
Comment 21•8 years ago
|
||
Thanks good to know.
Also I'm an idiot, treeherder already flags the two new message because they are logged with ERROR. I didn't pay close enough attention to the failure summary. This means problem 1) is already solved. And given that starring the old message is already terrible, I guess there's nothing extra to be done here.
Flags: needinfo?(wkocher)
Assignee | ||
Comment 22•8 years ago
|
||
modifying only commit message. carry on review.
Thanks!
Attachment #8779692 -
Attachment is obsolete: true
Flags: needinfo?(pyang)
Attachment #8780967 -
Flags: review+
Comment 23•8 years ago
|
||
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/95aef368faed
Use StructuredOutputParser for parsing reftest test results; r=ahal
Reporter | ||
Comment 24•8 years ago
|
||
Thought I should mention that I tested this by writing a patch that runs both the DesktopUnittestOutputParser and the StructuredOutputParser on the same log. It then compares the TBPL status of both, and passes the job if they're the same, and fails the job if they're different:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f3abf6cb08b972ba70af6f0941f0d6e2d2bfda7d
All green means that both output parsers appear to be flagging the same sets of errors. It's not 100% fool proof, but much better than manually comparing logs.
Comment 25•8 years ago
|
||
It should have found a difference on 330 seconds without output hangs, since we have been failing to actually parse anything for them other than the utterly useless "crash" that we cause after them, since somewhere around February, a thing which this now fixed.
Reporter | ||
Comment 27•8 years ago
|
||
Yeah, I wasn't able to pipe logs that were printed directly to stdout (i.e not through the logger) to the StructuredOutputParser in my test patch. I presume the 330s timeout message was one of those.
But when the StructuredOutputParser encounters one of these in production, it passes it on to the base class. Which means unstructured output is now going through the base OutputParser instead of DesktopUnittestOutputParser. This is probably what inadvertently fixed it.
Comment 28•8 years ago
|
||
bugherder |
You need to log in
before you can comment on or make changes to this bug.
Description
•