Closed
Bug 1140394
Opened 10 years ago
Closed 10 years ago
b2g desktop reftest suite reporting WARNING when all tests were successfully executed but shutdown spew is interleaved in pass/fail/todo log line
Categories
(Testing :: General, defect)
Tracking
(firefox40 fixed)
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox40 | --- | fixed |
People
(Reporter: gerard-majax, Assigned: gerard-majax)
References
(Depends on 1 open bug)
Details
(Whiteboard: [systemsfe])
Attachments
(3 files)
I ran into this intermittent issue when working on mulet reftest. One try that exposes the issue is: https://treeherder.mozilla.org/#/jobs?repo=try&revision=13e6c3ec48da&exclusion_profile=false
Assignee | ||
Comment 1•10 years ago
|
||
Assignee | ||
Comment 2•10 years ago
|
||
Assignee | ||
Comment 3•10 years ago
|
||
Assignee | ||
Comment 4•10 years ago
|
||
Here, R4 failure: https://treeherder.allizom.org/#/jobs?repo=try&revision=0e665dd844f3
Comment 5•10 years ago
|
||
In effect, that's a test of how quickly your product shuts down, and congratulations, it's intermittently getting faster.
The decision about whether the job step passed or failed comes from looking in the log for the pass/fail/todo line in a particular format (literally, matching a particular regex), and that format is not
13:55:10 INFO - REFTEST INFO | UnexpectREFTEST INFO | b2g_desktop.py | Running tests: end.
13:55:10 INFO - ed: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 unexpected fixed asserts, 0 failed load, 0 exception)
like it is in the failure cases. As was the case with bug 967647, some one of the multitude of harness layers isn't buffering output like it ought to, so writing the summary and writing shutdown spew get intermingled when the shutdown spew comes faster than it usually does. What that layer is, dunno, though I'd bet it's named mozsomething.
Component: Mozharness → General
Product: Release Engineering → Testing
QA Contact: jlund
Summary: Reftest suite reporting WARNING when all tests were successfully executed → b2g desktop reftest suite reporting WARNING when all tests were successfully executed but shutdown spew is interleaved in pass/fail/todo log line
Assignee | ||
Comment 6•10 years ago
|
||
Thanks for the explanation!
Assignee | ||
Comment 7•10 years ago
|
||
Phil, there are data about this on https://treeherder.allizom.org/#/jobs?repo=try&revision=7b52379388d1. I don't know if it's luck or not, but R2 and R4 are not impacted, and R5 and R6 have only one. R1 and R3 are more impacted.
Flags: needinfo?(philringnalda)
Updated•10 years ago
|
Flags: needinfo?(philringnalda)
Comment 8•10 years ago
|
||
This would be fixed by bug 798300 (but we need to fix another issue before we can re-land that).
Assignee | ||
Comment 9•10 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #8)
> This would be fixed by bug 798300 (but we need to fix another issue before
> we can re-land that).
Thanks ted. Any way I can help speeding up those ?
Flags: needinfo?(ted)
Comment 10•10 years ago
|
||
bug 798300 comment 18 describes the fix needed. It's straightforward but unfortunately maybe not simple.
Flags: needinfo?(ted)
Comment 11•10 years ago
|
||
Bug 798300 is about changing the default behavior of mozprocess. If this is
Comment 12•10 years ago
|
||
Sorry, entered that too early.
If this is acutely a problem for b2g reftests, it looks like we could pass arguments from the b2g destkop runner to mozprocess to split the stdout/stderr streams. These tests aren't using the shutdown leaks detector so would not require the same invasive changes discussed there.
Comment 13•10 years ago
|
||
Looking at b2g_desktop.py and comparing with comment 5, the interleaving looks like it's between python sources (https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/reftest/b2g_desktop.py#l107 and https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/reftest/b2g_desktop.py#l164). If these are in different threads that could be the source of the problem, and sharing a lock in this file would be enough to fix it.
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #13)
> Looking at b2g_desktop.py and comparing with comment 5, the interleaving
> looks like it's between python sources
> (https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/
> reftest/b2g_desktop.py#l107 and
> https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/
> reftest/b2g_desktop.py#l164). If these are in different threads that could
> be the source of the problem, and sharing a lock in this file would be
> enough to fix it.
Thanks, I'm going to give this a try.
Assignee | ||
Comment 15•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #13)
> Looking at b2g_desktop.py and comparing with comment 5, the interleaving
> looks like it's between python sources
> (https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/
> reftest/b2g_desktop.py#l107 and
> https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/
> reftest/b2g_desktop.py#l164). If these are in different threads that could
> be the source of the problem, and sharing a lock in this file would be
> enough to fix it.
Andrew, it's in code you hacked, do you want to fix it? :)
Flags: needinfo?(ahalberstadt)
Assignee | ||
Comment 16•10 years ago
|
||
Chris, I have cross checked and the interleaving seems not to be like you said, but I see:
> 20:37:56 INFO - REFTEST INFO | Successful: 1801 (1798 pass, 3 load only)
> 20:37:57 INFO - REFTEST INFO | UnexpectREFTEST INFO | b2g_desktop.py | Running tests: end.
> 20:37:57 INFO - ed: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 unexpected fixed asserts, 0 failed load, 0 exception)
> 20:37:57 INFO - REFTEST INFO | Known problems: 261 (23 known fail, 0 known asserts, 22 random, 216 skipped, 0 slow)
The part "REFTEST INFO | b2g_desktop.py | Running tests: end." seems to be coming from https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/reftest/b2g_desktop.py#l107 as you suggested.
But the part "REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 unexpected fixed asserts, 0 failed load, 0 exception)" (with the interleaving fixed) does seems to come from https://hg.mozilla.org/mozilla-central/annotate/dd32e3ff3717/layout/tools/reftest/reftest.js#l1367
Assignee | ||
Comment 17•10 years ago
|
||
Chris, am I missing something in comment 16 or the issue you pointed at in comment 13 is unrelated ?
Flags: needinfo?(cmanchester)
Comment 18•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #17)
> Chris, am I missing something in comment 16 or the issue you pointed at in
> comment 13 is unrelated ?
I don't have first hand knowledge of this implementation, but the way this is usually done is the python harness reads output from the firefox process (managed with mozprocess) and forwards it to its own stdout or another log file. The line in reftest.js is delivering the summary to the firefox process' stdout, which would then be passed to the on_output callback I linked in comment 13. Python's print isn't thread safe, so if mozprocess is calling this callback on another thread, and I think it is, this could be our interleaving scenario. I think it's worth testing, but maybe Andrew has another take on this.
Flags: needinfo?(cmanchester)
Assignee | ||
Comment 19•10 years ago
|
||
Right, it makes sense, I missed this :). Let's try something around threading.Lock().
Assignee | ||
Comment 20•10 years ago
|
||
Not sure if it was what you had in mind, but this do not works: https://hg.mozilla.org/try/rev/efa4c9f3da95
Flags: needinfo?(cmanchester)
Assignee | ||
Comment 21•10 years ago
|
||
So I lost my sunday on trying several ways to fix this, nothing ever worked.
Comment 22•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #20)
> Not sure if it was what you had in mind, but this do not works:
> https://hg.mozilla.org/try/rev/efa4c9f3da95
That's pretty much what I had in mind. Can you provide a link to the try results?
Flags: needinfo?(cmanchester) → needinfo?(lissyx+mozillians)
Assignee | ||
Comment 23•10 years ago
|
||
Here: https://treeherder.allizom.org/#/jobs?repo=try&revision=737d6de2d8ee
I'm giving a try to using sys.stdout.write() only and in combination with sys.stdout.flush() in:
https://treeherder.allizom.org/#/jobs?repo=try&revision=dde031e66745
https://treeherder.allizom.org/#/jobs?repo=try&revision=0b9f05b77f54
Flags: needinfo?(lissyx+mozillians)
Assignee | ||
Comment 24•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #23)
> Here: https://treeherder.allizom.org/#/jobs?repo=try&revision=737d6de2d8ee
>
> I'm giving a try to using sys.stdout.write() only and in combination with
> sys.stdout.flush() in:
> https://treeherder.allizom.org/#/jobs?repo=try&revision=dde031e66745
And according to:
> https://treeherder.allizom.org/#/jobs?repo=try&revision=0b9f05b77f54
sys.stdout.write() + sys.stdout.flush() helps a lot!
Assignee | ||
Comment 25•10 years ago
|
||
Attachment #8591396 -
Flags: review?(ahalberstadt)
Comment 26•10 years ago
|
||
Comment on attachment 8591396 [details] [diff] [review]
Protect standard output from interleaving
Review of attachment 8591396 [details] [diff] [review]:
-----------------------------------------------------------------
Lgtm.
Attachment #8591396 -
Flags: review?(ahalberstadt) → review+
Updated•10 years ago
|
Flags: needinfo?(ahalberstadt)
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → lissyx+mozillians
Assignee | ||
Updated•10 years ago
|
Whiteboard: [systemsfe]
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 27•10 years ago
|
||
Keywords: checkin-needed
Comment 28•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in
before you can comment on or make changes to this bug.
Description
•