Closed Bug 967647 Opened 6 years ago Closed 6 years ago
B2G mochitests sometimes contain mangled log lines
Some mochitest runs for B2G contain logs in which output from the browser is interleaved with output from the harness, which can break our log parsing and cause trouble with TBPL. Case in point: https://tbpl.mozilla.org/php/getParsedLog.php?id=34070825&tree=Mozilla-Inbound&full=1#error0 This problem seems related to the way that mozprocess and/or mozrunner are handling stdout; the problem was introduced when we moved our automation on top of them.
This output is being generated here: http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozrunner/mozrunner/remote.py#193
Quite possibly, we should be killing the process (which would force a flush of stdout) before we log the TEST-UNEXPECTED-FAIL message...that might be enough to solve this particular instance of the problem.
We do kill the process before we get to that point: http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozprocess/mozprocess/processhandler.py#732 Though I think in this case, we are just killing adb while the shell process it created on the device continues to run. Another thing I notice from this log is that the kill and get stack attempt is totally not working.
I think just logging the timeout message after the process is killed might work.. though hard to say for sure. No harm in trying.
Attachment #8370269 - Flags: review?(jgriffin)
Comment on attachment 8370269 [details] [diff] [review] Patch 1.0 - kill b2g process before logging timeout message Review of attachment 8370269 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, hopefully this helps.
Attachment #8370269 - Flags: review?(jgriffin) → review+
Also landed to upstream mozbase: https://github.com/mozilla/mozbase/commit/e26e8e1e2f45a6e63c66a41006652aaae5d9d2ac
Assignee: nobody → ahalberstadt
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
I'm reopening this; the patch above fixes the problem (probably) in the case where a test times out, but we also see cases where this causes an otherwise-passing run to fail, e.g.: https://tbpl.mozilla.org/php/getParsedLog.php?id=34206387&tree=B2g-Inbound We should get to the bottom of why it's occurring and make a real fix to this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
So the job is still green, but mozharness thinks there is a failure because the summary is broken up by the runtests end message and thus dumps the logcat.
Are reftest logs an entirely separate proposition, so we need a separate bug for the rather frequent way they time out on startup and mangle together that fact and the last Chatty Cathy Marionette line about setting prefs and what should be fixed about it, a la https://tbpl.mozilla.org/php/getParsedLog.php?id=34353600&tree=Mozilla-Inbound?
https://tbpl.mozilla.org/php/getParsedLog.php?id=34393090&tree=Mozilla-Inbound for a mochitest mangling in sort of semi-mid-run.
Likely the same problem, but with different chunks of code. Reftests don't use mozprocess yet iirc.
Opt mochitest-6 is up to hitting the comment 9 thing around 80% of the time, so I've hidden it on all trunk trees.
There's only been one mochitest-6 failure on inbound since they were hidden, do you mind if we unhide them again? A big mozbase sync patch just landed which might affect this bug and I plan to make this one of my top priorities if it is still a problem. Please direct me towards any new logs you notice that have this.
Yeah, that's pretty sweet, from when it got better I'd guess that at any time just adding or deleting, or disabling or enabling a mochitest will cause some random b2g mochitest hunk to become permaorange.
https://tbpl.mozilla.org/php/getParsedLog.php?id=35326379&tree=B2g-Inbound and see also bug 977204, RyanVM actually managed to forget about the existence of this bug, the lucky devil.
Ignorance is bliss!
Alright, obviously the mozbase sync didn't help much. I think flushing stdout before the mochitest harness prints the "Running test end" message might fix this particular instance of the problem.
I *think* this should fix all the interleaving issues. At the very least I found a test case that fails when using a buffered stream and passes when using an unbuffered one. So assuming the problem was that sys.stdout was buffering output, this should fix it. I'll push it to try once it is open again.
I forgot to qref some minor changes to the test and how ProcessHandler handles the default output case.
This fixes a s/extend/update error in B2GRunner that the previous try run caught. Everything else is the same: https://tbpl.mozilla.org/?tree=Try&rev=eb0bb0264bb6
Attachment #8384563 - Flags: review?(jgriffin) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/085029423c29 I'll leave-open for now, just cause I'm not sure if this will improve the situation or not. Please let me know if you see this interleaving again.
Ok, no reports of mangled logs in the last week. Tentatively closing. Feel free to re-open if problem persists.
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.