Closed
Bug 904839
Opened 11 years ago
Closed 11 years ago
B2G emulator tests still timing out after 1200s, even with the mozprocess-timeout code
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jgriffin, Assigned: jgriffin)
References
Details
Attachments
(2 files, 2 obsolete files)
827 bytes,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
3.13 KB,
patch
|
k0scist
:
review+
|
Details | Diff | Splinter Review |
B2G emulator unittests are still being killed by buildbot after 1200s with no output, despite the fact that we are running them with mozprocess and an output_timeout of 1000, which should cause them to be killed by mozharness instead, and thus allow a dump of the logcat to occur. sample log: https://tbpl.mozilla.org/php/getParsedLog.php?id=26160870&full=1&branch=mozilla-central
Assignee | ||
Comment 1•11 years ago
|
||
This adds some logging to the mozprocess section of run_command; hopefully this will help us figure out why the problem described in this bug's summary is occurring.
Attachment #793563 -
Flags: review?(ahalberstadt)
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → jgriffin
Comment 2•11 years ago
|
||
Comment on attachment 793563 [details] [diff] [review] Add some mozprocess logging, Review of attachment 793563 [details] [diff] [review]: ----------------------------------------------------------------- A print statement in the on_timeout method might also be helpful.
Attachment #793563 -
Flags: review?(ahalberstadt) → review+
Assignee | ||
Comment 3•11 years ago
|
||
I added your suggestion and landed: https://hg.mozilla.org/build/mozharness/rev/a4e6f54010d3 I'll trigger a run on cedar to make sure nothing bad happens, then push to production.
Comment 4•11 years ago
|
||
Merged to production branch of mozharness. Live as of now.
Assignee | ||
Comment 5•11 years ago
|
||
In all cases this logging shows that we are using mozprocess to run the command, but in some cases we're not catching the timeout. I can't tell why.
Comment 6•11 years ago
|
||
In bug 917252 I noticed that the messages in the on_timeout handler can get interleaved with the process output somehow. Did you ctrl-f for "mozprocess timed out"? Because it could just be hidden among some random line of output above.
Assignee | ||
Comment 7•11 years ago
|
||
That doesn't appear to be the case here; see a recent log: https://tbpl.mozilla.org/php/getParsedLog.php?id=28100440&full=1&branch=mozilla-inbound
Assignee | ||
Comment 9•11 years ago
|
||
WIP. This seems to resolve this issue on try, but I need to do a full try run to make sure it doesn't break anything else. Without this patch: https://tbpl.mozilla.org/?tree=Try&rev=2471fcda8961 With this patch: https://tbpl.mozilla.org/?tree=Try&rev=fc8f06309cfa
Assignee | ||
Comment 10•11 years ago
|
||
full try run: https://tbpl.mozilla.org/?tree=Try&rev=6dcdc9b326cb
Assignee | ||
Comment 11•11 years ago
|
||
This patch seems to work well for B2G mochitests/reftests, but fails for linux tests and for B2G marionette. :(
Assignee | ||
Comment 12•11 years ago
|
||
v2..handle the case where os.read returns only one or more newlines. try: https://tbpl.mozilla.org/?tree=Try&rev=6f94f56145c8
Assignee | ||
Updated•11 years ago
|
Attachment #815961 -
Attachment is obsolete: true
Assignee | ||
Comment 13•11 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #12) > Created attachment 816074 [details] [diff] [review] > Use os.read instead of readline in readWithTimeout, > > v2..handle the case where os.read returns only one or more newlines. try: > https://tbpl.mozilla.org/?tree=Try&rev=6f94f56145c8 This is much better; the only problem left is that we're somehow messing up Windows builds. Will investigate a little more.
Assignee | ||
Comment 14•11 years ago
|
||
v3: locally fixes the Windows build problems
Attachment #816074 -
Attachment is obsolete: true
Assignee | ||
Comment 15•11 years ago
|
||
pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=247cd269d6fe
Assignee | ||
Comment 16•11 years ago
|
||
Comment on attachment 816650 [details] [diff] [review] mozprocess.diff Review of attachment 816650 [details] [diff] [review]: ----------------------------------------------------------------- This is looking good on try, so I'm putting up for review.
Attachment #816650 -
Flags: review?(jhammel)
Comment 17•11 years ago
|
||
Comment on attachment 816650 [details] [diff] [review] mozprocess.diff Review of attachment 816650 [details] [diff] [review]: ----------------------------------------------------------------- Nice. Looks good. ::: testing/mozbase/mozprocess/mozprocess/processhandler.py @@ +695,5 @@ > lineReadTimeout = outputTimeout > > + (lines, self.didTimeout) = self.readWithTimeout(logsource, lineReadTimeout) > + while lines != "" and not self.didTimeout: > + for line in lines.split('\n'): why not splitlines? Is this from universal newlines? (do we enforce that?) @@ +793,5 @@ > + self.read_buffer += output > + if '\n' not in self.read_buffer: > + time.sleep(0.01) > + continue > + tmp = self.read_buffer.split('\n') again, '\n' v splitlines
Attachment #816650 -
Flags: review?(jhammel) → review+
Assignee | ||
Comment 18•11 years ago
|
||
split('\n') seemed more precise. Here's a try run using splitlines() instead: https://tbpl.mozilla.org/?tree=Try&rev=8b2d8b3c7af7
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #18) > split('\n') seemed more precise. Here's a try run using splitlines() > instead: > > https://tbpl.mozilla.org/?tree=Try&rev=8b2d8b3c7af7 It turns out this doesn't quite work, at least for the second case. In that case, we want to differentiate between 'test\nfoo\n' and 'test\nfoo', which is a little clumsier with splitlines(). So, I'll keep split('\n') in that case.
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #19) > (In reply to Jonathan Griffin (:jgriffin) from comment #18) > > split('\n') seemed more precise. Here's a try run using splitlines() > > instead: > > > > https://tbpl.mozilla.org/?tree=Try&rev=8b2d8b3c7af7 > > It turns out this doesn't quite work, at least for the second case. In that > case, we want to differentiate between 'test\nfoo\n' and 'test\nfoo', which > is a little clumsier with splitlines(). So, I'll keep split('\n') in that > case. And here's a green try run which shows this works ok: https://hg.mozilla.org/try/rev/1c31fd8992c5
Assignee | ||
Comment 21•11 years ago
|
||
https://github.com/mozilla/mozbase/commit/576b5285c981b6ba938cb1f8716aa96cd7e3b046 We'll have to mirror to m-c before we can tell if this fixes this problem, so I'll leave this bug open until then.
Assignee | ||
Comment 22•11 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #21) > https://github.com/mozilla/mozbase/commit/ > 576b5285c981b6ba938cb1f8716aa96cd7e3b046 > > We'll have to mirror to m-c before we can tell if this fixes this problem, > so I'll leave this bug open until then. Totally green! After the versionbump patch is r+'d, I'll land that in github and do another try run with the result, since it will include a few more packages than the above.
Assignee | ||
Comment 23•11 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #22) > (In reply to Jonathan Griffin (:jgriffin) from comment #21) > > https://github.com/mozilla/mozbase/commit/ > > 576b5285c981b6ba938cb1f8716aa96cd7e3b046 > > > > We'll have to mirror to m-c before we can tell if this fixes this problem, > > so I'll leave this bug open until then. > > Totally green! After the versionbump patch is r+'d, I'll land that in > github and do another try run with the result, since it will include a few > more packages than the above. wrong bug, sorry !
Assignee | ||
Comment 24•11 years ago
|
||
This appears to be fixed by some recent changes to mozrunner and/or mozprocess. I'll re-open if I see this again. Getting stack traces appropriately is being handled under separate bugs, like bug 919353.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 25•11 years ago
|
||
this is causing talos to fail if I run with mozprocess 0.13. I get a never ending stream of __metrics (a line printed at the end of the logfile for the warmup run). I really need another fix that is in 0.13 and not in 0.11.
Comment 26•11 years ago
|
||
I added a \n to the last __metrics line printed out here: http://hg.mozilla.org/build/talos/file/tip/talos/getInfo.html#l66, and the problem is solved. We really need to fix this in the mozprocess code as this is not handling generic process output.
Assignee | ||
Comment 27•11 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #26) > I added a \n to the last __metrics line printed out here: > http://hg.mozilla.org/build/talos/file/tip/talos/getInfo.html#l66, and the > problem is solved. We really need to fix this in the mozprocess code as > this is not handling generic process output. Agreed. Is this problem you're seeing on Windows, or everywhere?
Comment 28•11 years ago
|
||
oh, bad bug reporting. this was seen on all linux and osx platforms, which was easily seen here: https://github.com/mozilla/mozbase/blob/master/mozprocess/mozprocess/processhandler.py#L779 windows worked great.
Assignee | ||
Comment 29•11 years ago
|
||
Thanks, I'll try to write a test to reproduce this.
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•