Closed
Bug 814496
Opened 12 years ago
Closed 12 years ago
SUT: long-running exec'd process corrupts response to future commands
Categories
(Testing :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla20
People
(Reporter: gbrown, Assigned: gbrown)
References
Details
Attachments
(1 file, 1 obsolete file)
18.12 KB,
patch
|
gbrown
:
review+
wlach
:
feedback+
|
Details | Diff | Splinter Review |
If devicemanagerSUT.launchProcess is used to exec a process via StartPrg2, and that process runs for more than 300 s, the output from that process may be returned at a later time and confused with the response data for another command.
(There may be similar issues with other types of exec's and other ways of launching via dm -- this is just a description of the particular scenario that is giving me trouble today!)
![]() |
Assignee | |
Comment 1•12 years ago
|
||
Observed while running robocop locally on a tegra, while investigating bug 805969/813858...
On one run, testBrowserProvider executed for longer than normal. devicemanagerSUT.launchProcess had been called which eventually called _sendCmds, which was waiting for the sut prompt response, for up to 300 s:
https://hg.mozilla.org/mozilla-central/file/3b71d63eafd5/testing/mozbase/mozdevice/mozdevice/devicemanagerSUT.py#l207
On the sutAgent side, StartPrg2 exec'd "am instrument ...", created a RedirOutputThread, and waited for the thread for up to 300 s (30x10):
https://hg.mozilla.org/mozilla-central/file/3b71d63eafd5/build/mobile/sutagent/android/DoCommand.java#l3729
StartPrg2 timed out after 300 s, and returned. dmSUT._sendCmds received the $> prompt without most of the test output and without the normal returnCode= marker. launchProcess returned and dmSUT began monitoring for process completion using the "ps" command periodically. After some time, a "ps" command received the test output concatenated with the normal ps output.
I suppose the RedirOutputThread for the exec continues to run after the time-out and still has a valid stream to the socket...as it receives output, it keeps sending it to the socket, even though the listener has now moved on to issue other commands.
![]() |
Assignee | |
Comment 2•12 years ago
|
||
With logging and an artificially long-running test, I was able to confirm: If an exec'd process runs for longer than the 30x10=300 second timeout, the RedirOutputThread continues to run and redirect output, which is just mixed in with the responses to whatever other command happens to be running at that time.
![]() |
Assignee | |
Comment 3•12 years ago
|
||
I have added a stopRedirect() function to RedirOutputThread. stopRedirect signals the thread loop to exit and waits for confirmation that redirection has ended before returning. RedirOutputThread.run breaks out of its redirection loop when signalled, notifies stopRedirect() that redirection has ended, and then continues to wait for process completion (so that it can clean up effectively).
I tested locally with an artificially long-running robocop test and verified that there was no longer any response corruption.
Attachment #684786 -
Flags: review?(wlachance)
Comment 4•12 years ago
|
||
Comment on attachment 684786 [details] [diff] [review]
stop RedirOutputThread process output redirection on timeout
So I could be missing something, but as it is I don't see any reason why you couldn't override the RedirOutputThread class to do this automatically after the thread is joined. I think getting the right behaviour automatically here would be both safer and less verbose.
Otherwise, good catch!
Attachment #684786 -
Flags: review?(wlachance) → review-
![]() |
Assignee | |
Comment 5•12 years ago
|
||
(In reply to William Lachance (:wlach) from comment #4)
> Comment on attachment 684786 [details] [diff] [review]
> stop RedirOutputThread process output redirection on timeout
>
> So I could be missing something, but as it is I don't see any reason why you
> couldn't override the RedirOutputThread class to do this automatically after
> the thread is joined. I think getting the right behaviour automatically here
> would be both safer and less verbose.
Good idea, but...
- There are a couple of places where we want to join but keep redirection going.
- Thread.join is final -- it cannot be overridden.
![]() |
Assignee | |
Comment 6•12 years ago
|
||
...so then the patch starts to look like this. I think I prefer the original patch. Thoughts?
Attachment #685404 -
Flags: feedback?(wlachance)
Comment 7•12 years ago
|
||
Comment on attachment 685404 [details] [diff] [review]
an alternative
Given that we can't override join, I think either this one or the other are about equally good. I think I slightly prefer this one as it does provide a one-stop-shop method to use but the difference is slight. I guess I'll give r+ to either and let you make the final call.
Attachment #685404 -
Flags: feedback?(wlachance) → feedback+
Comment 8•12 years ago
|
||
Comment on attachment 684786 [details] [diff] [review]
stop RedirOutputThread process output redirection on timeout
Changing r- to r+ based on discussion.
Attachment #684786 -
Flags: review- → review+
![]() |
Assignee | |
Updated•12 years ago
|
Attachment #684786 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 9•12 years ago
|
||
Comment on attachment 685404 [details] [diff] [review]
an alternative
Let's go with the new version. r=wlach
Attachment #685404 -
Flags: review+
![]() |
Assignee | |
Comment 10•12 years ago
|
||
Comment 11•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
You need to log in
before you can comment on or make changes to this bug.
Description
•