Closed
Bug 699173
Opened 13 years ago
Closed 12 years ago
Talos testing gives permission denied error on windows when a timeout occurs
Categories
(Testing :: Talos, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 572127
People
(Reporter: BYK, Unassigned)
References
Details
Attachments
(2 files)
At the end of testing tp5 with "responsiveness: True" the firefox process becomes a zombie without any UI with a RAM consumption around 120MB. The Python script waits for the process to terminate for several minutes and after that it throws the following:
Traceback (most recent call last):
File "bcontroller.py", line 235, in <module>
sys.exit(main())
File "bcontroller.py", line 232, in main
bcontroller.run()
File "bcontroller.py", line 175, in run
results_file = open(self.browser_log, "a")
IOError: [Errno 13] Permission denied: 'browser_output.txt'
Also after killing the zombie process manually, testing starts over from cycle 1 without any firefox process being actually launched and terminates with the following:
Failed tp5:
Stopped Wed, 02 Nov 2011 20:46:09
FAIL: Busted: tp5
FAIL: unrecognized output format
Completed test tp5:
Stopped Wed, 02 Nov 2011 20:46:09
RETURN: cycle time: 01:04:30<br>
qm-pxp01:
Stopped Wed, 02 Nov 2011 20:46:09
A stacktrace and browser_output.txt are attached.
Reporter | ||
Comment 1•13 years ago
|
||
Reporter | ||
Comment 2•13 years ago
|
||
I think the normal code has an open handle with exclusive write permission to the "browser_output.txt" file and when the timeout occurs the handle stays open. Thus, when the timeout branch tries to open the file to log the info, it gets an error.
May be a big try-finally block will solve the problem but it may come with a performance hit.
Reporter | ||
Comment 3•13 years ago
|
||
The process and the test terminates successfully with "responsiveness: False".
Also this may not be a Windows specific issue since the handle should remain open for all OS in case of a timeout.
Comment 4•13 years ago
|
||
I believe this is a windows only issue, but a flaw in the harness for when we timeout.
Would be good to know if the stack trace helps figure out why we ended up timing out.
BYK, this is a really awesome catch. We've been trying to hunt down a set of failures that look a lot like this but they've been hard for us to reproduce in our automation. Thanks so much for hunting this down!
Confirming bug.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Reporter | ||
Comment 6•13 years ago
|
||
The stack trace here seems to be useless due to my inability to generate a proper one. Though noticed that the "hanging problem" does not occur with the nightlies.
Reporter | ||
Comment 7•13 years ago
|
||
Closing for now since it seems to work perfectly OK on nightlies.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
Comment 8•13 years ago
|
||
I think we should fix this bug to try/catch around the file access in the timeout handler. Even though this doesn't reproduce on the nightly builds, we can easily reproduce this on aurora and by fixing this bug it will help us see a timeout vs a file access error.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Reporter | ||
Comment 9•13 years ago
|
||
@John Maher,
Indeed, I missed the second part of the problem. Sorry =)
Reporter | ||
Comment 10•13 years ago
|
||
After some investigation, I discovered that the issue resonates from a line in bcontroller.py which does:
os.system(somecommand + ' > ' + log_file)
in a separate thread. When the browser is frozen and the timeout occurs, the main thread tries to append into the same "log_file" which Windows keeps locked since the command line execution is not completed yet.
After some more investigation I found out that there isn't a good way to "kill" threads in Python 2.x. This issue might get resolved with the usage of mozprocess or a workaround for the "Windows file-locking" issue or killing off the spawned thread before appending to the log_file.
I have experimented with multiprocess module but it was no good and also does not make much sense since that newly created process will fire another process(the browser process in this case) which sounds like a bit inefficient.
Reporter | ||
Updated•13 years ago
|
Status: REOPENED → NEW
Comment 11•12 years ago
|
||
So this looks like a dupe of 572127 instead of a dep of to me. Closing as such. Please reopen if i have made a mistake
Status: NEW → RESOLVED
Closed: 13 years ago → 12 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•