Closed
Bug 712102
Opened 13 years ago
Closed 13 years ago
Mozrunner hangs when registering addons for mozilla 1.9.2
Categories
(Testing :: Mozbase, defect)
Testing
Mozbase
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mozilla, Unassigned)
References
Details
Attachments
(3 files)
1.81 KB,
patch
|
k0scist
:
review+
|
Details | Diff | Splinter Review |
1.79 KB,
patch
|
Details | Diff | Splinter Review | |
5.09 KB,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
From bug 700415 comment 51: On w7: 14:09:25 INFO - File "C:\mozilla-build\python25\lib\tempfile.py", line 33, in <module> 14:09:25 INFO - from random import Random as _Random 14:09:25 INFO - File "C:\mozilla-build\python25\lib\random.py", line 838, in <module> 14:09:25 INFO - _inst = Random() 14:09:25 INFO - File "C:\mozilla-build\python25\lib\random.py", line 94, in __init__ 14:09:25 INFO - self.seed(x) 14:09:25 INFO - File "C:\mozilla-build\python25\lib\random.py", line 108, in seed 14:09:25 INFO - a = long(_hexlify(_urandom(16)), 16) 14:09:25 INFO - WindowsError: [Error 22] Invalid Signature (Random error when creating the mozprofile.) This doesn't always happen. From bug 700415 comment 53: w7 and xp: I've stopped hitting the WindowsError and just end up hanging and timing out after 1200 seconds with no output during the peptest run. The only way to rescue the slave after this hang is to reboot.
Reporter | ||
Comment 1•13 years ago
|
||
I loaned ahal a Windows 7 box that I've been using for testing (talos-r3-w7-003).
Comment 2•13 years ago
|
||
So I managed to reproduce this on my Windows environment at home. Oddly mcote was also able to reproduce on his Linux environment (that used to work for him). After doing a pdb trace this is the line that is hanging in mozrunner: https://github.com/mozilla/mozbase/blob/master/mozrunner/mozrunner/runner.py#L164 That code is only there to maintain compatibility with Firefox 3.6. Given that you used to be hitting the python random lib and then magically stopped seeing it (and that mcote and I are both now able to reproduce without having been able to in the past) I might guess that there was a change in Firefox.
Comment 3•13 years ago
|
||
So with some help from mcote we've determined: a) Still works in Aurora (i.e regression happened in Firefox) b) The problem goes away if we uncomment the above line. That code is only there to maintain compatibility with Firefox 3.6 so one possible solution is to drop support for 3.6.
There are only three consumers of mozrunner that run against Gecko 1.9 (Firefox 3.6) - one is the thunderbird tree (which I list because I'm not certain about) the others that I am more certain about are the Jetpack cfx tool and the Mozmill Testing system. All three of these tools are running using the 1.5.x branch of the mozrunner API,and will need to upgrade to modern versions of mozrunner at some point. So, I'm OK with breaking 3.6 compatibility. Rev the version number when you and make it very clear in your commit message as to what the breakage will be on 1.9 and that this is a deliberate move away from it.
Comment 6•13 years ago
|
||
> w7 and xp: I've stopped hitting the WindowsError and just end up hanging and timing out after
> 1200 seconds with no output during the peptest run.
That's a separate issue that I haven't been able to reproduce.
Reporter | ||
Comment 7•13 years ago
|
||
(In reply to Ted Mielczarek [:ted, :luser] from comment #5) > Where is the stack trace in comment 0 coming from? Not entirely sure. Catlee was worried about the new signing-on-demand, but I think I was able to reproduce with an Aurora build before those were signed on demand. Then it went away :\ I guess phantom errors that are gone are better than phantom errors that stick around. (In reply to Andrew Halberstadt [:ahal] from comment #6) > > w7 and xp: I've stopped hitting the WindowsError and just end up hanging and timing out after > > 1200 seconds with no output during the peptest run. > > That's a separate issue that I haven't been able to reproduce. The timeout is buildbot-specific; if there's no output for 20min (configurable, but this is probably symptomatic of some larger issue), it will kill the job.
Updated•13 years ago
|
Component: Peptest → Mozbase
OS: Windows 7 → All
QA Contact: peptest → mozbase
Hardware: x86 → All
Summary: Peptest doesn't install or hangs on Windows → Mozrunner hangs when registering addons for mozilla 1.9.2
Comment 8•13 years ago
|
||
I version bumped this to 5.1 and can upload to pypi when it lands.
Attachment #583262 -
Flags: review?(jhammel)
Comment 9•13 years ago
|
||
Comment on attachment 583262 [details] [diff] [review] Patch 1.0 - Remove un-needed addon registration this is kinda blind, but remember to bump the mozmill master version requirement to reflect this
Attachment #583262 -
Flags: review?(jhammel) → review+
Reporter | ||
Comment 10•13 years ago
|
||
Not sure if this is fixed for testing, but I'm still hanging on windows using http://people.mozilla.com/~ahalberstadt/firefox-11.0a1.en-US.mac.tests.zip .
Comment 11•13 years ago
|
||
So the above patch landed and should be in the tests.zip https://github.com/mozilla/mozbase/commit/d09cb9d00db58f1f39d63185b278e99bf247ccdc. This patch fixes the hang I saw on Monday. When mcote and I reproduced this we saw the line "DEBUG Starting Peptest" before the hang actually occured. I'm not seeing this in the test slave logs. So maybe this is a separate hang? Ay ca-rumba.
Comment 12•13 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #11) > So the above patch landed and should be in the tests.zip > https://github.com/mozilla/mozbase/commit/ > d09cb9d00db58f1f39d63185b278e99bf247ccdc. This patch fixes the hang I saw on > Monday. > > When mcote and I reproduced this we saw the line "DEBUG Starting Peptest" > before the hang actually occured. I'm not seeing this in the test slave > logs. So maybe this is a separate hang? Ay ca-rumba. Its probably worth noting this change in the mozrunner README.md
Reporter | ||
Comment 13•13 years ago
|
||
C:\Documents and Settings\cltbld>c:\\talos-slave\\test\\build\\venv\\Scripts\\py thon c:\\talos-slave\\test\\build\\tests\\peptest\\peptest\\runpeptests.py --bin ary c:\\talos-slave\\test\\build\\application\\firefox\\firefox.exe --test-path c:\\talos-slave\\test\\build\\tests\\peptest\\tests/firefox/firefox_all.ini =ENTERING MAIN= =ENTERING PEPTEST INIT= creating runner =EXITING PEPTEST INIT= =STARTING PEPTEST= =ENTERING MOZRUNNER START= =ENTERING MOZRUNNER KILL= running process ProcessManager UNABLE to use job objects to manage child processes ProcessManager NOT managing child processes PEP ERROR | AttributeError: 'Process' object has no attribute '_procmgrthread' Exception exceptions.AttributeError: "'Process' object has no attribute '_intern al_poll'" in <bound method Process.__del__ of <mozprocess.processhandler.Process object at 0x011A4B70>> ignored =ENTERING MOZRUNNER KILL= Exception exceptions.AttributeError: "'PepProcess' object has no attribute 'proc '" in <bound method FirefoxRunner.cleanup of <mozrunner.runner.FirefoxRunner obj ect at 0x011A4A90>> ignored
Reporter | ||
Comment 14•13 years ago
|
||
So I changed the mozharness_python to use -u, and that didn't help the buffering issue. When I tried running the above command (comment 13) on the commandline, that's the output I got. I'm not sure why I wasn't able to get that from mozharness; it's entirely possible that's a bug, though I wasn't able to figure it out. Hoping that output is useful? Ahal has the login info for talos-r3-w7-003 via email, which should allow him to do some testing via vnc or ssh. If it's still not solved by EOW, maybe we need someone else to get access to buildvpn to try.
Comment 15•13 years ago
|
||
So this patch to mozprocess fixes the hang for me. Two things to note: 1) There's still an exception (Attribute error) on shutdown when calling self._job.Close() 2) I'm not sure if this patch introduces memory leaks with the python ctypes stuff I still don't really know what is going on (I just made this patch based on the stack traces I got). Clint did a whole bunch of debugging, I feel like he can probably shed some more light on what's happening then I can.
Comment 16•13 years ago
|
||
So, there were a host of issues here. 1. We weren't really good on python 2.5. We were calling some stuff that didn't exist in 2.5 and that was causing us some problems. 2. There is an oddity with None != NULL in python ctypes on python 2.5 (I assume) and therefore we could not create the job object, but the code is written to fallback gracefully when it cannot create job objects, but because we threw an exception early due to the 2.5 issue, we did not completely fallback as the code originally intended. That means we were in an inbetween state half-expecting job objects to work and half expecting them not to work, causing issues. The changes we've made address these things.
Attachment #583949 -
Flags: review?(ahalberstadt)
Comment 17•13 years ago
|
||
And I don't get any errors running peptest on windows with 2.6.5 and this patch.
Comment 18•13 years ago
|
||
Comment on attachment 583949 [details] [diff] [review] this works on windows on the buildbot slave! Review of attachment 583949 [details] [diff] [review]: ----------------------------------------------------------------- Awesome! Thanks so much for helping debug this. So to recap next steps we need to land this in master, mirror master to m-c, test and clean up the mozharness code, get everything reviewed, deploy to try.
Attachment #583949 -
Flags: review?(ahalberstadt) → review+
Comment 19•13 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #18) > Comment on attachment 583949 [details] [diff] [review] > this works on windows on the buildbot slave! > > Review of attachment 583949 [details] [diff] [review]: > ----------------------------------------------------------------- > > Awesome! Thanks so much for helping debug this. > So to recap next steps we need to land this in master, mirror master to m-c, > test and clean up the mozharness code, get everything reviewed, deploy to > try. Please go ahead and land on master, then give me a patch to mirror to m-c and I can review and land that this weekend. Thanks for all the hard work on this and have a wonderful christmas and an excellent new year. -- C
Comment 20•13 years ago
|
||
master: https://github.com/mozilla/mozbase/commit/df008c27b6a83c8ccd04302f3fa0471b12254c89 I'll file a new bug for getting this into m-c since there are a whole bunch of patches that need to get in.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•