Closed Bug 994658 Opened 11 years ago Closed 11 years ago

Several JSbridge disconnect failures

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: andrei, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(12 files, 1 obsolete file)

Attached file jsbridge.txt
The category will need to be updated. We've seen several such disconnects today across Windows and OSX nodes. All failed during restart tests. Attached is a sample log.
All this most likely happens because of bug 974971. It looks like Firefox doesn't restart after test1 but just closes down. With my fix on bug 975068 we would know more. Can you try to reproduce it?
Depends on: 974971, 975068
I tried to reproduce this on mm-win-81-32-4 with the fix from bug 975068 applied, and with "Gecko profiler 1.12.23:" from bug 974971, hoping to reproduce it, I ran about 30 testruns and it didn't failed once.
Let's try reproducing this disconnects. It's affecting our ability to properly run tests in CI in a big way.
Assignee: nobody → andrei.eftimie
Status: NEW → ASSIGNED
I've ran this all day with a patched mozmill (with bug 975068 attachment 8380150 [details] [diff] [review]) on mm-win-81-32-4 and it hasn't dc even once. I've set up some ENV variables (notably proxy, WORKSPACE). I've ran both full functional testruns, and directly restart tests via mozmill. I'll keep running these testruns, hopefully it will dc at some point.
I would suggest you run it without my patch first. It might be that it makes a difference. Once you have a reproduction pattern it will be useful to have this additional information available.
I actually ran it on a clean 2.0.6.1 env yesterday (again without success). But I agree, having it fail _either way_ is better than not having it fail at all. I'll do more runs on a clean env. I'll take another machine and run them in parallel.
Attached file jsbridge_07-15.txt
This is raw data computed from failure mails. I went through all failure emails from the 7th until 15th April (local time). There are also reported failures from 5 and 6 (they were weekend days) which were handled on the 7th. There's Date and Time (local, based on received email timestamp), machine, last test (where available) and additional info on some. Total 117 jsbridge disconnects. I didn't include multiple ondemand update failures which didn't ran and failed with 'updateStagingPath'. There is a possibility that they are related. No hard conclusions. I just gathered the data. Some notes: - vista has almost all failures on mm-win-vista-32-3, and most of them didn't run any tests at all (it looks a bit different from the other ones) - osx failed mostly in testAddons_installTheme/test1.js. Also some in testPasswordSavedAndDeleted.js and testAddons_uninstallTheme/test1.js - for win8/8.1: it seems once a machine is affected, multiple failures occur on that machine. This isn't a hard rule, but it looks to be statistically significant. Not sure what it means... If anyone has the time to look over the data, please. Maybe you notice something.
One testrun failed with Jsbridge on Windows 8.1 64 Bits with the latest Aurora installed version of, in restartTests\testAddons_installFromFTP\test1.js. In the console, beside the traceback and jsbridge error there was also 5 occurrences of this error: > 05:00:43 ************************* > 05:00:43 A coding exception was thrown and uncaught in a Task. > 05:00:43 > 05:00:43 Full message: TypeError: worker is null > 05:00:43 Full stack: post/</<@resource://gre/modules/osfile/osfile_async_front.jsm:355:9 > 05:00:43 TaskImpl_run@resource://gre/modules/Task.jsm:282:1 > 05:00:43 TaskImpl@resource://gre/modules/Task.jsm:247:3 > 05:00:43 createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:224:7 > 05:00:43 Task_spawn@resource://gre/modules/Task.jsm:139:5 > 05:00:43 post/<@resource://gre/modules/osfile/osfile_async_front.jsm:339:28 > 05:00:43 Handler.prototype.process@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:707:11 > 05:00:43 this.PromiseWalker.walkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:586:7 > 05:00:43 Spinner.prototype.observe@resource://gre/modules/AsyncShutdown.jsm:446:7 > 05:00:43 > 05:00:43 *************************
Can we check other testruns for this particular promise exception? Does it always occur in those disconnects? Or even without it?
We established last week that Andrei will no longer work on this at the moment. Daniel, please check the comment above, thanks!
Assignee: andrei.eftimie → nobody
I look into the latest jsbridge disconnects and the promise exception was present just in this case. So I think might not be related. From 19-22 April we've seen about 30 jsbridge disconnects. Most of them (over 20) were on windows.
I was running some testruns and reproduced this JSBridge locally once. (4th testrun, OSX, in restartTests/testAddons_installTheme/test1.js). Ran this testmodule in a loop 100 times with the WIP patch from bug 975068 applied to get more info. Did not have another jsbridge dc yet. But this might be interesting: https://pastebin.mozilla.org/4944191 Not sure if I read this correctly, but from a failed addon install (promise related?) the stack leads us to AsyncShutdown. This _might_ be related as in our JSBridge disconnects, Firefox should restart, yet it only shuts down without being reopened. Could a faulty shutdown mess with our restart flags?
With no diconnect in 200 runs, I've hacked a functional testrun to run the first few tests (until installTheme) and ran this in a loop. In the first 10 runs I got the following failure: http://mozmill-crowd.blargon7.com/#/functional/report/db3ef7ec039e4c6b255196b62805f7d0 Followed by a disconnect (though it is reported a bit differently): > ERROR | Test Failure | { > "exception": { > "message": "Notification popup state has been open", > "lineNumber": 27, > "name": "TimeoutError", > "fileName": "resource://mozmill/modules/errors.js" > } > } > TEST-UNEXPECTED-FAIL | restartTests/testAddons_installMultipleExtensions/test1.js | testInstallMultipleExtensions > TEST-START | restartTests/testAddons_installMultipleExtensions/test1.js | teardownModule > TEST-END | restartTests/testAddons_installMultipleExtensions/test1.js | finished in 7046ms > RESULTS | Passed: 7 > RESULTS | Failed: 1 > RESULTS | Skipped: 0 > Report document created at 'http://mozmill-crowd.blargon7.com/db/db3ef7ec039e4c6b255196b62805f7d0' > *** Removing profile: /var/folders/9l/sn_p3bw914s360j602z20jsc0000gq/T/tmpXpMZA7.workspace/profile > *** Removing test repository '/var/folders/9l/sn_p3bw914s360j602z20jsc0000gq/T/tmpXpMZA7.workspace/mozmill-tests' > Traceback (most recent call last): > File "/Users/andrei.eftimie/work/mozilla/mozmill/lib/python2.7/site-packages/mozmill_automation-2.1_dev-py2.7.egg/mozmill_automation/testrun.py", line 351, in run > self.run_tests() > File "/Users/andrei.eftimie/work/mozilla/mozmill/lib/python2.7/site-packages/mozmill_automation-2.1_dev-py2.7.egg/mozmill_automation/testrun.py", line 575, in run_tests > TestRun.run_tests(self) > File "/Users/andrei.eftimie/work/mozilla/mozmill/lib/python2.7/site-packages/mozmill_automation-2.1_dev-py2.7.egg/mozmill_automation/testrun.py", line 302, in run_tests > self._mozmill.run(tests, self.options.restart) > File "/Users/andrei.eftimie/work/mozilla/mozmill/src/mozmill/mozmill/__init__.py", line 444, in run > self.stop_runner() > File "/Users/andrei.eftimie/work/mozilla/mozmill/src/mozmill/mozmill/__init__.py", line 576, in stop_runner > raise Exception('client process shutdown unsuccessful') > Exception: client process shutdown unsuccessful
Attached file crash_2014-04-25.txt
I've run a Debug build and lo and behold I got a crash (in the same place as the usual JSbridge dc). I'm missing the dmp file, and its not mentioned in about:crashes I did have the OSX "This application crashed" window. I'll try to see if that saved a dump somewhere.
Well here's the crash as logged by OSX. > Crashed Thread: 31 DOM Worker > > Exception Type: EXC_BAD_ACCESS (SIGSEGV) > Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000000
Another crash. Same location. Still no dump, only the OSX crash log. But this one reports problems in malloc: > Thread 0:: Dispatch queue: com.apple.main-thread > 0 libsystem_malloc.dylib 0x00007fff8f3c5287 szone_malloc_should_clear + 4 > 1 libsystem_malloc.dylib 0x00007fff8f3c7868 malloc_zone_malloc + 71 > 2 libsystem_malloc.dylib 0x00007fff8f3c827c malloc + 42 > 3 libmozalloc.dylib 0x00000001000823ae moz_xmalloc + 14
(In reply to Andrei Eftimie from comment #13) > Not sure if I read this correctly, but from a failed addon install (promise > related?) the stack leads us to AsyncShutdown. This _might_ be related as in > our JSBridge disconnects, Firefox should restart, yet it only shuts down > without being reopened. > > Could a faulty shutdown mess with our restart flags? Absolutely. An exception during shutdown can certainly cause some code not being executed. So a requested restart may fail and Firefox quits. Interesting here would be the exit code of the Firefox process.
(In reply to Andrei Eftimie from comment #15) > Created attachment 8412452 [details] > crash_2014-04-25.txt > > I've run a Debug build and lo and behold I got a crash (in the same place as > the usual JSbridge dc). > > I'm missing the dmp file, and its not mentioned in about:crashes > I did have the OSX "This application crashed" window. I'll try to see if > that saved a dump somewhere. Yeah, crashes on OS X and the Apple reporter. I didn't work on OS X for a long time, so I forgot how to get proper crash reports out of a debug build. Steven, can you please help us here?
Flags: needinfo?(smichaud)
Attached file log_disconnect.txt
Reproduced directly with mozmill. Debug build and --debug option in mozmill.
I can reproduce it directly via mozmill by running: > mozmill -m firefox/tests/functional/restartTests/testAddons_installTheme/manifest.ini -b /Applications/FirefoxNightlyDebug.app/ --debug --profile=../profile/p1 I've also run only the first test: > mozmill -t firefox/tests/functional/restartTests/testAddons_installTheme/test1.js -b /Applications/FirefoxNightlyDebug.app/ --debug --profile=../profile/p1 and I did got a Disconnect Error, but different from the rest. Attached is the log for this. In this case I still had the Firefox window open, before it DC i noticed the messages logged in the console slowed down considerably. So the last messages: > --DOMWINDOW == 18 (0x11c1d5e70) [pid = 8249] [serial = 18] [outer = 0x11e7bee30] [url = about:blank] > --DOMWINDOW == 17 (0x1142261b0) [pid = 8249] [serial = 8] [outer = 0x11418ed30] [url = about:blank] > --DOMWINDOW == 16 (0x11c1d3570) [pid = 8249] [serial = 15] [outer = 0x0] [url = about:blank] > --DOMWINDOW == 15 (0x1137cb5f0) [pid = 8249] [serial = 11] [outer = 0x0] [url = about:blank] > --DOMWINDOW == 14 (0x1140d8ea0) [pid = 8249] [serial = 12] [outer = 0x0] [url = about:home] > TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed Were spread apart by 5-20 sec each. This looked like it slowed down until it stopped...
Regarding the crash for debug build lets do: 1. Try to repro on 10.8 which still supports gdb 2. Lets only run this single test with --debugger=gdb specified for mozmill 3. When it crashes do a 'bt' in gdb to get the stack 4. Fix the stack by running it through http://mxr.mozilla.org/mozilla-central/source/tools/rb/fix_macosx_stack.py
(In reply to comment #19) Apple crash reports can most likely be found in ~/Library/DiagnosticReports/. But you might also want to look in /Library/DiagnosticReports/.
Flags: needinfo?(smichaud)
(In reply to Henrik Skupin (:whimboo) from comment #22) > 1. Try to repro on 10.8 which still supports gdb Seems GDB is not tied to OS releases, but to XCODE (Command Line Tools). I've tried this on a (relatively new installed) 10.8 machine and didn't have GDB in an easy way. I've found some avenues of installing GDB, I'll give them a go. (Will try this on my main 10.9 machine)
Attached file gdb_stack.txt
Well, this is underwhelming. Finally managed to get a crash under gdb. Stack looks useless: > (gdb) bt > #0 0x00000001024ef38f in ?? () > #1 0x0000000000000000 in ?? () Attached is the whole log. I'll try again to see if I get the same result...
Same "trace" on a second crash: > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x170b of process 1987] > 0x00000001024ef38f in ?? () > (gdb) bt > #0 0x00000001024ef38f in ?? () > #1 0x0000000000000000 in ?? () Not sure if I'm doing something wrong
Check which thread has been crashed. Can you retry with a self-made build? There is still no information for the above trace. Also I wonder what this thread 0x170b actually is, which is created right after an assertion: Assertion failure: IsCanceled() (Subclass Cancel() didn't set IsCanceled()!), at /builds/slave/m-cen-osx64-d-0000000000000000/build/dom/workers/WorkerRunnable.cpp:278 [New Thread 0x170b of process 1723] [..] Source: http://mxr.mozilla.org/mozilla-central/source/dom/workers/WorkerRunnable.cpp#278 It looks like a worker issue. Olli, do you have an idea here?
Flags: needinfo?(bugs)
Ben knows about workers.
Flags: needinfo?(bent.mozilla)
Flags: needinfo?(bugs)
Sorry for spamming with logs, but hopefully we'll get to the bottom of this issue. I haven't had a crash with a locally build (debug, with symbols) FF. But I did had 2 different outcomes. GDB will not run a second test, but it will honor the restart. Attached is the log where we RECONNECT to Firefox.
And here is the log where we DON'T RECONNECT. I'm diffing them ATM to see if anything stands out. It could very well be that this case is the problematic behaviour. With all debug options activated maybe we'll find something.
That looks suspicious: [59393] WARNING: '!fd.IsInitialized()', file /Users/andrei.eftimie/work/mozilla/gecko-dev/netwerk/base/src/nsSocketTransport2.cpp, line 2597 [59393] WARNING: 'NS_FAILED(rv)', file /Users/andrei.eftimie/work/mozilla/gecko-dev/netwerk/protocol/http/nsHttpConnection.cpp, line 1638 [59393] WARNING: 'NS_FAILED(rv)', file /Users/andrei.eftimie/work/mozilla/gecko-dev/netwerk/protocol/http/nsHttpConnection.cpp, line 370 Can you check what's in those lines? We may have to set a break point there and do a detailed check.
This is the line: http://dxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsSocketTransport2.cpp#2597 But this is for the passing log. Since that was run under GDB, I am expecting some problems after the first test since gdb doesn't run subsequent tests after a restart.
When this fail we the following items in the log right before the quit-application observer which we do not have when this passes: > [59785] WARNING: NS_ENSURE_TRUE(mTextInputHandler) failed: file /Users/andrei.eftimie/work/mozilla/gecko-dev/widget/cocoa/nsChildView.mm, line 5305 > --DOMWINDOW == 21 (0x1196d9c00) [pid = 59785] [serial = 13] [outer = 0x0] [url = about:blank] > --DOMWINDOW == 20 (0x12698f800) [pid = 59785] [serial = 14] [outer = 0x0] [url = about:home] > --DOMWINDOW == 19 (0x1239a6400) [pid = 59785] [serial = 18] [outer = 0x0] [url = about:blank] > --DOMWINDOW == 18 (0x1239a6800) [pid = 59785] [serial = 15] [outer = 0x0] [url = about:newtab] > * Sending message: '{"eventType":"mozmill.pass","result":{"function":"assert.waitFor()"}}' > * Set: '9ff4fac5-cee6-11e3-bc67-c42c03164de7' > * Sending message: '{"result":true,"data":"bridge.registry[\"{3e299c49-bc89-214f-95ac-ec86935492dc}\"]","uuid":"9ff4fac5-cee6-11e3-bc67-c42c03164de7"}' > --DOCSHELL 0x1235cf000 == 7 [pid = 59785] [id = 7] > * Observer topic: 'quit-application' Passes: > [59497] WARNING: NS_ENSURE_TRUE(mTextInputHandler) failed: file /Users/andrei.eftimie/work/mozilla/gecko-dev/widget/cocoa/nsChildView.mm, line 5305 > * Sending message: '{"eventType":"mozmill.pass","result":{"function":"assert.waitFor()"}}' > * Set: '941c4d7a-cee5-11e3-8769-c42c03164de7' > * Sending message: '{"result":true,"data":"bridge.registry[\"{f7ba1372-6019-404c-a7f4-c5cb6f473464}\"]","uuid":"941c4d7a-cee5-11e3-8769-c42c03164de7"}' > --DOCSHELL 0x118341000 == 7 [pid = 59497] [id = 7] > * Observer topic: 'quit-application'
Not sure how to debug this further. I'll make a minimized testcase to simplify the logs.
That may be better, yes. I could have a look into this then. Thanks!
Attached file testcase WIP (obsolete) —
This can still be slimmed down a bit. I'll have a Make sure to have proper paths to its dependencies. Original location: > firefox/tests/functional/restartTests/testAddons_installTheme/test1.js On OSX 10.9 with a Nightly Debug build running: > mozmill -t firefox/tests/functional/restartTests/testAddons_installTheme/test1.js -b /Applications/FirefoxNightlyDebug.app/ --profile=../profile/p1/ Yields a crash (crash only with a debug build, failure with an opt build) at roughly 30-50%.
I tried to reproduce this on windows, I couldn't in about 30 rans. If I ran with a profile with geckoprofiler addon installed it fails 2-3 times in 25 rans with: >console.error: geckoprofiler: > Profiler module not found: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIXPCComponents_Utils.import], undefined >TEST-START | test1.js | setupModule >TEST-START | test1.js | testInstallTheme >TEST-PASS | test1.js | testInstallTheme >TEST-START | test1.js | teardownModule >TEST-END | test1.js | finished in 1451ms >TEST-START | test2.js | setupModule >TEST-START | test2.js | testThemeIsInstalled >TEST-PASS | test2.js | testThemeIsInstalled >TEST-START | test2.js | teardownModule >TEST-END | test2.js | finished in 1375ms >Parent process 2832 exited with children alive: >PIDS: 1964 >Attempting to kill them... >Parent process 2832 exited with children alive: >PIDS: 1964 >Attempting to kill them... >Error Code 6 trying to query IO Completion Port, exiting >Exception in thread Thread-1: >Traceback (most recent call last): > File "C:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\Lib\threading.py", line 551, in __bootstrap_inner > self.run() > File "C:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\Lib\threading.py", line 504, in run > self.__target(*self.__args, **self.__kwargs) > File "c:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\lib\site-packages\mozprocess\processhandler.py", line 321, in _procmgr > self._poll_iocompletion_port() > File "c:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\lib\site-packages\mozprocess\processhandler.py", line 371, in _poll_iocompletion_port > raise WinError(errcode) >WindowsError: [Error 6] The handle is invalid. > >RESULTS | Passed: 2 >RESULTS | Failed: 0 >RESULTS | Skipped: 0 or >console.error: geckoprofiler: > Profiler module not found: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIXPCComponents_Utils.import], undefined >TEST-START | test1.js | setupModule >TEST-START | test1.js | testInstallTheme >TEST-PASS | test1.js | testInstallTheme >TEST-START | test1.js | teardownModule >TEST-END | test1.js | finished in 1474ms >TEST-START | test2.js | setupModule >TEST-START | test2.js | testThemeIsInstalled >TEST-PASS | test2.js | testThemeIsInstalled >TEST-START | test2.js | teardownModule >TEST-END | test2.js | finished in 928ms >RESULTS | Passed: 2 >RESULTS | Failed: 0 >RESULTS | Skipped: 0 >1398774567923 addons.xpi ERROR Failed to remove directory c:\Users\svuser\Desktop\newProfile\extensions\staged\mozmill@mozilla.com >1398774567924 addons.xpi ERROR Failure moving c:\Users\svuser\Desktop\newProfile\extensions\staged\mozmill@mozilla.com to c:\Users\svuser\Desktop\newProfile\extensions >1398774567927 addons.xpi ERROR Failed to install staged add-on mozmill@mozilla.com in app-profile >console.error: geckoprofiler: > Profiler module not found: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIXPCComponents_Utils.import], undefined >Timeout: bridge.set("e8d584a1-cf99-11e3-8cd7-08002796b112", Components.utils.import("resource://mozmill/driver/mozmill.js")) > >TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed >RESULTS | Passed: 0 >RESULTS | Failed: 0 >RESULTS | Skipped: 0 >Traceback (most recent call last): > File "c:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\lib\site-packages\mozmill\__init__.py", line 831, in run > mozmill.run(tests, self.options.restart) > File "c:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\lib\site-packages\mozmill\__init__.py", line 429, in run > frame = self.run_test_file(frame or self.start_runner(), > File "c:\Users\svuser\Desktop\2.0.6-windows\mozmill-env\python\lib\site-packages\mozmill\__init__.py", line 347, in start_runner > self.minidump_save_path = os.path.join(appinfo['paths']['appdata'], >KeyError: 'paths'
Running a new profile with the testcase showed no more failures. Compared to the affected profile found the following pref was still active there: > user_pref("security.dialog_enable_delay", 250); We set this pref in almost every restart + addon test. The default value is 1000. And I get no more jsbridge dc with the default value! This was introduced in bug 923723 I also tested a 0 delay and got good results with that as well.
Attached file testcase WIP 2
Just a small update. This has the pref again (I had the pref saved in the profile before). Basically comment 36. Using a Nightly Debug version it should crash > for i in {1..10}; do mozmill -m firefox/tests/functional/restartTests/testAddons_installTheme/manifest.ini -b /Applications/FirefoxNightlyDebug.app/ --profile=../profile/p13; done test2 from the same folder is empty for me.
Attachment #8414455 - Attachment is obsolete: true
Comment 27 looks to have been moved to bug 1003766
Flags: needinfo?(bent.mozilla)
From local testing this should alleviate the failures. I propose to land this, and monitor the results. I'm not sure if this will reduce the JSBridge disconnects completely but it may greatly help.
Attachment #8416437 - Flags: review?(andreea.matei)
Comment on attachment 8416437 [details] [diff] [review] fix1_increase_delay.patch Review of attachment 8416437 [details] [diff] [review]: ----------------------------------------------------------------- We should really centralize those constants so we only have to call define it once. Also this is an infrastructure bug. You might want to fix this in a newly bug filed, where we can continue investigating this problem. Keep in mind that also update tests are failing and this changes are not related.
Comment on attachment 8416437 [details] [diff] [review] fix1_increase_delay.patch Review of attachment 8416437 [details] [diff] [review]: ----------------------------------------------------------------- Please file a new bug and attach there. I would like to see this landed today, so we can get results over the weekend.
Attachment #8416437 - Flags: review?(andreea.matei)
Blocks: 1005035
No longer blocks: 1005035
Depends on: 1005035
And inbound pushlog: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=0de56f72315e&tochange=b8051da2a530 Well high chance that either this or the mc one is wrong (I've had low reproduce rates with older builds). Currently last good inbound build has 40+ good runs, I usually see this once in a 20 run loop. I'll rerun the last good build a few times more.
100 more runs on last good inbound build. All good.
I would still favor in getting this fixed in Mozmill proper. Any chance to see it 100% failing with even smaller delays as 250ms in the addon installation dialog?
I'll run more tests.
Assignee: nobody → andrei.eftimie
The "worker is null" stack trace looks like bug 995162.
Ah, good to know. Thanks David. That at least reduces possible candidates. So bug 1005487 may still be a hot candidate for us here.
Reducing priority since this is not affecting daily runs since we landed bug 1005035.
Priority: P1 → P2
Whenever we get this fixed, we should also revert the patch from bug 1005035.
Assignee: andrei.eftimie → nobody
Component: Infrastructure → Mozmill
Priority: P2 → --
Product: Mozilla QA → Testing
QA Contact: hskupin
Hardware: x86 → All
Whiteboard: [mozmill-2.1?]
Finally we have a reproducible testcase here, where it fails all the time on mm-osx-109-3: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_functional/6634/console I was watching the testrun via VNC and when we run this test, the browser seem to stall for a moment. Then the browser window is closing, but the application is still visible in the dock. After the jsbridge timeout the process is killed. So this specific issue doesn't seem to be a Mozmill bug. We don't know yet, why we quit. Andrei, would you be able to check that tomorrow? Maybe it is reproducible even when you run it yourself on the box? If yes it would be good to modify mozmill to print out the exit code of Firefox. That might help us get more details about this mystic shutdown.
Flags: needinfo?(andrei.eftimie)
Oh wait. This geolocation test actually restarts the browser. So what we might face here is a really long shutdown time of Firefox which exceeds even 60s! This should really be investigated.
Andrei is on PTO so I'll do some investigation. I can reproduce the jsbridge disconnect with only running that test > mozmill -t firefox/tests/functional/testGeolocation/testNotNowShareLocation.js -b /Applications/Firefox.app/
The above test was made on the mm-osx-109-3 CI.
Ok, so lets spun this out to a new mozmill-test bug for investigation. If its a core bug we will need to file even one more.
Depends on: 1040679
Flags: needinfo?(andrei.eftimie)
Andrei, if we still see this please use the latest Mozmill code on master for investigation. Thanks.
There is nothing actionable on this bug. Even lately we haven't seen any large amount of disconnects anymore. So we seem to run very stable meanwhile. If we see something specific, we should file a new bug with exact details of the problem. This bug has been gotten more a meta bug.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → INCOMPLETE
Whiteboard: [mozmill-2.1?]
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: