Closed
Bug 1293969
Opened 9 years ago
Closed 8 years ago
Intermittent test_direct_update.py TestDirectUpdate.test_update, test_fallback_update.py TestFallbackUpdate.test_update | IOError: Process has been closed (Exit code: 0) (Reason: Timed out waiting for port 2828!) - no relaunch of Firefox
Categories
(Testing :: Firefox UI Tests, defect)
Tracking
(firefox-esr52 unaffected, firefox54 unaffected, firefox55 unaffected, firefox56 fixed)
RESOLVED
FIXED
mozilla56
| Tracking | Status | |
|---|---|---|
| firefox-esr52 | --- | unaffected |
| firefox54 | --- | unaffected |
| firefox55 | --- | unaffected |
| firefox56 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: robert.strong.bugs)
References
Details
(Keywords: intermittent-failure, regression, Whiteboard: [stockwell fixed:product])
| Comment hidden (obsolete) |
Comment 2•9 years ago
|
||
We had this failure again today:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-aurora&filter-searchStr=Firefox%20UI&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=3273324
Checking the log I can see it happening here:
03:18:33 INFO - File "c:\jenkins\workspace\mozilla-aurora_update\build\venv\lib\site-packages\marionette_driver\marionette.py", line 1077, in restart
03:18:33 INFO - self.quit_in_app()
03:18:33 INFO - File "c:\jenkins\workspace\mozilla-aurora_update\build\venv\lib\site-packages\marionette_driver\decorators.py", line 42, in _
03:18:33 INFO - return func(*args, **kwargs)
03:18:33 INFO - File "c:\jenkins\workspace\mozilla-aurora_update\build\venv\lib\site-packages\marionette_driver\marionette.py", line 1052, in quit_in_app
03:18:33 INFO - self.raise_for_port(self.wait_for_port())
03:18:33 INFO - File "c:\jenkins\workspace\mozilla-aurora_update\build\venv\lib\site-packages\marionette_driver\decorators.py", line 48, in _
03:18:33 INFO - m.force_shutdown()
03:18:33 INFO - File "c:\jenkins\workspace\mozilla-aurora_update\build\venv\lib\site-packages\marionette_driver\decorators.py", line 42, in _
03:18:33 INFO - return func(*args, **kwargs)
03:18:33 INFO - File "c:\jenkins\workspace\mozilla-aurora_update\build\venv\lib\site-packages\marionette_driver\marionette.py", line 652, in raise_for_port
03:18:33 INFO - raise socket.timeout("Timed out waiting for port {}!".format(self.port))
03:18:33 INFO - TEST-INFO took 186566ms
Interesting is the following line:
> IOError: Process has been closed (Exit code: 0)
It indicates that the process has been closed and did not restart as requested. Not sure why we do not see the appropriate failure message. But it reminds me all on bug 974971. We should keep watching this failure.
| Comment hidden (Intermittent Failures Robot) |
Comment 4•9 years ago
|
||
So far this seems to be a Windows x64 issue, which is spread across Windows 7 and 8.1 for us.
Here an example of a couple of test failures from Saturday last week on mozilla-aurora:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-aurora&revision=070cf059bb70aa991d50b4d266987adf26685915&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=runnable&filter-searchStr=update%20windows&filter-tier=3&filter-tier=2&filter-tier=1&selectedJob=4077549
So this is clearly not a crash but a normal shutdown when we usually should restart Firefox with the update applied. As I brought up the suspicion that bug 974971 might be related here, I feel more and more that this is the reason here.
Matt, do you have ever heard about that?
Flags: needinfo?(mhowell)
OS: Unspecified → Windows
Hardware: Unspecified → x86_64
Updated•8 years ago
|
Flags: needinfo?(mhowell)
Comment 5•8 years ago
|
||
All update tests on Windows 7 and 8.1 for Nightly are busted right now. Given the exit code 0 I would assume that Firefox crashes with a content crash when the update is going to be applied. Not sure if that is related to bug 1370576 (looks like Windows 10 only) because we force staging to be enabled across all platforms.
Changelog of the last days:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=firefox%20update%20windows&filter-tier=1&filter-tier=2&filter-tier=3&bugfiler&fromchange=c03f906e6579fce0b61524c4f5c8445dfa76ffe0&filter-resultStatus=success&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=running&filter-resultStatus=pending&filter-resultStatus=runnable&selectedJob=110318131
Pushlog for the source build which is affected:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d50abca6521b&tochange=53477d584130
Could this be related to bug 1375549?
Flags: needinfo?(robert.strong.bugs)
Comment 6•8 years ago
|
||
The log indeed shows a crash:
https://treeherder.mozilla.org/logviewer.html#?job_id=111079389&repo=mozilla-central&lineNumber=6662
05:47:47 INFO - PROCESS-CRASH | test_fallback_update.py TestFallbackUpdate.test_update | application crashed [@ RtlAllocateMemoryZone + 0xa32c72]
[..]
05:47:47 INFO - Crash reason: EXCEPTION_BREAKPOINT
05:47:47 INFO - Crash address: 0x6217671e
05:47:47 INFO - Process uptime: 84 seconds
05:47:47 INFO - Thread 32 (crashed)
05:47:47 INFO - 0 xul.dll!RtlAllocateMemoryZone + 0xa32c72
05:47:47 INFO - eip = 0x6217671e esp = 0x1e19fe48 ebp = 0x1e19fe4c ebx = 0x0cbed9d0
05:47:47 INFO - esi = 0x0000003f edi = 0x75cff9a3 eax = 0x716c17e8 ecx = 0x642141a4
05:47:47 INFO - edx = 0x77996c74 efl = 0x00000246
05:47:47 INFO - Found by: given as instruction pointer in context
05:47:47 INFO - 1 nss3.dll!RtlGetProcessHeaps + 0x5a
05:47:47 INFO - eip = 0x712aafe3 esp = 0x1e19fe54 ebp = 0x1e19fe68
05:47:47 INFO - Found by: previous frame's frame pointer
05:47:47 INFO - 2 nss3.dll!RtlLookupElementGenericTableAvl + 0x10
05:47:47 INFO - eip = 0x712a0afe esp = 0x1e19fe70 ebp = 0x1e19fe74
05:47:47 INFO - Found by: call frame info
05:47:47 INFO - 3 ucrtbase.dll!__RtlConvertUlongToLargeInteger + 0x74a9
05:47:47 INFO - eip = 0x7158d5ef esp = 0x1e19fe7c ebp = 0x1e19feb0
05:47:47 INFO - Found by: call frame info
05:47:47 INFO - 4 kernel32.dll! ?? ?? ::FNODOBFM::`string' + 0xa
05:47:47 INFO - eip = 0x75cfef8c esp = 0x1e19feb8 ebp = 0x1e19febc
05:47:47 INFO - Found by: previous frame's frame pointer
05:47:47 INFO - 5 mozglue.dll!RtlAdjustPrivilege + 0xb
05:47:47 INFO - eip = 0x716abca5 esp = 0x1e19fec4 ebp = 0x1e19fed0
05:47:47 INFO - Found by: previous frame's frame pointer
05:47:47 INFO - 6 ntdll.dll!LdrpSetAlternateResourceModuleHandle + 0x148
05:47:47 INFO - eip = 0x779b367a esp = 0x1e19fed8 ebp = 0x1e19ff10
05:47:47 INFO - Found by: call frame info
05:47:47 INFO - 7 ntdll.dll!LdrpSetAlternateResourceModuleHandle + 0x11b
05:47:47 INFO - eip = 0x779b364d esp = 0x1e19ff18 ebp = 0x1e19ff28
05:47:47 INFO - Found by: call frame info
Comment 7•8 years ago
|
||
The crash might have happened because there is a hang during shutdown. As the test logs show we click the update button, which should restart Firefox. But nothing seems to happen within 3 minutes.
05:44:25 INFO - 1498826665361 Marionette TRACE 2 -> [0,1405,"clickElement",{"id":"295b2ebc-b3e7-44df-bdb6-dbfeba2da4a5"}]
05:44:25 INFO - 1498826665429 Marionette TRACE 2 <- [1,1405,null,{}]
05:44:25 INFO - 1498826665593 Marionette DEBUG Closed connection 2
05:44:25 INFO - 1498826665672 Marionette WARN New connections are currently not accepted
05:44:25 INFO - Unable to read VR Path Registry from C:\Users\mozauto\AppData\Local\openvr\openvrpaths.vrpath
05:44:25 INFO - [Child 1440] WARNING: pipe error: 109: file c:/builds/moz2_slave/m-cen-w32-ntly-000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
05:44:25 INFO - [Child 1440] WARNING: pipe error: 109: file c:/builds/moz2_slave/m-cen-w32-ntly-000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
05:44:25 INFO - Unable to read VR Path Registry from C:\Users\mozauto\AppData\Local\openvr\openvrpaths.vrpath
05:44:25 INFO - [Child 1924] WARNING: pipe error: 109: file c:/builds/moz2_slave/m-cen-w32-ntly-000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
05:44:25 INFO - [Child 1924] WARNING: pipe error: 109: file c:/builds/moz2_slave/m-cen-w32-ntly-000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
05:44:26 INFO - *** UTM:SVC TimerManager:registerTimer - id: telemetry_modules_ping
05:47:30 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/EI9bVMHbSRKv-BgaTX6_hw/artifacts/public%2Fbuild%2Ffirefox-56.0a1.en-US.win32.crashreporter-symbols.zip
| Comment hidden (Intermittent Failures Robot) |
| Assignee | ||
Comment 9•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #5)
> All update tests on Windows 7 and 8.1 for Nightly are busted right now.
> Given the exit code 0 I would assume that Firefox crashes with a content
> crash when the update is going to be applied. Not sure if that is related to
> bug 1370576 (looks like Windows 10 only) because we force staging to be
> enabled across all platforms.
>
> Changelog of the last days:
>
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-
> searchStr=firefox%20update%20windows&filter-tier=1&filter-tier=2&filter-
> tier=3&bugfiler&fromchange=c03f906e6579fce0b61524c4f5c8445dfa76ffe0&filter-
> resultStatus=success&filter-resultStatus=busted&filter-
> resultStatus=exception&filter-resultStatus=retry&filter-
> resultStatus=usercancel&filter-resultStatus=running&filter-
> resultStatus=pending&filter-resultStatus=runnable&selectedJob=110318131
>
> Pushlog for the source build which is affected:
>
> https://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=d50abca6521b&tochange=53477d584130
>
> Could this be related to bug 1375549?
The crash looks like it is well outside of app update code and there are no reports of the updater crashing including from the people experiencing the firefox.exe crashes or firefox.exe hangs.
Flags: needinfo?(robert.strong.bugs)
| Assignee | ||
Comment 10•8 years ago
|
||
If this isn't a crash and instead firefox.exe isn't launched after an update you might be seeing bug 1376612 which landed on m-c a couple of days ago.
| Comment hidden (Intermittent Failures Robot) |
Comment 12•8 years ago
|
||
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #10)
> If this isn't a crash and instead firefox.exe isn't launched after an update
> you might be seeing bug 1376612 which landed on m-c a couple of days ago.
When I tried to reproduce this problem on one of our machines I noticed that I'm unable to install/open Firefox Nightly at all starting with the Nighly build on June 29th. So this test failure seems to perfectly match this situation. I filed bug 1378059 to get this investigated.
Comment 13•8 years ago
|
||
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #10)
> If this isn't a crash and instead firefox.exe isn't launched after an update
> you might be seeing bug 1376612 which landed on m-c a couple of days ago.
With being bug 1378059 a red herring, that's indeed the problem here. Does it mean it's actually a regression from bug 1375549 but which got fixed via bug 1376612? So far update tests with the update number 1 and 2 are working fine.
Flags: needinfo?(robert.strong.bugs)
Updated•8 years ago
|
Summary: Intermittent test_direct_update.py TestDirectUpdate.test_update | IOError: Process has been closed (Exit code: 0) (Reason: Timed out waiting for port 2828!) → Intermittent test_direct_update.py TestDirectUpdate.test_update, test_fallback_update.py TestFallbackUpdate.test_update | IOError: Process has been closed (Exit code: 0) (Reason: Timed out waiting for port 2828!) - no relaunch of Firefox
| Assignee | ||
Comment 15•8 years ago
|
||
Yes, Bug 1376612 was a regression from bug 1375549 which was found by a few people including myself after bug 1375549 made it into nightly. Bug 1375549 tries to apply updates even when the pid doesn't exit which is bug 1375242 and a few others (see dependencies) to try to at least get clients updated. It is best effort due to a really nasty shutdown bug in Firefox.
Flags: needinfo?(robert.strong.bugs)
| Assignee | ||
Comment 16•8 years ago
|
||
Forgot to mention that it is also intermittent.
Comment 17•8 years ago
|
||
Thanks Robert. Lets update the dependency tree then.
I will keep this bug open for now until all remaining busted tests are gone. Interestingly there are still some update number 3 tests failing today, which should not have happened. So maybe there is still a remaining issue left over. I will keep an eye on the update test results.
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=fef489e8c2a193dde885adc48deb74cc883a5881&filter-searchStr=firefox%20ui%20update&filter-tier=1&filter-tier=2&filter-tier=3
| Assignee | ||
Comment 18•8 years ago
|
||
BTW: the only change left from bug 1375549 is applying the update after waiting 30 seconds for the pid to exit vs. the way it was before which was waiting and if the pid didn't exit after 70 seconds (used to be 10 seconds) not applying the update. You might very well be seeing issues with the firefox.exe process not exiting properly.
Comment 19•8 years ago
|
||
Hm, so in Marionette we are waiting 185 seconds (65s for shutdown + 120s for startup) for a successful restart of Firefox when triggered from inside the application:
https://dxr.mozilla.org/mozilla-central/rev/acbd9a64c0fa4e1980c7732735d4f4c2761ca4c9/testing/marionette/client/marionette_driver/marionette.py#1145
Shouldn't this be long enough to prevent failures like that to appear?
| Assignee | ||
Comment 20•8 years ago
|
||
It is long enough. As a matter of fact the updater waited only 10 seconds until very recently and now it waits 30 seconds due to a couple of recent reports of the firefox.exe process taking more than 10 seconds which is unrelated to this.
The problem that has been reported is the firefox.exe process not exiting at all or crashing. See the dependent bugs in bug 1375242.
| Assignee | ||
Comment 21•8 years ago
|
||
BTW: there are also reports of this happening without applying an update.
Comment 22•8 years ago
|
||
Ok, so given how often it will be reproducible the next days and depending on my available time I could have a look at our machines. Maybe I can catch such a behavior while being connected.
Depends on: 1375242
| Assignee | ||
Comment 23•8 years ago
|
||
Just so you know, I'm not working on bug 1375242 and I don't know who is if anyone though you might be able to find out on irc.
Comment 24•8 years ago
|
||
If I'm not mistaken, dmajor said he's working on a patch in https://bugzilla.mozilla.org/show_bug.cgi?id=1374043#c29.
Comment 25•8 years ago
|
||
So are bug 1375242 and bug 1374043 the same underlying issue?
Comment 26•8 years ago
|
||
At this point, we have to address the issue of firefox.exe not shutting down before we can assert otherwise.
Comment 27•8 years ago
|
||
Quick update for today. Only 2 tests were busted today due to this reported issue. Both are for updates with the update number 4. So maybe already tomorrow we will be green again.
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=11755fd63168581e194258d04bb6a7337779ec78&filter-searchStr=firefox%20ui%20update&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=112044695
Comment 28•8 years ago
|
||
There is no test failure in the last two days. Everything is green. So closing as fixed by bug 1376612.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
status-firefox56:
--- → fixed
Target Milestone: --- → mozilla56
| Comment hidden (Intermittent Failures Robot) |
Updated•8 years ago
|
Assignee: nobody → robert.strong.bugs
status-firefox54:
--- → unaffected
status-firefox55:
--- → unaffected
status-firefox-esr52:
--- → unaffected
Updated•8 years ago
|
Whiteboard: [stockwell fixed:product]
Comment 30•8 years ago
|
||
I didn't managed to reproduce this issue using STR from bug https://bugzilla.mozilla.org/show_bug.cgi?id=1374043#c0 or https://bugzilla.mozilla.org/show_bug.cgi?id=1375242#c0. I've tried with an affected Nightly build on Windows 10 x64, and Windows 7 x64, but Firefox updated as expected to the latest version and the firefox.exe process was killed in task manager.
Per comment 28 I see that this bug was already verified by the reporter in https://bugzilla.mozilla.org/show_bug.cgi?id=1376612#c15.
| Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•