Closed
Bug 803489
Opened 12 years ago
Closed 11 years ago
Software update tests on Windows 8 fail sometimes due to updater prompt on startup (jsbridge cannot connect)
Categories
(Mozilla QA Graveyard :: Mozmill Tests, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: whimboo, Assigned: mario.garbi)
References
Details
(Whiteboard: s=130121 u=failure c=update p=1)
Attachments
(7 files)
Seen this last night by running update tests against the 17.0b2 beta candidate builds of Firefox. As the screenshot shows there seem to be still running processes of Firefox around. This modal dialog on startup causes an application disconnect for Mozmill on first startup. Not sure but it could be that we are not correctly shutting down Firefox between the tests. Affected testrun: http://10.250.73.243:8080/job/ondemand_update/2937/console *** AUS:SVC Downloader:onProgress - progress: 5402/10386816 *** AUS:SVC Downloader:onProgress - progress: 2055304/10386816 *** AUS:SVC Downloader:onProgress - progress: 9106584/10386816 *** AUS:SVC Downloader:onProgress - progress: 10386816/10386816 *** AUS:SVC Downloader:onStopRequest - original URI spec: http://download.mozilla.org/?product=firefox-17.0b2-partial-16.0b6&os=win&lang=en-US&force=1, final URI spec: http://download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/17.0b2/update/win32/en-US/firefox-16.0b6-17.0b2.partial.mar, status: 0 *** AUS:SVC Downloader:onStopRequest - setting state to: pending-service *** AUS:UI gDownloadingPage:onStopRequest - patch verification succeeded *** AUS:SVC gCanStageUpdates - able to stage updates because we'll use the service *** AUS:SVC UpdateService:applyUpdateInBackground called with the following update: Firefox 17.0 Beta 2 *** AUS:SVC readStatusFile - status: failed: 7, path: C:\Users\mozauto\AppData\Local\Mozilla\Firefox\firefox\updates\0\update.status *** UTM:SVC TimerManager:notify - notified @mozilla.org/browser/search-service;1 *** UTM:SVC TimerManager:notify - notified @mozilla.org/addons/integration;1 *** UTM:SVC TimerManager:notify - notified @mozilla.org/extensions/blocklist;1 Timeout: bridge.execFunction("5dc455a1-19d6-11e2-a001-7c6d6299cd7e", bridge.registry["{a4969087-3644-4262-9107-1dd5a911bc66}"]["runTestFile"], ["c:\\users\\mozauto\\appdata\\local\\temp\\tmpa8zyad.mozmill-tests\\tests\\update\\testDirectUpdate\\test2.js"]) TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed INFO Passed: 2 INFO Failed: 1 INFO Skipped: 0
Comment 2•12 years ago
|
||
Appeared in 11/23 on Windows NT 6.2.8400 (x86) on Release 17.0 http://mozmill-ondemand.blargon7.com/#/update/report/674977957b923f4905160d1b9ac05dcb
Comment 3•12 years ago
|
||
Reproducible in 11/26 on Windows 8 x64: http://10.250.73.243:8080/job/ondemand_update/4844/
Comment 4•12 years ago
|
||
Happened today on Windows 8 x64: http://10.250.73.243:8080/job/ondemand_update/4897/console http://10.250.73.243:8080/job/ondemand_update/4900/console On Windows 8 x86: http://10.250.73.243:8080/job/ondemand_update/4905/console http://10.250.73.243:8080/job/ondemand_update/4913/console
Reporter | ||
Comment 5•12 years ago
|
||
I wish someone could have a look into that. Not sure when I actually be able to work on it.
Assignee: hskupin → nobody
Priority: -- → P1
Reporter | ||
Updated•12 years ago
|
Status: ASSIGNED → NEW
This issue is enough of a nuisance that I'm going to stop running ondemand-update automation on release builds until this is resolved. The tests fail more often than they pass, creating an unnecessary backlog. We'll manually spotcheck Win8 updates in the meantime.
Reporter | ||
Comment 7•12 years ago
|
||
My suspicion on this is that this behavior exists because of all the trouble VMware Fusion is causing us. Those two Win8 VMs are running on qa-set which has the same memory issues as all the other machines we have run Fusion yet. So once we moved to the new ESX cluster this should be fixed.
Reporter | ||
Comment 8•11 years ago
|
||
Looks like even in the new CI the problem persists. It's something we have to figure out soon.
Whiteboard: s=130121 u=failure c=update p=1
Updated•11 years ago
|
Assignee: nobody → mario.garbi
Status: NEW → ASSIGNED
Reporter | ||
Comment 9•11 years ago
|
||
Mario, please try to reproduce this issue first. That would be important. If you can't locally please use our Win8 machines of the old CI system. Those are free and can be utilized. Keep in mind that something Mozmill related could be involved here. I'm happy to help you whenever you are blocked.
Assignee | ||
Comment 10•11 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #9) > Mario, please try to reproduce this issue first. That would be important. If > you can't locally please use our Win8 machines of the old CI system. Those > are free and can be utilized. Keep in mind that something Mozmill related > could be involved here. I'm happy to help you whenever you are blocked. I'm on it as we speak. I will come back with info as soon as I get some results.
Assignee | ||
Comment 11•11 years ago
|
||
I have tried to reproduce it locally and didn't managed yet except the cases when I manually open a second instance of Firefox. If testrun_update.py is run properly in the correct enviroment the tests pass 10/10: http://mozmill-crowd.blargon7.com/#/update/reports I will try on the Win8 machines of the old CI system and check where they act differently.
Assignee | ||
Comment 12•11 years ago
|
||
So far I wasn't able to reproduce it on the old Win8 machine as well, I will continue investigating but I suspect it's related to another bug that leaves a FF instance opened. Reports: http://mozmill-crowd.blargon7.com/#/update/reports
Reporter | ||
Comment 13•11 years ago
|
||
I wouldn't be able to explain that but it might be bug 813170. Since we have disabled this test we no longer have those problems on Win8, right? Would you mind to check that? What was the frequency of failures in the last days?
Assignee | ||
Comment 14•11 years ago
|
||
I didn't managed to reproduce it in normal conditions (without manually opening a FF instance) neither locally(last 3 days) nor on the old Win8 machines (yesterday only). I will look over bug 813170 and continue the investigations.
Reporter | ||
Comment 15•11 years ago
|
||
Please check the results in the dashboard and come back with the failure rates from the last 7 or 10 days.
Assignee | ||
Comment 16•11 years ago
|
||
Mozmill CI update reports for: 07.01.2013 - 24.01.2013 2013-01-16 20.0a2 fr - Windows NT 5.1.2600 (x86) http://mozmill-ci.blargon7.com/#/update/report/f25fe2f500e5e4086802832f52121ada 2013-01-08 20.0a1 fr - Windows NT 6.2.8400 (x86) http://mozmill-ci.blargon7.com/#/update/report/23d8fbdd0190d4b0496d6b129fcd6e8e 2013-01-08 20.0a1 en-US - Windows NT 6.2.8400 (x86_64) http://mozmill-ci.blargon7.com/#/update/report/23d8fbdd0190d4b0496d6b129fc15b31 Only 3 fails in the period 07-24.01.2013 for update testruns with Disconnect Error. Last one was in 16.01.2013.
Reporter | ||
Comment 17•11 years ago
|
||
(In reply to mario garbi from comment #16) > 2013-01-16 > 20.0a2 fr - Windows NT 5.1.2600 (x86) > http://mozmill-ci.blargon7.com/#/update/report/ > f25fe2f500e5e4086802832f52121ada That's not Windows 8 and not this bug. > 2013-01-08 > 20.0a1 fr - Windows NT 6.2.8400 (x86) > http://mozmill-ci.blargon7.com/#/update/report/ > 23d8fbdd0190d4b0496d6b129fcd6e8e > > 2013-01-08 > 20.0a1 en-US - Windows NT 6.2.8400 (x86_64) > http://mozmill-ci.blargon7.com/#/update/report/ > 23d8fbdd0190d4b0496d6b129fc15b31 January 8th was really the last time we have seen it? I thought I noticed it even with the new CI.
Assignee | ||
Comment 18•11 years ago
|
||
Yes, I double checked mozmill-ci reports and I cannot find a Disconnect error failure since 08.01.2013. I posted the win NT 5.1 report to cover all Disconnect errors.
Reporter | ||
Comment 19•11 years ago
|
||
This bug is for Win8 only. Other disconnect failures are based on different issues. Given that we cannot reproduce it right now and I was thinking that it might have been changed with the new CI, I will lower the priority to P3. Lets revisit on Monday when we can decide to close as WFM.
Priority: P1 → P3
Hardware: x86 → All
Assignee | ||
Comment 20•11 years ago
|
||
I still haven't been able to reproduce it, has it showed up in recent runs on CI?
Reporter | ||
Comment 21•11 years ago
|
||
(In reply to mario garbi from comment #20) > I still haven't been able to reproduce it, has it showed up in recent runs > on CI? Not sure. You might want to check that yourself.
Assignee | ||
Comment 22•11 years ago
|
||
As far as I've seen it hasn't showed up again in the last period.
Reporter | ||
Comment 23•11 years ago
|
||
Lets wait one more week and then we could close it as WFM if no more failure occurs.
Reporter | ||
Comment 24•11 years ago
|
||
This happened again today with an ondemand test: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/1302/console
Comment 25•11 years ago
|
||
Happened again today with an ondemand test: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/2122/console Windows 8 and FF 17.0.1: http://mozmill-ondemand.blargon7.com/#/update/report/f36358d058daf73ddbf78150163cb18f
Reporter | ||
Comment 26•11 years ago
|
||
This has happened a lot lately during the ondemand tests for recent Firefox releases. Mario, do we have an update here? Given that it's blocking the QA team from quickly running the update tests for release we have to raise the severity.
Priority: P3 → P2
Assignee | ||
Comment 27•11 years ago
|
||
I am still working on trying to reproduce this. I will come back with updates as soon as possible.
Comment 28•11 years ago
|
||
Reporter | ||
Comment 29•11 years ago
|
||
So I was able to catch that issue right now. I made a screenshot given that zh-TW is not that readable for me. As what I have seen the dialog pops-up when Firefox downloads or applies the update. Tony, are you able to help out?
Flags: needinfo?(tchung)
Comment 30•11 years ago
|
||
Can you copy and paste the text in the dialog to Google Translate?
Reporter | ||
Comment 31•11 years ago
|
||
No, it was not copyable. I wouldn't have asked otherwise.
Comment 32•11 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #31) > No, it was not copyable. I wouldn't have asked otherwise. hi henrik, your screenshot in comment 29 is exactly the same translation for comment 0 screenshot.
Flags: needinfo?(tchung)
Assignee | ||
Comment 33•11 years ago
|
||
I am working on creating an ondemand file to trigger locally with the release builds that failed recently. So far I have also noticed some strange behavior with win8 and updating l10n versions of FF. This bug has top priority for me and I'll try to figure it out as soon as possible.
Assignee | ||
Comment 34•11 years ago
|
||
I has reproduced again yesterday 01.04 at 2:24 AM on Firefox 19.0.2 en-US: http://mozmill-ondemand.blargon7.com/#/update/report/25ad365ca7bcf4905e9b700b4f970746 I am investigating this still and I'm trying to understand what ondemand testruns do differently from what I'm doing when I'm trying to simulate an ondemand run locally. I was hoping to observe the ondemand run but due to the late hour (2:24AM) I was unable.
Assignee | ||
Comment 35•11 years ago
|
||
*It has reproduced in an ondeman testrun.
Assignee | ||
Comment 36•11 years ago
|
||
And again on Firefox 19.0.2 zh-TW http://mozmill-ondemand.blargon7.com/#/update/report/25ad365ca7bcf4905e9b700b4f96c068
Assignee | ||
Comment 37•11 years ago
|
||
It has also reproduced on regular testrun_update scripts for Firefox 22.0a1 fr on Win 6.2.9200 64bit: http://mozmill-ci.blargon7.com/#/update/report/25ad365ca7bcf4905e9b700b4fceaa7e
Reporter | ||
Comment 38•11 years ago
|
||
Given all the massive failures lately I'm raising this issue to a P1. Mario, if you are not able to reproduce it yourself please take advantage of other members of your team. We cannot wait longer to get this problem fixed. It's sitting in the queue for almost 3 months(!) now. Thanks.
Priority: P2 → P1
Comment 39•11 years ago
|
||
Andreea and I have configured a local Jenkins and started running the ondemands testruns with the latest configuration file taken from this bug. In case this issue reproduces locally we can investigate it here. In case it doesn't, it might be an issue with the remote machine configuration. Mario, you might want to verify the reports and see if the same Windows 8 machine was used when the failures appeared.
Assignee | ||
Comment 40•11 years ago
|
||
I have noticed this while working on an win8 machine. When this pops up we are unable to interact with the applications running in background and we must first close this dialog. I'm not sure but I think this could impact our tests.
Reporter | ||
Comment 41•11 years ago
|
||
Thanks Mario. I have already disabled update checks which were still enabled on all Win8 machines. So this dialog should not appear anymore. But in any case Mozmill should be able to kill the Firefox process. So it shouldn't be directly related.
Comment 42•11 years ago
|
||
The first run of ondemand updates (using the file added to this bug), passed. We will be looking into mozmill-ci reports to see if all the failures were on a single machine or not, and also running again the ondemand through the local Jenkins.
Comment 43•11 years ago
|
||
Not the same machine was used when this error appeared. The ondemand test runs ran locally for the last couple of days, but the error does not appear. We have also created a new ondemand configuration file using the mozmill-ondemand reports for yesterday's testrun, but no error so far. We thought this had also something to do with the internet connection speed so we have checked the speed and our machine has a slower connection. We think that we could start an ondemand testrun tomorrow morning our time on the http://mm-ci-master.qa.scl3.mozilla.com/ for Windows 8 machines only reporting to mozmill-crowd so it won't interfere with real reports. Please tell us what do you think.
Comment 44•11 years ago
|
||
We have run ondemands on all Windows 8 remote machines twice. These are the results: 1) First time: - we got disconnect error application closed at error patching (screenshot attached) - happened on the mm-win-8-32-1 machine - we got disconnect error application closed at download (screenshot attached) - happened on the mm-win-8-32-3 machine 2) Second time: - we still got disconnect error application closed at error patching on the first machine - an error about update failed due to still running copies of Firefox. We have looked, but no FF process was running in the task manager except for the error dialog. Jenkins console for this issue is: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/6401/console We are trying to see if we can reproduce locally now with the same configuration file. Also we think we could mark as offline one of the nodes and run the update tests on the machine where the error most reproduced. This might help us to create a minimized test case.
Reporter | ||
Comment 45•11 years ago
|
||
(In reply to Daniela Petrovici from comment #44) > - we got disconnect error application closed at error patching (screenshot > attached) - happened on the mm-win-8-32-1 machine When you attach screenshots please do that not via a zip archive but attach them separately. It's extra time for everyone who wants to look at those. Thanks. > We are trying to see if we can reproduce locally now with the same > configuration file. Why do you want to try to reproduce on machines where you weren't able to see the issue in the past couple of days? Now that it got reproduced on win8_32-3 why don't you use this one immediately?
Comment 46•11 years ago
|
||
We have started investigation on the remote machine and we were able to reproduce the error by running an upgrade from 20.0b6 RO to 21.0b1. We are trying a create a minimized test case that will reproduce this issue constantly, although since no Firefox process is running in background, but we get the error message about that, we think that it might be a Firefox issue.
Reporter | ||
Comment 47•11 years ago
|
||
(In reply to Daniela Petrovici from comment #46) > We have started investigation on the remote machine and we were able to > reproduce the error by running an upgrade from 20.0b6 RO to 21.0b1. We are It would be good to know how you have ran the tests. Also not sure what you mean with minimized testcase. I don't think that this is the right thing to do at this time.
Comment 48•11 years ago
|
||
We ran it with normal testrun on the remote machine, giving the ro build 20.0b6, which did the upgrade until 21.0b1 and it reproduced 2 times out of 3 runs. When running with the config file having more than 15 locales, after we get the jsbridge error saying of more copies existing, all following runs would pop that window and won't run until we close it. We looked in task manager as well and haven't found any process of Firefox still running so we're not sure how to proceed. We could try to run a script before each testrun begin, which would kill any existing firefox processes, to see if that helps or it still reproducing.
Reporter | ||
Comment 49•11 years ago
|
||
Please see the lately added dependency for my current work on that. It's bug 858686.
Reporter | ||
Comment 50•11 years ago
|
||
Ok, so the patch on bug 860677 landed now. It should make that we no longer see this problem. Mario please schedule an ondemand testrun to proof that in the next days, or ask Anthony or Juan if one of them will run such a job in any way today. Thanks.
Assignee | ||
Comment 51•11 years ago
|
||
I've emailed Juan and Anthony and if they won't have an ondemand update testrun triggered today we will schedule one for Monday morning because today's testruns have already started.
Assignee | ||
Comment 52•11 years ago
|
||
Reproduced again today with the ondemand testrun on Firefox 19.0.2 es-AR with Windows NT 6.2.9200 (x86_64): CI reports: http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d3182b977 Jenkins: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/7317/console
Reporter | ||
Comment 53•11 years ago
|
||
(In reply to mario garbi from comment #52) > Reproduced again today with the ondemand testrun on Firefox 19.0.2 es-AR > with Windows NT 6.2.9200 (x86_64): > > Jenkins: > http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/7317/console So this is at least not the jsbridge error we were facing all the time in the last weeks! I call this a good sign. It means that my latest patches were successful. Mario, when testing please make sure you all watch the tests running. Once a failure like this happens it would be good to know where we are hanging. What I have seen last week are real hangs while applying the update. So it might be the case here.
Assignee | ||
Comment 54•11 years ago
|
||
I managed to reproduce this error for each ondemand update job 3/3. It would seem that the last tests from the Jenkins ondemand job are failing on both win-8-64 and win-8-32. win-8-32-3 View the build in Jenkins: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/7398/ View the results in the Mozmill Dashboard: http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d319426ed http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d319460a8 win-8-64-2 View the build in Jenkins: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/7370/ View the results in the Mozmill Dashboard: http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d3191a2da http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d3191e909
Assignee | ||
Comment 55•11 years ago
|
||
By looking at the send time of the ondemand mail reports we can see that the time it takes from build version to build version increases: FF 17 @ 02:15 FF 18 @ 02:19 (4 min) FF 18.0.1 @ 02:22 (3min) FF 19 @ 02:26 (4 min) FF 19.0.2 @ 02:35 (9 min) --we had a failure report here It's safe to assume that we had a ~5 minutes hang in the tests here.
Comment 56•11 years ago
|
||
The hang is evident in the console log as there's a timeout: Timeout: bridge.execFunction("e002f730-a688-11e2-a13a-005056bb7a86", bridge.registry["{ab5e149e-93fa-45e1-a0fb-49df859c0a70}"]["runTestFile"], ["c:\\users\\mozauto\\appdata\\local\\temp\\tmpilq7u9.mozmill-tests\\tests\\update\\testDirectUpdate\\test2.js"]) Is this only reproducible for Firefox 19.0.2? As mentioned in comment 53, have you been able to watch the tests running. It would be useful to know what is happening on these boxes during those 5 minutes.
Assignee | ||
Comment 57•11 years ago
|
||
We have caught the failure and managed to collect a couple of screenshots that I will attach and the firefox process dump that should tell us why Firefox was not responding.
Assignee | ||
Comment 58•11 years ago
|
||
Firefox 19.0 zh-TW on Win 6.2.9200 64bit View the build in Jenkins: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/ondemand_update/7421/ View the results in the Mozmill Dashboard: http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d319645e5 http://mozmill-crowd.blargon7.com/#/update/report/8ec48e7ab0431a61b624e36d31965b8d
Assignee | ||
Comment 59•11 years ago
|
||
Screen Shot of the Task Manager
Assignee | ||
Comment 60•11 years ago
|
||
Firefox Dump download link - 170Mb https://dl.dropboxusercontent.com/u/37788888/Dump/firefox.DMP
Reporter | ||
Comment 61•11 years ago
|
||
Thanks for the analysis. Exactly that I have expected as mentioned before. I have seen exactly the same thing last week. Mario, please file a new bug for mozmill-test failures about the application disconnect. Further also file a bug for the application updater of Firefox and add all the information from the last three comments. Make both dependent. With the information from the latest test we can say that the original issue the bug was filed against is finally fixed. The jsbridge error doesn't occur anymore. That means we can close this bug. Thanks to everyone involved here.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Component: Mozmill Automation → Mozmill Tests
Resolution: --- → FIXED
Summary: Software update tests on Windows 8 fail sometimes due to still running copies of Firefox → Software update tests on Windows 8 fail sometimes due to updater prompt on startup (jsbridge cannot connect)
Updated•5 years ago
|
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•