Closed Bug 1108171 Opened 10 years ago Closed 10 years ago

Investigate update install on slow connections

Categories

(Toolkit :: Application Update, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: kairo, Unassigned)

Details

In our update test runs on Beta (running through our "check for updates" UI), we had a number of Windows test runs exit the browser during downloading of the MAR, e.g. in http://mm-ci-production.qa.scl3.mozilla.com:8080/job/ondemand_update/109201/console The downloads there were quite slow at that time, taking at least over 3 minutes for the ~47M complete MAR. The strange thing is that Firefox just exited (or was killed), no crash reporter came up or anything. We should find some way to investigate if we can reproduce this experience with slow connections in some way, as if this is a general bug, it's a bad one.
Hmmm... no reports from users? Is this when running mozmill?
Chances are the exit isn't due to app update. What specifically are you asking to investigate? The exit? The slowness?
This is/was on running mozmill tests, yes. We just happened to have slow connections there which we usually do not have. We should make sure we are not running into this on all slow connections.
Agreed though this could very well be a bug in mozmill especially since we have users with slow connections and there have been no reports of this happening outside of mozmill. It would be good to investigate whether that is the case or not first especially since there have been past occurrences where a bug was thought to be in app update and it was actually in the mozmill test.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #4) > is the case or not first especially since there have been past occurrences > where a bug was thought to be in app update and it was actually in the > mozmill test. Those instances were very very low compared to the enormous number of bugs we detected with our tests, mostly for releng related issues. So I don't concur here. (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #0) > In our update test runs on Beta (running through our "check for updates" > UI), we had a number of Windows test runs exit the browser during > downloading of the MAR, e.g. in > http://mm-ci-production.qa.scl3.mozilla.com:8080/job/ondemand_update/109201/ > console In the future it is good to add the console log as attachment to the bug or at least give an excerpt here. Those won't be available after three days. So that's what we actually see: > 19:03:13 TEST-START | testUpdate.js | setupTest > 19:03:13 TEST-START | testUpdate.js | testCheckAndDownloadUpdate > 19:06:14 Parent process 2324 exited with children alive: > 19:06:14 PIDS: 3348 > 19:06:14 Attempting to kill them... [..] > 19:09:45 Exception in thread Thread-6: > 19:09:45 Traceback (most recent call last): > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\Lib\threading.py", line 808, in __bootstrap_inner > 19:09:45 self.run() > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\Lib\threading.py", line 761, in run > 19:09:45 self.__target(*self.__args, **self.__kwargs) > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\lib\site-packages\mozprocess\processhandler.py", line 322, in _procmgr > 19:09:45 self._poll_iocompletion_port() > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\lib\site-packages\mozprocess\processhandler.py", line 357, in _poll_iocompletion_port > 19:09:45 self.kill() > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\lib\site-packages\mozprocess\processhandler.py", line 142, in kill > 19:09:45 self.returncode = self.wait() > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\lib\site-packages\mozprocess\processhandler.py", line 163, in wait > 19:09:45 self.returncode = self._wait() > 19:09:45 File "c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\lib\site-packages\mozprocess\processhandler.py", line 463, in _wait > 19:09:45 raise OSError(err) > 19:09:45 OSError: IO Completion Port failed to signal process shutdown So as listed in the above trace somehow the main process quit. This is not something initiated by Mozmill itself, and also doesn't look like a bug in mozprocess. > The downloads there were quite slow at that time, taking at least over 3 > minutes for the ~47M complete MAR. This was filed separately as bug 1108313 for all platforms. Only on Windows we have seen this specific sporadic quit of Firefox. > We should find some way to investigate if we can reproduce this experience > with slow connections in some way, as if this is a general bug, it's a bad > one. Are there some system logs which could give us an indication? I already checked with the event viewer but that didn't reveal any helpful information.
"releng related issues" are not toolkit -> app update related issues. I'm not saying there isn't a bug. I do hope you can find something to figure out what code is causing it and that it is doubtful that it is in toolkit -> app update code as has been shown in the past.
Again, I filed this bug because we should do some testing with such a very slow download speed with the manual update UI on Windows and *make sure* that it's not an issue there. I personally do not know how to simulate those slow speeds there, unfortunately, though.
I've downloaded several updates using a slow connection and the app update ui without experiencing any problems at all.
That's a relief. How slow of a connection?
And especially how did you test? Is there a tool for Windows, which can slow down the network connection?
I did essentially what you did in that I updated multiple times over a connection that took over 3 minutes (at times over 5 minutes) to update using the complete mar. The connection itself was slow and I didn't use any tools to slow down the connection.
That sounds to me confirmation that we don't have some really fatal flaw in the update code there.
If we cannot repro this behavior we might wanna close this bug as WFM and reopen once we have more information - in the case it happens again. Robert, please let me know when you see such a behavior again.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.