Closed
Bug 666022
Opened 13 years ago
Closed 13 years ago
Some Firefox 4.0.1 -> 5.0 partial updates download as complete updates
Categories
(mozilla.org Graveyard :: Server Operations, task)
mozilla.org Graveyard
Server Operations
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: u279076, Unassigned)
References
()
Details
Attachments
(1 file)
45.38 KB,
text/plain
|
Details |
For some locales, an error displays in the update dialog stating something to the effect of "partial update cannot be verified, downloading complete update". As a result, some locales actually get a complete update instead of a partial update. See the attached brasstacks log for details (simply select one of the locales reported as 'complete'). An example AUS URL for a "complete" update is as follows: http://mozilla.mirrors.tds.net/pub/mozilla.org/firefox/releases/5.0/update/mac/sr/firefox-5.0.complete.mar Please let me know if you need help interpreting the brasstacks logs.
Comment 1•13 years ago
|
||
Digging a bit: http://mozmill-archive.brasstacks.mozilla.com/#/update/report/ab6cdcd346f08a21493f2982e6d4ef56 is win32 nl failing over to the complete https://aus3.mozilla.org/update/3/Firefox/4.0.1/20110413222027/WINNT_x86-msvc/nl/release/Windows_NT%205.1/default/default/update.xml?force=1 is both partial and complete The mirror used for the complete was http://mozilla.c3sl.ufpr.br/releases//firefox/releases/5.0/update/win32/nl/firefox-5.0.complete.mar, which has the correct hash. Need to see what we hit for the partial though.
Comment 2•13 years ago
|
||
This is kinda strange. The finalURI of the partial patch still points to our download server and not to a mirror.
Comment 3•13 years ago
|
||
Rob, can you help us in identifying what the status error 2147549183 means?
Comment 4•13 years ago
|
||
That is NS_ERROR_UNEXPECTED which is a generic error and is from nsIncrementalDownload which returns that for several cases. http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsIncrementalDownload.cpp Perhaps one of the networking folks can help figure out why? cc'ing Josh
Comment 5•13 years ago
|
||
Would also be handy if you can post in this bug whether you can reproduce manually.
Comment 6•13 years ago
|
||
Can we tell if it's a problem talking to download.m.o, or the actual mirror ?
Comment 7•13 years ago
|
||
(In reply to comment #5) > Would also be handy if you can post in this bug whether you can reproduce > manually. It only failed a couple of times for all of the tests we ran. So no, I haven't tried that yet. But is there a way to let NSPR not recreate the log file for each start of Firefox? If that's possible I could re-run our automation and log all HTTP request/response headers.
Comment 8•13 years ago
|
||
(In reply to comment #6) > Can we tell if it's a problem talking to download.m.o, or the actual mirror ? I highly suspect it is for download.m.o since the url is http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=osx&lang=sr&force=1 (In reply to comment #7) > (In reply to comment #5) > > Would also be handy if you can post in this bug whether you can reproduce > > manually. > > It only failed a couple of times for all of the tests we ran. So no, I > haven't tried that yet. But is there a way to let NSPR not recreate the log > file for each start of Firefox? If that's possible I could re-run our > automation and log all HTTP request/response headers. I'm going to leave that to someone on the networking team
Comment 10•13 years ago
|
||
> is there a way to let NSPR not recreate the log > file for each start of Firefox? Not by default. If you apply this patch, and set "NSPR_LOG_MODULES=nsHttp:5,notrunc" in your environment, you should append to one big log file: https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=534764&attachment=485332 More generally, I don't understand the details of how partial/updates use necko requests to have an idea of what's broken here, and I've never personally waded into nsIncrementalDownload.cpp, but I'm happy to try to be of more use if someone can clue me into what's going wrong (load of incremental/partial update is failing, but only for certain locales, and rarely enough that we can't capture it in a debugger? Sounds like fun so far :)
Comment 11•13 years ago
|
||
(In reply to comment #10) > Not by default. If you apply this patch, and set > "NSPR_LOG_MODULES=nsHttp:5,notrunc" in your environment, you should append > to one big log file: > > https://bugzilla.mozilla.org/page.cgi?id=splinter. > html&bug=534764&attachment=485332 I can't patch the builds I'm testing with Mozmill. So it would be nice to get this checked in at some point. I will re-run those update tests now and simply check if it could be related to a massive amount of requests as what we had yesterday.
Comment 12•13 years ago
|
||
There were 25 failures out of 248 locales in the update checks that led to this bug (see the URL). I ran 500 requests against download.m.o (using curl) and there were only 302 responses - no timeouts or other errors I could see. That hit both the Phoenix and San Jose datacenters, 250 each, and according to the X-Backend-Server header it hit pp-app-dist01..09 and pm-app-dist01..08. So it's not an issue now, but perhaps it was before. It will be interesting to see if the problem happens again. If it does I'll bet it's different locales. What do you mean by 'massive amount of requests' whimboo ? There's a pretty high background level of update checks all day long, so it takes a lot of press/public awareness to raise the request rate significantly by manual checks. It might be possible for the machines serving download.m.o to get very busy due to other work. Perhaps mrz can suggest someone who can comment on that.
Comment 13•13 years ago
|
||
(In reply to comment #12) > What do you mean by 'massive amount of requests' whimboo ? There's a pretty > high background level of update checks all day long, so it takes a lot of We don't run those tests beside the usual release testing work. It's probably something we should do to check if things like that also happens when we do not push a new release to the public. I can remember that we have already seen this issue in the past but never reported it as bug so far. A day after the release everything was fine. Something you already noticed when running your own tests. My current test-run is still active but so far I can't see this issue: http://mozmill-archive.brasstacks.mozilla.com/#/update/detail?branch=5.0&channel=release&from=2011-06-22&to=2011-06-22&target=5.0
Comment 14•13 years ago
|
||
Everything works now. I really have the impression it's related to our release days.
Comment 15•13 years ago
|
||
Sounds like we've done all the debugging we can here :(.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
Comment 16•13 years ago
|
||
Whimboo really wants this fixed before the next release, re-opening.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment 17•13 years ago
|
||
IT, we would appreciate some input from your side. We were seeing odd behaviour from some update attempts that QA did yesterday. Specifically: * Firefox checks for an update, receives a response from AUS like this: https://aus3.mozilla.org/update/3/Firefox/4.0.1/20110413222027/WINNT_x86-msvc/sr/release/Windows_NT%205.1/default/default/update.xml?force=1 * Firefox attempts to download the partial (http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=win&lang=son&force=1), but doesn't appear to get redirected properly (The "final URI spec" below should be an URL to the requested MAR on a mirror, not download.m.o still): *** AUS:SVC Downloader:downloadUpdate - downloading from http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=osx&lang=sr&force=1 to /private/var/folders/rR/rRPQYt0bGFSKcIpcKmYxRU+++TM/-Tmp-/tmpvUx87Q.binary/Firefox.app/Contents/MacOS/updates/0/update.mar *** AUS:SVC Downloader:onStartRequest - original URI spec: http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=osx&lang=sr&force=1, final URI spec: http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=osx&lang=sr&force=1 *** AUS:SVC Downloader:onStopRequest - original URI spec: http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=osx&lang=sr&force=1, final URI spec: http://download.mozilla.org/?product=firefox-5.0-partial-4.0.1&os=osx&lang=sr&force=1, status: 2147549183 Unfortunately, we don't have full HTTP headers, as Firefox doesn't log them during updates. Do we know of any download.m.o machines that were acting up yesterday? Could they act up in such a way that they would not redirect a request, under heavy load or other conditions experienced yesterday?
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
Comment 18•13 years ago
|
||
If there are no mirrors available serving the file in question, you will get a page that suggests downloading from releases.mozilla.org with a manual link on it. The only time you'll get that is if bouncer is in good working condition and doesn't think there's any mirrors available that are serving the file in question. If bouncer is over capacity you'd get redirected to status.mozilla.com (for the hardhat page)
Comment 19•13 years ago
|
||
(In reply to comment #18) > If there are no mirrors available serving the file in question, you will get > a page that suggests downloading from releases.mozilla.org with a manual > link on it. The only time you'll get that is if bouncer is in good working > condition and doesn't think there's any mirrors available that are serving > the file in question. If bouncer is over capacity you'd get redirected to > status.mozilla.com (for the hardhat page) Sounds to me like we're hitting the "no mirrors available" state, given that it only happens right after a major release, and we never get redirected. Is there any way we can force Bouncer to always serve files to our own machines? If not, this sounds like CANTFIX to me.
Comment 20•13 years ago
|
||
I suspect it would be difficult to implement but it seems like if bouncer knows that there are no mirrors it can redirect the user to then AUS could use this information and not offer an update when that is the case.
Comment 21•13 years ago
|
||
btw: that wouldn't help mozmill but it would help the users.
Comment 22•13 years ago
|
||
re: comment 11: Henrik, that patch is for a bug that got fixed another way, so it's not planned to land. But if it would be useful for you for other purposes to have an append mode for NSPR logs we can open a new bug for it (use component NSPR and CC me).
Comment 23•13 years ago
|
||
Thanks Jason. I have filed bug 666376.
Comment 24•13 years ago
|
||
Henrik rebooted the machine that had issues (qa-horus). When he reran the tests the problem didn't occur again. If we spot specific problems with mirrors on the day we release then lets file them.
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → WORKSFORME
Comment 25•13 years ago
|
||
(In reply to comment #24) > Henrik rebooted the machine that had issues (qa-horus). When he reran the > tests the problem didn't occur again. If we spot specific problems with > mirrors on the day we release then lets file them. No, here we are not talking about the same issue. The restart fixed another issue I have noticed but we never filed as bug. The failures on this bug have been also discovered on another machine.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Updated•13 years ago
|
Assignee: server-ops-releng → server-ops
Component: Server Operations: RelEng → Server Operations
QA Contact: zandr → mrz
Comment 26•13 years ago
|
||
Is this still an issue?
Comment 27•13 years ago
|
||
Resolving as incomplete, since I'm not clear on whether there is an action IT can take right now on it.
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → INCOMPLETE
Comment 28•13 years ago
|
||
Now with bug 666376 fixed in Firefox 9, we could revisit this bug once we release Firefox 9 and the same issue happens again. Anthony, when we have reached this release and you can see this reported issue again, please immediately run an update test via our testrun_update.py script after you set the following environment variables: export NSPR_LOG_MODULES=nsHttp:5, append export NSPR_LOG_FILE=log.txt I'm leaving this bug as closed for now.
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•