Closed Bug 816470 Opened 12 years ago Closed 11 years ago

Software updates are failing because downloaded data is larger as patch size

Categories

(Mozilla QA Graveyard :: Mozmill Tests, defect, P2)

Tracking

(firefox27 fixed)

RESOLVED FIXED
Tracking Status
firefox27 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Whiteboard: [mozmill-test-failure][mozmill-update])

Recently we have seen new failures for update tests like:

http://mozmill-ci.blargon7.com/#/update/report/352218d7e3125c857fc1d3716713d32d

The failure we get is: "An update has been found."

Reason for is that the downloaded patch size is larger as the size given by the update snippet. That causes an abort of the update process. I will file a bug for Firefox immediately.
Depends on: 816472
I'm taking this given that more work is necessary in underlying code to further investigate the problem.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Happened again today with Firefox 23.0a1 de on Windows NT 6.1.7601:
http://mozmill-ci.blargon7.com/#/update/report/740ab23e062758ae8a240b263d064cbd
(In reply to mario garbi from comment #4)
> Happened again today with Firefox 23.0a1 de on Windows NT 6.1.7601:
> http://mozmill-ci.blargon7.com/#/update/report/
> 740ab23e062758ae8a240b263d064cbd

Mario, the Jenkins link would be much more helpful here.
(In reply to mario garbi from comment #6)
> View the build in Jenkins:
> http://mm-ci-master.qa.scl3.mozilla.com:8080/job/mozilla-central_update/2158/

There is nothing in that log which indicates that this is this bug. All patches have been downloaded correctly.
There was indeed an issue today again but it is:
http://10.22.73.128:8080/job/mozilla-central_update/2201/console
Happened twice just now, with Nightly on windows 8 and xp. The error shown in mozmill-ci is "Update successfully downloaded. - 'errors' should not equal 'errors'", but in console log I see the AUS line with "higher than patch size".

http://mm-ci-master.qa.scl3.mozilla.com:8080/job/mozilla-central_update/2906/
http://mm-ci-master.qa.scl3.mozilla.com:8080/job/mozilla-central_update/2909/
(In reply to Andreea Matei [:AndreeaMatei] from comment #10)
> Happened twice just now, with Nightly on windows 8 and xp. The error shown
> in mozmill-ci is "Update successfully downloaded. - 'errors' should not
> equal 'errors'", but in console log I see the AUS line with "higher than
> patch size".

That's not fully true. The first listed job is not triggered by that failure. It only shows that the integrity is broken of the update:

*** AUS:SVC getStatusTextFromCode - transfer error: Die Integrität des Updates konnte nicht sichergestellt werden, code: verification_failed
You're right, I must have opened them wrong. Do you know if we already have a bug for that failure?
It might be related to the other issue. so i wouldn't file a bug on it yet.
Closing as WFM given that we haven't seen this failure since April.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Happened again today on Windows NT 5.1.2600 Service Pack 3 (x86) with Aurora (24.0a2, en-US, 20130804004006):

Dashboard:
http://mozmill-daily.blargon7.com/#/update/report/bccc2ea9ac1e478fa074c699d84ca39a
Jenkins Console:
http://mm-ci-master.qa.scl3.mozilla.com:8080/job/mozilla-aurora_update/3705/console

*** AUS:SVC Downloader:onProgress - progress: 27900000/28263332
*** AUS:SVC Downloader:onProgress - progress: 28200000/28263332
*** AUS:SVC Downloader:onProgress - progress: 28500000/28263332
*** AUS:SVC Downloader:onProgress - progress: 28500000 is higher than patch size: 28263332
*** AUS:SVC Downloader: cancel
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Given these messages are server-side and not from Mozmill itself I think there might be a Firefox or RelEng bug here somewhere. Although this is extremely intermittent, need-infoing nthomas to see if he might have any ideas as to why this might happen.
Flags: needinfo?(nthomas)
I don't really understand how Mozmill works, so indulge a wild guess based on http://mm-ci-master.qa.scl3.mozilla.com:8080/job/mozilla-aurora_update/3705/console.

- it starts with a (currently) one day old nightly, 2013-08-04
- the update query is https://aus3.mozilla.org/update/3/Firefox/24.0a2/20130804004006/WINNT_x86-msvc/en-US/aurora/Windows_NT%205.1.3.0%20(x86)/default/default/update.xml?force=1
- that returns (for me, right now):
<?xml version="1.0"?>
<updates>
    <update type="minor" displayVersion="24.0a2" appVersion="24.0a2" platformVersion="24.0a2" buildID="20130805004006">
        <patch type="complete" URL="http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2013/08/2013-08-05-00-40-06-mozilla-aurora/firefox-24.0a2.en-US.win32.complete.mar" hashFunction="sha512" hashValue="feca2cf253a73c44ea4ce606d23f993f89225f6e7b0980948f8ef7687b57f1a9c68197dd6757c65f401e8313c87c4402edeb614e37ebcb9f3280e5c3202ad3d8" size="28263332"/>
        <patch type="partial" URL="http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2013/08/2013-08-05-00-40-06-mozilla-aurora/firefox-24.0a2.en-US.win32.partial.20130804004006-20130805004006.mar" hashFunction="sha512" hashValue="fd286ef354023f36abe56d6e679cd25f8a600941e60f1456187889ecf760e1d3955665e49ed201942b0d997de13595d15131d0e0dfef7addc7086784fcca93e4" size="3478289"/>
    </update>
</updates>
- the partial update is successfully downloaded, size & hash ok
- update.status set to 'failed: 6', app starts downloading the complete
- the download is interrupted after 300000 bytes:
*** AUS:SVC Downloader:onProgress - progress: 82389/28263332
*** AUS:UI gUpdates:onLoad - setting current page to startpage errorpatching
*** AUS:SVC Downloader:onProgress - progress: 300000/28263332
...
*** AUS:SVC Downloader: cancel
- then the d/l is started up again
*** AUS:SVC Downloader:onProgress - progress: 382405/28263332
*** AUS:SVC Downloader:onProgress - progress: 600000/28263332
...
- when it gets to the end it fails too big
*** AUS:SVC Downloader:onProgress - progress: 28500000/28263332
*** AUS:SVC Downloader:onProgress - progress: 28500000 is higher than patch size: 28263332

I think you need to confirm that the download is being interrupted 'gently', so that the update state and the file size are consistent. If part of a chunk is being written to the disk but some of that is re-requested you'd get this problem. Perhaps logging can be added for the Range requests the downloader is doing, and mozmill can check the file size on disk.
Flags: needinfo?(nthomas)
This is failing all over the place with our current ondemand update testruns for 24b1.
What turned out to be problem here ? Was it fixed by Rail regenerating the snippets ? If so, what needed to be corrected ?
I generated the snippets using build#1 instead of build#2. As a result the size and hash were set wrong.
In the future please move all discussions about that problem over to bug 816472. At least as long as we do not know if it is a Mozmill test issue or not.
This failed again yesterday on Windows 8 x86 with Nightly en-US
http://mozmill-daily.blargon7.com/#/update/report/26d5854914a68aa13b333f28152e365c

Jenkins Log 
*** AUS:SVC Downloader:onProgress - progress: 33000000/34294845
*** AUS:SVC Downloader:onProgress - progress: 33300000/34294845
*** AUS:SVC Downloader:onProgress - progress: 33600000/34294845
*** AUS:SVC Downloader:onProgress - progress: 33900000/34294845
*** AUS:SVC Downloader:onProgress - progress: 34200000/34294845
*** AUS:SVC Downloader:onProgress - progress: 34500000/34294845
*** AUS:SVC Downloader:onProgress - progress: 34500000 is higher than patch size: 34294845
*** AUS:SVC Downloader: cancel
*** AUS:SVC Downloader:onStopRequest - original URI spec: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2013/09/2013-09-10-03-02-54-mozilla-central/firefox-26.0a1.en-US.win32.complete.mar, final URI spec: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2013/09/2013-09-10-03-02-54-mozilla-central/firefox-26.0a1.en-US.win32.complete.mar, status: 2147549183
*** AUS:SVC Downloader:onStopRequest - status: 2147549183, current fail: 0, max fail: 10, retryTimeout: 2000
*** AUS:SVC Downloader:onStopRequest - non-verification failure
*** AUS:SVC getStatusTextFromCode - transfer error: Failed (unknown reason), default code: 2152398849
*** AUS:SVC Downloader:onStopRequest - setting state to: download-failed
*** AUS:UI gErrorsPage:onPageShow - update.statusText: Failed (unknown reason)
*** AUS:SVC Downloader:onStopRequest - all update patch downloads failed
TEST-UNEXPECTED-FAIL | c:\users\mozauto\appdata\local\temp\tmpnmhsse.mozmill-tests\tests\update\testFallbackUpdate\test3.js | test3.js::testFallbackUpdate_ErrorPatching
It didn't reproduced locally with this build, I will investigate more this failure.
(In reply to Cosmin Malutan from comment #23)
> It didn't reproduced locally with this build, I will investigate more this
> failure.

As mentioned on IRC and during the update test failure training session, such failures have to be investigated immediately. So I'm not surprised it doesn't reproduce. At least please check if another build has been triggered and appeared at that time.
The underlying issue should be fixed for Nightly now. Not sure yet if we will also get this backported to Aurora. I will close this bug appropriately.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.