Closed
Bug 1361255
Opened 7 years ago
Closed 7 years ago
Error stashed as MAR in 54.0b4 for "te" locale in Mac updates makes final verification fail
Categories
(Release Engineering :: Release Automation: Other, enhancement)
Release Engineering
Release Automation: Other
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mtabara, Assigned: mtabara)
References
Details
Final update beta verification task[1] failed in 54.0b4 due to size mismatch in "te" locale for Mac updates. At a closer look, seems like we've somehow packages an Internal Error code into a PMAR somewhere along the way, see file at [2]. The size was quite suspicios as there is no way we can get partial MAR files of just 202 bytes size. Debugging upstream to see where the problem might have occured (funsize, signing, beetmoving, etc). [1]: https://tools.taskcluster.net/task-inspector/#FlVunno9RIWwbB0kwvzZsQ/ [2]: http://archive.mozilla.org/pub/firefox/releases/54.0b4/update/mac/te/firefox-54.0b3-54.0b4.partial.mar
Assignee | ||
Comment 1•7 years ago
|
||
@nthomas> bucketlister-delivery.prod.mozaws.net agrees, so bad beetmove or generation ? 12:07:09 <@nthomas> oopsie, we got an error and stashed it as the mar file - https://pastebin.mozilla.org/9020492 12:09:51 <@aki> is te new? 12:10:31 <@aki> looks like no 12:12:12 <@nthomas> generation looks ok in http://mozilla-release-logs.s3.amazonaws.com/mozilla-beta/firefox-54.0b4/build1/%5Bfunsize%5D_Update_generating_task_macosx64_chunk_9_for_54.0b3-macosx64-97sIkDioQi6F9D49ajbctw-0 12:12:23 <@nthomas> right at the very end 12:15:15 <@nthomas> artifact on https://tools.taskcluster.net/task-inspector/#97sIkDioQi6F9D49ajbctw/0 is 6414732 bytes 12:16:05 <@nthomas> beetmover log is http://mozilla-release-logs.s3.amazonaws.com/mozilla-beta/firefox-54.0b4/build1/%5Bbeetmover%5D_firefox_mozilla-beta_macosx64_locales_partials_candidates_9_10-macosx64-dTQiLyrRRjqih2WWFjpThw-0 12:17:22 <@nthomas> https://irccloud.mozilla.com/pastebin/DoosrNq0/ 12:17:48 <@nthomas> I’m guessing there is an api call which got an error 282 bytes long there, which the code missed 12:20:31 <@nthomas> s/api call/download/, since we need a size and to calculate various hashes 12:23:34 <@nthomas> which beetmover code am I looking at again ? not https://github.com/mozilla-releng/beetmoverscript IIRC 12:23:45 <~mtabara> yep, not that 12:23:51 <~mtabara> that's nightly/fennec only 12:24:11 <~mtabara> https://hg.mozilla.org/mozilla-central/file/tip/testing/mozharness/scripts/release/beet_mover.py 12:24:20 <@nthomas> ah, thanks 12:25:11 <@nthomas> https://irccloud.mozilla.com/pastebin/w76UgM7c/ 12:25:46 <@nthomas> ah, I forgot about signing, duh 12:25:51 <~mtabara> the error stashed as mar is pretty scary 12:26:33 <gchang|afk> mtabara: Hi, Is the error going to impact 54.0b4? 12:28:37 <~mtabara> it's only impacting "te" locale for Mac users but I'd hold on from publishing this to beta until we've understood what's happened. otherwise, we'd be serving failed partials to users, if my understanding is right. however, we still have a good chunk of hours until QE signs this off so hopefully we've solved it by then 12:34:34 — ~mtabara files 1361255 to track that 12:36:12 <@nthomas> the size of public/env/firefox-54.0b3-54.0b4.te.mac.partial.mar on https://tools.taskcluster.net/task-inspector/#uqIErLdASHCTdLKOe81CeQ/0 looks right (6414996 bytes) 12:36:53 <~mtabara> since balrog has the right size, it means funsize generation + submission must have worked so beetmover must be the culprit 12:37:03 — ~mtabara reads scrollback again on ntho.mas's findings 12:38:31 <@nthomas> that’s pretty much where I got to 12:39:00 <@nthomas> the last pastebin is from the beetmover log 01:22:58 INFO - Downloading https://queue.taskcluster.net/v1/task/uqIErLdASHCTdLKOe81CeQ/artifacts/public/env/firefox-54.0b3-54.0b4.te.mac.partial.mar to /mozharness/build/firefox-54.0b3-54.0b4.te.mac.partial.mar 01:22:58 INFO - retry: Calling _download_file with args: (), kwargs: {'url': 'https://queue.taskcluster.net/v1/task/uqIErLdASHCTdLKOe81CeQ/artifacts/public/env/firefox-54.0b3-54.0b4.te.mac.partial.mar', 'file_name': '/mozharness/build/firefox-54.0b3-54.0b4.te.mac.partial.mar'}, attempt #1 01:23:08 INFO - Downloaded 282 bytes. tl;dr - the file in TC is good, but beetmover most likely fails to download it and stashes the error as the MAR. 12:43:55 <@nthomas> https://dxr.mozilla.org/mozilla-beta/source/testing/mozharness/mozharness/base/script.py#699 I guess, and eventually _download_file 12:44:32 <@nthomas> if we got an error without an error for http status code, and no content-length, we could get here 12:44:47 <@nthomas> kinda hard to tell what happened tbh 12:46:53 <@nthomas> the ‘content’ of the mar does look like an Amazon message, with a RequestId and all
Assignee | ||
Comment 2•7 years ago
|
||
So if we ship this, Firefox updates would get: https://aus5.mozilla.org/update/3/Firefox/53.0b2/20170427091925/Darwin_x86_64-gcc3-u-i386-x86_64/te/beta-cdntest/default/default/default/update.xml?force=1 Possible solutions: * ignore - most likely it'd default over to the CMAR * we could add a rule to block mac te until 53.0b5 * remove files from S3 (both candidates/releases) and rerun beetmover jobs + invalidate CDN caches via a bug filed to CloudOps Discussing with nthomas what's to be done here.
Assignee | ||
Comment 3•7 years ago
|
||
(In reply to Mihai Tabara [:mtabara]⌚️GMT+8 from comment #2) > So if we ship this, Firefox updates would get: > https://aus5.mozilla.org/update/3/Firefox/53.0b2/20170427091925/ > Darwin_x86_64-gcc3-u-i386-x86_64/te/beta-cdntest/default/default/default/ > update.xml?force=1 > > Possible solutions: > * ignore - most likely it'd default over to the CMAR > * we could add a rule to block mac te until 53.0b5 > * remove files from S3 (both candidates/releases) and rerun beetmover jobs + > invalidate CDN caches via a bug filed to CloudOps > > Discussing with nthomas what's to be done here. Eventually we went on to update https://aus4-admin.mozilla.org/releases#firefox-54.0b4 and delete the partial information specific for mac locale "te". Users in that pool will be offered complete mar instead. Update verify will likely fail but hopefully final update beta verification will work. mihaitabara@mozspace:[]~/Downloads$ diff Firefox-54.0b4-build1.json.backup Firefox-54.0b4-build1.json 2577,2581d2576 < "filesize": 6414996, < "from": "Firefox-54.0b3-build1", < "hashValue": "7f0cd71a8f038e9e8fe0ba36281bdf34f916a5a086c18b838da4157c7e8c7965271bfadb474e2866281e8e75845ab7eb6d899ebd7d44e0cb7eb2b9eecd943725" < }, < {
Assignee | ||
Comment 4•7 years ago
|
||
Rerun Final verification fails because it expects a partial mar for 54.b3 -> 54.0b4. It's fine, we can ignore this. https://public-artifacts.taskcluster.net/FlVunno9RIWwbB0kwvzZsQ/6/public/logs/live_backing.log
Assignee | ||
Comment 5•7 years ago
|
||
Note to self: talking to rail about this today, turns out :nthomas was right. The Firefox updater is smart enough to find a good reason to reject the corrupt partial MAR anyway (either signature failing, or size or hash or something else). So if we didn't take action, we would have gotten to the same end result anyway, but on the user side, rather than Balrog tweaking, which we eventually did.
Comment 6•7 years ago
|
||
A possible fix is in bug 1361878.
Assignee | ||
Comment 7•7 years ago
|
||
Corrupt PMAR was not nearly as bad as thought initially as most likely Firefox updater will find a reason to reject the partial and default to complete anyway (size, signature, SHA, etc). Will close this for now, as automation fix to prevent this in the future is tracked under bug 1361878.
Assignee: nobody → mtabara
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
See Also: → 1361878
You need to log in
before you can comment on or make changes to this bug.
Description
•