Closed Bug 1159902 Opened 10 years ago Closed 10 years ago

Please deploy onyx 1.4.1 to production

Categories

(Content Services Graveyard :: Tiles: Ops, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: kthiessen, Assigned: mostlygeek)

References

Details

Github SHA: 1679268b6027f7bda05e0208d53e798157c0fd86 Changelog: https://github.com/mozilla/onyx/blob/1.4.1/CHANGELOG.md 1.4.1 * Heartbeat API endpoint * external api test script for validating fetch api
deploying onyx canary
fetch api test passes on canary
end to end test completed successfully, growing stack
new stack filled to 11 hosts, scaling old stack to 0 hosts
old stack scaled down
api test against prod elb is passing
end to end test against prod elb is running
From IRC: [16:17:11] <Mardak> oyiptong: something doesn’t look right with the v3 production [16:17:21] <Mardak> https://tiles.services.mozilla.com/v3/links/fetch/en-US/nightly https://tiles.cdn.mozilla.net/desktop/US/en-US.eb1709191ca98387cb242e3e68d8010c75cd163e.ag.json [16:17:46] <Mardak> there’s a lot of repeat tiles [16:18:27] <Mardak> oyiptong: they’re actually tiles from other locales [16:18:47] <Mardak> that US/en-US json has "title": "Comunidade Mozilla" [16:18:48] <oyiptong> hmm [16:18:49] <oyiptong> ok [16:19:08] <oyiptong> taking a look [16:19:33] <Mardak> there’s 7 tiles with directoryId”: 786 [16:19:50] <Mardak> looks like the same tile just repeated [16:20:20] <oyiptong> relud is rolling back onyx [16:20:40] <oyiptong> the servers should still serve the previous tile set [16:20:50] <Mardak> fyi, v2 also has the repeated tiles https://tiles.cdn.mozilla.net/desktop/US/en-US.42febc943bdff7a5c4f14d24bee781f57e958abd.json [16:24:02] <oyiptong> the rollback should make onyx serve the old tile index [16:24:11] <oyiptong> so, that will give us time to fix the problem [16:24:27] <oyiptong> the problem is in splice though [16:24:57] <Mardak> ok https://tiles.services.mozilla.com/v2/links/fetch/en-US is redirecting to https://tiles.cdn.mozilla.net/desktop/US/en-US.acd7a476101251ecbc418b5f9932c7428e988a46.json with correct tiles [16:27:13] <oyiptong> working on fixing splice now [16:27:32] <Mardak> oyiptong: do you need the distribution json to test? [16:28:08] <oyiptong> i have one [16:28:20] <oyiptong> https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com/distributions/desktop/5e0772cf68072c5e6bb3026b94eaba8e4348f5d2.2015-04-29T21-45-56.988422.json [16:28:21] <Mardak> ok the live tiles should have been https://github.com/mozilla/tiles-data/blob/0ee9ef536c13c19c56924a8d9233ab3511723a6e/deployed/2015-04-28.json
Rolling back (to what version?)
Depends on: 1159975
it was rolled back to version 1.3.7
Trying again -- deplyed onyx 1.41. to prod, re-published tileset. Crawler results: [0] 13:51 (ttys003) kthiessen@kthiessen-16107:~/Library/github/mozilla/splice/scripts 618$ python tile_index_crawl.py https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazon\ aws.com NOTICE: crawling: https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com/('desktop', 'android')_tile_index_v3.json NOTICE: calculating tiles urls ERROR: https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com/android_tile_index_v3.json 403 NOTICE: tiles urls extracted: 172 NOTICE: calculating image urls NOTICE: image urls extracted: 95 NOTICE: validating image urls This is expected.
Assignee: dthornton → bwong
- Scaled up 1 Canary host, verified that it works by :relud - scaling up new onyx 1.4.1 and scaling down the old stack
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
python external_api_test.py https://tiles.services.mozilla.com https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com -v SUCCESS: https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com/desktop_tile_index_v3.json 200 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/de 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/en-GB 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/en-US 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/es-AR 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/es-CL 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/es-ES 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/es-MX 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/fr 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/ja 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/ja-JP-mac 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/pl 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/pt-BR 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/pt-PT 303 SUCCESS: https://tiles.services.mozilla.com/v2/links/fetch/ru 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/de/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/en-GB/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/en-US/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/es-AR/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/es-CL/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/es-ES/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/es-MX/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/fr/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/ja/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/ja-JP-mac/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/pl/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/pt-BR/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/pt-PT/desktop 303 SUCCESS: https://tiles.services.mozilla.com/v3/links/fetch/ru/desktop 303
end to end test is failing onyx_redshift_test Submitting test with id: 2015043014 HTTP/1.1 200 OK Date: Thu, 30 Apr 2015 21:35:32 GMT HTTP/1.1 200 OK Date: Thu, 30 Apr 2015 21:35:33 GMT Watching redshift for test with id: 2015043014 - and then it never comes through
i have seen no new data in redshift since 14:30 PDT
What was the resolution here? I had to go to the airport, but you folks finished up, I think. If we got it verified, please mark this VERIFIED; if not, please post a summary of what happened? Thanks very much.
Flags: needinfo?(dthornton)
QA Contact: kthiessen
I believe the issue here was related to an nginx configuration sending a list of ips, instead of a single ip, and we deployed a version of infernyx to handle it. data was then reloaded into infernyx, and it worked properly.
Status: RESOLVED → VERIFIED
Flags: needinfo?(dthornton)
You need to log in before you can comment on or make changes to this bug.