800042 - Move to new Bouncer code and infra

Reporter

Description

•

12 years ago

Steps are as follows:

1. Deploy database schema changes to existing production database.  Changes are as follows:

-- add fallback region options (bug 613620)
ALTER TABLE `geoip_regions` ADD COLUMN `fallback_id` integer;
ALTER TABLE `geoip_regions` ADD CONSTRAINT `fallback_id_refs_id_e6bfe66d` FOREIGN KEY (`fallback_id`) REFERENCES `geoip_regions` (`id`);
CREATE INDEX `geoip_regions_e28329c2` ON `geoip_regions` (`fallback_id`);
ALTER TABLE geoip_regions ADD COLUMN prevent_global_fallback int(1) NULL;

-- Add SSL only support (bug 796088)
ALTER TABLE mirror_products ADD COLUMN `ssl_only` tinyint(1) NOT NULL DEFAULT 0;


2.  WebQA to test (non-destructively) on new cluster, since it is using the ACTUAL PROD DATABASE.  You may add test products and test only SSL-only mirrors as this should be safe.  Do not edit any existing products or mirrors.   Be sure to delete test data (ONLY) when you are done.

3. When WebQA signs off, switch from old cluster to new cluster, via Zeus.

4.  After a period of testing and watching logs, via deinspanjer (at least several hours we will sign off on the new cluster (Daniel, how long do we expect this to take?  Is 3pm PT too early?)

5.  Releng to add new SSL-only products (stubinstaller) to Bouncer.

Stephen Donner [:stephend] Not actively reading bugmail

Comment 1

•

12 years ago

The plan looks good to me; this is the same testing we've done on staging, and helps guarantee a sane level of coverage for both positive and negative tests.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 2

•

12 years ago

Access logs from the Zeus load balancers that sit in front of Bouncer (download.mozilla.org) are rolled over hourly, and they are transferred to the log file server (metrics-logger1) via an rsync job.  This transfer typically has about a 4 hour lag.  Sometimes it can be as little as 2 hours, very rarely is it more than 5.

This means that requests that happen during the 9am hour (Pacific) can be viewed by Metrics between 11am and 2pm, with the most typical time being 12pm.

We would like to have at least 2 hours of log data to be able to trend and compare with the pre-cutover hours as well as with the same hours from the previous day.

So, take your cut-over time, add 2 to 3 hours for trending, and 2 to 3 hours for log collection, and you will have the earliest possible time we could give you results on the impact to incoming requests.

Finally, please keep in mind that these access logs show only redirect events.  They don't show where the redirect went or whether the download was completed.  If there is a systemic problem with the new Bouncer code, it is most likely to be in the second half that Metrics doesn't see.  Either the requests get redirected to the wrong place, or the destination doesn't deliver the client a working installer.  You need to coordinate with the CDNs or Ops to try to get verification of that part of the pipeline.

Laura Thomson :laura

Reporter

Comment 3

•

12 years ago

Jake says, re CDN:
We will have partial data immediately; estimate of bandwidth within 5-20 minutes, guaranteed numbers in ~5 hours, complete analytics in ~5 days.

We will watch the CDN numbers immediately.
We will have enough information, on both fronts, to sign off positively, 5 hours after ship.

Jake Maul [:jakem]

Assignee

Comment 4

•

12 years ago

Schema changes (comment 0 step 1) completed... this was tested as not affecting current prod (as well as visually, it's just new columns / indexes, and current code will ignore them).

Jake Maul [:jakem]

Assignee

Comment 5

•

12 years ago

Cut-over details:

change the zeus pool in use for download.mozilla.org http/https vservers to new prod pools

change dns for bounceradmin.mozilla.com to be a CNAME to download.mozilla.org

get a cert for bounceradmin.mozilla.com, add to zeus as an TLS SNI cert on that vhost
    ultimately should become a real, purchased cert
    right now, can be the cert from the sentry node... signed by Mozilla Root CA
    can be done anytime

remove "new bouncer" temp VIP of 63.245.217.79 from DNS and Zeus

Stephen Donner [:stephend] Not actively reading bugmail

Comment 6

•

12 years ago

Web QA has looked at this on both staging (yesterday/today), as well as the Bouncer Admin pieces on production, and tested the changes in aliasing and fallback functionality; we're ready to ship!

Let's do this thing.

Jake Maul [:jakem]

Assignee

Comment 7

•

12 years ago

This is all completed.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Stephen Donner [:stephend] Not actively reading bugmail

Comment 8

•

12 years ago

Verified FIXED; let the smoke (dust, really) completely settle before I verified this.

Status: RESOLVED → VERIFIED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

Bugzilla

Quick Search

Move to new Bouncer code and infra

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

Tracking

(Not tracked)

People

(Reporter: laura, Assigned: nmaul)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated