Closed Bug 1011629 Opened 11 years ago Closed 10 years ago

stand up scl3 proxxy server in production

Categories

(Infrastructure & Operations :: RelOps: General, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: taras.mozilla, Assigned: catlee)

References

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/985] )

Amy suggested sticking new drives into one of the hp 64gb kvm boxes. Production setup will be 1 varnish + s3 'hot spare' where sccache will fallback to s3 automatically. I think we should leave 2 of the harddrives in there for OS + scratch-space and add ssd as cache storage. I'm going to order a 750gb samsung pro ssd for that. I expect we'll be deploying varnish as docker images hosted in s3. Laura, please take care of getting the box set up for gozer. Let me know shipping info so I can arrange an ssd drive delivery(or have dcops buy one). We'll keep updating production-varnish plan in https://etherpad.mozilla.org/ikFCaXUCbK
Assignee: server-ops-storage → relops
Component: Server Operations: Storage → RelOps
Product: mozilla.org → Infrastructure & Operations
QA Contact: cshields → arich
Gozer actually already has the hosts since they were being used for labs. We'll need him to allocate them back to us so we can kickstart them. SSDs will need to be shipped to scl3.
Assignee: relops → gozer
Ordered ssd. usps tracking: 9402111899561497612009
Amy, Van has the ssd, can you open a bug for him like bug 1017126 to add it to one of the ex-kvm machines.
Flags: needinfo?(arich)
Priority: -- → P1
gozer? the machine is still your.
Flags: needinfo?(arich)
(In reply to Amy Rich [:arich] [:arr] from comment #4) > gozer? the machine is still your. Ah, okay, missed that in the bug. So we are going with one of the Labs KVM hosts, correct? You just need me to take it out of the Labs KVM cluster so you can take it over?
Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/354]
This isn't a Q2 project for us, so we're not "taking the host over," no. I think Taras' ask is to get the SSD installed at the moment. We discussed using one of the kvm hosts that we designated as a spare that you took for labs. I'm not sure which one you want to evacuate and give up. But when you choose one of them, taras wants a bug filed with dcops to put in the SSD drive. My understanding was that you'd then be getting varnish set up on it in the releng network (it should go on vlan 248 in scl3 and be called <something>.srv.releng.scl3.mozilla.com).
Depends on: 1027118
Took vm1-11.phy.labs.scl3.mozilla.com out of the Labs KVM cluster, you can reclaim it. https://inventory.mozilla.org/en-US/systems/show/5265/
Depends on: 1028830
gozer: do you still have this? What's the ETA?
Flags: needinfo?(gozer)
(In reply to Laura Thomson :laura from comment #8) > gozer: do you still have this? What's the ETA? I can see the host is now called sccache1.srv.scl3.mozilla.com, but I was expecting to be handed off an already kickstarted host. I'll get on that today and see how far I can get.
Status: NEW → ASSIGNED
Flags: needinfo?(gozer)
Varnish up and running, setup copied from hp4.relabs.releng.scl3.mozilla.com. Worth noting, the persistent storage backend is now deprecated, see <https://www.varnish-cache.org/docs/trunk/phk/persistent.html> $> lwp-request -m GET -edsS http://sccache1.srv.releng.scl3.mozilla.com/mozilla-releng-s3-cache-us-west-2-try/c/8/5/c853fd145b33923d065947d7ba9b8b3227fe2fe2 GET http://sccache1.srv.releng.scl3.mozilla.com/mozilla-releng-s3-cache-us-west-2-try/c/8/5/c853fd145b33923d065947d7ba9b8b3227fe2fe2 --> 200 OK Connection: close Date: Wed, 25 Jun 2014 18:32:34 GMT Via: 1.1 varnish-v4 Accept-Ranges: bytes Age: 31 ETag: "0298707e740416201039bc27bd7d1998" Server: AmazonS3 Content-Length: 55720 Content-Type: application/octet-stream Last-Modified: Thu, 12 Jun 2014 01:10:45 GMT Client-Date: Wed, 25 Jun 2014 18:33:03 GMT Client-Peer: 10.26.48.128:80 Client-Response-Num: 1 X-Amz-Expiration: expiry-date="Sat, 28 Jun 2014 00:00:00 GMT", rule-id="delete objects older than 15 days" X-Amz-Id-2: A0/BZ8oBkFQv9SqLg6C+zt2g+YGBWJetQENNv7dKnxdgKqHus+RtITiKbyQMUWMM X-Amz-Request-Id: 73A39FE73F4C2E93 X-Varnish: 32782 12
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Blocks: 1008015
The setup that was on hp4.relabs.releng.scl3.mozilla.com was forwarding to s3-us-west-2.amazonaws.com. Since then, we've switched to using the us-west-1 zone, which is closer and has lower latency. This sadly means that depending on what is being built on try, we can be using three different buckets, in two different zones. Is it possible to have the following: http://sccache1.srv.releng.scl3.mozilla.com/mozilla-releng-s3-cache-us-west-2-try/ send to https://s3-us-west-2.amazonaws.com/mozilla-releng-s3-cache-us-west-2-try/ http://sccache1.srv.releng.scl3.mozilla.com/mozilla-releng-ceph-cache-scl3-try/ send to https://s3-us-west-2.amazonaws.com/mozilla-releng-ceph-cache-scl3-try http://sccache1.srv.releng.scl3.mozilla.com/mozilla-releng-s3-cache-us-west-1-try/ send to https://s3-us-west-1.amazonaws.com/mozilla-releng-s3-cache-us-west-1-try/ (mozilla-releng-s3-cache-us-west-2-try and mozilla-releng-ceph-cache-scl3-try on s3-us-west-2 and mozilla-releng-s3-cache-us-west-1-try on s3-us-west-1)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Done with this snippet of VCL backend s3west1 { .host = "127.0.0.1"; .port = "9443"; # stunnel to s3 } backend s3west2 { .host = "127.0.0.1"; .port = "9444"; # stunnel to s3 } sub vcl_recv { # Happens before we check if we have this in cache already. # # Typically you clean up the request here, removing cookies you don't need, # rewriting the request, etc. if (req.url ~ "^/mozilla-releng-s3-cache-us-west-2-try/" ) { set req.backend_hint = s3west2; set req.http.host = "s3-us-west-2.amazonaws.com"; } else if (req.url ~ "^/mozilla-releng-ceph-cache-scl3-try/" ) { set req.backend_hint = s3west2; set req.http.host = "s3-us-west-2.amazonaws.com"; } else { set req.backend_hint = s3west1; set req.http.host = "s3-us-west-1.amazonaws.com"; } }
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Following the guidelines at https://docs.google.com/a/mozilla.com/document/d/182fAau8GXKAgMzaC_O-ezQxsZ4SUl2Z9ITZrFfG_ock, a few more steps need to happen before this service is production-ready. I'll open some sub-bugs for the most critical bits and use this as a tracker.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1032744
Depends on: 1032746
Depends on: 1032747
For taras to assign to whomever he has in mind for this project.
Assignee: gozer → taras.mozilla
To make sure these two bugs are linked together, this is similar to work going on in AWS in bug 1017759.
Depends on: 1056306
Proxxy provisioned at proxxy1.srv.releng.scl3.mozilla.com
George, are you also taking ownership of the three bugs that block this one (those must be complete before we put this machine in production)?
(In reply to Amy Rich [:arich] [:arr] from comment #17) > George, are you also taking ownership of the three bugs that block this one > (those must be complete before we put this machine in production)? Most likely, but I'll have to confirm this with catlee. As far as I understand, this is not about sccache anymore. This is about a proxy server like the one in bug 1017759, but hosted in scl3 instead of AWS. But obviously docs are still needed, so I'll be preparing them.
Summary: standup varnish sccache host in production → stand up scl3 proxxy server in production
Great, so what are we going to use for sccache?
Assignee: taras.mozilla → catlee
Depends on: 1058882
(In reply to Mike Hommey [:glandium] (out from Sep 6 to Sep 22) from comment #19) > Great, so what are we going to use for sccache? sccache should continue to push to S3, and we can use proxxy as a local http cache for fetching the files.
(In reply to Chris AtLee [:catlee] from comment #20) > (In reply to Mike Hommey [:glandium] (out from Sep 6 to Sep 22) from comment > #19) > > Great, so what are we going to use for sccache? > > sccache should continue to push to S3, and we can use proxxy as a local http > cache for fetching the files. The setup we had with varnish was taking PUTs too.
Assignee: catlee → george.miroshnykov
Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/354]
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/581]
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/581] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/977] [kanban:engops:https://kanbanize.com/ctrl_board/6/581]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/977] [kanban:engops:https://kanbanize.com/ctrl_board/6/581] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/984] [kanban:engops:https://kanbanize.com/ctrl_board/6/581]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/984] [kanban:engops:https://kanbanize.com/ctrl_board/6/581] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/985] [kanban:engops:https://kanbanize.com/ctrl_board/6/581]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/985] [kanban:engops:https://kanbanize.com/ctrl_board/6/581] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/985]
Assignee: gmiroshnykov → catlee
Depends on: 1155605
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
proxxy has been up and running in scl3 for a while, is puppetized, etc.
This was a tracker bug for the remaining work (of which documentation and nagios still haven't been done).
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Nagios work done, we're still missing the architecture and runbook documentation.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.