Closed
Bug 450645
Opened 16 years ago
Closed 7 years ago
hg.mozilla.org should disable http fetch, make https mandatory
Categories
(Developer Services :: Mercurial: hg.mozilla.org, defect)
Developer Services
Mercurial: hg.mozilla.org
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: BenB, Unassigned)
References
Details
(Whiteboard: [relsec])
Attachments
(1 obsolete file)
When devs (and even more so distributors and build machines) fetch via http, they make themselves vulnerable to interception (MITM etc.) and allow hackers to mess with the downloaded source code. The stakes are very high (even 30000 users are a lot to root at once) for Mozilla. Even more so when a developer commits an inserted backdoor / security hole. I think the review process would not catch that, as review happens before checkin and relies on the commiter to do the right thing; very rarely the commited diffs are read carefully and scrunitized. There is a small, but real chance that a backdoor / deliberate security hole is inserted into official Mozilla code this way. This actually did happen in the past. Usually, tarballs on insecure FTP servers were modified, but intercepting VCS is just another vector to do the same. So, the likeliness may be small, but the risk is real. Given that hardware https boosters exist, or https often is fine just with server CPU, turning on https is possible even on big scale. Please turn off http, to ensure that all devs and distributors use https.
Reporter | ||
Comment 1•16 years ago
|
||
Compare bug 450648 about moving http webapps away from hg.m.o. http could either be completely disabled or just redirect to hgweb.mozilla.org or similar.
Updated•16 years ago
|
OS: Linux → All
Hardware: PC → All
Comment 2•16 years ago
|
||
I disagree. I think its fine to leave http on for folks that are browsing casually or just looking at things. We already have https available for folks that want a secure chanel. I don't think we should shut down a service because there is potential for abuse.
Comment 3•16 years ago
|
||
The other flavor of this is if the MITM injection is performed with the intention of pwning the developer's machine directly. Once an attacker is into your box, it's a small task for them to steal account logins and the such. [Eg, install keysniffer, grab SSH passphrase and keys, then commit to Hg/CVS at your leisure.]
Comment 4•16 years ago
|
||
(In reply to comment #3) > The other flavor of this is if the MITM injection is performed with the > intention of pwning the developer's machine directly. Once an attacker is into > your box, it's a small task for them to steal account logins and the such. [Eg, > install keysniffer, grab SSH passphrase and keys, then commit to Hg/CVS at your > leisure.] > We had this risk with CVS, too, right? With anonymous pserver checkouts? Not saying this is necessarily a reason to keep it, but I don't think we're in a worse position.
Comment 5•16 years ago
|
||
(In reply to comment #4) > We had this risk with CVS, too, right? With anonymous pserver checkouts? No, all the build machines pulled the source over ssh from cvs.mozilla.org, so no, we didn't have this problem with CVS. > Not saying this is necessarily a reason to keep it, but I don't think we're in > a worse position. We are in a worse position currently with Hg. ;)
Comment 6•16 years ago
|
||
(In reply to comment #5) > (In reply to comment #4) > > We had this risk with CVS, too, right? With anonymous pserver checkouts? > > No, all the build machines pulled the source over ssh from cvs.mozilla.org, so > no, we didn't have this problem with CVS. > > > Not saying this is necessarily a reason to keep it, but I don't think we're in > > a worse position. > > We are in a worse position currently with Hg. ;) > We're not talking about build machines exclusively, afaik
Reporter | ||
Comment 7•16 years ago
|
||
aravind, casual browsing via http webapps should not happen on hg.mozilla.org anyways, see the other bug filed and mentioned.
Comment 8•16 years ago
|
||
(In reply to comment #2) > I don't think we should shut down a service because there is potential for > abuse. Visiting http://hg.mozilla.org in a browser is fine, the issue here is about "hg clone http://hg.mozilla/org". (In reply to comment #4) > We had this risk with CVS, too, right? With anonymous pserver checkouts? I don't think this matters. (If anything, it might merit another bug to secure CVS and other repos used! :-) DNS attacks are all the rage these days, which makes the odds of being MITM'd much higher.
Updated•16 years ago
|
Assignee: server-ops → aravind
We should not disable http fetch. We should use https fetch for our build systems.
Comment 10•16 years ago
|
||
Attachment #333921 -
Flags: review?(nthomas)
Comment 11•16 years ago
|
||
Comment on attachment 333921 [details] [diff] [review] [backed out] build machines should pull sources via https, not http ps, feel free to land this post-review.
Comment 12•16 years ago
|
||
(In reply to comment #9) > We should not disable http fetch. Can you expand on this assertion?
Updated•16 years ago
|
Attachment #333921 -
Flags: review?(nthomas) → review?(ccooper)
Updated•16 years ago
|
Attachment #333921 -
Flags: review?(ccooper) → review+
Comment 13•16 years ago
|
||
Comment on attachment 333921 [details] [diff] [review] [backed out] build machines should pull sources via https, not http changeset: 237:a3d8c3284ac7
Attachment #333921 -
Attachment description: build machines should pull sources via https, not http → [checked in] build machines should pull sources via https, not http
Comment 14•16 years ago
|
||
This patch ended up breaking some functionality because the host machine didn't have Python SSL modules on it. I think I've fixed it, if not I'll have to back it out.
Comment 15•16 years ago
|
||
Comment on attachment 333921 [details] [diff] [review] [backed out] build machines should pull sources via https, not http I've backed this out for now, until I have time to fix the failures.
Attachment #333921 -
Attachment description: [checked in] build machines should pull sources via https, not http → [backed out] build machines should pull sources via https, not http
Comment 16•16 years ago
|
||
This is a WONTFIX for now. I don't really think blocking http is the right thing to do. For everyone that wants to, https is available. In anycase we don't allow any kind of checkins on http or https so this should hopefully not be that big a risk.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
Comment 17•16 years ago
|
||
I'd kind of like a better explanation of why this isn't something we'll do (see also comment 12). Doing insecure pulls is essentially just as bad as doing insecure nightly updates. It seems kind of silly to give people a useless choice. Also reopening to at least have the build machines pull securely.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Reporter | ||
Comment 18•16 years ago
|
||
Now, hg via https entirely broke, see bug 457100.
Reporter | ||
Comment 19•16 years ago
|
||
Nevermind my last comment, was local horkage, sorry.
No longer depends on: 457100
Comment 20•16 years ago
|
||
bhearsum: Are all the build machines pulling builds over https? dolske: I am not sure that the possibility of someone snooping (or feeding you junk) should stop us from serving traffic on http. As I mentioned earlier, we do have https enabled for folks that want to go that route. Once bhearsum confirms that build machines have been switched over to https, I don't see a reason to keep this open.
Comment 21•16 years ago
|
||
(In reply to comment #20) > dolske: I am not sure that the possibility of someone snooping (or feeding you > junk) should stop us from serving traffic on http. As I mentioned earlier, we > do have https enabled for folks that want to go that route. The point is that if there's both a secure and insecure way of doing something, the insecure way should be disabled unless there's a compelling reason to keep it. So far, no one has presented any argument for why we should retain the ability to pull without SSL.
Comment 22•16 years ago
|
||
(In reply to comment #21) > The point is that if there's both a secure and insecure way of doing something, > the insecure way should be disabled unless there's a compelling reason to keep > it. So far, no one has presented any argument for why we should retain the > ability to pull without SSL. I am okay with forcing users to use https. Do we want to redirect folks visiting us on http to https? I am not sure if that (redirect) is vulnerable to all the pitfalls of http. Shaver: any objections to this?
Comment 23•16 years ago
|
||
If you want security, and not just security theater, you're going to need a slew of dependencies, since (assuming the rumors that Python 2.6 was going to actually verify certs, and not just accept anything including domain mismatches like 2.5 did were true) you'll need a version of Mercurial that only runs on 2.6, and won't allow pulling or cloning by any previous version, plus bugs to update the build machines and MozillaBuild to 2.6 plus that Mercurial. As someone, I think Gavin, was pointing out the other day in #hg, unless you use one of the third-party Python wrappers for OpenSSL instead of Python 2.5's "support" the only difference between http and https is that the person who wants to pwn you needs a cert, any cert, including a self-signed one for a different domain. Even if you just want security theater, we still probably ought to have some documentation about how to get the fake-SSL support for 2.5 on various platforms - it took me a couple of tries to finally find a blog post saying that you need to install "py25-socket-ssl" from macports to get even the illusion of SSL support.
Comment 24•16 years ago
|
||
I honestly don't think this matters either way. I am fine disabling http and forcing folks to use https. However, I am not planning on upgrading python versions on the O.S for this. We run on rhel5 (python 2.4). It's a pita to run custom compiled versions of python, hg, ssl support libraries.. etc. So if merely disabling http isn't good enough, then I'd be tempted to WONTFIX this bug.
Reporter | ||
Comment 25•16 years ago
|
||
Aravind, Phil is speaking about the client. The server-side Python would not need to be upgraded.
Comment 26•16 years ago
|
||
(In reply to comment #20) > bhearsum: Are all the build machines pulling builds over https? Not yet. Busy with other things. When you're done with this bug feel free to toss it into mozilla.org:RelEng and I'll pick it up when I can.
Reporter | ||
Comment 27•16 years ago
|
||
> When you're done with this bug feel free to > toss it into mozilla.org:RelEng and I'll pick it up when I can. The change in build machines has to happen *first*, so we can't be done until the build machines are switched. Thus, I filed blocking bug 460020.
No longer depends on: 460020
Comment 28•16 years ago
|
||
No, you misunderstand me: I am speaking about both the client and the server. You filed this bug to make it impossible for anyone pulling from hg.mozilla.org to be MITMed, whether or not they want that protection. If https plus current Python/hg offers absolutely no protection against MITM, the only way that forcing https will force MITM protection is if the server is changed to not accept connections from anyone using them. Whether or not that will be possible without upgrading Python and hg on the server nobody knows, since nobody knows how it will be done. If you want to force everyone to not be at risk of MITM attacks, your steps have to be: 1. Find a combination of Python (possibly plus extensions) and hg for the client side which will actually verify certs. 2. Find a way to make either hg or the server itself reject connections from anything which is not using that (which might or might not require server upgrades). 3. Get that combo into MozillaBuild. 4. Get that combo on the build machines. 5. Start making the server reject both http and https unless it's using your actually-secure combo. By way of contrast, if instead of forcing everyone to be secure you want to make it possible for people to be secure, you need just step 1 plus a wiki page.
Reporter | ||
Comment 29•16 years ago
|
||
If you want to reject Python <2.6 which does not verify certs, probably all you'd need to do would be to reject these User-Agent strings. Probably that's just a Apache mod_rewrite rule or something like that. I think at least your steps 1, 3 and 4 are a good idea, though. Step 4 is already filed as bug 460020 (as noted above). I just filed bug 460052 for Step 3 - fixing MozillaBuild.
Reporter | ||
Comment 30•16 years ago
|
||
In other words, no server software upgrades of Python needed, mod_rewrite or similar would be enough even if you want to block. Apache handles HTTPS on server, so server should not be affected by the Python problem. Note that a whitelist, which you suggest, is not an option, as I'll want to download the bz2 tarballs via HTTPS using wget or the browser.
Comment 31•16 years ago
|
||
https://www.g4v.org/hg/ serves an empty hg repository with a mismatched cert, in case that's useful for testing (cert only matches the non-"www." version). |hg clone https://www.g4v.org/hg/| succeeds using mercurial 1.0.
Comment 32•16 years ago
|
||
http://www.heikkitoivonen.net/blog/2008/10/14/ssl-in-python-26/ - see the "Clients" section, where Heikki talks about how Python 2.6's ssl module leaves hostname checking to the client application. So you need a dependency on a new version of Mercurial which either only runs on 2.6+ and does hostname checking itself, or bundles a third-party alternative, and in either case doesn't provide any fallback to insecure "Secure"SL, which is likely to be a hard sell.
Comment 33•16 years ago
|
||
See also http://www.selenic.com/mercurial/bts/issue1174, and possibly http://www.selenic.com/mercurial/bts/issue643. In any case, you'd need all the clients to install an extra dependency or Python 2.6 and a newer version of hg. Seems exceedingly unlikely for now.
Comment 34•16 years ago
|
||
Ugh, what a disappointing situation. :( So, it sounds like there are two possible routes to take: 1) Effectively wontfix this bug (at least for now), since Hg/Python are broken. Revisit the issue in the future when the client software can correctly detect MITM attacks. 2) Go ahead and switch the server to require SSL-only now, and wait for the client software to catch up. While there's no immediate security benefit to #2, perhaps it's cheaper to switch now rather than later (certainly that would be true ~6 months ago, before most developers and build systems started using Hg). At the very least, we should probably update documentation to show using https://hg.mozilla.org. [Hmm: If a tree is already cloned from http://hg.mo, can a user just edit .hg/hgrc and change the "default" path to https://hg.mo?]
Reporter | ||
Comment 35•16 years ago
|
||
I'd try to get the client fixed (see bug 460052 comment 3), and then fix this one, i.e. a bug dependency.
Comment 36•16 years ago
|
||
That doesn't work, if only because it's unrealistic to require people (for example those who spend their time in scratchbox, the Fennec platform, or those who want to stick to a Debian release) to use a very recent client version.
Comment 37•16 years ago
|
||
(In reply to comment #34) > [Hmm: If a tree is already cloned from http://hg.mo, > can a user just edit .hg/hgrc and change the "default" path to https://hg.mo?] Yep.
Reporter | ||
Comment 38•16 years ago
|
||
I try, but are not getting too far. E.g. for comm-central, the URLs are in the scripts. (Admittedly, this is technically a different issue: "make the scripts use https".)
Comment 39•16 years ago
|
||
(In reply to comment #36) > That doesn't work, if only because it's unrealistic to require people (for > example those who spend their time in scratchbox, the Fennec platform, or those > who want to stick to a Debian release) to use a very recent client version. I don't know what scratchbox is, but the Fennec people generally develop on normal desktop machines that will have whatever toolset we have for Firefox builds. People who want to stick to a stock anything, Debian or otherwise, are generally out of luck. We've always required specific versions of various tools at various times and people just have to go get them (service pack this, XCode that, specific make version, etc.). What you're suggesting is to extend the length of time for which the code repo is potentially vulnerable to injection and the risk of backdooring a couple hundred million people so as not to inconvenience a handful of developers who can't get the required version of an open-source tool. A dozen? twenty? a hundred? That's a bad tradeoff -- the developers can suck it up.
Comment 40•16 years ago
|
||
So, from comment 28 and the linked urls, doing this correctly on the server side means upgrading python to 2.6, using non-standard ssl libraries, and re-writing some of the hg server code to use these libraries. Supporting non-standard libraries is a non-trivial task for us (and I not even sure it'd all just work), and would require extensive testing. All that work, and we'd still be vulnerable until all clients can be upgraded to some future version of mercurial that addresses these issues. This is a WONTFIX for now, please re-open at some future date (probably when we are into the next rhel major release)
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → WONTFIX
Comment 41•16 years ago
|
||
Well, FTR, I don't think it's a given that this requires any server-side modifications (other than the config change to disable HTTP). Phil's premise in comment 28 is that the server should *enforce* using a client that does proper cert checking. While that's an interesting idea, I'm not sure that's required or as complicated as made out to be. In any case, this is a bit academic (seeing as there isn't a suitable Mercurial client yet), so deferring any action until that happens seems fair.
Comment 42•16 years ago
|
||
I'm sorry, what did I miss in comment 28, that makes it anything other than security theater to require https to prevent MITM attacks when we know that there's no current client for which that will prevent MITM attacks and to then allow continued use of those clients for which it will not prevent MITM attacks even after there is one for which it will?
Reporter | ||
Comment 43•16 years ago
|
||
aravind, there are no software upgrades at all needed on the server side. I said that, with reason, already in comment 30. REOPENing. Phil, I don't think this bug needs to block on the client. *Even if* it does, it's a classic blocker bug, bug 460052. We don't close bug just because they have blockers, but mark the blockers in the bugzilla field.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Comment 44•16 years ago
|
||
Works for me. I will disable http when those two blockers have been resolved.
Assignee: aravind → nobody
Component: Server Operations → Server Operations: Projects
Comment 45•16 years ago
|
||
Can I wontfix this now, now that build has decided to wontfix their bug?
Comment 46•15 years ago
|
||
Yes! :)
Status: REOPENED → RESOLVED
Closed: 16 years ago → 15 years ago
Resolution: --- → WONTFIX
Reporter | ||
Comment 47•10 years ago
|
||
Now that our hg install is fixed and the certificate can be checked, we need to re-visit this. This is important for the integrity of our source code base, and the security of our developer's machines. If a developer fetches source code over http, and a MITM introduces a source code change, the developer will compile and run that malicious code without noticing. In turn, his hg commit can be modified, bypassing all reviews, because we usually don't hair-comb hg commits patches after submission. It will be tough to detect that before we ship the code to our users.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Updated•9 years ago
|
Component: Server Operations: Projects → Mercurial: hg.mozilla.org
Product: mozilla.org → Developer Services
Updated•8 years ago
|
QA Contact: mzeier → hwine
Whiteboard: [relsec]
Comment 50•8 years ago
|
||
CVE-2016-3630 makes this more important. We should consider rolling out the SHA-2 x509 cert to hg.mozilla.org before we force people to https. The reason is a lot of people pin cert fingerprints in Mercurial (because Python TLS support is broken on many Python installs). If we force https then change the cert, a lot of people will pin the cert after the forced https then get caught up in the fingerprint change. This is avoidable churn. Doing the fingerprint change then forcing https is a better end-user experience. IMO we should announce a date for the certificate change then force TLS connections at the same time or a week or two later. FWIW, Mercurial 3.8 (to be released May 1) supports pinning multiple certificate fingerprints. So if we get that deployed in automation, we can announce the new fingerprint so downstream systems (like Firefox automation) can install it and they will transition to the new cert transparently.
Comment 51•8 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #50) > We should consider rolling out the SHA-2 x509 cert to hg.mozilla.org before > we force people to https. Is there a bug filed for this? :-)
Depends on: 1277714
Flags: needinfo?(gps)
Comment 52•8 years ago
|
||
There is a bug somewhere to move away from SHA-1 certs on hg.mozilla.org. We'll want to deploy Mercurial 3.8 to automation first so we can configure the SHA-2 fingerprint before it is deployed so automation doesn't blow up when we switch certs.
Flags: needinfo?(gps)
Comment 53•8 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #52) > There is a bug somewhere to move away from SHA-1 certs on hg.mozilla.org. Do you have the number? Having all bugs added to a dependency tree makes it much easier to see state at a glance. > We'll want to deploy Mercurial 3.8 to automation first so we can configure > the SHA-2 fingerprint before it is deployed so automation doesn't blow up > when we switch certs. Indeed (this is why I added bug 1277714 as a dependency as part of comment 51).
Comment 54•8 years ago
|
||
Bug 1147548 tracks the certificate upgrade.
Comment 55•8 years ago
|
||
Now that we have a SHA-256 cert deployed to hg.mozilla.org and we've done the hard work of upgrading automation to use Mercurial 3.9, that makes this bug unblocked! I think we should proceed with making hg.mozilla.org TLS only by end of 2016. I think the next step here is assessing who is still using port 80 and get them transitioned to secure connections. atoll: could you please assist me with obtaining the load balancer logs (I don't think I have access)? I'd like to analyze who is connecting to port 80 to help identify high volume consumers by repo/URL and source IP. I'm especially interested in high volume consumers coming from IP addresses that Mozilla uses.
Flags: needinfo?(rsoderberg)
Comment 56•8 years ago
|
||
atoll: also, you may want to look at relative traffic levels for :80 versus :443 on the load balancer. If we'll be shifting a lot of traffic to TLS, the load balancer CPU may not take kindly to that. An historical comparison from several months back might be useful: I believe we've shifted 1+ TB/day off hg.mozilla.org to Amazon S3 and a CDN as part of the "clone bundles" work that started in bug 1041173.
Comment 57•8 years ago
|
||
We keep 14 days of logs for these VIPs as no special retention arrangement has been made otherwise. These numbers are short by approximately 100 requests per day: 09/Sep/2016 288078 10/Sep/2016 309327 11/Sep/2016 283886 12/Sep/2016 312045 13/Sep/2016 364613 14/Sep/2016 331701 15/Sep/2016 384095 16/Sep/2016 361511 17/Sep/2016 355823 18/Sep/2016 324166 19/Sep/2016 520418 20/Sep/2016 336958 21/Sep/2016 331948 22/Sep/2016 317656 23/Sep/2016 311451 24/Sep/2016 300618 25/Sep/2016 340177 26/Sep/2016 260148 This works out to an additional 5 requests/sec over the day, which is acceptable from a load balancer perspective. The user-agent breakdown of the above time range, which I can email you privately if you need the full un-truncated output, has the following top hitters: 10891 mercurial/proto-1.0 (Mercurial 3.9) 15091 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 16334 YisouSpider 22319 mercurial/proto-1.0 (Mercurial 3.9.1) 24103 Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.0; trendictionbot0.5.0; trendiction search; http://www.trendiction.de/bot; please let us know of any problems; web at trendiction.com) Gecko/20071127 Firefox/3.0.0.11 33158 Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07) 34371 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) 37894 Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) 51274 ltx71 - (http://ltx71.com/) 55847 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; Win64; x64; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729 70248 Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) 77742 Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com) 104115 Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler) 192388 mercurial/proto-1.0 (Mercurial 3.9+60-cfa543f6c331) 424680 mercurial/proto-1.0 767538 - 775358 Python-urllib/2.7 2976553 Twisted PageGetter I checked a random urllib request: "GET /releases/mozilla-release/raw-file/59f461d36b4a133f5045a628f08628f9c48919d2/toolkit/components/telemetry/Histograms.json HTTP/1.1" And a couple of random "-" requests: "GET /mozilla-central/raw-file/tip/testing/docker/desktop-test/dot-files/config/pip/pip.conf HTTP/1.1" "GET /releases/mozilla-b2g44_v2_5/log/fddffdeab17035827778431387dcff9256e136a9 HTTP/1.0" And a couple of random mercurial/proto requests: "GET /build/mozharness?cmd=batch HTTP/1.1" "GET /mozilla-central?cmd=capabilities HTTP/1.1"
Flags: needinfo?(rsoderberg)
Comment 58•8 years ago
|
||
Oh, and Twisted PageGetter: "GET /releases/l10n/mozilla-aurora/ro/json-pushes?startID=941&endID=1141 HTTP/1.0" Which is originating requests from nat-fw1.scl3, so look inward on that particular user-agent as a first-step.
Comment 59•8 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #55) We have not upgraded to 3.9 everywhere, so this is still blocked. 2008 is blocking on hg/aws performance issues which have not been solved yet.
Comment 60•8 years ago
|
||
The EBS bug is a generic bug that will likely turn into a tracker. The bug we want to track is the one where our Windows automation is moving off the root EBS volume.
Comment 61•8 years ago
|
||
Axel: is that your l10n automation not using TLS? If so, please switch things to use https://hg.mozilla.org/.
Flags: needinfo?(l10n)
Comment 62•8 years ago
|
||
The twisted one is surely mine, filed bug 1305973 to track that.
Flags: needinfo?(l10n)
Updated•7 years ago
|
QA Contact: hwine → klibby
Comment 63•7 years ago
|
||
The automation for l10n.m.o is out of the way now.
Comment 64•7 years ago
|
||
(In reply to Amy Rich [:arr] [:arich] from comment #59) > (In reply to Gregory Szorc [:gps] from comment #55) > > We have not upgraded to 3.9 everywhere, so this is still blocked. 2008 is > blocking on hg/aws performance issues which have not been solved yet. :arr, :gps, is this still an issue? All of the dependent bugs are mark as resolved.
Attachment #333921 -
Attachment is obsolete: true
Updated•7 years ago
|
Flags: needinfo?(gps)
Comment 65•7 years ago
|
||
I /think/ we've got the world upgraded to 3.9+. I can pull the logs on the hgweb machines to verify. needinfo me for that. atoll: could you please do a repeat of comment #57 and see who our top remaining port 80 consumers are? The past 24-48 hours is preferred, as comment #63 indicates removal of port 80 traffic on 2017-01-16.
Flags: needinfo?(gps) → needinfo?(rsoderberg)
Comment 66•7 years ago
|
||
User agent analysis of the past 14 days. 11954 curl 12688 Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.0; trendictionbot0.5.0; trendiction search; http://www.trendiction.de/bot; please let us know of any problems; web at trendiction.com) Gecko/20071127 Firefox/3.0.0.11 14702 ltx71 - (http://ltx71.com/) 14720 Twitterbot/1.0 14839 Mozilla/5.0 (compatible; SemrushBot/1.2~bl; +http://www.semrush.com/bot.html) 16258 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 18291 Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots) 20398 BoogleBot 2.0 26244 mercurial/proto-1.0 (Mercurial 4.0.1) 27270 Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36 30819 Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) 43521 Mozilla/5.0 (compatible; SemrushBot/1.1~bl; +http://www.semrush.com/bot.html) 52918 mercurial/proto-1.0 (Mercurial 3.9.2) 63535 Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07) 69021 Python-urllib/2.7 119252 Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) 149165 Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com) 227889 mercurial/proto-1.0 237981 mercurial/proto-1.0 (Mercurial 4.0.1+304-60a40b3827ce) 239363 - 254360 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) 3465184 Twisted PageGetter (All below numbers are for one day or less, rather than 7 days) Twisted is still there, asking for json-pushes data, but I don't think it's l10n, I think it's SeaMonkey builds: 11.223.245.63.in-addr.arpa domain name pointer sea-master1.community.scl3.mozilla.com. User-Agent '-' means no user-agent header was sent. Lots of security hole scans, but also (of course) yet another way people are direct-fetching effective_tld_names.dat (seriously, how many different ways *are* there to access this file?!) 64 GET /mozilla-central/raw-file/1ad9af3a2ab8/netwerk/dns/effective_tld_names.dat HTTP/1.0 189 GET /mozilla-central/raw-file/1ad9af3a2ab8/netwerk/dns/effective_tld_names.dat HTTP/1.1 But also what appears to be some sort of thing that cares about pushlogs: 30 GET /chatzilla/json-pushes?full=true&startdate=2017-01-08%2009%3A06%3A16&enddate=Now HTTP/1.0 30 GET /chatzilla/log/c6ddca598a5398ca9a58b14d91bce82b8878bc7b HTTP/1.0 30 GET /chatzilla/pushloghtml?startdate=2+days+ago&enddate=now HTTP/1.0 30 GET /comm-central/pushloghtml?startdate=2+days+ago&enddate=now HTTP/1.0 30 GET /dom-inspector/json-pushes?full=true&startdate=2017-01-08%2009%3A08%3A36&enddate=Now HTTP/1.0 30 GET /dom-inspector/log/ca57c796b6228d020925c3ce299ff0056739a18b HTTP/1.0 30 GET /dom-inspector/pushloghtml?startdate=2+days+ago&enddate=now HTTP/1.0 30 GET /mozilla-central/pushloghtml?startdate=2+days+ago&enddate=now HTTP/1.0 I checked some random source IPs here and they're all offsite resources we don't control, so not much we can do there. The mercurial/proto-1.0 one was interesting, because it says Callek repeatedly over and over, but it turns out that it's just SeaMonkey build hosts depending critically on /users/Callek_gmail.com/tools to build SeaMonkey. So if I exclude the entire community VLAN, "mercurial/proto-1.0" is seeing traffic from an EC2 instance (that we may or may not control), and from the SCL3 DC NAT, as the two top most hits, with a huge long tail with scatterings of 63.245 after that. Whatever's running out of SCL3 is probably related to these three repos: 204 GET /qa/testcase-data/?cmd=listkeys HTTP/1.1 111 GET /SeaMonkey/seamonkey-project-org?cmd=listkeys HTTP/1.1 68 GET /webtools/telemetry-experiment-server?cmd=listkeys HTTP/1.1 And excluding community VLAN and SCL3 NAT, the top repos queried are: 934 GET /build/mozharness?cmd=listkeys HTTP/1.1 245 GET /mozilla-unified?cmd=known HTTP/1.1 151 GET /mozilla-central/?cmd=known HTTP/1.1 102 GET /users/hwine_mozilla.com/repo-sync-tools?cmd=capabilities HTTP/1.1 Oh, of course, I did my grep wrong, so repeating now with *both* mercurial user-agents (no-version and 4.0.1 version): 4978 GET /mozilla-central?cmd=capabilities HTTP/1.1 934 GET /build/mozharness?cmd=listkeys HTTP/1.1 340 GET /build/buildbot-configs/?cmd=capabilities HTTP/1.1 336 GET /build/tools?cmd=capabilities HTTP/1.1 336 GET /build/buildbotcustom?cmd=capabilities HTTP/1.1 336 GET /build/braindump/?cmd=capabilities HTTP/1.1 164 GET /integration/mozilla-inbound?cmd=capabilities HTTP/1.1 102 GET /users/hwine_mozilla.com/repo-sync-tools?cmd=capabilities HTTP/1.1 And it trails off from there. Onward to Python-urllib: 2957 GET /build/tools/raw-file/default/buildfarm/maintenance/production-branches.json HTTP/1.1 42 GET /releases/mozilla-release/raw-file/59f461d36b4a133f5045a628f08628f9c48919d2/toolkit/components/telemetry/Histograms.json HTTP/1.1 14 GET /projects/kraken/archive/tip.zip HTTP/1.0 All those buildfarm requests are from the SCL3 firewall, so I can't tell you what the backend is, but that's a really arcane URL to care about, so I bet someone knows. Histograms.json is som random EC2 instance and half of the histograms it asks for are 404, so who even knows what's up there. The kraken requests are some offsite that I don't recognize. Every curl request for a time window, because there's so few we can just direct-inspect them. I elided all but one of the '1 request' URLs, of which there were a handful. I kept this one for irony: 1 GET /releases/mozilla-beta/filelog/c03e51cec3b5f6b8821687c8db8be309727d5470/devtools/client/netmonitor/test/html_curl-utils.html HTTP/1.1 And the rest are: 2 GET / HTTP/1.1 8 GET /mozilla-central/atom-log/aec6bf932306/security/nss/lib/ckfw/builtins/certdata.txt HTTP/1.1 8 GET /releases/l10n/mozilla-aurora/id/atom-log HTTP/1.1 17 GET /gaia-l10n/en-US/atom-log HTTP/1.1 17 GET /mozilla-central/atom-log/default/netwerk/dns/effective_tld_names.dat HTTP/1.1 70 GET /hgcustom/version-control-tools/atom-log HTTP/1.1 93 GET /hgcustom/version-control-tools/rss-log HTTP/1.1 95 GET /releases/mozilla-release/atom-log HTTP/1.1 98 GET /releases/mozilla-release/rss-log HTTP/1.1 110 GET /releases/mozilla-beta/atom-log HTTP/1.1 142 GET /releases/mozilla-beta/rss-log HTTP/1.1
Flags: needinfo?(rsoderberg)
Comment 67•7 years ago
|
||
Callek: could you please triage the SeaMonkey and Callek references in comment #66 and file bugs blocking this one to transition things to https://hg.mozilla.org?
Flags: needinfo?(bugspam.Callek)
Comment 68•7 years ago
|
||
Callek: I suspect you may also know what's up with that /build/tools/raw-file/default/buildfarm/maintenance/production-branches.json request. There's a reference to that path in a few repos under hg.mo/build. But the URLs are https://. The only http:// reference to that URL I can find was changed way back in bug 960571. Perhaps there are some machines running with a really old config? Or more likely, I'm not grepping all the code I need to be (I wish we had a monorepo).
Comment 69•7 years ago
|
||
Note that while the l10n automation is using https now, it's still using mercurial 3.7.3 in a few places. Upgrading that will depend on bug 1323771, I think.
Comment 70•7 years ago
|
||
(needinfoing Kyle for ActiveData-ETL, Bob for autophone, Mark for telemetry-*) I see a number of non-https references in the build-central set of repos, found using DXR: https://dxr.mozilla.org/build-central/source/buildbotcustom/process/factory.py#1054 https://dxr.mozilla.org/build-central/source/braindump/buildbot-related/create-staging-master.pl#164 https://dxr.mozilla.org/build-central/source/braindump/update-related/create-channel-switch-mar.py#24 https://dxr.mozilla.org/build-central/source/mozharness/configs/developer_config.py#46 https://dxr.mozilla.org/build-central/source/mozharness/mozharness/mozilla/taskcluster_helper.py#60 https://dxr.mozilla.org/build-central/source/puppet/modules/cruncher/templates/reportor_credentials.ini.erb#9 https://dxr.mozilla.org/build-central/source/puppet/modules/slaveapi/templates/slaveapi.ini.erb#37 https://dxr.mozilla.org/build-central/source/slave_health/scripts/slave_health_cron.sh#20 https://dxr.mozilla.org/build-central/source/tools/buildfarm/maintenance/end_to_end_reconfig.sh#482 https://dxr.mozilla.org/build-central/source/tupperware/buildapi-app/Dockerfile#9 Similarly for mozilla-central: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/developer_config.py#45 https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/taskcluster_helper.py#64 https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/scripts/firefox_ui_tests/update_release.py#49 https://dxr.mozilla.org/mozilla-central/source/security/nss/tests/run_niscc.sh#262 And using GitHub code search: https://github.com/klahnakoski/ActiveData-ETL/blob/master/resources/settings/mercurial_settings.json https://github.com/klahnakoski/ActiveData-ETL/blob/35705ffe1ade8fbdf3188b12d873a0b4f1d833c8/mohg/hg_mozilla_org.py#L263 https://github.com/mozilla/autophone/blob/master/builds.py#L37 https://github.com/mozilla/telemetry-server/blob/5bfd3131426d89fa99cea55ab61a3ce7f32bf9c5/telemetry/revision_cache.py#L96 https://github.com/mozilla/telemetry-server/blob/5bfd3131426d89fa99cea55ab61a3ce7f32bf9c5/bin/get_histogram_tools.sh https://github.com/mozilla/telemetry-tools/blob/19393e0eabc87aa0deb5a4c819ea301fabc8e439/telemetry/revision_cache.py#L96 https://github.com/mozilla/telemetry-tools/blob/19393e0eabc87aa0deb5a4c819ea301fabc8e439/scripts/get_histogram_tools.sh https://github.com/mozilla-releng/funsize/blob/master/funsize/data/generate_update_platforms.sh https://github.com/mozilla-l10n/mozilla-l10n-query/blob/86099e3e7d78c0fc71a7b4830a57a3ef20a66df5/app/scripts/update_sources.py https://github.com/mozilla/elmo/blob/master/apps/life/fixtures/hg_mozilla_org.json I wonder if it would also be useful to post to a handful of newsgroups even now, to encourage people to update local scripts/check their projects etc to at least reduce the number of cases that have to be followed up manually in this bug?
Flags: needinfo?(mreid)
Flags: needinfo?(klahnakoski)
Flags: needinfo?(bob)
Comment 71•7 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #67) > Callek: could you please triage the SeaMonkey and Callek references in > comment #66 and file bugs blocking this one to transition things to > https://hg.mozilla.org? Pushing this n-i to :ewong I can help track down specifics (just not as time available as I'd like) but there are very likely a handful of these SeaMonkey related... I would say all /users/Callek../tools are likely SeaMonkey infra. The chatzilla/domi uses are also probably SeaMonkey, at least insofar as they are internal. Some of the mozilla-* repos with pushlog and l10n repos are also likely SeaMonkey, I'm less confident we make up a significant portion of polls there. ... As for: 111 GET /SeaMonkey/seamonkey-project-org?cmd=listkeys HTTP/1.1 There is one primary consumer of that repo, and thats our internal webops-controlled server (that teh SeaMonkey team doesn't have direct access to). which pulls and rebuilds the website. But iirc there was also a person on the SeaMonkey team doing near-identical work as the webops server to give us a staging site. (In reply to Gregory Szorc [:gps] from comment #68) > Callek: I suspect you may also know what's up with that > /build/tools/raw-file/default/buildfarm/maintenance/production-branches.json > request. There's a reference to that path in a few repos under hg.mo/build. > But the URLs are https://. The only http:// reference to that URL I can find > was changed way back in bug 960571. Perhaps there are some machines running > with a really old config? Or more likely, I'm not grepping all the code I > need to be (I wish we had a monorepo). I would not be shocked at all if there was a lingering http:// reference in SeaMonkey automation to this file somewhere.
Flags: needinfo?(bugspam.Callek) → needinfo?(ewong)
Comment 72•7 years ago
|
||
I just cleaned up the telemetry-server code to use only https in https://github.com/mozilla/telemetry-server/pull/159
Flags: needinfo?(mreid)
Comment 74•7 years ago
|
||
Thanks for all the follow-up work, Ed, Justin, Mark, and Bob! I think we should pick a date for 301'ing the http endpoint and announce it. I'll send out an email when we pick a date. arr: would you care to pick a date? I'll throw out February 7. Keep in mind I disappear for a while after February 10. (Although I don't think my presence is critical since this change is more about downstream consumers and not hg.mozilla.org itself.)
Flags: needinfo?(arich)
Comment 75•7 years ago
|
||
Question: Will we redirect http to https so that old links continue to work?
Comment 76•7 years ago
|
||
Yes, we will HTTP 301 the requests.
Comment 77•7 years ago
|
||
Will we prohibit HG clients, or will they follow the redirect?
Comment 79•7 years ago
|
||
(In reply to Kyle Lahnakoski [:ekyle] from comment #78) > The ActiveData ETL tries to use https first. Can it be updated to *not* ever try http://?
Comment 80•7 years ago
|
||
(In reply to Richard Soderberg [:atoll] from comment #77) > Will we prohibit HG clients, or will they follow the redirect? hg clients will follow the redirect. The answer to whether they'll accept the x509 certificate is "it depends" with most answers being "probably." If we're concerned about impact to automation, I suppose we could first add a redirect for some user agents, such as anything with "mozilla" or "bot" in it. Then we can chase the long tail of non-human, non-indexer clients.
Comment 81•7 years ago
|
||
I'm more inclined to suggest blocking it; there's an inherent risk in permitting clients to continue operating when http:// works transparently. If we redirect HG clients for now, can we also set a date at which time we'll block them instead?
Comment 82•7 years ago
|
||
:dividehex, could you please catch up with gps to coordinate any needed changes here? I want to make sure that we have a quick and easy roll back plan in case we find that this breaks things substantially. :hwine: NI you to verify that all the releng infra will work after this cutover, to help pick a date, and for general IT coordination (if needed).
Flags: needinfo?(jwatkins)
Flags: needinfo?(hwine)
Flags: needinfo?(arich)
Comment 83•7 years ago
|
||
(In reply to Amy Rich [:arr] [:arich] from comment #82) > :hwine: NI you to verify that all the releng infra will work after this > cutover, to help pick a date, and for general IT coordination (if needed). First look says we're okay -- I'll double check out of band (no news is good news). https://dxr.mozilla.org/build-central/search?q=http%3A%2F%2Fhg.mozilla.org&redirect=false Happy to work on date -- probably should go through CAB anyway to get visibility.
Flags: needinfo?(hwine)
Comment 84•7 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #71) > (In reply to Gregory Szorc [:gps] from comment #67) > > Callek: could you please triage the SeaMonkey and Callek references in > > comment #66 and file bugs blocking this one to transition things to > > https://hg.mozilla.org? > > Pushing this n-i to :ewong > I would not be shocked at all if there was a lingering http:// reference in > SeaMonkey automation to this file somewhere. Freaky... As I think I had filed bug 1305911. I'll get that fixed.
Flags: needinfo?(ewong)
Comment 85•7 years ago
|
||
RelEng has a few items to check, detailed in bug 1332964.
Comment 86•7 years ago
|
||
RelEng has completed it's sanity check, and we see no blockers to moving forward with disabling HTTP. That said, the usual caveats about releases, merges, & chemspills apply, so lets coordinate on that date. :)
Comment 87•7 years ago
|
||
(In reply to Amy Rich [:arr] [:arich] from comment #82) > :dividehex, could you please catch up with gps to coordinate any needed > changes here? I want to make sure that we have a quick and easy roll back > plan in case we find that this breaks things substantially. I emailed :gps. We'll make sure there is a rollback procedure.
Flags: needinfo?(jwatkins)
Comment 88•7 years ago
|
||
The cutover should not cause any service disruption, however it will be useful to have plenty of eyes around "just in case". That makes the cutover a good candidate for our Web morning (PT) window. Based on comment 74, the 2 upcoming dates are Feb 1 & Feb 8. I'll file a CAB ticket for Feb 1 (as the CAB for that will be Jan 25). If someone hollers, we can push to Feb 8. :dividehex has confirmed rollback plan is done (comment 87) :ewong - looks like you're ready to land on bug 1305911 -- any concerns with a Feb 1 date?
Comment 90•7 years ago
|
||
:atoll reminded me that "best practice" for the introduction of a 301 redirect is to first do a dance with a 302 redirect, and tuning cache expiry times before returning them to 1 hour. This prevents "locking" a client into the HTTPS arrangement if we need to revert it. That makes sense in most situations. For this case though: a) we know HTTPS works just fine with all modern clients, so there is no need to "roll back" a modern browser client b) if we rolled back (extremely low risk), it would be due to deficiencies in older implementations of the hg client c) afaik no older hg client ever attempted to update the URL on receipt of a 301, so there is no lock-in risk. :gps - can you confirm or invalidate my logic please? I'm no expert on hg clients.
Flags: needinfo?(gps)
Comment 91•7 years ago
|
||
Mercurial doesn't retain or update settings if it encounters an HTTP 301. Non-Mercurial automated clients generally behave this way as well. Generally speaking, only browsers and HTTP caches/proxies will retain HTTP 301. These could interfere with Mercurial clients, however. I think starting with an HTTP 302 before jumping to HTTP 301 is advised. Just in case. The difference between 301 and 302 for most clients in this case is semantic. Also, we have an HSTS policy on hg.mozilla.org. So once a modern web browser follows an HTTP redirect and hits https://hg.mozilla.org, it shouldn't load http://hg.mozilla.org/ - even if the user types that in the address bar.
Flags: needinfo?(gps)
Comment 92•7 years ago
|
||
TIL - thanks! I'll update the plan to be: - switch to 302 initially, shortening the cache retention policy to 10min - 2 days later, switch to 301 (assuming no issues) and return cache retention policy to current values. There will be a 2nd CAB ticket for the 2nd action, but that's even less visible non-event.
Comment 93•7 years ago
|
||
For those that want to follow the CAB process: - 1st change: add 302 redirect HTTP->HTTPS: CHG0011207 - 2nd change: convert 302 redirect to 301 redirect (one week later): CHG0011215
Comment 94•7 years ago
|
||
Manual test case to be done by releng staff: - log into any buildbot master. e.g. buildbot-master87 - activate the buildbot virtual env. e.g. source /builds/buildbot/*/bin/activate - cd /tmp && hg clone http://hg.mozilla.org/build/mozharness Test passes if - Clone is successful - 2 warning messages are presented regarding "warning: connecting to <host> using legacy security technology (TLS 1.0)" - one for bundle from S3, one from hg.mozilla.org
Comment 95•7 years ago
|
||
(In reply to Hal Wine [:hwine] (use NI) from comment #88) > The cutover should not cause any service disruption, however it will be > useful to have plenty of eyes around "just in case". That makes the cutover > a good candidate for our Web morning (PT) window. Based on comment 74, the 2 > upcoming dates are Feb 1 & Feb 8. > > I'll file a CAB ticket for Feb 1 (as the CAB for that will be Jan 25). If > someone hollers, we can push to Feb 8. > > :dividehex has confirmed rollback plan is done (comment 87) > > :ewong - looks like you're ready to land on bug 1305911 -- any concerns with > a Feb 1 date? No concerns, thanks for the heads up Hal! Edmund
Flags: needinfo?(ewong)
Comment 96•7 years ago
|
||
(:gcox noticed the 301/302 thing, actually, I just wrote a longer reply about it :)
Comment 97•7 years ago
|
||
Using a 302 will require a new trafficscript rule in zeus; all of the existing httpd redirects use http.changeSite() or the http redirect action, both of which use a 301. http.redirect() doesn't preserve any path or query string, it just expects a URL as an argument, so will need to cobble something together. :atoll, any thoughts?
Flags: needinfo?(rsoderberg)
Comment 98•7 years ago
|
||
Decision reached and approved by :gps via email: we will go with a 301 redirect HTTP->HTTPS in the load balancer. Thanks all!
Flags: needinfo?(rsoderberg)
Comment 99•7 years ago
|
||
Http-> https redirect has been enabled on zlb. └─▪ curl -I http://hg.mozilla.org HTTP/1.1 301 Moved Permanently Content-Type: text/html Date: Wed, 01 Feb 2017 16:00:31 GMT Location: https://hg.mozilla.org/ Connection: Keep-Alive Content-Length: 0
Updated•7 years ago
|
Status: REOPENED → RESOLVED
Closed: 15 years ago → 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•