Closed Bug 1350285 Opened 3 years ago Closed 3 years ago

Enable httppostargs on hg.mozilla.org

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlorenzo, Assigned: gps)

References

Details

Attachments

(1 file)

Sorry for the blocker status, this is preventing Firefox 52.0.2 and 52.0.2esr from shipping.

This issue has occurred about a dozen of times today. It's always failing against the nb-NO repo. Other repos don't seem impacted. For instance:
* mozilla-release https://tools.taskcluster.net/task-inspector/#do4vVje2QjiV7Hj-tYIGEQ/1
* https://tools.taskcluster.net/task-inspector/#TkRWwp0yR8CZCKSaLPD8Ig/0

Here's below a copy of the logs:
> 03:27:48     INFO - Running command: ['hg', '--config', 'ui.merge=internal:merge', '--config', 'extensions.robustcheckout=/builds/slave/rel-m-rel_fx_lx_l10n_rpk-00000/scripts/external_tools/robustcheckout.py', 'robustcheckout', u'https://hg.mozilla.org/releases/l10n/mozilla-release/nb-NO', u'nb-NO', '--sharebase', '/builds/hg-shared', '--branch', u'c84b8fb6b939']
> 03:27:48     INFO - Copy/paste: hg --config ui.merge=internal:merge --config extensions.robustcheckout=/builds/slave/rel-m-rel_fx_lx_l10n_rpk-00000/scripts/external_tools/robustcheckout.py robustcheckout https://hg.mozilla.org/releases/l10n/mozilla-release/nb-NO nb-NO --sharebase /builds/hg-shared --branch c84b8fb6b939
> 03:27:49     INFO -  ensuring https://hg.mozilla.org/releases/l10n/mozilla-release/nb-NO@c84b8fb6b939 is available at nb-NO
> 03:27:49     INFO -  warning: connecting to hg.mozilla.org using legacy security technology (TLS 1.0); see https://mercurial-scm.org/wiki/SecureConnections for more info
> 03:27:49     INFO -  (sharing from existing pooled repository 961f733cbd13415feeef46c0e4a35e4dede6fca8)
> 03:27:49     INFO -  searching for changes
> 03:27:49     INFO -  no changes found
> 03:27:49     INFO -  Traceback (most recent call last):
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/dispatch.py", line 204, in _runcatch
> 03:27:49     INFO -      return _dispatch(req)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/dispatch.py", line 880, in _dispatch
> 03:27:49     INFO -      cmdpats, cmdoptions)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/dispatch.py", line 637, in runcommand
> 03:27:49     INFO -      ret = _runcommand(ui, options, cmd, d)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/dispatch.py", line 1010, in _runcommand
> 03:27:49     INFO -      return checkargs()
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/dispatch.py", line 971, in checkargs
> 03:27:49     INFO -      return cmdfunc()
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/dispatch.py", line 877, in <lambda>
> 03:27:49     INFO -      d = lambda: util.checksignature(func)(ui, *args, **cmdoptions)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/util.py", line 1036, in check
> 03:27:49     INFO -      return func(*args, **kwargs)
> 03:27:49     INFO -    File "/builds/slave/rel-m-rel_fx_lx_l10n_rpk-00000/scripts/external_tools/robustcheckout.py", line 163, in robustcheckout
> 03:27:49     INFO -      sharebase, networkattempts)
> 03:27:49     INFO -    File "/builds/slave/rel-m-rel_fx_lx_l10n_rpk-00000/scripts/external_tools/robustcheckout.py", line 286, in _docheckout
> 03:27:49     INFO -      shareopts={'pool': sharebase, 'mode': 'identity'})
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/hg.py", line 496, in clone
> 03:27:49     INFO -      stream=stream)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/hg.py", line 385, in clonewithshare
> 03:27:49     INFO -      exchange.pull(destrepo, srcpeer, heads=revs)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/extensions.py", line 210, in closure
> 03:27:49     INFO -      return func(*(args + a), **kw)
> 03:27:49     INFO -    File "/usr/local/lib/hgext/bundleclone.py", line 590, in pull
> 03:27:49     INFO -      res = orig(repo, remote, *args, **kwargs)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/exchange.py", line 1186, in pull
> 03:27:49     INFO -      _pullbundle2(pullop)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/exchange.py", line 1325, in _pullbundle2
> 03:27:49     INFO -      bundle = pullop.remote.getbundle('pull', **kwargs)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/wireproto.py", line 396, in getbundle
> 03:27:49     INFO -      f = self._callcompressable("getbundle", **opts)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/httppeer.py", line 273, in _callcompressable
> 03:27:49     INFO -      stream = self._callstream(cmd, **args)
> 03:27:49     INFO -    File "/tools/python27-mercurial/lib/python2.7/site-packages/mercurial/httppeer.py", line 153, in _callstream
> 03:27:49     INFO -      resp = self.urlopener.open(req)
> 03:27:49     INFO -    File "/tools/python27/lib/python2.7/urllib2.py", line 406, in open
> 03:27:49     INFO -      response = meth(req, response)
> 03:27:49     INFO -    File "/tools/python27/lib/python2.7/urllib2.py", line 519, in http_response
> 03:27:49     INFO -      'http', request, response, code, msg, hdrs)
> 03:27:49     INFO -    File "/tools/python27/lib/python2.7/urllib2.py", line 444, in error
> 03:27:49     INFO -      return self._call_chain(*args)
> 03:27:49     INFO -    File "/tools/python27/lib/python2.7/urllib2.py", line 378, in _call_chain
> 03:27:49     INFO -      result = func(*args)
> 03:27:49     INFO -    File "/tools/python27/lib/python2.7/urllib2.py", line 527, in http_error_default
> 03:27:49     INFO -      raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
> 03:27:49     INFO -  HTTPError: HTTP Error 413: Request Entity Too Large
> 03:27:49     INFO -  abort: HTTP Error 413: Request Entity Too Large
> 03:27:49    ERROR - Return code: 255
> 03:27:49     INFO - rmtree: nb-NO
> 03:27:49     INFO - retry: Calling rmtree with args: (u'nb-NO',), kwargs: {}, attempt #1
> 03:27:49     INFO - retry: attempt #5 caught exception: repo checkout failed!
> 03:27:49    FATAL - Automation Error: Can't checkout https://hg.mozilla.org/releases/l10n/mozilla-release/nb-NO!


NI'ing :fubar per https://wiki.mozilla.org/DeveloperServices#Oncall_for_VCS
Flags: needinfo?(klibby)
I've doubled max_client_buffer on the zeus virtual server for hgmo, as we did for bug 898638; please see if that helps?  

As an aside, what does robustcheckout do? hg clone on that repo works fine, and we really shouldn't have to be tweaking that setting  :gps, do you know
Flags: needinfo?(klibby)
Flags: needinfo?(jlorenzo)
Flags: needinfo?(gps)
See Also: → 898638
Traffic Manager GUI --> Services --> Virtual Servers --> 
<Select concerned Virtual Server> --> Edit Protocol Settings --> 
Memory Limits --> Set 'max_client_buffer' to appropriate value in bytes -->
Update
See Also: → 1350136
Looks like the default value for max_client_buffer is 65536 and we'd bumped it up to 131072 in bug 898638. We really should see if there's a better fix for this, as it does increase memory usage on the zlb nodes.
It works! Both 52.0.2[1] and 52.0.2esr[2] managed to checkout all locales. Like :fubar asked on IRC, let's not close this bug yet, as better fix is coming up.

Thanks for the help, Kendall!

[1] https://tools.taskcluster.net/task-group-inspector/#/rLO5fOiOS8-9t94gsxW1rA
[2] https://tools.taskcluster.net/task-group-inspector/#/UN4gvLPIT4GDoaixLoYeIQ
Flags: needinfo?(jlorenzo)
Severity: blocker → normal
When Mercurial runs `hg pull` or `hg push`, it goes through a process called "discovery" to determine the minimal amount of data to exchange. It does this by exchanging lists of heads and nodes with the remote peer. Over the HTTP protocol, these lists are sent in HTTP request headers. e.g. X-Hg-Arg-1, X-Hg-Arg-2, etc. This is described at https://www.mercurial-scm.org/repo/hg/file/default/mercurial/help/internals/wireprotocol.txt.

For repos with a large number of heads, the size of the lists of nodes could be several kilobytes. I think it is 41 or 42 bytes per head.

For security and performance reasons (search "slowloris"), HTTP servers often limit HTTP request sizes. This includes individual HTTP header length and the total size of HTTP headers.

The HTTP servers running on the actual Mercurial servers have increased the limits to allow pretty much any request through. However, since the load balancer is also speaking HTTP, it also has limits.

This "max_client_buffer" setting on the load balancer appears to effectively be a limit on HTTP request header size. The docs say this is for "read and write buffers" used when "streaming data between client and server." So the implication here is the load balancer is buffering the HTTP request up until the end of the headers, at which point it sends the request to an origin server. Why it is doing this, I'm not sure. It might be a security mitigation. It might be because it wants to examine the HTTP request in detail before it dispatches it (e.g. in case it wants to filter for banned requests or something). Things like this happen when network devices operate at OSI Layer 7 (e.g. HTTP) instead of something lower (e.g. dumb TLS termination and TCP proxy).

I see 2 options here:

1) Change the load balancer to not speak HTTP
2) Change Mercurial

Modern Mercurial clients and servers support exchanging arguments via HTTP POST bodies instead of headers. We have this enabled for the try repo so it doesn't run into these limits. We could enable it globally on hg.mozilla.org so we never run into this problem again (at least with modern Mercurial clients).
Flags: needinfo?(gps)
(In reply to Gregory Szorc [:gps] from comment #6)
> 
> This "max_client_buffer" setting on the load balancer appears to effectively
> be a limit on HTTP request header size. The docs say this is for "read and
> write buffers" used when "streaming data between client and server." So the
> implication here is the load balancer is buffering the HTTP request up until
> the end of the headers, at which point it sends the request to an origin
> server. Why it is doing this, I'm not sure. It might be a security
> mitigation. It might be because it wants to examine the HTTP request in
> detail before it dispatches it (e.g. in case it wants to filter for banned
> requests or something). Things like this happen when network devices operate
> at OSI Layer 7 (e.g. HTTP) instead of something lower (e.g. dumb TLS
> termination and TCP proxy).

Yes, the load balancer wants to see HTTP traffic in order to be able to potentially modify the traffic (request and response headers, typically). 

> I see 2 options here:
> 
> 1) Change the load balancer to not speak HTTP

That would take away this problem but introduce others, notably removing a lot of flexibility at the edge. I'm not thrilled with that idea.
Let's just enable the use of HTTP POST for command arguments on hg.mozilla.org then.

This is a one-liner in the hgrc, so patch is easy to author. May take me a few hours to get the test environment in order so I can run tests.

I also checked with the author of the feature and they didn't express any concerns. As long as we're not relying on auth for POST, the feature should "just work." (This is also why we can't easily enable the feature on reviewboard-hg.mozilla.org, which does require non-GET/HEAD requests be authed.)
Assignee: nobody → gps
Status: NEW → ASSIGNED
Summary: Cannot robust checkout https://hg.mozilla.org/releases/l10n/mozilla-release/nb-NO : HTTP Error 413: Request Entity Too Large → Enable httppostargs on hg.mozilla.org
Comment on attachment 8852717 [details]
ansible/hg-web: enable accepting arguments via HTTP POST (bug 1350285);

https://reviewboard.mozilla.org/r/124878/#review127502

lgtm
Attachment #8852717 - Flags: review?(glob) → review+
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/3c1617cd2563
ansible/hg-web: enable accepting arguments via HTTP POST ; r=glob
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Reverted max_client_buffer back to previous value of 131072 bytes.
You need to log in before you can comment on or make changes to this bug.