Closed
Bug 1361006
Opened 8 years ago
Closed 4 years ago
intermittent failures from discourse.mozilla-community.org ([429] too many requests, [503] service unavailable)
Categories
(Infrastructure & Operations :: Community IT: Discourse, task)
Infrastructure & Operations
Community IT: Discourse
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: glob, Assigned: yousef)
Details
i'm seeing intermittent failures from discourse.mozilla-community.org ([429] too many requests, [503] service unavailable)
~$ curl -v https://discourse.mozilla-community.org/
* Trying 52.54.14.146...
* TCP_NODELAY set
* Connected to discourse.mozilla-community.org (52.54.14.146) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate: mozilla-community.org
* Server certificate: Amazon
* Server certificate: Amazon Root CA 1
* Server certificate: Starfield Services Root Certificate Authority - G2
> GET / HTTP/1.1
> Host: discourse.mozilla-community.org
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 108
< Connection: keep-alive
<
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
* Curl_http_done: called premature == 0
* Connection #0 to host discourse.mozilla-community.org left intact
the 429 error is more problematic as the server returns nothing instead of an error message:
~$ curl -v https://discourse.mozilla-community.org/
* Trying 54.172.10.109...
* TCP_NODELAY set
* Connected to discourse.mozilla-community.org (54.172.10.109) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate: mozilla-community.org
* Server certificate: Amazon
* Server certificate: Amazon Root CA 1
* Server certificate: Starfield Services Root Certificate Authority - G2
> GET / HTTP/1.1
> Host: discourse.mozilla-community.org
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 429
< Date: Mon, 01 May 2017 06:25:43 GMT
< Server: nginx
< Content-Length: 0
< Connection: keep-alive
<
* Curl_http_done: called premature == 0
* Connection #0 to host discourse.mozilla-community.org left intact
| Assignee | ||
Updated•8 years ago
|
Assignee: nobody → yousef
| Assignee | ||
Comment 2•8 years ago
|
||
This seems to have stabilized now, with two potential causes:
Huge spam of token requests:
> /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/oauth-0.5.1/lib/oauth/consumer.rb:217:in `token_request'
> /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/oauth-0.5.1/lib/oauth/consumer.rb:136:in `get_request_token'
> /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/omniauth-oauth-1.1.0/lib/omniauth/strategies/oauth.rb:28:in `request_phase'
> /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/omniauth-twitter-1.3.0/lib/omniauth/strategies/twitter.rb:61:in `request_phase'
CDN Encoding error:
> Attempted to concat a non UTF-8 string in SafeBuffer - https://1� xa7�� - internal encoding UTF-8, external encoding UTF-8 attempted to append encoding ASCII-8BIT
Flags: needinfo?(leo)
Comment 3•8 years ago
|
||
I expect the 429s were caused by the huge spam of token requests, but I'm not sure about the 503s, could have been caused by the CDN issue. They also tend to pop up when mesos brings containers up and down.
Yousef, can you check we're properly forwarding IP addresses? The spam of requests seemed to come [1] from a vuln scanner, Acunetix. Maybe we need to put better DoS protection in place?
As for the CDN encoding error, it seemed to be affecting server side [2] and client side [3]. Can you enable the CDN on staging to see if it's still a problem?
[1] https://discourse.mozilla-community.org/logs/show/c929e60cb4da64d9867429f7440ad5d7
[2] https://discourse.mozilla-community.org/logs/show/744d58baa584463ea1b367493521947b
[3] https://discourse.mozilla-community.org/logs/show/4c48b01e5e9b376e7b9ff3b49cdc6c1e
Flags: needinfo?(leo) → needinfo?(yousef)
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(yousef)
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•