Closed Bug 1252504 Opened 9 years ago Closed 9 years ago

https://brasstacks.mozilla.com (i.e. orangefactor) intermittently terminating connections prematurely

Categories

(Tree Management Graveyard :: OrangeFactor, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: keeler, Assigned: rwatson)

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2635] )

From bug 1252068 comment 0: (Armen Zambrano [:armenzg] - Engineering productivity) > I tried to load this page: > https://brasstacks.mozilla.com/orangefactor/ > ?display=Bug&bugid=1244936&startday=2016-02-22&endday=2016-02-28&tree=all > > yet Firefox does not give me a way to do so. I can't acknoledge that I > accept the risk. > > Here's a screenshot of the message: > http://people.mozilla.org/~armenzg/sattap/2b6b747f.png And bug 1252068 comment 5: (Armen Zambrano [:armenzg] - Engineering productivity) > It is happening again this morning: > https://brasstacks.mozilla.com/orangefactor/ > ?display=Bug&bugid=1204281&startday=2016-02-29&endday=2016-02-29&tree=all bug 1252068 morphed to be about how the UI is unclear. However, it appears that there is something wrong with brasstacks.mozilla.com itself.
This is an SSL negotiation failure. Maybe we're rate-limiting. What's your external IPv4 address right now, :armenzg?
Flags: needinfo?(armenzg)
Assignee: infra → server-ops-webops
Component: Infrastructure: Other → WebOps: Other
QA Contact: cshields → smani
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2635]
24.212.167.41 ? That is what https://www.whatismyip.com/ says
Flags: needinfo?(armenzg)
I forgot to mention. If I can't load it on Firefox, I can go ahead and load it on Chrome. Would that remove the IP address theory?
I also see this with this website: https://data.sparkfun.com/robs_office
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
FWIW, https://brasstacks.mozilla.com/ looks like it might have some buggy behaviour, assuming I'm reading RFC 5246 correctly. Some background info: - https://brasstacks.mozilla.com/ -> TLS 1.1 and TLS 1.2 only, according to SSL Labs. - https://data.sparkfun.com/robs_office -> TLS 1.2 only, according to SSL Labs. - We try to connect with a Firefox build with "security.tls.version.max" set to 1 (i.e. max of TLS 1.0). - https://tools.ietf.org/html/rfc5246#appendix-E.1: > If server supports (or is > willing to use) only versions greater than client_version, it MUST > send a "protocol_version" alert message and close the connection. https://brasstacks.mozilla.com: A packet capture shows that the protocol_version alert is never sent. The server seems to just close the connection in response. In Firefox, this results in the "[...] was interrupted while the page was loading." error page. https://data.sparkfun.com/robs_office: A packet capture shows that the protocol_version alert *is* sent. In Firefox, this results in the "SSL_ERROR_PROTOCOL_VERSION_ALERT" error page. I guess in both cases, there is pretty much nothing the end user can do, but SSL_ERROR_PROTOCOL_VERSION_ALERT it at least clearer on what went wrong.
Unusually, brasstacks does NOT do its SSL negotiation in the load balancer, but instead decodes it itself on the backend server. While we could theoretically take steps to repair it to behave normally, we won't - to quote bug 1236021#c5: > I don't believe it should be IT managed; it's effectively an a-team project > server running an EOL webapp that we hope to replace with equivalent Treeherder > functionality (and decom the box) in the next few months.
I didn't see this since it wasn't filed in the OrangeFactor component (it's the only service running on that box at the moment), moving there now :-) (In reply to Richard Soderberg [:atoll] from comment #7) > Unusually, brasstacks does NOT do its SSL negotiation in the load balancer, > but instead decodes it itself on the backend server. > > While we could theoretically take steps to repair it to behave normally, we > won't - to quote bug 1236021#c5: > > > I don't believe it should be IT managed; it's effectively an a-team project > > server running an EOL webapp that we hope to replace with equivalent Treeherder > > functionality (and decom the box) in the next few months. I'd be happy to have TLS termination be performed on the load balancer - I don't know why it's currently set up as-is (I've only recently had the pleasure of trying to maintain this ancient box). By the "not IT managed" comment in that bug, I meant more "we're happy to manage the app, and don't need people paged", but that may not have been clear.
Assignee: server-ops-webops → nobody
Component: WebOps: Other → OrangeFactor
Product: Infrastructure & Operations → Tree Management
QA Contact: smani
Version: other → ---
(In reply to :Cykesiopka from comment #6) > FWIW, https://brasstacks.mozilla.com/ looks like it might have some buggy > behaviour, assuming I'm reading RFC 5246 correctly. > > Some background info: > - https://brasstacks.mozilla.com/ -> TLS 1.1 and TLS 1.2 only, according to > SSL Labs. > - https://data.sparkfun.com/robs_office -> TLS 1.2 only, according to SSL > Labs. > ... If it helps, the current config used is the one recommended by: https://mozilla.github.io/server-side-tls/ssl-config-generator/ (I used the modern profile)
(In reply to Ed Morley [:emorley] from comment #9) > If it helps, the current config used is the one recommended by: > https://mozilla.github.io/server-side-tls/ssl-config-generator/ > (I used the modern profile) Which is perfectly fine and something I'm glad is being used. In any case, the intersection of people who would ever connect to https://brasstacks.mozilla.com/ and those that for whatever reason don't have TLS 1.1 or TLS 1.2 enabled is probably pretty small, so IMO there's no real rush to do anything here. Just something to fix at some point in the future.
> IMO there's no real rush to do anything here. Just something to fix at some point in the future. Reading the above, I'm struggling to tell which is the cause for the issues in this bug: 1) We're using the modern profile from the generator, which causes this behaviour, but is expected (in which case it's wontfix, since I don't mind about not supporting older devices). 2) The modern profile from the generator is actually broken, and needs a bug fix upstream. 3) Our (older) version of nginx is somehow not playing nicely with our chosen config. 4) <something else> (stumbled upon this, is fairly apt: https://insouciant.org/images/2012/10/sslihavenoideawhatimdoing.png)
Ah! Sorry. In that case, we would be happy to do the SSL termination if you want to switch over to us.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to Richard Soderberg [:atoll] from comment #12) > Ah! Sorry. In that case, we would be happy to do the SSL termination if you > want to switch over to us. If people would prefer us to, I'm more than happy to try. I've never seen the behaviour in comment 0 myself. Armen, was this was an old browser version? The current nginx config has: server { listen 80; server_name localhost; include /etc/nginx/default_locations/*.conf; } server { listen 443; server_name brasstacks.mozilla.com; include /etc/nginx/default_locations/*.conf; # <SNIP SSL config> } If we did switch I guess we'd need to add brasstacks.mozilla.com to the server_name attribute for the port 80 config. The only other thing thought might need tweaking is /etc/nginx/default_locations/orangefactor.conf , specifically: location /orangefactor/api/ { ... fastcgi_param REMOTE_ADDR $remote_addr; fastcgi_param REMOTE_PORT $remote_port; fastcgi_param SERVER_ADDR $server_addr; fastcgi_param SERVER_PORT $server_port; fastcgi_param SERVER_NAME $server_name; fastcgi_param SERVER_PROTOCOL $server_protocol; ... } However in bug 1235097 comment 23 the protocol reported to the WSGI app was broken already (ie HTTP even for connections over HTTPS), which meant I had to hardcode the site root in bug 1235097 comment 25. As such, I think the locations config shouldn't be affected.
Talked briefly with Ed and we're going to leave this as-is, with it terminated its own SSL, and see if an Nginx upgrade fixes the issue at some point.
Assignee: nobody → rwatson
I think this is fixed?
Thanks Ed.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.