853885 - HTTP Caching issues on sumo production

Reporter

Description

•

11 years ago

Users started reporting caching issues on prod yesterday. Basically, they reply to a question in the support forum and get redirected to a cached version of the page without their reply. I created a test question and can confirm the issue:
https://support.mozilla.org/en-US/questions/954201

I am seeing `X-Cache-Info: cached` in the response headers after replying. If I refresh the page, I see `X-Cache-Info: caching`.

This is puzzling. We've never cached our HTTPS traffic on SUMO. Did something change recently?

James Socol [:jsocol, :james]

Updated

•

11 years ago

Summary: Caching issues on sumo production → HTTP Caching issues on sumo production

Jake Maul [:jakem]

Assignee

Comment 1

•

11 years ago

Zeus sets that header, and it obeys whatever Cache-Control headers are sent by the servers (up to a cap... if you set a 1-hour max-age, it will only cache for 10 minutes).

In addition, if no cache headers are sent at all, it will cache for its default timeout, which is 10 minutes. For this reason alone it's a good idea to send something all the time.


Double-check the cache headers direct from the servers... something like this, if you can:

curl -v -H 'Host: support.mozilla.org' http://support1.webapp.phx1.mozilla.com:81/page-to-check

When I do this with the page you linked, I get no cache headers at all. That explains why Zeus will cache for a short time.


It's worth noting that this Zeus config isn't new... it's been like this for a long time. Perhaps something in the app (or in Apache's config) has changed recently?

Assignee: server-ops-webops → nmaul

Ricky Rosario [:rrosario, :r1cky]

Reporter

Comment 2

•

11 years ago

Nothing related to this has changed in the app recently. Yesterday we started having apache restart issues because mod_wsgi was upgraded by puppet. So, I am guessing there may have been related changes to the Apache config.

Jake Maul [:jakem]

Assignee

Comment 3

•

11 years ago

I don't see anything in the WebOps-managed apache config... nothing substantial since at least Feb 7, when the PyOpenSSL work was done.

The work yesterday resulted in exactly 1 line being changed, and it was just a re-enabling the mod_wsgi module.

Ricky Rosario [:rrosario, :r1cky]

Reporter

Comment 4

•

11 years ago

There seems to have been some environmental change. If I look at the WSGI request info on stage I see:
...
META:{'DOCUMENT_ROOT': '/data/www/support.allizom.org/kitsune/webroot',
 'GATEWAY_INTERFACE': 'CGI/1.1',
 'HTTPS': 'on',
 'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', ...

In prod that is missing:
...
META:{'DOCUMENT_ROOT': '/data/www/support.mozilla.org/kitsune/webroot',
 'GATEWAY_INTERFACE': 'CGI/1.1',
 'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',...

I am guessing that changed in the past few days. Why is it different on stage vs prod (missing HTTPS=on)?

We were depending on that for setting the cache control headers. I fixed our code to use django's `request.is_secure()` which I *thought* was working for us. AFAICT, it is checking `os.environ.get("HTTPS") == "on"` in our case:
https://github.com/django/django/blob/master/django/http/request.py#L117

Can you confirm that the HTTPS environment variable is being set in prod for the wsgi processes that handle https?

Chris Turra [:cturra]

Comment 5

•

11 years ago

i just confirmed that the apache environment variable is set - from the apache config:

  SetEnv HTTPS on


i just reviewed an `svn log` on this apache config and it has /not/ been updated recently.

Ricky Rosario [:rrosario, :r1cky]

Reporter

Comment 6

•

11 years ago

OK, I've fixed the issue in Bug 853904. Instead of checking `request.META['HTTPS'] != 'off'` we are now checking `os.environ.get("HTTPS") == "on"` to determine a request is over https.

The only explanation I have is that something changed recently to make our `request.META['HTTPS'] != 'off'` check fail. It still works on stage though.

I think we can close this as WFM?

Ricky Rosario [:rrosario, :r1cky]

Reporter

Comment 7

•

11 years ago

It really looks like the mod_wsgi upgrade did it:
http://code.google.com/p/modwsgi/wiki/ChangesInVersion0304#Features_Changed

"Note that you can still set HTTPS in Apache configuration using the SetEnv or SetEnvIf directive, or via a rewrite rule. In that case, that will override what wsgi.url_scheme is set to and once wsgi.url_scheme is set appropriately, the HTTPS variable will be removed from the set of variables passed through to the WSGI environment. "


Just curious why we arent on the same version on stage and dev?

Jake Maul [:jakem]

Assignee

Comment 8

•

11 years ago

dev/stage/prod are all the same version of mod_wsgi:

[support1.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64
[support2.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64
[support3.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64
[support4.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64
[support5.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64
[support1.stage.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64
[support1.dev.webapp.phx1.mozilla.com] out: mod_wsgi-3.4-1.el6.rfx.x86_64

This seems almost certain to just be a settings drift/mismatch between dev/stage/prod, where some of them had "SetEnv HTTPS on" and others didn't.


Good catch on this difference, and good find on the modwsgi Changelog info!


Marking this as R/F instead of WFM, since we did wind up making a change on our side (and you did on in the code too).

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Ricky Rosario [:rrosario, :r1cky]

Reporter

Comment 9

•

11 years ago

Awesome! Thanks jakem and cturra!

Status: RESOLVED → VERIFIED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

Bugzilla

Quick Search

HTTP Caching issues on sumo production

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

Tracking

(Not tracked)

People

(Reporter: rrosario, Assigned: nmaul)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated