Closed Bug 873669 Opened 11 years ago Closed 10 years ago

Front-end cache URLs for homepage/category/app detail API endpoints

Categories

(Marketplace Graveyard :: API, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 983815

People

(Reporter: cvan, Unassigned)

References

Details

For the Marketplace API only, can we try using Zeus to cache these following URLs:

https://marketplace-dev.allizom.org/api/v1/apps/(app|category|search)/.*

The idea is to serve the same cached content to users hitting the exact same API URL with the exact same querystring parameters (which is very common for anonymous users since we're not passing `?_user=` to the API). Can we start at a TTL of like 10 min?
Assignee: server-ops-amo → oremj
Doesn't this just mean we send cache headers for +10 minutes?
Wouldn't setting a Cache-Control header cache it for that user only? What I'm proposing is that we use Zeus to bypass hitting the API for those URL endpoints if another user has already requested that exact same URL in the past 10 minutes. I hope that's possible with Zeus, yeah?
(In reply to Christopher Van Wiemeersch [:cvan] from comment #2)
> Wouldn't setting a Cache-Control header cache it for that user only? What
> I'm proposing is that we use Zeus to bypass hitting the API for those URL
> endpoints if another user has already requested that exact same URL in the
> past 10 minutes. I hope that's possible with Zeus, yeah?

It should work fine.  You can think of the cache control headers as talking to Zeus if you want.  Since all traffic passes through them they see the headers as well as the client.  If they see a header saying to cache for 10 minutes, they'll cache it also as well as the client.
Moving this to the API since it's something we do on our end.

Just to check: Zeus has no problem with gzipped content, yes?
Assignee: oremj → nobody
Component: Server Operations: AMO Operations → API
Product: mozilla.org → Marketplace
QA Contact: oremj
Target Milestone: --- → 2013-05-23
Version: other → 1.1
It'll cache it the same as any other content.  If we start bringing back front end caching I'd like to see a plan for expiring that cache.  Caching things without expiry plans is how we get into situations like our current redis problem.

I know you said 10 minutes above, and if this was an emergency I'd be all for it, but with a plan in place there'd be no reason to not go above 10 minutes and expand the URLs we can do it on too.
Zeus should have any trouble with gzipped content. And yes, zeus (and netscaler in the future) will honor expires, cache-control, etc.
The API can return different content based on the user's geolocated region; we set an API-Filter header in the response. Is there a way to key it off URL+API-Filter pair?
We pass region as a query string parameter, though if we change it to use the API-Filter header, Zeus should conceivably honor the Vary header as well.
Yeah, if you pass it in the query, you won't need to do anything. If you are exclusively using the API-Filter header, you need to set Vary: API-Filter.
It does already vary on API-Filter:

Vary: API-Filter, X-Requested-With, Accept-Language, Cookie, X-Mobile, User-Agent
Yeah I'd like to see a reason to use Zeus caching, at least exploring other perf and caching issues before getting into this.
(In reply to Andy McKay [:andym] from comment #11)
> Yeah I'd like to see a reason to use Zeus caching, at least exploring other
> perf and caching issues before getting into this.

The best case is that this avoids hitting Python, Django, MySQL, elasticsearch, speeding up almost every single API request for anonymous users - and for subsequent pageloads for authenticated users. What's the worst case if we're caching by URL+header?

If this gets us closer to closing bug 869715, can we get away with testing this on -dev? Or -altdev?
(In reply to Christopher Van Wiemeersch [:cvan] from comment #12)
> The best case is that this avoids hitting Python, Django, MySQL,
> elasticsearch, speeding up almost every single API request for anonymous
> users - and for subsequent pageloads for authenticated users. What's the
> worst case if we're caching by URL+header?

Worst case would be what we used to have on AMO where we make a change and it doesn't show up for X minutes.  We definitely don't want to go down that road again.
Severity: normal → enhancement
Priority: -- → P5
Target Milestone: 2013-05-23 → ---
Depends on: 880772
Can we revisit this for all GET requests? Not even logged in, the search API request for the homepage takes 2.56s: http://cl.ly/image/451v3V0D3c3Y

This is crippling our start times especially when cold-starting the Marketplace from the Firefox OS devices.
We're investigating performance hit caused by collections in bug 926640, search/featured endpoint is one of the main areas we're interested in. bug 927420 might help here, I have a patch but I need to test the perf difference.

IMHO it's better to continue trying to improve normal loading times for this endpoint and other instead of adding yet another caching layer
Done using the CDN in bug 983815.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.