Closed
Bug 1091939
Opened 11 years ago
Closed 10 years ago
Send redirects to pre-generated snippet bundles
Categories
(Snippets :: Service, defect)
Snippets
Service
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: osmose, Unassigned)
References
Details
pmac proposed an idea that I think sounds like a really good solution to the performance problem on snippets.
When a user requests a set of snippets, instead of returning the entire set of snippets, we have the service return a redirect to a URL with the snippet code. What this would allow us to do is to serve the same snippet code at the same URL to users with differing snippet URLs. These common snippet bundles can then be effectively cached by the CDN.
The reason why this might help is that we've found that the majority of traffic that snippets sees is different snippet URL configurations, rather than many users with the same snippet URL. By redirecting all these to a single URL, we enable the service to cache a single bundle of snippet code across multiple snippet URLs, reducing the amount of time spent transferring data, which is the current bottleneck with regards to service performance.
Risks here include:
- Pre-generating snippet code limits some of the things we can do in snippet templates based on the client. We need to ensure none of our snippet code attempts to rely on the snippet URL or things that change often like the current time or date.
- about:home may not support receiving a redirect response when fetching snippets.
- The snippet bundles being sent to users with different snippet URLs may not be as shareable as we suspect, meaning we'd still be sending many different bundles and not save as much time as we hope.
I'd like to get a few opinions before moving forward with implementation. Any comments or risks that aren't mentioned?
Comment 1•11 years ago
|
||
Will each bundle exist at a URL that is a point-in-time and updates to that bundle will exist at a different URL? For example, imagine we create a bundle of snippets and then we have to change/fix the bundle, does the update to the bundle exist at the same URL? If it is the same URL, then do we have to rely on the cache rules for the CDN to expire the old bundle? Can we use a build ID to differentiate builds of bundles?
Reporter | ||
Comment 2•11 years ago
|
||
I think the bundle filename should be composed of the snippet IDs included, plus any other client-specific data it relies on (for example, we have some CSS that only goes to certain older about:home page versions, that should be included in the filename). We can hash all these together to get a reasonable filename out of it.
I don't think altering the URL based on updates to the bundle is a good idea, as that means we'd have to track changes to the individual snippets in each bundle in order to determine when a bundle needs to be re-generated. Instead, we can have the CDN expire the cached files every 15 minutes or so, and have the service re-generate the file every 15 minutes as well.
It leaves a maximum 30 minute period between a change being saved before it starts going out, but in return we can turn the about:home update period up so that people are checking for new snippets more often than 24 hours, which ultimately reduces the lag.
Comment 3•11 years ago
|
||
This is interesting approach to try. I suspect that maybe the bundles not as shareable as we think, so here's another similar approach to fix this:
What if we served _all_ snippets to all clients, along with a json with the snippet ids to display based on client configuration?
This way the app server will mostly have do a couple of db queries and generate a simple json, which will be cached by Zeus. Every 15 mins it will generate a full list of rendered snippets and feed it to Zeus.
Risks:
- This involves two requests from the client, one to download all snippets and one to download the 'snippets-to-show' json
- All snippets maybe a large download, although with snippets cdn and external files (Bug 1082208) we will reduce that risk
Variation:
- We serve the 'snippets to show' json first
- We cache on Zeus every snippet on a different URL
- Client hits X different URLs to download the snippets from the 'snippets-to-show' list.
- This involves more requests and it's probably more complicated.
Comment 4•11 years ago
|
||
I like that idea. Maybe we back down a little from _all_ snippets, and just have a few bundles for maybe just the major categories (e.g. channel and language), then serve the IDs for the final set.
Reporter | ||
Comment 5•11 years ago
|
||
(In reply to Giorgos Logiotatidis [:giorgos] from comment #3)
> This is interesting approach to try. I suspect that maybe the bundles not as
> shareable as we think, so here's another similar approach to fix this:
That's a good point, we should look into taking the top 1000 or so unique URLs and seeing if the delivered bundles are the same or not to get an idea of if this will actually result in savings.
Without checking, I remember that most of the variation between clients was in things like the build ID or version, which we don't vary on very often. Hence why I guessed that bundles would actually be the same across most clients.
> What if we served _all_ snippets to all clients, along with a json with the
> snippet ids to display based on client configuration?
>
> This way the app server will mostly have do a couple of db queries and
> generate a simple json, which will be cached by Zeus. Every 15 mins it will
> generate a full list of rendered snippets and feed it to Zeus.
>
>
> Risks:
> - This involves two requests from the client, one to download all snippets
> and one to download the 'snippets-to-show' json
> - All snippets maybe a large download, although with snippets cdn and
> external files (Bug 1082208) we will reduce that risk
>
>
> Variation:
> - We serve the 'snippets to show' json first
> - We cache on Zeus every snippet on a different URL
> - Client hits X different URLs to download the snippets from the
> 'snippets-to-show' list.
> - This involves more requests and it's probably more complicated.
Yeah, the complexity of implementation and possible giant filesize are scary here, especially when we start considering users on dial-up. While we want to open up the ability to send more data with snippets, it's still important to try and minimize the data sent with the initial snippets payload so that users on slow connections can see the text of a snippet ASAP even if the CDN is still loading external resources.
(In reply to Paul McLanahan [:pmac] from comment #4)
> I like that idea. Maybe we back down a little from _all_ snippets, and just
> have a few bundles for maybe just the major categories (e.g. channel and
> language), then serve the IDs for the final set.
I think this would be a very good variation of giorgos' plan, and I'm down for trying it out IF we find that snippet bundles are not shared between clients that much. If they are, I'd rather go with my original proposal mainly because it's much simpler to implement then adding more client-size snippet choosing logic.
Comment 6•11 years ago
|
||
Crazy idea: What about if we served a snippet bundle (_all_ or some subset) compressed and we decompressed them once inside the product? How much bandwidth would be saved if snippet delivery happened as a compressed blob instead of non-compressed character string?
Reporter | ||
Comment 7•11 years ago
|
||
(In reply to Chris More [:cmore] from comment #6)
> Crazy idea: What about if we served a snippet bundle (_all_ or some subset)
> compressed and we decompressed them once inside the product? How much
> bandwidth would be saved if snippet delivery happened as a compressed blob
> instead of non-compressed character string?
It wouldn't save anything over just gzipping the response on it's way out through apache.
Although I don't see any `Content-Encoding` headers coming from the service so we should probably enable that regardless, it wouldn't be a huge fix.
Comment 8•11 years ago
|
||
We could save some per-request CPU by pre-compressing the bundles, which also allows us to get a better compression ratio because it doesn't have to be quite as fast. Apache can do this with some creative rewrite rules a la http://stackoverflow.com/q/9076752. I'm sure this would require a large change to have snippets service pre-generate and save bundles, but I think that's what we're talking about anyway right?
Reporter | ||
Comment 9•11 years ago
|
||
(In reply to Paul McLanahan [:pmac] from comment #8)
> We could save some per-request CPU by pre-compressing the bundles, which
> also allows us to get a better compression ratio because it doesn't have to
> be quite as fast. Apache can do this with some creative rewrite rules a la
> http://stackoverflow.com/q/9076752. I'm sure this would require a large
> change to have snippets service pre-generate and save bundles, but I think
> that's what we're talking about anyway right?
As long as the browser ends up handling the compression that seems reasonable, but orthogonal to this particular bug.
Reporter | ||
Comment 10•11 years ago
|
||
So I still have some apache logs from prod snippets for about 30 minutes that I ran some analysis on a few weeks ago, and I wrote a script to run through the top 1000 snippet URLs for those 30 minutes and fetch their current contents from the service. It hashed the contents and then stored the hashes in a set, and output the length of the set, giving us a rough idea of how many different unique snippet bundles there are.
I got about 81 unique bundles, and that includes JSON snippet bundles, so the actual count of bundles is less. Based on this, I'm convinced that bundles don't vary much and that we can get some significant savings from having Zeus or the CDN cache them rather than caching snippet URLs directly.
(I can send the logs and code over if you want to check my work, I can't post the logfiles to the bug for privacy reasons.)
Comment 11•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/snippets-service
https://github.com/mozilla/snippets-service/commit/8c0192f264c4231f3a4d79942c530a5d6cda373c
Fix bug 1091939: Add option to serve pre-generated snippet bundles.
Instead of generating and returning snippet code whenever requested,
return a redirect to a static file with the snippet code. The file is
generated once when requested, and cached for some period of time.
Bundles are grouped by a key generated from the snippets included in the
bundle, as well the client’s locale and about:home version, both of
which affect the snippet code that is sent. This allows reuse of the
snippet bundle across client configurations that would otherwise receive
the exact same snippet code.
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 12•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/snippets-service
https://github.com/mozilla/snippets-service/commit/b6868f7dd3969e109ff3aaf80d37b0abbf17b6af
Bug 1091939: Add Access-Control-Allow-Origin header to media files.
Reporter | ||
Comment 13•10 years ago
|
||
Verified fixed on stage and deployed to master. Now to see if it helps the stats at all!
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•