Closed Bug 1269453 Opened 8 years ago Closed 8 years ago

Improve cachability for the first signature dates

Categories

(Socorro :: Webapp, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: peterbe, Assigned: peterbe)

Details

Attachments

(1 file)

The Top Crasher page takes all the signatures that it's going to mention (50 by default) and queries SignatureFirstDate().get(signatures=all_50_signatures). 

If any one of them change, the whole query has to be done all over again because the cache key is based on a hash of those 50 signatures. 

At the time of writing, the cache hit ratio of SignatureFirstDate is 15%. Meaning that 85% of the time, we're not getting any caching benefits. 

Add to the fact that the signature first date value is extremely unlikely to change. We should be able to cache these individual values for up to 24 hours. Or more!
A much better pattern would be to instead do what we do with Graphics Devices. The algorithm would basically look like this:

def get_signature_first_dates(signatures):
    result = {}
    missing = []
    for signature in signatures:
        cached = cache.get(signature)
        if cached:
            result[signature] = cached
        else:
            missing.append(signature)
    for hit in SignatureFirstDate().get(signatures=missing):
        result[hit['signature']] = hit['first_date']
        cache.set(hit['signature'], hit['first_date'], 60 * 60 * 24)
    return result

That way, only those not known in cache would be queried.
Heck, I bet we could write a similar pattern for the Bugs API. But there the wins would be less obvious because the individual values can't be as easily cached for a long time.
If you open each of the four versions of Firefox from the home page, each of them do a SignatureFirstDate().get(signatures=...) but that list is always different so they can't benefit from each other's caching. Meaning load 4 different versions and you have to effectively fetch 4x50 signatures' first date. I did a comparison about how many of them have signatures in common. Unfortunately I messed up the version numbers. But the "group numbers" are basically version 48, 47, 46 or 45:


GROUP 1
	2
	in common: 15
	not      : 70
	3
	in common: 16
	not      : 68
	4
	in common: 7
	not      : 86
GROUP 2
	1
	in common: 15
	not      : 70
	3
	in common: 31
	not      : 38
	4
	in common: 3
	not      : 94
GROUP 3
	1
	in common: 16
	not      : 68
	2
	in common: 31
	not      : 38
	4
	in common: 2
	not      : 96
GROUP 4
	1
	in common: 7
	not      : 86
	2
	in common: 3
	not      : 94
	3
	in common: 2
	not      : 96


Between the three versions there are 50 signatures queried. Only 74% of them are unique. Meaning, each version is quite different so it's very unlikely to be the same signatures across the version difference.
Assignee: nobody → peterbe
Commit pushed to master at https://github.com/mozilla/socorro

https://github.com/mozilla/socorro/commit/925101eeb5a2db680d6aa1eafdb2f9d5e7abc0e7
fixes bug 1269453 - Improve cachability for the first signature dates (#3319)

r=adngdb
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: