799690 - [socorro-crashstats] audit cache expires for models

Reporter

Description

•

12 years ago

At the moment ALL models use the default expiry time (which is 1h at the moment according to https://github.com/mozilla/socorro-crashstats/blob/master/crashstats/crashstats/models.py#L46)

I'm sure we can squeeze some smart optimization out of being more specific but I'm more worried that we're doing it wrong since for some models maybe it should be less than 1 hour. 

Can we do one check that none of the models are OVER-caching.

Robert Helmer [:rhelmer]

Comment 1

•

12 years ago

None of the data comes in more often than once per hour (they are almost entirely driven by cron, and nothing runs more than once per hour).

We could probably go longer but I'd be concerned about ADU/daily matviews/etc. looking like it's late, since I know people check early.

The one exception I can think of is status, if that one is being cached it should be much shorter.

Peter Bengtsson [:peterbe]

Reporter

Comment 2

•

12 years ago

Hence the word "audit" in the bug title. 

We could probably optimize but 24 fetches per day isn't too bad to boot. A future version we could probably have a two-step cache expiry where we do something like:

 def fetch(url):
     cache_key_fast = 'fast-%s' % url
     value = cache.get(cache_key_fast)
     if not value:
         cache_key_slow = 'slow-%s' % url
         value = cache.get(cache_key_slow)
         queue.enqueue(update_caches, url)  # fire-and-forget
         if not value:
             logging.warning("cache MISS on %s", (url,))
             value = self._really_fetch(url)
             cache.set(cache_key_fast, value, 60 * 60)
             cache.set(cache_key_slow, value, 60 * 60 * 24)
    return value

Robert Helmer [:rhelmer]

Comment 3

•

11 years ago

Let's do this before release.

Blocks: 749359

Peter Bengtsson [:peterbe]

Reporter

Comment 4

•

11 years ago

I'm assigning this to you Brandon because you're the master of the PHP side. 

Basically, what we want to do is make sure each of our middleware fetches is caching appropriately long times. At the moment it's 1h on all of them. I suspect that in the PHP code it's 10min for some and 24h for some etc. 

Feel free to remove the assignee if you're the wrong man for the job. We can figure it out together somehow. 

Adrian, Lars, Selena,
I'm CC'ing you because I think everyone can help with this. 
Basically, we should fetch data from the middleware as (in-)frequently as the data is changing or where it matters. For example, if a piece of data is only updated every 24h from a cron job, the middleware that exposes that can be cached for 24h because there's no point in doing it shorter. 

The middleware we're talking about tuning is from line 261 and down 
https://github.com/mozilla/socorro-crashstats/blob/master/crashstats/crashstats/models.py#L261

Assignee: nobody → bsavage

[DEACTIVATED] Adrian Gaudebert

Comment 5

•

11 years ago

Search should not be cached, or for a very small time. Iirc reports/list is also based on `reports`, which is real-time, which should thus not or slightly be cached.

Laura Thomson :laura

Updated

•

11 years ago

No longer blocks: 749359

Depends on: 749359

Brandon Savage [:brandon]

Assignee

Comment 6

•

11 years ago

Is this still needed? Has it been an issue in production or are we satisfied with the overall caching strategy we currently have?

Flags: needinfo?(rhelmer)

Flags: needinfo?(peterbe)

Robert Helmer [:rhelmer]

Comment 7

•

11 years ago

(In reply to Brandon Savage [:brandon] from comment #6)
> Is this still needed? Has it been an issue in production or are we satisfied
> with the overall caching strategy we currently have?

No issues, but I think it'd be valuable to audit.

Are we collecting data here (cache hit via statsd etc)?

Flags: needinfo?(rhelmer)

Brandon Savage [:brandon]

Assignee

Comment 8

•

11 years ago

I don't believe we're collecting data but that would be valuable. I will see if I can model the statsd collection off the service that :lonnen just committed last week.

Peter Bengtsson [:peterbe]

Reporter

Comment 9

•

11 years ago

I honestly think that the right thing to do is to evaluate each model and think about how often they change or don't change over on the middleware. However, that's a big task. 

I'd actually be happy if you WONTFIX this because it's just too broad a task. We also don't really have a performance problem. 

Once we have statsd we might be able to better tell which models actually are used a lot and thus worth scrutinizing more properly.

Flags: needinfo?(peterbe)

Brandon Savage [:brandon]

Assignee

Comment 10

•

11 years ago

Resolved WONTFIX per reporter.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → WONTFIX

Bugzilla

Quick Search

[socorro-crashstats] audit cache expires for models

Categories

(Socorro :: Webapp, task)

Tracking

(Not tracked)

People

(Reporter: peterbe, Assigned: brandon)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10