Investigate replacing the current caching of bugs

RESOLVED FIXED in 2.0

Status

Socorro
General
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: lonnen, Assigned: lonnen)

Tracking

Trunk
x86
Mac OS X
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 6 obsolete attachments)

(Assignee)

Description

7 years ago
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: 

Socorro currently use chron jobs to check and update from bugzilla which were written prior to the current Bugzilla REST api. We should look into whether or not the REST api is sufficient now. 

In IRC #mozwebtools, Reed suggested we would need some sort of caching, but it could have a lower ttl. Glob pointed out that autotools uses the elasticsearch cluster. 

Reproducible: Always
this problem stems from the unfortunate combination  of a lack of a good Bugzilla api and the way that we embed signatures within descriptions in Bugzilla.
  
Socorro is mandated to show the state of a bug when it displays a crash/bug association in the Socorro UI.  Also mandated to minimize interaction with Bugzilla, Socorro is left with having to maintain it's own copy of a bug's state.  Synchronization of this parallel data becomes a problem as seen in Bug 504704. 

The solution to Bug 504704 is expensive, because it requires Socorro to query Bugzilla for every bug known by Socorro every hour.  That's potentially thousands of bugs and the mandate of minimal Bugzilla interaction goes out the window.  I liken this to having a needle in a haystack.  Someone takes the needle away when we're not looking and we need to detect that.

I suggest that we reexamine this caching of parallel bugzilla data.  We should look at every place in the UI where we display Bugzilla information and ask this question, "can this display of a Bugzilla ID be replaced with some sort of dynamic bugzilla query?"  If the answer is yes, then we could eliminate Socorro's parallel data entirely.  This would resolve Bug 504704 by eliminating code rather than writing more.

This problem is also closely linked with the problem in how we save signatures within Bugzilla.  Currently, we save them by embedding them in the description of the bug.  If these were instead changed to a separate queriable field, then the aforementioned dynamic queries would be simple.  See Bug 634276 for details.
Blocks: 504704
Depends on: 634276
(Assignee)

Comment 2

7 years ago
grep'd the apache logs from one of the five servers to see how often https://crash-stats.mozilla.com/topcrasher/byversion/* and https://crash-stats.mozilla.com/report/list* pages.

We're looking at daily median traffic of 0 hits, and an average of <1 hit. When I ditch days with no hits, report/list* had about 25 average hits and /topcrasher/byversion/* averaged about 13 hits. Maximums are high: where report/list* had 3400 hits and /topcrasher/byversion/* had 240 hits. The report/list* max is a clear outlier, twice as high as the second highest day behind it.
(Assignee)

Comment 3

7 years ago
Reed -- I can provide more details if you need them, but do you think this load is can be handled by the rest api? The topcrasher/byversion calls could be handled in a single API query per page load. The list* pages could be handled in a single query per page load, or via several smaller queries when the user mouses over a list of bugs (fewer overall bugs pulled, more queries, don't have good numbers on how often the mouseover happens.
(Assignee)

Comment 4

7 years ago
Spoke with JustDave on IRC earlier and it looks like Bugzilla will handle calls via the rest api, though we should try and cache the calls for 10 minutes or so. I suggest that we memoize calls to the bugzilla API with a NoSQL, though I'm not sure what machine it would live on.

Comment 5

7 years ago
The cluster already has memcache and HBase - will one of those do you? ;)
(Assignee)

Comment 6

7 years ago
Memcached would be well suited to this
(Assignee)

Comment 7

7 years ago
Created attachment 533640 [details] [diff] [review]
Patch proposal

This patch adds a model for making and caching calls to the REST api. The model is hooked into the Bugzilla library so that now we only retrieve bug IDs and the associated sigs from our database and get the rest of the info from the REST API. If info about the bug isn't returned, nothing is passed along to the view about the bug. API calls are cached and can be configured with their own ttl independent of the global cache (15 minutes is recommended in the example config). Visually there should be no change from the prior version.

The patch was made against the root project folder.
Attachment #533640 - Flags: review?(rhelmer)
Attachment #533640 - Flags: review?(laura)
Attachment #533640 - Flags: review?(dveditz)
(Assignee)

Comment 8

7 years ago
Created attachment 534551 [details] [diff] [review]
Wraps Bugzilla api calls in a local cache wrapper

That last patch was a little sloppy. This one has the model I forgot to add last time, and has removed some unnecessary changes in the middleware.
Attachment #533640 - Attachment is obsolete: true
Attachment #533640 - Flags: review?(rhelmer)
Attachment #533640 - Flags: review?(laura)
Attachment #533640 - Flags: review?(dveditz)
Attachment #534551 - Flags: review?(rhelmer)
Attachment #534551 - Flags: review?(laura)
Attachment #534551 - Flags: review?(dveditz)
Comment on attachment 534551 [details] [diff] [review]
Wraps Bugzilla api calls in a local cache wrapper

Review of attachment 534551 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to comment #7)
> Created attachment 533640 [details] [diff] [review] [review]
> Patch proposal
> 
> This patch adds a model for making and caching calls to the REST api. The
> model is hooked into the Bugzilla library so that now we only retrieve bug
> IDs and the associated sigs from our database and get the rest of the info
> from the REST API. If info about the bug isn't returned, nothing is passed
> along to the view about the bug. API calls are cached and can be configured
> with their own ttl independent of the global cache (15 minutes is
> recommended in the example config). Visually there should be no change from
> the prior version.

This looks good overall; I think we should take it as-is, and investigating either switching the bugzilla cron over to bzapi, or (as you mention in IRC) better yet doing an AJAX call to bzpapi and caching the result to build that signature->bug_id map, instead of depending on the existing cron job to do that. We can do that part for the next release cycle though.

::: webapp-php/application/models/bugzilla.php
@@ +57,5 @@
> +                           "Accept: application/json",
> +                           "Content-Type: application/json"
> +                        ));
> +            curl_setopt($request, CURLOPT_RETURNTRANSFER, 1);
> +            $response = curl_exec($request);

Would be nice to log curl_error() if $response is FALSE.

::: webapp-php/application/config/bzapi.php-dist
@@ +13,5 @@
> +
> +// How long to cache REST calls, in seconds, if different from global cache.
> +// Setting the time out to 0 disables it.
> +$config['timeout'] = 900; // default 15 minutes 
> +*/

No need to comment these out; we want it to work out-of-the-box.
Attachment #534551 - Flags: review?(rhelmer) → review+
(Assignee)

Comment 10

7 years ago
Created attachment 535273 [details] [diff] [review]
Retrieve bug info from Bugzilla. Now with AJAX.

The last patch caused the load time for TCBS and queries to take an extra 5-30 seconds, depending on how responsive the Bugzilla REST API was. This patch retrieves the same information asynchronously when a page is loaded and updates the page information with the results when they arrive.

If the request fails, bug numbers and links to bugs are still provided.
Assignee: nobody → chris.lonnen
Attachment #534551 - Attachment is obsolete: true
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #534551 - Flags: review?(laura)
Attachment #534551 - Flags: review?(dveditz)
Attachment #535273 - Flags: review?(rhelmer)
Attachment #535273 - Flags: review?(laura)
Comment on attachment 535273 [details] [diff] [review]
Retrieve bug info from Bugzilla. Now with AJAX.

Review of attachment 535273 [details] [diff] [review]:
-----------------------------------------------------------------

I like this approach a lot, just a typo and missing impl for /buginfo (as discussed in irc)

::: webapp-php/js/socorro/bugzilla.js
@@ +9,5 @@
> +    });
> +    
> +    if (query_string) { 
> +      $.ajax({
> +        url: "/buginfo/bug?id=" + query_string + "&include_fields=summary,status,id,resolution", 

Where is /buginfo/ implemented?

@@ +21,5 @@
> +            bug = bugTable[v.innerHTML];
> +            if (bug) {
> +              $(this).attr("title", bug.status + " " + bug.resolution + " " + bug.summary);
> +              if(bug.resolution.length > 0 && 
> +                 !(bug.resolution in ['UNCONFIRMED','NEW','ASIGNED','REOPENED'])) {

s/ASIGNED/ASSIGNED/
Attachment #535273 - Flags: review?(rhelmer)
(Assignee)

Comment 12

7 years ago
Created attachment 535429 [details] [diff] [review]
Now with AJAX and all the files I forgot to svn add

Need to streamline how I develop on Khan. Here's the same patch as before, with two files that didn't get added to the last one.
Attachment #535273 - Attachment is obsolete: true
Attachment #535273 - Flags: review?(laura)
Attachment #535429 - Flags: review?(rhelmer)
Attachment #535429 - Flags: review?(laura)
(Assignee)

Comment 13

7 years ago
Created attachment 535523 [details] [diff] [review]
AJAX for everyone

This patch is against the webapp directory, not the Socorro root dir.
Attachment #535429 - Attachment is obsolete: true
Attachment #535429 - Flags: review?(rhelmer)
Attachment #535429 - Flags: review?(laura)
Attachment #535523 - Flags: review?(rhelmer)
Comment on attachment 535523 [details] [diff] [review]
AJAX for everyone

Cool looks good and WFM on my dev envs :)

Note that larger queries break with the (default) filesystem cache, but memcache seems to be fine. I don't think we should worry about this, the Kohana filesystem cache does not look like it's implemented in a way that would scale terribly well anyway.
Attachment #535523 - Flags: review?(rhelmer) → review+
(Assignee)

Comment 15

7 years ago
Comment on attachment 535523 [details] [diff] [review]
AJAX for everyone

Review of attachment 535523 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #535523 - Flags: review?(dveditz)
Attachment #535523 - Flags: review?(bsavage)
Comment on attachment 535523 [details] [diff] [review]
AJAX for everyone

Review of attachment 535523 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is close, but not perfect. The first note is optional; I think the second one is mandatory if this is going to work properly. If you have globally defined a function that you're using, please let me know and I'll be happy to give you r+.

::: application/models/bugzilla.php
@@ +28,5 @@
> +        $query_string = $method . "?";
> +        foreach ($arguments as $key => $value) {
> +           $query_string .= $key . "=" . $value . "&";
> +        }
> +        

Take a look at http://php.net/manual/en/function.http-build-query.php and see if that helps with your overall goal. Also, it will properly encode the query string.

@@ +29,5 @@
> +        foreach ($arguments as $key => $value) {
> +           $query_string .= $key . "=" . $value . "&";
> +        }
> +        
> +        return query_api($query_string);

The problem here is that this will never work. It appears as though you're trying to call Bugzilla_Model::query_api(), but you have to use $this->query_api($query_string) in order for that to work properly.
Attachment #535523 - Flags: review?(bsavage) → review-
Please note that I approve the rest of the patch and if time requires that you commit this tonight, I do not expect you to solicit additional review for that minor change.
(Assignee)

Comment 18

7 years ago
Created attachment 535544 [details] [diff] [review]
Removed dead and unuseful code

Very similar to my last patch, with changes in light of Brandon's review. 

Given that there is a php built in for creating query strings I've removed the utility method from the bugzilla_model. I also removed some stale commented out code that was outdated from an earlier patch. The removed code was dead or unused so the earlier reviews by Brandon and Rob should be still valid.
Attachment #535523 - Attachment is obsolete: true
Attachment #535523 - Flags: review?(dveditz)
Attachment #535544 - Flags: review?(laura)
Attachment #535544 - Flags: review?(dveditz)
Comment on attachment 535544 [details] [diff] [review]
Removed dead and unuseful code

You don't need this:
+?>
at the end of a file.

and you don't need and shouldn't use these references:
+        $bz_api_url =& Kohana::config('bzapi.url');
+        $bz_specific_cache_timeout =& Kohana::config('bzapi.timeout');
Ping me for a quick brief on PHP's allocation and gc if you want an explanation.

Don't forget the MPL in new files.

It's r+ with those changes.
(Assignee)

Comment 20

7 years ago
Created attachment 536369 [details] [diff] [review]
Moar MPL

Added the MPL to the missing files, made the corrections Laura suggested, and fixed caching so that it works! 

That last point needs some explaining. The error log showed memcached complaining about tags a whole bunch, and I found that I needed to provide an extra NULL to set the cache. Now it is much faster after the first successful retrieval.
Attachment #535544 - Attachment is obsolete: true
Attachment #535544 - Flags: review?(laura)
Attachment #535544 - Flags: review?(dveditz)
Attachment #536369 - Flags: review?(dveditz)
(Assignee)

Comment 21

7 years ago
Does this need a security review or can this be checked in?
Target Milestone: --- → 2.0
Comment on attachment 536369 [details] [diff] [review]
Moar MPL

I don't know socorro code and can't call this a "review", but I approve the caching approach in this patch as a solution to bug 504704. I note earlier references here and in that bug to "10 minutes" and that the implementation seems to define 15 minutes (mentioned in comment 7). I'm fine with either.
Attachment #536369 - Flags: review?(dveditz) → feedback+
(Assignee)

Comment 23

7 years ago
Patch checked in as r3194 (http://code.google.com/p/socorro/source/detail?r=3194)

Defaults to 15 min of caching, but that number is configurable.
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.