Closed Bug 834719 Opened 12 years ago Closed 10 years ago

TBPL should use a proper intermittent-failure bugscache that is updated regularly based on last-changed queries

Categories

(Tree Management Graveyard :: TBPL, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: emorley, Unassigned)

References

Details

(Keywords: sheriffing-P2)

Currently for each intermittent-failure bug lookup by TBPL's annotated summary generator, we: * Check a "bugscache" (unfortunate choice of name, since it's not a true bugscache) table for the search string - and return the previous search result (a bugslist) if found * If the search term isn't found in the table, query bzapi & iff >0 bugs found, store that term & results bugslist in the table. * Every time we run the data import cron on the server, purge search terms added to the bugscache table >3 hrs ago. This leads to several problems: 1) The list of bugs (and their metadata) for a search term can be up to 3 hours out of date. Summary changes / newly filed bugs (for terms that originally had >0 results) don't appear immediately - leading to sheriffs having to star manually / potentially filing dupes. 2) Every 3 hours we search for the same (common) terms again, after having thrown out their rows from the table, even if nothing has changed. Combine this will multiple worker threads racing, plus having tbpl-prod and tbpl-dev both parsing logs - and we search for the same common terms x6 every 3 hours. To put this in perspective, there are upto 5000 bzapi calls from tbpl for intermittent-failure bugs lookups per day, when there are only ~1000 open orange bugs, and only a fraction of those change per day. 3) For cases where the search term isn't found in the table, the bzapi call takes 5+ seconds for a simple substring match, whereas if we had a local bugscache it would not only be much faster - but we would be much more inclined to have clever fallback mechanisms (eg regexes, search for top frames from the crash signature field etc) that can make multiple queries to the local DB without worrying about either load on bzapi or the time taken. As such, we should: 1) Have a proper bugscache table that stores something like {bug_number, last_changed, summary, status, resolution, crash_signature, ...} rather than just {search_term,search_results}. 2) Populate that table from the data import cron (runs every 5 minutes), using last modified searches (as described on http://globau.wordpress.com/2012/10/09/bugzilla-mozilla-org-integration-best-practices/), & stop purging rows every 3 hours. 3) Switch the annotated summary generator to querying summary/crash singature/... in the cache directly, rather than via multiple bzapi summary searches. Since TBPL is going to be rewritten this year, we should just implement it as above (and leave TBPLv1 as it is) - but filing this so we can keep track of what needs to change in TBPLv2.
Product: Webtools → Tree Management
This is fixed in Treeherder. Wontfix here since TBPL is nearing EOL.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.