Simplify bugzilla query in cron job

RESOLVED FIXED

Status

Socorro
Backend
RESOLVED FIXED
6 months ago
6 months ago

People

(Reporter: adrian, Assigned: adrian)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

6 months ago
In our bugzilla cron job, we query the following URL: 

    'https://bugzilla.mozilla.org/buglist.cgi?query_format=advanced&short_'
    'desc_type=allwordssubstr&short_desc=&long_desc_type=allwordssubstr&lo'
    'ng_desc=&bug_file_loc_type=allwordssubstr&bug_file_loc=&status_whiteb'
    'oard_type=allwordssubstr&status_whiteboard=&keywords_type=allwords&ke'
    'ywords=&deadlinefrom=&deadlineto=&emailassigned_to1=1&emailtype1=subs'
    'tring&email1=&emailassigned_to2=1&emailreporter2=1&emailqa_contact2=1'
    '&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&vo'
    'tes=&chfieldfrom=%s&chfieldto=Now&chfield=[Bug+creation]&chfield=reso'
    'lution&chfield=bug_status&chfield=short_desc&chfield=cf_crash_signatu'
    're&chfieldvalue=&cmdtype=doit&order=Importance&field0-0-0=noop&type0-'
    '0-0=noop&value0-0-0=&columnlist=bug_id,bug_status,resolution,short_de'
    'sc,cf_crash_signature&ctype=csv'

This URL is unreadable and contains a lot of unneeded fields. After running some tests, I am able to reduce it to the following: 

https://bugzilla.mozilla.org/buglist.cgi?query_format=advanced&chfieldfrom=%s&chfieldto=Now&chfield=[Bug+creation]&chfield=resolution&chfield=bug_status&chfield=short_desc&chfield=cf_crash_signature&columnlist=bug_id,bug_status,resolution,short_desc,cf_crash_signature&ctype=csv

And after bug 1368498 lands, we won't even need to listen to changes in bug status, resolution or description, so we can simplify it even further (and reduce the number of operations we run): 

https://bugzilla.mozilla.org/buglist.cgi?query_format=advanced&chfieldfrom=%s&chfieldto=Now&chfield=[Bug+creation]&chfield=cf_crash_signature&columnlist=bug_id,cf_crash_signature&ctype=csv
csv?? Is it not possible to use the REST JSON endpoint instead? I remember that dkl said that any advanced search you can do you can convert to a REST JSON endpoint simply by changing the path or something.
The in_review.py script I use gets back JSON:

    https://github.com/willkg/socorro-zero/blob/master/bin/in_review.py

The code is terrible for reasons. But maybe that's helpful?

Also, there's a bugzilla python library that also gets back json:

    https://github.com/gdestuynder/simple_bugzilla

I wouldn't switch to that library without someone doing some work on it for better viability, though (no tests, no install_requires in setup, no indication of what Python versions it supports, and so on). If that's interesting, I'm game for spending some time on it.
(Assignee)

Comment 3

6 months ago
Created attachment 8877111 [details] [review]
Link to Github pull-request: https://github.com/mozilla-services/socorro/pull/3809

Comment 4

6 months ago
Commit pushed to master at https://github.com/mozilla-services/socorro

https://github.com/mozilla-services/socorro/commit/62286e3fb3dda24b2bfd02e4d005125ab7635422
Fixes bug 1370484 - Refactored bugzilla query to use REST API and JSON. (#3809)

* Fixes bug 1370484 - Refactored bugzilla query to use REST API and JSON.
This query used to be a very long, hand-written URL, using the old bugzilla query system that returned CSV data. With this, it now uses the REST API instead, manipulating JSON data with the requests library. The query is much simpler to read and has only the few arguments it actually needs. Also, the query is not configurable anymore, because that option was never used and it would have made the code less maintainable.

* Added a clear exception when bugzilla fails.

* Use requests.Response.raise_for_status for HTTP errors.

Updated

6 months ago
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.