Closed Bug 1294503 Opened 9 years ago Closed 9 years ago

getting additional data rows / columns for a deeper bugzilla analysis

Tracking

()

Status:

RESOLVED INCOMPLETE

People

(Reporter: hulmer, Unassigned)

References

Details

User Story

I am hoping to get a CSV of bugzilla data to perform some basic analysis, organized as one bug per row, each column representing an appropriate field. 

A complete spec of the CSV would be this:

Filtered Date Range: January 1st, 2010 onward
Filtered Products: Core, Desktop, Android, IOS, Toolkit

Original columns of CSV as per Bug #1274024 (starred columns means there wasn't enough time to pull these before London):
- INT: bug ID
- DATESTRING: Date bug was filed
- STRING: the bug resolution (fixed, wontfix, anything other than those two)
- STRING: product
- STRING: keywords
- STRING: status flag
- STRING: platform
- STRING: product 
- STRING: component of product
- TRUE / FALSE: user story is present
- TRUE / FALSE: did the bug start in the General component of a product?
- STRING: release version when added by a code sheriff
- STRING: priority if added by staff
- TRUE / FALSE: whether there is an unresolved needinfo
- TRUE / FALSE: has an attachment
- INT: # of comments
- **TRUE / FALSE: if any comment is marked as abuse / spam / non-pertinent
- STRING severity

Some additional columns I am hoping we can pull:
- STRING: regression range (check comments, or the has reg regression range field, the latter is not in heavy use yet)
- STRING: release version when added by a code sheriff
- TRUE / FALSE: is this bug a patch? (older bugs that don't use mozreview will have patches as attaches)
- TRUE / FALSE: did the bug get originally filed in the right product (not just component)?
- TRUE / FALSE: are any comments flagged as abuse / spam / non-pertinent?

Hamilton

Reporter

Description

•

9 years ago

+++ This bug was initially created as a clone of Bug #1274024 +++ I am hoping to get a CSV of bugzilla data to perform some basic analysis, organized as one bug per row, each column representing an appropriate field. I requested a subset of all this data before London, and am hoping to expand the analysis I started back then to suggest some answers to the question "What makes a successful bug?" My recollection is the previous cut of the data got most of the easy-to-query columns just fine. My hope is we can get some of the harder-to-pull ones as well, so I can see whether or not they contribute to successful bug resolutions. A complete spec of the CSV would be this: Filtered Date Range: January 1st, 2010 onward Filtered Products: Core, Desktop, Android, IOS, Toolkit Original columns of CSV as per Bug #1274024 (starred columns means there wasn't enough time to pull these before London): - INT: bug ID - DATESTRING: Date bug was filed - STRING: the bug resolution (fixed, wontfix, anything other than those two) - STRING: product - STRING: keywords - STRING: status flag - STRING: platform - STRING: product - STRING: component of product - TRUE / FALSE: user story is present - TRUE / FALSE: did the bug start in the General component of a product? - STRING: release version when added by a code sheriff - STRING: priority if added by staff - TRUE / FALSE: whether there is an unresolved needinfo - TRUE / FALSE: has an attachment - INT: # of comments - **TRUE / FALSE: if any comment is marked as abuse / spam / non-pertinent - STRING severity Some additional columns I am hoping we can pull: - STRING: regression range (check comments, or the has reg regression range field, the latter is not in heavy use yet) - STRING: release version when added by a code sheriff - TRUE / FALSE: is this bug a patch? (older bugs that don't use mozreview will have patches as attaches) - TRUE / FALSE: did the bug get originally filed in the right product (not just component)? - TRUE / FALSE: are any comments flagged as abuse / spam / non-pertinent?

Hamilton

Reporter

Comment 1

•

9 years ago

I should also mention - the flag in the first cut of the data set had a variable called has_unresolved_needinfo, but I think the logic for generating that may need to be re-checked. In the previous version of the data set there was only one row that had a 1 for this value.

Dylan Hardison [:dylan] (he/him)

Comment 2

•

9 years ago

Setting this a P2 -- work can be done on it next week. Sorry for the delay, we're catching up on some operational deficiencies.

Priority: -- → P2

Hamilton

Reporter

Comment 3

•

9 years ago

No problem / thanks Dylan!

Hamilton

Reporter

Updated

•

9 years ago

User Story: (updated)

Priority: P2 → --

Hamilton

Reporter

Updated

•

9 years ago

Priority: -- → P2

Hamilton

Reporter

Comment 4

•

9 years ago

:dylan any idea when we might get moving on this?

Dylan Hardison [:dylan] (he/him)

Comment 5

•

9 years ago

The unresolved parts are actually going to take more time and time is a little thin right now. I wonder if another avenue to pursue from this is to make use of the research database dumps directly?

Hamilton

Reporter

Comment 6

•

9 years ago

What do you have in mind in this regard? Handing over access to the research database dumps to me?

Emma Humphries ☕️🎸🧞‍♀️✨ (she/they) [:emceeaich] (Pacific Time) use needinfo

Comment 7

•

9 years ago

Asking :mcote to get Hamilton access to the research dumps

Emma Humphries ☕️🎸🧞‍♀️✨ (she/they) [:emceeaich] (Pacific Time) use needinfo

Comment 8

•

9 years ago

Ah Hamilton should have access per bug 1258849. I'm going to close this ticket and make sure that Hamilton can access the research dump.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → INCOMPLETE

You need to log in before you can comment on or make changes to this bug.

Bugzilla

getting additional data rows / columns for a deeper bugzilla analysis

Categories

(bugzilla.mozilla.org :: Administration, task, P2)

Tracking

()

People

(Reporter: hulmer, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Updated

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8