Open Bug 1086201 Opened 5 years ago Updated 5 years ago

Question 2: Impact of response times on future contributions

Categories

(Community Building :: Systems and Data, task)

task
Not set

Tracking

(Not tracked)

People

(Reporter: adam, Unassigned)

Details

(Whiteboard: [ContributorAnalysis])

This ticket is part of a joint MoCo/MoFo contributor analysis project. Find out more here: https://wiki.mozilla.org/Contribute/analysis 

We are using one ticket per question to track this work. Our goal is to answer a number of questions before the co-incident workweek. Some questions will not be practical to answer in this timeframe and when this happens we will keep the tickets for ongoing analysis after the workweek.

If you would like to work on this question, please comment on the ticket.

---

QUESTION 2: Impact of response times on future contributions

Can we grab data from Bugzilla (and other sources) to assess if “response times” (e.g. the time it takes to get a contribution reviewed or a request signed off on) have an impact on the likelihood to contribute again in the future.
The biggest data source for this question is likely to be bugzilla.

The vast majority of bugzilla data is public and anyone can analyze this and work on this question without special access or signing an NDA.

The data that is not public is for restricted employee access and is not relevant to this question so can be safely ignored.

There has been work in the past to look at this question but there were issues exposing Bugzilla data for general analysis. Any process to export from the raw data carries (small but real) risks of exposing private data so would need to go through a rigorous process (and therefore has time implications).

We should:
1) See if there is an existing public dataset for bugzilla we can analyse?
2) If not, I have a scraper that can turn the public bugzilla data into a structured database format we can work with (http://adamlofting.com/1112/getting-bicho-running-as-a-process-on-heroku-with-a-scheduler/ ) << this is slow to run though so we made need to focus on particular components
There is an existing Bugzilla public dataset available on an ElasticSearch cluster which we can query using the MoDevMetrics repo. 

You clone the repo to your machine, and then look at the example HTML files to see how to query the cluster and get results.

https://github.com/klahnakoski/MoDevMetrics/tree/dev

Kyle's documentation on that page explains how to get started (see the examples section to get started).
OS: Mac OS X → All
Hardware: x86 → All
You need to log in before you can comment on or make changes to this bug.