Closed Bug 796193 Opened 12 years ago Closed 11 years ago

Provide a 'daily grep' of the crash reporting logs for B2G crash reports where custom token is specifed

Categories

(Socorro :: Data request, task, P4)

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: lsblakk, Assigned: rhelmer)

References

Details

(Whiteboard: [triaged 20120914][waiting][metrics])

+++ This bug was initially created as a clone of Bug #791053 +++

We need to be able to track the B2G dogfooding program's phones when they crash. Ideally there will be a token in the crash report submission, something like &dogfooding_prerelease_id=<unique id> so that when we grep the logs for that token we know how often a particular phone out of the 400 devices deployed is crashing and with what signature.

I have a similar bug for update logs and we are able to grep the apache logs for update pings to parse out the lines for our dogfooding phones, it would be great to have this already parsed somewhere for viewing but short of that just straight-up access to logs that we can run a script against to search for that token will do for starters.
For reference, bug 789466 dealt with adding this ID to the crash report.
This won't be in any URL where it can be grepped, it needs to be extracted from the DB, CSV or HBase.

Also, IMHO, even having that ID in there is a huge privacy violation and we can only come away with it temporarily as long as there's only people using devices who have a signed a contract of some kind with Mozilla. If *anyone* who hasn't signed a contract with us is sending reports with that ID, I'm pretty sure we have a problem, as it's being done implicitly without any notification or opt-out.

That said, because this is so privacy-relevant, only very few machines and people with very high trust even have access to this. We can surely pull this out on demand, but I wonder hugely why there's any "need to be able" there at all.
This is another good example of why we are pushing hard to get Firefox Health Report injected into B2G quickly.  One of the things that FHR for the desktop is designed to collect is the amount of crashes for a given profile over time.  That data point will allow us to gain significant new insight into the Socorro data by letting us look at the proportion of crashes per profile instead of just the raw count of crash submissions.
Daniel: Indeed. And without much of the privacy implications I'm talking about above. I'm looking forward to seeing histograms of e.g. how crashy the worst 5% of profiles are (as well as doing deeper looks into what characteristics they might share and how we can help them) and how many profiles go completely without crashes for a long time.
Target Milestone: Unreviewed → Backlogged - BZ
Crash reporting, if not instrumented to be made available in FHR, would historically be available in Soccoro. Back burner for metrics until data source is straightened out.

:lsblakk  Have you chatted with the Soccoro folks?
Lukas, we can get this data for you pretty easily. Mind if I move this bug over to Socorro?
I don't mind at all - thank you!
Group: metrics-private
Component: Data/Backend Reports → Data request
Product: Mozilla Metrics → Socorro
Target Milestone: Backlogged - BZ → ---
So, looking at B2G code this is populating the email field, which makes it easy-peasy.

We can run this vs PG or HBase, and from talking to Lukas, sounds like we need, daily:
by email/token
crash signatures
crash OOIDs

Since the data has user emails, it's sensitive. 

Can we do this on crash-analysis for now, and drop it into Lukas's account?
Assignee: nobody → rhelmer
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #2)
> This won't be in any URL where it can be grepped, it needs to be extracted
> from the DB, CSV or HBase.
> 
> Also, IMHO, even having that ID in there is a huge privacy violation and we
> can only come away with it temporarily as long as there's only people using
> devices who have a signed a contract of some kind with Mozilla. If *anyone*
> who hasn't signed a contract with us is sending reports with that ID, I'm
> pretty sure we have a problem, as it's being done implicitly without any
> notification or opt-out.
> 
> That said, because this is so privacy-relevant, only very few machines and
> people with very high trust even have access to this. We can surely pull
> this out on demand, but I wonder hugely why there's any "need to be able"
> there at all.

Hey Kairo, just so you know: the people who participate in B2G Test Drivers are aware and have checked a box stating they are agreeing to this information being made available.  The reason we are using a unique ID and not their email address is so that it can only be mapped back to them via our dogfooding app that has very limited access (SUMO and Release Management) and it will be destroyed when the pre-release testing period is over.  We confirmed the acceptability of asking for, and tracking this information with our legal and privacy experts prior to announcing the program.
OK, good. Maybe would have been good to inform the stability group about this, as I had to find out through bugs I happened to follow. I'm still nervous about exposing data tied to a specific such ID publicly or even to an audience that has not been explicitly filtered to be highly trustworthy.
I also wonder what those reports data actually help us right now (and I'm entrenched in bug analysis the whole day).
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #10)
> OK, good. Maybe would have been good to inform the stability group about
> this, as I had to find out through bugs I happened to follow. I'm still
> nervous about exposing data tied to a specific such ID publicly or even to
> an audience that has not been explicitly filtered to be highly trustworthy.
> I also wonder what those reports data actually help us right now (and I'm
> entrenched in bug analysis the whole day).

This ID gives us a two-way communication channel with our (small) test population during this pre-release test period. It allows us to see which of the B2G Test Driver devices crash, on what buildID, and then to be able to follow up (if needed) with the individual person/people who are experiencing these crashes to look for STR.  This information was disseminated in many meetings and also in the sign up form for participation.
That's all god and understandable. It still doesn't explain why we need the reports requested here, as those who can access email addresses on Socorro already are able to see the ID to any crash report right on Socorro (that is, when it's sent - as it seems we only do this for the dogfooding devices, we don't yet get crash reports with this so we can't check or show it right now).
What I mean is I'm not sure why we need more than that and do additional work to do this when we already can get to that data given that it's sent in the email field and we already have tooling for that field.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #12)
> What I mean is I'm not sure why we need more than that and do additional
> work to do this when we already can get to that data given that it's sent in
> the email field and we already have tooling for that field.


Well I suppose that would be because I was not aware of where this information was being collected (all I had to go on was bug 789466 showing the id was added to the report submission) and so I filed this bug (cloned from the one asking for the same about updates) to track our B2G Test Driver participant's crashes. 

Now I know where the info is being stored, I have filed a bug for access, and I will be counting on the stability team to make sure we get this particular information parsed out for analysis during our B2G Test Driver sync-ups throughout the dogfooding period.
(In reply to Lukas Blakk [:lsblakk] from comment #13)
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #12)
> > What I mean is I'm not sure why we need more than that and do additional
> > work to do this when we already can get to that data given that it's sent in
> > the email field and we already have tooling for that field.
> 
> 
> Well I suppose that would be because I was not aware of where this
> information was being collected (all I had to go on was bug 789466 showing
> the id was added to the report submission) and so I filed this bug (cloned
> from the one asking for the same about updates) to track our B2G Test Driver
> participant's crashes. 
> 
> Now I know where the info is being stored, I have filed a bug for access,
> and I will be counting on the stability team to make sure we get this
> particular information parsed out for analysis during our B2G Test Driver
> sync-ups throughout the dogfooding period.

Did this all work out ok, or is this report still needed ? :)
I have the access, but have not had time to go deeper on pulling out per-asset-tag crash reports so still don't have visibility into how many of our testers are hitting crashes and how frequently.  If anyone has cycles to write a python script for this that I can tie into the dogfooding dashboard that would be great.  Being able to order by crash frequency (once we have symbols) and show the asset tags that are hitting a crash (so we can follow up) would be ideal. I'll try to get to it later this week if no one else can.
Well, we don't even know how many crash reports we currently get at all, as the infrastructure isn't working flawlessly yet.

I probably should set up a report of some kind there, I think the user I'm using on the DB is not able to read the email field, though.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #16)
> Well, we don't even know how many crash reports we currently get at all, as
> the infrastructure isn't working flawlessly yet.
> 
> I probably should set up a report of some kind there, I think the user I'm
> using on the DB is not able to read the email field, though.

I got that bit flipped in my LDAP to be able to access it when logged into crash-stats - is that the same thing for accessing it on the DB?
(In reply to Lukas Blakk [:lsblakk] from comment #17)
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #16)
> > Well, we don't even know how many crash reports we currently get at all, as
> > the infrastructure isn't working flawlessly yet.
> > 
> > I probably should set up a report of some kind there, I think the user I'm
> > using on the DB is not able to read the email field, though.
> 
> I got that bit flipped in my LDAP to be able to access it when logged into
> crash-stats - is that the same thing for accessing it on the DB?

To get direct access to the database you'd need to have an account on a machine that is able to connect directly to secondary DB master - the crash-analysis server is such a place. This would be a good place to generate and also publish reports.
Lukas, did you get what you need here? Maybe Selena can help if not, sorry for letting this languish so long!
We're done with that custom dogfooding program now and people are reporting crashes through the usual channels so this can be resolved - no worries, I know you all were swamped :)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.