Closed
Bug 915317
Opened 11 years ago
Closed 11 years ago
Display total volume of (Nightly) GC crashes over time
Categories
(Socorro :: Webapp, task)
Socorro
Webapp
Tracking
(Not tracked)
RESOLVED
FIXED
76
People
(Reporter: kairo, Assigned: espressive)
References
Details
(Whiteboard: [QA+])
Attachments
(2 files, 1 obsolete file)
+++ This bug was initially created as a clone of Bug #803209 +++
The actually important outcome of that bug would have been some display of the total volume of GC crashes on a certain version (Nightly, most importantly) over (build) time to be able to detect regressions and spikes happening on certain build dates.
Comment 1•11 years ago
|
||
Ideally by buildid instead of by date, as in bug 915246
Reporter | ||
Comment 2•11 years ago
|
||
I do not want Scorro to go into doing anything in smaller units than full days, not even for build time.
Comment 3•11 years ago
|
||
I don't understand what that comment means. Whenever we have reports focused on nightly/aurora regression ranges, we should switch them to be by buildid instead of by date. Our current system of measuring crashes on nightly and aurora by date instead of build introduces delay and uncertainty in measuring regression/fix ranges. So I want to insist that all new reports of this sort group and order by buildid first.
Reporter | ||
Comment 4•11 years ago
|
||
I was talking about build date, with a day granularity, in comment #0, and that's what I am rooting for being implemented. I just don't want to have time invested into figuring out the multiple-builds-per-day issue that comes when you go with the buildid itself.
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → schalk.neethling.bugs
Comment 5•11 years ago
|
||
Schalk: what's the status here?
Comment 6•11 years ago
|
||
This should just need UI work at this point. The MW/cron should already exist.
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #6)
> This should just need UI work at this point. The MW/cron should already
> exist.
Is there any UI for this at the moment or is it all new?
Assignee | ||
Comment 8•11 years ago
|
||
Couple of questions:
1) Where does this fit into the current Socorro workflow/pages?
2) If this is a new page, can someone give me a rough idea of what we need to display? Is it a graph displaying total volume of (Nightly) GC crashes over time on a page much similar to crash trends?
3) So from the title it sounds like this will only be for Nightly but, is this for all nightlies or only featured.
4) I assume this will also be for all products including Thunderbird and Firefox OS?
Thanks!
Flags: needinfo?(kairo)
Flags: needinfo?(bsavage)
Comment 9•11 years ago
|
||
This is a new report.
This should be a graph of GC crashes/ADU for each nightly, as in bug 915246 except rather than for a signature it's for all crashes with that specific annotation.
For now, this is for Firefox nightly only. None of the other products has enough volume of nightly users to produce meaningful results. Maybe as a future step it would be good to have trends of GC crashes/ADU for other builds (beta/release), but since you can't graph them the same way it's less interesting. The primary purpose of this report is to monitor whether changes in nightly are causing memory corruption/GC issues that we can't catch with normal by-signature reports because they would be in many different signatures.
Flags: needinfo?(kairo)
Assignee | ||
Comment 10•11 years ago
|
||
UI for this is in https://github.com/mozilla/socorro/pull/1743
Updated•11 years ago
|
Target Milestone: --- → 71
Comment 11•11 years ago
|
||
New PR for this - https://github.com/mozilla/socorro/pull/1755
Updated•11 years ago
|
Target Milestone: 71 → 70
Comment 12•11 years ago
|
||
Commit pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/0a04b23ab6b7ad2f877f3b31a8ce2de2b9014eda
Fix Bug 915317 UI, display total volume of gc crashes over time
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
How do we get to this page?
Comment 14•11 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #13)
> How do we get to this page?
This is unlinked from the menu and will be on stage in the next 20 minutes or so:
https://crash-stats.allizom.org/gccrashes/products/Firefox
Whiteboard: [QA+]
Comment 15•11 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #14)
> (In reply to Bill McCloskey (:billm) from comment #13)
> > How do we get to this page?
>
> This is unlinked from the menu and will be on stage in the next 20 minutes
> or so:
>
> https://crash-stats.allizom.org/gccrashes/products/Firefox
It's up now, but looks like there's a CSRF problem :/ I'll take a look.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 16•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/0a0502cc735c0a9c6119175b1131876eb4e39ad0
bug 915317 - add csrf protection to gccrash page
https://github.com/mozilla/socorro/commit/d3e826f09ebe386d18373eda9e9538cf459e3e98
Merge pull request #1756 from rhelmer/bug915317-gccrashes-fix-csrf
bug 915317 - add csrf protection to gccrash page
Comment 17•11 years ago
|
||
I just pushed a fix for the CSRF problem, I am seeing any data but this may depend on reports that have not run yet etc.
needinfo'ing Schalk for info on the status of this feature.
Flags: needinfo?(schalk.neethling.bugs)
(In reply to Robert Helmer [:rhelmer] from comment #17)
> I just pushed a fix for the CSRF problem, I am seeing any data but this may
> depend on reports that have not run yet etc.
>
> needinfo'ing Schalk for info on the status of this feature.
s/I am seeing any data/I am *not* seeing any data/?
Comment 19•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #18)
> (In reply to Robert Helmer [:rhelmer] from comment #17)
> > I just pushed a fix for the CSRF problem, I am seeing any data but this may
> > depend on reports that have not run yet etc.
> >
> > needinfo'ing Schalk for info on the status of this feature.
>
> s/I am seeing any data/I am *not* seeing any data/?
Yes, thank you :)
Assignee | ||
Comment 20•11 years ago
|
||
:rhelmer and :stephend So, I guess there is a little confusion here, which is good ;) So, the page in questions for this bug 'Display total volume of (Nightly) GC crashes over time' is indeed as Rob indicated but, that is not the bug myself and Rob has been working on.
The one that should be targeted for 70 is https://bugzilla.mozilla.org/show_bug.cgi?id=915373 as the PR suggests. To get to this page, the URL is: https://crash-stats.allizom.org/topcrasher_ranks_bybug/
Flags: needinfo?(schalk.neethling.bugs)
Flags: needinfo?(bsavage)
Assignee | ||
Comment 21•11 years ago
|
||
As far as this bug goes, I am not sure if there is middleware for it yet.
Flags: needinfo?(rhelmer)
Reporter | ||
Comment 22•11 years ago
|
||
(In reply to Schalk Neethling [:espressive] from comment #20)
> The one that should be targeted for 70 is
> https://bugzilla.mozilla.org/show_bug.cgi?id=915373 as the PR suggests. To
> get to this page, the URL is:
> https://crash-stats.allizom.org/topcrasher_ranks_bybug/
That one has nothing to do with the bug here, it's for bug 915373, which is something completely different.
Comment 23•11 years ago
|
||
Although part of this has landed, I mistakenly moved it to milestone 70. It will not be ready until 71, sorry for the noise!
Flags: needinfo?(rhelmer)
Target Milestone: 70 → 71
Assignee | ||
Updated•11 years ago
|
Target Milestone: 71 → 72
Assignee | ||
Updated•11 years ago
|
Target Milestone: 72 → 73
Assignee | ||
Comment 24•11 years ago
|
||
Any new updates on this feature? I am not sure what, if anything, is left to be done from a front end perspective. Thanks!
Target Milestone: 73 → 71
A lot of major changes are being made to the GC right now and we really need this feature. Can anything be done to prioritize this? I went to this link:
https://crash-stats.allizom.org/gccrashes/products/Firefox
but there's no data there.
Also, as Benjamin said, it's really important that this data be broken down by buildid.
Assignee | ||
Comment 26•11 years ago
|
||
Do we have middleware for this? I believe that is all that is missing for this bug. Rob, you know anything about that?
Flags: needinfo?(rhelmer)
Comment 27•11 years ago
|
||
(In reply to Schalk Neethling [:espressive] from comment #26)
> Do we have middleware for this? I believe that is all that is missing for
> this bug. Rob, you know anything about that?
Filed bug 967593 to track this.
Flags: needinfo?(rhelmer)
Comment 28•11 years ago
|
||
Attachment #8370340 -
Flags: review?(peterbe)
Comment 29•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/893853ce8a323103efd3c0eb4d4dc55e9cec4fc6
bug 915317 - gccrashes model
https://github.com/mozilla/socorro/commit/97608fbfd1a8d20e3676b169e96e39794047d4ca
Merge pull request #1853 from rhelmer/bug915317-gccrashes-model
bug 915317 - gccrashes model
Updated•11 years ago
|
Target Milestone: 71 → 73
Comment 30•11 years ago
|
||
OK I've created a stored procedure to populate a new "gccrashes" matview, and there's a corresponding middleware service and Django model+view. It's live on stage, you can see the output here:
https://crash-stats.allizom.org/gccrashes/json_data/products/Firefox/versions/29.0a1/start_date/2014-01-27/end_date/2014-02-04
The format is:
{
'hits':
[
[{{ build_id }}, {{ is_gc_count }}]
]
'total': 1
}
You should be able to plot build_id and is_gc_count - I think you can probably drop this directly into flot and get the desired result, although I haven't tried it.
Flags: needinfo?(schalk.neethling.bugs)
Assignee | ||
Comment 31•11 years ago
|
||
Awesome, I am going to work on this and will let you know.
Flags: needinfo?(schalk.neethling.bugs)
Assignee | ||
Updated•11 years ago
|
Target Milestone: 73 → 74
Updated•11 years ago
|
Attachment #8370340 -
Flags: review?(peterbe)
Assignee | ||
Updated•11 years ago
|
Status: REOPENED → ASSIGNED
Comment 32•11 years ago
|
||
hmm so the SP and matview are running find in production now, and I've backfilled a few days to see what it looks like:
https://crash-stats.mozilla.com/gccrashes/json_data/products/FennecAndroid/versions/27.0/start_date/2014-01-27/end_date/2014-02-10
However, there's a problem with the query, I think - note that's returning duplicate build IDs, because it's not exposing the report_date. I *think* what's probably desired here is to GROUP BY build and sum(is_gc_count) - that would give the total for a particular build_id for the date range specified in the URL, but just a single data point would be graphed for each build ID (ignoring report_date in the graph).
Make sense?
Flags: needinfo?(schalk.neethling.bugs)
Assignee | ||
Comment 33•11 years ago
|
||
Hey Rob,
I also saw that for some ranges the last item in the array sent back to the JS is:
[null, 0]
Not sure if it is related?
Flags: needinfo?(schalk.neethling.bugs)
Comment 34•11 years ago
|
||
(In reply to Schalk Neethling [:espressive] from comment #33)
> Hey Rob,
>
> I also saw that for some ranges the last item in the array sent back to the
> JS is:
>
> [null, 0]
>
> Not sure if it is related?
We actually get some null build IDs in reports_clean, for some reason :/ I can filter these out if you like.
Assignee | ||
Comment 35•11 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #34)
> (In reply to Schalk Neethling [:espressive] from comment #33)
> > Hey Rob,
> >
> > I also saw that for some ranges the last item in the array sent back to the
> > JS is:
> >
> > [null, 0]
> >
> > Not sure if it is related?
>
> We actually get some null build IDs in reports_clean, for some reason :/ I
> can filter these out if you like.
That would be great! One check I do not have to do on the client. Thanks!
Assignee | ||
Updated•11 years ago
|
Target Milestone: 74 → 75
Assignee | ||
Comment 36•11 years ago
|
||
Is this what you would need for GCCrashes?
Attachment #8375760 -
Flags: review?(kairo)
Assignee | ||
Comment 37•11 years ago
|
||
Here is the data I get/use to plot the graph in the attachment in comment 36:
http://sneethling.pastebin.mozilla.org/4280209
Reporter | ||
Comment 38•11 years ago
|
||
Comment on attachment 8375760 [details]
Screen Shot 2014-02-13 at 22.31.01.png
r- because we need bug 972563 before this makes sense (and you promised a new request when that's done).
That said, it this is total crashes, the decimals are probably superfluous as we don't have half or smaller fractals of crashes. ;-)
Attachment #8375760 -
Flags: review?(kairo) → review-
Comment 39•11 years ago
|
||
All dependent bugs have landed, but we need migration to run on stage. Once that is done we should have:
* gccrashes per ADU
* only crashes from nightly channel considered
Comment 40•11 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #39)
> All dependent bugs have landed, but we need migration to run on stage. Once
> that is done we should have:
> * gccrashes per ADU
> * only crashes from nightly channel considered
This needs to be manually run (I don't have access, needinfo'ing selena - please upgrade to latest alembic version :) )
Flags: needinfo?(sdeckelmann)
Comment 41•11 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #40)
> (In reply to Robert Helmer [:rhelmer] from comment #39)
> > All dependent bugs have landed, but we need migration to run on stage. Once
> > that is done we should have:
> > * gccrashes per ADU
> > * only crashes from nightly channel considered
>
> This needs to be manually run (I don't have access, needinfo'ing selena -
> please upgrade to latest alembic version :) )
Oh! Actually I checked and I *do* have access.
Comment 42•11 years ago
|
||
OK there's a bit of cleanup work I need to do wrt DB migrations in bug 972612, but I went ahead and tested the migration waiting for review on stage and it looks like it did the right thing (just a matter of using alembic correctly so the column type gets altered).
Schalk, please give it a try now, the following should work on stage:
(In reply to Robert Helmer [:rhelmer] from comment #39)
> * gccrashes per ADU
> * only crashes from nightly channel considered
Flags: needinfo?(sdeckelmann) → needinfo?(schalk.neethling.bugs)
Assignee | ||
Comment 43•11 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #42)
> OK there's a bit of cleanup work I need to do wrt DB migrations in bug
> 972612, but I went ahead and tested the migration waiting for review on
> stage and it looks like it did the right thing (just a matter of using
> alembic correctly so the column type gets altered).
>
> Schalk, please give it a try now, the following should work on stage:
>
> (In reply to Robert Helmer [:rhelmer] from comment #39)
> > * gccrashes per ADU
> > * only crashes from nightly channel considered
Hmmm so now, I am getting the following data with a call such as:
http://127.0.0.1:8000/gccrashes/json_data/products/Firefox/versions/30.0a1?start_date=2013-12-13&end_date=2014-02-10
http://sneethling.pastebin.mozilla.org/4286141
Flags: needinfo?(schalk.neethling.bugs) → needinfo?(rhelmer)
Comment 44•11 years ago
|
||
(In reply to Schalk Neethling [:espressive] from comment #43)
> (In reply to Robert Helmer [:rhelmer] from comment #42)
> > OK there's a bit of cleanup work I need to do wrt DB migrations in bug
> > 972612, but I went ahead and tested the migration waiting for review on
> > stage and it looks like it did the right thing (just a matter of using
> > alembic correctly so the column type gets altered).
> >
> > Schalk, please give it a try now, the following should work on stage:
> >
> > (In reply to Robert Helmer [:rhelmer] from comment #39)
> > > * gccrashes per ADU
> > > * only crashes from nightly channel considered
>
> Hmmm so now, I am getting the following data with a call such as:
>
> http://127.0.0.1:8000/gccrashes/json_data/products/Firefox/versions/30.
> 0a1?start_date=2013-12-13&end_date=2014-02-10
>
> http://sneethling.pastebin.mozilla.org/4286141
Some numbers are very small now, since bug 972612 does is_gc_count / ADI:
breakpad=# select 5 / 110896::real as gc_per_adu;
gc_per_adu
----------------------
4.50872889914875e-05
You can use these numbers directly in JS, in JSON as well:
> var n = 4.50872889914875e-05; n
0.0000450872889914875
> JSON.parse('{"n": 4.50872889914875e-05}');
{ n: 0.0000450872889914875 }
Let me know if I am missing your point, or if this does not make sense!
Flags: needinfo?(rhelmer)
Reporter | ||
Comment 45•11 years ago
|
||
I actually do not know where those numbers come from, but as I said on IRC, we probably can take numbers per million ADI.
That said, how do we get ADI per build ID when we take a table as a source that isn't by build ID, see bug 972612 comment #6?
Comment 46•11 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #45)
> I actually do not know where those numbers come from, but as I said on IRC,
> we probably can take numbers per million ADI.
OK this is doable.
> That said, how do we get ADI per build ID when we take a table as a source
> that isn't by build ID, see bug 972612 comment #6?
Working on that over there (short answer is that we are currently using ADI for all builds on the product/verision/channel which we don't want).
Reporter | ||
Comment 47•11 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #46)
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #45)
> > I actually do not know where those numbers come from, but as I said on IRC,
> > we probably can take numbers per million ADI.
>
> OK this is doable.
Let's review the unit multiplicator (1M, 1k, or whatever) and the amount of decimals in display once we have corrected the data in bug 972612, though, as the order of the numbers might change.
Assignee | ||
Comment 48•11 years ago
|
||
:rhelmer So there is going to be one more PR from you on this right?
:KaiRo Did we decide on dropping the decimal places (if any) or, are we going to use a precision of 2 decimal places?
Thanks!
Flags: needinfo?(rhelmer)
Flags: needinfo?(kairo)
Comment 49•11 years ago
|
||
(In reply to Schalk Neethling [:espressive] from comment #48)
> :rhelmer So there is going to be one more PR from you on this right?
>
> :KaiRo Did we decide on dropping the decimal places (if any) or, are we
> going to use a precision of 2 decimal places?
>
> Thanks!
Yep making requested changes in blocking bug 972612 (crashes per million ADI, use ADI for specific build ID, a few other related details). Will let you know when it lands and ready on stage!
Flags: needinfo?(rhelmer)
Assignee | ||
Comment 50•11 years ago
|
||
This is as the graph is plotted now with the data after the latest changes by :rhelmer. http://sneethling.pastebin.mozilla.org/4316328
Attachment #8375760 -
Attachment is obsolete: true
Attachment #8377027 -
Flags: review?(kairo)
Assignee | ||
Updated•11 years ago
|
Target Milestone: 75 → 76
Reporter | ||
Comment 51•11 years ago
|
||
Comment on attachment 8377027 [details]
Screen Shot 2014-02-17 at 12.13.46.png
This looks pretty good, I think we just should lose the decimals on the axis, though, they don't give us any additional value.
Attachment #8377027 -
Flags: review?(kairo) → review+
Flags: needinfo?(kairo)
Comment 52•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/47f0b7c1e05e675165e1534523cb2bc012eb7130
Fix Bug 915317, display total volume of gccrashes
https://github.com/mozilla/socorro/commit/164049c1d436fab863c19b5a57b4633750732618
Merge pull request #1900 from ossreleasefeed/bug915317-display-total-volume-of-gccrashes
Fix Bug 915317, display total volume of gccrashes
Updated•11 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 53•11 years ago
|
||
Steps to QA
-------------
1) Load up the main page
-- If the selected product is Fireox, there should be an entry for GC Crashes in the Reports drop down.
-- If any other product is selected, the entry should no exist.
2) Select GC Crashes
-- A new page should load for the latest nightly release for Fx
JS Validation to test
----------------------
Using the form on the left:
1) If any of the two date fields is set to a future date, clicking Generate should show an error messages stating that dates cannot be in the future.
2) If the from date is greater than the to date, clicking Generate should show an error indicating that the from date should be before the to date
3) If there is no product selected, clicking Generate should show an error indicating that a product should be selected
4) If there is no version selected, clicking Generate should show an error that a version needs to be selected.
5) If a product other than Firefox is selected, a error messages should indicate that the current report only supports Firefox.
If all is well, a graph should be drawn on the right.
Updated•11 years ago
|
Flags: needinfo?(stephen.donner)
Comment 55•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/0460aeddd5196155bf2b2789a332c357f58cb59e
Fix Bug 915317, display total volume of gccrashes
https://github.com/mozilla/socorro/commit/f23990270616ae03bedecdecb57df9255251bed8
Merge pull request #1914 from ossreleasefeed/bug915317-display-total-volume-of-gccrashes
Fix Bug 976674, use minified files for d3
Comment 56•11 years ago
|
||
Is this ready for QA? It looks like there are a few dependencies that are still open.
Flags: needinfo?(mbrandt) → needinfo?(schalk.neethling.bugs)
Assignee | ||
Comment 57•11 years ago
|
||
All dependencies will be fixed by https://github.com/mozilla/socorro/pull/1918 which should be merged latest tomorrow I would think.
Flags: needinfo?(schalk.neethling.bugs)
Updated•11 years ago
|
Flags: needinfo?(stephen.donner)
You need to log in
before you can comment on or make changes to this bug.
Description
•