Closed Bug 628341 Opened 13 years ago Closed 12 years ago

Please set up and enable correlation reports for Camino

Categories

(Socorro :: Backend, task, P5)

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 650904

People

(Reporter: alqahira, Assigned: rhelmer)

References

Details

I'd been meaning to file this before the holidays, but bug 628328 reminded me again.

We'd like the correlation reports to be set up/enabled for Camino, at least the Modules report. (We don't have XUL add-ons, so if there's a safe and easy way to not have Socorro generate or require the Add-Ons correlation report for Camino, it can be ignored.)

Bug 628328 sounds like maybe the crash-stats UI has already automatically "enabled" the reports (i.e., it's looking for the files) but that whatever generates the text files isn't set up to run the analysis against Camino crashes.
Assignee: nobody → rhelmer
This is generated by one of two scripts that have not made it into Socorro SVN yet (bug 608190):

cron_libraries.sh

I see at the top of this script:

"""
for I in Firefox Thunderbird SeaMonkey; do
"""

For right now, let's add Camino to this list. I've filed bug 629029 to clean this up.
I've added Camino to the list in the cron script.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Thanks, guys!  These should start working tomorrow, right?
(In reply to comment #3)
> Thanks, guys!  These should start working tomorrow, right?

Yep, looks like it runs at 5 AM Pacific:
05 00 * * *

So let's check tomorrow and either reopen or verify this one.
(In reply to comment #4)
> (In reply to comment #3)
> > Thanks, guys!  These should start working tomorrow, right?
> 
> Yep, looks like it runs at 5 AM Pacific:
> 05 00 * * *
> 
> So let's check tomorrow and either reopen or verify this one.

Oops actually this would be 5 minutes after midnight, sorry.
Robert, it looks like there are a couple of problems here:

1) Files were generated for 2.0.2, 2.0.3 (these two I think are the versions that stage uses, but are otherwise old and useless) and 2.0.6, but these files were empty other than a line reading "Mac OS X".

2) No files were generated for 2.1a1, which is our current milestone.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to comment #6)
> Robert, it looks like there are a couple of problems here:
> 
> 1) Files were generated for 2.0.2, 2.0.3 (these two I think are the versions
> that stage uses, but are otherwise old and useless) and 2.0.6, but these files
> were empty other than a line reading "Mac OS X".
> 
> 2) No files were generated for 2.1a1, which is our current milestone.

So I think the problem is that the script does a hard-coded check for the top three versions for each product by sorted by number of crashes in the last week, and uses those. There's a "manual version override" in place for Firefox only right now.

The current script is being worked on for inclusion to public Socorro SVN in bug 629029 (attachment 507180 [details] [diff] [review]), it's behavior is not what I expected before looking at it closely.

The way it's selecting the versions is:

"""
WEEK=`date -d 'last monday' '+%Y%m%d'`
DATE=`date '+%Y%m%d'`
for I in Firefox Thunderbird SeaMonkey Camino
do
  VERSIONS=`psql -t -U $databaseUserName -h $databaseHost $databaseName -c "select version, count(*) as counts from reports_${WEEK}  where completed_datetime < NOW() and completed_datetime > (NOW() - interval '24 hours') and product = '${I}' group by version order by counts desc limit 3" | awk '{print $1}'`
"""

I am reading this as "get the top three versions from the reports table by count from the last week", as opposed to what I was expecting (getting the "featured versions" from the same place the "branch data sources" admin panel does).

Here is what I see in production:

"""
breakpad=> select version, count(*) as counts from reports_20110124 where completed_datetime < NOW() and completed_datetime > (NOW() - interval '24 hours') and product = 'Camino' group by version order by counts desc limit 3;
 version | counts 
---------+--------
 2.0.6   |     24
 2.0.5   |      2
 2.0.2   |      1
(3 rows)
"""

It looks like there is a "manual override" only for Firefox in this file, which forces reports for 4.0b7pre and 3.6.9:

"""
MANUAL_VERSION_OVERRIDE="4.0b7pre 3.6.9"
for I in Firefox
do
  VERSIONS=`psql -t -U $databaseUserName -h $databaseHost $databaseName -c "select version, count(*) as counts from reports_${WEEK}  where completed_datetime < NOW() and completed_datetime > (NOW() - interval '24 hours') and product = '${I}' and version like '%b%' group by version order by counts desc limit 2" | awk '{print $1}'`
  for J in $VERSIONS $MANUAL_VERSION_OVERRIDE
  do
"""

Perhaps this should be extended to support any project (will be easier when this is Python not bash of course, but doable in bash).

Am I missing anything here?
I guess those selected versions make sense based on what the 1-week crash count would have been at 12:05 AM this morning (2.1a1 was running just slightly behind 2.0.2 and 2.0.3), but as you note, it's not exactly a desirable set.

I'm still not sure why the files that were generated are empty, though; is it because the analysis is then only looking at the last 24 hours of crashes (which is going to be too short a timeframe for Camino, given our crash volume)?  (In our case, using a rolling set of 7 or 14 days would be better.)
(In reply to comment #7)
> So I think the problem is that the script does a hard-coded check for the top
> three versions for each product by sorted by number of crashes in the last
> week, and uses those. There's a "manual version override" in place for Firefox
> only right now.

The more I think about it, the more it seems to me that is just like any other "major report" and should be gated on the versions that are currently "active" (for lack of a better word; have not yet reached their "End Date" per that column) in the Branch Data Sources admin panel.

I understand that may end up being too many "reports" to handle[1], but at the same time it should also eliminate the need to have (and constantly modify before releases) a manual override in place for release candidate/beta-channel Firefox versions (that may otherwise not have enough users to rank in the top 3 crashiest Firefox versions) before release.  Plus, it would also mean other products could benefit from the same reports-during-RC/beta-channel-phase without requiring more manual overrides.

(That still wouldn't solve the problem in the second half of comment 8, though, but it would rationalize which empty reports were being produced :P )

[1] I count currently:
 5 Camino
 4 Fennec
18 Firefox (including 3.6.4pre, running until 2015-06-30)
 6 SeaMonkey
 8 Thunderbird
for a total of 41 active versions.  (The Firefox number may be abnormally high right now given the small window between betas, so with the default 90-day window, there are lots of betas and beta-pres still running long after they've been superseded.  That may be desirable for those other reports, or it may not; that's just how it is right now.)
(In reply to comment #9)
> (In reply to comment #7)
> > So I think the problem is that the script does a hard-coded check for the top
> > three versions for each product by sorted by number of crashes in the last
> > week, and uses those. There's a "manual version override" in place for Firefox
> > only right now.
> 
> The more I think about it, the more it seems to me that is just like any other
> "major report" and should be gated on the versions that are currently "active"
> (for lack of a better word; have not yet reached their "End Date" per that
> column) in the Branch Data Sources admin panel.


Yes, I agree. This should be pretty easy to do, I think. I'd like to rewrite this script in Python so we can take advantage of existing code (rather than doing this in raw bash and SQL).

> I understand that may end up being too many "reports" to handle[1], but at the
> same time it should also eliminate the need to have (and constantly modify
> before releases) a manual override in place for release candidate/beta-channel
> Firefox versions (that may otherwise not have enough users to rank in the top 3
> crashiest Firefox versions) before release.  Plus, it would also mean other
> products could benefit from the same reports-during-RC/beta-channel-phase
> without requiring more manual overrides.


TBH this report seems pretty lightweight, I think we could enforce some kind of limit if it's necessary.

 
> (That still wouldn't solve the problem in the second half of comment 8, though,
> but it would rationalize which empty reports were being produced :P )


I have not looked into the empty reports issue yet but I imagine that lack of data for the window in question is probably the case, this could probably be made configurable.

 
> [1] I count currently:
>  5 Camino
>  4 Fennec
> 18 Firefox (including 3.6.4pre, running until 2015-06-30)
>  6 SeaMonkey
>  8 Thunderbird
> for a total of 41 active versions.  (The Firefox number may be abnormally high
> right now given the small window between betas, so with the default 90-day
> window, there are lots of betas and beta-pres still running long after they've
> been superseded.  That may be desirable for those other reports, or it may not;
> that's just how it is right now.)

18 does seem a little excessive, I suppose we could just have a limit of "5 most recent" or something, and make it a Socorro config option.

I think we should open a separate bug, to rewrite/refactor the existing correlation reports to be able to do what you're describing. The current behavior is not ideal for Firefox anyway, I'd really much rather give Socorro admins control over this kind of thing via the branch data sources page anyway.
Right now, basically nobody cares about end dates when inserting versions, I guess. If there would be some known use of them, the admins could care about them (and even more if there was some good UI to e.g. adjust the end date of the previous version when you add a new one).
(In reply to comment #10)
> > (That still wouldn't solve the problem in the second half of comment 8, though,
> > but it would rationalize which empty reports were being produced :P )
> 
> I have not looked into the empty reports issue yet but I imagine that lack of
> data for the window in question is probably the case, this could probably be
> made configurable.

Bug 536649 (which just came over to Socorro) could _possibly_ solve this problem (though configurable would be good).

> I think we should open a separate bug, to rewrite/refactor the existing
> correlation reports to be able to do what you're describing. The current
> behavior is not ideal for Firefox anyway, I'd really much rather give Socorro
> admins control over this kind of thing via the branch data sources page anyway.

Agreed. I filed bug 636233 with some of the highlights of the discussion here.
Depends on: 636233, 536649
Priority: -- → P5
Depends on: 650904
Component: Socorro → General
Product: Webtools → Socorro
Component: General → Backend
QA Contact: socorro → backend
Sorry for leaving this bug outstanding so long - planning to just fix the way correlation reports are generated instead of piling anything further onto them.
Status: REOPENED → RESOLVED
Closed: 13 years ago12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.