Closed Bug 678741 Opened 13 years ago Closed 13 years ago

/report/list needs to query the reports table the same way newtcbs is built

Categories

(Socorro :: General, task)

task
Not set
blocker

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: rhelmer, Assigned: rhelmer)

Details

Attachments

(2 files, 4 obsolete files)

I was just noticing that the topcrasher (both service and report) return some signatures that can't be loaded from the reports table:

https://crash-stats.allizom.org/topcrasher/byversion/Firefox/5.0/7

For example this one:
https://crash-stats.allizom.org/report/list?range_value=7&range_unit=days&date=2011-08-12&signature=libfontconfig.so.1.4.4%400x1a942&version=Firefox%3A5.0

The problem is that the latter comes from the reports table, which now looks up the build_id and restricts results to only those that match the build_id found in product_version_builds.

Here is a query showing the above (this is based on what is used in the PHP app):

SELECT * FROM reports WHERE signature = 'libfontconfig.so.1.4.4@0x1a942' AND build = '20110615151330' AND reports.date_processed BETWEEN CAST('2011-08-12' AS TIMESTAMP WITHOUT TIME ZONE) - CAST('7 days' AS INTERVAL) AND CAST('2011-08-12' AS TIMESTAMP WITHOUT TIME ZONE);

Note that we get a lot of reports for build ids that don't match what we've recorded in the releases_raw table, which is where I assume these are coming from:

breakpad=> select count(*), build from reports where product = 'Firefox' and version = '5.0' and release_channel = 'release' and date_processed > '2011-08-05' group by build order by count(*);
 count  |     build      
--------+----------------
      1 | 20110617093944
      2 | 
      2 | 20110617101910
      2 | 20110626221638
      2 | 20110704120007
      3 | 20110627063657
      6 | 20110617094044
     12 | 20110628233112
     24 | 20110628230643
     63 | 20110622130550
     67 | 20110622131247
    158 | 20110622063753
    223 | 20110622063727
    695 | 20110622232440
   3395 | 20110622232052
 795845 | 20110615151330
(16 rows)
Severity: normal → blocker
Target Milestone: --- → 2.2
We shouldn't be restricting report lookup by buildid.  Since we add a lot of non-standard builds to the stats, buildids won't match in a lot of cases.

Also, certain product/version combinations have multiple buildids legitimately.

Instead we should be filtering by product, version and release channel.  Note that when filtering by release channel, do not filter on 'release': filter on " lower(release_channel) NOT IN ( 'nightly', 'aurora', 'beta' )".  We assume that anything not part of a defined channel is a release.
(In reply to Josh Berkus from comment #1)
> We shouldn't be restricting report lookup by buildid.  Since we add a lot of
> non-standard builds to the stats, buildids won't match in a lot of cases.
> 
> Also, certain product/version combinations have multiple buildids
> legitimately.
> 
> Instead we should be filtering by product, version and release channel. 
> Note that when filtering by release channel, do not filter on 'release':
> filter on " lower(release_channel) NOT IN ( 'nightly', 'aurora', 'beta' )". 
> We assume that anything not part of a defined channel is a release.

OK I can change the filter the UI is doing - I added this because of kairo's comment on bug 678205 comment 4.
Assignee: josh → rhelmer
Summary: tcbs top crashers need to be restricted by build_id → /report/list needs to not restrict by buildid
(In reply to Josh Berkus from comment #1)
> We shouldn't be restricting report lookup by buildid.  Since we add a lot of
> non-standard builds to the stats, buildids won't match in a lot of cases.
> 
> Also, certain product/version combinations have multiple buildids
> legitimately.


That's why the UI query was doing "builds IN (SELECT ...)", instead of matching on a single build id.


> Instead we should be filtering by product, version and release channel. 
> Note that when filtering by release channel, do not filter on 'release':
> filter on " lower(release_channel) NOT IN ( 'nightly', 'aurora', 'beta' )". 
> We assume that anything not part of a defined channel is a release.

The way the query is built right now is that the UI does:

SELECT pi.version_string, which_table, major_version FROM product_info pi" .
                           " JOIN product_versions pv ON (pv.product_version_id = pi.product_version_id)" .
                           " WHERE pi.product_name = " . $this->db->escape($product) .
                           " AND pi.version_string = " . $this->db->escape($version) ."";


If "which_table" is new, and pi.major_version matches the version passed in the URL (e.g. "6.0" or "6.0(beta) or "6.0b3"), then we restrict channel to 'release'. If not and the version string contains 'b', we restrict it to 'beta'. Otherwise it's not restricted right now (mostly because I know that all other channels are going to have 'old' in which_table).

Assuming the above all sounds reasonable, I think all we need to do is back out the buildid change from bug 678205. I've done this on my dev instance and the results look reasonable.
Status: NEW → ASSIGNED
Attachment #552896 - Flags: review?(bsavage)
Attachment #552896 - Flags: review?(bsavage) → review+
Landed on trunk:
Committed revision 3419.
Landed on branch:
Committed revision 3420.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
OK this actually doesn't make sense for betas. What we are doing here is taking a signature from TCBS, and then then returning all results in the reports table that match that signature for that beta.

How can we identify the correct beta in the reports table if we don't use the build id?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: /report/list needs to not restrict by buildid → /report/list needs to query the reports table the same way newtcbs is built
Josh and I have been going over this, I think this is as close as we can get it to the way TCBS does it now for 2.2

However it looks like TCBS might be undercounting betas, since we are seeing a lot more entries in the reports table for a specific time range than expected (a small amount due to tcbs lag and timezone offsets would be expected).

Anyway we think the code this patch produces is correct.
Attachment #552896 - Attachment is obsolete: true
Attachment #552912 - Flags: review?(chris.lonnen)
(In reply to Robert Helmer [:rhelmer] from comment #7)
> Anyway we think the code this patch produces is correct.

Meant to say "... the SQL this patch produces is correct"
Example of discrepancy where we think there shouldn't be:

breakpad=> select sum(report_count) from tcbs join signatures on (signatures.signature_id = tcbs.signature_id) join product_versions on (product_versions.product_version_id = tcbs.product_version_id) where signature = 'hang | mozilla::plugins::PPluginInstanceParent::CallNPP_HandleEvent(mozilla::plugins::NPRemoteEvent const&, short*)' and product_name = 'Firefox' and version_string = '6.0b4' and report_date BETWEEN CAST('2011-08-12' AS TIMESTAMP WITHOUT TIME ZONE) - CAST('7 days' AS INTERVAL) AND CAST('2011-08-12' AS TIMESTAMP WITHOUT TIME ZONE); sum  ------ 1540
(1 row)

breakpad=> SELECT COUNT(uuid) as total            FROM   reports JOIN  product_versions ON reports.version = product_versions.release_version AND reports.product = product_versions.product_name JOIN product_version_builds ON product_versions.product_version_id = product_version_builds.product_version_id AND build_numeric(build) = product_version_builds.build_id WHERE  reports.signature = 'hang | mozilla::plugins::PPluginInstanceParent::CallNPP_HandleEvent(mozilla::plugins::NPRemoteEvent const&, short*)' AND reports.product = 'Firefox' AND product_versions.version_string = '6.0b4' AND reports.version = product_versions.release_version AND reports.release_channel ILIKE 'beta' AND product_versions.build_type = 'beta' AND ((reports.product = 'Firefox' AND reports.version = '6.0')) AND reports.date_processed BETWEEN CAST('2011-08-12' AS TIMESTAMP WITHOUT TIME ZONE) - CAST('7 days' AS INTERVAL) AND CAST('2011-08-12' AS TIMESTAMP WITHOUT TIME ZONE);
 total 
-------
  5740
(1 row)
(In reply to Robert Helmer [:rhelmer] from comment #9)

Output got a little mangled, here are the numbers in case that wasn't clear:

tcbs:

> sum  
> ------ 
> 1540

reports:

>  total 
> -------
>   5740
I see ILIKE in the sql. Is this for the case insensitivity? It seems like we should be able to use an `=` instead.
Comment on attachment 552912 [details] [diff] [review]
filter on buildid, more like how newtcbs does it

You define $reports_version in the if block above, and when $result is empty there is an error.

error: [5xx Error] File: application/models/common.php; Line: 339; Message: Undefined variable: reports_version
Attachment #552912 - Flags: review?(chris.lonnen) → review-
Thanks for catching that -  it was using major_version for this before, but we only want to swap override version w/ major_version in the beta case so now it sets a default for release_version.

I tested this on my dev instance against stagedb for these three cases:
1) old tcbs (8.0a1)
2) new tcbs, release (5.0)
3) new tcbs, beta (6.0b4)

I spot-checked the results and they look good now (appropriate buildid, reasonable number of results etc). This is of course taking into account the newtcbs vs. reports discrepancy discussed earlier in this bug.
Attachment #552912 - Attachment is obsolete: true
Attachment #552917 - Flags: review?(chris.lonnen)
Sorry, gave you an older version of the patch - this one should set release_channel = 'release' for newtcbs, if channel != 'beta'
Attachment #552917 - Attachment is obsolete: true
Attachment #552917 - Flags: review?(chris.lonnen)
Attachment #552921 - Flags: review?(chris.lonnen)
Attachment #552921 - Flags: review?(chris.lonnen) → review+
Comment on attachment 552917 [details] [diff] [review]
set a default for major_version, and simplify a little

Trunk:
Committed revision 3425.

Branch:
Committed revision 3426.
jberkus found that overcounting of reports was due to multiple build_id entries (there tend to be three for each product+version, one per platform) so we were inflating the counts by 300%

I've tested:
1) old tcbs (8.0a1)
2) new tcbs release (5.0)
3) new tcbs final beta (6.0beta)
4) new tcbs beta (6.0b4)

They all seem ok, counts seem reasonable. The TCBS and reports counts will likely not match up 100% because newtcbs is UTC and reports is Pacific however.

One thing that probably warrants further investigation is that the 6.0beta4 case seemed to have multiple build IDs, we are restricting this based on the new product_version_builds table, I will take a look at that. I am assuming the code and query are now fine, other cases seem ok.
Attachment #552921 - Attachment is obsolete: true
Attachment #552936 - Flags: review?(chris.lonnen)
The multiple builds for 6.0(beta) are expected.  Some of them are garbage, but without an API for releases, we have no way to distinguish between the garbage builds and the real final build.
Comment on attachment 552936 [details] [diff] [review]
eliminate overcounting due to multiple build id entries

Review of attachment 552936 [details] [diff] [review]:
-----------------------------------------------------------------

pages load without error, the sql changes seem ok
Attachment #552936 - Flags: review?(chris.lonnen) → review+
Comment on attachment 552936 [details] [diff] [review]
eliminate overcounting due to multiple build id entries

Thanks! 

Trunk:
Committed revision 3431.
2.2 branch:
Committed revision 3432.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
(In reply to Josh Berkus from comment #17)
> The multiple builds for 6.0(beta) are expected.  Some of them are garbage,
> but without an API for releases, we have no way to distinguish between the
> garbage builds and the real final build.

Strange. I would have expected "6.0(beta)" to be treated just like a beta, with a tight build ID criteria (contrary to "6.0", i.e. the "release channel" or actually collection of non-prerelease channels).
QA verified. Followed the steps to reproduce in comment 0.

Clicked through to the reports from the query results and ensured the reports load.
https://crash-stats.allizom.org/topcrasher/byversion/Firefox/5.0/7

For example this one:
https://crash-stats.allizom.org/report/list?range_value=7&range_unit=days&date=2011-08-12&signature=libfontconfig.so.1.4.4%400x1a942&version=Firefox%3A5.0
Status: RESOLVED → VERIFIED
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: