Closed
Bug 1301763
Opened 9 years ago
Closed 8 years ago
Why are Thunderbird 49.0b1 crashes being reported in Soccoro as 49.0b0?
Categories
(Socorro :: Backend, task)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: wsmwk, Assigned: adrian)
Details
Attachments
(1 file)
137.62 KB,
image/png
|
Details |
I don't know if this is a socorro issue or a build issue.
Noticed a few days ago that 49.0b1 crashes are showing in socorro as 49.0b0. And 49.0b1 is not available in the socorro UI.
49.0b1 https://crash-stats.mozilla.com/daily?p=Thunderbird near zero
49.0b0 is showing at normal beta crash rates (can't query the version# directly)
https://crash-stats.mozilla.com/search/?release_channel=beta&product=Thunderbird&_sort=-date&_facets=version&_columns=date&_columns=signature&_columns=version&_columns=build_id#facet-version
For example my report bp-12e8af44-3473-4a03-81e1-e928d2160909 is showing as 49.0b0
Reporter | ||
Updated•9 years ago
|
Severity: critical → normal
Reporter | ||
Comment 1•8 years ago
|
||
I'm not sure where to look, what questions to ask, or who to ask. So I'll probably be pinging people for ideas to make progress.
Are we likely to hit this same issue for beta 50?
Flags: needinfo?(rail)
Comment 2•8 years ago
|
||
https://github.com/mozilla/socorro/blob/991171dcf54fbe40b247ede4a72e4b77b7c64a29/socorro/processor/mozilla_transform_rules.py#L938-L942 looks responsible for this. I'd ask Soccoro folks.
Flags: needinfo?(rail)
Reporter | ||
Comment 3•8 years ago
|
||
Thanks rail! FWIW I confirm that thunderbird-49.0b1 is marked "shipped" in ship-it.
Reporter | ||
Comment 4•8 years ago
|
||
(In reply to Rail Aliiev [:rail] from comment #2)
> https://github.com/mozilla/socorro/blob/
> 991171dcf54fbe40b247ede4a72e4b77b7c64a29/socorro/processor/
> mozilla_transform_rules.py#L938-L942 looks responsible for this. I'd ask
> Soccoro folks.
adrian, help?
Flags: needinfo?(adrian)
Assignee | ||
Comment 5•8 years ago
|
||
Things get marked with version b0 when we do not have any data about that beta version. I suppose this could be a problem with the ftpscrapper? I do not know how Thunderbird versions data is pulled into Socorro those days...
Flags: needinfo?(adrian)
Reporter | ||
Comment 6•8 years ago
|
||
Thelast time this happened, peterb sorted things out in bug 1257651
Component: Build Config → General
Flags: needinfo?(peterbe)
Product: Thunderbird → Socorro
Comment 7•8 years ago
|
||
So, it happens when there's "not enough product version information" based on the parameters product, version, release_channel and build_id.
The ftpscraper is dumb. It just pulls down what's in the archive.mozilla.org. Sadly there's a lot of black magic that collects what data is in archive.mozilla.org and populates the product_versions table.
Can you take a look at http://archive.mozilla.org/pub/thunderbird/ and try to figure out what's different in the .json files for 49 that wasn't a problem before. E.g. does 46, 47, 48 have a valid build_id or release_channel and 49 doesn't?
Here is what our ftscraper + our black magic postgres functions have managed to collect about the recent product versions:
breakpad=> select major_version, release_version, version_string, beta_number, build_date, build_type, has_builds, is_rapid_beta from product_versions where product_name ='Thunderbird' order by version_sort desc limit 50;
major_version | release_version | version_string | beta_number | build_date | build_type | has_builds | is_rapid_beta
---------------+-----------------+----------------+-------------+------------+------------+------------+---------------
52.0 | 52.0a1 | 52.0a1 | | 2016-09-20 | nightly | t | f
51.0 | 51.0a2 | 51.0a2 | | 2016-09-20 | aurora | t | f
51.0 | 51.0a1 | 51.0a1 | | 2016-08-02 | nightly | t | f
50.0 | 50.0a2 | 50.0a2 | | 2016-08-02 | aurora | t | f
50.0 | 50.0a1 | 50.0a1 | | 2016-06-07 | nightly | t | f
49.0 | 49.0 | 49.0b1 | 1 | 2016-08-05 | beta | f | f
49.0 | 49.0a2 | 49.0a2 | | 2016-06-07 | aurora | t | f
49.0 | 49.0a1 | 49.0a1 | | 2016-04-26 | nightly | t | f
48.0 | 48.0 | 48.0b1 | 1 | 2016-07-12 | beta | f | f
48.0 | 48.0a2 | 48.0a2 | | 2016-04-26 | aurora | t | f
48.0 | 48.0a1 | 48.0a1 | | 2016-03-08 | nightly | t | f
47.0 | 47.0 | 47.0b2 | 2 | 2016-06-17 | beta | f | f
47.0 | 47.0 | 47.0b1 | 1 | 2016-06-04 | beta | f | f
47.0 | 47.0a2 | 47.0a2 | | 2016-03-08 | aurora | t | f
47.0 | 47.0a1 | 47.0a1 | | 2016-01-26 | nightly | t | f
46.0 | 46.0a2 | 46.0a2 | | 2016-01-26 | aurora | t | f
46.0 | 46.0a1 | 46.0a1 | | 2015-12-15 | nightly | t | f
45.4 | 45.4.0 | 45.4.0 | | 2016-09-28 | release | f | f
45.4 | 45.4.0 | 45.4.0b99 | 99 | 2016-09-28 | beta | f | f
45.3 | 45.3.0 | 45.3.0 | | 2016-08-25 | release | f | f
45.3 | 45.3.0 | 45.3.0b99 | 99 | 2016-08-25 | beta | f | f
45.2 | 45.2 | 45.2 | | 2016-06-28 | release | f | f
45.2 | 45.2.0 | 45.2.0 | | 2016-06-30 | release | f | f
45.2 | 45.2.0 | 45.2.0b99 | 99 | 2016-06-30 | beta | f | f
45.2 | 45.2 | 45.2b1 | 1 | 2016-05-19 | beta | f | f
45.1 | 45.1.1 | 45.1.1 | | 2016-05-26 | release | f | f
45.1 | 45.1.0 | 45.1.0 | | 2016-05-05 | release | f | f
45.1 | 45.1.0 | 45.1.0b99 | 99 | 2016-05-05 | beta | f | f
45.1 | 45.1 | 45.1b1 | 1 | 2016-04-28 | beta | f | f
45.0 | 45.0 | 45.0 | | 2016-04-07 | release | f | f
45.0 | 45.0 | 45.0b99 | 99 | 2016-04-07 | beta | f | f
45.0 | 45.0 | 45.0b4 | 4 | 2016-04-04 | beta | f | f
45.0 | 45.0 | 45.0b3 | 3 | 2016-03-22 | beta | f | f
45.0 | 45.0 | 45.0b2 | 2 | 2016-02-18 | beta | f | f
45.0 | 45.0 | 45.0b1 | 1 | 2016-02-02 | beta | f | f
45.0 | 45.0a2 | 45.0a2 | | 2015-12-15 | aurora | t | f
45.0 | 45.0a1 | 45.0a1 | | 2015-10-30 | nightly | t | f
44.0 | 44.0 | 44.0b1 | 1 | 2016-01-12 | beta | f | f
44.0 | 44.0a2 | 44.0a2 | | 2015-10-30 | aurora | t | f
44.0 | 44.0a1 | 44.0a1 | | 2015-09-22 | nightly | t | f
43.0 | 43.0 | 43.0b1 | 1 | 2015-12-07 | beta | f | f
43.0 | 43.0a2 | 43.0a2 | | 2015-09-23 | aurora | t | f
43.0 | 43.0a1 | 43.0a1 | | 2015-08-11 | nightly | t | f
42.0 | 42.0 | 42.0b2 | 2 | 2015-10-12 | beta | f | f
42.0 | 42.0 | 42.0b1 | 1 | 2015-09-23 | beta | f | f
42.0 | 42.0a2 | 42.0a2 | | 2015-08-11 | aurora | t | f
42.0 | 42.0a1 | 42.0a1 | | 2015-06-30 | nightly | t | f
41.0 | 41.0 | 41.0b2 | 2 | 2015-09-16 | beta | f | f
41.0 | 41.0 | 41.0b1 | 1 | 2015-08-27 | beta | f | f
41.0 | 41.0a2 | 41.0a2 | | 2015-06-30 | aurora | t | f
(50 rows)
Anything there you think stands out?
Flags: needinfo?(peterbe)
Comment 8•8 years ago
|
||
I'm not entirely sure what I'm doing but here's a comparison of that query the transform rule does compared for 47, 48 and 49.
breakpad=> select major_version, release_version, version_string, beta_number, build_date, build_type, has_builds, b.build_id, b.platform from product_versions pv left join product_version_builds b on (b.product_version_id = pv.product_version_id) where product_name ='Thunderbird' and version_string = '49.0b1' order by version_sort desc limit 50;
major_version | release_version | version_string | beta_number | build_date | build_type | has_builds | build_id | platform
---------------+-----------------+----------------+-------------+------------+------------+------------+----------------+----------
49.0 | 49.0 | 49.0b1 | 1 | 2016-08-05 | beta | f | 20160805071503 | linux
49.0 | 49.0 | 49.0b1 | 1 | 2016-08-05 | beta | f | 20160805071503 | mac
(2 rows)
breakpad=> select major_version, release_version, version_string, beta_number, build_date, build_type, has_builds, b.build_id, b.platform from product_versions pv left join product_version_builds b on (b.product_version_id = pv.product_version_id) where product_name ='Thunderbird' and version_string = '48.0b1' order by version_sort desc limit 50;
major_version | release_version | version_string | beta_number | build_date | build_type | has_builds | build_id | platform
---------------+-----------------+----------------+-------------+------------+------------+------------+----------------+----------
48.0 | 48.0 | 48.0b1 | 1 | 2016-07-12 | beta | f | 20160712184236 | linux
48.0 | 48.0 | 48.0b1 | 1 | 2016-07-12 | beta | f | 20160712184236 | mac
48.0 | 48.0 | 48.0b1 | 1 | 2016-07-12 | beta | f | 20160712184236 | win
(3 rows)
breakpad=> select major_version, release_version, version_string, beta_number, build_date, build_type, has_builds, b.build_id, b.platform from product_versions pv left join product_version_builds b on (b.product_version_id = pv.product_version_id) where product_name ='Thunderbird' and version_string = '47.0b1' order by version_sort desc limit 50;
major_version | release_version | version_string | beta_number | build_date | build_type | has_builds | build_id | platform
---------------+-----------------+----------------+-------------+------------+------------+------------+----------------+----------
47.0 | 47.0 | 47.0b1 | 1 | 2016-06-04 | beta | f | 20160604054735 | linux
47.0 | 47.0 | 47.0b1 | 1 | 2016-06-04 | beta | f | 20160604054735 | mac
47.0 | 47.0 | 47.0b1 | 1 | 2016-06-04 | beta | f | 20160604054735 | win
(3 rows)
Seems fine (except the lack of a win build). Perhaps reprocessing will set the version on the crash differently. Have you tried that?
Reporter | ||
Comment 9•8 years ago
|
||
https://archive.mozilla.org/pub/thunderbird/candidates/49.0b1-candidates/build5/win32/en-US/thunderbird-49.0b1.json looks OK to me.
who needs to run the reprocess step?
Flags: needinfo?(adrian)
Reporter | ||
Comment 10•8 years ago
|
||
Now, neither 49.0b1 nor 49.0b0 are offered as a choice in https://crash-stats.mozilla.com/crashes-per-day/?p=Thunderbird
New, is 50.0b1 was built on Friday. All crashes are being shown as 50.0b0 [1] like https://crash-stats.mozilla.com/report/index/a320137a-9e8a-42f4-a911-507262161010
[1] https://crash-stats.mozilla.com/search/?version=50.0b0&version=50.0b1&product=Thunderbird&date=%3E%3D2016-10-03T01%3A56%3A00.000Z&date=%3C2016-10-10T01%3A56%3A00.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports
Comment 11•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #10)
> Created attachment 8799284 [details]
> crash-stats verisons bonkers.png
>
> Now, neither 49.0b1 nor 49.0b0 are offered as a choice in
> https://crash-stats.mozilla.com/crashes-per-day/?p=Thunderbird
That might very well be unrelated and a matter of how those drop-downs aren't doing you any favors.
The version drop-down choices are based on Firefox and it seems it doesn't reload after you have selected Thunderbird. Another battle for another day, but please file a bug.
> New, is 50.0b1 was built on Friday. All crashes are being shown as 50.0b0
> [1] like
> https://crash-stats.mozilla.com/report/index/a320137a-9e8a-42f4-a911-
> 507262161010
>
According to the raw crash that crash is version "50.0"
https://crash-stats.mozilla.com/rawdumps/a320137a-9e8a-42f4-a911-507262161010.json
But the "pretty version" is turned into "50.0b0"
https://crash-stats.mozilla.com/api/UnredactedCrash/?crash_id=a320137a-9e8a-42f4-a911-507262161010
Is that not correct?
> [1]
> https://crash-stats.mozilla.com/search/?version=50.0b0&version=50.
> 0b1&product=Thunderbird&date=%3E%3D2016-10-03T01%3A56%3A00.000Z&date=%3C2016-
> 10-10T01%3A56%3A00.000Z&_sort=-
> date&_facets=signature&_columns=date&_columns=signature&_columns=product&_col
> umns=version&_columns=build_id&_columns=platform#crash-reports
According to
https://crash-stats.mozilla.com/search/?product=Thunderbird&date=%3E%3D2016-09-10T14%3A29%3A00.000Z&date=%3C2016-10-10T14%3A29%3A00.000Z&_sort=-date&_facets=version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-version
(all Thunderbird crashes the last 1 month from now)
It seems that there are no crashes that have come in under 50.* at all. That doesn't make any sense because I'm pretty sure https://crash-stats.mozilla.com/report/index/a320137a-9e8a-42f4-a911-507262161010 should be included. It happened 4 hours since the upper date bound on my search.
Adrian, can you explain that?
Assignee | ||
Comment 12•8 years ago
|
||
When a version gets a -b0 version number, it is because we could not find data in our database for that crash's (product, version, release channel, build id) tuple.
For bp-a320137a-9e8a-42f4-a911-507262161010, that means that when it was processed, our database had no data for (Thunberbird, 50.0, beta, 20161007134619).
@Peter, that would simply be because we only show the top 50 results in facets. I checked for version 50.0b and it has only 151 results, which is less than the 50th version has (388). If you want to prove that, you can use the API, like this:
https://crash-stats.mozilla.com/api/SuperSearch/?product=Thunderbird&date=%3E%3D2016-09-10T14%3A29%3A00.000Z&date=%3C2016-10-10T14%3A29%3A00.000Z&_facets_size=200&_results_number=0&_facets=version
Wayne, for which versions should we run a reprocess? I don't think we have data for 50.0 yet, but Peter can maybe confirm that?
Flags: needinfo?(adrian)
Reporter | ||
Comment 13•8 years ago
|
||
(In reply to Peter Bengtsson [:peterbe] from comment #11)
> ...
> > New, is 50.0b1 was built on Friday. All crashes are being shown as 50.0b0
> > [1] like
> > https://crash-stats.mozilla.com/report/index/a320137a-9e8a-42f4-a911-
> > 507262161010
> >
>
> According to the raw crash that crash is version "50.0"
> https://crash-stats.mozilla.com/rawdumps/a320137a-9e8a-42f4-a911-
> 507262161010.json
>
> But the "pretty version" is turned into "50.0b0"
> https://crash-stats.mozilla.com/api/UnredactedCrash/?crash_id=a320137a-9e8a-
> 42f4-a911-507262161010
>
> Is that not correct?
We did not build a 50.0b0 so, I don't see how it could be 50.0b0.
Build specs at https://public.etherpad-mozilla.org/p/thunderbird-release-50.0b1
Reporter | ||
Comment 14•8 years ago
|
||
(In reply to Adrian Gaudebert [:adrian] from comment #12)
> When a version gets a -b0 version number, it is because we could not find
> data in our database for that crash's (product, version, release channel,
> build id) tuple.
>
> For bp-a320137a-9e8a-42f4-a911-507262161010, that means that when it was
> processed, our database had no data for (Thunberbird, 50.0, beta,
> 20161007134619).
>
> @Peter, that would simply be because we only show the top 50 results in
> facets. I checked for version 50.0b and it has only 151 results, which is
> less than the 50th version has (388). If you want to prove that, you can use
> the API, like this:
I don't understand what the number of crashes has to do with what version is offered in the UI. But 21 hours after your comment we are at 412 crashes for 50.b0, and in https://crash-stats.mozilla.com/search/ 50.0b0 is not offered in the static version field nor for a "new line" set to "version has terms"
> https://crash-stats.mozilla.com/api/SuperSearch/
> ?product=Thunderbird&date=%3E%3D2016-09-10T14%3A29%3A00.000Z&date=%3C2016-10-
> 10T14%3A29%3A00.000Z&_facets_size=200&_results_number=0&_facets=version
>
> Wayne, for which versions should we run a reprocess? I don't think we have
> data for 50.0 yet, but Peter can maybe confirm that?
version 49.0 beta and 50.0 beta.
Assignee | ||
Comment 15•8 years ago
|
||
Since we moved to rapid betas a while ago, products in their beta version stopped being aware of their actual version number. They only know about their "major version". So for a Thunderbird 50.0b2, the version number it knows is 50.0. That is what gets sent to crash-stats as part of the crash report. Then, in crash-stats, we have a rule to rewrite these version numbers to what you would expect, so in that earlier case, turn 50.0 into 50.0b2. To do that, we use our database, in which we store a bunch of associations like I described earlier:
(product, version, release channel, build id) -> actual version number
For example:
("Thunderbird", "50.0", "beta", "20161007134619") -> "50.0b2"
When we cannot find a match in our database for that tuple (product, version, release channel, build id), we assume that we do not have data about that release yet, and thus give the crash report a "fake" version number ending with "b0". We use "0" because we know that it is not a valid version number for an actual beta, so that's a good way of seeing that something went wrong. It also makes it quite easy for us to spot all of those crashes to send them for reprocessing once we have the missing data.
So, it is expected that "50.0b0" is not in any drop-down. That is not a valid version number. It is there to show a crash report has a bogus version number.
Now, the cause of this problem is that we do not have data about Thunderbird 49.0 and 50.0 beta versions in our database. The table that contains data about builds is called `product_version_builds`. As far as I know, it is populated by our FTP Scrapper cron job.
Peter, do you know why that data is missing from our postgres database? FYI, the query we use to find that data is here: https://github.com/mozilla/socorro/blob/master/socorro/processor/mozilla_transform_rules.py#L881-L891
Component: General → Backend
Comment 16•8 years ago
|
||
I've been looking at the ftpscraper recently so I ran it in dry-run mode. Here's an excerpt of the output for Thunderbird releases:
INSERT BUILD
('thunderbird', '48.0', 'win', u'20160712184236', 'beta', '1', u'mozilla-beta', 'build3')
{'ignore_duplicates': True}
INSERT BUILD
('thunderbird', '49.0', 'win', u'20160901155122', 'beta', '1', u'comm-beta', 'build5')
{'ignore_duplicates': True}
INSERT BUILD
('thunderbird', '50.0', 'win', u'20161007134619', 'beta', '1', u'comm-beta', 'build2')
{'ignore_duplicates': True}
I've just shown Windows for brevity, there's also 2 entries for linux, and one for mac. Anyway, the change from mozilla-beta to comm-beta seems like a decent lead to follow.
Comment 17•8 years ago
|
||
Not an answer but here's what we have in the database:
breakpad=> SELECT
breakpad-> pv.version_string,
breakpad-> pv.build_type
breakpad-> FROM product_versions pv
breakpad-> WHERE pv.product_name = 'Thunderbird'
breakpad-> AND
breakpad-> (
breakpad(> pv.release_version like '48%' OR
breakpad(> pv.release_version like '49%' OR
breakpad(> pv.release_version like '50%' OR
breakpad(> pv.release_version like '51%')
breakpad-> group by pv.version_string, pv.build_type
breakpad-> order by version_string
breakpad-> ;
version_string | build_type
----------------+------------
48.0a1 | nightly
48.0a2 | aurora
48.0b1 | beta
49.0a1 | nightly
49.0a2 | aurora
49.0b1 | beta
50.0a1 | nightly
50.0a2 | aurora
51.0a1 | nightly
51.0a2 | aurora
(10 rows)
In other words, for 48* and 49* there were 3 build types (nightly, aurora, beta).
For 50* and 51* there's only 2 build types (nightly, aurora)
Is Nick's comment about the fact that the "mozilla-beta" now seems to be "comm-beta" a clue?
Comment 18•8 years ago
|
||
Maybe, but the switch to comm-beta happens at 49, instead of 50.0. Could be another change had an impact too.
Reporter | ||
Comment 19•8 years ago
|
||
Are you saying this should be mozilla-beta, not comm-beta?
('thunderbird', '49.0', 'win', u'20160901155122', 'beta', '1', u'comm-beta', 'build5')
{'ignore_duplicates': True}
Comment 20•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #19)
> Are you saying this should be mozilla-beta, not comm-beta?
>
> ('thunderbird', '49.0', 'win', u'20160901155122', 'beta', '1', u'comm-beta',
> 'build5')
> {'ignore_duplicates': True}
I'm not entirely sure what the details are but I think what Nick is suggesting is that starting with v 49, the name of the repository changed. That is a clue that the stuff that ftpscraper picks up from archive.mozilla.org might be different in other ways.
Reporter | ||
Comment 21•8 years ago
|
||
Is it possible to update the database to get 50.0b1 in there and reprocess those beta crashes, while we sort out the longer term fix?
Assignee | ||
Comment 22•8 years ago
|
||
The `comm-beta` lead what a good one. That is indeed not a known repository in our database. To fix that:
> INSERT INTO release_repositories (repository) VALUES ('comm-beta');
Now we need to update the tables being queried:
> SELECT update_product_versions();
And finally we can verify that now we have builds for version 50.0:
> SELECT DISTINCT pvb.build_id FROM product_versions pv
> LEFT JOIN product_version_builds pvb ON pv.product_version_id = pvb.product_version_id
> WHERE pv.product_name = 'Thunderbird' AND pv.release_version = '50.0'
> AND pv.build_type ILIKE 'beta';
> build_id
> ----------------
> 20161007134619
> 20161017040505
> 20161003102417
> (3 rows)
I have applied all of these on stage, then went to look for a 50.0b0 crash report and reprocessed it. And tada! https://crash-stats.allizom.org/report/index/49b67ee9-f320-4fed-b57a-197cf2161026 It now has a version number of 50.0b2.
So I think that solves it. I'm going to run the same commands on prod and then reprocess all -b0 crash reports we have.
Assignee: nobody → adrian
Assignee | ||
Comment 23•8 years ago
|
||
I have applied the database changes to prod, and have started reprocessing all crashes with version 49.0b0 or 50.0b0 in stage and prod. That's ongoing and should be done quickly.
I believe that resolves this bug!
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•