Closed
Bug 1499714
Opened 7 years ago
Closed 7 years ago
crashes for b99 builds can get the wrong version_string
Categories
(Socorro :: Processor, task, P1)
Socorro
Processor
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: willkg, Assigned: willkg)
Details
Attachments
(1 file)
Crash reports for 63.0b99 (version: 63.0, build id: 20181015152800) are coming in and getting assigned "63.0" as the version string.
Example:
bp-7e0f45e9-6246-448a-a3c9-6f3c70181017
This bug covers figuring out why, fixing it, and reprocessing affected crashes.
| Assignee | ||
Comment 1•7 years ago
|
||
Socorro has a processor rule called BetaVersionRule which uses the webapp's /api/VersionString API endpoint to do a lookup in the product version data Socorro has for a (product, version, buildid) combination and then uses the resulting version_string in the processed crash. This allows crash reports for beta builds to get the correct version string (63.0b11 vs. 63.0).
In the case of the 63.0b99 builds, however, the query that /api/VersionString is using ends up with two different version_strings:
version_string | build_id | platform | product_version_id | repository
----------------+----------------+----------+--------------------+-----------------
63.0b99 | 20181015152800 | linux | 7 | mozilla-beta
63.0b99 | 20181015152800 | mac | 7 | mozilla-beta
63.0b99 | 20181015152800 | win | 7 | mozilla-beta
63.0 | 20181015152800 | linux | 5 | mozilla-release
63.0 | 20181015152800 | mac | 5 | mozilla-release
63.0 | 20181015152800 | win | 5 | mozilla-release
(6 rows)
The ftpscraper data looks ok. I verified this is true of 62.0b99 as well. There's a comment in the code that goes like this:
# The query can return multiple results, but they're the same value. So
# we just return the first one.
I think that comment is wrong for the 0b99 case. I haven't looked at whether it's possibly wrong for other cases as well. Maybe.
I think the right fix here is to change /api/VersionString to also require the release channel. That would disambiguate the results and fix this issue. I think that fix also doesn't break other expectations/assumptions.
Grabbing this to do now.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P1
| Assignee | ||
Comment 2•7 years ago
|
||
Relatedly, I suspect this has been a bug for a long time. We probably didn't notice it because the "right choice" got cached often enough in the previous iteration. The rewrite of the BetaVersionRule bits caches for longer, so the problem became noticeable.
If I have some time, I'll try to prove that theory.
| Assignee | ||
Comment 3•7 years ago
|
||
Comment 4•7 years ago
|
||
Commits pushed to master at https://github.com/mozilla-services/socorro
https://github.com/mozilla-services/socorro/commit/60953244f9541da20c1d1bf8985917fda332c058
fix bug 1499714: further restrict VersionString by channel
This fixes the problem with b99 where there are two different version
strings and the VersionString API code picked the "first" one, but it's
unsorted, so it's really a random one and random isn't awesome here.
https://github.com/mozilla-services/socorro/commit/f05379e026cc4d763d93f1bfcec51db36b6fb81c
Merge pull request #4648 from willkg/1499714-b99
fix bug 1499714: further restrict VersionString by channel
Updated•7 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 5•7 years ago
|
||
I landed the fix, deployed it to stage, and tested it there... and it didn't work. Crashes were still getting 63.0 as the version string. So I read through the logs and noticed that the /api/VersionString/ request for this build was ages ago and then I wondered whether during the deploy, the processors came up before the webapp, so then they talked to the old webapp and got the wrong version string and because the value is cached in the processor, reprocessing didn't help.
I did another stage deploy and everything is fine now.
As an aside, this would have fixed itself since the cache in the processor has a TTL. I think it takes 6 hours or something like that.
We might need to recycle the processor nodes in prod after the prod deploy depending on what comes up when.
| Assignee | ||
Comment 6•7 years ago
|
||
The changes are on prod. We did have to recycle the processor nodes. I reprocessed all the crashes with that build id.
Pretty sure we're good here now.
You need to log in
before you can comment on or make changes to this bug.
Description
•