cpu_arch doesn't seem to work with exists/does-not-exist filters
Categories
(Socorro :: Webapp, defect, P2)
Tracking
(Not tracked)
People
(Reporter: willkg, Assigned: willkg)
Details
Attachments
(1 file)
Last week I landed a fix for bug #973894. Since then, I've had difficulties getting the exists/does-not-exist filter working with some fields.
For example, there are definitely crash reports that don't have a cpu_arch
, but doing a search for "cpu_arch
does not exist" brings up nothing.
This bug covers figuring out what's going on. Does it affect other fields? How do we fix it?
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 1•3 years ago
|
||
What's going on is that Socorro is indexing crashes with a cpu_arch
of ""
. There's no way to search for whether a value is the empty string.
Has cpu_arch
always defaulted to the empty string? Did that change when I converted the code to use glom?
What does the Details tab do when the cpu_arch
is missing?
What other code depends on having a cpu_arch
field?
Assignee | ||
Comment 2•3 years ago
|
||
Turns out I created cpu_arch
in bug #1305956.
We get an empty string value in a couple of cases:
- there was no minidump
- there was a miniudmp, but there was a parse error or minidump-stackwalk had problems with it
The value gets used in a few places. We'd have to make some changes to the code to allow for "the field is missing" situation.
The code currently looks at minidump-stackwalk output to determine the cpu_arch
value, but I think there are other ways we can determine what the cpu_arch
is. For example, Fenix Java crashes have Android crash annotations that we could probably infer the architecture from.
I think instead of defaulting to the empty string, we should default to "unknown" or some similar value. That makes it possible to search for the non-value, get a facet that includes the non-value in it, and gives us an easy way to search for situations we can improve how we determine the cpu_arch
value.
Marco, Gabriele: Thoughts on this?
Comment 3•3 years ago
|
||
Yes, it's possible to recover it from annotations and so is the platform field which is also empty when the minidump is missing. See this Java crash for example, the OS entry is Unknown
. All Android crashes should have the Android_CPU_ABI
annotation set so we can fallback on that. For starters if it's present then the platform (aka os_name
) is Android (and os_pretty_version
too). We should be able to extract the cpu_arch
from the contents of that annotation given the following mapping:
Android_CPU_ABI | → | cpu_arch |
---|---|---|
armeabi-v7a | → | arm |
arm64-v8a | → | arm64 |
x86 | → | x86 |
x86_64 | → | amd64 |
Assignee | ||
Comment 4•3 years ago
|
||
Ok. I like that.
In cases where the Socorro processor can't determine a cpu_arch value, can we use "unknown"? Is there a better "couldn't be determined" value to use that's used somewhere else or more convenient for known downstream data consumers?
Assignee | ||
Comment 6•3 years ago
|
||
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 7•3 years ago
|
||
Assignee | ||
Comment 8•3 years ago
|
||
Here's a Jupyter notebook with analysis on the cpu_arch
field data:
https://github.com/willkg/socorro-jupyter/blob/main/notebooks/bug_1710854_cpu_arch.ipynb
Assignee | ||
Comment 9•3 years ago
|
||
I pushed this to prod in bug #1711055. Marking as FIXED.
Description
•