Closed Bug 1710854 Opened 3 years ago Closed 3 years ago

cpu_arch doesn't seem to work with exists/does-not-exist filters

Categories

(Socorro :: Webapp, defect, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

Details

Attachments

(1 file)

Last week I landed a fix for bug #973894. Since then, I've had difficulties getting the exists/does-not-exist filter working with some fields.

For example, there are definitely crash reports that don't have a cpu_arch, but doing a search for "cpu_arch does not exist" brings up nothing.

This bug covers figuring out what's going on. Does it affect other fields? How do we fix it?

Assignee: nobody → willkg
Status: NEW → ASSIGNED

What's going on is that Socorro is indexing crashes with a cpu_arch of "". There's no way to search for whether a value is the empty string.

Has cpu_arch always defaulted to the empty string? Did that change when I converted the code to use glom?

What does the Details tab do when the cpu_arch is missing?

What other code depends on having a cpu_arch field?

Turns out I created cpu_arch in bug #1305956.

We get an empty string value in a couple of cases:

  1. there was no minidump
  2. there was a miniudmp, but there was a parse error or minidump-stackwalk had problems with it

The value gets used in a few places. We'd have to make some changes to the code to allow for "the field is missing" situation.

The code currently looks at minidump-stackwalk output to determine the cpu_arch value, but I think there are other ways we can determine what the cpu_arch is. For example, Fenix Java crashes have Android crash annotations that we could probably infer the architecture from.

I think instead of defaulting to the empty string, we should default to "unknown" or some similar value. That makes it possible to search for the non-value, get a facet that includes the non-value in it, and gives us an easy way to search for situations we can improve how we determine the cpu_arch value.

Marco, Gabriele: Thoughts on this?

Flags: needinfo?(mcastelluccio)
Flags: needinfo?(gsvelto)

Yes, it's possible to recover it from annotations and so is the platform field which is also empty when the minidump is missing. See this Java crash for example, the OS entry is Unknown. All Android crashes should have the Android_CPU_ABI annotation set so we can fallback on that. For starters if it's present then the platform (aka os_name) is Android (and os_pretty_version too). We should be able to extract the cpu_arch from the contents of that annotation given the following mapping:

Android_CPU_ABI cpu_arch
armeabi-v7a arm
arm64-v8a arm64
x86 x86
x86_64 amd64
Flags: needinfo?(gsvelto)

Ok. I like that.

In cases where the Socorro processor can't determine a cpu_arch value, can we use "unknown"? Is there a better "couldn't be determined" value to use that's used somewhere else or more convenient for known downstream data consumers?

Flags: needinfo?(gsvelto)

Unknown is fine IMHO

Flags: needinfo?(gsvelto)
Flags: needinfo?(mcastelluccio)

Here's a Jupyter notebook with analysis on the cpu_arch field data:

https://github.com/willkg/socorro-jupyter/blob/main/notebooks/bug_1710854_cpu_arch.ipynb

I pushed this to prod in bug #1711055. Marking as FIXED.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: