Closed Bug 1748248 Opened 3 years ago Closed 3 years ago

Searches on '"mac crash info" "exists"' now return all macOS crashes

Categories

(Socorro :: Processor, defect, P2)

Unspecified
macOS

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: smichaud, Assigned: willkg)

Details

Attachments

(2 files)

It's not really up to me to decide, but I'd say this is a S1 bug.

Unless someone else beats me to it, I'll start working on it as soon as I have time -- later this week?

Severity: -- → S1
Priority: -- → P1

So it was presumably triggered by a recent change in Breakpad or Socorro.

If I remember correctly, Mozilla recently switched to using a Rust-based minidump-stackwalk. This might be the source of the problem.

If I'm right about this, someone please tell me when this change happened, and where the source code of the new minidump-stackwalk lives.

Flags: needinfo?(gsvelto)
Flags: needinfo?(a.beingessner)

Grabbing this. It's a change that started a couple of weeks ago when we switched to the new rust-minidump stackwalker which now always emits a mac_crash_info section.

The real problem here is that I didn't really know how we were going to use the mac_crash_info data, didn't know what questions we were trying to answer using search, or how to index it when I was working on bug #1709658. Because of that, it's never been very usable and there are a few bugs stemming from that. (bug #1714190, bug #1713355)

I don't think we should change rust-minidump stackwalker. Instead, we should go back and figure out what questions are we trying to answer, what does the mac_crash_info structure look like, and how we want to index it so that we can answer the questions we want to answer.

Component: Crash Reporting → Processor
Flags: needinfo?(gsvelto)
Flags: needinfo?(a.beingessner)
Product: Toolkit → Socorro

The issue is that the entry is now populated even if it's empty (i.e. it will show num_records to be 0 followed by an empty array). That's most likely a result of how we deserialize those structures in the stack walker. You can use this query in the meantime to find only the reports that have a non-zero number of entries:

https://crash-stats.mozilla.org/search/?mac_crash_info=%40.*\"num_records\"%3A\%20[1-9][0-9]*.*&platform=Mac%20OS%20X&_facets=signature&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #3)

The real problem here is that I didn't really know how we were going to use the mac_crash_info data, didn't know what questions we were trying to answer using search, or how to index it when I was working on bug #1709658. Because of that, it's never been very usable and there are a few bugs stemming from that. (bug #1714190, bug #1713355)

I think we should leave the structure of the mac_crash_info data (as contained in crash reports and searched by Socorro) more or less as it is -- in other words in as raw a format as possible. That gives us maximum flexibility in deciding how to search on it.

I don't know how many others have been searching on the mac_crash_info data. I have, and I'm still not confident I know all the questions I want to ask of it. Note that Apple doesn't document any of the mac_crash_info data, so we can only learn by experience what it contains.

I don't think we should change rust-minidump stackwalker. Instead, we should go back and figure out what questions are we trying to answer, what does the mac_crash_info structure look like, and how we want to index it so that we can answer the questions we want to answer.

I disagree. I think we should go back to not reporting empty mac_crash_info structures.

I disagree. I think we should go back to not reporting empty mac_crash_info structures.

Empty mac_crash_info structures contain no information. So I don't think we should report them, or include them in searches.

This bug isn't such a hardship if I'm the only one (or one of the few) searching mac_crash_info structures.

Severity: S1 → --
Priority: P1 → --

rust-minidump always emit fields in the schema. This is easier to deal with in a lot of ways. I don't want to change it. I do see how it affects your "show me all the crash reports that have mac_crash_info" query.

I can add a new flag field that's easier to search with. Maybe has_mac_crash_info? Does that work for you?

I can add a new flag field that's easier to search with. Maybe has_mac_crash_info? Does that work for you?

How about you just remap '"mac_crash_info" "exists"' searches to that new field? Being able to do both '"mac_crash_info" "exists"' and "has_mac_crash_info" searches would be confusing.

I can't do that.

With the new flag, you'd do the "show me all the crash reports that have mac_crash_info" query as "has_mac_crash_info is true". Does that work for you?

Steven, isn't my replacement query in comment 4 sufficient? It finds all populated reports (that is, the ones that don't have zero entries).

With the new flag, you'd do the "show me all the crash reports that have mac_crash_info" query as "has_mac_crash_info is true". Does that work for you?

This seems the least bad option.

Steven, isn't my replacement query in comment 4 sufficient? It finds all populated reports (that is, the ones that don't have zero entries).

It's a pain to enter all that text, and to make sure the syntax is correct.

I can put up with quite a lot of inconvenience. But if large numbers (or even moderate numbers) of people ever begin searching on mac_crash_info, these decisions will come back to haunt you.

I forgot I have some code that extracts the value and normalizes it. I'll adjust that to look at num_records and then we can return the field to what you were used to.

Grabbing this to do now.

Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P2

Thanks! Future generations of mac_crash_info fans will be grateful ... for some value of "future" :-)

willkg merged PR #5965: "bug 1748248: fix mac_crash_info field to require records" in 0eda569.

I'll test reprocessing affected crash reports once that deploys to stage.

I'll deploy these changes to prod and reprocess the affected crash reports tomorrow.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

Thanks again, Will! Everything seems to be fine.

Next up in importance would be bug 1713355, I think, if you can find the time.

Thanks! Future generations of mac_crash_info fans will be grateful ... for some value of "future" :-)

As best I can tell, the Mozilla crash report repository is also the only public repository for "mac_crash_info" data. Neither Safari's nor Chrome's crash reports are publicly available. And, as I mentioned above, "mac_crash_info" is completely undocumented. This should make people interested in searching it.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: