Closed Bug 1743704 Opened 3 years ago Closed 3 years ago

remove jit-crash-categorize

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

Details

Attachments

(1 file)

pr 6001: bug 1743704: remove jit crash classifier 3 years ago Will Kahn-Greene [:willkg] ET needinfo? me 53 bytes, text/x-github-pull-request		Details \| Review

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Description

•

3 years ago

We're replacing stackwalker from minidump_stackwalk (with an underscore) with minidump-stackwalker (with a hyphen) from rust-minidump.

I had forgotten that jit-crash-categorize (with hyphens, not that there's any ambiguity here, but for completeness) also comes from minidump_stackwalk (with an underscore).

It's in the processor pipeline, but I don't know much about it.

What does it do? Does Socorro do anything useful with the output? Is the output surfaced anywhere? Is it used?

If it's still interesting, can we re-implement it in rust-minidump?

Nicolas B. Pierron [:nbp]

Comment 1

•

3 years ago

I looked at the code trying to understand what it was doing.

It seems that this code attempts to determine whether a crash is caused by some corrupted code or not.
This is blind-folded approach which walk over the x86 assembly, attempting to determine whether this is something which is never produced by our JIT, returning CRASH_CORRUPT_CODE, or a bad operand or bad branch target CODE_BAD_BRANCH_TARGET.

Technically this is an approximation which has many flaws. x86 is not a fixed size assembly, thus our page start is not guaranteed to start at an instruction boundary. Then walking these random bytes might yield to decoded gibberish, which could also be identified as corrupted code. So we have a risk of having wrongly reported corrupted code. Our JIT code is patched on invalidation, which can yield to bad results as well.

Then this code bit-rot since it was created. We do generate adc/sbb instructions, which were reported as corrupted code, now exists because we can manipulate 64 bits values in WASM.

The intent behind the current code is good, however, the likelyhood of having correct code being reported as incorrect is high.
I would be happy to help revive something similar in the future, and potentially improve upon it. There is currently some investigation made by the JS team to see how we can refine our crash-stats reports to make more of them actionable.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 2

•

3 years ago

This context is incredibly helpful!

It also tells me that the current jit-crash-classifier isn't good enough and that I should remove it from the Socorro pipeline. Then we can focus our efforts on something new that we'll use in the future.

Does that sound right?

(not currently active) Ted Mielczarek

Comment 3

•

3 years ago

For historical context, see bug 1124241.

(not currently active) Ted Mielczarek

Comment 4

•

3 years ago

Oh, bug 1268029 shows where it was actually wired up in prod at one point! Also some more reasoning in bug 764223.

Nicolas B. Pierron [:nbp]

Comment 5

•

3 years ago

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #2)

Does that sound right?

That sounds right to me.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 6

•

3 years ago

Rewrote the summary per comment #5.

I'm going to grab this now because reducing CI/test times would help a lot right now.

Assignee: nobody → willkg

Status: NEW → ASSIGNED

Summary: what is jit-crash-categorize and do we use it? → remove jit-crash-categorize

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 7

•

3 years ago

Attached file pr 6001: bug 1743704: remove jit crash classifier — Details

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 8

•

3 years ago

willkg merged PR #6001: "bug 1743704: remove jit crash classifier" in ade75bd.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 9

•

3 years ago

I deployed this just now in bug #1756845. Marking as FIXED.

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

remove jit-crash-categorize

Categories

(Socorro :: Processor, task, P2)

Tracking

(Not tracked)

People

(Reporter: willkg, Assigned: willkg)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Attachment

General

Description

File Name

Content Type