Closed Bug 1660539 Opened 3 months ago Closed 1 month ago

Crash in [@ mozilla::dom::CloneOrUndefined]

Categories

(Core :: DOM: Content Processes, defect, P5)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
83 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox79 --- unaffected
firefox80 --- unaffected
firefox81 --- wontfix
firefox82 --- fixed
firefox83 --- fixed

People

(Reporter: kashav, Assigned: nika)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/52499a8c-8256-4d9f-8832-5109e0200821

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::CloneOrUndefined dom/ipc/jsactor/JSActor.cpp:191
1 xul.dll mozilla::dom::JSActor::QueryHandler::RejectedCallback dom/ipc/jsactor/JSActor.cpp:426
2 xul.dll mozilla::dom::`anonymous namespace'::PromiseNativeHandlerShim::RejectedCallback dom/promise/Promise.cpp:389
3 xul.dll mozilla::dom::NativeHandlerCallback dom/promise/Promise.cpp:339
4 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:599
5 xul.dll js::Call js/src/vm/Interpreter.cpp:681
6 xul.dll PromiseReactionJob js/src/builtin/Promise.cpp:1902
7 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:599
8 xul.dll JS::Call js/src/jsapi.cpp:2831
9 xul.dll mozilla::dom::LifecycleConnectedCallback::Call dom/bindings/WebComponentsBinding.cpp:249

We've seen a few these daily since 20200817214602 (push log: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1891b1e3fa34901dd33a61575d34dd52e15679f5&tochange=508a0cc2f6d446e4b016ebb6d2c740c80f830dd9).

nika, sounds like this was bug 1658827.

Flags: needinfo?(nika)

The crash reason for these crashes is: MOZ_RELEASE_ASSERT(!rv.Failed()) (OOM while serializing 'undefined')

These crashes all seem to have the JSOutOfMemory: Reported annotation on them, so my assumption that if it fails at this point it's an OOM seems to be correct.

I'm guessing these are all effectively OOM | Small, just being reported differently 'cause they weren't caught at the specific OOM site.

Flags: needinfo?(nika)

These OOMs probably happened before but with a different signature. Nika added MOZ_RELEASE_ASSERT(!rv.Failed()) (OOM while serializing 'undefined') in 81.

P5 because crash volume is low.

Severity: -- → S3
Priority: -- → P5

Hey Chris, this looks pretty serious (beta numbers) and it's about to merge to release. I guess the OOM would still happen and show up someplace else without this assert? Can you and Nika confirm you want to ship with this?

Flags: needinfo?(cpeterson)

(In reply to Jim Mathies [:jimm] from comment #5)

Hey Chris, this looks pretty serious (beta numbers) and it's about to merge to release. I guess the OOM would still happen and show up someplace else without this assert? Can you and Nika confirm you want to ship with this?

Nika said (in comment 2): "these are all effectively OOM | Small, just being reported differently 'cause they weren't caught at the specific OOM site."

Thus, IIUC, all CloneOrUndefined crashes would have been OOMs with a slightly different call site before. So I don't think we need to worry about this new crash signature.

I see this crash signature is currently #17 on Fx81 Beta's list of top content process crashes (and browser process crash #47):

https://crash-stats.mozilla.org/topcrashers/?product=Firefox&version=81.0b&days=28&process_type=content

Looks like there's about 50-90 crash reports from each beta build:

https://crash-stats.mozilla.org/search/?signature=~mozilla%3A%3Adom%3A%3ACloneOrUndefined&product=Firefox&date=%3E%3D2020-06-16T20%3A25%3A00.000Z&date=%3C2020-09-16T20%3A25%3A00.000Z&_facets=signature&_facets=build_id&_facets=platform&_facets=version&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-version

Flags: needinfo?(cpeterson)

Can we maybe revise the assessment for this bug priority and severity? It was set as P5 before before the merge but since then, the number of crashes spiked a lot. We went from 40 crashes a day across channels to 200 crashes a day.

Flags: needinfo?(cpeterson)

(In reply to Pascal Chevrel:pascalc from comment #7)

Can we maybe revise the assessment for this bug priority and severity? It was set as P5 before before the merge but since then, the number of crashes spiked a lot. We went from 40 crashes a day across channels to 200 crashes a day.

@ Nika: this CloneOrUndefined crash is climbing on Fx82 Beta. AFAIU, we expect these CloneOrUndefined crashes would have happened before, just with different random crash signatures. Do you have any suggestions on how we might to determine whether these CloneOrUndefined crashes are a bigger problem the previous OOMs with random signatures?

@ Pascal: Is Fx82 Beta's overall crash rate higher than Fx81 Beta's was?

Flags: needinfo?(cpeterson) → needinfo?(nika)

The overall 82 beta cycle is less crashy than 81 was but 81 beta had several top crashers that were fixed late in the cycle.

we seem to have crashes like bp-552226d4-1497-4848-b8de-133cc0201005 or bp-cd757754-2ce3-4bac-b44c-ddf270201005 with plenty of available virtual and physical memory

Assignee: nobody → nika
Status: NEW → ASSIGNED

I've put up some patches which avoid the issue entirely by passing a Maybe over IPC, and interpreting a Nothing as undefined.

This adds a decent amount of complexity to the code, but given that these crashes are apparently occurring in the wild, it may be the best we've got for now.

It would be nice to learn why creating a structured clone buffer for a simple data value like undefined is failing so frequently in the wild, but this is probably the easiest option for now.

Flags: needinfo?(nika)
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7229926a45c2
Make StructuredCloneData args optional in JSActor RawMessage, r=kmag
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch

Nika, should we uplift your StructuredCloneData fix to 82 Beta? These CloneOrUndefined crashes are happening in 81 Release and 82 Beta, but it might be too late to uplift (what you describe as a "decent amount of complexity") during the last week of 82 Beta.

Flags: needinfo?(nika)

Since this is not new in 82 I wouldn't want to uplift late in RC week. Maybe for a dot release after we see how this goes on 83 for a little while?

It is still a bit early, but there are no crashes in the last 3 builds which had this fix, which seems like a good sign.

(In reply to Chris Peterson [:cpeterson] from comment #15)

Nika, should we uplift your StructuredCloneData fix to 82 Beta? These CloneOrUndefined crashes are happening in 81 Release and 82 Beta, but it might be too late to uplift (what you describe as a "decent amount of complexity") during the last week of 82 Beta.

It might be nice to uplift it in a dot release as :jcristau mentioned, if it seems stable enough on Nightly. I think the patch should apply pretty cleanly. It really depends on what the real-world crash volume looks like. If it's high enough to warrant the uplift we should go ahead with it. I think it's pretty safe.

Flags: needinfo?(nika)

Comment on attachment 9180756 [details]
Bug 1660539 - Make StructuredCloneData args optional in JSActor RawMessage,

Beta/Release Uplift Approval Request

  • User impact if declined: Crashes
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Patch has been on Nightly and Beta for a while now. The change is fairly mechanical, and has no known issues.
    If it applies cleanly to Release, it should be safe to ship.
  • String changes made/needed: None
Attachment #9180756 - Flags: approval-mozilla-release?

Comment on attachment 9180756 [details]
Bug 1660539 - Make StructuredCloneData args optional in JSActor RawMessage,

crash fix, verified in 83. approved for 82.0.1

Attachment #9180756 - Flags: approval-mozilla-release? → approval-mozilla-release+
You need to log in before you can comment on or make changes to this bug.