Crash in [@ OOM | large | NS_ABORT_OOM | nsINode::IsEqualNode] in nsFocusManager

RESOLVED FIXED in Firefox 67

Status

()

defect
P2
critical
RESOLVED FIXED
4 months ago
2 months ago

People

(Reporter: philipp, Assigned: bzbarsky)

Tracking

({crash, regression})

63 Branch
mozilla67
Points:
---

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox65 wontfix, firefox66 wontfix, firefox67 fixed)

Details

(Whiteboard: [MemShrink], crash signature)

Attachments

(1 attachment)

Reporter

Description

4 months ago

This bug is for crash report bp-34468577-84a8-4df1-8cc7-638eb0190224.

Top 10 frames of crashing thread:

0 xul.dll NS_ABORT_OOM xpcom/base/nsDebugImpl.cpp:603
1 xul.dll nsINode::IsEqualNode dom/base/nsINode.cpp:881
2 xul.dll void nsFocusManager::Focus dom/base/nsFocusManager.cpp:1823
3 xul.dll nsFocusManager::WindowRaised dom/base/nsFocusManager.cpp:690
4 xul.dll nsWebBrowser::FocusActivate toolkit/components/browser/nsWebBrowser.cpp:1364
5 xul.dll mozilla::dom::TabChild::RecvActivate dom/ipc/TabChild.cpp:1314
6 xul.dll mozilla::dom::ContentBridgeChild::RecvActivate dom/ipc/ContentChild.cpp:3283
7 xul.dll mozilla::dom::PContentChild::OnMessageReceived ipc/ipdl/PContentChild.cpp:8675
8 xul.dll void mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2087
9 xul.dll mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1967

oom content crashes with this signature and a stack touching nsFocusManager are regressing since firefox 63.
could it be potentially potentially related to the changes in bug 1480645 which would fit into the timeframe of the regression?

Hi Kyle, can you please take a look and suggest if bug 1480645 has something to do here? Thanks!

Flags: needinfo?(kyle)
Priority: -- → P2

While the timelines do match up, Bug 1480645 just streamlined the calls we use to focus, so AFAICT there's not really a code change that would've triggered this. We didn't change anything about how node processing works.

In some of the crash reports, it looks like we're making extremely large allocations (sometimes hundreds of mb), which is triggering the crash on 32-bit systems (though I did see a few x64 stacks). I'm not sure why/how we'd get a few hundred mb of character data being appended during that check.

ni?'ing bz who was also around some of the CharacterData stuff earlier, just for context if nothing else.

Flags: needinfo?(kyle) → needinfo?(bzbarsky)

I'm not sure why/how we'd get a few hundred mb of character data being appended during that check.

The simple answer would be a textnode with a few hundred MB of text in it.

The crash in comment 0 is allocating 1.84MB, which is quite believable.

We could change the IsEqualNode code to do in-place compares, but chances are some other 2MB allocation would happen.... :(

Flags: needinfo?(bzbarsky)

I also kinda wonder whether IsEqualNode should fast-path pointer-identical nodes...

Assignee: nobody → bzbarsky

On a silly "take a largish page (searchfox for one of our files), deep-clone the body, isEqualNode the original and the clone" testcase the new thing is about 4x faster than what we do now.

Whiteboard: [MemShrink]

Maybe file a bug in Core::String about this missing comparison function that you had to implement yourself? I don't know if it is worth fixing, but it is good practice to at least complain "up stream" about missing features.

Filed bug 1532356 and will reference it in the code comments.

Comment 9

4 months ago
Pushed by bzbarsky@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cc4b2dbdfd83
Fix isEqualNode to not do a bunch of string-copying.  r=mccr8

Some of the other failures in that push look related. For instance, there are assertions like this:

Child 2462, Main Thread] ###!!! ASSERTION: nsTDependentString must wrap only null-terminated strings. You are probably looking for nsTDependentSubstring.: 'this->mData[substring_type::mLength] == 0', file /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTString.h, line 456

Hrm. I guess nsTextFragment sometimes null-terminates its stuff (e.g. the aForce2b codepath in SetTo) and sometimes doesn't (e.g. the aLength == 1 codepath). What a mess. :(

Yes, those are all the same assertion failure... I have a try run going with that fixed to see whether there's anything else in there.

Flags: needinfo?(bzbarsky)

Comment 15

4 months ago
Pushed by bzbarsky@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a1fc95a1f08d
Fix isEqualNode to not do a bunch of string-copying.  r=mccr8

Comment 16

4 months ago
bugherder
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.