Crash in OOM | large | NS_ABORT_OOM | PLDHashTable::Add | mozilla::DOMEventTargetHelper::BindToOwnerInternal

VERIFIED FIXED in Firefox 61

Status

()

defect
P2
critical
VERIFIED FIXED
a year ago
a month ago

People

(Reporter: marcia, Assigned: smaug)

Tracking

({crash})

Trunk
mozilla61
Unspecified
Windows 10
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox-esr60 unaffected, firefox60 unaffected, firefox61 verified)

Details

(crash signature)

Attachments

(2 attachments)

This bug was filed from the Socorro interface and is
report bp-55ae8cae-477b-4b66-afaa-308ff0180424.
=============================================================

Seen while looking at nightly crashes. This small volume crash affects Windows and Mac and appears to have started using 20180420100056. Crash reason appears to be MOZ_CRASH(OOM).

Mac crashes: https://bit.ly/2HBmLcq
Windows crashes: https://bit.ly/2I54pjh

Some code was touched in Bug 1451913. ni on :bkelly

Top 10 frames of crashing thread:

0 xul.dll NS_ABORT_OOM xpcom/base/nsDebugImpl.cpp:614
1 xul.dll PLDHashTable::Add xpcom/ds/PLDHashTable.cpp:613
2 xul.dll mozilla::DOMEventTargetHelper::BindToOwnerInternal dom/events/DOMEventTargetHelper.cpp:362
3 xul.dll mozilla::DOMEventTargetHelper::DOMEventTargetHelper dom/events/DOMEventTargetHelper.h:52
4 xul.dll mozilla::dom::MediaQueryList::MediaQueryList layout/style/MediaQueryList.cpp:26
5 xul.dll nsIDocument::MatchMedia dom/base/nsDocument.cpp:6623
6 xul.dll nsGlobalWindowInner::MatchMedia dom/base/nsGlobalWindowInner.cpp:3538
7 xul.dll static bool mozilla::dom::WindowBinding::matchMedia dom/bindings/WindowBinding.cpp:3025
8 xul.dll mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::MaybeGlobalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions> dom/bindings/BindingUtils.cpp:3242
9  @0x1c68cd8433f 

=============================================================
Flags: needinfo?(bkelly)
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(bkelly)
Priority: -- → P2
Andrea, I believe the problem here is that some sites are creating very large number of temporary MediaQueryList objects.  Perhaps in a tight loop polling mediaMatch().  These build up until GC/CC can clear them and bloat the nsIGlobalObject::mEventTargetObjects hash table.  In the crash case to a capacity >2GB.

As a fix this patch waits to bind MediaQueryList to the global until the first event listener is added.  My reasoning is that a polling loop is probably not using the event listener mechanism.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6baac9e1ddb28baba7608ee432aaadc3c15b8dbb
Attachment #8971985 - Flags: review?(amarchesini)
Comment on attachment 8971985 [details] [diff] [review]
Make MediaQueryList lazily bind to the global when its first event listener is added. r=baku

Review of attachment 8971985 [details] [diff] [review]:
-----------------------------------------------------------------

Smaug, I pref you to review this.
Attachment #8971985 - Flags: review?(amarchesini) → review?(bugs)
(Assignee)

Comment 3

a year ago
Comment on attachment 8971985 [details] [diff] [review]
Make MediaQueryList lazily bind to the global when its first event listener is added. r=baku

I don't like this. MQL's wrapper may be created using whatever random global first, and then after adding listener (and a GC) wrapper is created using some other global.

Could we make DETH use linked list in the owner global?
Attachment #8971985 - Flags: review?(bugs) → review-
(Assignee)

Comment 4

a year ago
btw, if the objects are very common, perhaps they should have  nsJSContext::LikelyShortLivingObjectCreated(); call in ctor.
(In reply to Olli Pettay [:smaug] from comment #3)
> Comment on attachment 8971985 [details] [diff] [review]
> I don't like this. MQL's wrapper may be created using whatever random global
> first, and then after adding listener (and a GC) wrapper is created using
> some other global.

How is that possible if we use the same mDocument to get the window in both cases?  AFAICT the mDocument does not change.  Also, I don't understand how the global even changes here.  Are you hypothesizing about something like document.open()?

> Could we make DETH use linked list in the owner global?

Wouldn't that pessimize the more common case by making every DETH in the system have to implement the linked list node overhead (2 pointers and a bool)?  Its unclear to me that this would result in less memory in most cases.
(Assignee)

Comment 6

a year ago
oh, I didn't realize MediaQueryList has silly GetParentObject implementation, overriding DETH.


Having an entry in a hashtable isn't memory free either. But linked list doesn't reserve massive memory buckets at once.
Of course, there have been only 3 instances of this crash.  Maybe we should not worry about it for right now.

In any case, I can't really change the global hash table to a list during soft freeze.
(Assignee)

Comment 8

a year ago
why not?
Changing a data structure in nsIGlobalObject.h touched by every event target days before the next branch merge seems unnecessarily risky for a very low frequency OOM.

I'll look at converting to a linked list next week.
(Assignee)

Comment 10

a year ago
Posted patch wipSplinter Review
So something like this.
Some DETHs are already LinkedListElements, so need to do some casting.

remote: View your change here:
remote:   https://hg.mozilla.org/try/rev/bdca8795b0c0f6734a1605b90e125bdeb1999a3c
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdca8795b0c0f6734a1605b90e125bdeb1999a3c
remote: recorded changegroup in replication log in 0.103s
Olli, assigning to you for now since you're working on a patch.  I can look again in a week or so if you want me to.
Assignee: bkelly → bugs
(Assignee)

Comment 12

a year ago
Comment on attachment 8972989 [details] [diff] [review]
wip

One nice thing with linked list, that LinkedListElement's dtor ensures the element is removed from list. I didn't take an advantage of that, but it is a nice extra safety check.

(target->GetOwnerGlobal() != this) should be a rather fast way to check whether deth is in that global.

Had to switch to normal iteration in some loops, since ranged-for became a tad ugly because of multiple inheritance.
Attachment #8972989 - Flags: review?(bkelly)
Comment on attachment 8972989 [details] [diff] [review]
wip

Review of attachment 8972989 [details] [diff] [review]:
-----------------------------------------------------------------

Overall looks reasonable.  I do think we need to do something better for the GetOwnerGlobal() checks, though, since its too easy for a sub-class to override that an break things.  Also, maybe consider the private inheritance option to avoid the casting, etc.

::: dom/base/nsIGlobalObject.cpp
@@ +139,5 @@
>  nsIGlobalObject::RemoveEventTargetObject(DOMEventTargetHelper* aObject)
>  {
>    MOZ_DIAGNOSTIC_ASSERT(aObject);
> +  MOZ_ASSERT(aObject->isInList());
> +  MOZ_ASSERT(aObject->GetOwnerGlobal() == this);

Isn't it possible for sub-classes to override GetOwnerGlobal() and return something different from what BindToOwner() sets?  Could we mark DOMEventTargetHelper::GetOwnerGlobal() final?

Alternatively we could add a DOMEventTargetHelper::IsBoundTo(nsIGlobalObject*) method that cannot be overriden.

@@ +140,5 @@
>  {
>    MOZ_DIAGNOSTIC_ASSERT(aObject);
> +  MOZ_ASSERT(aObject->isInList());
> +  MOZ_ASSERT(aObject->GetOwnerGlobal() == this);
> +  aObject->remove();

One thing I don't love about the LinkedList implementation is that anyone can call remove() on the node directly without going through this method.  It would be nice if we could prevent that in some way.

@@ +160,5 @@
>    bool done = false;
>    for (auto target : targetList) {
>      // Check to see if a previous iteration's callback triggered the removal
>      // of this target as a side-effect.  If it did, then just ignore it.
> +    if (target->GetOwnerGlobal() != this) {

See my previous comment about subclasses overriding GetOwnerGlobal().  It seems we could also check isInList() here, although we still need something like the global check since it could be in a different list.

::: dom/events/DOMEventTargetHelper.h
@@ +35,5 @@
>  { 0xa28385c6, 0x9451, 0x4d7e, \
>    { 0xa3, 0xdd, 0xf4, 0xb6, 0x87, 0x2f, 0xa4, 0x76 } }
>  
> +class DOMEventTargetHelper : public dom::EventTarget,
> +                             public LinkedListElement<DOMEventTargetHelper>

To avoid some of the casting issues here, what about doing something like this:

class DOMEventTargetHelper : public dom::EventTarget,
                             private LinkedListElement<DOMEventTargetHelper>
{
  friend class nsIGlobalObject;

private:
  LinkedListElement<DOMEventTargetHelper> *
  AsDETHListNode()
  {
    return static_cast<LinkedListElement<DOMEventTargetHelper>*>(this);
  }
}

This would also prevent code other than friend classes like nsIGlobalObject from calling things like remove() directly.
Attachment #8972989 - Flags: review?(bkelly) → review-
Comment on attachment 8972989 [details] [diff] [review]
wip

Review of attachment 8972989 [details] [diff] [review]:
-----------------------------------------------------------------

I missed that DOMEventTargetHelper::GetParentObject() was marked final in this patch.

It would still be nice to try the private inheritence approach to avoid the casting issues here, though.
Attachment #8972989 - Flags: review- → review+
(Assignee)

Comment 15

a year ago
Compiler doesn't seem to like the private approach. Still
error: member 'getNext' found in multiple base classes of different types when dealing with mql

Comment 16

a year ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/91ba35aee750
store DOMEventTargetHelper objects in global object as a linked list, r=bkelly

Comment 17

a year ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d577b7168a13
remove useless virtual, r=bustage, CLOSED TREE
https://hg.mozilla.org/mozilla-central/rev/91ba35aee750
https://hg.mozilla.org/mozilla-central/rev/d577b7168a13
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
No reports on crash-stats since this fix landed.
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.