[crash] root_points_to_gcArenaPool assertion with root name "nsGenericElement::mScriptObject"

VERIFIED FIXED

Status

()

Core
DOM: Core & HTML
P3
normal
VERIFIED FIXED
18 years ago
17 years ago

People

(Reporter: sean echevarria, Assigned: jst)

Tracking

({crash})

Trunk
x86
Windows 2000
crash
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [rtm++], URL)

Attachments

(9 attachments)

(Reporter)

Description

18 years ago
JS API usage error: the address passed to JS_AddNamedRoot currently holds an
invalid jsval.  This is usually caused by a missing call to JS_RemoveRoot.
The root's name is "nsGenericElement::mScriptObject".
Assertion failure: root_points_to_gcArenaPool, at h:\moz\mozilla\js\src\jsgc.c:8
45

Similar to bug 50705 except that this is a different root name and this occurs 
after simply navigating around http://www.beatnik.com.  No eMixes involved.  No 
popup windows involved.

To reproduce, install the Beatnik plugin in 4x and then copy npBeatnk.dll from 
the 4x plugins directory to the mozilla plugins directory and navigate around 
beatnik.com for about 2 minutes (even though the 4x plugin won't work in 
mozilla, its presence enables certain functionality in the site that is 
necessary to repro the assert).
(Reporter)

Comment 1

18 years ago
Johnny: nsGenericElement::SetScriptObject() at
http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsGenericElement.cpp#170
8
overwrites slots->mScriptObject.  Should any named references associated with 
the old script object be cleared before overwriting it?
Keywords: beatnik, nsbeta3, rtm
(Reporter)

Comment 2

18 years ago
adding crash keywords
Keywords: nsbeta3 → 4xp, crash
Summary: root_points_to_gcArenaPool assertion with root name "nsGenericElement::mScriptObject" → [crash] root_points_to_gcArenaPool assertion with root name "nsGenericElement::mScriptObject"

Comment 3

18 years ago
Sean, can you take a stab at this?  Johnny is pretty overloaded.
(Reporter)

Comment 4

18 years ago
Created attachment 16253 [details] [diff] [review]
proposed fix - patch to free script object's named reference
(Reporter)

Comment 5

18 years ago
This bug also caused a crash for me at http://www.live365.com
updating keywords
Keywords: patch, review

Comment 6

18 years ago
Marking rtm need info.  Johnny, how risky do you think this patch is?  If you
think it is low risk, please get it reviewed and super-reviewed and I'll mark it
rtm+ to get it on the PDT radar.
Whiteboard: [rtm need info]
Sean: does that patch cure anything?  You shouldn't have to remove a GC root 
when setting mScriptObject.  You just need to keep the one that was added, if 
there is one.  Or am I misunderstanding, and the remove is needed to get rid of 
the root in all cases?

Is bug 54431 a dup of this one?

/be
(Reporter)

Comment 8

18 years ago
Brendan: On the day I submitted the patch I did a pull from the trunk and was
still able to repro the assert.  Could no longer repro after I applied the
patch.  I'll remove the patch tomorrow and attach a stack trace of the assertion.

As far as the reasoning behind the patch - it was a shot in the dark made only
on the assumption that there probably was a missing call to JS_RemoveRoot (per
the text of the assertion).  Saw that "nsGenericElement::mScriptObject" was set
using AddNamedReference so I looked for RemoveReference rather than
JS_RemoveRoot and looked for places where mDOMSlots->mScriptObject gets cleared
without calling RemoveReference.  Found that SetScriptObject often overwrites
mScriptObject and the attached patch fixed it for me.

Will post followup Fri.

Comment 9

18 years ago
Wow, so if this patch "fixes" the bug, then what must be happening is we're
getting a call to nsGenericElement::SetScriptObject() with aScriptObject ==
null, but then we've never unrooted it? I thought that SetScriptObject() was
only called from the finalizer...
http://lxr.mozilla.org/mozilla/search/search?string=SetScriptObject%28nsnull%29

discloses

http://lxr.mozilla.org/mozilla/source/layout/html/content/src/nsHTMLEmbedElement.cpp#351

Argh! Should be easy to fix, and I do believe this is a topcrash, but can't
prove it without details about whether a plugin was involved.

/be
(Reporter)

Comment 11

18 years ago
A plugin is definitely involved.  The initial description details how to repro 
using the Beatnik 4x Netscape plugin.

I also experienced the same assert at http://www.live365.com - they use Real 
Audio.

Comment 12

18 years ago
jst, that code has r1.27 and your fingerprints on it. I don't think that you can
safely call SetScriptObject(nsnull) except from the JS finalizer. What exactly
were you trying to do there?
jst's comments say he wants to give back the old script object, but then make
sure a new one is created on the next request, because GetPluginInstance's
failure means there is no frame for this element yet.  Or something like that.

Are we using the nullity of mScriptObject to decide not to remove the root?  If
so then we should do a RemoveReference when clearing mScriptObject outside of a
finalizer.  But: where, and how many nsIScriptObjectOwner impls need to do this?

/be
(Reporter)

Comment 14

18 years ago
Created attachment 17025 [details]
stack trace of gc_root_marker assert
(Reporter)

Comment 15

18 years ago
Created attachment 17028 [details]
another (different) stack trace of the gc_root_marker assert for mScriptObject
(Reporter)

Comment 16

18 years ago
I produced the asserts by first going to a couple of pages at Beatnik.com and 
then going to live365.com.  If you've been to live365.com you'll need to remove 
your cookies that it sets in order to get their splash page.  I previously 
wrote that they use Real Audio - while true, the plugin being used when the 
assert is generated is actually Flash.  The Flash splash will not appear on 
subsequent visits unless you remove their cookies.

(Reporter)

Comment 17

18 years ago
Per bug 50705, http://www.fortunecity.com should also be a good testcase for 
this one.
*** Bug 54431 has been marked as a duplicate of this bug. ***
(Reporter)

Comment 19

18 years ago
Created attachment 17031 [details]
yet another stack trace for the same mScriptObject assert - using same rough testcase
Let's get a good patch, fast -- the rtm keyword still applies, but I'm removing 
patch and review.

/be
Keywords: patch, review

Comment 21

18 years ago
  I'm seeing a crash with the same root name at http://www.fortunecity.com . Go 
there, click on the "Music Scene" link (in the "Explore" section, toward the 
middle of the page). Click the "back" toolbar button. Click "forward." Repeat as 
necessary; it is a GC problem after all. I've never had to repeat more than once. 
And I've tried a couple of Music Scene's sibling links with the same results.
(Assignee)

Comment 22

18 years ago
Created attachment 17084 [details] [diff] [review]
Proposed fix, unroot the jsobject after calling SetScriptObject(null)
(Reporter)

Comment 23

18 years ago
tested patch and am not able to repro the assertions anymore
(Assignee)

Comment 24

18 years ago
Thanks Sean! Oh, I forgot to say what the patch does. The patch makes us unroot
the script object slot (rooted in nsGenericElement::GetSciptObject()) if we end
up setting the script object to null (which is done if the plugin frame is not
yet available so that the next call to GetScriptObject() will create a new
script object that hopefully wraps the plugin object), if we don't do this we'll
end up rooting the same script object slot more than once, and that's causing
the problem, me thinks.

Brendan, Waterson, does this make sense?
Status: NEW → ASSIGNED
jst: not quite -- it's ok to call JS_AddRoot more than once on a given address
of a root without an intervening JS_RemoveRoot (these are AddNamedReference and
RemoveReference, in nsJSContext parlance).  The problem here is that by setting
mScriptObject to nsnull you tell the subsequent SetDocument code that it doesn't
need to RemoveReference.  So the nsGenericElement is deleted, and its memory
(including the rooted pointer to a script object) recycled.  Then the GC runs
and tries to trace the root.

I need to study your patch after I've caught up on sleep.  Hope this helps in
the mean time.

/be

Comment 26

18 years ago
http://www.fortunecity.com/ did *not* cause crash under Mac OS 9.0.4 running Moz
trunk build 2000101215.

What it *did* do was take an inordinate amount of time to load nothing.  Went to
"Music Scene" link like danm suggested.  Clicked back and forth.  Moz status bar
spun and spun, but Moz itself did not crash.  The statusbar then reported
"Document: Done" after about two minutes, and displayed a blank page.

Attempted to repeat.  No crash, so busted into MacsBug to get the stdlog that
follows.

Comment 27

18 years ago
Created attachment 17123 [details]
MacsBug stdlog per kurtw's description
(Assignee)

Comment 28

18 years ago
Brendan, ok, I get it, thanks. So, to my knowledge there's no way to know in
SetDocument() if the script object had been rooted or not (if mScriptObject is
null) so I think the best thing to do is to do what my patch does, then we know
that the script object will be unrooted even in the case where we do
SetScriptObject(null), the nsHTMLEmbedElement code is the only place where
that's done (at least I couldn't find any other place).

My patch isn't complete tho, the code needs to check if mInner.mDocument is non
zero before the script object is unrooted, if there's no document it won't be
rooted in GetScriptObject() so there's no need to unroot it. I'll attach a new
patch.
(Assignee)

Comment 29

18 years ago
Created attachment 17145 [details] [diff] [review]
Updated patch, only unroot script object if it was rooted.
Whoa -- jst, your next-to-last patch
(http://bugzilla.mozilla.org/showattachment.cgi?attach_id=17084, Proposed fix,
unroot the jsobject ...) is against 1.26.  But your last patch (Updated patch,
only unroot...) is against 1.31.  No wonder I was confused and sleepy!

The latest patch looks good to me.  Waterson?

/be
(Assignee)

Comment 31

18 years ago
Whoa, I attached the wrong patch the first time, ironically the patch I attached
was the patch that contained the code that caused this problem in the first
place (I should start removing old diff files from my system!). I'm sorry about
that.

Sean, did you try to attach the second-to-last patch? Did you revert the changes
that patch contained and the assertions went away? Could you please retest with
the last patch?
(Reporter)

Comment 32

18 years ago
Yikes - I applied jst's first patch as a reverse patch.  Which meant that, yes,
the assert stopped happening but also that plugins probably can't be scripted
(was trying to do too many things at once...).   Will fix it, apply the new
patch and test on Monday.
(Reporter)

Comment 33

18 years ago
Applied the patch and got a crash dereferencing mInner.mDOMSlots.

Changed this:
+    if (mInner.mDocument) {
to this:
+    if (mInner.mDocument && mInner.mDOMSlots) {

But I still got the mScriptObject root assert.  Will try again and get stack 
trace.
(Reporter)

Comment 34

18 years ago
Created attachment 17248 [details]
trace of assert using latest patch (with mDomSlots mod)
(Reporter)

Comment 35

18 years ago
The crash dereferencing mDomSlots is related to why the patch doesn't work.  
The SetScriptObject call should happen after the RemoveReference call.
(Reporter)

Comment 36

18 years ago
Created attachment 17254 [details] [diff] [review]
updated patch

Comment 37

18 years ago
sean's last patch looks right to me
(Assignee)

Comment 38

18 years ago
Looks good to me too, I should've realized that we need to unroot before we call
SetScriptObject(null).
r=brendan@mozilla.org, and I think waterson just gave an a=.  If this patch is 
not ok, remove the [rtm+] I am adding here.  I still believe this bug may be a 
topcrash.

/be
Whiteboard: [rtm need info] → [rtm+]

Comment 40

18 years ago
PDT says rtm++, okay to land on branch.
Whiteboard: [rtm+] → [rtm++]
(Assignee)

Comment 41

18 years ago
Fix checked in on both branch and trunk. Thanks to all of you for the help here!
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → FIXED

Comment 42

18 years ago
Verified with 2000-10-16-09.
Status: RESOLVED → VERIFIED

Updated

17 years ago
Keywords: beatnik, rtm → nsrtm
You need to log in before you can comment on or make changes to this bug.