Closed Bug 462853 Opened 11 years ago Closed 11 years ago

"Assertion failure: slot < (obj)->map->freeslot" during Topsite Test on http://www.clarin.com

Categories

(Core :: DOM: Core & HTML, defect, P2, critical)

x86
All
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: cbook, Assigned: jst)

References

()

Details

(Keywords: crash)

Attachments

(2 files, 1 obsolete file)

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b2pre) Gecko/20081031 Firefox/3.1b2pre

Crash during the Automated TopSite Test on www.clarin.com 

According to the Logfile there was a Assertion failure: slot < (obj)->map->freeslot, at /work/mozilla/builds/1.9.1/mozilla/js/src/jsinterp.cpp:5636

http://www.clarin.com: EXIT STATUS: CRASHED signal 5 SIGTRAP (2422.151124 seconds)
Loading http://www.clarin.com/ doesn't trigger a fatal assertion or crash for me, using a mozilla-central debug build from yesterday.  Was this triggered by http://www.clarin.com/ itself or one of the sub-pages?
Summary: Crash during Topsite Test on http://www.clarin.com [@ js_Interpret + 107780] → "Assertion failure: slot < (obj)->map->freeslot" during Topsite Test on http://www.clarin.com
Attached file stack, x86_64 Linux
STR: open all Alexa Global Top 100 in tabs.  Reload All Tabs a few times.
OS: Mac OS X → All
When this happens, if you are in a debugger please find cx->fp->script->filename and cx->fp->script->lineno. Bonus points for calling

js_FramePCToLineNumber(cx, cx->fp)

and reporting that too. May help give a reproducible testcase. Thanks,

/be
(In reply to comment #3)
> When this happens, if you are in a debugger please find
> cx->fp->script->filename and cx->fp->script->lineno. Bonus points for calling
> 
> js_FramePCToLineNumber(cx, cx->fp)
> 
> and reporting that too. May help give a reproducible testcase. Thanks,
> 
> /be

Note: During a Topsite Testrun we were seeing this Problem also on espn.go.com, so this happen not only on sites like clarin.com

Also we were able to get this crash into the debugger today. mrbkap has all the logs and information now (thanks Blake for helping how to get all this!). 

Let me know when i can help to provide more steps to reproduce/logs/stacks here. 

Requesting also blocking 1.9.1 for this Bug.
Flags: blocking1.9.1?
still seeing this?
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
(In reply to comment #5)
> still seeing this?

yeah i see this still in the topsite test reports, but so far i'm not able to reproduce :/ but blake should have all the informations when we tried to debug the crash from comment #4
Unable to reproduce, tracemonkey branch, today. Loaded top 100 global in tabs, repeatedly. Any movement towards a reduced case?
Blake can you comment here ?
I can't get this to happen on mozilla-central or tracemonkey. Does it still happen for you guys on trunk?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Reopening. Jesse reproduced yesterday, and I debugged it. This is a real problem that needs fixing. jst will patch it.

For the record: the problem that I spotted was that when a docshell dies, it calls SetDocShell(nsnull) on its script global object. In this case, the script global object is the outer window for an iframe. The iframe, however, has since been inserted into a document and has a *new* docshell. Now, we're running code for the document's inner window and do a GC that causes the *old* docshell to be collected. This calls SetDocShell on our window, resulting in a call to JS_ClearScope on its inner window, leading to this bug.
Assignee: general → jst
Status: RESOLVED → REOPENED
Component: JavaScript Engine → DOM
QA Contact: general → general
Resolution: WORKSFORME → ---
This is a patch that should fix the problem explained in this bug. But since I've been unable to reproduce this, I can't verify that whether this works or not, but it makes a lot of sense to me. An alternative to this fix, which mrbkap and I discussed earlier, is to change the signature of the SetDocShell() call to take the caller docshell as an argument, and do the check in that function. That would be safer if we're afraid of this being a problem in other places as well, but I can't think of a case where it would be, so I'm preferring this simpler patch.
Attachment #370155 - Flags: superreview?(bzbarsky)
Attachment #370155 - Flags: review?(mrbkap)
Comment on attachment 370155 [details] [diff] [review]
Untested (as I'm unable to reproduce), but should fix this bug.

I might be able to hack up a testcase for this bug... I'll see what I can do.
Attachment #370155 - Flags: review?(mrbkap) → review+
Do we no longer call nsDocShell::Destroy immediately on removal from the DOM?  That is, how are we ending up with a new docshell for the window before the old one has has Destroy() called?
So, afaict, we do call nsDocShell::Destroy immediately on removal from the DOM, but I also see us calling it with the stack:

#4  0xb2d3ad5c in nsGenericElement::DestroyContent (this=0xae6753a0)
    at /home/mrbkap/work/main/mozilla/content/base/src/nsGenericElement.cpp:3498
#5  0xb2cf6924 in nsDocument::Destroy (this=0xae612400) at /home/mrbkap/work/main/mozilla/content/base/src/nsDocument.cpp:6907
#6  0xb2a1c81b in DocumentViewerImpl::Destroy (this=0xae6be480)
    at /home/mrbkap/work/main/mozilla/layout/base/nsDocumentViewer.cpp:1497
#7  0xb36bdfb7 in ~nsSHEntry (this=0xae6607d0) at /home/mrbkap/work/main/mozilla/docshell/shistory/src/nsSHEntry.cpp:163
#8  0xb36be251 in nsSHEntry::Release (this=0xae6607d0) at /home/mrbkap/work/main/mozilla/docshell/shistory/src/nsSHEntry.cpp:182
#9  0xb2a23d88 in ~nsCOMPtr (this=0xbfdce6f4) at ../../../dist/include/xpcom/nsCOMPtr.h:510
#10 0xb2a1c7e1 in DocumentViewerImpl::Destroy (this=0xae6be480)

so maybe there's another reference keeping the the SHEntry alive that dies during GC?
But once Destroy() has been called once, mScriptGlobal gets set to null, so we shouldn't be calling SetDocShell on the window anymore...  At that point we also tear down outer and inner windows; if the iframe is then inserted into a document again we create a new docshell, new outer window, etc.

So I guess my real issue is that the analysis from comment 10 and comment 11 doesn't make sense to me.
Not going to block on this after all. If we'd get steps to reproduce this easier and we'd be able to dig in on what's going on here, I'd be happy to take a safe fix, but w/o more to go on here, I don't think it's worth holding the release for this, given how hard it appears to be to trigger this.
Flags: blocking1.9.1+ → blocking1.9.1-
Flags: wanted-next+
It doesn't seem like it's that hard to trigger, if we're hitting it on a topsite run.  Crashing on one of the topsites seems like a blocker to me! Renominating.
Flags: blocking1.9.1- → blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1+
No longer depends on: sisyphus-tracking
Attachment #370155 - Attachment is obsolete: true
Attachment #370155 - Flags: superreview?(bzbarsky)
I'd love for that to be true, but no luck triggering this crash yet. I've been trying, bnewman tried, etc.

Jesse, Martain, any help you can provide in making this easier to trigger would be greatly appreciated.
I can't reproduce this bug, so I'm not likely to be much help with reduction ;)
Tomcat says he doesn't see this any longer either (second hand information; Tomcat, please correct me if I'm wrong here), and no luck reproducing. I can't see how we could possibly block on this any more. Minusing.
Flags: blocking1.9.1+ → blocking1.9.1-
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → WORKSFORME
Wait, didn't mrbkap reopen this bug despite not having good steps to reproduce?
That's because I (mistakenly) thought I knew what's going on. We could keep this open, but IMO it shouldn't block.
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.