this site causes a hang in all versions (release, beta, aurora...)




7 years ago
7 years ago


(Reporter: pjpreilly, Assigned: Ms2ger)




Firefox Tracking Flags

(Not tracked)




(1 attachment)



7 years ago
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:15.0) Gecko/20120427 Firefox/15.0a1
Build ID: 20120427040203

Comment 1

7 years ago
spinning beach ball of death.

Comment 2

7 years ago
seems to work fine in chrome.

Comment 3

7 years ago
hangs in safe mode

Comment 4

7 years ago
Confirmed in
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120428 Firefox/15.0a1 ID:20120428030503

Unresponsive script dialog appears as follows:
A script on this page may be busy, or it may have stopped responding. You can stop the script now, or you can continue to see if the script will complete.

Regression window:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0a1) Gecko/20111218 Firefox/11.0a1 ID:20111218031140
Unresponsive script dialog appears:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0a1) Gecko/20111218 Firefox/11.0a1 ID:20111218021643

Last good: c3525cd1ce44
First bad: f9f6f9ed788a

Triggered by:
f9f6f9ed788a	Ms2ger — Bug 707576 - Remove nsIDOMNSElement; r=smaug
Blocks: 707576
Component: Untriaged → DOM
Ever confirmed: true
Keywords: regression
Product: Firefox → Core
QA Contact: untriaged → general
Version: 15 Branch → Trunk


7 years ago
OS: Mac OS X → All
Ms2ger, this is the same issue as the hang from the Big Picture site that stopped being reproducible: it's using the same library (ShareThis).  Can you please dig into this ASAP?  Please let me know if I can help; I have some bandwidth available this week.
Assignee: nobody → Ms2ger
So I think we should go ahead and back out bug 707576 on m-c and aurora (but _not_ beta) and work on figuring out why that patch causes this problem....  If we can figure that out soon enough, and the fix is safe enough, maybe we can get that fix into beta?
akeybl was the originator of the suggestion in comment 6, btw.  So we should do that.  Ms2ger, want me to put the patches together?
As far as what's going on: my current best guess is that the script is sensitive to the precise order in which properties appear on elements and that bug 707576 changed that order...

Comment 9

7 years ago
Works fine in beta 13.0b2 5/4/12. By accident or intent?
Beta doesn't have the patch bug 707576 in it.
I have a patch locally (in a tree from just before the fix for 707576 landed) that moves everything from nsIDOMNSElement to nsIDOMElement except for the children attribute. No apparent problems with that patch. If I then additionally move the children property to nsIDOMElement then the problem shows up. If I however change nothing other than moving the children property from nsIDOMNSElement to nsIDOMElement, I also don't see any problems.

Digging deeper...
Ok, this was no fun to debug. Here's what's going on here...

What happens on trunk here is that this code ends up calling a function that does this:

    function SHARETHIS_merge() {
        var e = {};
        for (var d = 0, a = arguments.length; d < a; d++) {
            var b = arguments[d];
            if (SHARETHIS_typeof(b) != "object") {
            for (var c in b) {
                var g = b[c],
                    f = e[c];
                e[c] = (f && SHARETHIS_typeof(g) == "object" && SHARETHIS_typeof(f) == "object") ? SHARETHIS_merge(f, g) : SHARETHIS_unlink(g)
        return e

Due to the fix for bug 707576 there's a difference in property order on DOM nodes, as bz suspected, and that leads to this code hitting a bunch of memory heavy property getters in this recursive function, .innerHTML and .outerHTML among others, and that ends up allocating gigs of memory and we start paging etc and we may or may not get to show the slow script dialog in a reasonable amount of time.

Before the fix for bug 707576 this code simply ilooped hitting only fast property getters, eventually hitting the stack limit and being stopped in its track, so the site "worked", with some definition of "worked".

I don't think there's anything we can do here, attempting to arbitrarily re-order our properties doesn't seem like a winning game here...
Oh, and that busted code comes from, which splatters share buttons of all kinds on sites that use it.
Arbitrary reordering would in fact not be winning, but reordering in a particular spec-required order might be....  But:

Looks like XPConnect puts the "extra" interfaces that are not on the direct proto chain on the prototype first.  So we used to have nsIDOMNodeSelector, then nsIDOMNSElement, then nsIDOMEventTarget, then nsIDOMElementCSSInlineStyle, then nsIDOMHTMLElement, then nsIDOMElement, then nsIDOMNode, etc.  Modulo the fact that quickstubbed stuff always comes before non-quickstubbed stuff or something.  So previousElementSibling and nextElementSibling came before innerHTML and guaranteed that we'd blow the stack before we got to innerHTML.

In any case, per WebIDL the HTMLElement properties (including innerHTML) would be enumerated before the Element properties (like previousElementSibling), so things would still break.  In particular, with nextElementSibling/previousElementSibling moved to nsIDOMElement things break even with XPConnect, as far as I can see.

I looked at some other browsers.  Opera lists the properties alphabetically.  innerHTML comes after firstChild and firstElementChild but before things that might take us back up or to the left in the tree, so I would expect it to be hitting that innerHTML call.  WebKit has some weird sort order but still has innerHTML coming before the tree-traversal stuff.

I checked in Chrome, and it does throw a stack exhaustion exception here, in SHARETHIS_unlink.  But it happens pretty quickly.  I'll experiment a bit to see why.

Opera throws the exception as well, and the site is unresponsive for a good long while there.

I'm going to contact ShareThis about this.

In the meantime, we should, imo, back out bug 707576 on trunk again for now, until they fix this.  And maybe on Aurora, if it's not too late to rev all the IIDs there.  Johnny, are you willing to make sure that either you or Ms2ger get to that?
Oh, and for the record, the SHARETHIS_unlink function, which starts like this:

    function SHARETHIS_unlink(c) {
        var a;
        switch (SHARETHIS_typeof(c)) {
        case "object":
            a = {};
            for (var e in c) {
                a[e] = SHARETHIS_unlink(c[e])

is the recursive culprit here.
Ah, the Chrome thing is simple.  On a trivial testcase that involves a function like SHARETHIS_unlink, Chrome runs out of stack after about 7000 recursive calls.  We run out of stack after about 150000 in Firefox 11, or after about 70000 in Firefox 10 (JS data structures got smaller in 11).  Opera runs out after about 112000.  

So Chrome obviously runs for a lot less time before running out of stack.  In fact, in my testing, they run for about 1/5 as long as we do, while only doing about 1/20 as much work...
Created attachment 623546 [details]
Testcase for comment 16

I guess we could also set a lower cap on our recursion depth, but that would break other things, I suspect.
Ah, the recursion counts I was doing were off because they counted function-valued props.  Chrome, of course, puts those higher up the proto chain, so never even gets to them.

It looks like we get to about 23k deep recursion before throwing; Chrome gets to 7k deep.  we walk a lot more properties per level...

Comment 19

7 years ago
Looks like WWN changed their website since it works on TFF 12 when it didn't previously?
Yes, something changed there.  For what it's worth, Johnny and I contacted ShareThis and they're fixing this on their end too.
Worksforme, I guess....
Last Resolved: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.