Closed Bug 633409 Opened 13 years ago Closed 13 years ago

modified iterators are reused after delete suppression

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
major

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: sdesai, Assigned: dvander)

References

()

Details

(Keywords: regression, Whiteboard: [hardblocker][fixed-in-tracemonkey][has patch])

Attachments

(2 files, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.94 Safari/534.13
Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b11) Gecko/20100101 Firefox/4.0b11

The YUI JS Library's Widget infrastructure maintains a reference to an object (srcNode). 

As of Firefox 4.0b10, this reference occasionally ends up being undefined, unexpectedly.

For example, for the 5 widgets on the test page, using exactly the same code base and code pattern, it ends up being undefined for 2 of the 5 widget instances. 

The same test page maintains references for all 5 widget instances correctly in 4.0b8 and 4.0b9 as well as Firefox 3.x and all other major browsers, so I'm pretty certain it's a browser bug.

I'm trying to whittle this down to a non-YUI use case, because I really can't see how it could be anything  in the YUI code (given the information above). My suspicion is that it maybe over-aggressive GC cleanup.

The test page has further details.

Reproducible: Always

Steps to Reproduce:
1. Load Test Page

http://yuiblog.com/sandbox/ff4_bugs/ff4b10widgetbug.html

2. Compare actual results to expected results as described on the test page, on FF 4.0b10, or FF 4.0b11
     (they differ)    

3. Compare actual results to expected results as described on the test page, on FF 4.0b8, or FF 3.6.x, or IE, or Chrome, or Safari. (they're the same)
Actual Results:  
Described on test page

Expected Results:  
Described on test page

I understand the above test case may not provide a good way for you to debug the problem. I'm trying to whittle it down to a non-YUI test case.

I wanted to get it filed because there's a obvious break/change in behavior from Firefox 4.0b8 to 4.0b10 and compared to all other browsers.

If you have a general idea of what broad areas changed from 4.0b8 (which is where the problem doesn't occur) to 4.0b10 (which is where the problem was first noticed), I could use that info to whittle this down to a basic test case.

Thanks.
Assignee: general → gal
Looking.
As reported works with WebKit-based browsers but fails in FF b10 and trunk.
Works with 3.6 so this is a regression.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Can we get some QA help with this? A regression window would be great.
blocking2.0: --- → ?
Keywords: qawanted
Doesn't look like wrapper business to me. No exceptions and no funky document.domain stuff. I am guessing enumeration order here. Bisecting would really help but I have to run.
Firstly, as an open source library developer myself, have to say, your triage speed is impressesive!

Secondly, as mentioned, let me know if you have any general inclinations about where it may be, and I can try whittling down the test case to something more basic based on what's going on in the Widget implementation, if that would help.

I know that the test case passes on 4.0b8 and fails on 4.0b10, if that helps narrow down the window.
Satyen, one thing that would help is to figure out more closely when this broke, could you do a binary search in nightly builds available at ftp://ftp.mozilla.org/pub/firefox/nightly/ ?

Thanks!
blocking2.0: ? → final+
Whiteboard: [hardblocker]
Are you guys aware of a JS debugger I can use with the FF4beta versions. If I could step through the JS code, I could probably give you more info as to where exactly it's failing (and if it is something GC related or not).

The inbuilt dev tools don't seem to have a debugger I could find, and Firebug isn't compatible (at least the released versions)
nm. found a firebug which seems to work
For what it's worth, I'm bisecting the TM regression range; should have this down to a changeset soon.  But if we can simplify or otherwise analyze the testcase that would probably help debugging...
So, this is the code pattern I was suspecting may be bringing out the issue (the delete operation mainly).

However, when I step through with the debugger it doesn't fail (which maybe indicates even more some kind of GC timing related issue).

---

// the "srcNode" reference I'm referring to, goes through this code path, as an "attr" value.

_filterAttrCfgs : function(clazz, allCfgs) {
    var cfgs = null, attr, attrs = clazz.ATTRS;
    if (attrs) {
        for (attr in attrs) {
            if (attrs.hasOwnProperty(attr) && allCfgs[attr]) {
                cfgs = cfgs || {};
                cfgs[attr] = allCfgs[attr];
                delete allCfgs[attr];
            }
   }
}
return cfgs;
}, 

Headed home, but will look into it tomorrow morning, if you guys haven't found the root issue by then.
hg bisect says:

The first bad revision is:
changeset:   58589:abd854c5d634
user:        David Anderson <danderson@mozilla.com>
date:        Tue Dec 14 12:52:55 2010 -0800
summary:     Fix iterator cache hits and deleted property suppression (bug 618614, r=gal).

which sure matches up to the code in comment 14.

Note that disabling mjit (so we run tracer only) gives slightly different but still wrong behavior; running either mjit only or interp only behaves just like default does.
(In reply to comment #14)
> However, when I step through with the debugger it doesn't fail (which maybe
> indicates even more some kind of GC timing related issue).

Usually in cases like these, the presence of the debugger perturbed execution in such a way as to preclude the bug from happening.  (But to be sure, it's not always a certainty.)  The same sort of thing often happens when calls to |eval| are inserted in code not executed properly by the browser in order to debug it.  Just an FYI for future note.
If I remember the logic correctly, it probably boils down to this:

var a = {
    foo: objA,
    srcNode: objB
};

var b = {
    bar: objC
};

var c = {};

var prop;

// merge a, b into c

for (prop in a) {
    if (a.hasOwnProperty(prop)) {
        c[prop] = a[prop];
    }
}

for (prop in b) {
    if (b.hasOwnProperty(prop)) {
        c[prop] = b[prop];
    }
}

-- later on in the stack --

var d = {};

// Filter a's properties back out of c, and into d, and remove them from c.

for (prop in a) {
    if (a.hasOwnProperty(prop) && c[prop]) {
        d[prop] = c[prop];
        delete c[prop];
    }
}

----
delete is a good hint. This might be deletion suppression.
I got this in the debugger. Trying to make a test case.
This is bhackett's terrible last-iterator-cache-bypass. I am tempted to just disable it for FF4 since its only a minimal speedup. I will poke at it a bit more before I do so.
Actually this still loads stuff.
Alright, I have a fix. I ended up finding this through inspection. When we perform a deletion suppression, we remove the property from the native iterator. That iterator can end up in the native iterator cache after the iteration and is potentially reused. When we reuse it we match the shape stored in the native iterator against the object. However, it turns out we never updated that shape after we removed a property from the native iterator's id list so we end up with an iterator that matches the previous shape and provides a wrong list of ids. Patch coming.
Attached patch patch (obsolete) — Splinter Review
This is definitely a hardblocker. Good call. Low-risk fix.
dvander caught a nasty bug, shape[0] is the direct object's shape, new patch in a sec
Attached patch patch (obsolete) — Splinter Review
Attachment #511654 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Attachment #511658 - Attachment is obsolete: true
Attachment #511660 - Flags: review?(dvander)
Attachment #511660 - Flags: review?(dvander) → review+
http://hg.mozilla.org/tracemonkey/rev/94e3767a268d
Whiteboard: [hardblocker] → [hardblocker], fixed-in-tracemonkey
Satyen, you can test this with a tinderbox build from the tracemonkey tree.
Whiteboard: [hardblocker], fixed-in-tracemonkey → [hardblocker]
Whiteboard: [hardblocker] → [hardblocker], fixed-in-tracemonkey
Depends on: 633525
Whiteboard: [hardblocker], fixed-in-tracemonkey → [hardblocker], fixed-in-tracemonkey [has patch]
The patch fails this test:

var o1 = {p1: 1};
var o2 = {p1: 1, p2: 2};

for(var x in o1) {
    for(var y in o2) {
        delete o2.p2;
    }
}

We need a slightly different logic to disable picking up modified enumerators. This one gets the enumerator stack out of whack. Backing out.
Keywords: qawanted
Whiteboard: [hardblocker], fixed-in-tracemonkey [has patch] → [hardblocker]
This needs an owner. I am too busy with content stuff right now.
Assignee: gal → general
Summary: Object references coming back undefined → modified iterators are reused after delete suppression
(In reply to comment #30)
> Satyen, you can test this with a tinderbox build from the tracemonkey tree.

It sounds like it got backed out, but let me know if you still want me to test.

Also, if I can get some pointers on where/how to get a tinderbox build that would be great (looking around in the meantime).

BTW, I've also updated the test page, at the same URL to use non-min files.

Let me know if you need me to whittle it down, but I believe the fundamental pattern is the one I describe above, it's just not consistent. That is, it follows that same pattern for one Widget, and hits the bug, and doesn't for another. Not sure if that's just a function of when GC is run.
Assignee: general → dvander
Status: NEW → ASSIGNED
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/94e3767a268d
Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.
Attached patch fix, maybe (obsolete) — Splinter Review
If JSITER_ENUMERATE, we must always remove it from cx->enumerate. I've inverted the flag so we just avoid using the iterator cache if JSITER_UNREUSABLE is set. Building a browser to test the original bug.
(In reply to comment #35)
> cdleary-bot mozilla-central merge info:
> http://hg.mozilla.org/mozilla-central/rev/94e3767a268d

For the record, the backout was also merged:
https://hg.mozilla.org/mozilla-central/rev/6f18793cdbe2
Comment on attachment 511660 [details] [diff] [review]
patch

Okay, the new patch fixes the original bug and doesn't regress the new test.
Attachment #511660 - Attachment is obsolete: true
Attached patch fix, w/ testsSplinter Review
Attachment #511810 - Attachment is obsolete: true
Attachment #511815 - Flags: review?(gal)
The comments still say REUSABLE, otherwise looks good. r=me
Comment on attachment 511815 [details] [diff] [review]
fix, w/ tests

Verify with the website too.
Attachment #511815 - Flags: review?(gal) → review+
http://hg.mozilla.org/tracemonkey/rev/bf89669b34cb
Whiteboard: [hardblocker] → [hardblocker][fixed-in-tracemonkey]
Whiteboard: [hardblocker][fixed-in-tracemonkey] → [hardblocker][fixed-in-tracemonkey][has patch]
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Just to make sure I'm understanding the status correctly: Does #c45 mean that today's nightly mozilla-central FF4b12pre build will have the fix?
Yes this is on mozilla-central now. The next nightly will have the fix.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: