Closed Bug 635252 Opened 14 years ago Closed 14 years ago

Autocomplete textbox gets corrupted after customizing toolbars

Categories

(Core :: JavaScript Engine, defect)

x86
All
defect
Not set
major

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: jgbishop, Assigned: dmandelin)

References

Details

(Keywords: regression, Whiteboard: [hardblocker][has patch][fixed-in-tracemonkey])

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows NT 5.1; rv:2.0b12pre) Gecko/20110218 Firefox/4.0b12pre
Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0b12pre) Gecko/20110218 Firefox/4.0b12pre

I have an extension that utilizes an autocomplete textbox XUL control (i.e a textbox element with type="autocomplete"). When I customize toolbars in Firefox 4, the textbox gets corrupted in a strange way. This only happens after a word has been typed into the textbox (if you never type anything, the control remains in its good condition). After customizing toolbars, even when nothing actually happens, the autocomplete mechanism no longer works. I see the following error appear in the error console, immediately after closing the toolbar customization palette:

Error: uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://browser/content/search/search.xml ::  :: line 91"  data: no]

Pressing the [Home] key after this error occurs also yields another error, along with a warning. First the error:

Error: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIAutoCompleteController.handleKeyNavigation]
Source file: chrome://global/content/bindings/autocomplete.xml
Line: 403

This warning also appears on occasion:

Warning: reference to undefined property this.popup.popupOpen
Source file: chrome://global/content/bindings/autocomplete.xml

Using the DOM Inspector to inspect the textbox element, I note that the following JavaScript items get destroyed:

Before: mInputField = [object HTMLInputElement]
After: mInputField = null

Before: mSearchNames = form-history
After: mSearchNames = null

Before: popupOpen = false
After: popupOpen = undefined

It seems to me that the textbox element is being corrupted at some point, and the innter HTMLInputElement is being destroyed. This is a bad thing.

Reproducible: Always

Steps to Reproduce:
The following steps use the sample extension I will attach to this bug as a guide. The XUL and JavaScript are the simplest test case I've been able to boil it down to.

1. Install the sample extension attached to this bug into Firefox 4 and restart the browser. A toolbar should appear with a single element (an autocomplete textbox wrapped in a toolbaritem element).
2. Type a word into the textbox control and press [Enter]. This will add the word to the textbox control's autocomplete history.
3. Clear the textbox control and retype the word, validating that the autocomplete mechanism is working as expected.
4. Right-click on the toolbar and select "Customize." Do nothing, and click the "Done" button on the toolbar palette.
5. Retype the word you typed above into the textbox. Note that the autocomplete mechanism no longer works. Clicking the history arrow to show the history also no longer works.
Actual Results:  
Autocomplete mechanism no longer works, nor does the autocomplete history drop-down appear. Errors and warnings also appear in the error console. JavaScript properties of the textbox element are corrupted. A browser restart is required to set things back to normal.

Expected Results:  
The autocomplete mechanism should continue to work. JavaScript properties within the textbox XUL control should not be destroyed.

I can reproduce the problem in Windows XP (32-bit) and Windows 7 (64-bit), but I haven't tried any other platforms. It also only occurs in Firefox 4 (problem doesn't manifest itself in Firefox 3.x).
I have attached a sample extension that helps demonstrate the bug. This is the simplest test bed I was able to boil it down to.
Regression window:
Works:
http://hg.mozilla.org/mozilla-central/rev/ba5140cd34c8
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b11pre) Gecko/20110131 Firefox/4.0b11pre ID:20110131190441
Fails:
http://hg.mozilla.org/mozilla-central/rev/fd63bdaa9cfc
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b11pre) Gecko/20110131 Firefox/4.0b11pre ID:20110131195016
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ba5140cd34c8&tochange=fd63bdaa9cfc
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Keywords: regression
Version: unspecified → Trunk
OS: Windows XP → All
Regression window(cached TM hourly):
Works:
http://hg.mozilla.org/tracemonkey/rev/5dbc4a422c1b
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b11pre) Gecko/20110129 Firefox/4.0b11pre ID:20110129033246
Fails:
http://hg.mozilla.org/tracemonkey/rev/8835fffb27af
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b11pre) Gecko/20110129 Firefox/4.0b11pre ID:20110129134356
Pushlog:
http://hg.mozilla.org/tracemonkey/pushloghtml?fromchange=5dbc4a422c1b&tochange=8835fffb27af
Assignee: nobody → general
Component: Toolbars → JavaScript Engine
Product: Firefox → Core
QA Contact: toolbars → general
In local build,
build from 8835fffb27af : fails
build from 378cfa4ae4f5 : fails
build from 824f69dad2bc : works

Regressed by:
378cfa4ae4f5	Igor Bukanov — bug 624364 - r=jorendorff
Whiteboard: [dmandelin to investigate]
Some of the error messages in comment 0 appear in both the unregressed and regressed versions. The clear differences I was able to spot are that these messages occur only in the regressed version:

1. When clicking "Done" on the customization dialog:

WARNING: NS_ENSURE_TRUE(popup != nsnull) failed: file c:/builds/tm-d/toolkit/components/autocomplete/src/../../../../../../sources/tracemonkey/toolkit/components/autocomplete/src/nsAutoCompleteController.cpp, line 1355
JavaScript strict warning: chrome://global/content/bindings/autocomplete.xml, line 0: reference to undefined property this.popup.popupOpen

2. When trying to use the text box again after that:

WARNING: NS_ENSURE_TRUE(popup != nsnull) failed: file c:/builds/tm-d/toolkit/components/autocomplete/src/../../../../../../sources/tracemonkey/toolkit/components/autocomplete/src/nsAutoCompleteController.cpp, line 1355

3. On shutdown?

JavaScript strict warning: chrome://global/content/bindings/autocomplete.xml, line 0: reference to undefined property this.popup.popupOpen
I think I have this about 70% figured out. The problem arises here in autocomplete.xml:

      <field name="popup"><![CDATA[
        var popup = null;
        ...
        popup;
      ]]></field>

Note that we are defining a field named 'popup', and the code that computes its value also declares a variable named 'popup'.

Before the regression, this is what happens when autocomplete runs getPopup:

1. Look for the property on a XULElement: it's not there.
2. Continue the search on its prototype, which calls the resolver.
3. The XBL resolver sees the binding, so it evaluates the code.
4. The code creates a permanent property named 'popup' on the XULElement, and also computes the property value for the resolver.
5. The XBL resolver calls DefineProperty to create a property named 'popup' on the XULElement with the value computed in step 4.
6. DefineProperty overwrites the property created in step 4 with a non-permanent property.

After the regression, everything is the same, except that in step 6 the 'popup' property stays permanent. Somehow this causes later calls to getPopup to return null. I'm not exactly sure how that happens but it probably relates to the line that sets the local variable to null.

This bug can be fixed by changing the name of the local variable to 'popup2', but I want to understand the problem a little better before saying that's what we should do.
This should be fixed by the patch for bug 632003 -- right, Igor?

/be
Depends on: 632003
(In reply to comment #7)
> This should be fixed by the patch for bug 632003 -- right, Igor?

Do you mean that bug 632003 intended to fix this? That patch landed and the bug is still here.
Attached patch PatchSplinter Review
Comment on attachment 513679 [details] [diff] [review]
Patch

This patch fixes the problem reported in this bug.
I'm still not sure what's actually causing getPopup to fail on later rounds. It seems that the property lookup succeeds and returns a XULElement, but within XPConnect, we fail to convert that object to the required iid. Maybe it's picking up the wrong object? Not sure.
So do we think:

 - that we should block on this,
 - that we should take this patch?
I'm not a Firefox developer, but it seems to me that if the patch is accepted, a comment in the code explaining why the variable has been named "popup2" would be useful (for future viewers of this code).

I also forgot to mention in my original submission on that ontextentered events do not fire when the control enters into the bad state. This is undoubtedly related to the inner element being destroyed, but I figured I should mention this tidbit anyway.
Whiteboard: [dmandelin to investigate] → [dmandelin to answer comment 12]
(In reply to comment #12)
> So do we think:
> 
>  - that we should block on this,
>  - that we should take this patch?

At the present state of knowledge, I would say we don't need to block on this but we should take a cleaned-up version of the patch, per comment 21.

I want to try to figure out exactly why this goes wrong, though--there could be some other bug in there.
OK, let's take this off the nomination list, but we'd approve the safe patch once reviewed!
blocking2.0: ? → -
Whiteboard: [dmandelin to answer comment 12]
> 4. The code creates a permanent property named 'popup' on the XULElement

That's ... unfortunate.

I guess the original XBL design didn't really envision people sticking |var| inside fields.  Field evaluation uses JS_EvaluateUCScriptForPrincipalsVersion with the JSObject for the element passed in as |obj|.  So the |var| is causing this property to be added to the element, presumably.

I wonder whether the issue is that we're nuking and then reinstantiating the binding?  When the binding is removed, we'd call nsXBLProtoImpl::UndefineFields which would attempt to JS_DeleteUCProperty2 all the field properties.  If JSPROP_PERMANENT keeps this from happening (as seems likely given the configurable() checks in js_DeleteProperty), then the "popup" property won't be removed.  Then when the binding is reinstantiated it will continue to point to the old popup, whereas it should point to the new one.

It'd be good if someone could double-check the above, but I'm willing to bet money that's what's going on.  And if so, then the JS eng behavior here is in fact ok.  Everything below is predicated on the assumption that the above is a correct analysis of what's going on.

The constraints we seem to have here are:

1)  |this| inside XBL fields really needs to mean the element itself.
2)  |var| inside a script being evaluated will always stick the property on
    the global object for that script.
3)  |this| at script toplevel is the global object.
4)  |var| creates non-configurable properties.
5)  non-configurable properties can no longer be removed.
6)  XBL fields need to be able to be removed; if we don't do that things break.

The conclusion is that as long as XBL fields are evaluated as scripts, they must not use |var x| where x is the name of any other field in the binding.

We could try to switch to evaluating them as function bodies with |this| bound to the element.  That would allow us to create a new global object that could end up with all the |var| crap.  References to unqualified names would no longer be looked up on the element itself, so this would be a compat risk.  We might be able to mitigate it somewhat by setting the element as the prototype of the new global, though.  Of course if someone depends on those properties that |var| used to define, we lose.

So the risk of taking the patch in this bug (or similar patch that sticks the |var| inside a function in the field, avoiding the whole mess) and shipping as-is is that various other XBL bindings might break when needing to be reinstantiated.  We can do a sweep for breakage in our own tree, I hope, but extension compat is a question mark.

The risk of making changes is at least the possible compat issues two paragraphs above if we start treating fields as function bodies.  Any other compat issues expected there?

I'm honestly not sure which is worse at this point.

Have we actually shipped a beta or three in the current state?  If so, I'd be tempted to fix instances in our tree for now, write up some developer docs saying to not have toplevel var inside fields (people shouldn't be writing long complicated field script _anyway_, since it has to be compiled anew for every single element the field is on!),  file a followup on how we want this to work in Fx5, and keep working on XBL2 in hopes of forgetting all this.
Blocks: 624364
So yeah, the other option for fixing this binding is to move the guts of that code into a <method> so that we don't have to compile all of it every single time we have an autocomplete widget.... now luckily we have very few autocomplete widgets around, so the perf effect is negligible.
Or as a stopgap we could wrap an anonymous function around this field, thus scoping the variables (I don't want the other two variables to leak either).
I think we should reevaluate blocking on this, at least as far as deciding which approach to take.
blocking2.0: - → ?
Jorge/Fligtar: we need to understand if this is an extension compatibility issue. It appeared in Beta 11, and we would very much like to not block on it, but if it turns out to be messing with extensions, then we should.

Can you help us understand that?
blocking2.0: ? → final+
Whiteboard: [blocking speculatively][hardblocker]
In particular, can we look through extensions for the pattern where a |var| is used inside a <field> and the name of the |var| is the same as the name of some <field> (same one or different one) in the binding?
(In reply to comment #21)
> In particular, can we look through extensions for the pattern where a |var| is
> used inside a <field> and the name of the |var| is the same as the name of some
> <field> (same one or different one) in the binding?

We use https://mxr.mozilla.org/addons/ to query add-ons on AMO, but I am unsure if there's anyway to perform such a query. Anybody more familiar with mxr can give us a hint?

Regarding the effect on extension compatibility, I think the impact should be fairly minor. Using <field> is rare, and having a |var| with the same name inside the code is a pretty bad and confusing practice, so I would be surprised if it affects more than a couple of add-ons, if any.

Using autocomplete search boxes, on the other hand, is very common, and this bug may be affecting many toolbar add-ons that include search boxes. So, fixing this specific problem is a blocker IMO.

Is it possible to use |let| instead of |var| to fix this problem without changing variable names?
mxr can't perform such a query, but I assumed that _someone_ has access to the underlying data mxr is using, and is able to run scripts against it...

I'll let one of the JS folks answer the let question.
(In reply to comment #23)
> mxr can't perform such a query, but I assumed that _someone_ has access to the
> underlying data mxr is using, and is able to run scripts against it...

I had been thinking about doing such a query. The way I would do it is to write a Python script that runs over all the xml files in the tree, extracting field elements and their contents, and then searching for "var foo" in the cdata, where "foo" is the name of the field. I think it would only take half a day or so.

> I'll let one of the JS folks answer the let question.

I think that would work. It would be easy to try it out on the STR here.
> where "foo" is the name of the field.

You need to check for all fields in the binding, not just the field being installed, right?
(In reply to comment #25)
> > where "foo" is the name of the field.
> 
> You need to check for all fields in the binding, not just the field being
> installed, right?

That seems right.
Attachment #514920 - Flags: review?(neil)
Attachment #514920 - Flags: review?(bzbarsky)
Is anyone available to do the XBL audit? I could do it but I do have other release bugs I'm already working on.
Whiteboard: [blocking speculatively][hardblocker] → [blocking speculatively][hardblocker][has patch]
I did the audit. It was pretty easy. I ran it over a debug build dir. I found a few places that use var statements, but collisions other than the one in autocomplete.xml. Do I need to search method and property names, too, though?

./dist/bin/chrome/browser/content/browser/places/menu.xml
    fields with vars: _overFolder
    var statements: draggingOverChild, parent, popup, timer
./dist/bin/chrome/browser/content/browser/search/search.xml
    fields with vars: searchbarController
    var statements: param
./dist/bin/chrome/browser/content/browser/urlbarBindings.xml
    fields with vars: _copyCutController
    var statements: urlbar, val
./dist/bin/chrome/toolkit/content/global/bindings/autocomplete.xml
    fields with vars: popup
    var statements: popup, popupId, popupset
./dist/bin/chrome/toolkit/content/global/bindings/findbar.xml
    fields with vars: _observer
    var statements: prefsvc
./dist/bin/chrome/toolkit/content/global/bindings/tabbox.xml
    fields with vars: tabbox
    var statements: parent
./dist/bin/chrome/toolkit/content/global/bindings/toolbar.xml
    fields with vars: _contextMenuListener
    var statements: contextMenuId, node
./dist/bin/chrome/toolkit/content/global/bindings/videocontrols.xml
    fields with vars: TouchUtils, Utils
    var statements: attrName, buffered, currentTime, duration, element, enabled, endTime, event, high, index, isMouseOver, keystroke, length, low, maxtime, newval, oldval, probe, r, self, shouldShow, show, value, video, volume
Should be "no collisions other than the known one in autocomplete.xml" in the previous comment.
You don't need to worry about methods and properties; those aren't defined directly on the element itself, unlike fields.  They go on the XBL prototype, and are unhooked by just taking that out of the proto chain.

So the worst vars in fields can do is shadow them, but that was the case before too.
Comment on attachment 514920 [details] [diff] [review]
Stopgap patch for autocomplete.xml

r=me
Attachment #514920 - Flags: review?(bzbarsky) → review+
Assignee: general → dmandelin
Attachment #514920 - Flags: review?(neil) → review+
Comment on attachment 514920 [details] [diff] [review]
Stopgap patch for autocomplete.xml

>       <field name="popup"><![CDATA[
>-        var popup = null;
>-        var popupId = this.getAttribute("autocompletepopup");
>-        if (popupId)
>-          popup = document.getElementById(popupId);
>-        if (!popup) {
>-          popup = document.createElement("panel");
>-          popup.setAttribute("type", "autocomplete");
>-          popup.setAttribute("noautofocus", "true");
>+        // Wrap in an anonymous function so that the var statements don't
>+        // create properties on 'this'.
>+        (function (that) {
>+            var popup = null;

How about just:

{
  let popup = null;
  ...

Also note that indentation in this file is two spaces.
(In reply to comment #33)
> Comment on attachment 514920 [details] [diff] [review]
> Stopgap patch for autocomplete.xml
> 
> >       <field name="popup"><![CDATA[
> >-        var popup = null;
> >-        var popupId = this.getAttribute("autocompletepopup");
> >-        if (popupId)
> >-          popup = document.getElementById(popupId);
> >-        if (!popup) {
> >-          popup = document.createElement("panel");
> >-          popup.setAttribute("type", "autocomplete");
> >-          popup.setAttribute("noautofocus", "true");
> >+        // Wrap in an anonymous function so that the var statements don't
> >+        // create properties on 'this'.
> >+        (function (that) {
> >+            var popup = null;
> 
> How about just:
> 
> {
>   let popup = null;
>   ...

I forgot to mention that I tried that, and it didn't fix the STR in this bug.

> Also note that indentation in this file is two spaces.

I'll be sure to fix that before landing. Thanks for pointing it out.
> > How about just:
> > 
> > {
> >   let popup = null;
> >   ...
> 
> I forgot to mention that I tried that, and it didn't fix the STR in this bug.

It should, shouldn't it? Is this another bug?
http://hg.mozilla.org/tracemonkey/rev/d5a3aff4f814
Status: NEW → ASSIGNED
Whiteboard: [blocking speculatively][hardblocker][has patch] → [hardblocker][has patch][fixed-in-tracemonkey]
(In reply to comment #35)
> > > How about just:
> > > 
> > > {
> > >   let popup = null;
> > >   ...
> > 
> > I forgot to mention that I tried that, and it didn't fix the STR in this bug.
> 
> It should, shouldn't it? Is this another bug?

Dave, did you add the extra bracing Dao showed via that left brace and indentation? If not, then our let devolves to var and the same problem arises.

/be
(In reply to comment #37)
> (In reply to comment #35)
> > > > How about just:
> > > > 
> > > > {
> > > >   let popup = null;
> > > >   ...
> > > 
> > > I forgot to mention that I tried that, and it didn't fix the STR in this bug.
> > 
> > It should, shouldn't it? Is this another bug?
> 
> Dave, did you add the extra bracing Dao showed via that left brace and
> indentation? If not, then our let devolves to var and the same problem arises.

I missed that brace. So that's why it didn't work when I tried it.
http://hg.mozilla.org/mozilla-central/rev/d5a3aff4f814
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Depends on: 637159
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: