Closed Bug 371576 Opened 17 years ago Closed 17 years ago

Script elements added with appendChild no longer execute synchronously (was: Firefox 2.0.0.2 update breaks Backbase enabled web sites)

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: devon, Assigned: sicking)

References

()

Details

(5 keywords, Whiteboard: regression from bug 364692)

Attachments

(3 files, 5 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1

The Firefox 2.0.0.2 update breaks runtime compatibility with Backbase enabled web sites.  Backbase enabled sites simply will not function.  (No error page displayed.  No script timeout.  Nothing.)

Revert back to Firefox 2.0.0.1 and Backbase related sites work again.

Reproducible: Always

Steps to Reproduce:
1. Using Firefox 2.0.0.2, visit http://www.backbase.com
   You will see their web site content and UI controls, etc.
2. Update to Firefox 2.0.0.2
3. Visit http://www.backbase.com
Actual Results:  
Visiting the backbase.com site simply displays a blank page.

Expected Results:  
Backbase site to initialize properly and displays the UI to the user.

For purpose of recreating this bug, visiting backbase.com is sufficient.  

PLEASE NOTE that there are a lot of sites out there using Backbase technology.  The Firefox 2.0.0.2 Mac update breaks access to all these sites!

ALSO NOTE: The real cause of the problem may in fact be in the Backbase framework itself; I don't know.  (I do not work for Backbase nor represent them in any way.)  Whatever the real cause, Firefox 2.0.0.2 does break Backbase enabled sites.  This incompatibility has been reported to Backbase by their developers on forums on the Backbase site.  Hopefully the Firefox and Backbase teams already have contacts to begin working with each other on the problem.
Affects windows, too, and also broke between 1.5.0.9 -> 1.5.0.10
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking1.8.1.3?
Flags: blocking1.8.0.11?
Keywords: regression
OS: Mac OS X → All
Hardware: Macintosh → All
Additional data points:
- The problem occurs with a clean Firefox 2.0.0.2 (Mac) install.  Not related to the update from 2.0.0.1 to 2.0.0.2.
- If you leave the browser window open for many minutes (5+ for me), Firefox will sometimes report a script timeout after visiting a Backbase enabled site.  I've gotten times out after waiting 5+ mins.  So internally, scripting is still trying to run something.  However, even if you continue past script timeouts, a Backbase site does not appear.
In a debug build there's a few CSS errors and then a zillion lines of

WARNING: NS_ENSURE_TRUE(aPos < slotCount) failed, file c:/dev/ff2/mozilla/content/base/src/nsAttrAndChildArray.cpp, line 470
Backing out sicking's checkin for bug 364692 fixes this.
Assignee: nobody → jonas
Blocks: 364692
Whiteboard: regression from bug 364692
Note that bug 364692 fixes Zalewski security bug 371321 so a simple backout isn't a real option. We'd need some other fix.
Attached file unminimised testcase (obsolete) —
To reproduce:
- Extract zip file and open the "skeletons" directory.
- Open the "basic-startup.html" file.

In unaffected build, you see "Click here to see if your Backbase installation is working" text and no js errors.
In affected builds, you see the "source" and you get this js error:
Error: a1o_ is not defined
Source File: file:///C:/Documents%20and%20Settings/mw/Bureaublad/bb_test/bpc/bpc_02.js
Line: 1
Attached file minimised testcase (obsolete) —
You should get a green background with the testcase, but that doesn't happen with current builds.

The external data/uri script has this src:
var x=document.createElement('script'); x.innerHTML = 'function doe3(){document.body.innerHTML = "You should see this text"; document.body.style.backgroundColor = "lime";}'; document.getElementsByTagName('head')[0].appendChild(x);
Also affecting some extension developers, it appears.
Attached file automate martijn's test case (obsolete) —
So in the minimized testcase we basically have the following:

1) The outer script (the one with the data: URI) loads.
2) It executes, adding the inner script (which defines doe3) to the DOM.
3) The inner script is not executed, since we're in an update.
4) We remove the execute blocker in EndUpdate.
5) We post an event to execute the inner script.
6) We fire the onload event on the outer script.
7) The code in the onload event handler assumes the inner script has already
   executed, and breaks.
Perhaps we should hold off on firing onload on a script until all the scripts which are supposed to execute as a result of it executing have loaded?

shaver, how is this affecting extension developers?  Are they using onload handlers on scripts which create inline scripts?  Or something else?
In fact, for compat it seems to me that we should keep the order of firing ScriptEvaluated the same.  In other words, something like this, maybe:

  In nsScriptLoader::ProcessRequest don't fire ScriptEvaluated if EvaluateScript
    added non-loading requests to mPendingRequests.
  Have said non-loading requests hold a reference to the "parent" aRequest
    passed to ProcessRequest so it won't die.
  Keep track of how many pending children a request has.
  After those requests have fired their ScriptAvailable, they should attempt to
    call ScriptAvailable on the parent, which should happen if it has no
    remaining kids.

Jonas, does that make sense?
Assignee: jonas → general
Component: General → DOM
Product: Firefox → Core
QA Contact: general → ian
Version: unspecified → Trunk
Ugh.  Damn bugzilla.  Jonas, see comment 13?
Assignee: general → jonas
Flags: blocking1.9?
This testcase will work in all browsers except the latest 2.0.0.2 and 1.5.0.10 versions of Firefox. Using synchronous XMLHTTPRequest and responseText / text assignment of script Backbase provides a JIT dynamic script loading mechanism. This doesn't work anymore because scripts arent executed right-away after appending them to the <head> (which it did before).

It seems all script tags became asynchronous and no load event or what so ever is fired on the script tag.

For now i only found the Backbase software broken but it's very BAD to suddenly change the way of loading dynamic script's from synchronous to asynchronous. Many other toolkits which are using dynamic script loading might also being affected.
Attached file mochikit version of Sjoerd's testcase (obsolete) —
Sjoerd, thanks for the testcase. We're very sorry we've broken Backbase, that should not have happened.
is there a workaround that sites or extensions can use?
Yes.  Put whatever code depends on the newly appended script into a setTimeout(0).

That said, comment 15 brings up a good point.  That issue won't be fixed just by changing when onload for a <script> fires...
Backbase isn't the only product affected by this. This bug breaks Telerik's r.a.d.ajax component, as well as a number of other controls that rely on the aforementioned dynamic script loading. While I've been using the setTimeout approach in attempt to get my app working again, this really isn't a practical approach for complex scenarios.

I've been working with the vendor for the past 48 hours trying to work around this problem. The only solutions thus far introduce awful page flash due to the delayed execution. I really hope this gets addressed soon...

(In reply to comment #19)
> Backbase isn't the only product affected by this. This bug breaks Telerik's
> r.a.d.ajax component, as well as a number of other controls that rely on the
> aforementioned dynamic script loading. While I've been using the setTimeout
> approach in attempt to get my app working again, this really isn't a practical
> approach for complex scenarios.
> 
> I've been working with the vendor for the past 48 hours trying to work around
> this problem. The only solutions thus far introduce awful page flash due to the
> delayed execution. I really hope this gets addressed soon...
> 

I'm also affected by this bug (https://bugzilla.mozilla.org/show_bug.cgi?id=371720) and I've taken the work around of not using settimeout (i also need synchonous execution) but to remove the DOM javascript nodes and doing eval() of the code (javascriptNode.text property) and then insert the rest of nodes with a normal appendChild().

Probably it helps you.
Blocks: 371720
I also have code which is broken by this bug.  setTimeout will not fix my problems.  Instead:

I would have to break up a loop in the middle of a nested if statement, save my context for that loop and create a new script tag to pick up where I left off after the current script tag manages to execute.  Meanwhile, I'll also need to put a global lock on the page so only one instance of this loop (which already suspends its execution at another point, for an ajax call) can be live at any time -- because I need to control the state of the page when the original script tag executes.

This is certainly doable, but it will take time, and will require some testing to make sure I don't inadvertently break something else.  I'd be tempted to submit a fix against firefox for that same development effort.  However, since the original issue was a security issue, and since I do not understand all the implications surrounding that part of the system (I don't really even understand why this implicit defer solves a security problem), I'm reluctant to try that.

For what it's worth, we're trying to get this fixed ASAP....  Comment 15 is spot on as to what the problem is.  The issue is fixing that without reopening the security holes.
Attached patch Patch to fixSplinter Review
What we want is for scripts to execute syncronously, but only after the last EndUpdate is called.

However we don't want to regress bug 364692, see especially bug 364692 comment 9. So this is fixed in a number of ways:

1. Make sure to that the last EndUpdate is called after the scriptloader is reenabled
2. Don't syncronously reenable the parser when the blocking script is done executing
3. Only process the pending request queue in the scriptloader on the last EndUpdate if the queue was empty the first BeginUpdate was called.
Attachment #256888 - Flags: superreview?(bzbarsky)
Attachment #256888 - Flags: review?(bzbarsky)
Btw, I can safely remove the aWasPending argument since it was always true (I verified this by having an assertion in there). The reason is that we should never add a script to the mScriptElements queue unless it blocked us. And we'll never return NS_ERROR_HTMLPARSER_BLOCK unless we were also marking the request as was-pending.
Comment on attachment 256888 [details] [diff] [review]
Patch to fix

>Index: content/base/src/nsContentSink.h
>+  void ContinueInterruptedParserAsync();
>+  void ContinueInterruptedParser();

I'd call them ContinueInterruptedParsing(Async)?, probably.

>Index: content/base/src/nsScriptLoader.h

>   void RemoveExecuteBlocker()
>   {
>-    if (!--mBlockerCount) {
>-      ProcessPendingRequestsAsync();
>+    if (!--mBlockerCount && !mHadPendingScripts) {
>+      ProcessPendingRequests();
>     }
>+    // If there were pending scripts then the newly added scripts will execute
>+    // once whatever event triggers the pending scripts fire.

s/fire/fires.

Also, there's one little issue here.  What if said events fired while we were inside this update (e.g. the update included a mutation event that did a sync XMLHttpRequest or some such insanity)?

I think this would perhaps be better as:

  if (!--mBlockerCount) {
    if (mHadPendingRequests) {
      ProcessPendingRequestsAsync();
    } else {
      ProcessPendingRequests();
    }
  }

>Index: content/html/content/test/test_bug300691-2.html

So with this change, we'd doing a bunch of the tests twice (e.g. the test2 test).  We could just pull up the second test's description to the first test, I think.  Either way, though.

>Index: content/html/content/test/test_bug300691-3.xhtml

Same here.

r+sr=bzbarsky with this, especially the RemoveExecuteBlocker change.

Please make sure to test Zalewski's crash thing!

Oh, and for branch we might want to leave the aWasPending thing as-is.
Attachment #256888 - Flags: superreview?(bzbarsky)
Attachment #256888 - Flags: superreview+
Attachment #256888 - Flags: review?(bzbarsky)
Attachment #256888 - Flags: review+
Attached patch Add mochitestsSplinter Review
Adds the above testcases as well as some i wrote myself to mochitest.
Attachment #256339 - Attachment is obsolete: true
Attachment #256357 - Attachment is obsolete: true
Attachment #256374 - Attachment is obsolete: true
Attachment #256448 - Attachment is obsolete: true
Attachment #256453 - Attachment is obsolete: true
Attachment #256896 - Flags: superreview?(bzbarsky)
Attachment #256896 - Flags: review?(bzbarsky)
Fix is checked in on trunk. Working on a branch patch...
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Attached patch branch versionSplinter Review
Same fix for branch but without the aWasPending cleanup.
Comment on attachment 256896 [details] [diff] [review]
Add mochitests

>Index: base/test/test_bug371576-4.html

>+var _bShouldbeTheir = false;

s/their/there/

>+ok(_bShouldbeTheir,"_bShouldbeTheir is true, scipts has executed on time");

s/scipts/script/

With that, looks great.
Attachment #256896 - Flags: superreview?(bzbarsky)
Attachment #256896 - Flags: superreview+
Attachment #256896 - Flags: review?(bzbarsky)
Attachment #256896 - Flags: review+
Attachment #256960 - Flags: superreview?(bzbarsky)
Attachment #256960 - Flags: review?(bzbarsky)
Comment on attachment 256960 [details] [diff] [review]
branch version

Looks good. And I assume it passes those mochitests?
Attachment #256960 - Flags: superreview?(bzbarsky)
Attachment #256960 - Flags: superreview+
Attachment #256960 - Flags: review?(bzbarsky)
Attachment #256960 - Flags: review+
Comment on attachment 256960 [details] [diff] [review]
branch version

This does fail 3 of the tests in test_bug300691-2. But they fail because on branch scripts consider themselves executed once they have any child nodes. Even if those nodes are just empty textnodes.

So that's failing for a different reason than this bug (and is not at all critical enough that we need to fix it on branch)
Attachment #256960 - Flags: approval1.8.1.3?
Attachment #256960 - Flags: approval1.8.0.11?
Flags: blocking1.8.1.4?
Flags: blocking1.8.1.4+
Flags: blocking1.8.1.3+
Flags: blocking1.8.0.11?
Flags: blocking1.8.0.11+
Attachment #256960 - Flags: approval1.8.1.4? → approval1.8.1.3?
Attachment #256960 - Flags: approval1.8.0.11?
Comment on attachment 256960 [details] [diff] [review]
branch version

a=mconnor on behalf of drivers for checkin to branches, please get this in ASAP
Attachment #256960 - Flags: approval1.8.1.3?
Attachment #256960 - Flags: approval1.8.1.3+
Attachment #256960 - Flags: approval1.8.0.12?
Attachment #256960 - Flags: approval1.8.0.11?
Attachment #256960 - Flags: approval1.8.0.11+
Summary: Firefox 2.0.0.2 update breaks Backbase enabled web sites → Script elements added with appendChild no longer execute synchronously (was: Firefox 2.0.0.2 update breaks Backbase enabled web sites)
Keywords: fixed1.8.1.3
This caused bug 373181 on both branches (but not on trunk).
Blocks: 371751
juan writes:

"Today I was able to install Backbase 3.3 on a Linux box in the QA Lab, and I tested a couple of the sample applications using 2001, 2002, 2003 and observed goodness, badness, goodness respectively.

I'll test with 1509, 15010, 15011 next; and then I will test their 3.3.1 version, to make sure we don't break their own fix."

thanks again to juan for verifying this.
Are you doing the testing with the latest 2003 and 15011 builds? I.e. ones with bug 373181 fixed?
I stopped testing when this bug 373181 came about; so I was going to wait until we got builds with cherry picked fixes before I did further testing.
done some tests with Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.3pre) Gecko/2007030903 BonEcho/2.0.0.3pre and the demo "testcases" on 
http://www.backbase.com/#home/products/ajax_demos.xml
works fine, but i will wait with the verification till we have RC Builds.
verified fixed for 1.8.1.3 with 2.0.0.3 RC 1 using the tests at http://www.backbase.com/#home/products/ajax_demos.xml and juanb`s test installation.

Using: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.3) Gecko/2007030919 Firefox/2.0.0.3 (Windows XP x64)

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/2007030916 Firefox/2.0.0.3 (Linux Fedora FC 6)

Mozilla/5.0 (Windows; U; Windows NT
6.0; en-US; rv:1.8.1.3) Gecko/2007030919 Firefox/2.0.0.3 (Vista)

Tests were passed and working fine in 2.0.0.3 RC 1 

Also verified in 1.8.0.11 using Backbase 3.3 on: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.11) Gecko/20070312 Firefox/1.5.0.11 (Also on Mac)

And verified that it worked with Backbase 3.3.1:
Backbase 3.3.1 going 1.5.0.9 (good), 1.5.0.10 (good), 1.5.0.11 (good) Windows
Backbase 3.3.1 going 2.0.0.1 (good), 2.0.0.2 (good), 2.0.0.3 (good) Windows



Because of the separate mini-branch for 2.0.0.3/1.5.0.11 marking fixed for the next releases as well to trigger re-verification.
Depends on: 377070
Verified fixed 1.8.1.4 after some intensive tests with Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.4pre) Gecko/2007042403 BonEcho/2.0.0.4pre

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4pre) Gecko/2007042603 BonEcho/2.0.0.4pre

and for 1.5.0.12 
Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.0.12pre) Gecko/20070426 Firefox/1.5.0.12pre ID:2007042607

using the demo test sites on http://www.backbase.com/#home/products/ajax_demos.xml and tests passed.
Flags: in-testsuite+
Flags: blocking1.9?
Blocks: 702167
verified fixed
Status: RESOLVED → VERIFIED
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.