Closed Bug 1219229 Opened 9 years ago Closed 8 years ago

Background scripts are loaded in the wrong order when the extension debugger is open

Categories

(WebExtensions :: Untriaged, defect, P2)

43 Branch
defect

Tracking

(Not tracked)

RESOLVED FIXED
Iteration:
45.3 - Dec 14

People

(Reporter: ido.y, Assigned: rpl)

References

(Blocks 1 open bug)

Details

(Whiteboard: triaged)

Attachments

(4 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36

Steps to reproduce:

i added 2 js script files as background scripts
1. "background1.js" - with function classA() {}
2. "background2.js" - with function classB() { console.log(classA); }
the manifest has the attribute
"background": {
    "scripts": ["background1.js", "background2.js"]
  }


Actual results:

classB don't know who is classA


Expected results:

classA and classB should share the same scope

this is happen in background scripts and in the content scripts
Version: 41 Branch → 43 Branch
Component: Add-ons Manager → WebExtensions
I can't reproduce this. Feel free to reopen if you have a minimal testcase XPI that you can attach.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Attached file commonScope.xpi
When I install this, I get:

printing classA from classB Object {  } background2.js:4:2
in the debug mode i'm getting an error that classA is undefined
http://screencast.com/t/TvUExx3Zp
What version of Firefox are you using?
44.0a2 (2015-11-25)
Interesting... having the debugger open appears to change the order scripts are loaded.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: WORKSFORME → ---
Summary: WebExtension - no common scope between different scripts (background and content scripts) → Background scripts are loaded in the wrong order when the extension debugger is open
There is a common scope, the scripts are just being loaded in the wrong order, so `classA` isn't available when `classB` is initialized.
ok, thank you very much :)
Assignee: nobody → luca.greco
Iteration: --- → 45.3 - Dec 14
This seems to be fixed in nightlies.
Luca, if it turns out I'm wrong, and you still need to work on this, you might want to talk to Jim Blandy. I talked to him about it on Monday, and we came up with some possible leads, but I didn't get very far in following them after I set up a new branch in m-c and realized I couldn't reproduce anymore.
i didn't see that this issue was fixed in the nightly version
i'm on v45.0a1 (2015-12-03)

and when the debugger is opened and i disable and enable the extension i'm still getting the wrong order
This patch contains a new webextension's test case which reproduces the issue,
currently it fails as expected and it passes if one of the workaround/solution is applied.

This test case is skipped on e10s because it uses the AddonManager to install and reinstall the generated webextension (if the AddonManager doesn't know about it then we will not be able to attach it with the addon debugger) and
we cannot use the AddonManager from a content process.
I've spent a good amount of quality time with this bug, I've been through all the
"six stages of 'debugging the debugger'", I jumped deep down into the "debugger" 
rabbit hole and now I think I have a good understanding of what is happening, 
how to reproduce it, how we can fix it and, the most important thing, why the fix works. 

First step of my journey was about "reproducing the bug" and the first attachment
is my attempt to build the minimal test case, which reproduces the following
scenario:

- Given an webextension with 3 background scripts that will "console.log" their name
  once loaded
- When we load the webextension without the debugger attached
- Then the collected log messages from the loaded background scripts are in the
  expected order ("background1.js", "background2.js" etc.)
- When we attach a debugger client to the above webextension
- And we reload the webextension (actually we ask the AddonManager to install the webextension
  again)
- Then the collected log messages "should" be in the expected order (and they will
  on the bug will be fixed)

To be able to restrict the range of components that take part to the issue,
I tried done the following variations of the above scenario:
- Open a webextension HTML page into a tab (instead of inject the scripts into a generated _blank page)
- Open the above webextension HTML page into a background page

These tests brought me to exclude the "windowless browser", created with 
Service.appShell.createWindowlessBrowser, from the possible causes and to
focus my attention on the "inject the script tags into a blank html page" side.

On the DevTools side, I followed the injected scripts through the Remote Debugger
actors and this is what I saw:

- When we attach a debugger client to a webextension (BrowserAddonActor.onAttach)
- Then an AddonThreadActor (which inherits from ThreadActor) will be created
- The AddonThreadActor creates the debugger API object using the makeDebugger
  helper from its parent actor (the BrowserAddonActor in our scenario)
  and subscribe a bunch of handlers that are responsible to react to 
  "New Scripts", "New Globals", "Uncaught Exceptions" and the 
  "Debugger Statement"
- When the debugger detects a new script, it will call the _addSource helper,
  which is responsible of defining the source actors (with and without a sourcemap)
  and restore the breakpoint actors.
   
The thread actor is responsible of managing the JSInspector, which
combined with the preNest and postNest helpers defined in the BrowserAddonActor,
at least in my current understanding, is what makes it possible to "stop the world" 
when the debugger is paused or, more important for this issue, during the asynchronous 
work that the debugger need to do to resolve the source maps of a new source file 
loaded and reload the breakpoints.

In particular if we do not enter the "this.unsafeSynchronize(...)" calls in
the ScriptActor._addSource method, the loading order of the background page
is correct even when the addon debugger attached.

To get a better picture about "why the script are loaded in the wrong order",
I produced a trace of the events sequence collected during both the following 
two scenarios:

- webextension which loads a background page with the three scripts statically defined 
  in the page source
- webextension which loads the same three scripts by injecting them into the generated 
  _blank.html page

In the first scenario the debugger:

- receive a new script event
- define the "non source-mapped" source actor
- suspend the background page's window (in the preNest method)
- resolve the Promise which define the source mapped actor
- resume the background page's it in the postNest 
- create a promise which creates any breakpoint actor
- if there are breakpoint actors to load it will repeat the
  suspend-resolve-resume on the above promise
- once the "unsafeSynchronize" method, called from _addSource, has completed its job, 
  a new script event is received and processed

No new source event is received until the page is resumed by a postNest,
and the loading order is as expected.

In the second scenario the debugger receives a new script event when
it is going to resolve the event loop entered during the source mapped actor
resolution and it will be stuck there until the last new script event is
received and then they will be unlocked in the reverse order.

If I've got it right, this should never happen and the next "new script" event
should be received only after the debugger has completed its job for the previous one.

This is the reason why the following workaround/solution (implemented in the second 
patch, and if I remember correctly it is is the same workaround already identified 
by Kris and Jim) fix the issues (by injecting the next script only once the previous one has 
already generated a load or error event):

      // In "toolkit/components/extension/ext-backgroundPage.js"
      ...
      function injectBackgroundScripts(extension, scripts, document) {
        return scripts.reduce((previousPromise, script) => {
          return previousPromise.then(() => {
            return new Promise((resolve, reject) => {
              let url = extension.baseURI.resolve(script);

              if (!extension.isExtensionURL(url)) {
                extension.manifestError("Background scripts must be files within the extension");
                resolve();
              }

              let scriptTag = document.createElement("script");
              scriptTag.setAttribute("src", url);
              scriptTag.async = false;
              // wait the load event to be sure we do not force
              // the debugger to process a new script when the
              // window should be frozen
              let scriptLoadListener = () => {
                scriptTag.removeEventListener("load", scriptLoadListener, true);
                resolve();
              };
              scriptTag.addEventListener("load", scriptLoadListener, true);
              document.body.appendChild(scriptTag);
            });
          });
        }, Promise.resolve());
      }

      if (this.scripts) {
        injectBackgroundScripts(this.extension, this.scripts, window.document);
      }
      ...

I'm not completely sure about the reason:

- the script injection thaws the suspended window? 
- the privileged internals code is injecting the tag script and force the 
  suspended window to load it?

But I'm now pretty sure of the outcome:

- When background1.js's Promise is stuck in the the nsJSInspector::EnterNestedEventLoop
  method (in the while() { ... })
- Then the NS_ProcessNextEvent will process the pending script loading events
  (which are already queued)

Follows a stacktrace (forced by introducing an nsAutoScriptBlocker inside the
nsJSInspector::EnterNestedEventLoop to force it to produce the stacktrace if a
script is loaded before the method exits):

    #01: nsScriptLoader::ProcessPendingRequests() (/zfs-tardis/mozlab/desktop/fx-team/dom/base/nsScriptLoader.cpp:1208)
    #02: nsScriptLoader::OnStreamComplete(nsIIncrementalStreamLoader*, nsISupports*, nsresult, nsresult, mozilla::Vector<char16_t, 0ul, mozilla::MallocAllocPolicy>&, mozilla::dom::SRICheckDataVer
ifier*) (/zfs-tardis/mozlab/desktop/fx-team/dom/base/nsScriptLoader.cpp:1491)
    #03: nsScriptLoadHandler::OnStreamComplete(nsIIncrementalStreamLoader*, nsISupports*, nsresult, unsigned int, unsigned char const*) (/zfs-tardis/mozlab/desktop/fx-team/dom/base/nsScriptLoader
.cpp:1899)
    #04: nsIncrementalStreamLoader::OnStopRequest(nsIRequest*, nsISupports*, nsresult) (/zfs-tardis/mozlab/desktop/fx-team/netwerk/base/nsIncrementalStreamLoader.cpp:100)
    #05: nsJARChannel::OnStopRequest(nsIRequest*, nsISupports*, nsresult) (/zfs-tardis/mozlab/desktop/fx-team/modules/libjar/nsJARChannel.cpp:1302)
    #06: nsInputStreamPump::OnStateStop() (/zfs-tardis/mozlab/desktop/fx-team/netwerk/base/nsInputStreamPump.cpp:716)
    #07: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) (/zfs-tardis/mozlab/desktop/fx-team/netwerk/base/nsInputStreamPump.cpp:434)
    #08: nsInputStreamReadyEvent::Run() (/zfs-tardis/mozlab/desktop/fx-team/xpcom/io/nsStreamUtils.cpp:96)
    #09: nsThread::ProcessNextEvent(bool, bool*) (/zfs-tardis/mozlab/desktop/fx-team/xpcom/threads/nsThread.cpp:941)
    #10: NS_ProcessNextEvent(nsIThread*, bool) (/zfs-tardis/mozlab/desktop/fx-team/xpcom/glue/nsThreadUtils.cpp:297)
    #11: mozilla::jsinspector::nsJSInspector::EnterNestedEventLoop(JS::Handle<JS::Value>, unsigned int*) (/zfs-tardis/mozlab/desktop/fx-team/devtools/server/nsJSInspector.cpp:83)

Once I had got a good understanding about what was happening and why, another
workaround/fix strategy did come to my mind and I choose to give it a try,
because, besides being able to fix the issue, I think it suggests something more about the issue:

if we inject the script tags from inside the target window using "window.eval", instead of
creating the tag from the privileged code and directly injecting it into the document,
then the loading order is preserved (and if we trace what is happening we can notice that
the next script is processed only once the previous one is completed)

      // In "toolkit/components/extension/ext-backgroundPage.js"
      ...
      if (this.scripts) {
        for (let script of this.scripts) {
          let url = this.extension.baseURI.resolve(script);
          if (!this.extension.isExtensionURL(url)) {
            this.extension.manifestError("Background scripts must be files within the extension");
            continue;
          }
          // NOTE: inject the background script by evaluating the injecting code
          // in the window context to prevents the next injected script to
          // be processed by the AddonThreadActor before the previous one is
          // completely resolved. (See Bug 1219229 for rationale)
          window.eval(`(function() {
            var scriptTag = document.createElement("script");
            scriptTag.setAttribute("src", "${url}");
            scriptTag.async = false;
            document.body.appendChild(scriptTag);
          })();`);
        }
      }
      ...

I'm adding a needinfo from Kris and Jim to check that the described picture is not lacking anything that I'm not aware of, and discuss if we want to solve this issue using one of the two workaround/solution described above, or if we prefer to look further into the issue and keep searching for a better one.
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(jimb)
This patch contains the first possible workaround described in my previous comment:

- inject the next script tag into the background page once the previous one is loaded (or sent an error event)

I ensure the script are injected in sequence by reducing the script array using a Promise (which is resolved once the previous tag script sent a load or error event) as accumulator.

The downside of this workaround is that we wait before inject the next script even when the debugger is not attached and the workaround is not really necessary.
This patch contains the second possible workaround that I described in the above comments:

- inject the tag script from the target window context using window.eval

The nice thing of this patch is that when the debugger is not attached is mostly equivalent to our current implementation (inject all the scripts
without waiting the previous one to be fully loaded) 
but they will be executed sequentially when the debugger is attached to the addon.
Follows a couple of try run (on both the patches applied over the testcase patch):

- try run on 0002-Bug-1219229-ensure-background-scripts-are-injected-s.patch
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=15cb817a4b53

- try run on 0002-Bug-1219229-eval-background-scripts-injection-in-the.patch
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=55b0d6725d92
With absolutely no disrespect intended for Luca's efforts, let me save everyone the trouble of reading that long comment and highlight the exciting part:

(In reply to Luca Greco from comment #14)
> In particular if we do not enter the "this.unsafeSynchronize(...)" calls in
> the ScriptActor._addSource method, the loading order of the background page
> is correct even when the addon debugger attached.
> 
> To get a better picture about "why the script are loaded in the wrong order",
> I produced a trace of the events sequence collected during both the
> following 
> two scenarios:
> 
> - webextension which loads a background page with the three scripts
> statically defined 
>   in the page source
> - webextension which loads the same three scripts by injecting them into the
> generated 
>   _blank.html page
> 
> In the first scenario the debugger:
> 
> - receive a new script event
> - define the "non source-mapped" source actor
> - suspend the background page's window (in the preNest method)
> - resolve the Promise which define the source mapped actor
> - resume the background page's it in the postNest 
> - create a promise which creates any breakpoint actor
> - if there are breakpoint actors to load it will repeat the
>   suspend-resolve-resume on the above promise
> - once the "unsafeSynchronize" method, called from _addSource, has completed
> its job, 
>   a new script event is received and processed
> 
> No new source event is received until the page is resumed by a postNest,
> and the loading order is as expected.
> 
> In the second scenario the debugger receives a new script event when
> it is going to resolve the event loop entered during the source mapped actor
> resolution and it will be stuck there until the last new script event is
> received and then they will be unlocked in the reverse order.
> 
> If I've got it right, this should never happen and the next "new script"
> event
> should be received only after the debugger has completed its job for the
> previous one.
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(jlong)
Flags: needinfo?(jimb)
In summary, Debugger is calling the onNewScript handler to report the second script while we're in an unsafeSynchronize call spun up by the onNewScript handler for the first script. Apparently the unsafeSynchronize call permits whatever was dealing with the second script to make progress, and it finishes compiling the script.

[tears hear out] I hate this programming model SO MUCH.

Basically, unsafeSynchronize has got to be put out of its misery. Nick, James, I seem to recall discussing with one or both of you at Mozlando a different approach to handling source maps, that would not entail anything like unsafeSynchronize. Is there a bug open for that work?
(on PTO, so may not continue to be respond quickly) I don't think we have a plan to completely remove `unsafeSynchronize` because no matter what, if we need to set a breakpoint we *have* to synchronously block until the sourcemap is parsed so we can use it to find the right location. It has to be sync because the breakpoint might be on top-level code which needs to be set before the script runs.

Avoiding this means that we always restore previous breakpoints on the old generated location, which might be wrong if the sourcemap has changed. The end result is that if you combine 100 files into 1 file, changing any of the 100 files will most likely make all breakpoints be set in the wrong place. However, this is what Chrome does, and honestly I've never heard somebody complain about it.

HOWEVER, right now we are shooting ourselves in the foot multiple times when we only need to be doing it once. We've made things a lot worse by calling it here: https://github.com/mozilla/gecko-dev/blob/master/devtools/server/actors/script.js#L1933. We NEVER need to call `unsafeSynchronize` unless we have an actual breakpoint to set. Otherwise we can completely avoid it. (there is a *second* call to it later in the function that is only called when breakpoints are set. this one was introduced later without any checking)

I discovered this in bug 1132501 after weeks of debugging, because it wreaks all sorts of havoc when I attach the debugger on toolbox open and all devtools tests are running with it on. My patch in that bug already makes it so that `unsafeSynchronize` is only ever called if we are setting breakpoints. That should fix this issue.

My patch in that bug is ready to land (has a green try), just need a review. However, if we feel we should spin the `unsafeSynchronize` fix out into a separate bug I can do that.
Flags: needinfo?(jlong)
This still means that when we *do* have breakpoints to set, we are back into the pit of doom. I hit some seriously weird bugs in random devtools tests when `unsafeSynchronize` was called on ever newSource event. In the effort of only shipping code when it's ready, I'm not sure we should be doing any of this until we are sure that `unsafeSynchronize` doesn't bend space/time so badly. We might need to just accept that breakpoints w/sourcemaps might be re-set in the wrong location for now.
(In reply to Jim Blandy :jimb from comment #19)
> In summary, Debugger is calling the onNewScript handler to report the second
> script while we're in an unsafeSynchronize call spun up by the onNewScript
> handler for the first script. Apparently the unsafeSynchronize call permits
> whatever was dealing with the second script to make progress, and it
> finishes compiling the script.
> 
> [tears hear out] I hate this programming model SO MUCH.

Which programming model? Spinning up event loops to "pause" the debuggees while we spin the event loop for debugger-related things, or the unsafeSynchronize function itself which wraps promise-based APIs with the former?

> Basically, unsafeSynchronize has got to be put out of its misery.

I'm still not clear which part you're tearing you're heart out over, nor what the alternative is.

> Nick, James, I seem to recall discussing with one or both of you at Mozlando
> a different approach to handling source maps, that would not entail anything
> like unsafeSynchronize. Is there a bug open for that work?

We've vaguely talked about pushing source maps into spidermonkey but that doesn't solve the underlying issue.

AFAICT, this is another case of not being able to properly pause debuggees: the debugger has work that it needs to do and it can't permit the debuggee to run until the work is complete, so the debuggee is "paused" until the work completes. However, we can't actually pause debuggees so everything is out the window, as usual.
Blocks: dbg-r2c
Flags: needinfo?(nfitzgerald)
(In reply to James Long (:jlongster) from comment #20)
> (on PTO, so may not continue to be respond quickly) I don't think we have a
> plan to completely remove `unsafeSynchronize` because no matter what, if we
> need to set a breakpoint we *have* to synchronously block until the
> sourcemap is parsed so we can use it to find the right location. It has to
> be sync because the breakpoint might be on top-level code which needs to be
> set before the script runs.

There are synchronization points which don't require spinning the event
loop, though. In the case of script nodes, it would be easy enough to
block compiling the script until we've fetched a source map, in cases
where we know it's needed. It would be nicer to compile the script
immediately and block executing it until the source map has been
fetched, but I have no idea how doable that is.

Either option would be a clear improvement on spinning the event loop,
though.

In the case of local scripts (as in this case, where all scripts are
included in the extension), it would almost certainly be better to just
fetch the source maps synchronously than to spin the event loop. But I'm
not sure that's a case there's any point in optimizing for, since
pre-fetching seems like the better solution in that case too.

> HOWEVER, right now we are shooting ourselves in the foot multiple times when
> we only need to be doing it once. We've made things a lot worse by calling
> it here:
> https://github.com/mozilla/gecko-dev/blob/master/devtools/server/actors/
> script.js#L1933. We NEVER need to call `unsafeSynchronize` unless we have an
> actual breakpoint to set. Otherwise we can completely avoid it. (there is a
> *second* call to it later in the function that is only called when
> breakpoints are set. this one was introduced later without any checking)

There are two other issues I see here, too:

1) I think we never need to call `unsafeSynchronize` unless we a) have
breakpoints set, and b) have source maps enabled. Jim suggested I try
disabling them when I was first investigating this, and was surprised
when it didn't help.

2) `unsafeSynchronize` tries to avoid spinning the event loop if the
promise has already been resolved. I *think* that if this actually
worked as expected, it would solve #1, since the relevant promise should
resolve immediately when source maps are disabled. But promise handlers
are always called asynchronously, so that short circuit never actually
happens.

> My patch in that bug is ready to land (has a green try), just need a review.
> However, if we feel we should spin the `unsafeSynchronize` fix out into a
> separate bug I can do that.

If that would get it landed faster, I'd very much appreciate it.

(In reply to James Long (:jlongster) from comment #21)
> This still means that when we *do* have breakpoints to set, we are back into
> the pit of doom. I hit some seriously weird bugs in random devtools tests
> when `unsafeSynchronize` was called on ever newSource event. In the effort
> of only shipping code when it's ready, I'm not sure we should be doing any
> of this until we are sure that `unsafeSynchronize` doesn't bend space/time
> so badly. We might need to just accept that breakpoints w/sourcemaps might
> be re-set in the wrong location for now.

I think I mostly agree with that. The bugs introduced by this are
stranger and harder to make sense of than a breakpoint being set in the
wrong place due to stale source maps. The debugger doing something less
than ideal to the behavior of the debugger, due to a bit of black magic
developers know they're using, seems much better than the debugger doing
weird things to the behavior of your code, for reasons that are
completely opaque.

(In reply to Luca Greco from comment #14)
>               let scriptTag = document.createElement("script");
>               scriptTag.setAttribute("src", url);
>               scriptTag.async = false;
>               // wait the load event to be sure we do not force
>               // the debugger to process a new script when the
>               // window should be frozen
>               let scriptLoadListener = () => {
>                 scriptTag.removeEventListener("load", scriptLoadListener,
> true);
>                 resolve();
>               };
>               scriptTag.addEventListener("load", scriptLoadListener, true);
>               document.body.appendChild(scriptTag);

If we have to resort to this, we need to move to the next script for the
error event as well as the load event.
(In reply to Kris Maglione [:kmag] from comment #23)
> (In reply to Luca Greco from comment #14)
> >               let scriptTag = document.createElement("script");
> >               scriptTag.setAttribute("src", url);
> >               scriptTag.async = false;
> >               // wait the load event to be sure we do not force
> >               // the debugger to process a new script when the
> >               // window should be frozen
> >               let scriptLoadListener = () => {
> >                 scriptTag.removeEventListener("load", scriptLoadListener,
> > true);
> >                 resolve();
> >               };
> >               scriptTag.addEventListener("load", scriptLoadListener, true);
> >               document.body.appendChild(scriptTag);
> 
> If we have to resort to this, we need to move to the next script for the
> error event as well as the load event.

My apologies, the snippet in the comment is an outdated "proof of concept", the attached patch (attachment 8702299 [details] [diff] [review]) uses both the load and error events.
(In reply to James Long (:jlongster) from comment #20)
> (on PTO, so may not continue to be respond quickly) I don't think we have a
> plan to completely remove `unsafeSynchronize` because no matter what, if we
> need to set a breakpoint we *have* to synchronously block until the
> sourcemap is parsed so we can use it to find the right location. It has to
> be sync because the breakpoint might be on top-level code which needs to be
> set before the script runs.

The problem with unsafeSynchronize is not that it pauses the debuggee; as you say, that's strictly necessary, and I don't think spinning up a nested event loop to fetch source maps is really any different from spinning it up because we're paused at a breakpoint. I think this is Nick's point as well.

The problem with unsafeSynchronize as we used it in devtools that it makes the server interfere with *itself*. A function that could spin the event loop or could possibly call anything else that does needs to be treated very specially; basically anything could happen when one calls such a function, so we have to think about the continuation of such a call the way we think about a 'then' handler or a callback.

In other words, the server itself has its own expectations about being able to run to completion, simply because it's extremely hard to what's going on without assumptions like that to lean on. unsafeSynchronize makes those expectations very difficult to manage.

And that's exactly what's happened here: our onNewScript handler was not expecting to be interrupted by another onNewScript handler invocation.
(In reply to Kris Maglione [:kmag] from comment #23)
> There are synchronization points which don't require spinning the event
> loop, though. In the case of script nodes, it would be easy enough to
> block compiling the script until we've fetched a source map, in cases
> where we know it's needed. It would be nicer to compile the script
> immediately and block executing it until the source map has been
> fetched, but I have no idea how doable that is.

We can't actually block compiling the script until we've fetched a source map, because the source map can be specified via a comment in the code, which we don't know about until we've compiled it.

But we are, in fact, compiling the script immediately and trying to block executing it. The problem is that the work we need to do isn't blocking other scripts.

It's not that the server has failed to keep the debuggee from running; it's that the server has failed to keep itself from running.
(In reply to Jim Blandy :jimb from comment #26)
> It's not that the server has failed to keep the debuggee from running; it's
> that the server has failed to keep itself from running.

This is incorrect, I think; the problem is indeed that the second script is running within the onNewScript handler of the first.
(In reply to James Long (:jlongster) from comment #20)
> HOWEVER, right now we are shooting ourselves in the foot multiple times when
> we only need to be doing it once. We've made things a lot worse by calling
> it here:
> https://github.com/mozilla/gecko-dev/blob/master/devtools/server/actors/
> script.js#L1933. We NEVER need to call `unsafeSynchronize` unless we have an
> actual breakpoint to set.

In an onNewScript handler, it seems to me it shouldn't be necessary to spin up a nested event loop at all if the script has no source maps, either.
(In reply to Jim Blandy :jimb from comment #26)
> We can't actually block compiling the script until we've fetched a source
> map, because the source map can be specified via a comment in the code,
> which we don't know about until we've compiled it.

We can easily parse out those comments well enough to prefetch source maps
without compiling the script. It may not be ideally efficient (or, rather, it
may be trivially inefficient), but it should only ever need to happen when the
debugger a) is enabled, and b) has breakpoints to restore.

> But we are, in fact, compiling the script immediately and trying to block
> executing it. The problem is that the work we need to do isn't blocking
> other scripts.

Right. My point is that, in the case of script nodes, anyway, there's nothing
to mandate that scripts execute synchronously immediately after they're
compiled. They already have to be fetched from asynchronous channels, so
they're already designed to be non-blocking, and have synchronization points
that don't have these problems.

I have no idea how doable it is to asynchronously block script execution after
the script has been compiled (but I'd hope not too difficult). I do know that
it's quite easily doable to asynchronously block the completion of the channel
that's fetching the script, though.

> It's not that the server has failed to keep the debuggee from running; it's
> that the server has failed to keep itself from running.

Not just itself. There are so many thorns to avoid every time you start
spinning the event loop, and the more often you do it, the scarier it gets. I
understand that there are times when we can't avoid it, but when you start
doing it for background network requests... I don't even want to think of what
happens when we run into this on a page with 15 or 20 scripts. Especially if
you throw in some debugging code with an alert, or an XHR, or multiple
debuggees.
There are two degrees of "solved-ness" here:

1) If we arrange to only enter a nested event loop when we have breakpoints to map *and* the script that has just been compiled has source maps, then we'll fix the symptoms reported here. But the bug will still occur if we do have breakpoints and source maps.

2) If we ensure that event dispatch between the completion of a script's compilation (as reported by onNewScript) and the beginning of its execution can never run related later scripts, then we will have solved the problem in both cases.

Kris, do you have a sense of what 2) would entail, specifically?
(In reply to Kris Maglione [:kmag] from comment #23)
> 
> There are synchronization points which don't require spinning the event
> loop, though. In the case of script nodes, it would be easy enough to
> block compiling the script until we've fetched a source map, in cases
> where we know it's needed. It would be nicer to compile the script
> immediately and block executing it until the source map has been
> fetched, but I have no idea how doable that is.

We can't unconditionally block on loading a sourcemap. We only need to block on it if we have breakpoints to set, so we would need some way to indicate that. If we have no breakpoints to set, we still want the sourcemap to update line numbers in the console and other places, but that can be done asynchronously.

> 
> 2) `unsafeSynchronize` tries to avoid spinning the event loop if the
> promise has already been resolved. I *think* that if this actually
> worked as expected, it would solve #1, since the relevant promise should
> resolve immediately when source maps are disabled. But promise handlers
> are always called asynchronously, so that short circuit never actually
> happens.

You are correct, I noticed this recently too. We used to use sync promises but when `TabSources` (which creates those promises passed into `unsafeSynchronize`) switched to async promises it broke that optimization. So it's faulty, and really needs to be fixed.

> > My patch in that bug is ready to land (has a green try), just need a review.
> > However, if we feel we should spin the `unsafeSynchronize` fix out into a
> > separate bug I can do that.
> 
> If that would get it landed faster, I'd very much appreciate it.

I'm going to talk to Brian today and see if he's going to review it soon. I'm very confident it won't bounce (had several green try runs) so I'm hoping to land it this week.
(In reply to Jim Blandy :jimb from comment #30)
> 2) If we ensure that event dispatch between the completion of a script's
> compilation (as reported by onNewScript) and the beginning of its execution
> can never run related later scripts, then we will have solved the problem in
> both cases.
> 
> Kris, do you have a sense of what 2) would entail, specifically?

I can't think of any obvious way to do this. I just looked through the script loader, and confirmed that it's not trivial.

Non-async, non-inline scripts are added to a queue, and processed in the order they were created. They're removed from the queue before being processed, though, so once we get to the point of spinning the event loop, there's nothing stopping the next script from executing if we process its request completion event.

The only possibility I see is disabling the script loader while spinning the event loop, since that's the only flag that's checked before loading these scripts. But that may have other undesirable side-effects if something tries to create a new script in the mean time.

It might be feasible to add a new flag, and a function similar to suspendTimeouts/enterModalState/suppressEventHandling to nsIDOMWindowUtils, to delay the execution of pending scripts. That's well out of my bailiwick, though.

(In reply to James Long (:jlongster) from comment #31)
> (In reply to Kris Maglione [:kmag] from comment #23)
> > 
> > There are synchronization points which don't require spinning the event
> > loop, though. In the case of script nodes, it would be easy enough to
> > block compiling the script until we've fetched a source map, in cases
> > where we know it's needed. It would be nicer to compile the script
> > immediately and block executing it until the source map has been
> > fetched, but I have no idea how doable that is.
>
> We can't unconditionally block on loading a sourcemap. We only need to block
> on it if we have breakpoints to set, so we would need some way to indicate
> that. If we have no breakpoints to set, we still want the sourcemap to
> update line numbers in the console and other places, but that can be done
> asynchronously.

Agreed. That was my understanding of "in cases where we know it's needed".
Bug 1132501 has landed, so this issue is less severe. It should not happen unless there are breakpoints to set.
(In reply to James Long (:jlongster) from comment #33)
> Bug 1132501 has landed, so this issue is less severe. It should not happen
> unless there are breakpoints to set.

I confirm that now the test case in Attachment 8702295 [details] [diff] can run successfully
(and it fails again once we set a breakpoint)

e.g. follows a small change to Attachment 8702295 [details] [diff] which adds a breakpoint to confirm that the test fails as expected:

diff --git a/toolkit/components/extensions/test/mochitest/test_ext_background_debugger_attached.html b/toolkit/components/extensions/test/mochitest/test_ext_background_debugger_attached.html
index f3fac99..129fd09 100644
--- a/toolkit/components/extensions/test/mochitest/test_ext_background_debugger_attached.html
+++ b/toolkit/components/extensions/test/mochitest/test_ext_background_debugger_attached.html
@@ -117,7 +117,7 @@ function installAddon(extensionData) {
 }
 
 function attachAddonDebugger(client, form) {
-  return new Promise((resolve) => {
+  return new Promise((resolve) => {
     client.attachAddon(form.actor, (response) => {
       response = SpecialPowers.wrap(response);
 
@@ -127,6 +127,10 @@ function attachAddonDebugger(client, form) {
         if (threadClient.paused) {
           threadClient.resume();
         }
+        client.addOneTimeListener("paused", (event, packet) => {
+          threadClient.resume();
+        });
+
         resolve();
       })
     });
@@ -136,9 +140,9 @@ function attachAddonDebugger(client, form) {
 add_task(function* test_backgroundPageDebuggerAttach() {
    let extensionData = {
      files: {
-       "background1.js": "console.log('background1.js')",
-       "background2.js": "console.log('background2.js')",
-       "background3.js": "console.log('background3.js')",
+       "background1.js": "debugger; console.log('background1.js');",
+       "background2.js": "console.log('background2.js');",
+       "background3.js": "console.log('background3.js');",
      },
      manifest: {
Alright, let's at least get that test case landed, for now.

I think we should make it a browser chrome test rather than a plain mochitest, which would simplify a lot of things. I don't think I'm the right person to review it, though. James might be better.

I'm reluctant to land either of the workarounds for now. I'd really like to avoid the added latency of waiting for load events. And if we decide we need to fix bug 1234677, I think we're going to have to pre-generate the background pages rather than dynamically injecting script nodes (or otherwise just loading them synchronously with the subscript loader), which should fix this anyway.

Either way, we should probably split this into two bugs, at this point. One for the specific problems of WebExtension background scripts, and one for the problem of dynamically-injected scripts in general.
Priority: -- → P2
Whiteboard: triaged
Status: REOPENED → RESOLVED
Closed: 9 years ago8 years ago
Depends on: 1234677
Resolution: --- → FIXED
It is not solved, just partially solved. I need to use breakpoints to debug my extension. I just cannot imagine how could I use debugger without break points.
It was fixed by bug 1234677.
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: