Closed Bug 1333990 Opened 3 years ago Closed 3 years ago

We re-parse content scripts every time we run them

Categories

(Core :: XPConnect, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: billm, Assigned: kmag)

References

(Depends on 1 open bug)

Details

(Keywords: dev-doc-needed)

Attachments

(18 files)

59 bytes, text/x-review-board-request
shu
: review+
billm
: review+
Details
59 bytes, text/x-review-board-request
hsivonen
: review+
billm
: review+
Details
59 bytes, text/x-review-board-request
aswan
: review+
Details
59 bytes, text/x-review-board-request
shu
: review+
Details
59 bytes, text/x-review-board-request
hsivonen
: review+
Details
59 bytes, text/x-review-board-request
hsivonen
: review+
Details
59 bytes, text/x-review-board-request
hsivonen
: review+
Details
59 bytes, text/x-review-board-request
aswan
: review+
billm
: review+
Details
59 bytes, text/x-review-board-request
aswan
: review+
billm
: review+
Details
59 bytes, text/x-review-board-request
aswan
: review+
Details
59 bytes, text/x-review-board-request
aswan
: review+
billm
: review+
Details
59 bytes, text/x-review-board-request
ehsan
: review+
Details
59 bytes, text/x-review-board-request
hsivonen
: review+
Details
59 bytes, text/x-review-board-request
bzbarsky
: review+
Details
59 bytes, text/x-review-board-request
bzbarsky
: review+
Details
59 bytes, text/x-review-board-request
bzbarsky
: review+
Details
59 bytes, text/x-review-board-request
bzbarsky
: review+
Details
59 bytes, text/x-review-board-request
bzbarsky
: review+
Details
I was profiling LastPass today and I noticed that we spend a good deal of time parsing the LastPass code. Pretty much any mechanism we have that runs content scripts (either in the add-on SDK or WebExtensions or GreaseMonkey or whatever) uses the subscript loader as far as I know. And the subscript loader re-reads and re-parses scripts every time they're run (assuming they're not in the startup cache, which won't be the case for add-on code). In the case I saw, we spent as much time parsing the script as we did running it.

We should fix this. Content scripts run a lot.
Wow, *every* time?
Currently, we load any script which doesn't run at document_start asynchronously. That means that we should (but apparently don't) be able to use off-main-thread compilation for at least those.

My understanding is that we currently only use the startup cache for code being loaded into chrome principals because the precompiled code needs to be compiled with the correct target principal, and we expect content scripts to run with a new principal every time. So unless I'm mistaken, or something has changed, I'm not sure what we can do to change this.
(In reply to Kris Maglione [:kmag] from comment #2)
> Currently, we load any script which doesn't run at document_start
> asynchronously. That means that we should (but apparently don't) be able to
> use off-main-thread compilation for at least those.

Yeah, parsing off the main thread would be a nice win. We don't do that now and I think it would be pretty easy to do.

> My understanding is that we currently only use the startup cache for code
> being loaded into chrome principals because the precompiled code needs to be
> compiled with the correct target principal, and we expect content scripts to
> run with a new principal every time. So unless I'm mistaken, or something
> has changed, I'm not sure what we can do to change this.

That doesn't seem like a significant problem to me. However, even the startup cache isn't that great since we still need to use XDR, which isn't known for its awesome performance.

It seems like it would be better to parse once and then clone the script each time we want to run it. Is that something we could do, Shu? I've forgotten all the invariants we have about baking in globals and scopes.
Flags: needinfo?(shu)
(In reply to Bill McCloskey (:billm) from comment #3)
> It seems like it would be better to parse once and then clone the script
> each time we want to run it. Is that something we could do, Shu? I've
> forgotten all the invariants we have about baking in globals and scopes.

This was my first reaction as well. I would hope that we could compile these scripts, clone (for the right compartment) and run them against arbitrary globals -- IIRC the only reason that we used to have problems with this in the past was if we used COMPILE_N_GO and then only because we'd bake assumptions about the shape of the global object into the resulting script.

I guess we'd have to deal with cache invalidation somehow too though that shouldn't be too difficult.
(In reply to Blake Kaplan (:mrbkap) from comment #4)
> This was my first reaction as well. I would hope that we could compile these
> scripts, clone (for the right compartment) and run them against arbitrary
> globals -- IIRC the only reason that we used to have problems with this in
> the past was if we used COMPILE_N_GO and then only because we'd bake
> assumptions about the shape of the global object into the resulting script.
> 
> I guess we'd have to deal with cache invalidation somehow too though that
> shouldn't be too difficult.

I think if we go this route, we should do something similar to what we do with preloaded stylesheets:

http://searchfox.org/mozilla-central/rev/7da3c9dcf467964f2fb82f3a4c63972ee79bf696/layout/base/nsIStyleSheetService.idl#50-55
http://searchfox.org/mozilla-central/rev/7da3c9dcf467964f2fb82f3a4c63972ee79bf696/dom/interfaces/base/nsIDOMWindowUtils.idl#1740-1747

That way we can control what's cached, and when, directly from the script loading code, and have things automatically GCed when extensions are unloaded.
(In reply to Blake Kaplan (:mrbkap) from comment #4)
> (In reply to Bill McCloskey (:billm) from comment #3)
> > It seems like it would be better to parse once and then clone the script
> > each time we want to run it. Is that something we could do, Shu? I've
> > forgotten all the invariants we have about baking in globals and scopes.
> 
> This was my first reaction as well. I would hope that we could compile these
> scripts, clone (for the right compartment) and run them against arbitrary
> globals -- IIRC the only reason that we used to have problems with this in
> the past was if we used COMPILE_N_GO and then only because we'd bake
> assumptions about the shape of the global object into the resulting script.
> 
> I guess we'd have to deal with cache invalidation somehow too though that
> shouldn't be too difficult.

That CNG nonsense has been gone for a while. Cloning a script into different globals should work.
Flags: needinfo?(shu)
(In reply to Bill McCloskey (:billm) from comment #3)
> (In reply to Kris Maglione [:kmag] from comment #2)
> > Currently, we load any script which doesn't run at document_start
> > asynchronously. That means that we should (but apparently don't) be able to
> > use off-main-thread compilation for at least those.
> 
> Yeah, parsing off the main thread would be a nice win. We don't do that now
> and I think it would be pretty easy to do.

What is currently meant by "loaded asynchronously" if not off-main-thread? I agree we should move async stuff to OMT if this is not already the case.
(In reply to Shu-yu Guo [:shu] from comment #7)
> What is currently meant by "loaded asynchronously" if not off-main-thread? I
> agree we should move async stuff to OMT if this is not already the case.

It uses the async version of the subscript loader API, which reads the script data asynchronously, but compiles and executes it synchronously once it's been read.
It would also probably help if, for document_start scripts, we could block the parser and load the scripts asynchronously, like the DOM script loader does, rather than loading and compiling them synchronously, if we don't already have them cached.
Priority: -- → P2
(In reply to Kris Maglione [:kmag] from comment #9)
> It would also probably help if, for document_start scripts, we could block
> the parser and load the scripts asynchronously, like the DOM script loader
> does, rather than loading and compiling them synchronously, if we don't
> already have them cached.

Perhaps I'm missing something simple. Why does blocking + async for scripts we need to execute immediately (assuming that's what document_start means) be preferable to sync?
why would*
We don't need to execute them immediately, we just need to execute them before any page scripts load. When we load them synchronously, we block the main thread on reading and parsing the scripts. That means not just blocking the document we're loading the script for, but also any other frames, any other tabs in the same process, and any other UI operations. For a page with a half dozen frames, loading something as complicated as LastPass, that can be a pretty big deal.
I'm going to work on a straw man implementation of this in the next few weeks. I may not have time to make it production ready before 57, though.
Assignee: nobody → kmaglione+bmo
Shu asked me to needinfo him on this.
Flags: needinfo?(shu)
This is a pretty rough proof-of-concept. I'd be interested in feedback from anyone who's interested.

A few issues to note at this point:

- The parser blocking patch currently blocks the parser after the document element is inserted for every document, until the document-element-inserted observer has fired. I'm not sure if this is entirely desirable, but I can't think of a better alternative.

- It would be nice for the blockParsing method to automatically unblock if the promise we pass it is GCed, for the sake of sanity. A timeout might also be appropriate.

- The script cache currently pre-loads all static content scripts in all processes as soon as the extension is loaded, and keeps them loaded indefinitely. At the minimum it needs automatic expiry, to respond to memory pressure events, and to only pre-load scripts if they're very likely to be used soon. It might also be a good idea to pre-load scripts with broad match patterns (<all_urls>, http://*, ...) in the parent process and XDR clone them to children.

- I haven't actually profiled this yet.

- I'm currently compiling the scripts with canLazilyParse = false, since it seemed to make sense to parse cached scripts as eagerly as possible. It would be good to profile some common extensions in both modes.
OK, I did some basic profiling, and it looks like with this code, we spend about 435ms loading LastPass into cnn.com, compared to 937ms with the async subscript loader version. We still spend about 100ms of that cloning the script, but it's way better than what we had before.
I did some talos runs with LastPass installed, before and after these changes. The results look pretty promising:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=ae2d2f1947821dbe140f475c0b7fd0580f11eb0f&newProject=try&newRevision=709554279a9de36e35b4c51041f28f29d6849ed1&framework=1&showOnlyImportant=1
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=07086d4607f426a16123e32aba0bacbea4bed4a8&newProject=try&newRevision=e03bfe92043f834a6eb041c252d8908f10ee501a&framework=1&showOnlyImportant=1

There's about an 8-11% improvement in tp5o, about a 15% improvement in non-e10s sessionrestore, and a 12% improvement in ts_paint. It looks like there's a slight regression in tp5o responsiveness, probably because of the added request listener, but I've been planning to move some of that content script matching code to C++ anyway.

That said, the numbers with LastPass installed compared to baseline are still pretty bad:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=2d61eb2ffeac39a98b87f89dd1f11a65bc16ddca&newProject=try&newRevision=709554279a9de36e35b4c51041f28f29d6849ed1&framework=1&showOnlyImportant=1
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=2d61eb2ffeac39a98b87f89dd1f11a65bc16ddca&newProject=try&newRevision=e03bfe92043f834a6eb041c252d8908f10ee501a&framework=1&showOnlyImportant=1

An 1100-1200% regression in tp5o responsiveness, 163-220% regression in tp5o (I had to increase the talos timeout in a few places to even get the run to complete), 55-75% in sessionrestore, 155% in tps, 40-47% in ts_paint, 65% in tabpaint, ...
I'm *very* uneasy about changing things so that there could be more one parser-blocking script. The HTML spec is written so that there can be at most one at a time, and it seems bad to either a) accidentally deviate from the spec for scripts that the page includes via <script> or b) mix extension-injected scripts into the concept that from the spec perspective is about scripts included by the page itself.

Do I understand correctly that we need a facility that allows Web Extensions to cause extension-provided scripts to be evaluated in the context of the document's global before parsing proceeds?

Is there a strong reason not to build an explicit (obviously separate from the code for the spec concept) facility for that?

I'd prefer the code being explicit in terms of which parts implement the spec and which parts are additions to support non-spec Web Extension semantics.
(In reply to Kris Maglione [:kmag] from comment #12)
> We don't need to execute them immediately, we just need to execute them
> before any page scripts load.

Does "any page script" include event handler attributes?
(In reply to Henri Sivonen (:hsivonen) from comment #35)
> Do I understand correctly that we need a facility that allows Web Extensions
> to cause extension-provided scripts to be evaluated in the context of the
> document's global before parsing proceeds?

In the context of a Sandbox that inherits from the document's window global,
yes.

> Is there a strong reason not to build an explicit (obviously separate from
> the code for the spec concept) facility for that?

I tried to keep this as separate as possible from the code that implements the
spec concept. Most of that logic is implemented in nsScriptLoader. These
changes simply let us block the parser without interfering with that code.

(In reply to Henri Sivonen (:hsivonen) from comment #36)
> (In reply to Kris Maglione [:kmag] from comment #12)
> > We don't need to execute them immediately, we just need to execute them
> > before any page scripts load.
>
> Does "any page script" include event handler attributes?

Ideally, yes, but in practice we don't guarantee that even now.
(In reply to Kris Maglione [:kmag] from comment #37)
> I tried to keep this as separate as possible from the code that implements
> the
> spec concept. Most of that logic is implemented in nsScriptLoader. These
> changes simply let us block the parser without interfering with that code.

Good point.

In the test case, it seems that the test expects head and body to have been inserted by the time the parser is blocked. Why is that? What's the expected behavior if there's a script (inline or external) in head? Why shouldn't the parser blocking happen immediately after inserting the root element? (In that case, we could make eTreeOpAppendToDocument special and able to break out of the loop and then rely on the existing blockedness check before the flush loop.)
(In reply to Henri Sivonen (:hsivonen) from comment #38)
> In the test case, it seems that the test expects head and body to have been
> inserted by the time the parser is blocked. Why is that? What's the expected
> behavior if there's a script (inline or external) in head? Why shouldn't the
> parser blocking happen immediately after inserting the root element? (In
> that case, we could make eTreeOpAppendToDocument special and able to break
> out of the loop and then rely on the existing blockedness check before the
> flush loop.)

Apparently the parser sets up the initial skeleton DOM before unblocking
scripts and allowing the document-element-inserted event to be dispatched, so
we always have more than just the document element when the event is fired. I
initially made some changes so we'd exit the flush loop immediately after the
document element was inserted, but I was worried there might be some
performance impact to that.

As for the expected behavior if there's a script, we'd definitely want to
block before it's inserted. This currently works because we dispatch
document-element-inserted synchronously as soon as scripts are unblocked, and
then load content scripts synchronously from that. But it looks like we don't
actually have any tests for that, so it may not be the case after these
changes.

I'm happy to go back to breaking out of the flush loop from
eTreeOpAppendToDocument if you think that's a good idea.
(In reply to Kris Maglione [:kmag] from comment #39)
> I'm happy to go back to breaking out of the flush loop from
> eTreeOpAppendToDocument if you think that's a good idea.

That seems like the way forward. Please add bool stop to the flush loop and make nsHtml5TreeOperation::Perform take a pointer to it in addition to the current nsIContent** aScriptElement. (The setting *aScriptElement should also set the boolean and the flush loop should check the boolean before checking the script in order to check only one thing when there's no need to break the loop.)
Kris, should I wait to review this until Henri's comments have been addressed?
Flags: needinfo?(kmaglione+bmo)
The parser blocking and script loading portions should be completely separate. As long as we're agreed that blocking the parser for this is OK (which I think we are), the other parts should be fine to review.
Flags: needinfo?(kmaglione+bmo)
(In reply to Henri Sivonen (:hsivonen) from comment #40)
> Please add bool stop to the flush loop and make nsHtml5TreeOperation::Perform
> take a pointer to it in addition to the current nsIContent** aScriptElement.
> (The setting *aScriptElement should also set the boolean and the flush loop
> should check the boolean before checking the script in order to check only
> one thing when there's no need to break the loop.)

I wound up not setting the flag when setting *aScriptElement because we don't
actually check whether that's been set during the flush loop. The parser just
performs a flush whenever it encounters a script element, so the queue is
always empty when we execute one of those operations.
Comment on attachment 8846336 [details]
Bug 1333990: Part 2c - Interrupt the flush loop after inserting document element.

https://reviewboard.mozilla.org/r/119386/#review122374

::: layout/reftests/forms/input/number/focus-handling.html
(Diff revision 2)
>         invalidation so we don't need to listen for MozReftestInvalidate.           
>    -->                                                                              
>    <head>
>      <meta charset="utf-8">
>      <script>
> -

Is this change intentional?
Attachment #8846336 - Flags: review?(hsivonen) → review+
Comment on attachment 8846335 [details]
Bug 1333990: Part 2a - Allow multiple concurrent parser blockers.

https://reviewboard.mozilla.org/r/119384/#review122380

r+ if the nits are fixed.

::: parser/html/nsHtml5Parser.h:281
(Diff revision 2)
>       * when preparsing document.write.
>       */
>      bool                          mDocWriteSpeculativeLastWasCR;
>  
>      /**
>       * The parser is blocking on a script

Please make the comment say that this covers both the parser-blocking script from the page (at most one) and extension-inserted scripts (any number).

::: parser/html/nsHtml5Parser.h:283
(Diff revision 2)
>      bool                          mDocWriteSpeculativeLastWasCR;
>  
>      /**
>       * The parser is blocking on a script
>       */
> -    bool                          mBlocked;
> +    int                           mBlocked;

Please make this `uint32_t`, since it can't be negative.

::: parser/htmlparser/nsParser.h:388
(Diff revision 2)
>      nsresult            mInternalState;
>      nsresult            mStreamStatus;
>      int32_t             mCharsetSource;
>      
>      uint16_t            mFlags;
> +    int                 mBlocked;

Please make this `uint32_t`.

::: parser/htmlparser/nsParser.cpp:657
(Diff revision 2)
>    return result;
>  }
>  
>  /**
>   *  Stops parsing temporarily. That's it will prevent the
>   *  parser from building up content model.

Please add here, too, the comment about this covering both the parser-blocking script (at most one) and extension-inserted scripts (potentially multiple).

::: parser/htmlparser/nsParser.cpp:692
(Diff revision 2)
>   * Call this to query whether the parser is enabled or not.
>   */
>  NS_IMETHODIMP_(bool)
>  nsParser::IsParserEnabled()
>  {
> -  return (mFlags & NS_PARSER_FLAG_PARSER_ENABLED) != 0;
> +  return mBlocked == 0;

Please make this `return !mBlocked`.
Attachment #8846335 - Flags: review?(hsivonen) → review+
Comment on attachment 8845709 [details]
Bug 1333990: Part 2d - Add a utility to block HTML parsing until sandbox scripts are ready.

https://reviewboard.mozilla.org/r/118856/#review122390

::: dom/base/nsDocument.cpp:10563
(Diff revision 4)
> +private:
> +  void MaybeUnblockParser() {
> +    if (mDocument) {
> +      nsCOMPtr<nsIParser> parser = mDocument->CreatorParserOrNull();
> +      if (parser) {
> +        parser->UnblockParser();

Is this sufficient for resuming the XML parser or the about:blank parser? The comment on `nsParser::UnblockParser()` suggests otherwise.

::: dom/base/nsDocument.cpp:10578
(Diff revision 4)
> +NS_IMPL_ISUPPORTS0(UnblockParsingPromiseHandler)
> +
> +already_AddRefed<Promise>
> +nsIDocument::BlockParsing(JSContext* aCx, JS::HandleObject aPromise,
> +                          ErrorResult& aRv)
> +{

I'm uncomfortable with reviewing this method, because I don't know enough about the promise machinery to know whether this is right. Not sure who to suggest as a reviewer. Sorry.

::: dom/webidl/Document.webidl:384
(Diff revision 4)
>  
>    [ChromeOnly] readonly attribute nsILoadGroup? documentLoadGroup;
>  
> +  // Blocks the initial document parser until the given promise is settled.
> +  [ChromeOnly, Throws]
> +  Promise<any> blockParsing(object promise);

Why is the argument `object promise` instead of `Promise<any> promise`?
Comment on attachment 8846337 [details]
Bug 1333990: Part 2e - Test that document.blockParsing blocks the parser at its current state during document-element-inserted.

https://reviewboard.mozilla.org/r/119388/#review122402

::: dom/base/test/file_simple.html:5
(Diff revision 2)
> +<!DOCTYPE html>
> +<html lang="en">
> +<head>
> +  <meta charset="utf-8">
> +  <title></title>

Please test 5 cases:
HTML file with internal script in head.
HTML file with external script in head.
XHTML file with internal script in head.
XHTML file with external script in head.
about:blank
Attachment #8846337 - Flags: review?(hsivonen) → review-
Comment on attachment 8846336 [details]
Bug 1333990: Part 2c - Interrupt the flush loop after inserting document element.

https://reviewboard.mozilla.org/r/119386/#review122374

> Is this change intentional?

Nope, I had debugging code in that file. I didn't notice I removed the blank line with it.
Comment on attachment 8845709 [details]
Bug 1333990: Part 2d - Add a utility to block HTML parsing until sandbox scripts are ready.

https://reviewboard.mozilla.org/r/118856/#review122390

> Is this sufficient for resuming the XML parser or the about:blank parser? The comment on `nsParser::UnblockParser()` suggests otherwise.

It looks like it's not. I guess we need `parser->ContinueInterruptedParsingAsync()`, too.

> Why is the argument `object promise` instead of `Promise<any> promise`?

The JS spec uses duck typing for promises, so any object with a `.then()` method is required to be treated as if it were a promise. In this case, depending on the caller, we can expect a DOM promise, a Spidermonkey promise, or a Promise.jsm promise with pretty much equal probability.
Attachment #8845709 - Flags: review?(wmccloskey)
Comment on attachment 8845708 [details]
Bug 1333990: Part 1a - Add an async script pre-loading utility.

https://reviewboard.mozilla.org/r/118854/#review122686

::: dom/webidl/PrecompiledScript.webidl:8
(Diff revision 4)
> + * A collection of static utility methods that are only exposed to Chrome. This
> + * interface is not exposed in workers, while ThreadSafeChromeUtils is.

This comment doesn't seem to describe this interface.

::: js/xpconnect/loader/ChromeScriptLoader.cpp:321
(Diff revision 4)
> +{
> +    CopyUTF8toUTF16(mURL, aUrl);
> +}
> +
> +bool
> +PrecompiledScript::ReturnValue()

The return value stuff will be a little confusing for someone not familiar with this aspect of compiling. Maybe call it ScriptReturnsValue or something, and comment somewhere what it's for?

::: js/xpconnect/loader/PrecompiledScript.h:6
(Diff revision 4)
> +/* -*-  Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2; -*- */
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef mozilla_dom_PrecompiledScript__

Nowadays the convention is mozilla_dom_PrecompiledScript_h.
Attachment #8845708 - Flags: review?(wmccloskey) → review+
Comment on attachment 8845708 [details]
Bug 1333990: Part 1a - Add an async script pre-loading utility.

https://reviewboard.mozilla.org/r/118854/#review122708

::: dom/webidl/ChromeUtils.webidl:64
(Diff revision 4)
>     */
>    static boolean
>    isOriginAttributesEqual(optional OriginAttributesDictionary aA,
>                            optional OriginAttributesDictionary aB);
> +
> +  [NewObject, Throws]

One more thing: can you add some documentation about what this does and how it's supposed to be used?
Comment on attachment 8845708 [details]
Bug 1333990: Part 1a - Add an async script pre-loading utility.

https://reviewboard.mozilla.org/r/118854/#review122686

> This comment doesn't seem to describe this interface.

Heh. I thought I'd written a proper doc comment for this after I copied the boilerplate from ChromeUtils...

> The return value stuff will be a little confusing for someone not familiar with this aspect of compiling. Maybe call it ScriptReturnsValue or something, and comment somewhere what it's for?

Just for the binary name, or for the JS property name too?
(In reply to Kris Maglione [:kmag] from comment #64)
> Just for the binary name, or for the JS property name too?

Probably best to change all of them.
(In reply to Henri Sivonen (:hsivonen) from comment #59)
> Please test 5 cases:
> HTML file with internal script in head.
> HTML file with external script in head.
> XHTML file with internal script in head.
> XHTML file with external script in head.
> about:blank

I added the HTML and XHTML test cases. For about:blank, though, we don't currently emit document-element-created events, and there's no support for interrupting the flush routine, so I couldn't add tests for it.
Comment on attachment 8845709 [details]
Bug 1333990: Part 2d - Add a utility to block HTML parsing until sandbox scripts are ready.

https://reviewboard.mozilla.org/r/118856/#review122390

> The JS spec uses duck typing for promises, so any object with a `.then()` method is required to be treated as if it were a promise. In this case, depending on the caller, we can expect a DOM promise, a Spidermonkey promise, or a Promise.jsm promise with pretty much equal probability.

It turns out the codegen is actually smart enough to handle this, and generates pretty much the same code what I was writing, so I switched the argument to `Promise<any>`
Comment on attachment 8846337 [details]
Bug 1333990: Part 2e - Test that document.blockParsing blocks the parser at its current state during document-element-inserted.

https://reviewboard.mozilla.org/r/119388/#review122904
Attachment #8846337 - Flags: review?(hsivonen) → review+
Comment on attachment 8845709 [details]
Bug 1333990: Part 2d - Add a utility to block HTML parsing until sandbox scripts are ready.

https://reviewboard.mozilla.org/r/118856/#review122906
Attachment #8845709 - Flags: review?(hsivonen) → review+
Comment on attachment 8847852 [details]
Bug 1333990: Part 2c.1 - Interrupt the XML flush loop after inserting document element.

https://reviewboard.mozilla.org/r/120772/#review122908
Attachment #8847852 - Flags: review?(hsivonen) → review+
Comment on attachment 8845709 [details]
Bug 1333990: Part 2d - Add a utility to block HTML parsing until sandbox scripts are ready.

https://reviewboard.mozilla.org/r/118856/#review122712

::: dom/base/nsDocument.cpp:79
(Diff revision 6)
> +#include "mozilla/dom/Promise.h"
> +#include "mozilla/dom/PromiseNativeHandler.h"

These should be sorted above TreeWalker.
Attachment #8845709 - Flags: review?(wmccloskey) → review+
Comment on attachment 8846338 [details]
Bug 1333990: Part 3b - Preload matching content scripts when opening document channels.

https://reviewboard.mozilla.org/r/119390/#review123118
Attachment #8846338 - Flags: review?(wmccloskey) → review+
Comment on attachment 8846339 [details]
Bug 1333990: Part 3c - Evict cached content scripts after a timeout and on memory-pressure.

https://reviewboard.mozilla.org/r/119392/#review123130

::: toolkit/components/extensions/ExtensionContent.jsm:136
(Diff revision 4)
> +    return script;
> +  }
> +
> +  delete(url) {
> +    if (this.has(url)) {
> +      super.get(url).timer.cancel();

Not sure if you're ever likely to use clear(), but that should also kill timers.
Attachment #8846339 - Flags: review?(wmccloskey) → review+
Comment on attachment 8845708 [details]
Bug 1333990: Part 1a - Add an async script pre-loading utility.

https://reviewboard.mozilla.org/r/118854/#review123140
Attachment #8845708 - Flags: review?(shu) → review+
Comment on attachment 8846334 [details]
Bug 1333990: Part 1b - Add tests for script precompiler.

https://reviewboard.mozilla.org/r/119382/#review123142

::: js/xpconnect/tests/unit/test_compileScript.js:38
(Diff revision 3)
> +
> +  obj = script2.executeInGlobal(sandbox1);
> +  equal(obj, undefined, "No-return script has no return value");
> +
> +  equal(Cu.getObjectPrincipal(sandbox1.bar).origin, "http://example.com", "Object value origin is correct");
> +  equal(sandbox1.bar.foo, "\u00ae", "Object value has the correct charset");

For completeness, please also test that hasReturnValue: false scripts (script2) execute correctly in a different global than the original.
Attachment #8846334 - Flags: review?(shu) → review+
Flags: needinfo?(shu)
Comment on attachment 8846341 [details]
Bug 1333990: Part 3e - Add tests for content script cache eviction.

https://reviewboard.mozilla.org/r/119396/#review123144
Attachment #8846341 - Flags: review?(wmccloskey) → review+
Comment on attachment 8847432 [details]
Bug 1333990: Part 2b - Don't enable editor until layout has started.

https://reviewboard.mozilla.org/r/120396/#review123160

::: commit-message-55eb0:3
(Diff revision 2)
> +Bug 1333990: Part 2b - Don't enable editor until layout has started. r?ehsan
> +
> +We need to In order to support asynchronous loading of extension content

Nit: s/We need to //

::: docshell/test/navigation/test_bug386782.html:43
(Diff revision 2)
>      ];
>  
>      var gTestNum = -1;
>      var gTest = null;
>  
> -    window.onload = goNext();
> +    window.onload = goNext;

lol!
Attachment #8847432 - Flags: review?(ehsan) → review+
Comment on attachment 8845710 [details]
Bug 1333990: Part 3a - Use async loading and in-memory caching for WebExtension content scripts.

https://reviewboard.mozilla.org/r/118858/#review123184

::: toolkit/components/extensions/ExtensionContent.jsm:144
(Diff revision 6)
>    this.requiresCleanup = !this.remove_css && (this.css.length > 0 || options.cssCode);
>  }
>  
>  Script.prototype = {
> +  compileScripts() {
> +    return this.js.map(url => this.scriptCache.get(url));

These used to get resolved against the extension's baseURI before injection.  Was it just redundant previously?
Attachment #8845710 - Flags: review?(aswan) → review+
Comment on attachment 8846338 [details]
Bug 1333990: Part 3b - Preload matching content scripts when opening document channels.

https://reviewboard.mozilla.org/r/119390/#review123188

::: commit-message-571c5:11
(Diff revision 4)
> +most of its prediction work happens in the parent process, and there are no
> +simple ways for us to hook into it.
> +
> +This currently does not do any pre-loading in the parent process, mainly
> +because there isn't a good way to distinguish top-level document loads that
> +are happening directly in the parent verses those that are being proxied from

nit: versus
Attachment #8846338 - Flags: review?(aswan) → review+
Comment on attachment 8846339 [details]
Bug 1333990: Part 3c - Evict cached content scripts after a timeout and on memory-pressure.

https://reviewboard.mozilla.org/r/119392/#review123190
Attachment #8846339 - Flags: review?(aswan) → review+
Comment on attachment 8846340 [details]
Bug 1333990: Part 3d - Split observe() method to fix complexity warnings.

https://reviewboard.mozilla.org/r/119394/#review123194
Attachment #8846340 - Flags: review?(aswan) → review+
Comment on attachment 8846341 [details]
Bug 1333990: Part 3e - Add tests for content script cache eviction.

https://reviewboard.mozilla.org/r/119396/#review123196
Attachment #8846341 - Flags: review?(aswan) → review+
Comment on attachment 8845710 [details]
Bug 1333990: Part 3a - Use async loading and in-memory caching for WebExtension content scripts.

https://reviewboard.mozilla.org/r/118858/#review123184

> These used to get resolved against the extension's baseURI before injection.  Was it just redundant previously?

Yes. The manifest entries get resolved by the schema normalizer, and the ones passed to `executeScript` get resolved relative to their caller.
https://hg.mozilla.org/integration/mozilla-inbound/rev/48f9ab3c59676f9a290e1a08f22e2d0ee945eb98
Bug 1333990: Part 1a - Add an async script pre-loading utility. r=billm,shu

https://hg.mozilla.org/integration/mozilla-inbound/rev/db8466a676a1c98e0ed7de06e07efe6821a04d40
Bug 1333990: Part 1b - Add tests for script precompiler. r=shu

https://hg.mozilla.org/integration/mozilla-inbound/rev/a62f02519522dd12e5a6ffa2a482d5d6c43912e1
Bug 1333990: Part 2a - Allow multiple concurrent parser blockers. r=hsivonen

https://hg.mozilla.org/integration/mozilla-inbound/rev/44b17b53062ff9c14ca882bb3e81e5d4d3247972
Bug 1333990: Part 2b - Don't enable editor until layout has started. r=ehsan

https://hg.mozilla.org/integration/mozilla-inbound/rev/2297d47b77e1eacec984068e1de9ce5ab1a953da
Bug 1333990: Part 2c - Interrupt the flush loop after inserting document element. r=hsivonen

https://hg.mozilla.org/integration/mozilla-inbound/rev/3ce64ea6ccdc400400e0dca38573cfc2a93be05a
Bug 1333990: Part 2c.1 - Interrupt the XML flush loop after inserting document element. r=hsivonen

https://hg.mozilla.org/integration/mozilla-inbound/rev/fce9fb688fe01cd5bcf9997be7ecc6bbca38dbf7
Bug 1333990: Part 2d - Add a utility to block HTML parsing until sandbox scripts are ready. r=hsivonen,billm

https://hg.mozilla.org/integration/mozilla-inbound/rev/dc7501462ffff389ef10ac910c41140bad0ae575
Bug 1333990: Part 2e - Test that document.blockParsing blocks the parser at its current state during document-element-inserted. r=hsivonen

https://hg.mozilla.org/integration/mozilla-inbound/rev/34710ba3077609aab152f9d9db3dc3ce57aa1226
Bug 1333990: Part 3a - Use async loading and in-memory caching for WebExtension content scripts. r=aswan

https://hg.mozilla.org/integration/mozilla-inbound/rev/b101f320e0b4e126f42cd3adc9de769c1f11c2b6
Bug 1333990: Part 3b - Preload matching content scripts when opening document channels. r=aswan,billm

https://hg.mozilla.org/integration/mozilla-inbound/rev/5bb9384b1205c40cc625146a2bc8d7cc48801468
Bug 1333990: Part 3c - Evict cached content scripts after a timeout and on memory-pressure. r=aswan,billm

https://hg.mozilla.org/integration/mozilla-inbound/rev/db5b16f0b17df584e3bf1232a1d1cd0cd6f64957
Bug 1333990: Part 3d - Split observe() method to fix complexity warnings. r=aswan

https://hg.mozilla.org/integration/mozilla-inbound/rev/291b34abf2f47f7a168652aec4c6c54dce7492db
Bug 1333990: Part 3e - Add tests for content script cache eviction. r=aswan,billm
> + * Represents a pre-compiled JS script, which can be repeatedly exeuted in

"executed".

>    RootedValue exn(aCx, StringValue(JS_NewStringCopyZ(aCx, msg.get())));

This is going to mess up all sorts of stuff: non-ASCII urls, embedded nulls, etc.  Please use existing utilities for properly converting strings to JS values.

The API doesn't document what principal the resulting script will get, and it's _very_ non-obvious from reading the code what the answer is.  Can you please document this in the IDL?  The point, I _think_ is that it doesn't matter where you _compile_ the script; what matters is what's passed to the executeInGlobal function, right?

In PrecompiledScript::ExecuteInGlobal you should return without pressing on to the JS_WrapObject on failure; otherwise you're messing around with a JSContext with a pending exception on it.

It's probably worth documenting why AsyncScriptCompiler doesn't need to be cycle-collected.

Why does UnblockParsingPromiseHandler not need to be cycle-collected?  That looks pretty fishy to me.  I bet it can leak.

> nsIDocument::BlockParsing(OwningNonNull<Promise> aPromise,

Arg type should be "Promise&" here.  I should probably fix bindings so the OwningNonNull version just doesn't compile.... :(

How does this code handle cases when the parser on the document changes while our parser-block is in effect (e.g. due to page scripts calling document.close()/document.open())?
Flags: needinfo?(kmaglione+bmo)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #106)
> >    RootedValue exn(aCx, StringValue(JS_NewStringCopyZ(aCx, msg.get())));
>
> This is going to mess up all sorts of stuff: non-ASCII urls, embedded nulls,
> etc.  Please use existing utilities for properly converting strings to JS
> values.

I copied the error handling pretty much verbatim from the subscript loader.
Which utilities would you prefer I use? I'll change them both.

> The API doesn't document what principal the resulting script will get, and
> it's _very_ non-obvious from reading the code what the answer is.  Can you
> please document this in the IDL?  The point, I _think_ is that it doesn't
> matter where you _compile_ the script; what matters is what's passed to the
> executeInGlobal function, right?

Yes, it executes with the principal of the global passed to executeInGlobal.

> Why does UnblockParsingPromiseHandler not need to be cycle-collected?  That
> looks pretty fishy to me.  I bet it can leak.

That's a good point. It may be able to leak if the promise we pass it never
settles, and is being kept alive by the document or the result promise.

> How does this code handle cases when the parser on the document changes
> while our parser-block is in effect (e.g. due to page scripts calling
> document.close()/document.open())?

I'm not sure that can happen in practice, but maybe it can. Page scripts
shouldn't be able to run until we unblock the parser, but I suppose it may be
possible for another same-origin document, or an extension content script, to
create a new parser.

I initially stored the original parser as a weak reference, so I can go back
to doing that if you think it's necessary.
Flags: needinfo?(kmaglione+bmo)
> I copied the error handling pretty much verbatim from the subscript loader.

<sigh>.  Why do we have all this broken code in the tree?  ;)

The only thing that will handle non-ASCII correctly here is converting to UTF-16 before we hand stuff to the JS engine.  At that point you might as well use ToJSValue to get a JS::Value out of it.  You should be able to assume your string is UTF-8 here, right?

> Yes, it executes with the principal of the global passed to executeInGlobal.

Good.  We should document.

> but I suppose it may be possible for another same-origin document, or an extension content script,
> to create a new parser.

Right.  I don't know whether it's _necessary_ to store the original parser, but that's because I don't understand what the failure modes are if we do not....
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #108)
> The only thing that will handle non-ASCII correctly here is converting to
> UTF-16 before we hand stuff to the JS engine.  At that point you might as
> well use ToJSValue to get a JS::Value out of it.  You should be able to
> assume your string is UTF-8 here, right?

Yes, the string is definitely in UTF-8. But it would be easy enough to
normalize it to ASCII after we call NS_NewURL.

> > but I suppose it may be possible for another same-origin document, or an
> > extension content script, to create a new parser.
>
> Right.  I don't know whether it's _necessary_ to store the original parser,
> but that's because I don't understand what the failure modes are if we do
> not....

Well, it would result in an extra UnblockParser call if the parser changed. In
non-release builds, that would trigger an assertion. In release builds, it
would just cause the parser to never unblock, which would not be a major
issue.

I'm still not sure it's actually possible for this to happen with the original
parser (which is the only way we use this API), since document.open() and
document.close() don't work on it, and I don't think it should ever disappear
while it's blocked unless the document is being destroyed. But perhaps it's
best to store it anyway, just to be safe.
> But it would be easy enough to normalize it to ASCII after we call NS_NewURL.

I don't think it's worth the annoyance involved in that.  Just NS_ConvertUTF8toUTF16 in the error case.

> since document.open() and document.close() don't work on it

Ah, ok.  In that case this is probably a nonissue.  I thought you could document.close() it.
Boris, can you review these follow-ups?
Flags: needinfo?(bzbarsky)
Comment on attachment 8848623 [details]
Bug 1333990: Follow-up: Add cycle collection and comments where unnecessary.

https://reviewboard.mozilla.org/r/121510/#review123598

r=me
Attachment #8848623 - Flags: review+
Comment on attachment 8848624 [details]
Bug 1333990: Follow-up: Use Promise& rather than OwningNonNull<Promise> for binding arguments.

https://reviewboard.mozilla.org/r/121512/#review123600
Attachment #8848624 - Flags: review+
Comment on attachment 8848625 [details]
Bug 1333990: Follow-up: Use safer conversion functions when creating error message JS strings.

https://reviewboard.mozilla.org/r/121514/#review123604

r=me with the nits below.

::: js/xpconnect/loader/ChromeScriptLoader.cpp:201
(Diff revision 2)
>  AsyncScriptCompiler::Reject(JSContext* aCx, const char* aMsg)
>  {
> -    nsAutoCString msg(aMsg);
> -    msg.Append(": ");
> -    msg.Append(mURL);
> +    nsAutoString msg;
> +    msg.AppendASCII(aMsg);
> +    msg.AppendLiteral(": ");
> +    msg.Append(NS_ConvertUTF8toUTF16(mURL));

AppendUTF8toUTF16(mURL, aMsg);

::: js/xpconnect/loader/ChromeScriptLoader.cpp:204
(Diff revision 2)
> -    msg.Append(": ");
> -    msg.Append(mURL);
> +    msg.AppendASCII(aMsg);
> +    msg.AppendLiteral(": ");
> +    msg.Append(NS_ConvertUTF8toUTF16(mURL));
>  
> -    RootedValue exn(aCx, StringValue(JS_NewStringCopyZ(aCx, msg.get())));
> +    RootedValue exn(aCx);
> +    if (xpc::StringToJsval(aCx, msg, &exn)) {

Might as well call NonVoidStringToJsval here and save that one branch.
Attachment #8848625 - Flags: review+
Comment on attachment 8848626 [details]
Bug 1333990: Follow-up: Clarify documentation for PrecompiledScript bindings.

https://reviewboard.mozilla.org/r/121516/#review123606
Attachment #8848626 - Flags: review+
Comment on attachment 8848638 [details]
Bug 1333990: Follow-up: Return before wrapping result value on error.

https://reviewboard.mozilla.org/r/121526/#review123608

Thank you for fixing these up!
Attachment #8848638 - Flags: review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd81731fee82fc79cf04333b38b72cc1f8eebcf0
Bug 1333990: Follow-up: Add cycle collection and comments where unnecessary. r=bz

https://hg.mozilla.org/integration/mozilla-inbound/rev/95d7e575d614ad6a2f77f1fbf2739113fc3ecd23
Bug 1333990: Follow-up: Use Promise& rather than OwningNonNull<Promise> for binding arguments. r=bz

https://hg.mozilla.org/integration/mozilla-inbound/rev/ab283807ed8ace975bcc183a1006130729956558
Bug 1333990: Follow-up: Use safer conversion functions when creating error message JS strings. r=bz

https://hg.mozilla.org/integration/mozilla-inbound/rev/dffb79fe69fe0983ef222b8da5418ae188b62eac
Bug 1333990: Follow-up: Clarify documentation for PrecompiledScript bindings. r=bz

https://hg.mozilla.org/integration/mozilla-inbound/rev/67162d27979c8e8dbbc81461b5915e6ac15683d9
Bug 1333990: Follow-up: Return before wrapping result value on error. r=bz
Depends on: 1348523
Depends on: 1348241
Flags: needinfo?(bzbarsky)
Depends on: 1349989
Whilst comment 34 has some useful metrics, now the patch has landed I'd be curious of the final result, to see if this improved things (I'm sure it has) and by how much.
Please, you could edit the documentation pages with this change?
(https://developer.mozilla.org/en-US/docs/Observer_Notifications#Documents)

You see also: https://github.com/greasemonkey/greasemonkey/issues/2515

Thank you in advance.
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(kmaglione+bmo)
Keywords: dev-doc-needed
Depends on: 1379148
(In reply to janekptacijarabaci from comment #128)
> Please, you could edit the documentation pages with this change?
> (https://developer.mozilla.org/en-US/docs/Observer_Notifications#Documents)

That documentation is correct now (though it wasn't before this change)
You need to log in before you can comment on or make changes to this bug.