Page Worker API implementation

RESOLVED FIXED in 0.4

Status

Add-on SDK
General
P1
normal
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: dbuc, Assigned: Felipe)

Tracking

unspecified

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(2 attachments, 10 obsolete attachments)

(Reporter)

Description

9 years ago
For requirements and specs please reference https://wiki.mozilla.org/Labs/Jetpack/Reboot/JEP/108
Note: I've started a discussion of this JEP at http://groups.google.com/group/mozilla-labs-jetpack/browse_thread/thread/6a55b9cbb530affa, as it's not clear what the uses cases for its proposed API are.
Severity: major → normal
Priority: P2 → P1
Target Milestone: -- → 0.2
(Reporter)

Updated

9 years ago
Summary: Background Pages API implementation → Page Worker API implementation
Target Milestone: 0.2 → --
Target Milestone: -- → Future
Assignee: nobody → felipc
Target Milestone: Future → 0.4
Version: 0.2 → Trunk
(Assignee)

Comment 2

8 years ago
Created attachment 439465 [details] [diff] [review]
WIP v1

This is the first working version of this JEP. It is quite simple at the moment but I don't think it has to be much more complex that. I haven't updated it to the new constructor style yet because that is ongoing discussion and I'm just focusing on get it working now.

As far as the API goes, it implements everything. There are some things yet to solve though:

- I don't know what is the correct way to run a script within another window's context. at the moment I'm doing something that works (injecting <script> tags), but it doesn't seem the best solution. Using apply just sets the _this_ variable, not _window_.  This might be easy to do, and I'll poke people around, or might be hard to do and depend on the page mods work. Either way, I'll discover in the next few days

- To create an invisible and global webpage, I'm using the hidden DOM window. This might be frowned upon, but it works surprisingly well, and I don't know if there's a better way to do that.
(Assignee)

Comment 3

8 years ago
Created attachment 439466 [details] [diff] [review]
WIP v1

Oops, wrong patch. Also, this is just the module.js file, not the whole package
Attachment #439465 - Attachment is obsolete: true

Comment 4

8 years ago
Cool. I'm not sure how to do this without the hidden DOM window either, but I've found it incredibly useful for things like this too.

Regarding running a script within another window's context, I think you might be able to use the "@mozilla.org/moz/jssubscript-loader;1" component, which exports a mozIJSSubScriptLoader interface. As long as you pull the component off the target window's Components object, the script you execute will be run in the context of that window. I think.
(Reporter)

Comment 5

8 years ago
So here is a mock implementation (use your imagination that the parent web page is chrome space!) that uses a closure and external salt validation to execute script in the page's context.

The implementation at this point assumes that Page Worker only allows for content code to be run and not intermixed with chrome privileged js.  It may be possible to modify the implementation to allow for that but I was under the assumption, based on the spec as it stands, that chrome code was not the focus of the API.

code: http://jsfiddle.net/2KNQH/41/

try it link: http://fiddle.jshell.net/2KNQH/41/show/light/  -  paste this line into Firebug and it will return the window obj of the content frame: PageWorker.run(function(){ return this; })
(Assignee)

Comment 6

8 years ago
Created attachment 440316 [details] [diff] [review]
WIP v2

Second version. Will start writing docs and tests now.

This version has a new approach for the run function. It is similar to dbuc's ideas (uneval), but instead of injecting a script tag (that the page could sniff), I learned how to properly use the sandbox for this. It also makes use of chrome -> content and content -> chrome wrappers.

API-wise comments:
  - should we call this Page Worker or Background Page?  (We're calling it the former, but to me it appears that the latter name reflects it better)

  - the empty() call removes every element (including body, head, html) and reinserts <html><body/></html>

  - I was experimenting with the timer part and I built some timer management functions here. I can remove them if not wanted, or we could even use it as a separate module if found to be useful. It basically goes like this:

  instead of having timers by addressed by numeric ids, and have the setTimeout or setInterval call return the id, we let the developer set a name for the timer beforehand, and manage the numeric ids internally.  This way, he can easily cancel a timer by its name, and if he wants to override one timer, he just uses the same id. pseudo-example:

   setTimer("periodic-update", function_a, {interval: 1000});

   // cancels first timer with the same name
   setTimer("periodic-update", function_b, {interval: 500});

   cancelTimer("periodic-update");


  - the <xul:browser> element that holds the page is set as type="content", meaning that you can't set a privileged page to run there


  - Just some food for thought: the current API is sync in that the function to be run on the page returns a value, but talking to dietrich we thought that this might be changed when we go multi-process. We can either change the API now to call back a function when the value is returned (making it future proof), or we can keep it like that if we think we will have a way to block the execution and wait for the return value, as dbuc suggested.
Attachment #439466 - Attachment is obsolete: true
(Assignee)

Comment 7

8 years ago
Created attachment 440650 [details] [diff] [review]
Complete version - v1

API, tests and docs included.


One thing to note: there was no way on the API to add an event listener on the background page to check when the page finishes loading (either on the first time or when the page URL is changed). What I did was to add an optional callback parameter to the constructor that will be called whenever a DOMContentLoaded event hit the <xul:browser> element
Attachment #440316 - Attachment is obsolete: true
(Assignee)

Comment 8

8 years ago
Created attachment 441077 [details] [diff] [review]
Complete version - v2

Minor update from the other patch, I wrote one extra test on this, and removed two debug variable that I had left behind.

Drew, can you take a look on this implementation?
Attachment #440650 - Attachment is obsolete: true
Attachment #441077 - Flags: review?(adw)

Comment 9

8 years ago
Felipe, I've started to look at this but I noticed that it's often different from the JEP, and the JEP itself has spots that don't conform to Jetpack API conventions.  Has Myk signed off on the JEP as it is currently written?
(Reporter)

Comment 10

8 years ago
No the JEP has not been approved for syntax and similar items by Myk as far as I know. If he as things he needs to add/remove/or conform, can you flag them?
Let's hold off on Drew's review until Myk does an API review of the implemented module.

(The JEP is the initial *proposed* API, which is more archival than useful at this point. As I understood it, the JEPs aren't supposed to be living documents.)
Attachment #441077 - Flags: feedback?(myk)

Comment 12

8 years ago
Comment on attachment 441077 [details] [diff] [review]
Complete version - v2

All JEPs need to be approved by Myk before they're implemented to ensure consistency with the other APIs, among other things.  That's a critical step that was missing in the prototype's process.  If the JEP was never approved then it's not "archived."  Once the JEP is approved and its implementation ready, Myk does another review as part of the review process to ensure that the code matches the JEP.

All that is not to say the implementation can't be worked on during JEP review, but it might change as a result.
Attachment #441077 - Flags: feedback?(myk)
(Assignee)

Comment 13

8 years ago
Ok, for easier reference I've pasted the documentation of how the JEP is currently implemented on a markedown pastebin service http://pastedown.lunant.net/fe743a

(The formatting is not as good as cfx docs but it's still useful)

Comment 14

8 years ago
Comment on attachment 441077 [details] [diff] [review]
Complete version - v2

I think you should update the JEP directly and iterate on it with Myk.  (You can always revert to an earlier version if you want, the benefits of using a wiki.)

Here's some feedback on the module code.  I hope it's helpful even if the API ends up changing a little.

>diff --git a/packages/jetpack-core/lib/page-worker.js b/packages/jetpack-core/lib/page-worker.js

>+function Page(options) {

>+  let listener = options && options.callback ?
>+                   function () options.callback() :
>+                   function () { };

Any time you're calling a user callback you should try-catch it and log the error with console.exception().  You don't want user errors interfering with your own code.  (The new jetpack-core/errors module has a catchAndLog() function that returns a function that try-catch-logs.  I asked Atul about adding another function that just try-catch-logs without wrapping it in a function, so feel free to add that if you want.)

Also, it would be safer to check typeof(options.callback) === "function" rather than just for options.callback's existence.  (context-menu.js has a validateOptions() function that I'd like to pull out and you could use here, but it might be overkill since there's only one option.)

Nit: The two lines with the functions should be left-aligned with the first |options| in the previous line.

>+                                        

Please remove the whitespace on this line.

>+  browser.addEventListener("DOMContentLoaded", listener, false);
>+
>+  browser.setAttribute("type", "content");
>+  browser.setAttribute("src", "about:blank");
>+
>+  window.document.documentElement.appendChild(browser);
>+
>+  this.unload = function JP_SDK_Page_unload() {
>+    this.clearAllTimers();
>+    browser.removeEventListener("DOMContentLoaded", listener, false);
>+    window.document.documentElement.removeChild(browser);
>+    delete this.document;
>+    delete this.window;
>+    browser = null;
>+  };

Does unload really need to be public?  Is there a public use case?  (If not, you can require("unload").when() instead of ensure() to avoid making it public.)

>+  this._timerIds = {};

For the high-level APIs we really shouldn't define "private" members like this, since consumers have public access to them.  See my comment on Page.prototype below.

>+  this.__defineGetter__("window", function() browser.contentWindow);
>+  this.__defineGetter__("document", function() browser.contentDocument);

Nit: Space between "function" and the parentheses, but that's only my personal preference.  You should at least be consistent in your own code.

>+Page.prototype = {

This is subtle, but in the high-level APIs Atul has said that we should avoid |this| in public methods.  The reason is that since anyone can use |apply| or |call| on a public function, we can't be sure what |this| really refers to.  It could be a dangerous object that defines its own versions of our methods.  So instead of defining public methods that refer to |this| in a prototype, we should define them directly on |this| in the constructor (like you do with unload and the two getters) and replace references to |this| with variables scoped to the constructor.

>+  empty: function JP_SDK_Page_empty() {
>+    let document = this.document;
>+    while (document.firstChild) {
>+      document.removeChild(document.firstChild);
>+    }

Nit: All the Jetpack code I've seen (Atul's) omits braces on loops and conditionals whose bodies are single lines.  Personally I don't care so much, but just letting you know in case you want to do that.

>+    document.appendChild(document.createElement("html"));
>+    document.firstChild.appendChild(document.createElement("head"));
>+    document.firstChild.appendChild(document.createElement("body"));
>+  },
>+  

Please remove the whitespace from that last line.

>+  clearTimer: function JP_SDK_Page_clearTimer(timerName) {
>+    let timer = require("timer");

Since you use timer a few times, it would be better to import it once and use that instance everywhere.  Every time you do a require(module), you cause that module to be parsed and loaded again.

>+  clearAllTimers: function JP_SDK_Page_clearAllTimers() {
>+    let timer = require("timer");
>+    for each (timerId in this._timerIds) {
>+      timer["clear" + timerId.type](timerId.id);
>+    }
>+  },

It would be good if clearAllTimers called clearTimer rather than duplicating its string concatenation and call to clear{Timeout,Interval}.  They should use the same code path.

>+  run: function JP_SDK_Page_run() {
>+    let options, action;
>+
>+    if (arguments.length == 2) {
>+      options = arguments[0];
>+      action  = arguments[1];
>+    } else {
>+      options = {};
>+      action = arguments[0];
>+    }
>+    if (!(typeof(action) == "function"))

if (typeof(action) != "function")

>+      throw new Error("Page.run needs a function argument");

Nit: My personal preference, but I think all high-level API error messages should be full sentences with punctuation, since we're trying to be really friendly to developers.  So a period at the end here.

>+    let params = options.arguments || [];
>+    let bind = options.bind || this.window;

These should be sanity checked like action is, with friendly error messages.  I'll file a bug to break out validateOptions since it should be helpful.

>+    let runFunction = function() {

Nit: function runFunction() {

>+      // hiddenWindow's alert is problematic, so we provide
>+      // an alert call without a parent window

Nit: Please punctuate comments that are full sentences.

>+      function parentlessAlert(msg) {
>+        let prompter = Components.classes["@mozilla.org/embedcomp/prompt-service;1"].
>+                       getService(Components.interfaces.nsIPromptService);

Cc and Ci.

>+        prompter.alert(null, "Jetpack SDK", msg);
>+      }
>+
>+      sandbox.importFunction(parentlessAlert, "alert");

importFunction, cool.

Hmm.  Should alert be disabled altogether (i.e., an empty function)?  What's the user supposed to think or do when he gets a random alert that's disconnected from any page?  Seems like a bad user experience.  What do you think?

>+      sandbox.extraContext = {alert: sandbox.alert};

Nit: Please include a space after the opening brace and a space before the closing brace.

>+      let result = Cu.evalInSandbox("with(window) with(extraContext) {" +
>+                                    "(function(){ return (" +
>+                                    uneval(action) +
>+                                    ").apply(bind,params);})()}", sandbox);
>+
>+      try {
>+        return new XPCSafeJSObjectWrapper(result);
>+      } catch (error) {
>+        if (error.name == "NS_ERROR_ILLEGAL_VALUE") {

catch (error if error.result == Cr.NS_ERROR_ILLEGAL_VALUE) {

>+          // result is a native type and we can return it
>+          // without concerns

Punctuate.

>+    let timed = options.interval || options.timeout;
>+
>+    if (timed) {

timed will be falsey if both options.interval and timeout are 0.  Is that intended?  0 durations are allowed for setTimeout, setInterval, and Gecko timers, and they at least cause the callback to be scheduled later in the event loop (and I thiknk might even be capped to a 10ms minimum by the platform).

>+      let timer = require("timer");
>+      let timerId = timer[options.timeout ? 'setTimeout' : 'setInterval'](

Nit: Please use double quotes here, since you use them elsewhere.

>+        runFunction, options.timeout || options.interval
>+      );

So if the user sets a timer, runFunction's return value is thrown away, but if he doesn't, it's not?

>+      let timerName = options.name || "timer_id_" + Date.now();
>+      let timerIds = this._timerIds;
>+      if (timerIds[timerName]) {
>+        timer["clear" + timerIds[timerName].type](timerIds[timerName].id);
>+      }

Call clearTimer here.  Multiple paths for the same thing are bad! :)

>+      timerIds[timerName] = {id: timerId, 
>+                             type: options.timeout? "Timeout" : "Interval"};

Nit: That first line has an extra space at the end that should be removed.  Also, please include a space after the opening brace, a space before the closing brace, and a space before the ?.  Ideally I'd prefer:

      timerIds[timerName] = {
        id: timerId,
        type: options.timeout ? "Timeout" : "Interval"
      };

But since Atul doesn't follow that style I can't hold you to it. :)

>+      return timerName;
>+
>+    } else {
>+      return runFunction();
>+    }

else not needed here, since the if branch returns.
Attachment #441077 - Flags: review?(adw) → feedback+
(Assignee)

Comment 15

8 years ago
Hi Drew, thanks for this first feedback! Answering your questions because they're useful independent of the final API
(skipping the nits, which will all be addressed)


(In reply to comment #14)
> (From update of attachment 441077 [details] [diff] [review])
> I think you should update the JEP directly and iterate on it with Myk.  (You
> can always revert to an earlier version if you want, the benefits of using a
> wiki.)
> 
I wanted to keep the wiki on the original JEP for now for easier comparison between the differences. After the API is more defined I'll update the wiki directly.

> 
> >diff --git a/packages/jetpack-core/lib/page-worker.js b/packages/jetpack-core/lib/page-worker.js
> 
> >+function Page(options) {
> 
> >+  let listener = options && options.callback ?
> >+                   function () options.callback() :
> >+                   function () { };
> 
> Any time you're calling a user callback you should try-catch it and log the
> error with console.exception().  You don't want user errors interfering with
> your own code.  (The new jetpack-core/errors module has a catchAndLog()
> function that returns a function that try-catch-logs.  I asked Atul about
> adding another function that just try-catch-logs without wrapping it in a
> function, so feel free to add that if you want.)
> 

Okay, I will use catchAndLog or something similar. I had seen it used before but didn't know what was the purpose. Got it now

> 
> Does unload really need to be public?  Is there a public use case?  (If not,
> you can require("unload").when() instead of ensure() to avoid making it
> public.)
> 

Hmm, I believe that a good developer might want to unload a page at some arbitrary point to save on memory (since it's a <xul:browser> which likely consumes a lot of resources). However, I don't know if this would be a common use case. I'd have no problems in make it private if wanted. Jetpack question: using when(), will my unload be called when the object is GC'ed, or only on shutdown?


> >+Page.prototype = {
> 
> This is subtle, but in the high-level APIs Atul has said that we should avoid
> |this| in public methods.  The reason is that since anyone can use |apply| or
> |call| on a public function, we can't be sure what |this| really refers to.  It
> could be a dangerous object that defines its own versions of our methods.  So
> instead of defining public methods that refer to |this| in a prototype, we
> should define them directly on |this| in the constructor (like you do with
> unload and the two getters) and replace references to |this| with variables
> scoped to the constructor.
> 

Okay! everything will go inside the constructor though.. (but that's not a problem)


> Since you use timer a few times, it would be better to import it once and use
> that instance everywhere.  Every time you do a require(module), you cause that
> module to be parsed and loaded again.
> 
Okay, good to know. I thought that require() was parsed like .jsms, (or more precisely, once per extension). It would be good to have this explained somewhere, though I don't know what would be a good place to put this info.


> It would be good if clearAllTimers called clearTimer rather than duplicating
> its string concatenation and call to clear{Timeout,Interval}.  They should use
> the same code path.

Sure! I did this twice, shame :p  Blame it on writing the original code in-place and later deciding to make it a function :)


> 
> >+        prompter.alert(null, "Jetpack SDK", msg);
> >+      }
> >+
> >+      sandbox.importFunction(parentlessAlert, "alert");
> 
> importFunction, cool.
> 
> Hmm.  Should alert be disabled altogether (i.e., an empty function)?  What's
> the user supposed to think or do when he gets a random alert that's
> disconnected from any page?  Seems like a bad user experience.  What do you
> think?
> 

Yeah, it's definitely not a good user experience, but the purpose of this alert is mostly for debugging purposes, since unfortunately[?] alert is the de-facto way to debug JS. That's why I'm forcing the alert title to "Jetpack SDK" instead of letting it be customizable.

I could alias it to console.log, but the developer should be using it anyway. Completely disabling alert would be hard (the developer could put a same-origin iframe on the page and then grab that iframe's window.alert), so better having one that works well than forcing workarounds.

> 
> >+    let timed = options.interval || options.timeout;
> >+
> >+    if (timed) {
> 
> timed will be falsey if both options.interval and timeout are 0.  Is that
> intended?  0 durations are allowed for setTimeout, setInterval, and Gecko
> timers, and they at least cause the callback to be scheduled later in the event
> loop (and I thiknk might even be capped to a 10ms minimum by the platform).

Good catch! This will be fixed together by better sanity checking using validateOptions

> 
> >+        runFunction, options.timeout || options.interval
> >+      );
> 
> So if the user sets a timer, runFunction's return value is thrown away, but if
> he doesn't, it's not?
> 

If a timer is used, the function won't actually run, so there is nothing to return. The only way that the developer can get the result of that scheduled function is to use a callback.

Comment 16

8 years ago
> I wanted to keep the wiki on the original JEP for now for easier comparison
> between the differences. After the API is more defined I'll update the wiki
> directly.

OK, no worries, I don't mean to try to force you to do it a certain way.  I should have explained that the process for JEP approval so far has been to write a first draft, Myk looks at it, he posts his thoughts to the user group, there's a conversation about it, the JEP is updated along the way, and ultimately you end up with an approved JEP on the wiki.  One of the reasons we chose wikimo for JEPs is that being a wiki, people can easily see and link to and revert to old revisions, so people don't have to feel timid about making updates, especially while review is ongoing.

> Hmm, I believe that a good developer might want to unload a page at some
> arbitrary point to save on memory (since it's a <xul:browser> which likely
> consumes a lot of resources).

That makes sense.  A question like that will be ironed out in the JEP review.

> using when(), will my unload be called when the object is GC'ed, or only on
> shutdown?

when() takes a callback that's called when your module is unloaded, which you should assume can happen at any time.  When exactly your module will be unloaded is up to the Cuddlefish loader.  In a real extension (as opposed to a unit test for example) modules are generally unloaded on a "quit-application-granted" observer [1].  In unit tests you can force your module to unload, and that's useful for testing that you've cleaned up everything.  (See the XHR test for example [2].) ([3] is the site that tells the `unload` module to notify all the modules that registered with it that they're being unloaded, including your module when you call when() or ensure().  BTW, I could be wrong about some of this -- I'm justing reading the code.)

ensure() is just a convenience for the more general when(), but it requires that unload() be defined on your object [4].

[1] http://mxr.mozilla.org/labs-central/source/jetpack-sdk/python-lib/cuddlefish/app-extension/components/harness.js#289

[2] http://mxr.mozilla.org/labs-central/source/jetpack-sdk/packages/jetpack-core/tests/test-xhr.js#32

[3] http://mxr.mozilla.org/labs-central/source/jetpack-sdk/packages/jetpack-core/lib/cuddlefish.js#78

[4] http://mxr.mozilla.org/labs-central/source/jetpack-sdk/packages/jetpack-core/lib/unload.js#36

> Okay, good to know. I thought that require() was parsed like .jsms, (or more
> precisely, once per extension). It would be good to have this explained
> somewhere, though I don't know what would be a good place to put this info.

Yeah...

> Yeah, it's definitely not a good user experience, but the purpose of this alert
> is mostly for debugging purposes, since unfortunately[?] alert is the de-facto
> way to debug JS. That's why I'm forcing the alert title to "Jetpack SDK"
> instead of letting it be customizable.
> 
> I could alias it to console.log, but the developer should be using it anyway.
> Completely disabling alert would be hard (the developer could put a same-origin
> iframe on the page and then grab that iframe's window.alert), so better having
> one that works well than forcing workarounds.

OK.

> > 
> > >+        runFunction, options.timeout || options.interval
> > >+      );
> > 
> > So if the user sets a timer, runFunction's return value is thrown away, but if
> > he doesn't, it's not?
> > 
> 
> If a timer is used, the function won't actually run, so there is nothing to
> return. The only way that the developer can get the result of that scheduled
> function is to use a callback.

I meant that when the timer fires and runs the user's callback, at that point its return value is discarded.  But if the callback is not on a timer, it's not discarded.  No big deal, but it's something that should be specified in the JEP.
(Assignee)

Comment 17

8 years ago
Created attachment 441949 [details] [diff] [review]
v3

Posting a new version to address adw's comments. Main changes include:
    - removed access to "private" _timerIds property by making it be accessed through the closure
    - using catchAndLog around user defined callbacks
    - removed possibility to override the |this| object inside public methods
    - using validateOptions and added tests to cover validation


outstanding questions about the API as currently implemented:

 - should page.unload() be a public method? and page.clearAllTimers() ?
 
 - page.run() return value: at the moment, page.run is able to return the value that the called function returns. This implies some sort of sync blocking between the running contexts. I talked to dietrich yesterday and said I was going to change it to return value through a callback, but I decided to wait to make this change because it would involve API changes (how to define this callback, for instance)

 - security overview: I'll ask mrbkap to make a security pass on the implementation, but the main design (which he and dbuc helped me plan) is the following:
  there should be no intrinsic security problems (obviously) (this is done by using the sandbox which automatically uses some wrappers), and it should be hard to mistakenly create a problem (done by things like breaking the closure of the callback function and running it in the page context).
 Still, AFAIU, to make it completely secure we would have to drop support to these and also not provide the page.document and page.window properties, but these changes would limit the API functionality.
Attachment #441077 - Attachment is obsolete: true

Comment 18

8 years ago
(In reply to comment #14)
> Also, it would be safer to check typeof(options.callback) === "function" rather
> than just for options.callback's existence.  (context-menu.js has a
> validateOptions() function that I'd like to pull out and you could use here,
> but it might be overkill since there's only one option.)

Yeah, we should totally move validateOptions() into a helper module during our refactoring sprint! :P

> >+  this._timerIds = {};
> 
> For the high-level APIs we really shouldn't define "private" members like this,
> since consumers have public access to them.  See my comment on Page.prototype
> below.

Note that this isn't yet a hard-and-fast rule, but is generally something that Brian Warner and I are keen on. We are open to feedback/criticism though!

> >+Page.prototype = {
> 
> This is subtle, but in the high-level APIs Atul has said that we should avoid
> |this| in public methods.  The reason is that since anyone can use |apply| or
> |call| on a public function, we can't be sure what |this| really refers to.  It
> could be a dangerous object that defines its own versions of our methods.  So
> instead of defining public methods that refer to |this| in a prototype, we
> should define them directly on |this| in the constructor (like you do with
> unload and the two getters) and replace references to |this| with variables
> scoped to the constructor.

Yeah, agreed--though I think that mrbkap might have been thinking about treating "this" specially in some funky way, it's just a lot simpler IMO if we avoid it altogether.

> Nit: All the Jetpack code I've seen (Atul's) omits braces on loops and
> conditionals whose bodies are single lines.  Personally I don't care so much,
> but just letting you know in case you want to do that.

Heh, I will admit that the only reason I do this is because my IDE (js2-mode for Emacs) auto-indents for me, meaning that it's watching my back to ensure that the correct # of braces are always there.  As evidence of this, consider the fact that I actually *always* include braces when I'm editing JS that's embedded in a .html file, because js2-mode doesn't work on those so I don't have it watching my back, and would rather stay safe than write more readable code (I hate braces).

So in other words, we can totally go either way on this and I'm open to either.

> Since you use timer a few times, it would be better to import it once and use
> that instance everywhere.  Every time you do a require(module), you cause that
> module to be parsed and loaded again.

Ack! No! I need to write up a guide on this in the SDK documentation...

It is subtly confusing because multiple instances of modules *can* exist, because sometimes there is a need for that.  But within the lifetime of your Jetpack-based extension, you can basically treat it like a .jsm--that is, the first time you require("foo"), foo.js is evaluated and its exports are returned, but every subsequent time just returns its pre-existing exports.

> Nit: My personal preference, but I think all high-level API error messages
> should be full sentences with punctuation, since we're trying to be really
> friendly to developers.  So a period at the end here.

I agree with this, we should actually write it down in our code style guidelines. Speaking of which, we should move those into the SDK docs... Also, because Python and (I think?) JS *don't* follow this rule, I think some of the existing code in the SDK doesn't either... but it totally makes sense, IMO.

> I'll file a bug to break out validateOptions since it should be helpful.

Awesome, thanks!

> Nit: That first line has an extra space at the end that should be removed. 
> Also, please include a space after the opening brace, a space before the
> closing brace, and a space before the ?.  Ideally I'd prefer:
> 
>       timerIds[timerName] = {
>         id: timerId,
>         type: options.timeout ? "Timeout" : "Interval"
>       };
> 
> But since Atul doesn't follow that style I can't hold you to it. :)

Lol! I like the style though, I should use it!

Comment 19

8 years ago
(In reply to comment #15)
> Hmm, I believe that a good developer might want to unload a page at some
> arbitrary point to save on memory (since it's a <xul:browser> which likely
> consumes a lot of resources).

Yeah, this is part of the reason I've made many of my .unload() methods public. It's also so that other objects which contain them can call .unload() on their contents when they're themselves unloaded.

I feel that this is very much analogous to a file's close() method: sure, you could remove the method and just tell developers to kill all references to it and let it close on GC--sometimes I rely on this behavior with my Python code--but this kind of explicitness can be helpful sometimes too.

> Jetpack question:
> using when(), will my unload be called when the object is GC'ed, or only on
> shutdown?

Great question--it's only called on shutdown, so this actually probably means that you want unload() to be a public method, since a Jetpack may only need a background page for a short amount of time.

Another alternative, I guess, is to have the main method of "unloading" a background page simply be to call its window.close() method. In any case, if the background page's window.close() method is called, I suspect it should have the same effect as calling the .unload() method.
(Assignee)

Comment 20

8 years ago
(In reply to comment #19)
> Another alternative, I guess, is to have the main method of "unloading" a
> background page simply be to call its window.close() method. In any case, if
> the background page's window.close() method is called, I suspect it should have
> the same effect as calling the .unload() method.

Now, here's something interesting..  I called page.window.close(), and what happens is that the hiddenDOMWindow ceases to exist.  This leads to Firefox crashing on exit..

Note that if ran from the webpage's context:

  page.run(function () window.close())

this doesn't happen.. It only happens through the browserElement.contentWindow.close method..  so perhaps we won't be able to expose browserElement.contentWindow and browserElement.contentDocument.  Or this is a platform bug? I'm not sure

Comment 21

8 years ago
Oh, that's funny--are you exposing the hidden DOM window to the client code, or an iframe within it?  In the Jetpack Prototype, we made an iframe in it, which also allowed us to have control over whether it was actually XUL or HTML (since on some platforms the hidden window is a XUL file and on others it's HTML).

Anyhow, I should actually read your patch before making any more comments... :P
(Assignee)

Comment 22

8 years ago
(In reply to comment #21)
> Oh, that's funny--are you exposing the hidden DOM window to the client code, or
> an iframe within it?  In the Jetpack Prototype, we made an iframe in it, which
> also allowed us to have control over whether it was actually XUL or HTML (since
> on some platforms the hidden window is a XUL file and on others it's HTML).
> 
> Anyhow, I should actually read your patch before making any more comments... :P

Not directly exposing, but the browser element exposes the contentWindow and contentDocument, which ends up related to the hidden DOM window somehow. But I don't know why the page's window.close() wouldn't cause that too..  Perhaps this is a case that was just never considered.

Now that you mentioned iframes I tried changing the <browser> to an <iframe>, but the same happens.
(Assignee)

Comment 23

8 years ago
Comment on attachment 441949 [details] [diff] [review]
v3

Now that most of the internal details are covered, I'm asking feedback from myk on how the API looks like (and please take a look on comment 17).

Meanwhile I'll be sorting out the last details (specially comment 20) and after those steps it can go to code review.
Attachment #441949 - Flags: feedback?(myk)
Attachment #441949 - Flags: feedback?(myk) → feedback-
Comment on attachment 441949 [details] [diff] [review]
v3

Sorry for the delay reviewing the API for this! I just went through it, and these are the issues I identified:

The "callback" option should be called "onReady" per callback naming guidelines (and because "ready" seems to be a common simplification of the DOMContentLoaded event), it should set a corresponding callback property Page.onReady, and it should behave the way other callback properties behave as described in the design guidelines <https://wiki.mozilla.org/Labs/Jetpack/Design_Guidelines>.


On the run method, the optional "args" parameter should be the second parameter, so the position of the "func" parameter doesn't change when an argument is provided for the "args" parameter. And it should be called something different, like "options", since the name "args" suggests it contains arguments to the function being run, whereas those arguments are actually contained within its "arguments" property.


I'm not completely sold on the value of the "reset" and "empty" methods, given that they wrap what would otherwise be one-liners, with "reset" being the equivalent of setting window.location, while "empty" is something like setting document.body.innerHTML to an empty string. And it's not clear how often they would be used.

I think we'd be better off observing usage before implementing these two methods. For the "reset" method, I'd like to see how common it is for addons to want to reset pages (as opposed to completely destroying the page) and what location they want to reset it to (about:blank, a data: URL, a bundled resource, a remote URL).

For the "empty" method, I'm interested in knowing how often addons want to empty the body specifically as opposed to other elements in the page. Given that jQuery's "empty" method can be applied to any DOM element, I suspect that usage may be more varied than can be satisfied by a single "empty" method that applies only to the body.


Speaking of which, Atul and I were talking recently about how to make it possible for addons to use JavaScript libraries ilke jQuery and MooTools to manipulate documents (page workers, tabs, panels, etc.). I think he's going to look into this further.


The mechanism by which pages are instantiated and destroyed follows the design guidelines, which specify the use of constructors, but adw and I have had recent conversations in which adw suggested (and I agreed) that it's a problem for simple constructors to have side-effects that persist outside the scopes of their instances.

For example, consider the following code:

  function foo() {
    let page = require("page-worker").Page();
  }
  foo();

A developer would expect the page instance to be GCed after the call to "foo" returns, whereas it actually sticks around (or at least the browser element in the hidden window does), and you have to call page.unload to remove it.

For constructors that have persistent side-effects, then (like the context-menu API <https://jetpack.mozillalabs.com/sdk/latest/docs/#module/jetpack-core/context-menu>, which adds XUL elements to the context menu), we decided that there should be an additional registration step that triggers the side-effect and makes the instance persistence obvious.

For context-menu, this takes the form of add/remove methods. We could do something like that here as well, i.e.:

function foo() {
  let pages = require("page-worker");
  let page = pages.add(new pages.Page());
}
foo();


And finally, two big picture issues:

First, since this API creates a frame containing potentially untrusted content, it should follow the content frame guidelines outlined in Brian's JEP 115 <https://wiki.mozilla.org/Labs/Jetpack/Reboot/JEP/115>. That complicates the API a bit, since it means specifying a frame with permissions, but that makes it possible to create pages in which script doesn't execute (which is safer when you don't actually need it to).

Second, I wonder about the implications of electrolysis on this API. Especially considering that the hidden window presumably lives in chrome, whereas the plan is that Jetpack-based addon code is going to live in its own process. We should get bsmedberg's input on this.


FWIW, regarding the name "page worker", we considered the name "background page", but we wanted to avoid confusing developers who are familiar with Chrome's API, since Chrome's background pages is very different from the functionality we are providing here (it's a singleton page that provides the main execution context for the addon, much like our "main" module, except that it's an entire HTML page rather than just a JS execution context).

I'm happy to entertain other names for this, though. Suggest away!


And regarding the creation of browser elements in the hidden window, that's the best way I know to go about it (modulo any electrolysis issues). And it looks like the hidden window is now XUL on all OSes (it used to be HTML on Windows and Linux), so you don't need to do anything special anymore to secure them via type="content".

However, they should probably be "iframe" elements rather than "browser" elements, since the API doesn't seem to need the additional capabilities that "browser" elements provide.
(Reporter)

Comment 25

8 years ago
One comment I have regarding the callback name of onReady is that it tends to suggest, as Myk mentioned, a function that is fired when the DOM ready state of the page is reached. In the Page Worker API, developers would likely be going to the page just as much after content is loaded into it to perform additional tasks as they would calling an entirely new page. 

Could we have an onReady that fires whenever the page is truly changed and perhaps some other callback more suggestive of "eval or exec instantly" for the other common case I described? I believe that having these event cases separated actually would provide further value and control to the developer.
(In reply to comment #25)
> Could we have an onReady that fires whenever the page is truly changed and
> perhaps some other callback more suggestive of "eval or exec instantly" for the
> other common case I described?

It's not clear what "truly changed" and "eval or exec instantly" mean here.  Currently, onReady is called when the DOMContentLoaded event is dispatched, which is the earliest moment at which the content of the page is likely to be useful for addon developers.

We can add callback properties for other events. We just need to be specific about which events they represent and the use cases for making them available.  Can you elaborate?

Comment 27

8 years ago
(In reply to comment #26)
> Currently, onReady is called when the DOMContentLoaded event is dispatched,
> which is the earliest moment at which the content of the page is likely to be
> useful for addon developers.

It's probably not very critical to this particular conversation, but I wanted to note that there's actually an earlier point at which the addon is likely to be useful to the content of the page (which is the reverse of what you said): when the 'window' object for the content has been created.

A big use case of Jetpack code is for addons to securely inject objects into content, which should be usable by content before its DOMContentLoaded event has fired, e.g. when an inline <script> tag refers to the injected object. This is covered by bug 526010, as there isn't currently an reliable way to do this in the platform.
(Reporter)

Comment 28

8 years ago
OK so to clarify, I agree that we should have an event called onReady.  This even would be set by the developer and would fired each time the page's DOMContentReady event occurs.  The other event that is crucial, is one that allows you to execute code sometime after DOMContentLoaded.  For instance, if I had a page which was already loaded and running and I wanted to drop in some code to be run instantly, we should enable that case and make it syntactically different from the onReady callback that was proposed to cover the DOMContentLoaded case - .run(), .execute(), something like that?

In the future, I could imagine having a more comprehensive palette of events from which to pick as the developer. Just as a strawman: Ability to fire an event when the page does an ajax request and then when the data is returned to also be able to act on those callbacks. Obviously this is far-future type ideas, and we should wait until we get feedback from developers on what additional needs they may have. 

Is that more clear?
(In reply to comment #28)
> The other event that is crucial, is one that
> allows you to execute code sometime after DOMContentLoaded.  For instance, if I
> had a page which was already loaded and running and I wanted to drop in some
> code to be run instantly, we should enable that case and make it syntactically
> different from the onReady callback that was proposed to cover the
> DOMContentLoaded case - .run(), .execute(), something like that?

In the spec and current implementation, there's a run method for executing code in the context of the page.  It sounds like that satisfies this use case.
(Assignee)

Comment 30

8 years ago
(In reply to comment #24)

> The "callback" option should be called "onReady" per callback naming guidelines
> (and because "ready" seems to be a common simplification of the
> DOMContentLoaded event), it should set a corresponding callback property
> Page.onReady, and it should behave the way other callback properties behave as
> described in the design guidelines
> <https://wiki.mozilla.org/Labs/Jetpack/Design_Guidelines>.

Right, will do! I liked this pattern. And I believe it covers the two main use cases for DOMContentLoaded. Whenever the DOMContentLoaded is raised the callback will be called, and if the developer just wants to use it on the first "page worker is ready" notification, he just deletes the callback.


> On the run method, the optional "args" parameter should be the second
> parameter, so the position of the "func" parameter doesn't change when an
> argument is provided for the "args" parameter. And it should be called
> something different, like "options", since the name "args" suggests it contains
> arguments to the function being run, whereas those arguments are actually
> contained within its "arguments" property.

Ok

> I'm not completely sold on the value of the "reset" and "empty" methods, given
> that they wrap what would otherwise be one-liners, with "reset" being the
> equivalent of setting window.location, while "empty" is something like setting
> document.body.innerHTML to an empty string. And it's not clear how often they
> would be used.
> 
> I think we'd be better off observing usage before implementing these two
> methods. For the "reset" method, I'd like to see how common it is for addons to
> want to reset pages (as opposed to completely destroying the page) and what
> location they want to reset it to (about:blank, a data: URL, a bundled
> resource, a remote URL).

Ok. I don't know what were the main use cases for these functions, and if Daniel thinks they're not highly important to be kept, then we could remove the empty and reset functions, and leave it to be implemented by the callers or used via jQuery.

> 
> 
> The mechanism by which pages are instantiated and destroyed follows the design
> guidelines, which specify the use of constructors, but adw and I have had
> recent conversations in which adw suggested (and I agreed) that it's a problem
> for simple constructors to have side-effects that persist outside the scopes of
> their instances.
> 
> For example, consider the following code:
> 
>   function foo() {
>     let page = require("page-worker").Page();
>   }
>   foo();
> 
> A developer would expect the page instance to be GCed after the call to "foo"
> returns, whereas it actually sticks around (or at least the browser element in
> the hidden window does), and you have to call page.unload to remove it.
> 
> For constructors that have persistent side-effects, then (like the context-menu
> API
> <https://jetpack.mozillalabs.com/sdk/latest/docs/#module/jetpack-core/context-menu>,
> which adds XUL elements to the context menu), we decided that there should be
> an additional registration step that triggers the side-effect and makes the
> instance persistence obvious.
> 
> For context-menu, this takes the form of add/remove methods. We could do
> something like that here as well, i.e.:
> 
> function foo() {
>   let pages = require("page-worker");
>   let page = pages.add(new pages.Page());
> }
> foo();

Hm I understand the concern, but (apart from making the API more complex) here's some food for thought about this change:

 - The "trigger side-effect on add" is possible (both on context-menu and here on page-worker) because they are a direct mapping to appendChild. It will likely be harder or more cumbersome for other APIs to maintain consistency with this.

 - what will impede the caller from just doing a new pages.Page() and use it without adding first (even if some things doesn't work)? What would be the defined behavior for that? throw an error, fail silently? The object would have to keep track if it was added or not to throw an "Please add me first" error (and probably with a fake private member so that the exported `add` method can switch it).

(also, the `pages.add` example above implies that it returns the object passed to it. context-menu doesn't do that atm, so it's a 3-step call)

- One possible alternative to avoid this would be to let the Page Worker work as is now, but have an extra property or function (I think it has to be a function by CommonJS standards) that returns all the existing page workers that haven't been unloaded.  If the developer read the docs (ok, we shouldn't count too much on that, but still...), it should be more obvious that if something can get back all page workers, they are not actually gone.

 
> And finally, two big picture issues:
> 
> First, since this API creates a frame containing potentially untrusted content,
> it should follow the content frame guidelines outlined in Brian's JEP 115
> <https://wiki.mozilla.org/Labs/Jetpack/Reboot/JEP/115>. That complicates the
> API a bit, since it means specifying a frame with permissions, but that makes
> it possible to create pages in which script doesn't execute (which is safer
> when you don't actually need it to).

Hmm ok. I'm not sure what this means for this API. Looking into JEP 115 it seems very similar to this one, except that the main use cases on JEP 115 are to be rendered on a Panel or statusbar (for example), while this one would be rendered invisibly.  Should we block this one on JEP 115 and let it be just a wrapper for it on a hidden window, as the Panel will be a wrapper for it too?

Anyhow, perhaps they don't need to be similar. Having the iframe type="content" shouldn't make this any more dangerous than any regular webpage, unless the developer does something really bad on page.run like leaking a privileged `eval` to the content (via the bind option), in which case we're out of luck anyway :)

Still, if JEP 115 will probably cover everything here, there's no need to duplicate the implementations. Is there a bug where it is being tracked?

> 
> Second, I wonder about the implications of electrolysis on this API. Especially
> considering that the hidden window presumably lives in chrome, whereas the plan
> is that Jetpack-based addon code is going to live in its own process. We should
> get bsmedberg's input on this.

Yeah, definitely.


> FWIW, regarding the name "page worker", we considered the name "background
> page", but we wanted to avoid confusing developers who are familiar with
> Chrome's API, since Chrome's background pages is very different from the
> functionality we are providing here (it's a singleton page that provides the
> main execution context for the addon, much like our "main" module, except that
> it's an entire HTML page rather than just a JS execution context).
> 
> I'm happy to entertain other names for this, though. Suggest away!

I've grown more comfortable with the name "page worker", so I'm fine with it now :)

> And regarding the creation of browser elements in the hidden window, that's the
> best way I know to go about it (modulo any electrolysis issues). And it looks
> like the hidden window is now XUL on all OSes (it used to be HTML on Windows
> and Linux), so you don't need to do anything special anymore to secure them via
> type="content".
> 
> However, they should probably be "iframe" elements rather than "browser"
> elements, since the API doesn't seem to need the additional capabilities that
> "browser" elements provide.

Glad that you mentioned this. I actually did some experiments this weekend and it appears that the hiddenWindow is still different on OSes (I never thought they would be!). I changed it to iframe and almost everything works correctly, except for a difference between windows/mac on one test case. I'll ask more about this to you and atul on IRC to get ideas from your previous experience with this.
(Reporter)

Comment 31

8 years ago
Just FYI, the only reason I proposed the idea of .empty() and .reset() was to provide shortcuts to these actions in the absence of a library that provided similar equivalents, as jQuery and MooTools do. If libraries like these are available then such methods aren't really important :)
(Assignee)

Comment 32

8 years ago
> Second, I wonder about the implications of electrolysis on this API. Especially
> considering that the hidden window presumably lives in chrome, whereas the plan
> is that Jetpack-based addon code is going to live in its own process. We should
> get bsmedberg's input on this.

Here's a quick summary for Benjamin about how this API works right now: code running on the jetpack context calls the page-worker constructor, which creates an <iframe type="content"> element and appends it to the hiddenDOMWindow.

Then, the API has a method called `run` which takes a function argument and is supposed to run the given function on the iframe's context (as if it was a content function) and return its value to the caller on the jetpack context 
This method also takes extra arguments to bind or pass parameters to the function, which can be objects from the jetpack context [*].

It is done by creating a Sandbox (using the iframe's window as principal), setting the window as a global variable on the sandbox, and then calling evalInSandbox with the given function uneval'ed.  The returned value from evalInSandbox is then returned to the caller [which means that evalInSandbox waits for the function to complete. will that still be the case with e10s?].


Benjamin, can you give some input about this approach, and how would it need to be changed when Jetpack becomes electrolysis-powered?  I've taken a look on the wiki page (https://wiki.mozilla.org/Electrolysis/Jetpack), but I'm not aware of all the details that electrolysis involve. It looks that CPOWs plus message passing might cover it, but there's at least an alert on the wiki about not having chrome->content CPOWs at the moment (which might affect [*]), and also other things that might not be described there and I don't know about.

Comment 33

8 years ago
This is basically the same as "the pagemods problem". There is ongoing discussion in the jetpack mailing list about how we want to solve that problem.

I really don't think we should *ever* use an API which unevals a function. It's just completely unexpected behavior for functions to be serialized/deserialized like that. Nickolay has two alternative behaviors that we need to decide on.
(Assignee)

Comment 34

8 years ago
(In reply to comment #27)
> A big use case of Jetpack code is for addons to securely inject objects into
> content, which should be usable by content before its DOMContentLoaded event
> has fired, e.g. when an inline <script> tag refers to the injected object. This
> is covered by bug 526010, as there isn't currently an reliable way to do this
> in the platform.

Atul, about this that you posted, take a look at bug 549539. I'm not sure but it looks like what you wanted was done in that bug!
(Assignee)

Comment 35

8 years ago
(In reply to comment #33)
> This is basically the same as "the pagemods problem". There is ongoing
> discussion in the jetpack mailing list about how we want to solve that problem.
> 
> I really don't think we should *ever* use an API which unevals a function. It's
> just completely unexpected behavior for functions to be serialized/deserialized
> like that. Nickolay has two alternative behaviors that we need to decide on.

The uneval is used here to break the original closure of the function and be able to run it through evalInSandbox. Personally I don't think it's very uncommon.. It's the same behavior as other things that handle scope and load script from textual data, such as JSSubScriptLoader and Web Workers.

I took a look at Nickolay current implementation on github and saw that the function is given a wrappedWindow as an argument (instead of injecting it into its globals), and it is called from chrome context, whereas here it's inside a sandbox.
(I don't know what would be the other alternative behavior yet, trying to find it on the posts and wiki)

Comment 36

8 years ago
neither JSSubScriptLoader or Web Workers take JS functions: they take a *uri* of a script to load. I'm fine with that. I'm just not fine with taking a JS function in one context and uneval/evaling it in a different context.

Comment 37

8 years ago
(In reply to comment #30)
> > The mechanism by which pages are instantiated and destroyed follows the design
> > guidelines, which specify the use of constructors, but adw and I have had
> > recent conversations in which adw suggested (and I agreed) that it's a problem
> > for simple constructors to have side-effects that persist outside the scopes of
> > their instances.

> - One possible alternative to avoid this would be to let the Page Worker work
> as is now, but have an extra property or function (I think it has to be a
> function by CommonJS standards) that returns all the existing page workers that
> haven't been unloaded.

Do clients really need more than one Page Worker?  (And do we want them to be able to create an unlimited number?)  IMO an extension should get a singleton, which would mean no constructor, no add(), but something like an init() instead.

Comment 38

8 years ago
Yes, I expect they may need to a basically unlimited number of them, e.g. in the extreme case they might need one per-tab.
(Assignee)

Comment 39

8 years ago
Created attachment 444769 [details] [diff] [review]
v4

New patch. This one works around the hiddenWindow differences on platforms and do what's necessary on each platform to host a secure <xul:iframe type="content">.

Also on this patch:
 - using api-utils
 - moved to the pages.add(new pages.Page()) style. It works like this:
    pages.Page() just parses the options and preemptively sets up the container frame if needed by the platform. It returns an almost blank object but it's already the correct Page Worker instance.
    pages.add() takes that object and creates the page frame, appendChilds it and attaches the public API to the object.
    This means that the caller can get a reference either on pages.Page() or pages.add(), since it will be the same object. And any function that potentially would fail or just don't make sense before initialization aren't defined until pages.add is called. The onReady event will only be called after pages.add, and it won't be called immediately.

- a new option, allow : { script: true } is required to let JavaScript run on the page. It sets the docShell.allowJavascript property for now but in the future will just pass the object down to the container-frame constructor.
(I'm assuming other properties like allow images, XHR will default to true, is that correct?)


Pending details: decide on the behavior of page.run and make sure the overall API is future-proof for e10s.

Gory details: since we can't ship a chrome:// file, and we need one, I had to find a suitable existing file to use. The one it's using right now is the most lightweight I could find: no stringbundles, no scripts
(Assignee)

Updated

8 years ago
Attachment #441949 - Attachment is obsolete: true
(Assignee)

Comment 40

8 years ago
Created attachment 445021 [details] [diff] [review]
v5

Updated the docs to reflect latest changes, and fixed an 3.6 only wrapper issue. Also improved the detection and test running when the hidden window service is not available (I'm testing this instead of testing for app versions.. is this the prefered approach?)
Attachment #444769 - Attachment is obsolete: true
(Assignee)

Comment 41

8 years ago
Created attachment 445419 [details] [diff] [review]
Page Worker

This is the same as the previous version of the Page Worker, minus the features related to the page.run method. We chose not to expose the page.run API until we have a better sense on how its behavior should be, and this still holds value since everything can still be done with this version of page worker, if only in a slightly different way.
Attachment #445021 - Attachment is obsolete: true
Attachment #445419 - Flags: review?(myk)
Comment on attachment 445419 [details] [diff] [review]
Page Worker

>  - moved to the pages.add(new pages.Page()) style. It works like this:
>     pages.Page() just parses the options and preemptively sets up the container
> frame if needed by the platform. It returns an almost blank object but it's
> already the correct Page Worker instance.
>     pages.add() takes that object and creates the page frame, appendChilds it
> and attaches the public API to the object.
>     This means that the caller can get a reference either on pages.Page() or
> pages.add(), since it will be the same object. And any function that
> potentially would fail or just don't make sense before initialization aren't
> defined until pages.add is called. The onReady event will only be called after
> pages.add, and it won't be called immediately.

This is great.  Among other things, it means that a caller can construct and add in one step:

  let pages = require("page-worker");
  let page = pages.add(pages.Page({ ... }));

That somewhat mitigates the pain of having the two-step process here.


> - a new option, allow : { script: true } is required to let JavaScript run on
> the page. It sets the docShell.allowJavascript property for now but in the
> future will just pass the object down to the container-frame constructor.
> (I'm assuming other properties like allow images, XHR will default to true, is
> that correct?)

That depends on the property.  If we were to implement an XHR option, it would probably default to true, since XHR is currently possible, and we wouldn't want to break existing consumers when adding the property.

But if we were to implement a cross-domain XHR option, it would probably default to false, since cross-domain XHR isn't currently possible.  Ditto for geolocation.


> Gory details: since we can't ship a chrome:// file, and we need one, I had to
> find a suitable existing file to use. The one it's using right now is the most
> lightweight I could find: no stringbundles, no scripts

Sounds good.  Just make sure to file a bug requesting the ability to load pages in the background without having to resort to this hack.  And make sure to specify in the bug description that the solution should work on Fennec as well (where I suspect there is no hidden window, in which case it's not as simple as making sure the hidden window is of a suitable type).


> Updated the docs to reflect latest changes, and fixed an 3.6 only wrapper
> issue. Also improved the detection and test running when the hidden window
> service is not available (I'm testing this instead of testing for app
> versions.. is this the prefered approach?)

This seems like a reasonable approach.  One potential issue with it is that Fennic doesn't have crash-submit-form.xhtml, which is also a dependency (Thunderbird does, though, and presumably XULRunner does too, since it's in toolkit).  But if Fennec also doesn't have a hidden window, then that's not a problem.


>diff --git a/packages/jetpack-core/docs/page-worker.md b/packages/jetpack-core/docs/page-worker.md

>+The `page-worker` module provides a way to create a permanent, invisible
>+background page, access its DOM and run JavaScript code in the page's context.

The last bit of this sentence is still true, but we don't currently provide a special API for doing so, so we should probably omit it and just say something like this for now:

The `page-worker` module provides a way to create a permanent, invisible
background page and access its DOM.


>+## Initialization ##
>+
>+<code>`require`(*"page-worker"*).**Page**(*options*)</code>
>+
>+Creates an unitialized Page Worker instance.

Nit: unitialized -> uninitialized


>+The parameter options is optional, and if given it should be an object
>+with any the following keys:

Nit: any the -> any of the


>+    <td><code>onReady</code></td>
>+    <td>
>+      A function callback that will be called at any time the DOM on the
>+      page is ready. This can be used to know when your Page Worker instance
>+      is ready to be used, and also whenever the webpage on it is changed.

When I read "and also whenever the webpage on it is changed," my first thought is that it is describing a DOM mutation callback, which is of course incorrect.  To reduce that potential for confusion, it would be better to say something like "and also whenever the page is reloaded or another page is loaded in its place."


>+<code>`require`(*"page-worker"*).**add**(*Page Worker*)</code>
>+
>+Initialize the given Page Worker instance. You'll only be able to use its
>+features after calling this function, which will define its properties
>+and function as described on the API reference below.

Nit: and function -> and functions (even though there's only one, as the conventional section title is Functions)

Nit: on the API reference -> in the API reference


>+## Examples ##
>+
>+### Example - Get all header titles from an wikipedia article ###

Nit: an wikipedia -> a Wikipedia


>+    function getTitles(page) {
>+      var titles = [];
>+      var elements = page.document.querySelectorAll("h2 > span");
>+      for (var i in elements) {
>+        var title = elements[i].firstChild.nodeValue;
>+        titles.push(title);
>+      }
>+      return titles;
>+    });

Returning the titles won't have any effect, since the caller is just the module.  It would be better to provide an example that either does something noticeable with the titles, f.e.:

  console.log("Wikipedia Internet article titles are: " + titles.join(", "));

or has a placeholder comment:

  /* do something interesting with the titles */


>diff --git a/packages/jetpack-core/lib/page-worker.js b/packages/jetpack-core/lib/page-worker.js

>+let hostFrame, hiddenWindow, isHostFrameReady = false;
...
>+if (hiddenWindow.location.protocol == "chrome:") {
>+  // We can use the hidden window itself to host our iframes
>+  hostFrame = hiddenWindow;
>+  hostFrame.contentDocument = hiddenWindow.document;
>+  setHostFrameReady();
>+}
>+
>+function setHostFrameReady() {
>+  isHostFrameReady = true;
>+  hostFrame.removeEventListener("DOMContentLoaded", setHostFrameReady, false);
>+}

On the general principle that we should tread as lightly as possible on the hidden window, it would be better for this code to not set its contentDocument property.  Instead, define a hostDocument global variable that references the hostFrame's document, i.e.:

  let hostFrame, hostDocument, hiddenWindow, isHostFrameReady = false;
  ...
  if (hiddenWindow.location.protocol == "chrome:") {
    // We can use the hidden window itself to host our iframes
    hostFrame = hiddenWindow;
    hostDocument = hostFrame.document;
    isHostFrameReady = true;
  }
  
  function setHostFrameReady() {
    isHostFrameReady = true;
    hostDocument = hostFrame.contentDocument;
    hostFrame.removeEventListener("DOMContentLoaded", setHostFrameReady, false);
  }


Also, since the |hiddenWindow.location.protocol == "chrome:"| code path only shares one line of code (isHostFrameReady = true) with the alternative code path, it would be better for it to set isHostFrameReady itself than to call setHostFrameReady, since setHostFrameReady tries to remove an event listener that the code path never added.

Also, it's not just that the hidden window uses the chrome: protocol <chrome://browser/content/hiddenWindow.xul> on Mac and the resource: protocol <resource://gre-resources/hiddenWindow.html> on Windows and Linux.

It is also that on Mac it loads a XUL document, in which it is possible to create XUL iframe elements with type="content" security restrictions, while on the other OSes it loads an HTML document, in which it is not possible to create such elements (because HTML doesn't support namespaces).

So the conditional should check the content type (i.e. hiddenWindow.document.contentType), which is application/vnd.mozilla.xul+xml on Mac and text/html on Windows and Linux (XHTML documents of the application/xhtml+xml content type also support namespaces, which is why the crash-submit-form.xhtml hack works, but none of the target OSes use an XHTML document for the hidden window).

Also, it would be good to have a comment here explaining that the host frame will be lazily loaded, as it isn't clear from context how the hostFrame gets defined when the hidden window isn't suitable for it.

Finally, it's worth considering whether it makes sense to have both code paths in the first place.  Are there any significant advantages to using the hidden window directly on Mac?  If not, this code would be simpler if it just used a host frame inside the hidden window on all three platforms.


>+  if (options.onReady) {
>+    this.onReady = options.onReady;
>+  }

onReady should be a Collection.


>+  if (!hostFrame) {
>+    hostFrame = hiddenWindow.document.createElement("iframe");
>+    hostFrame.setAttribute("type", "content");

It isn't necessary to set the "type" attribute of the host frame to "content" here, as setting the type of the page's iframe is sufficient to establish the chrome/content boundary.  FWIW, setting type="content" here it also doesn't have any effect, since the host frame is an HTML element, which doesn't honor the "type" attribute (only XUL iframe elements take that attribute into consideration).


>+    // ugly ugly hack. This is the most lightweight chrome:// file I could find on the tree
>+    hostFrame.setAttribute("src", "chrome://global/content/crash-submit-form.xhtml");
>+    hostFrame.addEventListener("DOMContentLoaded", setHostFrameReady, false);
>+    hiddenWindow.document.body.appendChild(hostFrame);

In the comment, make sure to reference the bug requesting the platform fix that makes it possible to remove this hack.


>+  /* Private helper function */
>+  function onReadyListener() {
>+    if (self.onReady) {
>+      errors.catchAndLog(self.onReady).call(self);
>+    }
>+  }

The onReadyListener function should differentiate between DOMContentLoaded events on the page itself and those on subframes, with only the former being passed to the consumer, by comparing the target of the event with the browser's contentDocument, i.e.:

  /* Private helper function */
  function onReadyListener(event) {
    if (event.target == browser.contentDocument && self.onReady) {
      errors.catchAndLog(self.onReady).call(self);
    }
  }


>+  require("unload").ensure(self);

For consistency with the way context-menu uses add/remove, both defined on the singleton, this should use add/remove on the singleton here too instead of add on the singleton and unload on the instance.
Attachment #445419 - Flags: review?(myk) → review-
Note that this simple addon works on Mac OS X but fails on Linux with "TypeError: browser is null" at page-worker.js line 173 |self.__defineGetter__("window", function () browser.contentWindow);|:

  let pages = require("page-worker");
  exports.main = function() {
    let page = pages.add(pages.Page());
    page.window.location = "data:text/html,<html><body></body></html>";
  }


I have confirmed, however, that I am able to load a Google Map into an HTML page bundled with an addon using the Google Maps API.
(Assignee)

Comment 44

8 years ago
(In reply to comment #43)
> Note that this simple addon works on Mac OS X but fails on Linux with
> "TypeError: browser is null" at page-worker.js line 173
> |self.__defineGetter__("window", function () browser.contentWindow);|:
> 
>   let pages = require("page-worker");
>   exports.main = function() {
>     let page = pages.add(pages.Page());
>     page.window.location = "data:text/html,<html><body></body></html>";
>   }
> 
> 
> I have confirmed, however, that I am able to load a Google Map into an HTML
> page bundled with an addon using the Google Maps API.


Alright, I debugged this on Linux today, and the problem is two-fold there, but easily solvable.
The main problem is that some of the Firefox versions shipped by the distros do not include the crash-reporter, so the crash-submit-form.xhtml doesn't exist there. I'm thinking of changing it to the aboutRobots.xhtml page, unless someone know of a better host page (aboutRobots is definitely bigger than crash-submit-form but small in comparison to everything else I was able to find using the Chrome List add-on).

The other problem is that when run in xulrunner mode (actually this may not be linux specific), the hidden window apparently exists but is about:blank, which won't allow to have a chrome frame inside it. We could theoretically change the hidden window address but this is too much of a stretch IMO (the app could be already using it for something else), so I think the best for now would be to check for Firefox (or Thunderbird) instead of checking for a hidden window.
(In reply to comment #44)
> Alright, I debugged this on Linux today, and the problem is two-fold there, but
> easily solvable.

I think there's actually a third fold, as I still see the problem when testing on my own installations (from a tarball) of Firefox 3.6 and the latest trunk nightly via the following commands:

  cfx run -a firefox --binary ~/Applications/firefox-3.7-nightly/firefox
  cfx run -a firefox --binary ~/Applications/firefox-3.6-release/firefox


Here's a full stacktrace:

error: An exception occurred.
Traceback (most recent call last):
  File "chrome://browser/content/browser.js", line 1163, in delayedStartup
    ss.init(window);
  File "", line 0, in init
  File "file:///home/myk/Applications/firefox-3.6-release/components/nsSessionStore.js", line 308, in sss_init
    this.onLoad(aWindow);
  File "file:///home/myk/Applications/firefox-3.6-release/components/nsSessionStore.js", line 618, in sss_onLoad
    this._observerService.notifyObservers(null, NOTIFY_WINDOWS_RESTORED, "");
  File "", line 0, in notifyObservers
  File "file:///tmp/tmpGhvSBD.mozrunner/extensions/xulapp@toolness.com/components/harness.js", line 335, in Harness_observe
    this.load();
  File "file:///tmp/tmpGhvSBD.mozrunner/extensions/xulapp@toolness.com/components/harness.js", line 282, in Harness_load
    program.main(options, {quit: quit, print: dump});
  File "resource://test-page-worker-test-page-worker-lib/main.js", line 4, in 
    page.window.location = "data:text/html,<html><body></body></html>";
  File "resource://test-page-worker-jetpack-core-lib/page-worker.js", line 173, in 
    self.__defineGetter__("window", function () browser.contentWindow);
TypeError: browser is null
FAIL
(Assignee)

Comment 46

8 years ago
Aha, about:mozilla (chrome://global/content/mozilla.xhtml) is even smaller, and I verified it's present on Thunderbird and on the FF version shipped on Ubuntu.
(Assignee)

Comment 47

8 years ago
> TypeError: browser is null

Hmm, the error manifests itself there, but the stacktrace isn't much help because browser being null means there was some problem on the module setup. Do you know if the crash-submit-form is present in this build? (maybe it's off-by-default on linux).  Could you try changing the page to chrome://global/content/mozilla.xhtml ?

If it still doesn't work, a stacktrace of the test suite might be more useful to understand what's going on.
(Assignee)

Comment 48

8 years ago
Wait, I got the error now
Felipe: perhaps it's because the addon tries to access browser.window before the hostFrame is loaded?
(Assignee)

Comment 50

8 years ago
Myk: yeah exactly. I fixed a similar condition when two pages are started almost simultaneously. It doesn't happen on OS X because there's no need to set up the host frame, and it didn't happen on the test suite because it's only on the first page worker created.

I think that the addon shouldn't try to access page.window or page.document before receiving an onReady event, though it's an unfortunate burden to use a callback because it's needed for a first setup only.

Do you think we should just enforce using the onReady event for this, or create a new one (onStart perhaps)?  Using onReady is bad because if the developer doesn't check that the onReady came from about:blank, he might easily create an infinite loop:

//infinite loop since href changes trigger onReady
let page = pages.page({onReady: function() {
  page.window.href = "http://www.example.com"
}});

Although of course it's an easy to catch error
(In reply to comment #50)
> Do you think we should just enforce using the onReady event for this, or create
> a new one (onStart perhaps)?

Let's instead allow the developer to specify the initial content for the page on construction via a "content" option that can contain either a string of HTML (in which case we load it via a data: URL) or a URL (in which case we load the URL).

See the widget patch (bug 543585) for an example of how this can be done (although note that it'll change in 0.5, when we'll require the use of a URL constructor to specify URLs).
(Assignee)

Comment 52

8 years ago
(In reply to comment #51)
> (In reply to comment #50)
> > Do you think we should just enforce using the onReady event for this, or create
> > a new one (onStart perhaps)?
> 
> Let's instead allow the developer to specify the initial content for the page
> on construction via a "content" option that can contain either a string of HTML
> (in which case we load it via a data: URL) or a URL (in which case we load the
> URL).
> 
> See the widget patch (bug 543585) for an example of how this can be done
> (although note that it'll change in 0.5, when we'll require the use of a URL
> constructor to specify URLs).

Awesome idea, that will work perfectly. I'll do this, and will also move the definitions of page.document and page.window to only when they're in fact usable, so it can't be used while browser is null
(Assignee)

Comment 53

8 years ago
Created attachment 445898 [details] [diff] [review]
Page Worker - v2

OK! New version, with all the comments and issues covered! Is this *the* Page Worker? 

All the changes from the previous patch:

> > Gory details: since we can't ship a chrome:// file, and we need one, I had to
> > find a suitable existing file to use. The one it's using right now is the most
> > lightweight I could find: no stringbundles, no scripts
> 
> Sounds good.  Just make sure to file a bug requesting the ability to load pages
> in the background without having to resort to this hack.  And make sure to
> specify in the bug description that the solution should work on Fennec as well
> (where I suspect there is no hidden window, in which case it's not as simple as
> making sure the hidden window is of a suitable type).

Did this. the bug is 565388. Also added a comment citing the bug number. FWIW, I think support for Fennec will be less troublesome than for FF since it already operates with hidden browsers.

> >diff --git a/packages/jetpack-core/docs/page-worker.md b/packages/jetpack-core/docs/page-worker.md
> 
Fixed all the documentation nits. Also improved example by printing out the titles from the Wikipedia page.


> >diff --git a/packages/jetpack-core/lib/page-worker.js b/packages/jetpack-core/lib/page-worker.js
> 
> >+let hostFrame, hiddenWindow, isHostFrameReady = false;
> ...
> >+if (hiddenWindow.location.protocol == "chrome:") {
> >+  // We can use the hidden window itself to host our iframes
> >+  hostFrame = hiddenWindow;
> >+  hostFrame.contentDocument = hiddenWindow.document;
> >+  setHostFrameReady();
> >+}
> >+
> >+function setHostFrameReady() {
> >+  isHostFrameReady = true;
> >+  hostFrame.removeEventListener("DOMContentLoaded", setHostFrameReady, false);
> >+}
> 
> On the general principle that we should tread as lightly as possible on the
> hidden window, it would be better for this code to not set its contentDocument
> property.  Instead, define a hostDocument global variable that references the
> hostFrame's document, i.e.:

done

> Also, since the |hiddenWindow.location.protocol == "chrome:"| code path only
> shares one line of code (isHostFrameReady = true) with the alternative code
> path, it would be better for it to set isHostFrameReady itself than to call
> setHostFrameReady, since setHostFrameReady tries to remove an event listener
> that the code path never added.

done

> Also, it's not just that the hidden window uses the chrome: protocol
> <chrome://browser/content/hiddenWindow.xul> on Mac and the resource: protocol
> <resource://gre-resources/hiddenWindow.html> on Windows and Linux.
> 
> It is also that on Mac it loads a XUL document, in which it is possible to
> create XUL iframe elements with type="content" security restrictions, while on
> the other OSes it loads an HTML document, in which it is not possible to create
> such elements (because HTML doesn't support namespaces).

Fixed this. I'm checking for chrome:// protocol and xul or xhtml content-type.
As you mentioned, it's not needed to at the moment check for xhtml here but I added it since it's future-proof and more correct
 
> Also, it would be good to have a comment here explaining that the host frame
> will be lazily loaded, as it isn't clear from context how the hostFrame gets
> defined when the hidden window isn't suitable for it.

ok

> Finally, it's worth considering whether it makes sense to have both code paths
> in the first place.  Are there any significant advantages to using the hidden
> window directly on Mac?  If not, this code would be simpler if it just used a
> host frame inside the hidden window on all three platforms.

There are no significant advantages except that page worker start-up on Mac becomes a tad faster. I was gonna remove the different path but very likely similar code will be needed to support Fennec or other scenarios, so I thought there was no real reason to remove it now since it's working fine.

> >+  if (options.onReady) {
> >+    this.onReady = options.onReady;
> >+  }
> 
> onReady should be a Collection.

Made it a collection, updated docs and options validation, and added tests to cover onReady being defined as an array of functions


> >+  if (!hostFrame) {
> >+    hostFrame = hiddenWindow.document.createElement("iframe");
> >+    hostFrame.setAttribute("type", "content");
> 
> It isn't necessary to set the "type" attribute of the host frame to "content"
> here, as setting the type of the page's iframe is sufficient to establish the
> chrome/content boundary.  FWIW, setting type="content" here it also doesn't
> have any effect, since the host frame is an HTML element, which doesn't honor
> the "type" attribute (only XUL iframe elements take that attribute into
> consideration).

That's right, I removed the "type" attribute now.


> >+  /* Private helper function */
> >+  function onReadyListener() {
> >+    if (self.onReady) {
> >+      errors.catchAndLog(self.onReady).call(self);
> >+    }
> >+  }
> 
> The onReadyListener function should differentiate between DOMContentLoaded
> events on the page itself and those on subframes, with only the former being
> passed to the consumer, by comparing the target of the event with the browser's
> contentDocument, i.e.:

done! good idea

> For consistency with the way context-menu uses add/remove, both defined on the
> singleton, this should use add/remove on the singleton here too instead of add
> on the singleton and unload on the instance.

I did this using the approach we discussed on IRC. A cache object present on the module scope is used to pass info between the functions without exposing them publicly. This also made it possible to not need to pass options.allow manually from constructor to pages.add.

Turns out it was still possible to use unload.ensure. I just defined the page worker unload function on the cache info object. Worked very well.


(In reply to comment #51)
> Let's instead allow the developer to specify the initial content for the page
> on construction via a "content" option that can contain either a string of HTML
> (in which case we load it via a data: URL) or a URL (in which case we load the
> URL).
> 
> See the widget patch (bug 543585) for an example of how this can be done
> (although note that it'll change in 0.5, when we'll require the use of a URL
> constructor to specify URLs).

I did this, really liked it!  It simplified the tests and examples, page worker setup for the user, and fixed the error you reported. All in all, great win. The wikipedia example basically became:

pages.add(pages.Page({
  content: "http://en.wikipedia.org/wiki/Internet",
  onReady: getTitles
}}));

and it doesn't need to check for the correct onReady anymore since this saves the step of starting as about:blank before defining the real URL
Attachment #445419 - Attachment is obsolete: true
Attachment #445898 - Flags: review?(myk)
(Assignee)

Comment 54

8 years ago
myk, just as an extra, I think it'd be nice to have this +1 change to the patch:

   function onReadyListener(event) {
     if (event.target == browser.contentDocument && self.onReady) {
       for each (callback in self.onReady.__iterator__())
-        errors.catchAndLog(callback).call(self);
+        errors.catchAndLog(callback).call(self, self);
     }
   }

So the onReady callback can receive the page worker as the first argument when called. I used this on the example in the docs, I think it is cleaner than using `this`.  (The caller would have both options)
Comment on attachment 445898 [details] [diff] [review]
Page Worker - v2

> OK! New version, with all the comments and issues covered! Is this *the* Page
> Worker? 

Zarro boogs? ;-)

In any case, this is getting close!  There are just a few more issues (and a couple nits).


> myk, just as an extra, I think it'd be nice to have this +1 change to the
> patch:
> 
>    function onReadyListener(event) {
>      if (event.target == browser.contentDocument && self.onReady) {
>        for each (callback in self.onReady.__iterator__())
> -        errors.catchAndLog(callback).call(self);
> +        errors.catchAndLog(callback).call(self, self);
>      }
>    }
> 
> So the onReady callback can receive the page worker as the first argument when
> called. I used this on the example in the docs, I think it is cleaner than
> using `this`.  (The caller would have both options)

I would prefer to provide only a single way for an event callback to access the page, for which the emerging convention is to use "this" to refer to the object on which the callback is defined (whether that object is a singleton or, as in this case, an instance), as implied by the JEP implementation guidelines <https://wiki.mozilla.org/Labs/Jetpack/So_You%27re_Implementing_a_JEP#this>.


>diff --git a/packages/jetpack-core/docs/page-worker.md b/packages/jetpack-core/docs/page-worker.md

>+The `page-worker` module provides a way to create a permanent, invisible
>+background page and access its DOM.

Nit: it would be useful to avoid the use of the phrase "background page" here and in the rest of the document, referring to these pages as simply "pages", as Chrome extensions currently use the phrase "background page" for a related but different API, and we should avoid conflating this API with that one.


>+The parameter options is optional, and if given it should be an object
>+with any of the following keys:

Nit: parameter options -> `options` parameter


>+      A function callback or an array of functions that will be called at any
>+      time the DOM on the page is ready. This can be used to know when your
>+      Page Worker instance is ready to be used, and also whenever the page is
>+      reloaded or another page is loaded in its place.

Nit: at any time -> when


>+    var page = PageWorker.Page({
>+      content: "http://en.wikipedia.org/wiki/Internet",
>+      onReady: printTitles
>+    });
>+    PageWorker.add(page);
>+
>+And define the function to print the titles:
>+
>+    function getTitles(page) {
>+      var elements = page.document.querySelectorAll("h2 > span");
>+      for (var i = 0; i < elements.length; i++) {
>+        console.log(elements[i].textContent + "\n");
>+      }
>+    });

There's a trailing ");" at the end of this function declaration that causes a syntax error when it is cut/pasted into a script.

Also, this function is named getTitles, but the Page constructor's onReady property references printTitles.

And the function relies on receiving a "page" argument that doesn't exist in the patch instead of "this".

Finally, it isn't necessary to append "\n" to console.log messages, as that is done by the console logging module automatically.


>diff --git a/packages/jetpack-core/lib/page-worker.js b/packages/jetpack-core/lib/page-worker.js

>+function findInCache(pageWorker) {
>+  for (let i in pageWorkerCache)
>+    if (pageWorkerCache[i].ref === pageWorker)
>+      return i;
>+  return -1;
>+}

Nit: for.. in on arrays is contraindicated in favor of either for each.. in (when you don't care about the index) or for(;;) (when you do).


>+  pageWorkerCache.push({
>+    ref: this,
>+    allow: options.allow || {},
>+    content: options.content || ""
>+  });

The page mustn't be cached before it gets added, because it needs to be garbage-collectable if it falls out of scope unless it has been added.

That's the key difference between a page that has been added and one that hasn't: added pages can have other references (and resources) than the ones consumers hold to them, causing them to stick around when the consumers' references fall out of scope, while non-added (or removed) pages no longer have any such hanging references.  Otherwise, the page is more likely to leak.

So instead of caching it here, cache it in the add function, passing the unload function and browser reference from startPageWorker back to the add function so they can be cached for access by the remove function and other code that needs access to them, i.e. something like:

  exports.add = function JP_SDK_Page_Worker_add(pageWorker) {
    ...

    var [unload, browser] = startPageWorker(pageWorker);
    pageWorkerCache.push({
      pageWorker: pageWorker,
      unload: unload,
      browser: browser
    });

Of course you still need to provide access to the values of allow and content.  Those are best implemented as getter/setters in the constructor, i.e.:

  this.__defineGetter__("allow", function() options.allow);
  this.__defineSetter__("allow", function(newVal) {
    options.allow = newVal;
    /* if the page has been added, update the value of browser.docShell.allowJavascript */
  });
  this.__defineGetter__("content", function() options.content);
  this.__defineSetter__("content", function(newVal) {
    options.content = newVal;
    /* if the page has been added, load the new content into it */
  });

Note that this makes them public properties of the Page instance, but that's perfectly reasonable from a security perspective, since they were provided by the consumer (so there is no reason the consumer shouldn't retain access to them), and the content frame spec provides that these properties be public.

It's also consistent with the API design guidelines <https://wiki.mozilla.org/Labs/Jetpack/Design_Guidelines>, which specify that constructor options should be reflected into instance properties.

Note also that I've made the implementations above settable.  That's the ultimate goal, but if it's too difficult to accomplish, I'm ok with pushing it to a second phase of development and having them only be gettable for the initial version.


>+  if ("document" in pageWorker) {
>+    throw new Error("You can't add the same Page Worker more than once. " +
>+                    "Instead, create a new instance if you need a new Page Worker.");
>+  }

Just noticed this: because pages load asynchronously, this test won't actually catch all cases in which a page worker was added twice (f.e. two synchronous adds, one after the other, would not be caught by this check).  This should look in the cache instead.  However, I also wouldn't throw here (or in "remove"), just return early, as there's no harm in doing so, because the result afterwards is what the user expected: the page has been added (or removed).
Attachment #445898 - Flags: review?(myk) → review-
(Assignee)

Comment 56

8 years ago
Created attachment 446124 [details] [diff] [review]
Page Worker - v3

I think this is done now. I don't know if you want me to set the r? flag again, so I'm doing it. I'm gonna post an interdiff as attachment as well which will hopefully it much easier to review the latest changes


(In reply to comment #55)

> 
> I would prefer to provide only a single way for an event callback to access the
> page, for which the emerging convention is to use "this" to refer to the object
> on which the callback is defined (whether that object is a singleton or, as in
> this case, an instance), as implied by the JEP implementation guidelines
> <https://wiki.mozilla.org/Labs/Jetpack/So_You%27re_Implementing_a_JEP#this>.

Ok! Keeping it as is, and I've updated the example to use `this` (and fixed the syntax error and "\n")
 
> 
> Nit: it would be useful to avoid the use of the phrase "background page" here
> and in the rest of the document, referring to these pages as simply "pages", as
> Chrome extensions currently use the phrase "background page" for a related but
> different API, and we should avoid conflating this API with that one.

 Agreed, removed reference to "background" throughtout the docs
>
> >+The parameter options is optional, and if given it should be an object
> >+with any of the following keys:
> 
> Nit: parameter options -> `options` parameter

done
> 
> 
> >+      A function callback or an array of functions that will be called at any
> >+      time the DOM on the page is ready. This can be used to know when your
> >+      Page Worker instance is ready to be used, and also whenever the page is
> >+      reloaded or another page is loaded in its place.
> 
> Nit: at any time -> when

done

> 
> >diff --git a/packages/jetpack-core/lib/page-worker.js b/packages/jetpack-core/lib/page-worker.js
> 
> >+function findInCache(pageWorker) {
> >+  for (let i in pageWorkerCache)
> >+    if (pageWorkerCache[i].ref === pageWorker)
> >+      return i;
> >+  return -1;
> >+}
> 
> Nit: for.. in on arrays is contraindicated in favor of either for each.. in
> (when you don't care about the index) or for(;;) (when you do).

Okay, since we care about the index here I'm using for (let i = 0; i < pageWorkerCache.length; i++)

 
> >+  pageWorkerCache.push({
> >+    ref: this,
> >+    allow: options.allow || {},
> >+    content: options.content || ""
> >+  });
> 
> The page mustn't be cached before it gets added, because it needs to be
> garbage-collectable if it falls out of scope unless it has been added.
> 
> That's the key difference between a page that has been added and one that
> hasn't: added pages can have other references (and resources) than the ones
> consumers hold to them, causing them to stick around when the consumers'
> references fall out of scope, while non-added (or removed) pages no longer have
> any such hanging references.  Otherwise, the page is more likely to leak.
> 
> So instead of caching it here, cache it in the add function, passing the unload
> function and browser reference from startPageWorker back to the add function so
> they can be cached for access by the remove function and other code that needs
> access to them, i.e. something like:
> 
>   exports.add = function JP_SDK_Page_Worker_add(pageWorker) {
>     ...
> 
>     var [unload, browser] = startPageWorker(pageWorker);
>     pageWorkerCache.push({
>       pageWorker: pageWorker,
>       unload: unload,
>       browser: browser
>     });

sure, done! the cache entry is now added on .add() and removed on .remove()

> Of course you still need to provide access to the values of allow and content. 
> Those are best implemented as getter/setters in the constructor, i.e.:
> 
>   this.__defineGetter__("allow", function() options.allow);
>   this.__defineSetter__("allow", function(newVal) {
>     options.allow = newVal;
>     /* if the page has been added, update the value of
> browser.docShell.allowJavascript */
>   });
>   this.__defineGetter__("content", function() options.content);
>   this.__defineSetter__("content", function(newVal) {
>     options.content = newVal;
>     /* if the page has been added, load the new content into it */
>   });
> 
> Note that this makes them public properties of the Page instance, but that's
> perfectly reasonable from a security perspective, since they were provided by
> the consumer (so there is no reason the consumer shouldn't retain access to
> them), and the content frame spec provides that these properties be public.
> 
> It's also consistent with the API design guidelines
> <https://wiki.mozilla.org/Labs/Jetpack/Design_Guidelines>, which specify that
> constructor options should be reflected into instance properties.
> 
> Note also that I've made the implementations above settable.  That's the
> ultimate goal, but if it's too difficult to accomplish, I'm ok with pushing it
> to a second phase of development and having them only be gettable for the
> initial version.

Makes sense. I fixed this and made then settable too. I had to do some getter/setter magic to make the two following cases work:

page.allow = { script: true }
page.allow.script = true;

i.e. setting allow.script triggers the setter, and also setting directly to the object does not destroy it and triggers the setter as well.
:)

Added docs and tests for public properties `allow` and `content`

> >+  if ("document" in pageWorker) {
> >+    throw new Error("You can't add the same Page Worker more than once. " +
> >+                    "Instead, create a new instance if you need a new Page Worker.");
> >+  }
> 
> Just noticed this: because pages load asynchronously, this test won't actually
> catch all cases in which a page worker was added twice (f.e. two synchronous
> adds, one after the other, would not be caught by this check).  This should
> look in the cache instead.  However, I also wouldn't throw here (or in
> "remove"), just return early, as there's no harm in doing so, because the
> result afterwards is what the user expected: the page has been added (or
> removed).

Now that the cache entry is only added on .add(), this was simplified by checking if the object is on the cache or not. It should always work now. Also, I made it just return the same object instead of throwing an error. (same for .remove())
Attachment #445898 - Attachment is obsolete: true
Attachment #446124 - Flags: review?(myk)
(Assignee)

Comment 57

8 years ago
Created attachment 446125 [details] [diff] [review]
Interdiff between v2 and v3

This is an interdiff between the last reviewed patch and the changes made to it
Comment on attachment 446124 [details] [diff] [review]
Page Worker - v3

>+    let entryPos = findInCache(self);
>+    pageWorkerCache.splice(entryPos, 1);

Nit: I just noticed this, although it was in an earlier version of the code... presumably this would never happen, but if the page worker were not in the cache for some reason, then this splice call would arbitrarily remove the last entry from the cache, since entryPos would be -1, which would cause splice to count backwards one item from the back of the array.

So this should check entryPos before trying to remove the item from the cache, i.e.:

    let entryPos = findInCache(self);
    if (entryPos != -1)
      pageWorkerCache.splice(entryPos, 1);

Otherwise this looks great.  I'll fix this nit when checking in the patch.
Attachment #446124 - Flags: review?(myk) → review+
Fixed by changeset https://hg.mozilla.org/labs/jetpack-sdk/rev/e8bb17714e97.  Thanks Felipe!
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED

Updated

8 years ago
Blocks: 569481
The Add-on SDK is no longer a Mozilla Labs experiment and has become a big enough project to warrant its own Bugzilla product, so the "Add-on SDK" product has been created for it, and I am moving its bugs to that product.

To filter bugmail related to this change, filter on the word "looptid".
Component: Jetpack SDK → General
Product: Mozilla Labs → Add-on SDK
QA Contact: jetpack-sdk → general
Version: Trunk → unspecified
You need to log in before you can comment on or make changes to this bug.