Closed Bug 1140172 Opened 5 years ago Closed 5 years ago

Use a single worker for background readability parsing

Categories

(Toolkit :: Reader Mode, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox38 --- fixed
firefox39 --- fixed

People

(Reporter: Margaret, Assigned: Margaret)

References

Details

(Whiteboard: [MemShrink])

Attachments

(1 file)

This issue came up in bug 1139250.

I also wonder if we should try re-using an existing worker, or kill off any workers that are still going in the background if the user navigates to a new page before we're done parsing the old one.
It would probably be good to do some kind of limiting.  DOM workers are not super lightweight, though it is better than it used to be.
Whiteboard: [MemShrink]
(In reply to Andrew McCreight [:mccr8] from comment #1)
> It would probably be good to do some kind of limiting.  DOM workers are not
> super lightweight, though it is better than it used to be.

Yeah, in practice we really only need one at a time, so if we're making a new one, we should kill any old one that's still hanging around, even if it's doing work.
(In reply to :Margaret Leibovic from comment #2)
> (In reply to Andrew McCreight [:mccr8] from comment #1)
> > It would probably be good to do some kind of limiting.  DOM workers are not
> > super lightweight, though it is better than it used to be.
> 
> Yeah, in practice we really only need one at a time, so if we're making a
> new one, we should kill any old one that's still hanging around, even if
> it's doing work.

Actually, that's a lie, since we can have multiple tabs loading at the same time... I'll think about this a bit more.
This replaces our infinite spawning worker logic with a single reader worker. This means that we'll only be parsing one article at a time, but I'm very much okay with that.

I didn't add logic to shut this down because I modeled it after SessionWorker, and I didn't see logic in there to shut that down. Asking Yoric for feedback on this use of PromiseWorker.jsm.
Attachment #8573643 - Flags: review?(bnicholson)
Attachment #8573643 - Flags: feedback?(dteller)
Priority: -- → P1
No feedback yet, just my experience: shutting down workers has so far proved a can of worms. It took us 1+ year to finally track down the bug in B2G OS.File shutdown.
Comment on attachment 8573643 [details] [diff] [review]
Use a single reader worker instead of spawning infinite workers

Review of attachment 8573643 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/reader/ReaderMode.jsm
@@ +217,4 @@
>  
> +    let serializedDoc = Cc["@mozilla.org/xmlextras/xmlserializer;1"].
> +                         createInstance(Ci.nsIDOMSerializer).
> +                         serializeToString(doc);

This can be somewhat slow, so I would add a `yield Promise.resolve()` to make sure that sending the doc is done on the next tick.

::: toolkit/components/reader/ReaderWorker.js
@@ +8,2 @@
>  
> +importScripts("resource://gre/modules/workers/require.js", "resource://gre/modules/reader/JSDOMParser.js", "resource://gre/modules/reader/Readability.js");

Nit: Wouldn't this be more readable with one per line?

@@ +9,5 @@
> +importScripts("resource://gre/modules/workers/require.js", "resource://gre/modules/reader/JSDOMParser.js", "resource://gre/modules/reader/Readability.js");
> +
> +let PromiseWorker = require("resource://gre/modules/workers/PromiseWorker.js");
> +
> +let worker = new PromiseWorker.AbstractWorker();

Here, too, you may wish to define `worker.log`, for logging purposes.

@@ +25,5 @@
> +
> +let Agent = {
> +  parseDocument: function (uri, serializedDoc) {
> +    let doc = new JSDOMParser().parse(serializedDoc);
> +    return new Readability(uri, doc).parse();

What's the return value of this?

::: toolkit/components/reader/ReaderWorker.jsm
@@ +9,5 @@
> +Cu.import("resource://gre/modules/PromiseWorker.jsm", this);
> +
> +this.EXPORTED_SYMBOLS = ["ReaderWorker"];
> +
> +this.ReaderWorker = new BasePromiseWorker("resource://gre/modules/reader/ReaderWorker.js");

You may wish to define `ReaderWorker.log`, for debugging purposes.
Attachment #8573643 - Flags: feedback?(dteller) → feedback+
Attachment #8573643 - Flags: review?(bnicholson) → review+
FYI, this does appear to fix the issue from bug 1139250 \o/

https://bugzilla.mozilla.org/show_bug.cgi?id=1139250#c43

(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #6)
> Comment on attachment 8573643 [details] [diff] [review]
> Use a single reader worker instead of spawning infinite workers
> 
> Review of attachment 8573643 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/reader/ReaderMode.jsm
> @@ +217,4 @@
> >  
> > +    let serializedDoc = Cc["@mozilla.org/xmlextras/xmlserializer;1"].
> > +                         createInstance(Ci.nsIDOMSerializer).
> > +                         serializeToString(doc);
> 
> This can be somewhat slow, so I would add a `yield Promise.resolve()` to
> make sure that sending the doc is done on the next tick.

Thanks for the suggestion, I can update it to do that (I was just following what the current code does).

> ::: toolkit/components/reader/ReaderWorker.js
> @@ +8,2 @@
> >  
> > +importScripts("resource://gre/modules/workers/require.js", "resource://gre/modules/reader/JSDOMParser.js", "resource://gre/modules/reader/Readability.js");
> 
> Nit: Wouldn't this be more readable with one per line?

Good idea.

> @@ +9,5 @@
> > +importScripts("resource://gre/modules/workers/require.js", "resource://gre/modules/reader/JSDOMParser.js", "resource://gre/modules/reader/Readability.js");
> > +
> > +let PromiseWorker = require("resource://gre/modules/workers/PromiseWorker.js");
> > +
> > +let worker = new PromiseWorker.AbstractWorker();
> 
> Here, too, you may wish to define `worker.log`, for logging purposes.

Thanks for the suggestion, will do.

> @@ +25,5 @@
> > +
> > +let Agent = {
> > +  parseDocument: function (uri, serializedDoc) {
> > +    let doc = new JSDOMParser().parse(serializedDoc);
> > +    return new Readability(uri, doc).parse();
> 
> What's the return value of this?

An "article" object:
https://github.com/mozilla/readability/blob/master/README.md#usage

I can add some documentation comments in here to explain that as well.

> ::: toolkit/components/reader/ReaderWorker.jsm
> @@ +9,5 @@
> > +Cu.import("resource://gre/modules/PromiseWorker.jsm", this);
> > +
> > +this.EXPORTED_SYMBOLS = ["ReaderWorker"];
> > +
> > +this.ReaderWorker = new BasePromiseWorker("resource://gre/modules/reader/ReaderWorker.js");
> 
> You may wish to define `ReaderWorker.log`, for debugging purposes.

Will do.
Summary: Shut down the reader worker when it's done → Use a single worker for background readability parsing
https://hg.mozilla.org/integration/fx-team/rev/61faee68488e

If things go smoothly here, I'll try re-enabling this feature on Nightly (again).
https://hg.mozilla.org/mozilla-central/rev/61faee68488e
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Depends on: 1141618
Margaret, if you think there's something manual QA should look at here, please flip the qe-verify flag.
Flags: qe-verify-
Depends on: 1141757
You need to log in before you can comment on or make changes to this bug.