Reduce the cost of taking screenshots for the app switcher

RESOLVED FIXED

Status

defect
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: gsvelto, Assigned: justin.lebar+bug)

Tracking

unspecified
ARM
Gonk (Firefox OS)
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(blocking-basecamp:+, firefox18 fixed, firefox19 fixed)

Details

(Whiteboard: [fast:500ms])

Attachments

(5 attachments, 3 obsolete attachments)

Reporter

Description

7 years ago
Screenshots are periodically taken so that thumbnails of open applications can be shown in the application switcher. This has a significant performance impact: every screenshot taken costs 300-600ms, multiple screenshots can be taken in a short amount of time and one is always taken during application startup, more importantly this one seems useless as it is always taken before the screen has been repainted.

The cost of taking a screenshot is split among three major components:

- Color conversion: the source image has an alpha-channel (strangely enough), since screenshots are taken as PNG images this forces the encoder to convert the pixels by dividing every color channel by the alpha channel. Since ARM processors do not support integer divisions this operation is done by a helper function and can account for up to 60% of the time required for the screenshot

- Page history: the profiles show the _recvGetScreenshot() function calling multiple methods from the nsNavHistory class (AddVisitChain, MarkPageAsFollowedLink, GetRecentFlags, AddObserver). This can account for 20-30% of the screenshot time, unfortunately I wasn't able to figure out from where this methods are being called.

- Compression: once converted the image goes through PNG filters and then the deflate compression algorithm, this consumes relatively little time but imposes allocating memory for the compressor which is usually 256-512 KiB

To reduce the impact of taking screenshots multiple steps should be taken:

- Ensure that screenshots are not taken continuously but only at idle times

- Ensure that a screenshot is not taken before an application has finished starting up

- Make sure the source image does not contain an alpha channel or strip it if need be

- Set PNG encoding parameters so that compression is not used, this should prevent the deflate compressor from allocating memory
Reporter

Updated

7 years ago
Attachment #669644 - Attachment description: Profiled run of the picture calculator application with multiple screenshots → Profiled run of the calculator application with multiple screenshots
Reporter

Comment 4

7 years ago
I've attached multiple traces that highlight the issue, look for the _recvGetScreenshot() calls.
Another thing to consider here is using drawWindow(USE_WIDGET_LAYERS) for foreground (visible) windows, which is much cheaper than repainting.

But I think we should start here by ensuring that gaia is smart, and has all the information it needs to be smart.
Reporter

Updated

7 years ago
Blocks: slim-fast
gsvelto, note that the profiler UI has an "Upload" feature.  If you use that, you can get direct links to the profiles, loaded in the profiler UI.  It's awesome! :)
Reporter

Comment 7

7 years ago
(In reply to Chris Jones [:cjones] [:warhammer] from comment #6)
> gsvelto, note that the profiler UI has an "Upload" feature.  If you use
> that, you can get direct links to the profiles, loaded in the profiler UI. 
> It's awesome! :)

I'll give it a try, thanks for the tip :-)
A good way to measure the upper-bound win here is to measure with screenshotting disabled entirely.
Reporter

Comment 9

7 years ago
(In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> A good way to measure the upper-bound win here is to measure with
> screenshotting disabled entirely.

I'll give it a try, especially to see how much can be saved at startup. My gut feeling is that the speedup will be higher than what can be measured with the profiler, after all we can see only what's happening on the application taking the snapshot but the said image will be decompressed and stored afterwards by the b2g process.
A good way to ensure that screenshots are off the critical path may be to
 - move the "first idle" code [1] to live on TabChild
 - fire that event through the mozbrowser API, maybe a "firstidle" event or something
 - in gaia, take a screen capture off that notification instead of "firstpaint"

Gaia would also want to set a long-ish fallback timeout, 10 seconds or so, because there's no guarantee currently that the idle notification ever fires.

[1] http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#482
FWIW, I expect the cost of taking a screenshot may decrease significantly after I finish bug 800170.
Most definitely.

But I think we should also get this off the critical path, because early paints can force us to reflow / build display lists / etc. earlier than Gecko would otherwise want to, which can hurt (extra work).

However, data shall tell!
Reporter

Comment 13

7 years ago
(In reply to Justin Lebar [:jlebar] from comment #11)
> FWIW, I expect the cost of taking a screenshot may decrease significantly
> after I finish bug 800170.

I will try using JPEG instead of PNG for taking the screenshot, if the cost of JPEG compression is lower on the Otoro then it's a win-win situation (lower memory and CPU usage). Alternatively I'll try using the PNG encoder with the 'transparency=none' parameter to squash the alpha channel. That should shrink the resulting string by at least ~25% and remove the costly color-conversion operation.
Reporter

Updated

7 years ago
Assignee: nobody → gsvelto
The cost of Gecko reflowing/painting is usually much higher than png or jpeg compression on the resulting bitmap.  I'd suggest we start here by removing screenshotting entirely from app startup (in the gaia system app, $b2g/gaia/apps/system) to see what the best possible win is.

The work in bug 800170 will reduce memory overhead and should make things faster, but if it gets us from 600ms on the critical path to, say, 400ms on the critical path, then we still have 400ms we need to move off the critical path :).
Reporter

Comment 15

7 years ago
(In reply to Chris Jones [:cjones] [:warhammer] from comment #14)
> The cost of Gecko reflowing/painting is usually much higher than png or jpeg
> compression on the resulting bitmap.

Yes, but on the Otoro (and other ARM-based devices) the lack of an integer division implemented in hardware makes PNG encoding with an alpha channel so expensive that it currently dwarfs the cost of drawing the screen.

> I'd suggest we start here by removing
> screenshotting entirely from app startup (in the gaia system app,
> $b2g/gaia/apps/system) to see what the best possible win is.

I'll also try to do that.

> The work in bug 800170 will reduce memory overhead and should make things
> faster, but if it gets us from 600ms on the critical path to, say, 400ms on
> the critical path, then we still have 400ms we need to move off the critical
> path :).

Couldn't agree more :-)
(In reply to Gabriele Svelto [:gsvelto] from comment #13)
> I'll try using the PNG encoder
> with the 'transparency=none' parameter to squash the alpha channel. That
> should shrink the resulting string by at least ~25% and remove the costly
> color-conversion operation.

I would expect that if you encode with transparency a PNG24 which has no transparency (as this image has), that's not 25% more expensive than encoding it without transparency.  PNG is a compressed image format.

I'd really encourage you to wait until I finish with that bug before spending a lot of time timing things, if you have something else to do in the meantime.  I'm shrinking the image size by a factor of 10 or more, which I expect may have a more significant effect on speed than you guys seem to be supposing.
Reporter

Comment 17

7 years ago
(In reply to Justin Lebar [:jlebar] from comment #16)
> I would expect that if you encode with transparency a PNG24 which has no
> transparency (as this image has), that's not 25% more expensive than
> encoding it without transparency.  PNG is a compressed image format.

There was something in the alpha channel of the source image so I was guessing that information was actually non-trivial. I've dumped it and it turns out to be 0xff for every pixel so yes, it's going to be largely compressed.

> I'd really encourage you to wait until I finish with that bug before
> spending a lot of time timing things, if you have something else to do in
> the meantime. I'm shrinking the image size by a factor of 10 or more, which
> I expect may have a more significant effect on speed than you guys seem to
> be supposing.

OK, I did just a quick test on the Otoro following your lead. I changed the canvas.ToDataURL() invocation in BrowserElementChild.js::_recvGetScreenshot() to use "image/jpeg" instead of "image/png". The cost of a snapshot drops from 500-600ms to just 100ms so it's a huge win. I will focus on getting the screenshot out of the critical path then and leave the encoding of the resulting image to you.
Reporter

Updated

7 years ago
Depends on: 801676
Reporter

Comment 18

7 years ago
(In reply to Chris Jones [:cjones] [:warhammer] from comment #10)
> A good way to ensure that screenshots are off the critical path may be to
>  - move the "first idle" code [1] to live on TabChild
>  - fire that event through the mozbrowser API, maybe a "firstidle" event or
> something
>  - in gaia, take a screen capture off that notification instead of
> "firstpaint"

I'm trying to implement this but I need a couple of pointers because I can't seem to wrap my head around the code. Currently ContentChild posts a task here:

http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#490

... which calls the FirstIdle() function, I assume from the name of the method used for this (PostIdleTask) that this function will be called during the first idle period available. FirstIdle() in turn calls SendFirstIdle() on the ContentChild object itself. Whilst I'm not sure about it I think that this actually executes the RPC call specified here:

http://mxr.mozilla.org/mozilla-central/source/dom/ipc/PContent.ipdl#359

... which is then handled by a ContentParent object. And if that's the case then should I add something similar to TabChild/TabParent?

Also looking at Bug 787378 which adds the 'firstpaint' event I am under the impression that I should add similar code to generate a 'firstidle'. All in all I'm quite a bit confused... 

Since I can't seem to connect all the dots would you recommend a deep dive in the IPDL documentation on MDN before trying to tackle this issue?
> All in all I'm quite a bit confused...

It sounds like you may be underestimating yourself a bit!  :)  Comment 18 is a good summary of what's going on.  You've got the IPDL stuff spot-on.

> And if that's the case then should I add something similar to TabChild/TabParent?

Yes.  You might also remove the one on ContentChild/Parent?  I guess it depends on what it's used for.

The trick here is that you need to get the firstidle notification into the correct BrowserElementParent, so it can fire the event on the correct iframe.

If you get the idle event into the TabParent, then you can fire an observer "first-idle" (or whatever) and pass in the TabParent's mFrameElement along with the notification.  Then you can add an observer to BrowserElementParent which listens to that notification and fires a firstidle event on the iframe if the notification's frame element matches BEP's this._frameElement.

Does that make sense?
canvas.toBlob() just landed, which might help. See https://bugzilla.mozilla.org/show_bug.cgi?id=648610
Reporter

Comment 21

7 years ago
(In reply to Justin Lebar [:jlebar] from comment #19)
> It sounds like you may be underestimating yourself a bit!  :)  Comment 18 is
> a good summary of what's going on.  You've got the IPDL stuff spot-on.

Sorry for the delay, I've used yesterday to clean up the smaller bugs I had in my backlog and spent some time pondering your comment. I think I grasped most of the high-level relation between the components but getting the code together will require more time. After having read the IPDL documentation I have a slightly better understanding of how some of this stuff works.

> Yes.  You might also remove the one on ContentChild/Parent?  I guess it
> depends on what it's used for.

Looking at bug 787378 currently ContentParent waits for the firstidle event to start the prelaunch app (if I understand correctly this is the idle app that is sitting around to speed-up startup of a new app). There are no other consumers and from the bug comments it seems like this is a temporary solution.

Alas this event is now fired only once whilst in the case of this bug we want a periodic event that fires when the app is idle so that a listener can take a screenshot.

> The trick here is that you need to get the firstidle notification into the
> correct BrowserElementParent, so it can fire the event on the correct iframe.
> 
> If you get the idle event into the TabParent, then you can fire an observer
> "first-idle" (or whatever) and pass in the TabParent's mFrameElement along
> with the notification.

What I'm trying to figure out now is when/where to fire this event from the TabChild code. The code in ContentChild is currently being called once per created tab but this wouldn't cut it in this case.

> Then you can add an observer to BrowserElementParent
> which listens to that notification and fires a firstidle event on the iframe
> if the notification's frame element matches BEP's this._frameElement.

I think I lost you here, what you suggest is sending a message from TabParent into the iframe and adding an observer to the BEP, then in the BEP's observe() function check if the mFrameElement parameter matches the one in the BEP and then from there firing the actual message, i.e. (pseudo-code ahead):

Somewhere in TabContent.cpp:

SendFirstIdle(); // This means I'll have to add the message to the appropriate .idl file

In TabParent.cpp:

TabParent::RecvFirstIdle() {
  nsCOMPtr<nsIContent> content = do_QueryInterface(mFrameElement);
  nsIFrame *frame = content->GetPrimaryFrame();
  // send and event into the nsIFrame object, still have to figure out how to do it
}

In BrowserElementParent.js:

_init: function() {
  // ...
  var os = Cc["@mozilla.org/observer-service;1"].getService(Ci.nsIObserverService);
  os.addObserver(this, 'first-idle', /* ownsWeak = */ true);
  // ...
}

observe: function(subject, topic, data) {
  switch(topic) {
  case 'first-idle':
      // if we've got the right frame element do something similar to _fireEventFromMsg()
      break;
  // ...
  },
}

> Does that make sense?

Kinda :) I'm still trying to figure out the details, I'm sorry it's taking so long and thanks for your help.
> Alas this event is now fired only once whilst in the case of this bug we want a periodic event that 
> fires when the app is idle so that a listener can take a screenshot.

That's not how I understood this bug.  I understood us to want to replace "firstpaint" with "firstidle".  This firstidle event would fire only once per app.

However, it seems like the ContentParent firstidle is subtly different: That's the firstidle per process.  We usually have only one app per process, so this would usually be the same as the app's firstidle.  But we sometimes load multiple apps into the same process (e.g. the browser app).  So it sounds like you need both.

> What I'm trying to figure out now is when/where to fire this event from the TabChild code. The 
> code in ContentChild is currently being called once per created tab but this wouldn't cut it in 
> this case.

Just like the ContentChild code, I think you'd kick off an event when the TabChild is created.  We get one TabChild per BrowserElementChild.

> I think I lost you here

The only part in your pseudocode that's not right is

> TabParent::RecvFirstIdle() {
>   nsCOMPtr<nsIContent> content = do_QueryInterface(mFrameElement);
>   nsIFrame *frame = content->GetPrimaryFrame();
>   // send and event into the nsIFrame object, still have to figure out how to do it
> }

This should be a call into the observer service, as:

   #include "mozilla/Services.h"
   using namespace mozilla;


   nsCOMPtr<nsIObserverService> os = services::GetObserverService();
   if (os) {
     os->NotifyObservers(mFrameElement, "first-idle", nullptr);
   }

see nsIObserverService for details.
(In reply to Justin Lebar [:jlebar] from comment #22)
> > Alas this event is now fired only once whilst in the case of this bug we want a periodic event that 
> > fires when the app is idle so that a listener can take a screenshot.
> 
> That's not how I understood this bug.  I understood us to want to replace
> "firstpaint" with "firstidle".  This firstidle event would fire only once
> per app.

Just to confirm, firstidle is per-process, and that's in fact what we want for prelaunching.

> However, it seems like the ContentParent firstidle is subtly different:
> That's the firstidle per process.  We usually have only one app per process,
> so this would usually be the same as the app's firstidle.  But we sometimes
> load multiple apps into the same process (e.g. the browser app).  So it
> sounds like you need both.
> 
> > What I'm trying to figure out now is when/where to fire this event from the TabChild code. The 
> > code in ContentChild is currently being called once per created tab but this wouldn't cut it in 
> > this case.

You also don't want to fire an event on every idle. Once you fire event(s) based on idle, you'd probably want to do delay for a period of time before posting another idle task.

Idle can only be measured on a per-process basis, but having the process being idle implies that all of the apps are idle. You'd need to iterate over the apps in the process and fire an event for each one (I'm not sure how you actually do this). I saw jlebar mention observers. You could fire a "good-time-to-take-a-screenshot" event per-process but have an observer per-app.
> You also don't want to fire an event on every idle. Once you fire event(s) based on idle, 
> you'd probably want to do delay for a period of time before posting another idle task.

We're not going to fire an event on ever idle.  We're going to fire an event on each TabParent's first idle.  That's in addition to firing an event on the ContentParent's first idle.  So I do not think we need a timeout.

If each TabParent inserts an idle event on creation just like the ContentParent inserts an event on creation, I think we'll be fine.
Reporter

Comment 25

7 years ago
(In reply to Justin Lebar [:jlebar] from comment #24)
> We're not going to fire an event on ever idle.  We're going to fire an event
> on each TabParent's first idle.  That's in addition to firing an event on
> the ContentParent's first idle.  So I do not think we need a timeout.
> 
> If each TabParent inserts an idle event on creation just like the
> ContentParent inserts an event on creation, I think we'll be fine.

Yeah, I got things a bit mixed up, my bad. In this bug we want a per-tab first-idle to be used for taking the first screenshot and to push it out of the application startup critical path.

Besides that a "good-time-to-take-a-screenshot" event is probably a good idea too because from the traces I've taken with the profiler it seems to me that we're snapshotting far too often. I haven't yet figured out how to do that one, for the moment I'll concentrate on first-idle.
Reporter

Comment 26

7 years ago
This patch implements a per-TabChild first-idle event as we have discussed in this bug, this event is posted every time a TabChild is created. It has two main issues:

- It seems to me looking at the logs that the first-idle event is getting delivered *before* the first-paint event, the issue here is that a TabChild is preallocated as part of a new tab and I think this happens before an application is launched (i.e. as part of the pre-launched app, but I might be wrong here)

- In BrowserElementParent.js the check between the frameElement that comes with the first-idle event and the BEP's own frameElement shows up as always false which makes me think I got it completely wrong
Attachment #673902 - Flags: feedback?(justin.lebar+bug)
Reporter

Updated

7 years ago
Attachment #673902 - Flags: feedback?(jones.chris.g)
> - In BrowserElementParent.js the check between the frameElement that comes with the first-idle event 
> and the BEP's own frameElement shows up as always false which makes me think I got it completely 
> wrong

This may be a consequence of the first problem -- the BEP isn't created until after we have an iframe for the process, which happens after the process is preloaded (and apparently also after the TabChild is preloaded!).  So maybe we're firing first-idle before the right BEP is around to hear it.

Perhaps you need to modify ContentParent::SetManifestFromPreallocated so that all of the ContentParent's TabParents get notified when we set the manifest.  They only enqueue the first-idle event when that happens.

But I have another idea which might be better.  What we really want here is to decrease the priority of taking a screenshot.  If we had a multi-threaded system, we'd decrease the priority of the screenshot thread.  But we can't do that.  (*)  But what we can do is enqueue the event to take the screenshot with a low priority, such that other events take priority over it.  That's basically what we're trying to approximate here with this first-idle event.  But maybe we can do that directly.

The trick would be modifying BrowserElementChild.js::_recvGetScreenshot so that it does its work off a low-priority set-timeout.  Maybe we'd break it up into two separate timeouts (drawWindow and toBlob), if both operations are expensive.

I don't know how to do the equivalent of MessageLoop::current()->PostIdleTask from chrome JS.  Chris, do we have functionality to do that?  If not, it doesn't seem like it would be hard to add.

(*) We /could/ approximate that if we encoded the screenshot image off-main-thread, which is probably doable.  Do you know what fraction of the cost of taking a screenshot is due to the encoding?
Flags: needinfo?(jones.chris.g)
Attachment #673902 - Flags: feedback?(justin.lebar+bug)
You can't do the equivalent of PostIdleTask() from JS currently, but yes that would be relatively easy to do.  We could let script request to be notified on "nextidle" and then send an observer service notification or something.
Flags: needinfo?(jones.chris.g)
> We could let script request to be notified on "nextidle" and then send an observer 
> service notification or something.

Or we could write an XPCOM service (or whatever) that gives scripts direct access to PostIdleTask(), perhaps?
That would complicate memory management quite a bit.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #30)
> That would complicate memory management quite a bit.

We must not be thinking of the same thing.

I'm just saying that that we should change BrowserElementChild's screenshot code from

  function getScreenshot() {
    ctx.drawImage();
    canvas.toBlob();
  };

to

  function getScreenshot() {
    var messageLoop = getService("...");
    messageLoop.postIdleTask(function() {
      ctx.drawImage();
      canvas.toBlob();
    });
  }

I don't see how that affects memory management at all.  Am I missing something?
The |messageLoop| thing doesn't exist.  It has to manage the tasks that are being posted to it.

That's a considerably higher amount of work than going through observer service.  This is just an internal API so I don't think it's worth the effort of trying to be perfect.
Comment on attachment 673902 [details] [diff] [review]
Tentative patch to add a per-TabChild first-idle event

Generally looks OK, but your firstidle task needs to hold a strong ref to the TabChild.
Attachment #673902 - Flags: feedback?(jones.chris.g) → feedback+
> The |messageLoop| thing doesn't exist.  It has to manage the tasks that are being posted 
> to it.

I understand.  But it's like a 30 loc XPCOM service.  All it does is call MessageLoop::Current()->PostIdleTask().  Where does the complication come in?

We shouldn't use the observer service as a replacement for method calls, if that's what you're suggesting here.

Maybe it would just be easier for me to write this patch so you can explain why it won't work.
You have JS handing you a JS callback.  The Task* that you PostIdleTask() with has to manage that reference.  It will have to duplicate everything the observer service does.

We're apparently not on the same page, so feel free to ping on IRC to straighten this out with higher bandwidth ;).
Posted patch WIP v1 (obsolete) — Splinter Review
This almost works.  As far as I can tell, the refcounting is all correct; the Chromium code addrefs and releases the runnable in PostRunnableTask.

Unfortunately we hang on the second screenshot in the browser's GetScreenshot test.  The first screenshot goes fine.  So I imagine that we're simply getting starved here.  One solution would be to add a timeout and take the screenshot after that timeout expires.
Attachment #674071 - Flags: feedback?(jones.chris.g)
Comment on attachment 674071 [details] [diff] [review]
WIP v1

This doesn't hold a strong ref to the task, AFAICT.

What happens when the idle task doesn't fire in time due to busy event loop and the tab dies?  This task will hold the tab alive forever, right?
Attachment #674071 - Flags: feedback?(jones.chris.g)
> This doesn't hold a strong ref to the task, AFAICT.

That's not what I understand from reading the documentation.

>   // The MessageLoop takes ownership of the Task, and deletes it after it has
>   // been Run().

And the nsIRunnable is addref'ed by NewRunnableMethod.

> What happens when the idle task doesn't fire in time due to busy event loop and the tab 
> dies?  This task will hold the tab alive forever, right?

Sure, that's a problem.  Adding a timeout (which we apparently have to do anyway) would make this a non-issue.
(In reply to Justin Lebar [:jlebar] from comment #38)
> > This doesn't hold a strong ref to the task, AFAICT.
> 
> That's not what I understand from reading the documentation.
> 
> >   // The MessageLoop takes ownership of the Task, and deletes it after it has
> >   // been Run().
> 
> And the nsIRunnable is addref'ed by NewRunnableMethod.
> 

Are you sure about that?  I didn't think we'd implemented the addref/unref traits for gecko XPCOM objects.

> > What happens when the idle task doesn't fire in time due to busy event loop and the tab 
> > dies?  This task will hold the tab alive forever, right?
> 
> Sure, that's a problem.  Adding a timeout (which we apparently have to do
> anyway) would make this a non-issue.

Nope.  Then you have to expose another object to JS to cancel the pending task (i.e. return something from postIdleTask), and then inside there you have to clear the nsIRunnable.
> I didn't think we'd implemented the addref/unref traits for gecko XPCOM objects.

The generic traits call AddRef() / Release(), as far as I can tell.  See task.h:

template <class T>
struct RunnableMethodTraits {
  static void RetainCallee(T* obj) {
    obj->AddRef();
  }
  static void ReleaseCallee(T* obj) {
    obj->Release();
  }
};


> Nope.  Then you have to expose another object to JS to cancel the pending task (i.e. 
> return something from postIdleTask), and then inside there you have to clear the 
> nsIRunnable.

If we clear the runnable after a short timeout, why does it matter if we leak a window for that period of time?  Note that we already "leak" windows until GC, which doesn't happen all that often.  We also "leak" the window while we're getting the screenshot blob, which can take up to a second...

Anyway it should be easy to tie the runnable's lifetime to a window's lifetime, cancel the runnable, and so on...
Who's doing the clearing in your proposal?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #41)
> Who's doing the clearing in your proposal?

This new code can listen to inner-window-destroyed notifications.

Again, I don't really think this is necessary if the timeout less than 5 seconds.
What I meant was, which code is going to cancel the task off a timeout?  Are you going to hard-code the timeout somewhere and force the clients to just deal with their runnable never firing?
> What I meant was, which code is going to cancel the task off a timeout?

This new code that I'm writing -- nsMessageLoop -- can deal with the timeout.

I wasn't thinking of cancelling the task -- I thought I'd run it after the timeout elapses, to ensure that we don't starve.  I can imagine wanting the other behavior in some circumstances, but in the case of screenshots at least, we want to eventually take the screenshot.
You would need to task->Cancel() in C++ land to free the object from the event loop's grasp.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #45)
> You would need to task->Cancel() in C++ land to free the object from the
> event loop's grasp.

Can't I just leave the task object in the event loop and let it get cleared the next time we're idle?  I'm not particularly worried about these small objects piling up.  (We can instead modify the task's runnable so that it doesn't hold onto any JS objects, and so that when run, it doesn't do anything.)
I'm not worried about the Task*, I'm worried about its nsIRunnable holding half the world alive for an unbounded amount of time.

task->Cancel() should null out the object the method is going to be called on.
Posted patch Patch, v1 (obsolete) — Splinter Review
Like so?
Attachment #673902 - Attachment is obsolete: true
Attachment #674071 - Attachment is obsolete: true
Attachment #674118 - Flags: review?(jones.chris.g)
Posted patch Patch v1.1Splinter Review
Fixing out-of-date comment.
Attachment #674118 - Attachment is obsolete: true
Attachment #674118 - Flags: review?(jones.chris.g)
Attachment #674120 - Flags: review?(jones.chris.g)
Actually, I need to add an isAlive() check to the beginning of BEC::_takeScreenshot, but I'll do that tomorrow.
Comment on attachment 674120 [details] [diff] [review]
Patch v1.1

>diff --git a/dom/browser-element/BrowserElementChild.js b/dom/browser-element/BrowserElementChild.js

>   _recvGetScreenshot: function(data) {

>+    let maxWidth = data.json.args.width;
>+    let maxHeight = data.json.args.height;

Would recommend some light sanitization here, unless it's done
elsewhere that you know of.

Looks good!
Attachment #674120 - Flags: review?(jones.chris.g) → review+
> Would recommend some light sanitization here, unless it's done
> elsewhere that you know of.

It's done on the parent side, yes.
I can't ask for approval-aurora on B2G bugs, so I guess blocking-?
blocking-basecamp: --- → ?
This is the biggest remaining generic platform win for startup performance, which is required to hit our ship targets.  So bb+ based on that.
blocking-basecamp: ? → +
One thing I haven't done here is measure that this actually gets us the perf win that we're hoping for.

I'm happy to measure myself, but I don't know how.  Gabriele, can you provide some pointers?
Assignee: gsvelto → justin.lebar+bug
Flags: needinfo?(gsvelto)
Reporter

Comment 56

7 years ago
I've re-run my tests and the patch does effectively push the screenshot to the end of the startup procedure, after all the elements have been properly painted, etc... Here's a trace of the settings up starting and showing this behavior:

http://people.mozilla.com/~bgirard/cleopatra/?report=192b595af3ea9eb8017267fdd5d95043dda4c61f
Flags: needinfo?(gsvelto) needinfo?(gsvelto) → needinfo+
Thanks!  Pushed to inbound without tryserver coverage.  We'll see if this sticks.

https://hg.mozilla.org/integration/mozilla-inbound/rev/15c00f893939
https://hg.mozilla.org/mozilla-central/rev/15c00f893939
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.