Last Comment Bug 731419 - Discard image data immediately when a tab is closed
: Discard image data immediately when a tab is closed
Status: RESOLVED FIXED
[MemShrink:P2]
:
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla13
Assigned To: Justin Lebar (not reading bugmail)
:
Mentors:
Depends on: 762460 763593
Blocks: image-suck
  Show dependency treegraph
 
Reported: 2012-02-28 14:59 PST by Justin Lebar (not reading bugmail)
Modified: 2012-06-16 06:46 PDT (History)
16 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Ugly instrumentation code (7.72 KB, patch)
2012-03-02 13:59 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Review
Part 1: imagelib changes (8.85 KB, patch)
2012-03-06 18:28 PST, Justin Lebar (not reading bugmail)
joe: review+
Details | Diff | Review
Part 2: Content changes (3.06 KB, patch)
2012-03-06 18:28 PST, Justin Lebar (not reading bugmail)
bzbarsky: review+
Details | Diff | Review

Description Justin Lebar (not reading bugmail) 2012-02-28 14:59:42 PST
It seems that when you close a tab, we wait for the image discard timeout before discarding the image data.  Which is super-lame.

(I thought we filed a bug on this, but I can't find it.)
Comment 1 Justin Lebar (not reading bugmail) 2012-02-29 13:04:36 PST
Yeah, it looks like we merely unlock all images on document destruction.

I think the right solution is to have two unlock methods:

 (a) Unlock, and if the count goes to 0, discard after 20s (this is the current behavior)
 (b) Unlock, and if the count goes to 0, discard immediately.

I wonder if in case (b) we can get rid of the compressed image data, too.
Comment 2 Robert Claypool 2012-02-29 19:19:06 PST
What are all the causes of an image unlock? I'm guessing at least navigating away from a page and closing the tab. What happens when the picture goes off the edge of the window or state is changed due to a script. What performance tests are there? Is there a page (wiki, education,  or other) that covers this? Should there be?
Comment 3 Joe Drew (not getting mail) 2012-02-29 19:24:32 PST
Off the edge of a window currently does nothing; eventually, it will unlock the image.

Not sure what you mean by "state is changed due to a script", but all images on the foreground tab should be locked. I'm not sure if that invariant is maintained for JS-added images, come to think of it.

Perf tests: this is mostly covered by Tp, such as it is. We don't have any other image-specific perf tests.

There definitely should be a whole series of wiki pages. Most of this simply lives in the brains of a small handful of people: Justin Lebar, Bobby Holley, Kyle Huey, Jeff Muizelaar and me, to be precise.
Comment 4 Justin Lebar (not reading bugmail) 2012-02-29 19:41:50 PST
> What are all the causes of an image unlock? I'm guessing at least navigating away from a page and 
> closing the tab.

The basic rule is: Only images in the foreground tab in are locked.  So we unlock when you navigate or close the active tab, and we also unlock when you switch to a different tab.

We never discard the uncompressed data of a locked image, and we discard uncompressed data of unlocked images after a timeout.

As an added wrinkle, we share images between tabs.  I don't pretend to understand all the details, but you could imagine an image that's in both a main window and a popup window having a lock count of 2, and only becoming unlocked when both the popup and main window no longer have that tab in the foreground.
Comment 5 Robert Claypool 2012-02-29 19:53:08 PST
On long pages with lots of images, especially when free memory is low, you might want to unlock the images furthest from the visible area. I've run into situations where no tab in any Firefox window will draw and unlocking might make that less likely to happen.
Comment 6 Justin Lebar (not reading bugmail) 2012-02-29 19:57:35 PST
Yes, this is one of the long-standing memshrink bugs.  See bug 689623.
Comment 7 Justin Lebar (not reading bugmail) 2012-03-01 13:32:08 PST
I spent some time poking at this today, and it's a bit tricky.

AFAICT, a document doesn't know when its tab is closed.  It only knows when its destructor is called, which might be some time after its tab goes away.  Once the document is destroyed, its images go away too, because the document is what's keeping them alive.

But destroying a document can take time.  In my testing, I've had to run the CC three times to nab my compartment (and by extension, I presume, my document).

As a hack, we could I'm sure send a notification down from somewhere in the front-end to the document telling it and all its children to drop their decoded images.

But the right solution, I think, would be to run "enough" iterations of the gc/cc soon after a tab is closed.  Then we'd drop the decoded image data *and* all the relevant JS data, which is often much larger than the image data.

I dunno how many runs "enough" is, but the good thing is, we can yield to the event loop between each run.
Comment 8 Robert Claypool 2012-03-01 14:00:04 PST
What happens between the time the tab is closed and the document's destructor is called? Or to put it another way, what needs to be done that prevents the tab closing code from immediately calling the document's destructor.

Should another bug be filed about making the document destruction happen ASAP after the tab is closed?

And... does the browser currently actually use any of this info when doing undo close tab to make it unclose faster? Are there statistics on how frequently someone uncloses a tab, or can we just assume that it happens rarely enough and/or has/matches end-user's certain expectation of slowness that it's just not worth leaving any potentially undo close performance enhancements in?

A person could expect that it would take just as long for the page to open on a undo close tab as if it had never been loaded, or they could expect that it was just there a moment ago, so it shouldn't take that long to undo.

Of course the amount of time it takes from a person to close a tab, to deciding the tab needs to be reopened, to the time they perform the necessary operations to unclose the tab could make any performance enhancement time window be closed.
Comment 9 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2012-03-01 14:42:33 PST
    (In reply to Justin Lebar [:jlebar] from comment #7)
    > AFAICT, a document doesn't know when its tab is closed.
    Sure it should know.
    Something like nsDocument::OnPageHide gets called.

    > 
    > But destroying a document can take time.  In my testing, I've had to run the
    > CC three times to nab my compartment (and by extension, I presume, my
    > document).
    compartment != document. 
    When nsGlobalWindow::SetDocShell(nsnull) is called, mDoc is set to null.
    Though, some JS may still keep document alive.


    > But the right solution, I think, would be to run "enough" iterations of the
    > gc/cc soon after a tab is closed.  Then we'd drop the decoded image data
    > *and* all the relevant JS data, which is often much larger than the image
    > data.
    > 
    > I dunno how many runs "enough" is, but the good thing is, we can yield to
    > the event loop between each run.
    Running CC is probably more ok now that most of CC runs are fast.
    But I'd be still very careful. In general we're trying to run CC less often.
Comment 10 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2012-03-01 14:45:00 PST
(In reply to Robert Claypool from comment #8)
> What happens between the time the tab is closed and the document's
> destructor is called? Or to put it another way, what needs to be done that
> prevents the tab closing code from immediately calling the document's
> destructor.
> 
> Should another bug be filed about making the document destruction happen
> ASAP after the tab is closed?
Document is destroyed when there are no references to it.
(or references form a cycle which cycle collector detects)
JS in another same domain tab may easily refer to the document when tab is closed.
Comment 11 Robert Claypool 2012-03-01 16:44:13 PST
Should a JS reference have the same weight as the document actually being open? What would a JS reference do that wouldn't have an effect similar to unclose document? What JS representation corresponds to an actual JS reference? I'm not that familiar with JS, but what goes beyond a soft link, a URL in a variable for example that binds the JS to the actual document in a way that unclose tab, a bookmark, or the URL typed in an address bar does not? Are there any cases where two instances of a document become so different that they are no longer considered the same document?

I'm thinking of instances where a JS opens a window and writes some data to the window. Is/Should that considered a document all on it's own or part of the document that the JS is part of?

What is a document that the browser is mindful of it?
Comment 12 Justin Lebar (not reading bugmail) 2012-03-02 13:30:53 PST
I instrumented the GC/CC code to track windows and documents.

I add a weak ref to a window and its document to a list when we SetDocShell(null).  Then after every GC/CC, I check whether the window and/or document is still alive, how many GC/CCs have passed since it was enqueued, and how many seconds have passed.

It takes up to three CCs for a document or window to be collected.  This can take almost 80s in my tests.  The longest I saw a top-level window stay alive was 40s and 2 CCs, but I haven't tested for very long yet.
Comment 13 Justin Lebar (not reading bugmail) 2012-03-02 13:31:49 PST
Make that 4CCs, not three, and 300s, for a twitter widget.
Comment 14 Justin Lebar (not reading bugmail) 2012-03-02 13:52:27 PST
And for a top-level window, I saw 43s, 8 GCs, 3 CCs.

So now the question is: Is this an acceptable amount of time?  We can immediately nuke the images in SetDocShell, but lots of memory will be stuck around until the 3 CCs run...
Comment 15 Justin Lebar (not reading bugmail) 2012-03-02 13:59:49 PST
Created attachment 602480 [details] [diff] [review]
Ugly instrumentation code
Comment 16 Justin Lebar (not reading bugmail) 2012-03-04 08:03:44 PST
Since the CC frequency is still in flux, and since I now know how to tell when a window becomes invisible, I should focus this back on the original bug, of tossing the decoded images as soon as the window/document goes away.
Comment 17 Joe Drew (not getting mail) 2012-03-04 08:21:06 PST
Yes. The simpler the better.
Comment 18 Justin Lebar (not reading bugmail) 2012-03-06 18:28:01 PST
Created attachment 603554 [details] [diff] [review]
Part 1: imagelib changes

The change in RasterImage::Discard to always remove the image from the discard tracker (on both normal and force discards) is safe with my rewrite of discarding in bug 732820.  Not sure if it is otherwise.
Comment 19 Justin Lebar (not reading bugmail) 2012-03-06 18:28:15 PST
Created attachment 603555 [details] [diff] [review]
Part 2: Content changes
Comment 20 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-03-06 19:22:58 PST
Comment on attachment 603555 [details] [diff] [review]
Part 2: Content changes

So this will discard image data in display:none subframes, not just in closed tabs.  Is that intended?
Comment 21 Justin Lebar (not reading bugmail) 2012-03-06 19:32:08 PST
(In reply to Boris Zbarsky (:bz) from comment #20)
> Comment on attachment 603555 [details] [diff] [review]
> Part 2: Content changes
> 
> So this will discard image data in display:none subframes, not just in
> closed tabs.  Is that intended?

Well, not *un*intentional.  :)  Those images should already be unlocked because the document is not visible -- this just discards immediately.

I guess the concern is that an iframe might oscillate between display:none and display:block, causing us to waste time re-decoding.  And I'd guess we have no idea whether this commonly happens?
Comment 22 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-03-06 19:42:52 PST
Comment on attachment 603555 [details] [diff] [review]
Part 2: Content changes

Well, it's _a_ concern.  I doubt it's common.  Let's do this.
Comment 23 Joe Drew (not getting mail) 2012-03-07 21:33:38 PST
Comment on attachment 603554 [details] [diff] [review]
Part 1: imagelib changes

Review of attachment 603554 [details] [diff] [review]:
-----------------------------------------------------------------

I would love to see a unit test that used this, then drew an ostensibly discarded image to a canvas and ensured it drew properly.

::: image/public/imgIRequest.idl
@@ +195,5 @@
>    void unlockImage();
>  
>    /**
> +   * If this image is unlocked, discard the image's decoded data.  If the image
> +   * is not unlocked or is already discarded, do nothing.

s/not unlocked/locked/

::: image/src/RasterImage.cpp
@@ -2181,5 @@
>    if (observer)
>      observer->OnDiscard(nsnull);
>  
> -  if (force)
> -    DiscardTracker::Remove(&mDiscardTrackerNode);

Er, why do we have the force parameter then?

@@ +2689,5 @@
> +NS_IMETHODIMP
> +RasterImage::RequestDiscard()
> +{
> +  if (CanDiscard()) {
> +    printf("RequestDiscard: Discarding.\n");

remove debugging code

::: image/src/imgRequestProxy.cpp
@@ +374,5 @@
> +NS_IMETHODIMP
> +imgRequestProxy::RequestDiscard()
> +{
> +  if (mImage)
> +    return mImage->RequestDiscard();

{} please
Comment 24 Justin Lebar (not reading bugmail) 2012-03-07 21:44:37 PST
> Er, why do we have the force parameter then?

I guess I could do

  if (CanDiscard())
    ForceDiscard()

but that's not really what I mean...

But yes, it looks like the only difference between Discard() and ForceDiscard() is the assertion.
Comment 25 Mozilla RelEng Bot 2012-03-09 21:16:30 PST
Try run for 2a95ced444a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=2a95ced444a6
Results (out of 70 total builds):
    exception: 48
    success: 11
    warnings: 9
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-2a95ced444a6
Comment 26 Justin Lebar (not reading bugmail) 2012-03-09 22:36:37 PST
Bug 732820 is a minefield I did not anticipate, but this bug looks fine on try, so I just landed it alone.

https://hg.mozilla.org/integration/mozilla-inbound/rev/46fe585786ae
https://hg.mozilla.org/integration/mozilla-inbound/rev/d6dc71da36ac
https://hg.mozilla.org/integration/mozilla-inbound/rev/f0933a7d1ab0
Comment 27 Mozilla RelEng Bot 2012-03-10 00:31:11 PST
Try run for 98531156a16d is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=98531156a16d
Results (out of 218 total builds):
    exception: 1
    success: 178
    warnings: 38
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-98531156a16d
Comment 28 Daniel Holbert [:dholbert] (largely AFK until June 28) 2012-03-11 19:43:18 PDT
https://hg.mozilla.org/mozilla-central/rev/46fe585786ae
Comment 29 Daniel Holbert [:dholbert] (largely AFK until June 28) 2012-03-11 19:44:10 PDT
(In reply to Daniel Holbert [:dholbert] from comment #28)
> https://hg.mozilla.org/mozilla-central/rev/46fe585786ae
Also:
  https://hg.mozilla.org/mozilla-central/rev/d6dc71da36ac
  https://hg.mozilla.org/mozilla-central/rev/f0933a7d1ab0

Note You need to log in before you can comment on or make changes to this bug.