Last Comment Bug 845147 - Janky scrolling on pages with many small images as a result of bug 689623
: Janky scrolling on pages with many small images as a result of bug 689623
Status: RESOLVED FIXED
[snappy][MemShrink]
:
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: Trunk
: x86_64 Linux
: -- normal with 6 votes (vote)
: mozilla24
Assigned To: Timothy Nikkel (:tnikkel)
:
Mentors:
: 848377 (view as bug list)
Depends on: 1113855 695763 878612
Blocks: 689623
  Show dependency treegraph
 
Reported: 2013-02-25 17:44 PST by Justin Lebar (not reading bugmail)
Modified: 2014-12-30 17:57 PST (History)
33 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch, v1: Don't decode images synchronously if we've decoded them once before. (1.46 KB, patch)
2013-02-27 13:49 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Review
WIP patch from comment 21 (12.86 KB, patch)
2013-03-05 13:18 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Review
Don't decode images synchronously if we've decoded them once before. (843 bytes, patch)
2013-03-24 21:06 PDT, Seth Fowler [:seth] [:s2h]
joe: review+
Details | Diff | Review
Don't decode images synchronously if we've decoded them once before. (812 bytes, patch)
2013-06-04 12:27 PDT, Timothy Nikkel (:tnikkel)
joe: review+
Details | Diff | Review
fix landing the wrong patch (935 bytes, patch)
2013-06-26 13:28 PDT, Timothy Nikkel (:tnikkel)
joe: review+
bajaj.bhavana: approval‑mozilla‑aurora+
Details | Diff | Review

Description Justin Lebar (not reading bugmail) 2013-02-25 17:44:02 PST
(njn reported this; I haven't tried it myself)

STR that I expect will work but haven't tried myself:

* http://mpdrolet.tumblr.com/archive
* Scroll down so there are a few pages of images
* Wait 30s
* Grab the scrollbar and drag it to the top of the page.  This is not janky in a nightly without bug 689623, but the assertion is that it is with bug 698623.

I suspect this is because we're sync-decoding all these small images.

Perhaps a heuristic we can apply is not to sync-decode images that have been discarded (or have had their decodes canceled).  The sync decoding is really needed only for our tests, right?

I think this should be a high-priority follow-up to bug 698623, because I suspect it will be whine-worthy (that is, we'll get a lot of people complaining about it).
Comment 1 Nicholas Nethercote [:njn] (on vacation until July 11) 2013-02-25 21:11:03 PST
I see this on Linux and Mac, but it's not (to my eye) terrible.
Comment 2 Timothy Nikkel (:tnikkel) 2013-02-25 23:07:46 PST
I thought we synchronously did a small amount of decoding for all images? And for small images that small amount decodes the whole image, whereas for larger images it just does a piece of the decoding work?
Comment 3 Timothy Nikkel (:tnikkel) 2013-02-25 23:09:20 PST
I guess nothing you said contradicts that.

Anyways I thought we did the sync decoding so that we do get too many images flashing into view on tab switch.
Comment 4 Justin Lebar (not reading bugmail) 2013-02-25 23:15:07 PST
> Anyways I thought we did the sync decoding so that we do get too many images flashing into view on 
> tab switch.

That may have been a justification at one point, but I'd bet money that Taras would trade some image flashing for faster tab switching.  The tests were a real impediment to experimentation here, last time I checked.
Comment 5 Justin Lebar (not reading bugmail) 2013-02-27 11:55:44 PST
I have no idea if this will work, but if it compiles, I'll try it out on my slow netbook.

https://tbpl.mozilla.org/?tree=Try&rev=af9e2f293ae6
Comment 6 Justin Lebar (not reading bugmail) 2013-02-27 13:49:23 PST
Created attachment 719161 [details] [diff] [review]
Patch, v1: Don't decode images synchronously if we've decoded them once before.

Could it be so simple?  Still waiting for test results.

The difference in jank on my crappy netbook with this patch is night-and-day.

There's a visible downside, which is that images take longer to come into view.  I presume this is because we're devoting more CPU to scrolling and less to decoding.  I think this is the right trade-off, though.
Comment 7 Justin Lebar (not reading bugmail) 2013-02-27 13:53:54 PST
Alas this appears to be causing reftest failures.

I'm kind of tired of using reftest as an excuse for not fixing jank.

How about we make this change but hide it behind a pref and set that pref to false for reftest?
Comment 8 Justin Lebar (not reading bugmail) 2013-02-27 13:57:37 PST
(In reply to Justin Lebar [:jlebar] from comment #7)
> How about we make this change but hide it behind a pref and set that pref to
> false for reftest?

I guess this patch doesn't necessarily do what you'd expect due to the fact that we share images between windows.  So an image might have been decoded before, but in a different window.

So perhaps the right thing to do is to hide /all/ SOMEWHAT_SYNC decodes behind a pref, instead of all but the first.
Comment 9 Joe Drew (not getting mail) 2013-02-27 15:39:08 PST
It'll be pretty temporary anyways - most of the time we'll probably be switching SOMEWHAT_SYNCHRONOUS to NORMAL after bug 716140. (Switching from StartDecoding() to RequestDecode() in users, that is - I don't want to interpret anything in the depths of RequestDecode().)

Bug 716140 will probably make this a little better regardless, because the non-synchronous part of the decode will be off the main thread, but given that you can basically instantaneously start the decoding on a different thread by calling RequestDecode() in a multithreaded decoding world, there's not much reason to do StartDecoding() unless you actually want to block the main thread.
Comment 10 Joe Drew (not getting mail) 2013-02-27 15:39:52 PST
IOW, if this actually helps, a=me to hide this behind a non-reftest pref, and we'll revisit in general as we tune in a multithreaded world.
Comment 11 Justin Lebar (not reading bugmail) 2013-02-27 15:56:43 PST
\o/  I'll give you a patch soon.
Comment 12 Timothy Nikkel (:tnikkel) 2013-02-27 15:58:32 PST
You seem to be failing only one reftest, and only on some platforms. Maybe it would be better to disable one test then disable this patch for all reftests.
Comment 13 Justin Lebar (not reading bugmail) 2013-02-27 16:15:00 PST
(In reply to Timothy Nikkel (:tn) from comment #12)
> You seem to be failing only one reftest, and only on some platforms. Maybe
> it would be better to disable one test then disable this patch for all
> reftests.

Sure, it sounds like waiting for full test results would be reasonable.  Another option (if reftest supports it) would be to set the pref only for the relevant tests.
Comment 14 Justin Lebar (not reading bugmail) 2013-02-27 22:51:06 PST
I'm happy to have a look at the tests we're failing to see if I can understand what's going on; I think that's a reasonable burden to meet before hacking prefs.
Comment 15 Joe Drew (not getting mail) 2013-02-28 07:26:03 PST
(In reply to Justin Lebar [:jlebar] from comment #13)
> Another option (if reftest supports it) would be to set the pref only for
> the relevant tests.

It does. http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/README.txt#160
Comment 16 Justin Lebar (not reading bugmail) 2013-02-28 11:06:28 PST
I wonder if we could just make reftest wait until all images have been decoded.

What's going on in layout/reftests/bugs/816948-1.html is we set an image to be re-decoded in a "reftest-wait" test, and then we immediately remove "reftest-wait" from the html element.  There's no reason that the image should be decoded at this point.
Comment 17 Timothy Nikkel (:tnikkel) 2013-02-28 11:14:25 PST
Yeah, you can use special powers to observe the decode complete notification. http://mxr.mozilla.org/mozilla-central/source/image/test/crashtests/ownerdiscard.html?force=1 does this.
Comment 18 Justin Lebar (not reading bugmail) 2013-02-28 14:24:37 PST
I tried hacking something into reftest so it waits for all images to be decoded.  We'll see if this works.

https://tbpl.mozilla.org/?tree=Try&rev=4d5234c503bc
Comment 19 Justin Lebar (not reading bugmail) 2013-02-28 22:43:11 PST
> We'll see if this works.

> 16:18:33 INFO - REFTEST TEST-UNEXPECTED-FAIL | | program error managing snapshots

Apparently not.  I got this error locally a few times, but then it seemed to stop happening.
Comment 20 Justin Lebar (not reading bugmail) 2013-03-01 09:18:09 PST
https://tbpl.mozilla.org/?tree=Try&rev=0eaa272260c1
Comment 21 Justin Lebar (not reading bugmail) 2013-03-01 14:14:09 PST
Maybe one of these will work...

https://tbpl.mozilla.org/?tree=Try&rev=2496a1254d53
Comment 22 Justin Lebar (not reading bugmail) 2013-03-01 22:18:29 PST
Okay, so the attempt at fixing things in comment 21 fell short.

Judging by the logs, reftest is successfully waiting for the decode requests to finish, and no decodes finish after reftest takes its snapshot.  So it appears that when reftest gets notified that the decodes are done, those decodes haven't (yet?) triggered a pending paint event which would otherwise cause reftest to wait.

AFAICT (and I'll push to try to verify this), we will always call FlushInvalidations() on the image before we notify reftest that decoding is done.  (When the decoder shuts down, it calls FlushInvalidations(), and all decoders should shut down before I fire the notification.)

So assuming I'm right and we are calling FlushInvalidations() before we notify reftest, my question is: Should FlushInvalidations() be sufficient to get reftest to wait at [1], or do we need to do something harder than this?  dholbert or dbaron, do you understand this code (perhaps including the changes I made [1]) well enough to say?

[1] https://hg.mozilla.org/try/rev/2496a1254d53#l5.65
Comment 23 Justin Lebar (not reading bugmail) 2013-03-01 22:44:17 PST
Try push with additional printfs: https://tbpl.mozilla.org/?tree=Try&rev=f5fcdee9aaef
Comment 24 Justin Lebar (not reading bugmail) 2013-03-02 09:32:29 PST
Yes, afaict we flush invalidations before sending the notification to reftest.
Comment 25 Justin Lebar (not reading bugmail) 2013-03-02 18:21:31 PST
And I don't see anything in FlushInvalidations which interacts with layout, so I wonder how we know to paint the image.  Do we just get around to it when we get around to it?
Comment 26 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2013-03-02 18:29:14 PST
(In reply to Justin Lebar [:jlebar] from comment #22)
> dholbert or dbaron, do you understand this code (perhaps including the
> changes I made [1]) well enough to say?

Well, the simple answer is just:  no.
Comment 27 Justin Lebar (not reading bugmail) 2013-03-03 13:53:02 PST
In my (brief) gdb'ing, one thing looked suspicious.  AFAICT, layout is listening to the LOAD_COMPLETE notification but not the FRAME_COMPLETE notification.  Again AFAICT LOAD_COMPLETE means we've finished downloading the image but may not have finished decoding it yet, while FRAME_COMPLETE signals that we're done decoding.  It doesn't make much sense to me that we'd redraw when we're done downloading the image but not when we're done decoding it, so I tried changing it.

https://hg.mozilla.org/try/rev/378fd3add332

I can't reproduce the test failures on my mac, so I guess we'll see on try if this has any effect.

https://tbpl.mozilla.org/?tree=Try&rev=378fd3add332
Comment 28 Justin Lebar (not reading bugmail) 2013-03-03 16:00:46 PST
It seems that ImageLoader::DoRedraw never does anything in my tests.  This is because FrameLayerBuilder::IterateRetainedDataForFrame is iterating over a null array.  As a result, we don't call SchedulePaint() on the image's frame when it's done decoding.

In fact, we never call ImageLoader.cpp::InvalidateImagesCallback when I run reftest 816948-1.html.
Comment 29 Justin Lebar (not reading bugmail) 2013-03-03 19:37:50 PST
https://tbpl.mozilla.org/?tree=Try&rev=a046fd080df7
Comment 30 Justin Lebar (not reading bugmail) 2013-03-03 22:05:42 PST
This one does the pref hack.  https://tbpl.mozilla.org/?tree=Try&rev=f5280f61976e
Comment 31 Justin Lebar (not reading bugmail) 2013-03-04 04:53:07 PST
Another with the pref hack.  https://tbpl.mozilla.org/?tree=Try&rev=c0b41f211cee
Comment 32 Justin Lebar (not reading bugmail) 2013-03-04 07:23:16 PST
Sorry to keep spamming the bug.  https://tbpl.mozilla.org/?tree=Try&rev=e5dcda540a55
Comment 33 Seth Fowler [:seth] [:s2h] 2013-03-04 14:34:09 PST
(In reply to Justin Lebar [:jlebar] from comment #27)
> In my (brief) gdb'ing, one thing looked suspicious.  AFAICT, layout is
> listening to the LOAD_COMPLETE notification but not the FRAME_COMPLETE
> notification.  Again AFAICT LOAD_COMPLETE means we've finished downloading
> the image but may not have finished decoding it yet, while FRAME_COMPLETE
> signals that we're done decoding.  It doesn't make much sense to me that
> we'd redraw when we're done downloading the image but not when we're done
> decoding it, so I tried changing it.

I can offer some insight here. Imagelib guarantees that at least the first frame is decoded by the time LOAD_COMPLETE was fired. Bug 841579 will remove that requirement in support of OMTD, though. It's probably quite near to landing.
Comment 34 Justin Lebar (not reading bugmail) 2013-03-04 14:38:50 PST
Thanks, Seth.

I guess since all of this is slated to change soon, it makes sense to do the simplest thing that could possibly work and switch to all async decodes except for the tests that need sync decodes.  If my tests ever run, we'll see how green I am this time.
Comment 35 Justin Lebar (not reading bugmail) 2013-03-05 08:55:31 PST
As it turns out, it seems I now need a "all images finished decoding" notification for B2G, to fix bug 846177.  So maybe we need to go back to that approach.

I could use some help from a layout/gfx person, because I'm not familiar with this stuff at all.  Aha, Seth just pinged me!
Comment 36 Seth Fowler [:seth] [:s2h] 2013-03-05 09:32:48 PST
Justin, if all you need is a notification that guarantees that an image has downloaded and at least the first frame has decoded, the image's load event will do that for you. Also, if these images were all present when the document first loads, the _window's_ load event (not the document's; my apologies) should only be fired after the first frame has decoded for all images.
Comment 37 Justin Lebar (not reading bugmail) 2013-03-05 11:45:29 PST
(In reply to Seth Fowler [:seth] from comment #36)
> Justin, if all you need is a notification that guarantees that an image has
> downloaded and at least the first frame has decoded, the image's load event
> will do that for you.

But that doesn't work if we're re-decoding the image, which happens in bug 846177.  So I really want a notification which says "all images are done decoding and, if you wait until the next mozAfterPaint, they will be displayed."
Comment 38 Justin Lebar (not reading bugmail) 2013-03-05 13:12:55 PST
khuey asked me to post a summary of where this bug is since he might be able to take it over for me, since I just got stuck with a sizable b2g blocker.

Essentially I think we want to take the approach from comment 21, which adds a function to DOMWindowUtils that lets chrome JS be notified when all image decodes have finished.  We then make the reftest harness wait for this notification.  I think I can make B2G do a similar thing in bug 846177.

So if we want to take this code essentially unchanged, the only remaining task is to fix up the tests that are failing.  (There's a test failure on 816948-1.html and additional assertions on 364066-1.html.  I haven't looked at the latter; they may just go away when you rebase the patch.)

I'll post a copy of the last patch I had in a sec.
Comment 39 Justin Lebar (not reading bugmail) 2013-03-05 13:18:25 PST
Created attachment 721428 [details] [diff] [review]
WIP patch from comment 21

See the try build in comment 21; perhaps this just needs two changes to our tests.
Comment 40 mayankleoboy1 2013-03-12 05:48:01 PDT
I filed bug 848377 , which appears to be a dupe.
Comment 41 Robert Lickenbrock [:rclick] 2013-03-12 15:01:42 PDT
*** Bug 848377 has been marked as a duplicate of this bug. ***
Comment 42 Seth Fowler [:seth] [:s2h] 2013-03-24 21:06:54 PDT
Created attachment 728823 [details] [diff] [review]
Don't decode images synchronously if we've decoded them once before.

This is a slightly altered version of Justin's patch. It uses |mHasBeenDecoded| to decide what kind of decode StartDecoding() should request instead of checking the decode count. This means that it behaves a little differently; in particular, StartDecoding() now passes SOMEWHAT_SYNCHRONOUS to RequestDecodeCore until at least one decode has _finished_, while the original starts passing ASYNCHRONOUS to RequestDecodeCore as soon as one decode _starts_. These two behaviors will almost always appear the same - after all, if a decoder is running, StartDecoding() is ignored anyway - but there are edge cases involving decodes being started and then immediately canceled without finishing where the two would differ.
Comment 43 Seth Fowler [:seth] [:s2h] 2013-03-24 21:08:38 PDT
Comment on attachment 719161 [details] [diff] [review]
Patch, v1: Don't decode images synchronously if we've decoded them once before.

Justin, hopefully you don't find it presumptuous if I mark your version obsolete. =) After thinking about it I think the new version is the way to go, though.
Comment 44 Seth Fowler [:seth] [:s2h] 2013-03-24 21:09:12 PDT
Whoa there. =) Resetting assignee.
Comment 45 Kyle Huey [:khuey] (khuey@mozilla.com) 2013-03-24 21:31:42 PDT
(In reply to Seth Fowler [:seth] from comment #44)
> Whoa there. =) Resetting assignee.

I think I liked this bug better that way.
Comment 46 Justin Lebar (not reading bugmail) 2013-03-25 06:10:08 PDT
> After thinking about it I think the new version is the way to go, though.

I don't have an opinion either way.

I think ideally we'd never decode somewhat_sync, since that's a recipe for jank.  Maybe we can look at making that change in a separate bug.
Comment 47 Seth Fowler [:seth] [:s2h] 2013-03-25 11:08:32 PDT
It might be that that'll end up happening in bug 854394. I'd support that change as well.
Comment 48 Seth Fowler [:seth] [:s2h] 2013-04-17 18:28:02 PDT
Justin, what are your thoughts on this patch? Should we land it, or look at a different approach, or what?
Comment 49 Justin Lebar (not reading bugmail) 2013-04-18 02:08:20 PDT
I'd love to land this, but I thought we were blocked on bug 695763.

If we can't land this before m-c branches to aurora, I think we should consider disabling bug 689623.
Comment 50 Seth Fowler [:seth] [:s2h] 2013-04-18 12:43:45 PDT
We are - sorry, that was poorly phrased. I hope to get us unblocked there ASAP now that that patch's dependencies have landed. I was trying to make sure that we actually wanted to land this patch; wasn't sure if we were still in the discussion stages. It sounds like we do, so I'm going to request review for the patch.
Comment 51 Joe Drew (not getting mail) 2013-04-18 14:43:59 PDT
Comment on attachment 728823 [details] [diff] [review]
Don't decode images synchronously if we've decoded them once before.

Review of attachment 728823 [details] [diff] [review]:
-----------------------------------------------------------------

you should also see whether it blows up Try :)

::: image/src/RasterImage.cpp
@@ +2776,5 @@
>  /* void startDecode() */
>  NS_IMETHODIMP
>  RasterImage::StartDecoding()
>  {
> +  return RequestDecodeCore(mHasBeenDecoded ? ASYNCHRONOUS : SOMEWHAT_SYNCHRONOUS);

This definitely needs a comment, likely referring to this bug, and saying that we're explicitly trading off flashing for responsiveness in the case that we're redecoding an image.
Comment 52 Seth Fowler [:seth] [:s2h] 2013-04-18 15:25:57 PDT
Thanks for the review, Joe. I'll add the comment. As for try, this isn't ready to go until the bug it depends on is ready, but I wanted to get a review from you in case you wanted bigger changes to this patch.
Comment 53 Justin Lebar (not reading bugmail) 2013-04-27 12:56:53 PDT
Seth, how's it going here?

I think we should probably disable bug 689623 until we resolve this bug.  If you agree and you can't resolve this bug within a few days, are you able to write and land that patch (on branches if necessary)?
Comment 54 Justin Lebar (not reading bugmail) 2013-05-01 12:06:02 PDT
FWIW I tried disabling somewhat-sync decodes on b2g18, and it didn't make any appreciable difference in jank when scrolling pinterest, even after a memory-pressure event knocked out our whole cache of decoded images.
Comment 55 Seth Fowler [:seth] [:s2h] 2013-05-02 14:11:24 PDT
Justin, let's triage the situation on Monday. I want to get bug 695763 in but I can't land it without resolving bug 859377, or else its dependencies will be backed out. Bug 859377 is very high priority for me right now.
Comment 56 Timothy Nikkel (:tnikkel) 2013-06-04 12:27:37 PDT
Created attachment 758101 [details] [diff] [review]
Don't decode images synchronously if we've decoded them once before.

We've added a new decode type since, so re-requesting review to be sure.
Comment 57 Joe Drew (not getting mail) 2013-06-04 14:53:40 PDT
Comment on attachment 758101 [details] [diff] [review]
Don't decode images synchronously if we've decoded them once before.

Review of attachment 758101 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/src/RasterImage.cpp
@@ +2765,5 @@
>  NS_IMETHODIMP
>  RasterImage::StartDecoding()
>  {
> +  return RequestDecodeCore(mHasBeenDecoded ?
> +    SYNCHRONOUS_NOTIFY : SYNCHRONOUS_NOTIFY_AND_SOME_DECODE);

I shall repeat my earlier comment: "This definitely needs a comment, likely referring to this bug, and saying that we're explicitly trading off flashing for responsiveness in the case that we're redecoding an image."
Comment 58 Timothy Nikkel (:tnikkel) 2013-06-04 14:57:29 PDT
(In reply to Joe Drew (:JOEDREW! \o/) from comment #57)
> I shall repeat my earlier comment: "This definitely needs a comment, likely
> referring to this bug, and saying that we're explicitly trading off flashing
> for responsiveness in the case that we're redecoding an image."

Thank you for remembering. This probably would have fallen through the cracks after Justin had to move on from this bug.
Comment 59 Timothy Nikkel (:tnikkel) 2013-06-04 15:07:35 PDT
I added the comment:
  // Here we are explicitly trading off flashing for responsiveness in the case
  // that we're redecoding an image (see bug 845147).
Let me know if you'd like a change.

And pushed
https://hg.mozilla.org/integration/mozilla-inbound/rev/022bc808586b
Comment 60 Daniel Holbert [:dholbert] 2013-06-04 19:53:11 PDT
This seems like the most likely culprit to have made list-simple-1.html start failing around once every few pushes, with the reference case lot loading its two lime-green-square bullet-point PNG images.

There appears to have been a delay of a few pushes before this started happening, but this is by far the most related-looking push, so philor & I both tend to think this was the cause.
Sample logs:
 https://tbpl.mozilla.org/php/getParsedLog.php?id=23796196&tree=Mozilla-Inbound
 https://tbpl.mozilla.org/php/getParsedLog.php?id=23794047&tree=Mozilla-Inbound
Comment 61 Daniel Holbert [:dholbert] 2013-06-04 19:53:51 PDT
(In reply to Daniel Holbert [:dholbert] from comment #60)
> This seems like the most likely culprit to have made list-simple-1.html
> start failing around once every few pushes, with the reference case lot
> loading its two lime-green-square bullet-point PNG images.

s/reference case lot loading/reference case not rendering/
Comment 62 Daniel Holbert [:dholbert] 2013-06-04 20:03:09 PDT
[So basically, in the failures, it looks like list-simple-1-ref.html's bullet images aren't ready to paint (aren't finished decoding, perhaps?) when onload fires and the reftest-snapshot gets taken, or something.  I believe that image decode used to block onload.]

I backed this out, since I'm pretty sure it's the cause. Apologies if it's not.
 https://hg.mozilla.org/integration/mozilla-inbound/rev/b39437839d53
Comment 64 Phil Ringnalda (:philor) 2013-06-04 20:22:24 PDT
Retriggering on this push did produce quite a few instances on it.
Comment 65 Joe Drew (not getting mail) 2013-06-05 07:39:38 PDT
This is almost certainly the cause. If we *ever* reuse images in a reftest session the first occasion will block onload for the decode, but the second and subsequent won't.

One thought I have is to have a reftest-specific pref that disables this behaviour.
Comment 66 Timothy Nikkel (:tnikkel) 2013-06-05 08:39:05 PDT
Is what happens before/after the load event even relevant in this situation though? For reftests we use the sync decode flag when we are taking the snapshots for comparison so that should trigger a sync decode when we Draw() the image, that should decode it no matter what. The change to StartDecoding shouldn't have any effect on that as Draw() calls SyncDecode, not StartDecoding.
Comment 67 Joe Drew (not getting mail) 2013-06-05 08:42:51 PDT
Doesn't taking the snapshot just do a glReadPixels(), though?
Comment 68 Seth Fowler [:seth] [:s2h] 2013-06-05 11:23:58 PDT
Nah. CanvasRenderingContext2D::DrawWindow calls nsIPresShell::RenderDocument here:

https://mxr.mozilla.org/mozilla-central/source/content/canvas/src/CanvasRenderingContext2D.cpp#3225

This calls nsLayoutUtils::PaintFrame, which builds a display list and paints it:

https://mxr.mozilla.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp#1847

Note that this failure is exactly what prevented me from landing bug 695763, though given the patches that Timothy has landed recently I would've expected it to be resolved.
Comment 69 Timothy Nikkel (:tnikkel) 2013-06-21 11:52:34 PDT
We build a display list and use that to determine if we need to invalidate anything in our retained layers. We'll paint anything invalidated, and then composite the retained layers to a resulting surface.

The problem is that if the image is not decoded when the reftest harness takes it's snapshot, even though it asks for sync decodes, no decoding happens because the contents of the retain layers are still marked as valid, so the painting of the display items for those images is optimized away, and hence we don't get any calls into imagelib with a sync decode flag set. The fix is to check if we are sync decoding and we are not decoding and invalidate, which is what the patches in bug 695763 were doing (but they missed some cases).
Comment 70 Timothy Nikkel (:tnikkel) 2013-06-22 09:34:48 PDT
Relanded with enough patches from bug 695763 that it shouldn't cause problems

https://hg.mozilla.org/integration/mozilla-inbound/rev/ef8d03814470
Comment 71 Phil Ringnalda (:philor) 2013-06-22 15:56:07 PDT
https://hg.mozilla.org/mozilla-central/rev/ef8d03814470
Comment 72 Nicholas Nethercote [:njn] (on vacation until July 11) 2013-06-23 18:28:35 PDT
tn, can you clarify what the status of 689623 is now?  AIUI it's enabled on trunk but disabled on Beta and maybe Aurora, or something like that?  Thanks!
Comment 73 Joe Drew (not getting mail) 2013-06-25 07:16:47 PDT
comment 72
Comment 74 Timothy Nikkel (:tnikkel) 2013-06-25 08:56:41 PDT
Sorry, somehow I overlooked the bugmail for your comment.

689623 is present in 22 and up. It is (currently) pref'ed off in 22 and 23. It is currently pref'ed on in 24 and up. Since we just had a merge day that means 689623 is present but pref'ed off in release and beta, and pref'ed on in aurora and nightly.
Comment 75 Timothy Nikkel (:tnikkel) 2013-06-25 14:57:16 PDT
So as it stands we should be able to keep bug 689623 pref'ed on from now on (knock on wood), resolving this bug was the only thing blocking us from doing that.
Comment 76 Timothy Nikkel (:tnikkel) 2013-06-25 15:01:37 PDT
There is of course room to improve on the current situation so that we only skip sync decoding when scrolling image into view. Bug 847221 is the first step on that path.
Comment 77 Nicholas Nethercote [:njn] (on vacation until July 11) 2013-06-25 16:41:22 PDT
FF24 FTW!  Thanks, tn.
Comment 78 Timothy Nikkel (:tnikkel) 2013-06-26 13:08:56 PDT
I just noticed that I accidentally landed a version of the patch that I had been using for testing that made us never do any sync decoding when we call RequestDecodeCore (notice the comment markers):

  return RequestDecodeCore(//mHasBeenDecoded ?
    SYNCHRONOUS_NOTIFY /*: SYNCHRONOUS_NOTIFY_AND_SOME_DECODE*/);

It is good news that this is green and survived several days on nightly with no problems reported so far.

Do we want to keep the current behaviour? Or keep the mHasBeenDecoded guard?
Comment 79 Joe Drew (not getting mail) 2013-06-26 13:15:26 PDT
I think we want to keep the mHasBeenDecoded guard. Otherwise, there's no difference between StartDecoding and RequestDecode. 

Maybe we don't *want* to have a difference between the two — I'd be willing to listen to that argument — but for now you should probably fix that.
Comment 80 Justin Lebar (not reading bugmail) 2013-06-26 13:21:12 PDT
> Maybe we don't *want* to have a difference between the two — I'd be willing to listen to that  
> argument — but for now you should probably fix that.

Can we file a bug to have this discussion?  I'm with tn in that I think it might be a very nice thing to have.
Comment 81 Timothy Nikkel (:tnikkel) 2013-06-26 13:28:08 PDT
Created attachment 767959 [details] [diff] [review]
fix landing the wrong patch
Comment 82 Timothy Nikkel (:tnikkel) 2013-06-26 13:32:15 PDT
(In reply to Justin Lebar [:jlebar] from comment #80)
> > Maybe we don't *want* to have a difference between the two — I'd be willing to listen to that  
> > argument — but for now you should probably fix that.
> 
> Can we file a bug to have this discussion?  I'm with tn in that I think it
> might be a very nice thing to have.

We have bug 854394 for such a discussion in the context of multithread image decoding. Does that cover it?
Comment 83 Timothy Nikkel (:tnikkel) 2013-06-26 13:34:15 PDT
Comment on attachment 767959 [details] [diff] [review]
fix landing the wrong patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): landed the wrong patch in this bug
User impact if declined: possibly more images flashing into view while we decode them
Testing completed (on m-c, etc.): this patch just brings us to what we always intended to land
Risk to taking this patch (and alternatives if risky): the patch that was accidentally landed is more risky than this patch
String or IDL/UUID changes made by this patch: none
Comment 84 Timothy Nikkel (:tnikkel) 2013-06-26 13:37:18 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/1c0af79ac29c
Comment 85 Ed Morley [:emorley] 2013-06-27 03:50:51 PDT
https://hg.mozilla.org/mozilla-central/rev/1c0af79ac29c
Comment 86 Timothy Nikkel (:tnikkel) 2013-06-28 14:54:45 PDT
https://hg.mozilla.org/releases/mozilla-aurora/rev/767c0dbc8cf2
Comment 87 Andreas Eibach 2014-10-22 18:27:15 PDT
Whee, my head is spinning now ... just finished reading through all 86 (!!) comments.
Anyways, excellent work tn (et al) on this hard-to-fix issue.

While reading, another fairly old issue came into my mind, that *might* (or might not be related) to the quadruple ('images', 'scrolling', 'jerky', 'image-heavy web page') ... this:

 - bug 629234 -

Though it might look only marginally related, I can't really say that bug 988594 is related any closer related with its Nexus4-typical behavior reported ;-) 

I'm wondering in particular if your fixes would at least minimize the (largely Nvidia-specific) issues reported under #629234 ?
Comment 88 Andreas Eibach 2014-10-22 18:28:59 PDT
oops, strike the excess 'related' pls

Note You need to log in before you can comment on or make changes to this bug.