Infinite loop in mozilla::image::RasterImage::RequestRefresh

RESOLVED FIXED in mozilla24

Status

()

Core
ImageLib
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: till, Assigned: Joe Drew (not getting mail))

Tracking

Trunk
mozilla24
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

(Reporter)

Description

5 years ago
Created attachment 754551 [details]
Instruments trace of infinite loop

I just had a hang in the current (2013-05-27) Nightly. An Instruments run indicates an iloop within mozilla::image::RasterImage::DrawFrameTo.

When this happened, I had just clicked on a tags folder in Google Reader, the previously active folder's content was already gone, but the newly active one's not yet shown.

Instruments trace is attached.

Comment 1

5 years ago
I don't see how we could be ilooping _within_ DrawFrameTo; the only loops seem to be quite bounded. My suspicion is that we just end up calling DrawFrameTo over and over instead.
(Reporter)

Comment 2

5 years ago
Quite possibly, yes. I should've attached a debugger to the app while it was still running ...
(Reporter)

Comment 3

5 years ago
So, this happened again, and this time I did attach a debugger to it. The iloop indeed occurs in RequestRefresh, not DrawFrameTo.

I'd bet good money this is a regression from bug 867758.
Blocks: 867758
Flags: needinfo?(joe)
Summary: Infinite loop in mozilla::image::RasterImage::DrawFrameTo → Infinite loop in mozilla::image::RasterImage::RequestRefresh
(Assignee)

Comment 4

5 years ago
Very likely, yes, or one of its dependent bugs. I thought I had fixed this in bug 875173, though.

Till, do you recall (were you able to tell?) precisely what condition caused this? Or do you have any possible steps to reproduce this?
Component: Graphics → ImageLib
Flags: needinfo?(joe)
(Reporter)

Comment 5

5 years ago
(In reply to Joe Drew (:JOEDREW! \o/) from comment #4)
> Very likely, yes, or one of its dependent bugs. I thought I had fixed this
> in bug 875173, though.
> 
> Till, do you recall (were you able to tell?) precisely what condition caused
> this? Or do you have any possible steps to reproduce this?

Sorry, no on both counts. I ran into this in an opt build and didn't have symbols, so couldn't tell which condition caused it.

The only, pretty unreliable, STR I can give are in comment 0: switch between folders in Google Reader ...

If that helps at all, I also had a hang that went away after a few seconds while switching in gReader. Of course, that might've been unrelated.
(Reporter)

Comment 6

5 years ago
So, I can reproduce this issue annoyingly easy, now. :(

At least, that gave me plenty of opportunities to investigate it some more: turns out the hang isn't really permanent, it can just take a long time for the loop to end. Sometimes it ends after a second or so, sometimes it takes maybe a minute.

After fetching the symbols, I can confirm the somewhat obvious: it's the loop at http://mxr.mozilla.org/mozilla-central/source/image/src/RasterImage.cpp#657 that takes this long to complete.

This being an opt build, however, I can't tell which condition is causing this, exactly: all locals are optimized away, and my asm foo isn't nearly good enough to figure out in which registers the relevant values are stored, sorry.

@joe, can I do anything else to help you debug this? Ping me on IRC if an interactive debug session would help you.
Flags: needinfo?(joe)
(Assignee)

Comment 7

5 years ago
I'd be very interested if you still have problems with the fix to bug 876332.
Flags: needinfo?(joe)
(Reporter)

Comment 8

5 years ago
Ok, will retest once that has landed.
(Reporter)

Comment 9

5 years ago
So, the fix to bug 876332 made this much harder to reproduce (as in, it doesn't happen every second time I navigate within gReader), but, unfortunately, it didn't fully fix it.

Here's a Gecko Profiler profile of the hang - not that it gives a lot of new insights or anything:
http://people.mozilla.com/~bgirard/cleopatra/#report=0221df66d0a16e82f70d670ef06002d9ef11903e
Flags: needinfo?(joe)
(Assignee)

Comment 10

5 years ago
Unfortunately bug 876332 seems to have fixed the other bug that seemed to have the same cause as this (bug 877253), so it's harder to reproduce.

Till, if you're up to building and debugging Firefox, I can walk you through finding out what the state is when we run into this case.
Flags: needinfo?(joe)
(Reporter)

Comment 11

5 years ago
(In reply to Joe Drew (:JOEDREW! \o/) from comment #10)
> Till, if you're up to building and debugging Firefox, I can walk you through
> finding out what the state is when we run into this case.

Yes yes, I'm certainly up for that. I just hoped that giving you the information I had would let me get out of doing propert debugging. ;)

I have a debug build and can reproduce the issue with that. I guess we should go through that together the next time we're both online, but here's what information I get out of it, for now:

In RasterImage::RequestRefresh, the loop `while (currentFrameEndTime <= aTime)` doesn't really iloop, it just sometimes takes a very long time to finish.

One case I had in the debugger looked like this:
- aTime: 211042319161672
- all frame end times are identical: 210825837815525
- the animation has 18 frames
- loop count is -1
- the image in question is "chrome://browser/skin/tabbrowser/connecting@2x.png"
- AdvanceFrame() returns `true`

It took tens of thousands of iterations for this loop to finish. I don't pretend to understand the code well enough to know whether this is intended or a problem, and the general slowness of the debug browser makes it kinda hard to know if this was an actual instance of the bug, but it might have been.

If this information helps you in finding the bug: great. If not, ping me on IRC (#developers would work) when you have time, then we can go through it together.
Flags: needinfo?(joe)
(Assignee)

Updated

5 years ago
Duplicate of this bug: 878709
(Assignee)

Comment 13

5 years ago
Till's been having a lot of trouble reproducing this in a debugger again. Something I want to know is what GetCurrentFrameEndTime() is returning, and how it's calculating that result (i.e., what the individual frame timeouts are), both within RequestRefresh() and AdvanceFrame().
Flags: needinfo?(joe)

Comment 14

5 years ago
Would it be a good idea to back out bug 876332, temporarily, to help debug this issue?
(Assignee)

Comment 15

5 years ago
No, because that bug itself caused long hangs, and significantly more frequently than this bug, so one would be overwhelmed with the other.

Comment 16

5 years ago
When I leave my browser idle on the following page, eventually one core get saturated.  I'm not sure how long you'd have to leave the browser on the page, but it's happened consistently on both my home and work PCs since yesterday when I first visited the page:

http://feryjunaedi.wordpress.com/2008/05/29/redistribution-between-eigrp-in-two-different-autonomous-systems/
(Assignee)

Comment 17

5 years ago
Does your browser stop responding?

Comment 18

5 years ago
(In reply to Joe Drew (:JOEDREW! \o/) from comment #17)
> Does your browser stop responding?

Mostly unresponsive.  Any interaction is excruciatingly slow.
(Assignee)

Comment 19

5 years ago
OK, and you've profiled and it's all under RequestRefresh?

Comment 20

5 years ago
That I didn't do.  Will the Gecko Profiler extension provide sufficiently useful information for you?
(Assignee)

Comment 21

5 years ago
Yep! That's exactly what I'd like!
(Assignee)

Comment 22

5 years ago
I'm pretty sure what's happening here is that we're drawing an image a long time after its last refresh, for whatever reason, and looping, blending frames, until we reach the right time. Of course, mostly gifs loop, so we could do a *single* iteration of the loop instead of tens of thousands.
Assignee: nobody → joe
That squares with the conditions under which I tend to experience this, when I've scrolled down the endless tumblr dashboard for a long way, and then something causes me to scroll back to the top where an animated gif is looping.

Comment 24

5 years ago
Glad you're figuring it out 'cause I was having trouble getting the profile to upload and was going to try again later today.  That said, I wasn't sure I was observing the right thing, because it was fingering a whole bunch of timer-related functions.
(Assignee)

Comment 25

5 years ago
Created attachment 759411 [details] [diff] [review]
For looping images, advance by the loop time when we're being asked to

When there is a long delay between calls to RequestRefresh(), for example because an animated image has been scrolled off the screen, the current animation frame time can be significantly behind the current time, requiring a huge number of composites to catch up. This patch makes us skip those intermediate composites, jumping to the closest multiple of the image's loop time.

I'm not sure if putting this in AdvanceFrame or RequestRefresh makes more sense, but since AdvanceFrame explicitly sets the currentAnimationFrameTime, I chose this location. It's not perfect, though, because it'll be checked every time.
Attachment #759411 - Flags: review?(seth)
(Assignee)

Comment 26

5 years ago
https://tbpl.mozilla.org/?tree=Try&rev=f38bee2ee06d

Comment 27

5 years ago
That does not fix my issue of comment 16, so I might be experiencing a different issue.
(Assignee)

Comment 28

5 years ago
In that case please try getting your profile to upload! :D

Comment 29

5 years ago
The patch fixes the bug for me.
Comment on attachment 759411 [details] [diff] [review]
For looping images, advance by the loop time when we're being asked to

Review of attachment 759411 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.
Attachment #759411 - Flags: review?(seth) → review+
(Assignee)

Comment 31

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/00c0970b6926
https://hg.mozilla.org/mozilla-central/rev/00c0970b6926
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
You need to log in before you can comment on or make changes to this bug.