Closed Bug 926048 Opened 6 years ago Closed 4 years ago

Animated .gifs cause Firefox to become unresponsive/freeze.

Categories

(Core :: ImageLib, defect)

27 Branch
x86_64
Windows 7
defect
Not set

Tracking

()

VERIFIED FIXED
mozilla47
Tracking Status
firefox47 --- verified

People

(Reporter: bullvar, Assigned: tnikkel)

Details

Attachments

(5 files, 3 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0 (Beta/Release)
Build ID: 20131011115413

Steps to reproduce:

Tried to view an animated .gif in browser.


Actual results:

Became unresponsive/froze, rarely recovering within 30-40 seconds if at all.


Expected results:

The .gif should have played smoothly after it'd finished downloading, without any performance hits/hangs and freezes.
Happens in Firefox, Aurora and Nightly. Never happened in older versions, unsure as to when the problem started to actually occur.
I'm not seeing this issue.
Can you show me one that you have an issue with? Maybe this is a performance issue on your machine?
Can you please provide a specific url/image that you're experiencing this problem with?  

In addition, because it's a nightly build, can you search for a regression range using the information on https://developer.mozilla.org/en-US/docs/Mozmill/How_to_do_regression_testing ?
Component: Untriaged → Graphics
Flags: needinfo?(bullvar)
Product: Firefox → Core
Almost all .gifs on 4chan.org around 2.0Mb, however I've found a temporary solution of pre-loading the images in their thumbnails, as it seems to only hang if it has to download the image from scratch at it's full resolution.

However these issues aren't present in other browsers like Chromium, however I have noticed that my CPU spikes when firefox hangs.
Flags: needinfo?(bullvar)
Please give us a *SPECIFIC* example.
We can't help with just a domain full of user generated content, sadly.

Once we have a link to a single image, we can take a look
Well, I just had a hang on this, not a huge one, but it was a hang.

http://images.4chan.org/a/src/1381648483886.gif
I've managed to reproduce the bug using the following gif( the gif provided in the bug was removed from the web page):

http://www.capnbeeb.com/SA/Gifs/aliensletsroooockHD.gif

So far the bug is reproducible on all builds until Firefox release 17.

Last good release:
Build identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0

First bad release:
Build identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0

Please note that the bug is intermittent, it reproduce in 50% of tries, also the hang varies from 2-3 sec to 20 sec.
Tomorrow I will provide a more accurate regression window.
Status: UNCONFIRMED → NEW
Ever confirmed: true
A more accurate regression window:

Last Good Release:
Build identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/18.0 Firefox/18.0
Build Id: 20120928030544
http://hg.mozilla.org/mozilla-central/rev/895f66c4eada

First Bad Release:
Build identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/18.0 Firefox/18.0
Build Id:20120929030624
http://hg.mozilla.org/mozilla-central/rev/c09a0c022b2e

Steps to reproduce:
1. Set When Nightly Starts option to "Show my windows and tabs from last time"
2. Open http://www.capnbeeb.com/SA/Gifs/aliensletsroooockHD.gif ( wait until the gif is propery loaded)
3. Close Firefox browser and reopen it.

Expected:
The gif runs smoothly without any hangs.

Actual:
The gif hangs even if it was previously loaded.

Please note that the bug is intermittent, it reproduce in 50% of tries, also the hang varies from 2-3 sec to 20 sec.

If I can help you with further information please don't hesitate to ask.
regression range from comment 8:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=895f66c4eada&tochange=c09a0c022b2e

Does the browser become unresponsive or does the gif just not update very often? If the browser was unresponsive then things like scrolling, opening the context menu, interacting with the address bar would show the problem. Whereas if it is just the gif not updating on the screen I'm leaning towards an invalidation problem because the regression range includes dlbi (display list based invalidation, bug 539356).
My entire browser hangs, becomes unresponsive and 'whites out'(windows feature) whenever you try to perform any action on it.

To close it I'll have to kill the process, as a standard end task won't work.
(In reply to Timothy Nikkel (:tn) from comment #9)

> Does the browser become unresponsive or does the gif just not update very
> often? If the browser was unresponsive then things like scrolling, opening
> the context menu, interacting with the address bar would show the problem.

The browser does not become unresponsive in my case it's just the fact that the same frame/picture of the gif is displayed for a longer time even if the gif was previously cached. I guess Dan has a different issue.
It's already in beta 26.

http://jsfiddle.net/KLtzg/3/

Quickly hovering in and out over the link freezes the browser.

Looks like gifs now continue to animate even when they are removed from the DOM, while in Firefox 25 (which doesn't seem to be affected by this issue), each time you mouseover the link, the gif restarts from the beginning.
I get this freeze frequently. Many times a day. Currently on firefox 26, and i'm fairly sure i had the same problem on firefox 25, but it is a relatively recent issue.
I mainly experience this crash/freeze while i visit mlpchan.net and 4chan.org when the gifs expand on hover.

I've made a thread there where i've posted a few gifs that usually crash firefox very quickly:
https://mlpchan.net/site/res/13123.html

It ONLY happens (at least to me) when hovering gifs, and never if i inline them into the post or open them in a new tab, so make sure you enable hovering in the site settings (upper right corner).
Since i cannot edit
I wanted to add to my comment above:
On a fresh start of firefox, the freeze/crash does not seem to be as quick to reappear (hovering the same gifs that made it crash earlier does not reproduce a crash). But i seem to be able to reproduce it quickly by hover-expanding many more gifs, such as most all of the gifs i've posted in the thread on mlpchan.
I'm the developer of mlpchan.net, an imageboard that by default shows full images when you mouseover an image thumbnail. (4chan has a similar feature that is off by default.) Ever since Firefox 26 was released, I've gotten many reports from users of Firefox freezing for good when an animated GIF is previewed, and I've experienced it myself many times a day. I mouse over a thumbnail of an animated GIF, the full image displays on the first frame (maybe a few frames go by and animate, can't remember), maybe I mouse away to make the full GIF go away, and then my browser entirely hangs forever with the image there (before it was removed if I had moused away; I can't remember if the crash has ever happened besides when I'm mousing away to remove the preview). I have to kill the process.

The issue is always Firefox 26 after viewing a GIF using the mouseover-hover-preview. I've also seen many complaints of this happening on 4chan from people that have the feature enabled (or are using the 4chanX userscript that happens to reimplement the feature itself). I've never seen or heard of another browser crashing from the hover-preview. Firefox before version 26 never had any problems with these image hover-previews. Each implementation I mentioned of these image hover previews (mlpchan, 4chan, 4chanX) were independently written, so I doubt they all suffer from a common error in themselves (that somehow only manifests in Firefox 26 with GIFs while none of them having any GIF-specific code). I assume the issue has something to do with how Firefox handles animated GIFs being added and removed from the DOM.

Every user of the site who uses Firefox that I've discussed this with has told me that this crash happens for them several times a day since version 26.
I am currently on Firefox 26 (Ubuntu linux x86_64) and I am experiencing this issue on regular basis as well. It appears to be random — sometimes loading gifs will make Firefox hang and use whole CPU, sometimes it won't.
Seth, have we/you fixed any of these in 28 or 29?
Flags: needinfo?(seth)
If someone can get a profile of this (https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler) this will become a lot more actionable.
(In reply to Milan Sreckovic [:milan] from comment #17)
> Seth, have we/you fixed any of these in 28 or 29?

Not intentionally, though we certainly have fixed a lot of bugs that could be related. I'm going to try to reproduce this and see what I can find out.
Flags: needinfo?(seth)
I can't reproduce this on OS X or linux with any of the examples posted, on either FF 26 or FF Nightly. I've spent most of my time on the following two examples:

- https://mlpchan.net/site/res/13123.html
- http://jsfiddle.net/KLtzg/6/

I'd love to have a little bit more information to narrow this down. Does the browser have to be running for a while to make it happen? Does it still happen in a fresh profile with no addons installed?

The troubleshooting information in about:support would also be helpful. You should be able to click the "Copy all to clipboard" button and paste the information here.
(In reply to Seth Fowler [:seth] from comment #20)
> I've spent most of my time on the following two
> examples:

Forgot to mention: http://www.capnbeeb.com/SA/Gifs/aliensletsroooockHD.gif

I can't reproduce it using this one either, even after closing and reopening FF several times.
(In reply to Seth Fowler [:seth] from comment #21)
> (In reply to Seth Fowler [:seth] from comment #20)
> > I've spent most of my time on the following two
> > examples:
> 
> Forgot to mention: http://www.capnbeeb.com/SA/Gifs/aliensletsroooockHD.gif
> 
> I can't reproduce it using this one either, even after closing and reopening
> FF several times.

Using the mlpchan example i am often able to reproduce it even on a fresh start. But not always. Just now it crashed on the first gif i hovered (completely fresh start).
If firefox has been open for some time, it usually only requires me to hover a single large gif to freeze. While a fresh start usually requires a number of gifs.
about:support for my firefox26: http://pastebin.com/5JtYamyY

However. I can not reproduce it on the latest nightly. And nightly has yet to freeze on any gifs for me. Perhaps coincidentally: Nightly restarts animations when i hover out and then back in, whereas ff26 will pause it and try to continue it from where it was paused when i hover back in (sometimes showing the first frame for a split second).
about:support for nightly: http://pastebin.com/CegBgqy8

It has not made any difference for me which OS i use. Stable gives the freeze in OpenSUSE Linux 13.1 and Fedora Linux 20 as well. And conversely, nightly does not.
If you have time to do the profiling steps from comment 18 that might help out more here.   Not tracking this yet since it's not obvious that it's a widely hit issue, reproducing on 28.
Flags: needinfo?(bullvar)
Attached file WinDbg log
FWIW, I repro'd a freeze against 26 in safe-mode only (i.e. HWA off; vanilla profile) with http://fiddle.jshell.net/KLtzg/6/show/
Failed to repro against 27 rc and higher.
Seems to contradict with some previous comments.

windbg output:
003ef27c 5b6bd07b xul!mozilla::image::FrameBlender::DrawFrameTo(unsigned char * aSrcData = 0x0000000e "--- memory read error at address 0x0000000e ---", struct nsIntRect * aSrcRect = 0x003ef35c, unsigned int aSrcPaletteLength = 0x400, bool aSrcHasAlpha = false, unsigned char * aDstPixels = 0x0cc50000 "ueM???", struct nsIntRect * aDstRect = 0x003ef33c, mozilla::image::FrameBlender::FrameBlendMethod aBlendMethod = kBlendOver (0n1))+0xbf [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\image\src\frameblender.cpp @ 491]
003ef380 5b6bc42a xul!mozilla::image::FrameBlender::DoBlend(struct nsIntRect * aDirtyRect = 0x00000400, unsigned int aPrevFrameIndex = 0xca3f900, unsigned int aNextFrameIndex = 0x17)+0x50c [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\image\src\frameblender.cpp @ 383]
003ef3cc 5b6bc5ff xul!mozilla::image::FrameAnimator::AdvanceFrame(class mozilla::TimeStamp aTime = class mozilla::TimeStamp)+0xbc [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\image\src\frameanimator.cpp @ 137]
003ef45c 5b6c0916 xul!mozilla::image::FrameAnimator::RequestRefresh(class mozilla::TimeStamp * aTime = 0x003ef678)+0x4f [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\image\src\frameanimator.cpp @ 188]
003ef4a0 5b4ff3ef xul!mozilla::image::RasterImage::RequestRefresh(class mozilla::TimeStamp * aTime = 0x003ef678)+0x41 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\image\src\rasterimage.cpp @ 535]
003ef664 5b4137d7 xul!nsRefreshDriver::Tick(int64 aNowEpoch = 0n1391065577239318, class mozilla::TimeStamp aNowTime = class mozilla::TimeStamp)+0x85f [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\layout\base\nsrefreshdriver.cpp @ 1179]
003ef6c0 5b4495b2 xul!mozilla::RefreshDriverTimer::Tick(void)+0x107 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\layout\base\nsrefreshdriver.cpp @ 158]
003ef6d8 5b4899de xul!nsTimerImpl::Fire(void)+0xc2 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\xpcom\threads\nstimerimpl.cpp @ 546]
003ef754 5b46979e xul!nsThread::ProcessNextEvent(bool mayWait = <Memory access error>, bool * result = <Memory access error>)+0x2ce [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\xpcom\threads\nsthread.cpp @ 622]
003ef768 5b969e8b xul!NS_ProcessNextEvent(class nsIThread * thread = <Memory access error>, bool mayWait = <Memory access error>)+0x2e [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\xpcom\glue\nsthreadutils.cpp @ 238]
003ef794 5ba00d6e xul!mozilla::ipc::MessagePump::Run(class base::MessagePump::Delegate * aDelegate = <Memory access error>)+0x46 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\ipc\glue\messagepump.cpp @ 81]
003ef7cc 5ba00e5b xul!MessageLoop::RunHandler(void)+0x51 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\ipc\chromium\src\base\message_loop.cc @ 214]
003ef7ec 5b353aa3 xul!MessageLoop::Run(void)+0x19 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\ipc\chromium\src\base\message_loop.cc @ 188]
003ef7f8 5b6023ad xul!nsBaseAppShell::Run(void)+0x2c [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\widget\xpwidgets\nsbaseappshell.cpp @ 163]
003ef80c 5b3c5b2b xul!nsAppShell::Run(void)+0x14 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\widget\windows\nsappshell.cpp @ 112]
003ef8e8 5b368efe xul!XREMain::XRE_mainRun(void)+0x483 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\toolkit\xre\nsapprunner.cpp @ 3869]
003ef908 5b5f85da xul!XREMain::XRE_main(int argc = 0n4, char ** argv = 0x00935018, struct nsXREAppData * aAppData = 0x003efa50)+0xe1 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\toolkit\xre\nsapprunner.cpp @ 3937]
003efa20 012516b9 xul!XRE_main(int argc = 0n4, char ** argv = 0x00935018, struct nsXREAppData * aAppData = 0x003efa50, unsigned int aFlags = 0)+0x30 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\toolkit\xre\nsapprunner.cpp @ 4139]
003efbb4 0125197e firefox!do_main(int argc = 0n4, char ** argv = 0x00935018, class nsIFile * xreDirectory = 0x00c18100)+0x283 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\browser\app\nsbrowserapp.cpp @ 275]
003efc48 01251a89 firefox!NS_internal_main(int argc = 0n4, char ** argv = 0x00935018)+0x11d [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\browser\app\nsbrowserapp.cpp @ 635]
003efc7c 01252357 firefox!wmain(int argc = 0n0, wchar_t ** argv = 0x009328d0)+0xf0 [e:\builds\moz2_slave\rel-m-rel-w32_bld-000000000000\build\toolkit\xre\nswindowswmain.cpp @ 112]
003efcc0 7504336a firefox!__tmainCRTStartup(void)+0x122 [f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 552]
003efccc 77159f72 kernel32!BaseThreadInitThunk+0xe
003efd0c 77159f45 ntdll!__RtlUserThreadStart+0x70
003efd24 00000000 ntdll!_RtlUserThreadStart+0x1b
I can repro this bug 100% in the official build of Fx 27 with <http://i.imgur.com/95Vh1AW.gif>. It happens even in safe mode with hardware acceleration disabled. I'm on Win7 x64 and my CPU is a quad-core 3.2GHz AMD.

Here's a profile of me opening that GIF from disc into a new tab two times: <http://people.mozilla.org/~bgirard/cleopatra/#report=24692b8163c1ce33ce4076517937e4b92f8e9e71>

When I load it from disc the freeze happens immediately, but when I first viewed the file online it froze half-way through. Make of that what you will.
(In reply to Tom Edwards from comment #25)
> I can repro this bug 100% in the official build of Fx 27 with
> <http://i.imgur.com/95Vh1AW.gif>. It happens even in safe mode with hardware
> acceleration disabled. I'm on Win7 x64 and my CPU is a quad-core 3.2GHz AMD.
> 
> Here's a profile of me opening that GIF from disc into a new tab two times:
> <http://people.mozilla.org/~bgirard/cleopatra/
> #report=24692b8163c1ce33ce4076517937e4b92f8e9e71>
> 
> When I load it from disc the freeze happens immediately, but when I first
> viewed the file online it froze half-way through. Make of that what you will.

That looks like a case of Firefox hanging for a while and then recovering? People here seem to be experiencing hangs that don't recover.

In any case that profile looks like the windows preview per tab code is taking a snap snot of the tab contents and we are doing a sync decode of the whole gif. Which does not seem like a good idea in cases like this.
Yes, the program always recovers after a while. This bug has been covering both permanent and temporary hangs since the start and at this stage there's no reason thing they are separate issues.

The preview per tab call popped out at me too, but I thought I was misreading the results as I don't have per-tab previews enabled! But I just took a profile with aero effects (required for window previews) disabled and WindowsPreviewPerTab is still there.
I can reproduce this occasionally on firefox 26/27, on both arch x86_64 and ubuntu 13.10 x86_64. 

I haven't caught this bug in action with a profiler, but I can offer the following data:

- Every time I hit this freeze was with a dynamically loaded img element loaded with javascript, e.g. the "image hover" functionality on 4chan. GIFS in static img elements have yet to freeze for me.
- I reported a bug with GIF playback in dynamically loaded img elements a year and a half a ago: https://bugzilla.mozilla.org/show_bug.cgi?id=756367 . While the fix was merged in ff16 and I don't remember hitting this bug till at least ff24, the section of code concerning both that bug and this one is probably the same, so I'm a bit suspicious.
- FWIW, this problem has been reported by many 4chan users as well: http://rbt.asia/g/thread/S40122062#p40122108
Perhaps if there was a long machine readable list of animated gifs somewhere one could create a page that each gif into the page in order, and then just let it run until it hangs. That might make an easy repeatable testcase.
Flags: needinfo?(bullvar)
(In reply to Timothy Nikkel (:tn) from comment #29)
> Perhaps if there was a long machine readable list of animated gifs somewhere
> one could create a page that each gif into the page in order, and then just
> let it run until it hangs. That might make an easy repeatable testcase.

Something like this?  https://github.com/bwinton/whimsy/blob/gh-pages/thumbnail-gifs.txt
This is interesting: the GIF I mentioned in #25 always unfreezes on the same frame, no matter whether it froze immediately (loaded from disc) or froze after playing for a while (downloaded from web following Ctrl+F5).

Perhaps we're looking at a buffer somewhere hitting its limit? The file is quite large at 9.16MB...
(In reply to Tom Edwards from comment #31)
> This is interesting: the GIF I mentioned in #25 always unfreezes on the same
> frame, no matter whether it froze immediately (loaded from disc) or froze
> after playing for a while (downloaded from web following Ctrl+F5).
> 
> Perhaps we're looking at a buffer somewhere hitting its limit? The file is
> quite large at 9.16MB...

I've noticed that the freeze occurs when the image finishes loading. That's why it looks like freezing always on the same frame. You can confirm that Ctrl-Reloading with the Console open.
(In reply to David :3 from comment #32)
> (In reply to Tom Edwards from comment #31)
> > This is interesting: the GIF I mentioned in #25 always unfreezes on the same
> > frame, no matter whether it froze immediately (loaded from disc) or froze
> > after playing for a while (downloaded from web following Ctrl+F5).
> > 
> > Perhaps we're looking at a buffer somewhere hitting its limit? The file is
> > quite large at 9.16MB...
> 
> I've noticed that the freeze occurs when the image finishes loading. That's
> why it looks like freezing always on the same frame. You can confirm that
> Ctrl-Reloading with the Console open.

I mean the Network Tab, not the Console.
Hmm. When I hit Ctrl+F5 the network tab shows a download of 1398KB, whereas the image's true size of 9382KB. And when I view the response tab it shows only a snippet from the start of the GIF, presumably the first 1398KB of it.

When I do the same thing in IE11 its dev tools report the full 9.16MB, and the download also takes about 200ms less to complete (probably because Firefox's timing data includes its freeze?).
This is a crash report from FF29 while using mouseover-hover-preview on a GIF (using latest 4chanX v3.20.11).
https://crash-stats.mozilla.com/report/index/50874793-1dea-4c95-a0b7-5c56d2140429

It can also crash using the native 4chan extension.
Is this bug ever going to be addressed? Any site that displays a gif over a few MB will always crash or freeze Firefox. Is displaying a gif really that hard? Can't wait to see the next version release where you move all the buttons around but don't fix the most basic functionality.
I have similar issue. Haven't seen it before, but in last month it started appearing for me. Was something changed in latest versions?
What does your about:support say of your graphics, and are there any particular examples that reproduce this problem?
Hm... This is strange. Just tried to reproduce it and record video, but now it's working fine. I first seen it when I scrolled to gifs part here http://dou.ua/lenta/digests/funny-meta-digest-2015/
In case that was related to some preloading i tried it on page with a lot of gifs http://devopsreactions.tumblr.com/ but it still worked fine now.
This is on 43, or one of the pre-release channels?
43.0.3 and I don't remember any recent firefox/drivers updates
http://pastebin.com/QaWzw1Rg
http://imgur.com/xFIITE2

Also I usually use a lot of tabs, like 50-100 sometimes. But only 20-30 of them are actually loaded and I rarely see any problems, and for sure no hangouts like I seen with those gifs.
I have managed to nail some instances of this bug and created patch. I'm not sure if it is in valid format, could someone check it? I have written big comment in source describing how this bug works. Is it acceptable? Maybe it would be useful for someone preparing better solution because my patch is a little hackish.
Attached patch bug926048patch2.diff (obsolete) — Splinter Review
Resolved some problems with encoding in previous patch.
Attachment #8720762 - Attachment is obsolete: true
Comment on attachment 8720766 [details] [diff] [review]
bug926048patch2.diff

I would have the awesome documentation of the problem and the solution in this bug, rather than as comments in the code, and lets also get Timothy's review on the code change itself.
Attachment #8720766 - Flags: review?(tnikkel)
Attached patch bug926048patch3.diff (obsolete) — Splinter Review
Corrected patch as recomended by Milan Sreckovic.

And documentation:

--------------------------------------------------------------------------
Bug 926048 - Animated .gifs cause Firefox to become unresponsive/freeze.
INVESTIGATION REPORT

How this bug is triggered?

Occurrences of this bug are rather random, but generally You need to:
1. Open page with big animated GIF placed somewhere further down below
   visible area (1000 pixels or more below bottom of view).
2. Switch to another tabs and navigate graphically intensive pages for
   about 5 minutes.
3. Return to page with animated GIF and scroll down. When GIF is few
   hundreds pixels below bottom of view Firefox freezes for few seconds,
   sometimes even for 3 minutes.

Why Firefox freezes?

After attaching debugger form Visual Studio 2008 Express Edition,
setting up symbols and source, I was able to break execution of Firefox.
Every time I have landed in method FrameAnimator::DrawFrameTo.

CALL STACK:
FrameAnimator::DrawFrameTo
FrameAnimator::DoBlend
FrameAnimator::AdvanceFrame
FrameAnimator::RequestRefresh
RasterImage::RequestRefresh
...

DrawFrameTo is relatively computational intensive, but it was possible to
instantly trace out of this method. I could trace out from other methods
below, but got freeze when attempting to trace out of
FrameAnimator::RequestRefresh.

In FrameAnimator::RequestRefresh we have loop that tries to advance
animation to current time. This loop seems poorly designed, but looking
at FrameAnimator::AdvanceFrame there is shortcut code (see comment
"If we can get closer to the current time by a multiple of the image's
loop time, we should") that skips all full loops with single division.
Unexpectedly in this specific case shortcut wasn't taken, because
GetSingleLoopTime() returned -1. This after coercion to uint32_t means
0xffffffff so
    delay.ToMilliseconds() > 0xffffffff
is always false. GetSingleLoopTime() returns -1 because mDoneDecoding is
false. If mDoneDecoding is false, why we are looping like crazy, shouldn't
animation stop on not decoded frame? Analyzing nextFrameIndex in
FrameAnimator::AdvanceFrame showed that it goes through full animation
and after hitting line:
    if (mImage->GetNumFrames() == nextFrameIndex) {
starts animation form beginning. For all animation frames
    nextFrame->IsImageComplete()
is true, but mDoneDecoding is still false, so Firefox is rendering all
frames of animation until currentFrameEndTime in
FrameAnimator::RequestRefresh makes up lost time. This meas it has to
render thousands of frames at one go.

Solution

We have two potential (maybe temporary) solutions:
1. After wrapping animation to first frame force mDoneDecoding to true so
   FrameAnimator::AdvanceFrame could take shortcut.
2. If mDoneDecoding == false don't wrap animation, keep it on last frame.

I prefer solution 1. Second solution might stop animation forever.

TODO: Investigate why after getting all frames mDoneDecoding still could be
      false.
--------------------------------------------------------------------------
Attachment #8720766 - Attachment is obsolete: true
Attachment #8720766 - Flags: review?(tnikkel)
Comment on attachment 8720917 [details] [diff] [review]
bug926048patch3.diff

Thanks for tracking this down and that very thorough explanation!

I don't think FrameAnimator can set mDoneDecoding itself accurately because it doesn't have enough information to know if decoding is complete or not. mImage->GetNumFrames() is the current number of frames that have been decoded, it increases as frames of the image are decoded. So we have to let RasterImage tell us when it is finished decoding.

But I think based on your description we should be able to come up with a fix here. I want to think a little on what the right solution would be. This situation where the last refresh time is far in the past and we haven't finished decoding is a little unusual, but it makes sense that it happens given the steps to reproduce.
Attachment #8720917 - Flags: review?(tnikkel)
Flags: needinfo?(tnikkel)
(In reply to Daniel Sęk from comment #46)
> We have two potential (maybe temporary) solutions:
> 1. After wrapping animation to first frame force mDoneDecoding to true so
>    FrameAnimator::AdvanceFrame could take shortcut.
> 2. If mDoneDecoding == false don't wrap animation, keep it on last frame.
> 
> I prefer solution 1. Second solution might stop animation forever.

I think 2. is what we want to do. Shouldn't be any danger of stopping the animation. The refresh driver (what asks the image to update frames) will keep asking the image to update frames.

After looking at the code there were so many changes I wanted to make that I just went ahead and wrote all the patches. But it would have been impossible without you figuring out the problem!
Flags: needinfo?(tnikkel)
Component: Graphics → ImageLib
Assignee: nobody → tnikkel
Attachment #8720917 - Attachment is obsolete: true
Now to the actual fix.

The hang is caused by (wrongly) looping over the decoded frames (before we have all the frames decoded) repeatedly until we catch up to the current time. Because we aren't decoded yet we can't use the code that fastforwards using an integer multiple of an entire loop.

But looping over these frames is just wrong. We should we sticking with the last decoded frame in this situation.

This patch introduces a regression, fixed in the next patch.
Attachment #8723772 - Flags: review?(edwin)
Fix the regression from the last patch.

(Wrongly) looping through the decoded frames had the beneficial side effect of bringing our current animation time up to the current time. Which we want so we don't jump to a random spot in the animation when decoding is done.
Attachment #8723774 - Flags: review?(edwin)
I also filed bug 1251403 and bug 1251405 for cleanups/fixes that I noticed while fixing this bug.
Comment on attachment 8723774 [details] [diff] [review]
Part 4. Update current animation time when we hit the end of decoded frames (but aren't done decoding).

Review of attachment 8723774 [details] [diff] [review]:
-----------------------------------------------------------------

Am I right in thinking that if we're setting mCurrentAnimationFrameTime here, then once the next frame is available we'll still wait the duration of the current frame to display it? It would be nice to show the next frame immediately after it's available if the current frame's timeout has already gone by.

Still, probably a minor delay compared to blocking on I/O or decoding so r+.
Attachment #8723774 - Flags: review?(edwin) → review+
Hmm, I think you are right.

So far the only way I can see to avoid that problem involve adding a lot more complexity. And we've already got a lot of subtlety and complexity in this function.

I'll think on it a little more to see if I can come up with anything better, but the problem should be pretty minor. If network/decoding is happening faster than we need to display the frames (how did we reach the end of decoded frames in the first place then?) we will get one little hiccup and then the rest should go fine. If network/decoding is too slow to display the images at their correct times then no big deal waiting a little more on this frame, we're going to have to wait more on later frames anyway. If network/decoding is just barely keeping up then a little extra pause on this one frame should give it a head start on the rest of the frames and they will run with no extra waiting.
I haven't thought of anything better, and this is essentially what we were doing before (looping through the decoded frames also updated our current animation time) so it's not a regression. I'm going to land as is.
Yesterday I have found some time to pull changes and recompile browser. After doing rather large amount of test, and even leaving Firefox (with about 50 cards open) hibernated for night, I can confirm that these changes eliminated freezes described in comment 46.
Well done.
Thank you! It would have been impossible to fix this without your work!
That patch gives a very nice performance boost on horizontaly floating image galleries.
Flags: qe-verify+
No longer seeing the browser freeze with .gifs on Firefox 47 beta 4, build ID: 20160509171155.
Tested on Windows 7 x64 and Windows 10x64 with various .gifs, including the str from comment 8.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.