Closed Bug 376446 Opened 15 years ago Closed 15 years ago

spurious white pixels in animated gifs

Categories

(Core Graveyard :: GFX, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9alpha5

People

(Reporter: sayrer, Unassigned)

References

()

Details

(Keywords: regression, testcase)

Attachments

(3 files)

Attached image ytmnd in action
No description provided.
Flags: blocking1.9?
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a4pre) Gecko/20070404 Minefield/3.0a4pre ID:2007040405 [cairo]
I'm seeing this too, but not in IE6 or Opera 9.1
Keywords: regression
OS: Mac OS X → All
Hardware: PC → All
WFM
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a4pre) Gecko/20070414 Minefield/3.0a4pre
(In reply to comment #2)
> WFM
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a4pre) Gecko/20070414
> Minefield/3.0a4pre
> 

If it works initially, do a shift-reload a few times until it fails.
I Shift+Reloaded 10+ times, and emptied my cache a few times, and still can't reproduce it. I can't reproduce it in 2007-04-04-05 or 2007-04-03-04 builds either. Maybe there's something about my setup that's causing me to not see it.
still see it here
(In reply to comment #5)
> still see it here
> 
But it appears you accidentally removed the URL which would make it impossible for anyone else to see it.
Attached image testcase
This seems to show the issue more reliably.  I am not sure of the source of this image, so it might be copyrighted.
(In reply to comment #7)
> Created an attachment (id=261599) [details]
> testcase
> 
> This seems to show the issue more reliably.  I am not sure of the source of
> this image, so it might be copyrighted.
> 
Hmm attaching it to this bug does not appear to show the issue.  This was originally posted to the Mozillazine forums using the following URL http://i10.tinypic.com/2hx3bpg.gif which does seem to show the issue.  I wonder if loading the image via https somehow avoids the issue?
Here's a more stable location:

http://people.mozilla.com/~sayrer/2007/04/15/2hx3bpg.gif
Works (no random whiteness)
- Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a3pre) Gecko/2007032011 Minefield/3.0a3pre

Doesn't work (random whiteness)
- Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a3pre) Gecko/2007032104 Minefield/3.0a3pre

Bonsai regression range:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=PhoenixTinderbox&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-03-20+10&maxdate=2007-03-21+05&cvsroot=%2Fcvsroot
Keywords: testcase
OK, I can reproduce the problem with currentr trunk nightlies and http://people.mozilla.com/~sayrer/2007/04/15/2hx3bpg.gif.
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a4pre) Gecko/20070419 Minefield/3.0a4pre ID:2007041915 [cairo]
Issue still present even after the follow up to bug 257197
Can someone find more (different) ways to reproduce this bug? Preferrably with some really simple image, like two frames with a long delay in between. But any number of different images that hopefully have something specific in common.

Is it possible there is something wrong with this particular image?
It's not just that particular image, I've seen it with a whole bunch of different images. Another example is http://dopefishjustin.googlepages.com/danbooru-p-115346267114392.gif
Every sample image has transparent pixels in some frames.  This problem might
be due to failure of composition of transparent pixels in a frame onto the
previous (opaque) ones.
(In reply to comment #16)
> Every sample image has transparent pixels in some frames.  This problem might
> be due to failure of composition of transparent pixels in a frame onto the
> previous (opaque) ones.
> 
Well, that could be, but if the issue were that there is just a simple bug in the decoding logic, that would not really explain why the exact same GIF file works when loaded from an https server yet fails when loaded from an http server.  The issue seems to be somehow timing related, and appears to be aggravated by the image being displayed multiple times on the screen simultaneously.  The original URL had the same image tiled over the entire content window.

The single image from the URL in comment #9 displays the image twice, once as the image itself in the content area and the second time as the favicon in the urlbar.  If you suppress the display of the favicon by setting browser.chrome.image_icons.max_size this becomes much more difficult to reproduce.
Note, it seems that bug 317748 solves this issue.
All the images mentioned overhere are working fine in my build with the patch for 317748. 
	
My guess would be that the 'BuildCompositeMask' function doesn't work well in Cairo...
Depends on: 317748
(In reply to comment #18)
> Note, it seems that bug 317748 solves this issue.
> All the images mentioned overhere are working fine in my build with the patch
> for 317748. 
> 
> My guess would be that the 'BuildCompositeMask' function doesn't work well in
> Cairo...
> 
Odd.  I just did a build including that last patch attached to bug 317748
( https://bugzilla.mozilla.org/attachment.cgi?id=263343 ) and it does not have any effect for me on this issue.  I still see exactly the same failures.
Weird, because his v9 patch seems to have helped for me in some situations. There's a few GIFs which used to display the problem which now look fine. However, some other ones don't (such as the one posted earlier in this thread).
http://people.mozilla.com/~sayrer/2007/04/15/2hx3bpg.gif
The ones posted  in this thread seem to be somehow timing related.  Like it starts decoding the image before the entire file is loaded for some reason or something like that.  Doing a hard reload while the image is still loading almost always causes an issue with these images.
Trying to solve 'spurious events' can be tricky.

However, the following piece of code also seems fishy:
 441                   // If we're done decoding the next frame, go ahead and display it now and
 442                   // reinit the timer with the next frame's delay time.
 443 pavlov     1.43   // currentDecodingFrameIndex is not set until the second frame has
 444                   // finished decoding (see EndFrameDecode)
 445                   if (mAnim->doneDecoding || 
 446                       (numFrames == 2 && nextFrameIndex < 2) ||
 447                       (numFrames > 2 && nextFrameIndex < mAnim->currentDecodingFrameIndex)) {

As far as I can decode this logic, the test (numFrames == 2 && nextFrameIndex < 2) says:
If there are only two frames (yet) in the mFrames array, and nextFrameIndex is 1 (0 based), it safe to jump to the next frame (the second one).
However, this is not safe as the second one may not be completely decoded yet...
(Remember AppendFrame is called at the beginning of decoding an frame,
it would have been easier to specify that a decoder may only AppendFrame a frame, when it is really complete (except for the first one, to allow for incremental image display)).

Note, the old code has this as the test:
384                   if (mDoneDecoding || (nextFrameIndex < mCurrentDecodingFrameIndex)) {

Why there is an extra test case for "numframes == 2" is unclear...


So, two ways to fix:
1. remove the extra test case for numframes==2
2. make the nsGIFDecoder and nsPNGDecoder more safe by only adding complete frames (after the first one).
(note, in option 2, the animation timer then only starts after having completely decoded the second frame)


Can you all (Bill, Ryan, others) do a compile with this patch and see if this fixes this spurious effect?
Attachment #263362 - Flags: review?(pavlov)
(In reply to comment #22)
> Created an attachment (id=263362) [details]
> Patch to remove the numframes == 2 test

That patch does appear to clear up the issue for me.  However it does drastically increase the amount of time between page load and when the image starts animating at anything near a reasonable speed.
Confirming Bill's comments. The GIF in comment #15 is a good example of the initial slowness of the animation.
I am not seeing any initial slowness. 
The only thing what could happen is that the 
" if (nextFrameIndex == mAnim->currentDecodingFrameIndex)"
branch is hit more often (the timer wants to jump to the currently decoded image), and then the timer is reset to wait another 100 milliseconds (so 0.1 second) before trying to display that frame... 
For those experiencing 'drastically increase', can you test this with an 'old' browser? This patch is only reverting a little bit of the APNG patch back to the 'old' behaviour for this waiting for completely decoded frames...
(In reply to comment #26)
> For those experiencing 'drastically increase', can you test this with an 'old'
> browser? This patch is only reverting a little bit of the APNG patch back to
> the 'old' behaviour for this waiting for completely decoded frames...
> 

I tested with Firefox 2.0.0.3, and you are correct.  The speed is similar.  That said, I might still play around to see if I can find a more optimum value for the delay than 100, but this does appear to be the correct fix.
Playing with those timeouts did not really make as significant amount of difference as I expected.  I tried values between 10 and 1000 and the change was not particularly dramatic.  I changed all 3 of the timeouts in that area of code.
I propose to move the timing aspects to another bug, and first fix the 'spurious white pixels'.

I have seen the delay in animation before, generally when the decoding as trouble keeping up (network delay, or heavy memory allocation for large images).
This seems to need some timer debugging...
The patch fixes the regression, which is what this bug is about.  Testing with Firefox 2 shows that the timing issue was pre-existing, so that certainly makes sense.
Flags: blocking1.9? → blocking1.9+
Target Milestone: --- → mozilla1.9alpha5
I added that check because (like the comment sais) currentDecodingFrameIndex is never 1. With the patch applied the second frame won't show while it's being decoded. Which is a nice fix considering the major problems it was causing.

I suspect the variable can be constructed with the value 1 instead of 0 since it is only allocated after the second frame starts decoding but I haven't investigated thoroughly. If desired at all, that ought to be a separate bug after Alfred's patch lands.
Attachment #263362 - Flags: review+
Comment on attachment 263362 [details] [diff] [review]
Patch to remove the numframes == 2 test

Pavlov, can you do the SR of this one?
Attachment #263362 - Flags: review?(pavlov) → superreview?(pavlov)
Is this going to make A5?  If not, let's move please.
FWIW, I've been including this patch in my local tree since Alfred posted it and it's been working fine for me.
Assignee: general → nobody
Component: GFX → Image: GFX
QA Contact: ian → image.gfx
Attachment #263362 - Flags: superreview?(pavlov) → superreview+
Who can do the checkin now all reviews have been done?
Whiteboard: [checkin needed]
I just got stuart to review this patch and I just landed it to make sure it
makes it in for a5 -- sorry it took so long to get to it.  Thanks for the patch!
Status: NEW → RESOLVED
Closed: 15 years ago
Component: Image: GFX → GFX
Resolution: --- → FIXED
Whiteboard: [checkin needed]
Flags: in-testsuite?
All of the images I was seeing this on are fixed now, except for http://i.somethingawful.com/forumsystem/emoticons/emot-kamina.gif - is the spurious white block that appears there a different bug?
The only white "block" I see on that image appears to be a little star or asterisk in the hair that certainly appears to be part of the regular image. At least, I don't see any difference between how IE7 and the trunk render it.
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a5pre) Gecko/20070531 Minefield/3.0a5pre

I'm seeing the white block in that image, too, Justin. You should probably file a separate bug for that issue.
FWIW, my local tree also has the patch from bug 317748 applied to it. It's possible that it's having an effect as well.
OK, filed bug 382764.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.