Last Comment Bug 700930 - BMPs <= 8BPP can have bad color tables when color table is processed in parts
: BMPs <= 8BPP can have bad color tables when color table is processed in parts
Status: RESOLVED FIXED
[qa!]
: verified-beta
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla9
Assigned To: Brian R. Bondy [:bbondy]
:
: Milan Sreckovic [:milan]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-08 21:16 PST by Justin Dolske [:Dolske]
Modified: 2011-12-28 16:19 PST (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
verified
verified


Attachments
Green dino (21.91 KB, image/png)
2011-11-08 21:16 PST, Justin Dolske [:Dolske]
no flags Details
about:cache entries (17.21 KB, text/plain)
2011-11-09 17:15 PST, Justin Dolske [:Dolske]
no flags Details
Fixed BMP code (1.34 KB, patch)
2011-11-19 11:40 PST, Brian R. Bondy [:bbondy]
joe: review+
Details | Diff | Splinter Review
Fixed BMP code, v2. Updated with review nit. (1.49 KB, patch)
2011-11-24 11:30 PST, Brian R. Bondy [:bbondy]
netzen: review+
christian: approval‑mozilla‑aurora+
christian: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Justin Dolske [:Dolske] 2011-11-08 21:16:24 PST
Created attachment 573076 [details]
Green dino

I've noticed this happen a few times over the last couple weeks. The red dino favicon sometimes loads with a funny appearance, part of the icon has pixels set to bright green. Screenshot attached.

Works fine in Chrome. I suppose it's also possible that this could be a corrupted image on 1 of the load balancers, though. Can't say I've noticed anything unusual with other favicons recently.

Icon in the screenshot is from http://blog.mozilla.com/favicon.ico, but I think I've seen this for the dino head on Zimbra (which I have as an apptab).
Comment 1 Justin Dolske [:Dolske] 2011-11-08 21:18:10 PST
I loaded the favicon.ico URL in a tab, did a shift-reload, and it seems fixed for the moment? I'll try and save it next time to see if it's a server-side issue.
Comment 2 Brian R. Bondy [:bbondy] 2011-11-08 22:34:55 PST
Let me know if you see this again.
Comment 3 Brian R. Bondy [:bbondy] 2011-11-09 10:23:23 PST
reed mentioned he seen this as well,
Comment 4 Brian R. Bondy [:bbondy] 2011-11-09 10:36:26 PST
Whoever sees this next please save the ICO file and attach to this ticket.
Comment 5 Justin Dolske [:Dolske] 2011-11-09 17:08:44 PST
Just hit this again, running from a newish profile (same profile I was using yesterday for testing, but couldn't get it to happen again).

Saved the icon, but loading the saved version (in another tab in that same session) looks fine. I also took a look at the about:cache entry for it, diffed it against a similar entry for the same icon elsewhere, and the hex dumps are the same.

I think it's less-likely that this is a server-side issue now.
Comment 6 Justin Dolske [:Dolske] 2011-11-09 17:15:21 PST
Created attachment 573383 [details]
about:cache entries

Note that in comment 0, it was http://blog.mozilla.com/favicon.ico that was green. This time it's http://www.mozilla.org/org/favicon.ico.

They look corrupted the same way, so I won't attach another screenshot. Interesting that to my eye it looks like it's the 100% red pixels that are shifted to 100% green. Color management, and/or some off-by-1 error?

I haven't connected to an external monitor recently, fwiw.
Comment 7 Brian R. Bondy [:bbondy] 2011-11-09 17:50:17 PST
thanks, that's what I needed to know.
Comment 8 Brian R. Bondy [:bbondy] 2011-11-12 19:03:27 PST
16x16
BMP 8BPP
So likely something in the color table or padding byte.
Comment 9 Brian R. Bondy [:bbondy] 2011-11-12 19:25:37 PST
*** Bug 687982 has been marked as a duplicate of this bug. ***
Comment 10 Brian R. Bondy [:bbondy] 2011-11-12 19:27:16 PST
This seems to be the same as an uninitialized value problem Joe Drew found earlier. 

Here is his valgrind output:

==54156== Conditional jump or move depends on uninitialised value(s)
==54156==    at 0x10748F204: mozilla::imagelib::nsBMPDecoder::WriteInternal(char const*, unsigned int) (nsBMPDecoder.cpp:382)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107491D6A: mozilla::imagelib::nsICODecoder::WriteInternal(char const*, unsigned int) (nsICODecoder.cpp:400)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107458A8D: mozilla::imagelib::RasterImage::WriteToDecoder(char const*, unsigned int) (RasterImage.cpp:2293)
==54156==    by 0x107458C8A: mozilla::imagelib::RasterImage::DecodeSomeData(unsigned int) (RasterImage.cpp:2625)
==54156==    by 0x107458EAD: mozilla::imagelib::imgDecodeWorker::Run() (RasterImage.cpp:2744)
==54156==    by 0x107459789: mozilla::imagelib::RasterImage::AddSourceData(char const*, unsigned int) (RasterImage.cpp:1281)
==54156==    by 0x1074598D4: mozilla::imagelib::RasterImage::WriteToRasterImage(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*) (RasterImage.cpp:2831)
==54156==    by 0x1088ECB6A: nsPipeInputStream::ReadSegments(unsigned int (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (nsPipe3.cpp:799)
==54156==    by 0x10747E3CC: imgRequest::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgRequest.cpp:1158)
==54156==    by 0x10746BB8B: ProxyListener::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgLoader.cpp:2100)
==54156== 
==54156== Conditional jump or move depends on uninitialised value(s)
==54156==    at 0x10748F4D6: mozilla::imagelib::nsBMPDecoder::WriteInternal(char const*, unsigned int) (nsBMPDecoder.cpp:419)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107491D6A: mozilla::imagelib::nsICODecoder::WriteInternal(char const*, unsigned int) (nsICODecoder.cpp:400)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107458A8D: mozilla::imagelib::RasterImage::WriteToDecoder(char const*, unsigned int) (RasterImage.cpp:2293)
==54156==    by 0x107458C8A: mozilla::imagelib::RasterImage::DecodeSomeData(unsigned int) (RasterImage.cpp:2625)
==54156==    by 0x107458EAD: mozilla::imagelib::imgDecodeWorker::Run() (RasterImage.cpp:2744)
==54156==    by 0x107459789: mozilla::imagelib::RasterImage::AddSourceData(char const*, unsigned int) (RasterImage.cpp:1281)
==54156==    by 0x1074598D4: mozilla::imagelib::RasterImage::WriteToRasterImage(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*) (RasterImage.cpp:2831)
==54156==    by 0x1088ECB6A: nsPipeInputStream::ReadSegments(unsigned int (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (nsPipe3.cpp:799)
==54156==    by 0x10747E3CC: imgRequest::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgRequest.cpp:1158)
==54156==    by 0x10746BB8B: ProxyListener::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgLoader.cpp:2100)
==54156==
Comment 11 Brian R. Bondy [:bbondy] 2011-11-12 19:27:46 PST
I don't know why yet but I'm almost positive that the problem is with bpc getting a value of 3 instead of 4.

bpc = (mBFH.bihsize == OS2_BIH_LENGTH) ? 3 : 4; // OS/2 Bitmaps have no padding byte
   
Will fix soon and suggest aurora and beta pushes.
Comment 12 Brian R. Bondy [:bbondy] 2011-11-19 10:59:49 PST
I found a consistent way to reproduce this, basically by forcing small amounts to be fed into WriteInternal at a time:

> void
> nsICODecoder::WriteInternal(const char* aBuffer, PRUint32 aCount) 
> {
>   const int BLOCK_SIZE = 55;
>   int numBlocks = aCount / BLOCK_SIZE;
>   int partialBlock = aCount % BLOCK_SIZE;
>   for (int i = 0; i < numBlocks; ++i) {
>     WriteInternal2(aBuffer + (BLOCK_SIZE * i), BLOCK_SIZE);
>   }
>   WriteInternal2(aBuffer + (BLOCK_SIZE * numBlocks), partialBlock);
> }
> void
> nsICODecoder::WriteInternal2(const char* aBuffer, PRUint32 aCount)
> {
>... normal WriteInternal code
Comment 13 Brian R. Bondy [:bbondy] 2011-11-19 11:40:53 PST
Created attachment 575687 [details] [diff] [review]
Fixed BMP code

Fixed. 
This is a regression that is in FF9 and above, so it will need to go into Beta and Aurora.

The problem has always been around for BMPs, but not for the duplicated BMP code that was in ICO code.
When we did the refactoring of PNG encoded ICOs to remove duplicated BMP code in ICO, we started to use the BMP code for ICOs.
This is why we see the bug today, it is using the existing BMP code which always had a bug in it.

Here is a description of the bug:
The BMP encoder, when creating the color table, loops through the buffer setting blue, green, red, and then padding, one byte at a time.
It increments to the next colorIndex after hitting the red byte.
When there is a padding byte, that is the 3rd of 4 bytes in the current color index.

Presumably the person that coded this originally did this because you have NO padding byte in the color table for OS2 BMPs
and you do have a padding byte in the color table for Windows BMPs.  That way they could increment colorNum once.

The problem with this is that if a single WriteInternal happens to end at the red, let's say for the 26th color, 
then it will increment the locally declared colorNum to be 26, but have no more data to process so exits.

The next entry of WriteInternal calculates us to be at colorNum = 25 since we aren't finished processing the color yet.
> PRUint8 colorNum = colorBytes / bytesPerColor;
> PRUint8 at = colorBytes % bytesPerColor;

After it processes that padding byte, it doesn't incrmeent to 25, it stays at 25 because the incrementing only happens on byte 2.
So we re-process a colortable entry of 25 again.
  
The fix is to increment colorNum when it reaches 3, when a padding byte is present, not when it is at 2.
The bug would happen anytime you have a BMP (or ICO that contains a BMP) with a color table (Bits per pixel of 1,4,8) and a WriteInternal that is split half way through the colorTable code, and it happened to end when processing the red byte.
Comment 14 Brian R. Bondy [:bbondy] 2011-11-19 11:48:35 PST
Repalce "The BMP encoder" with "The BMP decoder" above
Comment 15 Brian R. Bondy [:bbondy] 2011-11-19 11:57:44 PST
Comment on attachment 575687 [details] [diff] [review]
Fixed BMP code

Risk of introducing further regressions: None

Description of why I want it in Aurora and Beta:
This has always been a bug in our BMP code, but as of FF9 we started to use that BMP code from within ICO code.  So it is a regression in ICO because of that.  For further details please see Comment 13.  This can lead to some icons not displaying correctly, in particular, our beloved Mozilla dino icon.

If this gets Aurora+ and Beta+ I will not push to Aurora and Beta until Joe reviews it.
Comment 16 christian 2011-11-22 10:18:34 PST
[triage comment]

I think we want to take this, but we'll hold off on the approval for the r+.
Comment 17 Joe Drew (not getting mail) 2011-11-24 09:52:09 PST
Comment on attachment 575687 [details] [diff] [review]
Fixed BMP code

Review of attachment 575687 [details] [diff] [review]:
-----------------------------------------------------------------

A reftest here would not go awry. You'll have to write an HTTP reftest, similar to the one in layout/reftests/backgrounds/delay-image-response.sjs, but which sends only part of the image, delays a certain number of seconds, and then sends the rest.

::: image/decoders/nsBMPDecoder.cpp
@@ +405,3 @@
>                      break;
>                  case 3:
> +                    // This is a padding byte, only present in Windows BMPs

Here and in case 2, you don't give details on why you're incrementing the color index. You should - "We're done processing this colour. On to the next"
Comment 18 Brian R. Bondy [:bbondy] 2011-11-24 11:30:50 PST
Created attachment 576807 [details] [diff] [review]
Fixed BMP code, v2.  Updated with review nit.

Thanks Joe, I updated the comment. 

The tests for this will be covered in the context of bug 703913 which I filed to find similar existing problems in a more general way. I updated that bug with your suggestions about the sleep and send image in parts via HTTP, thanks for the suggestion.

This patch was pushed to try here:
https://tbpl.mozilla.org/?tree=Try&rev=927f03f10233
Comment 19 Brian R. Bondy [:bbondy] 2011-11-27 12:21:58 PST
Try results had a intemittent build failure, I repushed to try and it passed. 

Pushed to inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/0f74e9a5ea15

Christian Legnitto in Comment 16 mentioned they probably want this for Beta (I definitely think it should be), so marking as Target milestone 9.
Comment 20 Brian R. Bondy [:bbondy] 2011-11-27 12:23:32 PST
Comment on attachment 576807 [details] [diff] [review]
Fixed BMP code, v2.  Updated with review nit.

See Comment 15 for the reasoning of why it should be in Beta and Aurora.
See Comment 16 for LegNeato's triage comment.
Comment 21 Marco Bonardo [::mak] 2011-11-28 05:29:24 PST
https://hg.mozilla.org/mozilla-central/rev/0f74e9a5ea15

I was also hit by this bug, as well as dietrich reported it on irc some days ago, so it should be quite visible among users.
Comment 22 Brian R. Bondy [:bbondy] 2011-11-29 19:57:13 PST
I'll be pushing these to Aurora and Beta tomorrow morning. Just don't want to do it tonight in case there are any problems and I'm not around.
Comment 24 Vlad [QA] 2011-12-16 05:45:31 PST
I have verified this using the steps from the description on:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0) Gecko/20100101 Firefox/9.0 beta 6
Mozilla/5.0 (X11; Linux i686; rv:9.0) Gecko/20100101 Firefox/9.0 beta 6
Mozilla/5.0 (Windows NT 6.1; rv:9.0) Gecko/20100101 Firefox/9.0 beta 6

After several page reloads, the issue is no longer reproducing. The red dino favicon always appears red.

Setting resolution to Verified Fixed on Beta.
Comment 25 Vlad [QA] 2011-12-28 04:41:50 PST
Verified on:
Mozilla/5.0 (Windows NT 6.1; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Comment 26 Vlad [QA] 2011-12-28 04:42:49 PST
Verified on:
Mozilla/5.0 (Windows NT 6.1; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1

Note You need to log in before you can comment on or make changes to this bug.