Closed Bug 700930 Opened 13 years ago Closed 13 years ago

BMPs <= 8BPP can have bad color tables when color table is processed in parts

Categories

(Core :: Graphics: ImageLib, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla9
Tracking Status
firefox9 + verified
firefox10 --- verified

People

(Reporter: Dolske, Assigned: bbondy)

Details

(Keywords: verified-beta, Whiteboard: [qa!])

Attachments

(3 files, 1 obsolete file)

Attached image Green dino
I've noticed this happen a few times over the last couple weeks. The red dino favicon sometimes loads with a funny appearance, part of the icon has pixels set to bright green. Screenshot attached.

Works fine in Chrome. I suppose it's also possible that this could be a corrupted image on 1 of the load balancers, though. Can't say I've noticed anything unusual with other favicons recently.

Icon in the screenshot is from http://blog.mozilla.com/favicon.ico, but I think I've seen this for the dino head on Zimbra (which I have as an apptab).
I loaded the favicon.ico URL in a tab, did a shift-reload, and it seems fixed for the moment? I'll try and save it next time to see if it's a server-side issue.
Let me know if you see this again.
reed mentioned he seen this as well,
Whoever sees this next please save the ICO file and attach to this ticket.
OS: Mac OS X → All
Hardware: x86 → All
Just hit this again, running from a newish profile (same profile I was using yesterday for testing, but couldn't get it to happen again).

Saved the icon, but loading the saved version (in another tab in that same session) looks fine. I also took a look at the about:cache entry for it, diffed it against a similar entry for the same icon elsewhere, and the hex dumps are the same.

I think it's less-likely that this is a server-side issue now.
OS: All → Mac OS X
Hardware: All → x86
Attached file about:cache entries
Note that in comment 0, it was http://blog.mozilla.com/favicon.ico that was green. This time it's http://www.mozilla.org/org/favicon.ico.

They look corrupted the same way, so I won't attach another screenshot. Interesting that to my eye it looks like it's the 100% red pixels that are shifted to 100% green. Color management, and/or some off-by-1 error?

I haven't connected to an external monitor recently, fwiw.
OS: Mac OS X → All
Hardware: x86 → All
thanks, that's what I needed to know.
16x16
BMP 8BPP
So likely something in the color table or padding byte.
This seems to be the same as an uninitialized value problem Joe Drew found earlier. 

Here is his valgrind output:

==54156== Conditional jump or move depends on uninitialised value(s)
==54156==    at 0x10748F204: mozilla::imagelib::nsBMPDecoder::WriteInternal(char const*, unsigned int) (nsBMPDecoder.cpp:382)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107491D6A: mozilla::imagelib::nsICODecoder::WriteInternal(char const*, unsigned int) (nsICODecoder.cpp:400)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107458A8D: mozilla::imagelib::RasterImage::WriteToDecoder(char const*, unsigned int) (RasterImage.cpp:2293)
==54156==    by 0x107458C8A: mozilla::imagelib::RasterImage::DecodeSomeData(unsigned int) (RasterImage.cpp:2625)
==54156==    by 0x107458EAD: mozilla::imagelib::imgDecodeWorker::Run() (RasterImage.cpp:2744)
==54156==    by 0x107459789: mozilla::imagelib::RasterImage::AddSourceData(char const*, unsigned int) (RasterImage.cpp:1281)
==54156==    by 0x1074598D4: mozilla::imagelib::RasterImage::WriteToRasterImage(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*) (RasterImage.cpp:2831)
==54156==    by 0x1088ECB6A: nsPipeInputStream::ReadSegments(unsigned int (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (nsPipe3.cpp:799)
==54156==    by 0x10747E3CC: imgRequest::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgRequest.cpp:1158)
==54156==    by 0x10746BB8B: ProxyListener::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgLoader.cpp:2100)
==54156== 
==54156== Conditional jump or move depends on uninitialised value(s)
==54156==    at 0x10748F4D6: mozilla::imagelib::nsBMPDecoder::WriteInternal(char const*, unsigned int) (nsBMPDecoder.cpp:419)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107491D6A: mozilla::imagelib::nsICODecoder::WriteInternal(char const*, unsigned int) (nsICODecoder.cpp:400)
==54156==    by 0x107455880: mozilla::imagelib::Decoder::Write(char const*, unsigned int) (Decoder.cpp:123)
==54156==    by 0x107458A8D: mozilla::imagelib::RasterImage::WriteToDecoder(char const*, unsigned int) (RasterImage.cpp:2293)
==54156==    by 0x107458C8A: mozilla::imagelib::RasterImage::DecodeSomeData(unsigned int) (RasterImage.cpp:2625)
==54156==    by 0x107458EAD: mozilla::imagelib::imgDecodeWorker::Run() (RasterImage.cpp:2744)
==54156==    by 0x107459789: mozilla::imagelib::RasterImage::AddSourceData(char const*, unsigned int) (RasterImage.cpp:1281)
==54156==    by 0x1074598D4: mozilla::imagelib::RasterImage::WriteToRasterImage(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*) (RasterImage.cpp:2831)
==54156==    by 0x1088ECB6A: nsPipeInputStream::ReadSegments(unsigned int (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (nsPipe3.cpp:799)
==54156==    by 0x10747E3CC: imgRequest::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgRequest.cpp:1158)
==54156==    by 0x10746BB8B: ProxyListener::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (imgLoader.cpp:2100)
==54156==
I don't know why yet but I'm almost positive that the problem is with bpc getting a value of 3 instead of 4.

bpc = (mBFH.bihsize == OS2_BIH_LENGTH) ? 3 : 4; // OS/2 Bitmaps have no padding byte
   
Will fix soon and suggest aurora and beta pushes.
I found a consistent way to reproduce this, basically by forcing small amounts to be fed into WriteInternal at a time:

> void
> nsICODecoder::WriteInternal(const char* aBuffer, PRUint32 aCount) 
> {
>   const int BLOCK_SIZE = 55;
>   int numBlocks = aCount / BLOCK_SIZE;
>   int partialBlock = aCount % BLOCK_SIZE;
>   for (int i = 0; i < numBlocks; ++i) {
>     WriteInternal2(aBuffer + (BLOCK_SIZE * i), BLOCK_SIZE);
>   }
>   WriteInternal2(aBuffer + (BLOCK_SIZE * numBlocks), partialBlock);
> }
> void
> nsICODecoder::WriteInternal2(const char* aBuffer, PRUint32 aCount)
> {
>... normal WriteInternal code
Summary: Mozilla dino favicon is moldy green and corrupted. → BMPs <= 8BPP can have bad color tables when processed partially
Attached patch Fixed BMP code (obsolete) — Splinter Review
Fixed. 
This is a regression that is in FF9 and above, so it will need to go into Beta and Aurora.

The problem has always been around for BMPs, but not for the duplicated BMP code that was in ICO code.
When we did the refactoring of PNG encoded ICOs to remove duplicated BMP code in ICO, we started to use the BMP code for ICOs.
This is why we see the bug today, it is using the existing BMP code which always had a bug in it.

Here is a description of the bug:
The BMP encoder, when creating the color table, loops through the buffer setting blue, green, red, and then padding, one byte at a time.
It increments to the next colorIndex after hitting the red byte.
When there is a padding byte, that is the 3rd of 4 bytes in the current color index.

Presumably the person that coded this originally did this because you have NO padding byte in the color table for OS2 BMPs
and you do have a padding byte in the color table for Windows BMPs.  That way they could increment colorNum once.

The problem with this is that if a single WriteInternal happens to end at the red, let's say for the 26th color, 
then it will increment the locally declared colorNum to be 26, but have no more data to process so exits.

The next entry of WriteInternal calculates us to be at colorNum = 25 since we aren't finished processing the color yet.
> PRUint8 colorNum = colorBytes / bytesPerColor;
> PRUint8 at = colorBytes % bytesPerColor;

After it processes that padding byte, it doesn't incrmeent to 25, it stays at 25 because the incrementing only happens on byte 2.
So we re-process a colortable entry of 25 again.
  
The fix is to increment colorNum when it reaches 3, when a padding byte is present, not when it is at 2.
The bug would happen anytime you have a BMP (or ICO that contains a BMP) with a color table (Bits per pixel of 1,4,8) and a WriteInternal that is split half way through the colorTable code, and it happened to end when processing the red byte.
Assignee: nobody → netzen
Attachment #575687 - Flags: review?(joe)
Repalce "The BMP encoder" with "The BMP decoder" above
Comment on attachment 575687 [details] [diff] [review]
Fixed BMP code

Risk of introducing further regressions: None

Description of why I want it in Aurora and Beta:
This has always been a bug in our BMP code, but as of FF9 we started to use that BMP code from within ICO code.  So it is a regression in ICO because of that.  For further details please see Comment 13.  This can lead to some icons not displaying correctly, in particular, our beloved Mozilla dino icon.

If this gets Aurora+ and Beta+ I will not push to Aurora and Beta until Joe reviews it.
Attachment #575687 - Flags: approval-mozilla-beta?
Attachment #575687 - Flags: approval-mozilla-aurora?
Summary: BMPs <= 8BPP can have bad color tables when processed partially → BMPs <= 8BPP can have bad color tables when color table is processed in parts
[triage comment]

I think we want to take this, but we'll hold off on the approval for the r+.
Comment on attachment 575687 [details] [diff] [review]
Fixed BMP code

Review of attachment 575687 [details] [diff] [review]:
-----------------------------------------------------------------

A reftest here would not go awry. You'll have to write an HTTP reftest, similar to the one in layout/reftests/backgrounds/delay-image-response.sjs, but which sends only part of the image, delays a certain number of seconds, and then sends the rest.

::: image/decoders/nsBMPDecoder.cpp
@@ +405,3 @@
>                      break;
>                  case 3:
> +                    // This is a padding byte, only present in Windows BMPs

Here and in case 2, you don't give details on why you're incrementing the color index. You should - "We're done processing this colour. On to the next"
Attachment #575687 - Flags: review?(joe) → review+
Thanks Joe, I updated the comment. 

The tests for this will be covered in the context of bug 703913 which I filed to find similar existing problems in a more general way. I updated that bug with your suggestions about the sleep and send image in parts via HTTP, thanks for the suggestion.

This patch was pushed to try here:
https://tbpl.mozilla.org/?tree=Try&rev=927f03f10233
Attachment #575687 - Attachment is obsolete: true
Attachment #575687 - Flags: approval-mozilla-beta?
Attachment #575687 - Flags: approval-mozilla-aurora?
Attachment #576807 - Flags: review+
Try results had a intemittent build failure, I repushed to try and it passed. 

Pushed to inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/0f74e9a5ea15

Christian Legnitto in Comment 16 mentioned they probably want this for Beta (I definitely think it should be), so marking as Target milestone 9.
Target Milestone: --- → mozilla9
Comment on attachment 576807 [details] [diff] [review]
Fixed BMP code, v2.  Updated with review nit.

See Comment 15 for the reasoning of why it should be in Beta and Aurora.
See Comment 16 for LegNeato's triage comment.
Attachment #576807 - Flags: approval-mozilla-beta?
Attachment #576807 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/0f74e9a5ea15

I was also hit by this bug, as well as dietrich reported it on irc some days ago, so it should be quite visible among users.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Attachment #576807 - Flags: approval-mozilla-beta?
Attachment #576807 - Flags: approval-mozilla-beta+
Attachment #576807 - Flags: approval-mozilla-aurora?
Attachment #576807 - Flags: approval-mozilla-aurora+
I'll be pushing these to Aurora and Beta tomorrow morning. Just don't want to do it tonight in case there are any problems and I'm not around.
Whiteboard: [qa+]
I have verified this using the steps from the description on:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0) Gecko/20100101 Firefox/9.0 beta 6
Mozilla/5.0 (X11; Linux i686; rv:9.0) Gecko/20100101 Firefox/9.0 beta 6
Mozilla/5.0 (Windows NT 6.1; rv:9.0) Gecko/20100101 Firefox/9.0 beta 6

After several page reloads, the issue is no longer reproducing. The red dino favicon always appears red.

Setting resolution to Verified Fixed on Beta.
Keywords: verified-beta
Whiteboard: [qa+] → [qa+][qa!:9]
Verified on:
Mozilla/5.0 (Windows NT 6.1; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Whiteboard: [qa+][qa!:9] → [qa+][qa!:9][qa!:10]
Verified on:
Mozilla/5.0 (Windows NT 6.1; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0 beta 1
Whiteboard: [qa+][qa!:9][qa!:10] → [qa!]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: