Last Comment Bug 726004 - Fix bmp/ico orange (bug 721917) by listening for image data errors
: Fix bmp/ico orange (bug 721917) by listening for image data errors
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla13
Assigned To: Justin Lebar (not reading bugmail)
:
Mentors:
Depends on: 715308 721917
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-10 08:00 PST by Justin Lebar (not reading bugmail)
Modified: 2012-02-15 08:43 PST (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed


Attachments
Patch v1 (2.27 KB, patch)
2012-02-13 15:35 PST, Justin Lebar (not reading bugmail)
joe: review+
Details | Diff | Splinter Review

Description Justin Lebar (not reading bugmail) 2012-02-10 08:00:27 PST
It appears that the bmp decoder notices invalid bits, writes them to the output buffer, and then does not immediately return an error code.  This seems pretty bogus to me, and I think it may be what's causing bug 721917.

Why my patch to bug 715308 causes this, I do not know.
Comment 1 Justin Lebar (not reading bugmail) 2012-02-10 08:37:24 PST
Ah, I see.  The BMP decoder sends a data error, but not a decoder error.  And we listen only for decoder errors.  So this isn't a bug in the BMP decoder, but maybe I can fix it in RasterImage.
Comment 2 Justin Lebar (not reading bugmail) 2012-02-10 08:48:56 PST
https://tbpl.mozilla.org/?tree=Try&rev=63022c335913
Comment 3 Justin Lebar (not reading bugmail) 2012-02-10 12:43:24 PST
Damn, that did not fix it.  :(
Comment 4 Justin Lebar (not reading bugmail) 2012-02-12 11:18:22 PST
This push [1], which adds a sync-until-size decode to AddSourceData (because originally we'd do a sync decode there, but I changed it so we don't), also did not fix things 

However, this push [2] has had many green reftest runs.  The relevant change is

    1.65 @@ -3001,22 +3012,22 @@ RasterImage::DecodeWorker::DecodeSomeOfImage
    1.66      }
    1.67  
    1.68      // We keep decoding chunks until either:
    1.69      //  * we're an UNTIL_SIZE decode and we get the size,
    1.70      //  * we don't have any data left to decode,
    1.71      //  * the decode completes, or
    1.72      //  * we run out of time.
    1.73  
    1.74 -    if (aDecodeType == DECODE_TYPE_UNTIL_SIZE && aImg->mHasSize)
    1.75 -      break;
    1.80 +    /*if (aDecodeType == DECODE_TYPE_UNTIL_SIZE && aImg->mHasSize)
    1.81 +      break;*/

I'm not sure why this is the magic change...

[1] https://tbpl.mozilla.org/?tree=Try&rev=23e6d49c4a3e
[2] https://tbpl.mozilla.org/?tree=Try&rev=ab57bf32c9fa
Comment 5 Justin Lebar (not reading bugmail) 2012-02-12 11:24:52 PST
What's particularly strange is that all of the images in image/test/reftest/bmp/bmp-corrupted have size less than 4096, which is the value of image.mem.decode_bytes_at_a_time.  Therefore, DecodeSomeOfImage should always decode all of the available data.  The loop should not run twice, and therefore the early break shouldn't make any difference.

Perhaps in some error case, we bail early and don't recognize the error until a second pass through the loop.
Comment 6 Justin Lebar (not reading bugmail) 2012-02-12 11:42:33 PST
https://tbpl.mozilla.org/?tree=Try&rev=5b75286fd31f
Comment 7 Justin Lebar (not reading bugmail) 2012-02-13 07:02:48 PST
The patch below seems to fix the problem, or at least make it significantly more rare (I have 234 greens on try).

But I think returning NS_ERROR_FAILURE here may cause us to throw onerror for corrupt image data, which is not what we want to do (onerror is for network errors, according to bholley and jst).  I'll see if I can return NS_OK here.

diff --git a/image/src/RasterImage.cpp b/image/src/RasterImage.cpp
--- a/image/src/RasterImage.cpp
+++ b/image/src/RasterImage.cpp
@@ -3001,16 +3001,22 @@ RasterImage::DecodeWorker::DecodeSomeOfI
     }
 
     // We keep decoding chunks until either:
     //  * we're an UNTIL_SIZE decode and we get the size,
     //  * we don't have any data left to decode,
     //  * the decode completes, or
     //  * we run out of time.
 
+    if (aImg->mDecoder->HasError()) {
+      printf("Hit aImg->mDecoder->HasError().  Bailing.\n");
+      aImg->DoError();
+      return NS_ERROR_FAILURE;
+    }
+
     if (aDecodeType == DECODE_TYPE_UNTIL_SIZE && aImg->mHasSize)
       break;
 
   } while (aImg->mSourceData.Length() > aImg->mBytesDecoded &&
            !aImg->IsDecodeFinished() &&
            TimeStamp::Now() < deadline);
 
   aImg->mDecodeRequest.mDecodeTime += (TimeStamp::Now() - start);
Comment 8 Justin Lebar (not reading bugmail) 2012-02-13 15:35:52 PST
Created attachment 596822 [details] [diff] [review]
Patch v1
Comment 9 Justin Lebar (not reading bugmail) 2012-02-13 15:44:00 PST
AFAICT, this just happened to work before bug 715308.

What I think must have happened is, on these synchronous decodes (e.g. triggered by SourceDataComplete), we'd to through the decode loop *twice*.  The first time, we'd notice invalid data and bail.  Then the second time, DecodeSomeData() must have returned an error code, since we'd already pushed all our data into the decoder.  (It returns an error code when mDecoder is null, and I'm not sure exactly how that happens.)  So we wouldn't end up flushing invalidations, because we'd return early.

Now we break out of the loop after one iteration on synchronous until-size decodes, so we don't get the benefit of this behavior.

The simple thing to do is not paint if the decoder flags an error.  This means that some bogus images may be cut off even earlier than they would be before, but I think that's OK.  We make no guarantees about being able to paint part of corrupted images.
Comment 10 Justin Lebar (not reading bugmail) 2012-02-13 15:56:39 PST
Tryserver runs for patch v1: https://tbpl.mozilla.org/?tree=Try&rev=044938ec3603
Comment 11 Joe Drew (not getting mail) 2012-02-13 17:22:31 PST
Comment on attachment 596822 [details] [diff] [review]
Patch v1

Review of attachment 596822 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/src/RasterImage.cpp
@@ +3018,5 @@
>    if (chunkCount && !aImg->mDecoder->IsSizeDecode()) {
>      Telemetry::Accumulate(Telemetry::IMAGE_DECODE_CHUNKS, chunkCount);
>    }
>  
> +  // Flush invalidations after we've decoded all the chunks we're going to.

Add a "(and hence paint)" after "Flush invalidations". And maybe add a blank line.
Comment 12 Justin Lebar (not reading bugmail) 2012-02-13 18:27:21 PST
> The first time, we'd notice invalid data and bail.

To be clear, we'd notice invalid data and bail out of the decoder, having written some possibly invalid data to the output buffer, but we wouldn't bail out of the decode worker's loop.

https://hg.mozilla.org/integration/mozilla-inbound/rev/e2efe112d306
Comment 13 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-02-15 08:43:15 PST
https://hg.mozilla.org/mozilla-central/rev/e2efe112d306

Note You need to log in before you can comment on or make changes to this bug.