Use StreamingLexer in the GIF decoder

RESOLVED FIXED in Firefox 50

Status

()

defect
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: seth, Assigned: benjaminster, Mentored)

Tracking

(Blocks 2 bugs)

unspecified
mozilla50
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 fixed)

Details

(Whiteboard: [gfx-noted])

Attachments

(1 attachment, 5 obsolete attachments)

The existing GIF decoder code is a real mess. It'd probably be simplified quite a bit by reworking it to use StreamingLexer.

By itself this isn't an urgent matter, but because using StreamingLexer and following the method-per-state approach used in bug 1196066 for nsICODecoder would make it easier to correctly implement different "modes" for the GIF decoder, this may be relevant to bug 1187118. In bug 1187118 we'd ideally like to have GIF decoder modes that did things like find the boundaries between frames without actually decoding them, or decoding a single frame in isolation, and I think this refactoring would make that work a lot easier.

This would be a good bug for a new contributor with good C++ skills.
Mentor: seth
Whiteboard: [gfx-noted]
I found this bug from :njn's blogpost about writing the BMP decoder and I'd be really interested in taking this on or assisting.

I assume that I'd have to pick up a bunch about contributing to Mozilla, appropriate style, getting a build-test system set up, but I've worked professionally in a large C++ codebase, written a GIF decoder (and encoder) from scratch in JavaScript, and Just Like Browsers A Lot.
Sure, that would be great!
(In reply to Ben Ellis from comment #1)
> I found this bug from :njn's blogpost about writing the BMP decoder and I'd
> be really interested in taking this on or assisting.

Oh dear... I've already started this and I came here to assign it to myself and I saw this. The rewriting of the code is definitely a one-person job and not something that can be shared easily. It would also be good to improve test coverage as part of this bug (e.g. with https://code.google.com/p/imagetestsuite/wiki/GIFTestSuite), which is conceivably doable by a different person but it would be easier if one person did it all.

The good news is that I haven't got that far; I'm probably about an hour into it. So I could post what I have so far and then hand it over to Ben. I'm loathe to turn away an eager new contributor, and there are plenty of other things I can work on.

On the other hand, this is a large and complex bug for a new contributor. First bugs typically only involve touching a few lines of actual code because all the other stuff -- getting the code, building, testing, posting patches, getting review, etc. -- is usually more than enough for a first bug. Furthermore, Seth tells me that bug 1187118, which this depends on, is a Q4 (or Q1 next year?) goal for the layout(?) team, and so we want some confidence that this bug to be completed with some in the near future. (Not to doubt you, Ben, but that's always a risk with new contributors and I don't know how much time you have to devote to this.)

So, we have a decision to make. Seth, perhaps you should make the call? If you want more certainty I can do it and we can work together to find Ben a new bug. If you're willing to tolerate a higher risk then we can assign it to Ben. What do you think?
Flags: needinfo?(seth)
(In reply to Nicholas Nethercote [:njn] from comment #3)

FWIW I think your concerns are all spot-on.  The rewrite might proceed through several phases (breaking out WriteInternal's buffer logic and big switch into a switch lambda and functions returning LexerTransition<>s, possibly correcting style issues in the body of those functions, possibly correcting style issues in the GIF state struct) but none of those phases can be worked in parallel.  I do have Firefox building and running on a laptop, but I have a bunch more reading to do (though everything at https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions is super straightforward so far) before I understand the expected style and contribution flow.
Ben, I'd love to have more contributors working on this code, and it sounds to me like you have the skills to work on this bug. The main issue, as Nick mentioned above, is that there are other things which depend on this and which we're trying to ship within the next two or three months.

So here's what I'd like to propose: Ben, why don't you go ahead and take the bug, and give it a shot, and aim to complete it within about a month. If at the end of a month you're running into problems, or if at any time you decide that you'd rather work on something else =), we can talk again and decide how to proceed. Would that work for you?
Flags: needinfo?(seth)
Flags: needinfo?(benjaminster)
And obviously I'll be happy to provide any help and support I can. (And I don't mean to speak for him, but I imagine Nick will too.) You can find us on IRC in #layout or #memshrink.
My just-started patch. Things to note:

- I've started breaking up the big switch in WriteInternal() into
  LexerTransition functions, and I've done the first three states. It's not
  even close to being in a compilable state. I have two (old and new)
  WriteInternal() functions at the moment.

- I've started using LittleEndian::read*() functions (from mfbt/Endian.h);
  please continue doing that where appropriate.

- Bug 1210291 is likely to land soon. It slightly tweaks StreamingLexer's
  handling of terminal states. Just a minor API change, but something to be
  aware of.

- I use comments starting with "njn:" (my initials) to mark things that I need
  to come back to and fix up. There are already plenty of them.

- I've been fixing style issues (including sub-standard comments) in the lines
  that I've moved out of the big switch. Might as well do it while those lines
  are showing up in patch diffs.

- A lot of the names used don't match the names in the GIF spec. (E.g.
  "global_colormap" for the Global Color Table.) I was planning to use the spec
  terminology where appropriate, though I haven't got far with that.

- GIF2.h doesn't need to be a separate file. It's contents can be moved into
  nsGIFDecoder2.{h,cpp}. Though it might be better to make that a separate
  follow-up patch, because smaller patches are easier to review and the rewrite
  of the big switch alone will be more than large enough.

Hopefully this is useful. It might be worth starting from scratch and then
using this patch as a reference rather than continuing it, because the first
couple of states are the easiest and therefore a good way to get your head
around things.
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
Assignee: n.nethercote → nobody
Status: ASSIGNED → NEW
Ben: Seth set the "NEEDINFO" flag for you on this bug, which is a Bugzilla feature that indicates he would like you to respond. If you're still happy to take this bug, please say so. I'm not sure if you'll have permissions to change the assignee to yourself or not; if not just say so and I can do it for you. Thank you.
Thanks, Nick.  I would like to take the bug and it appears that I cannot assign it to myself.
Flags: needinfo?(benjaminster)
Assignee: nobody → benjaminster
I started out by writing a few LexerTransition-returning functions and getting everything to compile.  Tonight I translated the rest of the current code to functions returning LexerTransitions.  Once I'm happy with that initial translation and the change to WriteInternal to use the StreamingLexer, how should I post/share the patch for style feedback / guidance about what to clean up next?  What steps can or should I take to test locally or is there a way I can get my changes tested through the build system?

Sorry for the long silence.
Status: NEW → ASSIGNED
Ways to test:

1. Just visit sites with lots of GIFs and make sure they're rendering ok.

2. Run the GIF reftests with a command like this:  ./mach reftest image/test/reftest/gif

3. If you can import https://code.google.com/p/imagetestsuite/wiki/GIFTestSuite into the reftests, that would be *great*. You can see I did something similar with BMPSuite in bug 1204394. Docs on reftests are at https://developer.mozilla.org/en/docs/Creating_reftest-based_unit_tests.

4. After that there's pushing to try server to test on a range of machines, but that can wait a little longer.

Once you've done some testing and are confident things are correct, you can upload the patch to this bug (via the "Add an attachment" link above) and set the "feedback" checkbox to "?" and put ":njn" (me) as the person to give feedback.
Everything in comment 11 is correct. There are some additional tests you should run locally as well:

1. The "new" ImageLib unit tests: |./mach gtest 'Image*'|

2. The "old" ImageLib unit tests: |./mach xpcshell-test image/test/unit|

3. The ImageLib general mochitests: |./mach mochitest image/test/mochitest|

Combined with Nick's suggestions above, if these tests pass you can be quite confident that you didn't break anything.

A tip about the mochitests: you may get spurious failures if your desktop resolution isn't high enough that the entire mochitest window fits on-screen. That will be the case for a rMBP, for example, if you leave it set at the default resolution.

Please also request feedback from me (:seth) via the same mechanism that Nick explains above.
> A tip about the mochitests: you may get spurious failures if your desktop
> resolution isn't high enough that the entire mochitest window fits
> on-screen. That will be the case for a rMBP, for example, if you leave it
> set at the default resolution.

You can confirm if such failures are your fault by running the tests without the patch applied and seeing if they still occur.
After debugging with an informal but semi-broad set of GIFs, the patched version passed all the tests in the 3 suites listed above.

Will clean up and post patch tonight.
Posted patch gif-streaming-lexer.patch (obsolete) — Splinter Review
Focused on breaking the old `WriteInternal` case statement out into methods returning LexerTransition<>s.  More to do in terms of cleaning up names, style.
Attachment #8699763 - Flags: feedback?(seth)
Attachment #8699763 - Flags: feedback?(n.nethercote)
Comment on attachment 8699763 [details] [diff] [review]
gif-streaming-lexer.patch

Review of attachment 8699763 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for the patch. It looks like an excellent start.

I did a quick pass focusing on style issues; it's good to get them out of the way so they're not a distraction when later reviewing for the more important stuff. (Also, Seth will be the final reviewer because he has a much deeper knowledge about this code than I do, so I figured I'd help in the earlier stages while I can.)

There are lots of nitpicky things. That's typical for a first patch, especially such a large one, so don't be discouraged!  Also, I only looked at the added code, not the removed code, so some of the things I complained about may have been pre-existing. If so, consider this an opportunity to fix up these pre-existing issues while you're in here :)

Have you done testing with any images not currently in the repository, such as GIFTestSuite? It might make sense to import any tests like that.

I'll cancel the feedback request from Seth. It makes sense for you to address my comments and upload a new patch and then ask him for feedback on the new version. That saves him from having to make comments that might overlap with mine.

Hope this all makes sense and seems reasonable. I'm going on vacation for the last two weeks of December so I won't be able to give any more feedback during that time :(

::: image/decoders/GIF2.h
@@ +83,5 @@
>                                  // for this image in a multi-image GIF
>  
> +    //ble: consider breaking this out into its own struct,
> +    //or putting this at the top level of a struct which
> +    //contains many of these other fields as sub-structs.

You've prefaced many, but not all, comments with "ble:". Some of these look like FIXME/TODO-style comments but some don't. What is the label intended to communicate? If it just shows that you wrote the comment, there's no need for that. Please make it clear and consistent which comments are TODO-style ones and which aren't.

Also, please put a space after '//' in comments. That needs changing on many comments in the patch.

::: image/decoders/nsGIFDecoder2.cpp
@@ +73,5 @@
>  // GIF Decoder Implementation
>  
>  nsGIFDecoder2::nsGIFDecoder2(RasterImage* aImage)
>    : Decoder(aImage)
> +  , mLexer(Transition::To(TYPE, 6))

Replace 6 with GIF_HEADER_LEN?

@@ +711,2 @@
>  
> +  Maybe<State> terminalState =

Looks like you haven't updated your copy of the code since bug 1210291 landed. Please do so. You'll have to make some simple changes. See https://hg.mozilla.org/mozilla-central/rev/793133ff5233 for examples, e.g. the changes to nsIconDecoder.{h,cpp}.

@@ +714,5 @@
> +      aBuffer,
> +      aCount,
> +      [=](State aState,
> +          const char* aData,
> +          size_t aLength) {

I'd put these three parameters on a single line, along with the '{'.

@@ +719,5 @@
>  
> +        switch(aState) {
> +          case TYPE:                           return ReadHeader(aData);
> +          case SCREEN_DESCRIPTOR:              return ReadScreenDescriptor(aData);
> +          case GLOBAL_COLOR_TABLE:             return ReadGlobalColorTable(aData);

The first three funcs called in this switch have a |Read| prefix but the rest don't. I don't mind which way you go, but consistency is preferable.

@@ +725,5 @@
> +          case IMAGE_HEADER:                   return ImageHeader(aData);
> +          case LOCAL_COLOR_TABLE:              return LocalColorTable(aData);
> +          case LZW_START:                      return ImageLZWStart(aData);
> +          case LZW:                            return ImageLZW(aData);
> +          case SUB_BLOCK:                      return SubBlock(aData);

There are four places where the decoding states are listed:
- this switch statement;
- the order of the decoding function definitions;
- the order of the decoding function declarations (in the header);
- the order of the values in the State enum.

Please use a consistent ordering. This switch statement starts with ReadHeader, which makes sense to me -- the order of the states in all the cases should mirror the order in which the states are reach, as much as that is possible/makes sense.

@@ +748,5 @@
>    }
>  
> +  if (*terminalState == State::FAILURE) {
> +    PostDataError();
> +    return;

Once you import the changes from bug 1210291 you can replace these seven lines of terminalState checking with these three:

>  if (terminalState == Some(TerminalState::FAILURE)) {
>    PostDataError();
>  }

(copied from the BMP and Icon decoders).

@@ +753,3 @@
>    }
>  
> +  // We want to flush before returning if we're on the first frame

Please end comments with a '.' so they are proper sentences. There are lots of comments that need addressing in this way. (Use upper-case letters at the start of comment sentences, too.)

@@ +759,5 @@
>      mLastFlushedPass = mCurrentPass;
>    }
>  }
>  
> +//ble: equiv of "case gif_lzw:"

Remove all these "equiv of" comments; they're not useful.

@@ +764,5 @@
> +LexerTransition<nsGIFDecoder2::State>
> +nsGIFDecoder2::ImageLZW(const char* aData)
> +{
> +  //ble: TODO: is it okay to make this case?
> +  if (!DoLzw((const uint8_t*)aData)) {

Use static_cast<>.

@@ +785,5 @@
> +    mColormap[mGIFStruct.tpixel] = 0;
> +  }
> +
> +  // Initialize LZW parser/decoder
> +  mGIFStruct.datasize = (uint8_t) aData[0];

Avoid the cast with |uint8_t(aData[0])|.

@@ +796,5 @@
> +  mGIFStruct.avail = clear_code + 2;
> +  mGIFStruct.oldcode = -1;
> +  mGIFStruct.codesize = mGIFStruct.datasize + 1;
> +  mGIFStruct.codemask = (1 << mGIFStruct.codesize) - 1;
> +  mGIFStruct.datum = mGIFStruct.bits = 0;

Chained assignments can be hard to read. I'd just make it two separate assignments on two separate lines.

@@ +811,5 @@
> +}
> +
> +//ble: equiv of "case gif_type:"
> +LexerTransition<nsGIFDecoder2::State>
> +nsGIFDecoder2::ReadHeader(const char* aData)

This should be the first decoding function (see comment about ordering above).

@@ +844,5 @@
> +  mGIFStruct.global_colormap_depth = (packedFields & TABLE_DEPTH_MASK) + 1;
> +  mGIFStruct.global_colormap_count = 1 << mGIFStruct.global_colormap_depth;
> +
> +  //ble: are the background color index and pixel aspect ratio okay to ignore?
> +  if(packedFields & COLOR_TABLE_MASK) {

Nit: space after |if|.

@@ +847,5 @@
> +  //ble: are the background color index and pixel aspect ratio okay to ignore?
> +  if(packedFields & COLOR_TABLE_MASK) {
> +    return Transition::To(
> +      State::GLOBAL_COLOR_TABLE,
> +      3 * mGIFStruct.global_colormap_count);

Nit: more typical Mozilla style is like this:

> return Transition::To(State::GLOBAL_COLOR_TABLE,
>                       3 * mGIFStruct.global_colormap_count);

@@ +848,5 @@
> +  if(packedFields & COLOR_TABLE_MASK) {
> +    return Transition::To(
> +      State::GLOBAL_COLOR_TABLE,
> +      3 * mGIFStruct.global_colormap_count);
> +  } else {

Mozilla style is no |else| after |return|. You could use a ?: expression to give the two cases the same indentation level and visual emphasis.

@@ +860,5 @@
> +{
> +  memcpy(
> +    mGIFStruct.global_colormap,
> +    aData,
> +    3 * mGIFStruct.global_colormap_count);

Assuming it doesn't fit on one line, typical Mozilla style would be this:

> memcpy(mGIFStruct.global_colormap, aData,
>        3 * mGIFStruct.global_colormap_count);

@@ +864,5 @@
> +    3 * mGIFStruct.global_colormap_count);
> +
> +  ConvertColormap(
> +    mGIFStruct.global_colormap,
> +    mGIFStruct.global_colormap_count);

Similar here.

@@ +903,5 @@
> +        FinishInternal();
> +        return Transition::Terminate(State::SUCCESS);
> +      } else {
> +        // No images decoded, there is nothing to display.
> +        return Transition::Terminate(State::FAILURE);

No |else| after |return|. Again, a ?: expression might be nice.

@@ +913,5 @@
> +LexerTransition<nsGIFDecoder2::State>
> +nsGIFDecoder2::Extension(const char* aData)
> +{
> +  uint8_t label = aData[0],
> +          blockLength = aData[1];

Since they're on separate lines I'd just do:

> uint8_t label       = aData[0];
> uint8_t blockLength = aData[1];

@@ +916,5 @@
> +  uint8_t label = aData[0],
> +          blockLength = aData[1];
> +
> +  //ble: if the extension block has no data, we transition right past it.
> +  if(blockLength == 0) {

Nit: space after |if|.

@@ +929,5 @@
> +      // bytes. If the GIF specifies a different length, we allow that, so
> +      // long as it's larger; the additional data will simply be ignored.
> +      return Transition::To(
> +        State::CONTROL_EXTENSION,
> +        std::max(blockLength, (uint8_t) 4));

Identation again. Also, use uint8_t(4), or just |4| if the compiler accepts it.

@@ +939,5 @@
> +        return Transition::To(
> +          State::MAYBE_NS_APPLICATION_EXTENSION,
> +          blockLength);
> +      } else {
> +        return Transition::To(State::APPLICATION_EXTENSION, blockLength);

No |else| after |return| again. And again, ?: might be nice.

@@ +962,5 @@
> +  if (blockLength == 0) {
> +    return Transition::To(State::IMAGE_START, 1);
> +  }
> +
> +  return Transition::To(State::SKIP_BLOCK, blockLength);

Could do a ?: expression here too... or not. Either way is ok. The same goes for several more decoder functions below.

@@ +977,5 @@
> +LexerTransition<nsGIFDecoder2::State>
> +nsGIFDecoder2::ControlExtension(const char* aData)
> +{
> +  mGIFStruct.is_transparent = aData[0] & 0x1;
> +  mGIFStruct.tpixel = (uint8_t) aData[3];

uint8_t(aData[3]);

@@ +1061,5 @@
> +  //ble: We consume a minimum of 3 bytes in accordance with what specs exist
> +  //     for the Netscape application extension block.
> +  return Transition::To(
> +    State::CONSUME_NETSCAPE_EXTENSION,
> +    std::max(blockLength, (uint8_t) 3));

Indentation again. uint8_t(3) again (or just |3|).

@@ +1073,5 @@
> +  //     an alternate value for the sub-block ID.
> +  uint8_t subBlockID = aData[0] & 7;
> +  if (subBlockID == 1) {
> +
> +    mGIFStruct.loop_count = LittleEndian::readUint16(aData + 1);

No need for a blank line after an |if|, here and below in this function.

@@ +1077,5 @@
> +    mGIFStruct.loop_count = LittleEndian::readUint16(aData + 1);
> +
> +    return Transition::To(NETSCAPE_EXTENSION_BLOCK, 1);
> +
> +  } else if (subBlockID == 2) {

No |else| after |return|, here and lower down in this function.

@@ +1101,5 @@
> +nsGIFDecoder2::ImageHeader(const char* aData)
> +{
> +  if (mGIFStruct.images_decoded == 1) {
> +
> +    if (!HasAnimation()) {

No blank line after |if| again.

@@ +1177,5 @@
> +    }
> +  }
> +
> +  uint8_t packedFields = aData[8];
> +  //Determine depth (log base 2 of # of colors in palette) and realDepth

Use "number" instead of "#", here and below.

@@ +1184,5 @@
> +  uint16_t depth, realDepth;
> +  {
> +    if (packedFields & 0x80) {
> +      //Get it from the local color table
> +      depth = (packedFields & 0x07) + 1;

No need for this local block.

@@ +1262,5 @@
> +    }
> +    const uint32_t size = 3 << depth;
> +    if (mColormapSize > size) {
> +      // Clear the part of the colormap which will be unused with this palette
> +      memset( ((uint8_t*)mColormap) + size, 0, mColormapSize - size);

Use static_cast<> (or reinterpret_cast<>, if necessary?)

@@ +1266,5 @@
> +      memset( ((uint8_t*)mColormap) + size, 0, mColormapSize - size);
> +    }
> +
> +    return Transition::To(State::LOCAL_COLOR_TABLE, size);
> +  } else {

No |else| after |return|.

@@ +1295,5 @@
> +}
> +
> +//ble: equiv of "case gif_sub_block:"
> +LexerTransition<nsGIFDecoder2::State>
> +nsGIFDecoder2::SubBlock(const char* aData) //after case gif_sub_block

This comment seems unnecessary now.

@@ +1311,5 @@
> +  if (!mGIFStruct.rows_remaining) {
> +
> +//ble: need someone more knowledgeable of the build system to say whether this
> +//     ever gets defined.
> +#ifdef DONT_TOLERATE_BROKEN_GIFS

https://dxr.mozilla.org/mozilla-central/search?q=DONT_TOLERATE_BROKEN_GIFS&redirect=false&case=true suggests that this is the only occurrence of that constant in the codebase. I think it's safe to assume it's undefined and remove it.

::: image/decoders/nsGIFDecoder2.h
@@ +57,5 @@
>    inline int ClearCode() const { return 1 << mGIFStruct.datasize; }
>  
> +  typedef enum {
> +    FAILURE,
> +    SUCCESS,

Once you update and get the changes from bug 1210291 you can remove these two states.

@@ +77,5 @@
> +    APPLICATION_EXTENSION,
> +    NETSCAPE_EXTENSION_BLOCK,
> +    CONSUME_NETSCAPE_EXTENSION,
> +    CONSUME_COMMENT
> +  } State;

Make the ordering consistent (as per above).

@@ +82,5 @@
> +
> +  LexerTransition<State> ImageLZW(const char* aData);
> +  LexerTransition<State> ImageLZWStart(const char* aData);
> +  LexerTransition<State> ReadScreenDescriptor(const char* aData);
> +  LexerTransition<State> ReadGlobalColorTable(const char* aData);

Ditto.
Attachment #8699763 - Flags: feedback?(seth)
Attachment #8699763 - Flags: feedback?(n.nethercote)
Attachment #8699763 - Flags: feedback+
Ben, will you have time to complete this work in the near future? I'm afraid we're at the point where the need for this is getting more urgent.
Flags: needinfo?(benjaminster)
I'll have some time to work on it this week.  Before a long break, I'd cleared up the obvious style / naming / ordering issues and merged my changes with the StreamingLexer terminal states change and a GIF image bounds change, but found that I'd started failing some tests after the merge.
Flags: needinfo?(benjaminster)
Posted patch gif-streaming-lexer-2.patch (obsolete) — Splinter Review
Style issues fixed, merged with some GIF and StreamingLexer changes.
Attachment #8699763 - Attachment is obsolete: true
Thanks for tackling this, Ben!

This patch needs to be rebased against the current version of nsGIFDecoder2, which might be a bit complex, unfortunately. I'll do this for you as soon as I can.

Beyond the rebasing, what else are you aware of that needs to be done before this patch is ready for review?
Flags: needinfo?(benjaminster)
Blocks: 1187118
Thanks Seth!

Aside from getting those tests passing again, there's nothing else I can think of that would remain necessary after a rebase.

We did talk about adding GIFTestSuite, but I really don't know that it provides enough information to make _browser_ test cases; the file descriptions include "frame X has a serious error in it" but nothing like "user agents rendering this should loop frames 0 through X-1"...
Flags: needinfo?(benjaminster)
(In reply to Ben Ellis from comment #21)
> Aside from getting those tests passing again, there's nothing else I can
> think of that would remain necessary after a rebase.

Sounds good. It'd be worth re-testing after the rebase, as the GIF decoder has actually been greatly simplified recently, and it's possible that some of those issues will disappear. (If you're curious as to what has changed, take a look at bug 1247152 and bug 1246851.)

> We did talk about adding GIFTestSuite, but I really don't know that it
> provides enough information to make _browser_ test cases; the file
> descriptions include "frame X has a serious error in it" but nothing like
> "user agents rendering this should loop frames 0 through X-1"...

Yeah, I just took a look and I see what you mean. I think it's still worth bringing those tests into the codebase, if you can find the time to do it. To decide what we *should* do, generally I look at what Firefox and other browsers (usually WebKit/Blink is enough) currently do and then decide based on that. For corrupt images (which presumably most of the GIFTestSuite images are) the rules of thumb I use are:

(1) If neither Firefox nor other browsers currently decodes them, the test should check that they *don't* decode.

(2) If both Firefox and other browsers currently decode them, the test should check that they *do* decode and that the rendering is correct.

(3) If Firefox currently decodes them but other browsers don't, the test should check that they *don't* decode. If it's an easy fix, we should make the test pass. Otherwise, we should mark the test as failing and file a followup bug. We want to minimize the number of corrupt images that we accept, subject to the limitation that we want to be web-compatible.

(4) If Firefox currently fails to decode them but other browsers decode them successfully, the test should check that they *do* decode and that the rendering is correct. Again, if it's an easy fix we should just fix it; otherwise we should mark the test as failing and fix it in a followup bug.

Hopefully that's clear enough. I'd love to get GIFTestSuite imported into the codebase. The more tests we have, the better confidence we have that changes don't break anything, which is great when we're making huge changes like your patch makes.
OK, I've been working on ImageLib again lately and it seemed like a good time to
do this rebase because we really need to get this done.

Ben, thank you so much for doing this work. Your patch was great - this was a
big task, and you pulled it off. I massively appreciate your contribution and I
hope you contribute more code to this project in the future.

You can consider this updated patch not only the rebase but also effectively my
review; I went ahead and made all of the changes I wanted to see. I've listed
them below:

- First and most importantly, I made all of this code comply with the Mozilla
  Coding Style Guide. Ben, you can find that document here:
    https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style
  Virtually all of my changes were style related.

- I renamed the states to better match the terminology in the spec, renamed the
  methods to match the state names, and ordered things so that the State enum
  values, the switch statement in WriteInternal(), and the order of the methods
  themselves are all the same. This made it easier to understand what the new
  code is doing, and should make it easier for maintainers in the future as
  well.

- I moved a few comments that got lost during the refactoring into the new code.
  Ben, thanks for adding the new comments that you added; they were valuable
  additions. I also added some of my own new comments whenever something
  confused me during the review process, since I figured the same things would
  confuse other people as well.

- I collapsed the various states that just skipped data sub-blocks into only two
  states, since that code was basically duplicated several times.

- I switched the code that reads the LZW image data sub-blocks to read the data
  in unbuffered mode, instead of waiting to read an entire sub-block before
  doing any decoding. This should improve our ability to display the image on
  slow connections a little bit. The original code didn't do this, so this is an
  improvement; it's so straightforward with StreamingLexer that there was really
  no reason not to make the change.

- I *may* have fixed those test failures. I didn't do so intentionally, but my
  initial attempt at rebasing broke the code a little and in the process of
  debugging that, I seem to have fixed existing test failures, or at least I'm
  not hitting them locally.

Nick, since I made all these changes I think it's best that you review as well.
I know this is a huge patch and the diff is nigh-useless; don't feel like it's
urgent that you get it done quickly. We don't want to land this until after the
merge anyway, at a minimum.
So I didn't request review when uploading the previous patch because I
remembered that I forgot to make a change I wanted to make. (That's what I get
for working on several things at once!)

This patch makes skipping over data sub-blocks we don't care about use
unbuffered reads, and also uses unbuffered reads for reading in color tables.
This change could easily save us several kilobytes of intermediate copies *per
frame* of an animated gif, so it's well worth doing. (And it wasn't too hard!)

Additionally, I've also added an explicit default buffer size for the
StreamingLexer. It happens to be the same as the normal default, but it's nice
to document that somebody bothered to think this through. This, again, should
keep us from performing unnecessary memory allocation and copies. 

I'd say this is ready for review now. I'm going to r? myself and r+ it to
reflect that I am r+'ing, in some sense, Ben's patch, but given that I made a
bunch of changes, Nick should review it too (as mentioned above).
Attachment #8759027 - Flags: review?(seth)
Attachment #8759027 - Flags: review?(n.nethercote)
Attachment #8686798 - Attachment is obsolete: true
Attachment #8730680 - Attachment is obsolete: true
Attachment #8759022 - Attachment is obsolete: true
Attachment #8759027 - Flags: review?(seth) → review+
Nice!

If you think we'd want a bit of a run of this change in nightly before it goes to dev edition, let's not land until after the weekend.  Assuming it gets through reviews before then, of course.
I will start fuzzing as soon as I can get my hangs on an inbound build that includes it.
(In reply to Milan Sreckovic [:milan] from comment #25)
> Nice!
> 
> If you think we'd want a bit of a run of this change in nightly before it
> goes to dev edition, let's not land until after the weekend.  Assuming it
> gets through reviews before then, of course.

Yep, that's my plan.
Comment on attachment 8759027 [details] [diff] [review]
Use StreamingLexer in the GIF decoder.

Review of attachment 8759027 [details] [diff] [review]:
-----------------------------------------------------------------

Lovely. So much nicer than the old code. Lots of comments below but they're all minor.

::: image/decoders/nsGIFDecoder2.cpp
@@ +516,5 @@
> +  // GIF89a are always accepted.
> +  if (strncmp(aData, "GIF87a", GIF_HEADER_LEN) == 0) {
> +    mGIFStruct.version = 87;
> +  } else if (strncmp(aData, "GIF89a", GIF_HEADER_LEN) == 0) {
> +    mGIFStruct.version = 89;

Thank you for changing this to |strcmp(...) == 0| which is much clearer than |!strncmp(...)|.

@@ -695,5 @@
> -            std::max(mGIFStruct.bytes_to_consume, 4u);
> -          break;
> -
> -        // The GIF spec also specifies the lengths of the following two
> -        // extensions' headers (as 12 and 11 bytes, respectively). Because

The new comment removes the mention of 12 bytes -- was that just incorrect?

@@ +543,5 @@
> +
> +  // We ignore several fields in the header. We don't care about the 'sort
> +  // flag', which indicates if the global color table's entries are sorted in
> +  // order of importance, as if we need to render this image for a device with a
> +  // narrower color gamut then GIF supports we'll handle that at a different

s/as if/because if/ or s/, as if/ - if/

s/then GIF/than GIF/

@@ +607,5 @@
> +
> +      if (mGIFStruct.images_decoded > 0) {
> +        // The file is corrupt, but one or more images have been decoded
> +        // correctly. In this case, we proceed as if the file were correctly
> +        // terminated and set the state to gif_done, so the GIF will display.

|gif_done| no longer exists.

@@ +672,5 @@
> +  mGIFStruct.disposal_method = (aData[0] >> 2) & 0x7;
> +
> +  if (mGIFStruct.disposal_method == 4) {
> +    // Some specs say 3rd bit (value 4), other specs say value 3. Let's choose
> +    // 3 (the more popular).

I don't understand this comment. Clarify?

@@ +702,5 @@
> +  if ((strncmp(aData, "NETSCAPE2.0", 11) != 0) &&
> +      (strncmp(aData, "ANIMEXTS1.0", 11) != 0)) {
> +    // This is an application extension we don't care about. Just skip it.
> +    return Transition::To(State::SKIP_SUB_BLOCKS, SUB_BLOCK_HEADER_LEN);
> +  }

I feel like this function would read more naturally if you switched the ordering, i.e. if you tried to match the two cases you cared about and then fell back to the "skip it" case. But I'll let you decide which is better, because I can see the "get the exceptional case out of the way first" reasoning too...

@@ +730,5 @@
> +  // Documentation for NETSCAPE2.0 / ANIMEXTS1.0 extensions can be found at:
> +  //   https://wiki.whatwg.org/wiki/GIF
> +  const uint8_t subBlockID = aData[0] & 7;
> +  switch (subBlockID) {
> +    case 1:

Use named constants for the 1 and 2 case values here?

@@ +731,5 @@
> +  //   https://wiki.whatwg.org/wiki/GIF
> +  const uint8_t subBlockID = aData[0] & 7;
> +  switch (subBlockID) {
> +    case 1:
> +      // This is looping extension.

"*the* looping extension"? (Though named constants might remove the need for this comment.)

@@ +737,5 @@
> +      return Transition::To(State::NETSCAPE_EXTENSION_SUB_BLOCK,
> +                            SUB_BLOCK_HEADER_LEN);
> +
> +    case 2:
> +        // This is the buffering extension. We allow, but ignore, this extension. 

This comment is over-indented by 1 char. It also has trailing whitespace. (Again, named constants might remove the need for this comment.)

@@ -987,5 @@
> -        // Still working on the same image: Process next LZW data block
> -        // Make sure there are still pixels left. If the GIF data
> -        // is corrupt, we may not get an explicit terminator.
> -        if (mGIFStruct.pixels_remaining <= 0) {
> -#ifdef DONT_TOLERATE_BROKEN_GIFS

The new code doesn't use DONT_TOLERATE_BROKEN_GIFS. I guess it's not needed any more?

@@ +801,5 @@
> +      // Setting the size led to an error.
> +      return Transition::TerminateFailure();
> +    }
> +
> +    // If we were doing a metadata decode, we're done.

s/were/are/ ?

@@ +832,5 @@
> +  uint8_t packedFields = aData[8];
> +
> +  if (packedFields & 0x80) {
> +    // Get the palette depth from the local color table.
> +    depth = (packedFields & 0x07) + 1;

Please make COLOR_TABLE_MASK and TABLE_DEPTH_MASK visible outside of ReadScreenDescriptor() and use them here instead of hard-wiring the constants.

@@ +843,5 @@
> +  // If the transparent color index is greater than the number of colors in the
> +  // color table, we bump up |realDepth| until either it's less than or equal to
> +  // the number of colors, or we can't go any higher because a higher depth
> +  // would require more than one byte per palette index.
> +  realDepth = depth;

Move the declaration of realDepth down here so you don't need the unused zero assignment?

@@ +853,5 @@
> +  // Create a mask used to ensure that color values fit within the colormap.
> +  mColorMask = 0xFF >> (8 - realDepth);
> +
> +  // Determine if this frame is interlaced or not.
> +  const bool isInterlaced = packedFields & 0x40;

Please define a constant for 0x40.

@@ +862,5 @@
> +  }
> +
> +  // While decoders can reuse frames, we unconditionally increment
> +  // mGIFStruct.images_decoded when we're done with a frame, so we both can and
> +  // need to zero out the colormap and image data after every new frame.

"both can and need to" reads weirdly. Maybe replace with just "can"?

@@ +881,5 @@
> +    mGIFStruct.local_colormap_size = 1 << depth;
> +
> +    if (mGIFStruct.images_decoded == 0) {
> +      // The first frame has a local color table. Allocate space for it as we
> +      // use a BGRA or BGRX surface for the first frame, and such surfaces don't

s/and/because/ ?

@@ +1016,5 @@
> +    if (mGIFStruct.images_decoded == 0) {
> +      result = mPipe.WritePixels<uint32_t>([&]{
> +        return YieldPixel<uint32_t>(data, length, &bytesRead);
> +      });
> +    } else {

I would stick with using the ?: operator for this assignment.

::: image/decoders/nsGIFDecoder2.h
@@ +117,5 @@
> +  // The StreamingLexer used to manage input. The initial size of the buffer is
> +  // chosen as a little larger than the maximum size of any fixed-length data we
> +  // have to read for a state. We read variable-length data in unbuffered mode
> +  // so the buffer shouldn't have to be resized during decoding.
> +  StreamingLexer<State, 16> mLexer;

Is it worth having an assertion at the end of decoding that confirms no resizing occurred?
Attachment #8759027 - Flags: review?(n.nethercote) → review+
Thanks for the thorough review! I know this one had to be rough.

(In reply to Nicholas Nethercote [:njn] from comment #28)
> The new comment removes the mention of 12 bytes -- was that just incorrect?

I think that was probably a reference to the plain text extension, which does indeed have a 12 byte header. We don't do anything with it, so we don't need to handle it differently from other unrecognized extensions.

> @@ +672,5 @@
> > +  mGIFStruct.disposal_method = (aData[0] >> 2) & 0x7;
> > +
> > +  if (mGIFStruct.disposal_method == 4) {
> > +    // Some specs say 3rd bit (value 4), other specs say value 3. Let's choose
> > +    // 3 (the more popular).
> 
> I don't understand this comment. Clarify?

I should've rewritten every comment in the file. =) As far as I can tell, this is just saying that specs exist that use |4| instead of |3| to mean that the disposal method should be "restore previous". (See the DisposalMethod enum in imgFrame.h for a list of the possibilities.) We're just normalizing here. I'll reword.

> @@ -987,5 @@
> > -        // Still working on the same image: Process next LZW data block
> > -        // Make sure there are still pixels left. If the GIF data
> > -        // is corrupt, we may not get an explicit terminator.
> > -        if (mGIFStruct.pixels_remaining <= 0) {
> > -#ifdef DONT_TOLERATE_BROKEN_GIFS
> 
> The new code doesn't use DONT_TOLERATE_BROKEN_GIFS. I guess it's not needed
> any more?

Seemed preferable to remove it since it isn't used, especially since it only handled one of a million ways that GIFs can be broken.

> ::: image/decoders/nsGIFDecoder2.h
> @@ +117,5 @@
> > +  // The StreamingLexer used to manage input. The initial size of the buffer is
> > +  // chosen as a little larger than the maximum size of any fixed-length data we
> > +  // have to read for a state. We read variable-length data in unbuffered mode
> > +  // so the buffer shouldn't have to be resized during decoding.
> > +  StreamingLexer<State, 16> mLexer;
> 
> Is it worth having an assertion at the end of decoding that confirms no
> resizing occurred?

It is; I think this is a great idea! What I'd like to do is having StreamingLexer itself assert this, though, so that the stack of the assertion failure shows which state caused the problem. I need to think about this a little, though, because decoders that read a row at a time would easily trip the assertion. Let me file a followup and we'll come back to this.
Blocks: 1278776
Here's the updated patch; it should address all of the concerns above.

I think we're ready to go if try looks OK.
Attachment #8759027 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/026cf6432f44
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Blocks: 773203
Depends on: 1279859
No longer blocks: 773203
Depends on: 1280109
Should we consider backing this out while we sort out the regressions?
Flags: needinfo?(seth)
(In reply to Milan Sreckovic [:milan] from comment #35)
> Should we consider backing this out while we sort out the regressions?

Taking a look.
Blocks: 1280712
OK, there are patches up for all known regressions.
Flags: needinfo?(seth)
Blocks: 1284031
Duplicate of this bug: 1193119
You need to log in before you can comment on or make changes to this bug.