Closed Bug 1204394 Opened 9 years ago Closed 9 years ago

Use StreamingLexer in the BMP decoder

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla44

Tracking Flags:

Tracking

Status

firefox44

---

fixed

People

(Reporter: seth, Assigned: n.nethercote, Mentored)

References

Details

(Whiteboard: [gfx-noted])

Attachments

(2 files, 4 obsolete files)

(part 1) - Fix indentation in nsBMPDecoder.cpp 9 years ago Nicholas Nethercote [inactive] 4.62 KB, patch		Details \| Diff \| Splinter Review
(part 2) - Using StreamingLexer in the BMP decoder 9 years ago Nicholas Nethercote [inactive] 57.44 KB, patch		Details \| Diff \| Splinter Review
(part 2) - Using StreamingLexer in the BMP decoder 9 years ago Nicholas Nethercote [inactive] 70.86 KB, patch		Details \| Diff \| Splinter Review
(part 1) - Using StreamingLexer in the BMP decoder 9 years ago Nicholas Nethercote [inactive] 84.88 KB, patch	seth : review+	Details \| Diff \| Splinter Review
(part 2) - Add bmpsuite to the BMP reftests 9 years ago Nicholas Nethercote [inactive] 480.54 KB, patch	seth : review+	Details \| Diff \| Splinter Review
(part 2) - Add bmpsuite to the BMP reftests 9 years ago Nicholas Nethercote [inactive] 517.63 KB, patch	n.nethercote : review+ gerv : superreview+	Details \| Diff \| Splinter Review

Seth Fowler [:seth] [:s2h]

Reporter

Description

•

9 years ago

It'd be nice to rework the BMP decoder to use StreamingLexer. A particular advantage of this is that currently, the ICO decoder has to actually synthesize a fake header to pass to BMP decoder, but if the BMP decoder used the StreamingLexer / method-per-state approach that the ICO decoder uses now, we could avoid that complexity easily and just start decoding in a different state for BMPs embedded in ICOs. This would be a good first bug for a new contributor with good C++ skills.

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

9 years ago

Whiteboard: [gfx-noted]

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Assignee: nobody → n.nethercote

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

9 years ago

Attached patch (part 1) - Fix indentation in nsBMPDecoder.cpp (obsolete) — Details — Splinter Review

The are some 4-space indents in this file which mostly uses 2-space indents.

Attachment #8670619 - Flags: review?(seth)

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

9 years ago

Attached patch (part 2) - Using StreamingLexer in the BMP decoder (obsolete) — Details — Splinter Review

This is far enough along that some feedback would be helpful. In particular, I have comments marked with "njn" in places where I'm unsure about things. It's definitely an improvement over the old code, about 280 lines shorter and easier to read. I'm still iffy about test coverage. This format has lots of possible combinations and I don't think they all have test coverage. (Indeed, I have code to crash on some combinations that I haven't yet encountered.) I also haven't thought carefully yet about avoiding problems (e.g. buffer overflows) with malformed files. There's some lack of clarity about what versions of these files are supported. The types mention OS2, WIN_V3, WIN_V5. But the code seems to indicate the following: - BITMAPCOREHEADER / OS21XBITMAPHEADER: supported - OS22XBITMAPHEADER: unsupported? not entirely sure - BITMAPINFOHEADER: supported - BITMAPV2INFOHEADER: supported - BITMAPV3INFOHEADER / BITMAPV4HEADER / BITMAPV5HEADER: partly supported -- the extra fields these versions have beyond what BITMAPV2INFOHEADER has are ignored, though. The patch is a nightmare to read. That seems unavoidable, unfortunately. It's probably easiest to review just by opening the old nsBMPEncoder.cpp and the new one side by side.

Attachment #8670621 - Flags: review?(seth)

Nicholas Nethercote [inactive]

Assignee

Comment 3

•

9 years ago

> There's some lack of clarity about what versions of these files are > supported. Ah, the version naming used at https://en.wikipedia.org/wiki/BMP_file_format is a bit confusing. The naming used at http://www.fileformat.info/format/bmp/egff.htm is clearer and matches up better with what's in the code.

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Attachment #8670621 - Flags: review?(seth)

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

9 years ago

I talked with seth on IRC and no longer need feedback on part 2.

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

9 years ago

Attached patch (part 2) - Using StreamingLexer in the BMP decoder (obsolete) — Details — Splinter Review

Updated version. Looking much better, but still a bit more work to do.

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Attachment #8670621 - Attachment is obsolete: true

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

9 years ago

seth: I found http://entropymine.com/jason/bmpsuite/bmpsuite/html/bmpsuite.html which is a great suite of test BMP images, and AFAICT it covers lots of dark corners that our existing tests do not. (I haven't found any 16 bpp BMP files in our tests, for example.) Do you think it would be reasonable to import it into reftests? The code that generates the images is GPLv3, so hopefully this is legally ok (and I'm happy to check this).

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Attachment #8670619 - Attachment is obsolete: true

Attachment #8670619 - Flags: review?(seth)

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

9 years ago

> The code that generates the images is GPLv3, so hopefully this is legally ok > (and I'm happy to check this). Looking more closely, https://github.com/jsummers/bmpsuite/blob/master/readme.txt says "Image files generated by this program are not covered by this license, and are in the public domain."

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

9 years ago

Attached patch (part 1) - Using StreamingLexer in the BMP decoder — Details — Splinter Review

Apologies for the hugeness of this patch. The StreamingLexer/WriteInternal changes were large enough that I figured there was no point separating the other changes. Applying the patch and reading the resulting code might be easiest, at least for large chunks of it.

Attachment #8671748 - Flags: review?(seth)

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Attachment #8671235 - Attachment is obsolete: true

Nicholas Nethercote [inactive]

Assignee

Comment 9

•

9 years ago

Attached patch (part 2) - Add bmpsuite to the BMP reftests (obsolete) — Details — Splinter Review

This test suite has greatly increased my confidence in the new nsBMPDecoder code.

Attachment #8671749 - Flags: review?(seth)

Emanuel Hoogeveen [:ehoogeveen]

Comment 10

•

9 years ago

I was wondering, does our ICO decoder use the BMP decoder when the ICO file contains BMP data? I was thinking it would be nice if the test suite applies to both.

Nicholas Nethercote [inactive]

Assignee

Comment 11

•

9 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #10) > I was wondering, does our ICO decoder use the BMP decoder when the ICO file > contains BMP data? It does. See comment 0, for example, though I haven't made that suggested change here.

Nicholas Nethercote [inactive]

Assignee

Comment 12

•

9 years ago

Try looks good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=86f1a22a193f https://treeherder.mozilla.org/#/jobs?repo=try&revision=fdcc39cacded (Yes, I could have avoided the Linux overlap in the second job, but I didn't think of it until afterwards. Oh well.)

Seth Fowler [:seth] [:s2h]

Reporter

Comment 13

•

9 years ago

Comment on attachment 8671749 [details] [diff] [review] (part 2) - Add bmpsuite to the BMP reftests Review of attachment 8671749 [details] [diff] [review]: ----------------------------------------------------------------- This is great, Nick. Thanks for putting this together! ::: image/test/reftest/bmp/bmpsuite/README.mozilla @@ +22,5 @@ > + > +- The first line starts with "BMP:" and is the output of the MOZ_LOG call in > + nsBMPDecoder.cpp. It has basic image info. > + > +- Next is a quote from the bmpsuite docs, which describes the particulars of Do we know the license of the bmpsuite docs themselves? ::: image/test/reftest/bmp/bmpsuite/g/reftest.list @@ +76,5 @@ > +# "A 16-bit image with the default color format: 5 bits each for red, green, and > +# blue, and 1 unused bit. The whitest colors should (I assume) be displayed as > +# pure white: (255,255,255), not (248,248,248)." > +# [XXX: we do the cheap scaling that gives (248,248,248) for white, which > +# explains the overly generous fuzzy values. Chromium does proper scaling.] We should probably file a followup for this. @@ +112,5 @@ > +# "A 32-bit image with a BITFIELDS segment. As usual, there are 8 bits per color > +# channel, and 8 unused bits. But the color channels are in an unusual order, > +# so the viewer must read the BITFIELDS, and not just guess." > +# [XXX: we get this completely wrong because we don't handle BITFIELDS at all > +# in 32bpp BMPs. Chromium gets this right.] This too. ::: image/test/reftest/bmp/bmpsuite/q/reftest.list @@ +73,5 @@ > +# "A 16-bit image with an alpha channel. There are 4 bits for each color > +# channel, and 4 bits for the alpha channel. It’s not clear if this is valid, > +# but I can’t find anything that suggests it isn’t." > +# [XXX: we don't even try to do transparency for 16bpp. Chromium gets the > +# transparency right.] This too.

Attachment #8671749 - Flags: review?(seth) → review+

Seth Fowler [:seth] [:s2h]

Reporter

Comment 14

•

9 years ago

Comment on attachment 8671748 [details] [diff] [review] (part 1) - Using StreamingLexer in the BMP decoder Review of attachment 8671748 [details] [diff] [review]: ----------------------------------------------------------------- Consider all the changes outside of nsBMPDecoder.(cpp|h) r+'d. I'll make a second pass over this patch later to review those changes.

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Blocks: 1213613

Nicholas Nethercote [inactive]

Assignee

Comment 15

•

9 years ago

(In reply to Seth Fowler [:seth] [:s2h] from comment #13) > Comment on attachment 8671749 [details] [diff] [review] > (part 2) - Add bmpsuite to the BMP reftests > > Review of attachment 8671749 [details] [diff] [review]: > ----------------------------------------------------------------- > > This is great, Nick. Thanks for putting this together! > > ::: image/test/reftest/bmp/bmpsuite/README.mozilla > @@ +22,5 @@ > > + > > +- The first line starts with "BMP:" and is the output of the MOZ_LOG call in > > + nsBMPDecoder.cpp. It has basic image info. > > + > > +- Next is a quote from the bmpsuite docs, which describes the particulars of > > Do we know the license of the bmpsuite docs themselves? I guess it's GPLv3 because it's within the code which is GPLv3. You could probably argue that these quotes are fair use. I've also copied some of the reference PNGs. I also generated a couple of them myself (using Firefox's new "screenshot a DOM element" feature). I will ask Gerv about all these. > ::: image/test/reftest/bmp/bmpsuite/g/reftest.list > @@ +76,5 @@ > > +# "A 16-bit image with the default color format: 5 bits each for red, green, and > > +# blue, and 1 unused bit. The whitest colors should (I assume) be displayed as > > +# pure white: (255,255,255), not (248,248,248)." > > +# [XXX: we do the cheap scaling that gives (248,248,248) for white, which > > +# explains the overly generous fuzzy values. Chromium does proper scaling.] > > We should probably file a followup for this. > This too. > This too. I filed bug 1213613 for all of them. They're all related so I will likely fix them all in one bug.

Nicholas Nethercote [inactive]

Assignee

Comment 16

•

9 years ago

Gerv: I have a question about including third-party data and code into our codebase. bmpsuite (https://github.com/jsummers/bmpsuite/) is a really nice test suite for BMP images and I want to import parts of it. (See the attached "part 2" patch.) bmpsuite consists of (a) code to generate the test BMP images, (b) some reference PNG images used to compare against the test BMP images, and (c) some documentation. bmpsuite's readme.txt says it is GPLv3 but specifically says that the images it code generates are in the public domain. So, the parts of interest in my patch... (0a) I've documented that we're using content from bmpsuite in the README.mozilla file, and I linked to the github page, so attribution is present. (0b) I didn't include the bmpsuite code for generating the BMP images. (I just ran that code locally to generate them.) (1) I included the generated BMP images, which are public domain and so should be fine. (2) I included some quotes from the documentation ((c) above). I'd suggest they are fair use but Plan B would be to remove them and just point to the online documentation if necessary, which would be a mild inconvenience. Or I could rewrite the descriptions to not be direct quotes. (3) I included some of the reference PNG images ((b) above). Plan B would be to regenerate these via screen shots of our renderings of the BMP images, if necessary. If I did that there's a good chance they'd be exactly the same as the original, or at least, extremely similar. I'm confident that (1) is fine. What about (2) and (3)? If GPLv3 is compatible with MPL2 then we should be good. If not, things are a little trickier and Plan B might be necessary. What do you think? Thank you.

Flags: needinfo?(gerv)

Seth Fowler [:seth] [:s2h]

Reporter

Comment 17

•

9 years ago

Comment on attachment 8671748 [details] [diff] [review] (part 1) - Using StreamingLexer in the BMP decoder Review of attachment 8671748 [details] [diff] [review]: ----------------------------------------------------------------- This came out really well, Nick! Thanks for putting so much effort into this! r+ with the additional |mCurrentRow == 0| checks mentioned below and a name change for |EndOfRow()|. The other stuff is optional but recommended. =) ::: image/decoders/nsBMPDecoder.cpp @@ +249,5 @@ > + mBitFields.blueLeftShift = 8 - length; > +} > + > +void > +nsBMPDecoder::EndOfRow() This name sounds like a getter to me. Maybe "FinishRow()"? @@ +528,5 @@ > > +LexerTransition<nsBMPDecoder::State> > +nsBMPDecoder::ReadColorTable(const char* aData, size_t aLength) > +{ > + MOZ_ASSERT_IF(aLength != 0, mColors); Maybe also MOZ_ASSERT_IF(aLength != 0, mNumColors > 0)? @@ +572,5 @@ > + const uint8_t* p = reinterpret_cast<const uint8_t*>(aData); > + uint32_t* d = mDownscaler > + ? reinterpret_cast<uint32_t*>(mDownscaler->RowBuffer()) > + : reinterpret_cast<uint32_t*>(mImageData) + > + PIXEL_OFFSET(mCurrentRow, 0); It might be good to abstract this into a private method |GetRowBuffer()| which would do this calculation based upon |mCurrentRow| and |mCurrentPos| and return the appropriate pointer. There are two other places where we do this computation, so it'd be nice to share the code. We do this in some of the other decoders - for example, nsGIFDecoder2. |ReadPixelRow()| leaves |mCurrentPos| at 0, so even though it doesn't use |mCurrentPos| in its calculation, it doesn't seem like we need to handle it specially. @@ +681,5 @@ > + if (pixelsNeeded) { > + uint32_t* d = mDownscaler > + ? reinterpret_cast<uint32_t*>(mDownscaler->RowBuffer()) + mCurrentPos > + : reinterpret_cast<uint32_t*>(mImageData) + > + PIXEL_OFFSET(mCurrentRow, mCurrentPos); This is another place where a new |GetRowBuffer()| method could be called. @@ +701,5 @@ > + > + if (byte2 == RLE::ESCAPE_EOL) { > + EndOfRow(); > + mCurrentPos = 0; > + return Transition::To(State::RLE_SEGMENT, RLE_SEGMENT_LENGTH); We should probably be checking if |mCurrentRow == 0| here and transitioning to State::SUCCESS if so, rather than relying on the check at the the top of this method. @@ +765,5 @@ > + mDownscaler->CommitRow(); > + } > + } > + > + return Transition::To(State::RLE_SEGMENT, RLE_SEGMENT_LENGTH); Again, seems like we should be checking if |mCurrentRow == 0| here and transitioning to State::SUCCESS. @@ +785,5 @@ > + // which contains the color index of a single pixel. > + uint32_t* dst = mDownscaler > + ? reinterpret_cast<uint32_t*>(mDownscaler->RowBuffer()) + mCurrentPos > + : reinterpret_cast<uint32_t*>(mImageData) + > + PIXEL_OFFSET(mCurrentRow, mCurrentPos); This is another place where a new |GetRowBuffer()| method could be called. ::: image/decoders/nsBMPDecoder.h @@ +104,3 @@ > > + uint32_t mNumColors; // The number of used colors, i.e. the number of > + // entries in mColors, if it's present. Micronit: Are you intentionally using hanging indent here? @@ +117,5 @@ > > + int32_t mCurrentRow; // Index of the row of the image that's currently > + // being decoded: [height,1]. > + int32_t mCurrentPos; // Index into the current line; only used when > + // doing RLE decoding. (Also here?)

Attachment #8671748 - Flags: review?(seth) → review+

Andrew McCreight [:mccr8]

Comment 18

•

9 years ago

If you can set up the decoder to run stand-alone, then fuzzing it with something like American Fuzzy Lop ( https://en.wikipedia.org/wiki/American_fuzzy_lop_%28fuzzer%29 ) might be useful.

Nicholas Nethercote [inactive]

Assignee

Comment 19

•

9 years ago

Attached patch (part 2) - Add bmpsuite to the BMP reftests — Details — Splinter Review

Judging from the many existing cases in the Mozilla codebase, importing GPLv3 code is fine. I've updated the README.mozilla file in this patch to specifically describe the GPLv3 things we're including (some documentation quotes and some reference PNG images) and I also added a COPYING.txt file to the directory. So I think this should be good enough.

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Attachment #8671749 - Attachment is obsolete: true

Nicholas Nethercote [inactive]

Assignee

Comment 20

•

9 years ago

> @@ +572,5 @@ > > + const uint8_t* p = reinterpret_cast<const uint8_t*>(aData); > > + uint32_t* d = mDownscaler > > + ? reinterpret_cast<uint32_t*>(mDownscaler->RowBuffer()) > > + : reinterpret_cast<uint32_t*>(mImageData) + > > + PIXEL_OFFSET(mCurrentRow, 0); > > It might be good to abstract this into a private method |GetRowBuffer()| > which would do this calculation based upon |mCurrentRow| and |mCurrentPos| > and return the appropriate pointer. There are two other places where we do > this computation, so it'd be nice to share the code. We do this in some of > the other decoders - for example, nsGIFDecoder2. Good idea. That'll let me remove the LINE and PIXEL_OFFSET macros, too. > |ReadPixelRow()| leaves |mCurrentPos| at 0, so even though it doesn't use > |mCurrentPos| in its calculation, it doesn't seem like we need to handle it > specially. I'll add an assertion that mCurrentPos is zero in ReadPixelRow(). > > + uint32_t mNumColors; // The number of used colors, i.e. the number of > > + // entries in mColors, if it's present. > > Micronit: Are you intentionally using hanging indent here? I am -- it's better than the weird "<" prefix used in the old comments -- but it's not really necessary so I'll remove it. I've addressed the other comments as you suggested.

Nicholas Nethercote [inactive]

Assignee

Comment 21

•

9 years ago

Comment on attachment 8672833 [details] [diff] [review] (part 2) - Add bmpsuite to the BMP reftests Carrying over r+ from Seth, adding sr? request for Gerv.

Attachment #8672833 - Flags: superreview?(gerv)

Attachment #8672833 - Flags: review+

Nicholas Nethercote [inactive]

Assignee

Updated

•

9 years ago

Blocks: 1214476

Gervase Markham [:gerv]

Comment 22

•

9 years ago

Sorry I'm late to the party here. (In reply to Nicholas Nethercote [:njn] from comment #19) > Judging from the many existing cases in the Mozilla codebase, importing GPLv3 > code is fine. Not sure where you got that idea - we definitely have no GPLv3 code in the main Firefox or B2G codebases. We may have the odd bit in independent tools. But there's certainly not "many cases" AFAIK. > I've updated the README.mozilla file in this patch to > specifically describe the GPLv3 things we're including (some documentation > quotes and some reference PNG images) and I also added a COPYING.txt file to > the directory. So I think this should be good enough. This stuff is all standalone and for tests only, so that's OK, as long as it's appropriately documented. So that sounds fine. Gerv

Flags: needinfo?(gerv)

Nicholas Nethercote [inactive]

Assignee

Comment 23

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cfa04aeeb2ce

Nicholas Nethercote [inactive]

Assignee

Comment 24

•

9 years ago

(In reply to Gervase Markham [:gerv] from comment #22) > > > > Judging from the many existing cases in the Mozilla codebase, importing GPLv3 > > code is fine. > > Not sure where you got that idea - we definitely have no GPLv3 code in the > main Firefox or B2G codebases. We may have the odd bit in independent tools. > But there's certainly not "many cases" AFAIK. If you search for "version 3 of the License" (https://dxr.mozilla.org/mozilla-central/search?q=%22version+3+of+the+License%22&redirect=false&case=true) you get 41 hits. But they do look to mostly (all?) be in tests or the build system, i.e. not part of released code. Thank you for the approval.

Nicholas Nethercote [inactive]

Assignee

Comment 25

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/0c84122fa187f772547c6fe7f553039979e47da8 Bug 1204394 (part 1) - Using StreamingLexer in the BMP decoder. r=seth. https://hg.mozilla.org/integration/mozilla-inbound/rev/609fa3c015842fa5bb9592e590164ea79088d247 Bug 1204394 (part 2) - Add bmpsuite to the BMP reftests. r=seth.

Carsten Book [:Tomcat]

Comment 26

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/0c84122fa187 https://hg.mozilla.org/mozilla-central/rev/609fa3c01584

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox44: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla44

Gervase Markham [:gerv]

Comment 27

•

9 years ago

(In reply to Nicholas Nethercote [:njn] from comment #24) > If you search for "version 3 of the License" > (https://dxr.mozilla.org/mozilla-central/ > search?q=%22version+3+of+the+License%22&redirect=false&case=true) you get 41 > hits. But they do look to mostly (all?) be in tests or the build system, > i.e. not part of released code. Just FYI, what's fooling you there is that many of these are autoconf files. While these are technically GPLv3, they have a broad exception to them which allows you to distribute them under any other license. See https://dxr.mozilla.org/mozilla-central/source/build/autoconf/config.guess for an example; see the para starting "As a special exception...". True GPLv3 files are much rarer. Gerv

Gervase Markham [:gerv]

Updated

•

9 years ago

Attachment #8672833 - Flags: superreview?(gerv) → superreview+

Alice0775 White

Updated

•

9 years ago

Depends on: 1237171

Loic

Updated

•

9 years ago

Depends on: 1267224

Seth Fowler [:seth] [:s2h]

Reporter

Updated

•

8 years ago

Blocks: 1284031

Jonathan Watt [:jwatt]

Updated

•

7 years ago