Closed
Bug 285872
Opened 19 years ago
Closed 19 years ago
GIF Decoder: replace gathering buffer with dynamic malloc to fixed 256 bytes hold
Categories
(Core :: Graphics: ImageLib, defect)
Core
Graphics: ImageLib
Tracking
()
VERIFIED
FIXED
People
(Reporter: alfredkayser, Assigned: alfredkayser)
References
Details
(Keywords: memory-footprint, perf)
Attachments
(1 file, 4 obsolete files)
20.35 KB,
patch
|
pavlov
:
superreview+
benjamin
:
approval1.8b4-
|
Details | Diff | Splinter Review |
As a followup to bug 274391, comment #27: The GIF decoder uses a dynamically allocated and resizing 'gathering' buffer to collect the 'blocks' of the gif file. Profiling learned that for GIF files bigger than 4K, each 4K crossing results in the allocation and resizing of a gathering buffer to get the bytes at the end of the prev. 4k together with the first bytes of the next 4K into a gathering buffer. In a large animated GIF of 2MB, this results in about 1000 Allocs/reallocs... Using a fixed hold buffer of 256 bytes, these allocs and reallocs can all be prevented. In a gif file all items are at most 256 bytes, except for the colormaps. The global_colormap can be directly copied to the available colormap in gifstruct, and the local_colormap directly into its own allready dynamically allocated colormap. So, by replacing the dynamical gathering buffer by a fixed 256 byte buffer we can save a lot allocs and reallocs. Also by preventing to switching to the 'gather_state' between each state, the state engine is twice as fast, as we only need to 'gather' the block at the end and start of each 4K (passed to gif_write by the streamers). Removing some other redundant states, as well as some redundant variables (clear_code is a static value, backgroundcolor is not needed, version can be PRPackedBool), about 28 bytes from the gif_struct is saved. Patch to be attached soon.
Assignee | ||
Comment 1•19 years ago
|
||
This patch replaces the dynamically allocated 'gather' buffer with a 256 bytes hold buffer allocated as part of the gif_struct. Global and local colormaps are directly copied into the resp. global_colormap of the gif_struct, and the dynamically allocated local_colormap. This also remove the now redundant 'gather' state, so that the decoder now directly switches to the next state, instead of going through the 'gather' state between each state. Also removed the not used member mBackgroundColor. The state engine is now managed by two variable: 'state' the state the decoder want to go to (or is active in), and the 'hold_more' integer indicating how many bytes should be available for the next state. So GETN(13, gif_init); requests to go to the state 'gif_init', ensuring that at least 13 bytes are available. In most cases this is in the current input buffer. If the input buffer is not enough the remain bytes are stored in the 'hold', and at entry of the next gif_write, first the hold is completed to satisfy 'hold_more', to go the next state. This way, at most 256 bytes are required in the hold buffer, as each next state block is never bigger than 256 bytes. (except for colormaps, but they are copied directly in place). This saves a lot of allocs (for a 80K gif image, about 80 allocs!), makes the state engine itself faster, copies colormaps directly, and less code as well.
Assignee | ||
Comment 2•19 years ago
|
||
Note, also fixed the order of the members and initialisation of nsGIFDecoder2 (to remove the needless compiler warnings). Also aligned comments of the members of gif_struct, that's why they all appear in the diff.
Assignee | ||
Updated•19 years ago
|
Attachment #177387 -
Flags: review?(pavlov)
Comment 3•19 years ago
|
||
I think the only 1 bucket is needed after bug 274391 converted 3 mallocs into 1, kGifAllocatorNBucket should be set to 1 at: http://lxr.mozilla.org/mozilla/source/modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp#60 (I'm commenting here because the other bug is fixed)
Assignee | ||
Comment 4•19 years ago
|
||
Yes, nBuckets can be reduced to 1, because it seems that there is always only 1 gif decoder active, but further proof is required. It would be even better if the combination nsGIFdecoder and mGIFstruct can be cached as a whole, particulary if normally at most 1 is active. This seems to be related to the way the image decoders are 'created' in imgLoad.
Assignee | ||
Updated•19 years ago
|
Assignee | ||
Comment 5•19 years ago
|
||
As bug 196295 is progressing now, let's have this one after that one. This bug covers replacement of dynamic gathering buffer with one static array (256 bytes) in GIFstruct. This will then also remove some redundant states (deadcode), as well as the then redundant 'gif_write_ready' function. Also there some redundant and/or dead members of GIF2 class and GIFstruct that be can cleaned (tpixel, is_transparent, control_extension, interlaced, etc..)
Depends on: 196295
Assignee | ||
Comment 6•19 years ago
|
||
Comment on attachment 177387 [details] [diff] [review] First edition Removing review request, awaiting a new patch after bug 196295 has landed, which makes this one much easier to review!
Attachment #177387 -
Flags: review?(pavlov)
Assignee | ||
Updated•19 years ago
|
Attachment #177387 -
Flags: review?(tor)
What happened to the ordering relative to bug 196295?
Assignee | ||
Comment 8•19 years ago
|
||
This bug is not really dependent on 196295. I thought to wait reviews of this one to after that one. But seeing that 196295 is not progressing, I wanted to get at least the review started of this one. If 196295 gets checked in meanwhile, I will update this patch accordingly...
Comment on attachment 177387 [details] [diff] [review] First edition Reading through the patch, it seems be a collection of number of different cleanups/changes. Please address one problem at a time.
Attachment #177387 -
Flags: review?(tor) → review-
Assignee | ||
Comment 10•19 years ago
|
||
Assignee | ||
Updated•19 years ago
|
Attachment #177387 -
Attachment is obsolete: true
Assignee | ||
Comment 11•19 years ago
|
||
Summary of patch (V2): * Much smaller because of the removed other (very minor) optimizations * Replaced hold, gather_head, gather_request_size, gathered, post_gather_state with hold_more, hold_size, and hold[MAX_HOLD_SIZE]. * Logic is hold buffer is now: - Hold never holds more than 256 bytes (as all blocks, except colormaps are <=256) - hold_more is the number of bytes to get into the hold - hold_size is number of bytes allready in hold. * The hold buffer is only required between gif_writes, because in the loop in gif_write, as long as the input buffer as more bytes than needed (len => gs->hold_more) we consume directly from input buffer. If not enough is left in input buffer, it is pushed into the hold, awaiting the next call to gif_write. Then first the hold is completed with 'hold_more' for the current state. * The colormaps are bigger than 256 but can directly be copied from input into the corr. colormap. Except if input doesn't contain enough, than the rest is copied, and the next call to gif_write first completes the colormap copy. * This also means that the 'gif_write_ready' is no longer needed. * Because of the gather into hold change, also the start state needed to be changed, merging also the three states (gif_init, gif_type, gif_version) into one (gif_type) to check the first 6 bytes. * States gif_gather, gif_delay and gif_stop_animating are no longer required.
Assignee | ||
Updated•19 years ago
|
Attachment #187492 -
Flags: review?(tor)
Comment 12•19 years ago
|
||
Comment on attachment 187492 [details] [diff] [review] V2: Removed other cleanups; only do the gather buffer fix >+ const PRUint8 *q = buf; > >- const PRUint8 *q, *p = buf, *ep = buf + len; >+ // Add what we have sofar to the block >+ // If last call to me, left something in the hold first complete current block >+ // Or if we are filling the colormaps, >+ PRUint8* p = nsnull; >+ switch (gs->state) >+ { >+ case gif_global_colormap: p = gs->global_colormap; break; >+ case gif_image_colormap: p = gs->local_colormap; break; >+ default: if (gs->hold_size) p = gs->hold; break; >+ } Please split the case statements on seperate lines. >+ if (p) { >+ // Add what we have sofar to the block >+ PRUint32 l = len < gs->hold_more ? len : gs->hold_more; Use PR_MIN. >+ memcpy(p+gs->hold_size, buf, l); >+ buf += l; >+ len -= l; >+ gs->hold_size += l; >+ gs->hold_more -= l; > >- q = nsnull; /* Initialize to shut up gcc warnings */ >+ if (gs->hold_more > 0) { >+ // Block in hold is not complete yet >+ return PR_SUCCESS; >+ } >+ >+ gs->hold_size = gs->hold_more = 0; Only need to zero hold_size, since hold_more will already be zero. >+ q = gs->hold; >+ } >+ >+ // Invariant: q is start of current to be processed block >+ // buf points to start of next block >+ // len is number of bytes left in buffer (from position p) >+ // gs->hold_more is number of bytes required from buffer for this block (0 when current block in 'hold') >+ for (;len >= gs->hold_more; q=buf) { >+ // Eat the current block from the buffer, q keeps pointed at current block >+ buf += gs->hold_more; >+ len -= gs->hold_more; Something about this logic doesn't seem to add up. If you're working out of the hold buffer, len is going to be what was passed in minus hold_more, but you only want to process hold_size (before you zeroed it) bytes from the hold buffer.
Attachment #187492 -
Flags: review?(tor) → review-
Assignee | ||
Comment 13•19 years ago
|
||
Changes: > Please split the case statements on seperate lines: Ok, replaced by 'if' statements, is shorter but as readable. > Use PR_MIN: Ok > Only need to zero hold_size, since hold_more will already be zero: Ok > Something about this logic doesn't seem to add up: It does, but I improved the logic as follows: * First complete block in hold if required * If still not complete, adjust hold_size and hold_more and go to next gif_write * Otherwise, go to for loop, with 'hold_more' being number of bytes to be consumed from 'buf' for current block, and with 'q' pointing to start of current block (either hold, colormap or buf), and 'buf' to input buffer at start of bytes to be consumed... * At entrance of for loop, adjust 'buf' and 'len' with 'hold_more' to really consume the bytes from input buffer, and then process current block pointed to by 'q'. At next loop, set 'q' to the adjusted 'buf', etc... Note, 'hold' and 'hold_size' are only used from end of gif_write to start of next gif_write.
Assignee | ||
Updated•19 years ago
|
Attachment #187492 -
Attachment is obsolete: true
Attachment #187940 -
Flags: review?(tor)
Assignee | ||
Comment 14•19 years ago
|
||
Oops, don't fix too much. Colormap copying reverted to v2, and fully tested to work, including 'truecolor.gif'.
Attachment #187940 -
Attachment is obsolete: true
Attachment #187941 -
Flags: review?(tor)
Attachment #187940 -
Flags: review?(tor)
Comment 15•19 years ago
|
||
Comment on attachment 187941 [details] [diff] [review] v3b: Oops, colormap copying malfunction detected and fixed This buffer/loop control fails the test of being easy to understand for someone new to the code, but I'm not sure exactly how to fix that. Renaming hold_size might help ("held" or "held_bytes", maybe?). Some minor nits: there's a couple misspellings of "already" in comments, and some of the new code is missing spaces around operators. With those fixed, r=tor.
Attachment #187941 -
Flags: review?(tor) → review+
Comment 16•19 years ago
|
||
we need to triple check this patch for security holes before it lands
Assignee | ||
Comment 17•19 years ago
|
||
To help understanding the logic of the 'hold' buffer: * Renamed hold_more to bytes_to_consume: # of bytes for current state to consume from input buffer for the current state. * Renamed hold_size to bytes_in_hold: # of bytes already accumulated into the hold Fixed spelling errors. Tested with my set of large and animated gif's. As for the triple security check, it is important, but I believe my way of using a fixed hold buffer is safer than using a dynamically allocated gathering buffer. For the fixed hold buffer, it is important to make sure that never more bytes are put into it than allowed. Check all the 'GETN' statements where the 'bytes_to_consume' value is set. Three cases exists: * fixed size, for fixed sized elements (1, 2, 7, etc). * lzw, comment, blocks etc: length value is always limited to one byte so max. 255. * colormaps, max. length is always (2<<7)*3 = 768 bytes, copied into the resp. colormap buffers. global_colormap is fixed at 768, and local_colormap is realloc to the actual length.
Assignee | ||
Updated•19 years ago
|
Attachment #187941 -
Attachment is obsolete: true
Assignee | ||
Comment 18•19 years ago
|
||
Comment on attachment 189901 [details] [diff] [review] V4: Comments from Tor incorporated Stuart, can you do the superreview as well as the security check? Thanks in advance, Alfred
Attachment #189901 -
Flags: superreview?(pavlov)
Updated•19 years ago
|
Attachment #189901 -
Flags: superreview?(pavlov) → superreview+
Assignee | ||
Comment 19•19 years ago
|
||
Having received both r and sr, the next step is to commit it to the three(s). Who can do this for me?
Comment 20•19 years ago
|
||
At this point in the cycle you also need approval, which means we need to judge if it's safe to take this for gecko 1.8.
Updated•19 years ago
|
Attachment #189901 -
Flags: approval1.8b4?
Comment 21•19 years ago
|
||
Comment on attachment 189901 [details] [diff] [review] V4: Comments from Tor incorporated The risk/reward factor isn't good for getting this in 1.8... let's land it early in 1.9.
Attachment #189901 -
Flags: approval1.8b4? → approval1.8b4-
Assignee | ||
Comment 22•19 years ago
|
||
Who can check this for me into the 'trunk' three for 1.9?
Comment 23•19 years ago
|
||
checked in to trunk Checking in modules/libpr0n/decoders/gif/GIF2.cpp; /cvsroot/mozilla/modules/libpr0n/decoders/gif/GIF2.cpp,v <-- GIF2.cpp new revision: 1.53; previous revision: 1.52 done Checking in modules/libpr0n/decoders/gif/GIF2.h; /cvsroot/mozilla/modules/libpr0n/decoders/gif/GIF2.h,v <-- GIF2.h new revision: 1.21; previous revision: 1.20 done Checking in modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp; /cvsroot/mozilla/modules/libpr0n/decoders/gif/nsGIFDecoder2.cpp,v <-- nsGIFDecoder2.cpp new revision: 1.66; previous revision: 1.65 done
Assignee | ||
Comment 24•19 years ago
|
||
Thanks!
Alfred, is there more work to do in this bug? If not, can you mark this fixed? Thanks.
Assignee | ||
Updated•19 years ago
|
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 26•19 years ago
|
||
As this is checked into trunk and verified to work, can I be so bold to request it to also check it in into the branch?
Flags: blocking1.8b5?
Flags: blocking1.8b4?
Updated•19 years ago
|
Flags: blocking1.8b5?
Flags: blocking1.8b5-
Flags: blocking1.8b4?
Flags: blocking1.8b4-
Updated•19 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•19 years ago
|
Assignee: stuart → alfredkayser
Status: REOPENED → NEW
Updated•19 years ago
|
Status: NEW → RESOLVED
Closed: 19 years ago → 19 years ago
Resolution: --- → FIXED
Verified FIXED as best I can tell. This has been in the trunk for a few weeks now, without crashers or other strange bugs. Build 2005-09-11-19 on Windows XP SeaMonkey trunk.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•