Closed Bug 1108084 Opened 10 years ago Closed 8 years ago

Don't store favicons that are more than 1MB

Categories

(Firefox for Android Graveyard :: Favicon Handling, defect)

All
Android
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rnewman, Assigned: imjalpreet, Mentored)

References

Details

(Whiteboard: [good next bug][lang=java])

Attachments

(1 file, 2 obsolete files)

See Bug 1106347. Android can't retrieve these from sqlite, so we shouldn't store them.
Mentor: chriskitching, rnewman
Whiteboard: [good second bug][lang=java]
Hi, I would like to take this bug. Can you please guide me where to start from?
Flags: needinfo?(rnewman)
Jalpreet: Great! There are a few different layers to this bug. We want to fix them in order so that each layer is safe in the presence of invalid inputs. The lowest is LocalBrowserDB.updateFaviconForUrl. It should reject values that are too large before writing them to the database. The next is LoadFaviconTask#saveFaviconToDb. It should check the size of its argument, and if it's too large (>= 1MB) it should do something smarter: rather than blindly calling updateFaviconForUrl with the input, it should do one of several things. The simplest two are -- 1. Just log an error and return. This avoids us inserting into the DB, but it can leave stale data in the DB, and means that that cache still contains a large favicon and is out of sync with the DB. 2. Return a value to indicate that saving did not proceed because the input was invalid. You might opt for this to help in the next layer. The next layer is consistency in LoadFaviconTask#doInBackground. It might be worth another size-check at line 446. The final layer is to be smarter in LoadFaviconResult itself, and its callers FaviconDecoder#decodeFavicon and ICODecoder, which are the classes that eventually create the oversize array from component bitmaps. Rather than holding an arbitrarily large array of arbitrarily large bitmaps, encoding them into a multi-megabyte byte array for database storage, it should: * Avoid including huge individual bitmaps unless there is only one, and thus there are no smaller alternatives. Only the ICODecoder has this problem; PNGs only contain one image. * If there's only one, resize it down to a more manageable size. This will happen around FaviconDecoder.java:104 and again in ICODecoder#decode/decodeBitmapAtIndex. * Never return an array larger than 1MB from getBytesForDatabaseStorage. If it were to do so, it should instead return null to indicate failure. This is a classic example of fixing bugs inside-out, and we'll get a more robust system as a result. You'll of course need to get your Fennec toolchain working, if you haven't already; https://wiki.mozilla.org/Mobile/Fennec/Android should point you in the right direction.
Assignee: nobody → jalpreetnanda
Status: NEW → ASSIGNED
Flags: needinfo?(rnewman)
In the first layer, I am checking the size of the favicon by encodedFavicon.length, is this correct? or am I going in the wrong direction?, because this is going to be used in most of the layers.
Flags: needinfo?(rnewman)
You want the size in bytes of the blob that'll end up in the DB; that's the limitation. So yes, that's byte[].length.
Flags: needinfo?(rnewman)
I didn't get what's the limitation?
Flags: needinfo?(rnewman)
No need to set needinfo every time. The limitation is in the title of the bug, and also in Comment 2: 1MB.
Flags: needinfo?(rnewman)
Oh, Thanks I thought you are talking about some other limitation.
Attached patch Bug-1108084.diff (obsolete) — Splinter Review
Can you please check and give a feedback if I am going in the right direction? I have attached the diff for the first three layers.
Attachment #8535656 - Flags: feedback?(rnewman)
Comment on attachment 8535656 [details] [diff] [review] Bug-1108084.diff Review of attachment 8535656 [details] [diff] [review]: ----------------------------------------------------------------- This is the right start, but let's start moving towards good engineering practices: lifting constants, avoiding confusing choices, and working towards efficiency. ::: mobile/android/base/db/LocalBrowserDB.java @@ +1088,5 @@ > values.put(Favicons.URL, faviconUri); > values.put(Favicons.PAGE_URL, pageUri); > values.put(Favicons.DATA, encodedFavicon); > > + if((encodedFavicon.length)/1024 > 1024) { Lift out the magic constant to somewhere that makes sense: public static final int MAXIMUM_FAVICON_SIZE_BYTES = 1024 * 1024; // 1MB then use it at the start of this method before we do any work: if (encodedFavicon.length > MAXIMUM_FAVICON_SIZE_BYTES) { Log.w(LOGTAG, "Favicon too large; returning."); return; } ::: mobile/android/base/favicons/LoadFaviconTask.java @@ +107,5 @@ > > ContentResolver resolver = context.getContentResolver(); > + if ((encodedFavicon.length)/1024 > 1024){ > + Log.e("Invalid Size", "Size of favicon is too large"); > + return 2; No magic numbers. Either use an enum, or consider whether you need more than one thing at all -- if you want the cache to reflect the DB, surely we only care about whether the insertion succeeded or not? @@ +448,5 @@ > // Fetching bytes to store can fail. saveFaviconToDb will > // do the right thing, but we still choose to cache the > // downloaded icon in memory. > + if(loadedBitmaps.getBytesForDatabaseStorage() != null){ > + int result = saveFaviconToDb(loadedBitmaps.getBytesForDatabaseStorage()); Don't call this method twice. Fetch it then compare it.
Attachment #8535656 - Flags: feedback?(rnewman) → feedback+
(In reply to Richard Newman [:rnewman] from comment #2) > Rather than holding an arbitrarily large array of arbitrarily large bitmaps, > encoding them into a multi-megabyte byte array for database storage, it > should: > > * Avoid including huge individual bitmaps unless there is only one, and thus > there are no smaller alternatives. Only the ICODecoder has this problem; > PNGs only contain one image. > > * If there's only one, resize it down to a more manageable size. This will > happen around FaviconDecoder.java:104 and again in > ICODecoder#decode/decodeBitmapAtIndex. Can you please give a small explanation on how should I approach these two. Should I use bitmapsDecoded.compress() to resize it down ? or is there some other method? Can you please give a brief outline on how to proceed in the first point. Thanks for the help :)
(In reply to Jalpreet Singh Nanda [:imjalpreet] from comment #10) > Can you please give a small explanation on how should I approach these two. For the former, you'll need to read and understand how ICODecoder works, so you can figure out whether it's a problem at all, and how to solve it. You can get some clues from Bug 748100. (That's why this is a [good second bug], not a [good first bug]!) If you look in ICODecoder, around line 166, you'll see we try to get an icon that's just larger than the largest icon we'll want. That width is relative to the display density -- it's 32dp, so on an xxhdpi device that'll target icons about 96x96px. The worst case, then, is that we have a whole bunch of icons in the icon directory that are smaller than 96px, and one that's way bigger, and the total ends up larger than 1MB. Most likely it's the giant icon that's the problem -- golem.de's, for example, is 256x256, which would end up as a 196KB bitmap. Think this through -- is it possible to get a .ico that will yield a 1MB+ byte array when prepared for insertion? If so, you need to come up with an approach of selectively discarding images below the threshold, and resizing the one above the threshold, to get the total size down. Resizing is answered below... > Should I use bitmapsDecoded.compress() to resize it down ? or is there some > other method? Can you please give a brief outline on how to proceed in the > first point. Imagine we fetch favicon.ico for a site, and it's a 1024x1024 uncompressed image. There are two ways we can make that smaller: one is to use a smarter image format (e.g., PNG -- getBytesForDatabaseStorage does this already for single-image icons), and the other is to rescale the image down to a sane size. We already know the size we want: Favicons.largestFaviconSize x Favicons.largestFaviconSize. So if we're handling one image, and it's huge, when we get to decodeBitmapAtIndex on line 366, you can call http://developer.android.com/reference/android/graphics/Bitmap.html#createScaledBitmap(android.graphics.Bitmap,%20int,%20int,%20boolean) to create a smaller version, and use that instead. If I had to guess the input that triggered this bug, it would be: * An icon directory with more than one entry -- single-image icons are compressed, and a 1MB PNG would be huge, so we probably have multiple entries. * The directory contains one very very large entry and some smaller ones. It would be hard to have enough small entries to be more than 1MB. A 512x512 BMP in 24-bit color with an alpha channel would be more than 1MB.
Attached patch Bug-1108084 (obsolete) — Splinter Review
I have made a diff after all the changes. Can you please give a feedback on what more has to be done. Coming to the ICODecoder, I think the selective discarding of some .ico that are smaller than the threshold is already implemented. Secondly, the selection of only one .ico that is larger than the threshold is also implemented. The smallest ico that is larger than the threshold is selected. I have implemented the rest of the parts. *PLEASE CORRECT ME IF I AM WRONG*
Attachment #8535656 - Attachment is obsolete: true
Attachment #8536148 - Flags: feedback?(rnewman)
Comment on attachment 8536148 [details] [diff] [review] Bug-1108084 Review of attachment 8536148 [details] [diff] [review]: ----------------------------------------------------------------- Poking my nose in... ::: mobile/android/base/db/LocalBrowserDB.java @@ +1086,5 @@ > > public void updateFaviconForUrl(ContentResolver cr, String pageUri, > byte[] encodedFavicon, String faviconUri) { > + > + if((encodedFavicon.length) > MAXIMUM_FAVICON_SIZE_BYTES) { if ( You can ditch the brackets around encodedFavicon.length. ::: mobile/android/base/favicons/LoadFaviconTask.java @@ +116,2 @@ > BrowserDB.updateFaviconForUrl(resolver, pageUrl, encodedFavicon, faviconURL); > + return true; (Mostly a comment for Richard) I'm tempted to say we shouldn't bother with this because we've got the logic in getBytesForDatabaseStorage() ensuring we never get here with things that are too big. For someone to be calling this in a way that bypasses getBytesForDatabaseStorage(), they've also bypassed the favicon decoding/caching system. That would be a mistake. Should we really be writing code that will only serve a purpose when some future developer screws up? Doing so might make their mistake less obvious, too... @@ +451,5 @@ > // Fetching bytes to store can fail. saveFaviconToDb will > // do the right thing, but we still choose to cache the > // downloaded icon in memory. > + final byte[] faviconBytes; > + faviconBytes = loadedBitmaps.getBytesForDatabaseStorage(); Why split declaration and assignment here? @@ +452,5 @@ > // do the right thing, but we still choose to cache the > // downloaded icon in memory. > + final byte[] faviconBytes; > + faviconBytes = loadedBitmaps.getBytesForDatabaseStorage(); > + if(faviconBytes != null){ Space before (, space after ). See: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Java_practices @@ +453,5 @@ > // downloaded icon in memory. > + final byte[] faviconBytes; > + faviconBytes = loadedBitmaps.getBytesForDatabaseStorage(); > + if(faviconBytes != null){ > + boolean result = saveFaviconToDb(faviconBytes); This may be clearer if inlined. @@ +457,5 @@ > + boolean result = saveFaviconToDb(faviconBytes); > + if (result) { > + return pushToCacheAndGetResult(loadedBitmaps); > + } > + Log.e(LOGTAG, "Favicon too large; returning."); This is not the *only* reason that saveFaviconToDb might return false (you just wrote every return statement in that function!), so this log statement is misleading. @@ +460,5 @@ > + } > + Log.e(LOGTAG, "Favicon too large; returning."); > + return null; > + } > + Log.e(LOGTAG, "Favicon too large; returning."); This is not the *only* reason that getBytesForDatabaseStorage() might return null, so this log message is also misleading. ::: mobile/android/base/favicons/decoders/FaviconDecoder.java @@ +102,5 @@ > > // We assume here that decodeByteArray doesn't hold on to the entire supplied > // buffer -- worst case, each of our buffers will be twice the necessary size. > + > + decodedImage = Bitmap.createScaledBitmap(decodedImage, Favicons.largestFaviconSize, Favicons.largestFaviconSize, true); This will stretch every favicon to be the largest favicon size. You almost certainly don't want to do this. Note that Favicons.largestFaviconSize refers to *pixel size* of the favicon, not payload size. ::: mobile/android/base/favicons/decoders/ICODecoder.java @@ +351,5 @@ > * Inner class to iterate over the elements in the ICO represented by the enclosing instance. > */ > private class ICOIterator implements Iterator<Bitmap> { > private int mIndex; > + Bitmap nextBitmap; No need to introduce a field here: use a local variable. @@ +364,5 @@ > if (mIndex > iconDirectory.length) { > throw new NoSuchElementException("No more elements in this ICO."); > } > + > + if (iconDirectory[mIndex].width > Favicons.largestFaviconSize){ "width" is not the property you are looking for. The IconDirectoryEntry class represents an ICO ICONDIRENTRY as described: http://en.wikipedia.org/wiki/ICO_%28file_format%29#Icon_resource_structure Each such entry represents one of the contained images. The width is the width of the contained image. You probably want the payloadSize property instead. @@ +367,5 @@ > + > + if (iconDirectory[mIndex].width > Favicons.largestFaviconSize){ > + nextBitmap = decodeBitmapAtIndex(mIndex); > + nextBitmap = Bitmap.createScaledBitmap(nextBitmap, Favicons.largestFaviconSize, Favicons.largestFaviconSize, true); > + mIndex++; Trailing whitespace. @@ +370,5 @@ > + nextBitmap = Bitmap.createScaledBitmap(nextBitmap, Favicons.largestFaviconSize, Favicons.largestFaviconSize, true); > + mIndex++; > + } > + > + return nextBitmap; I think you've broken this iterator. If the condition of your new if statement comes out as false, it doesn't look like we ever decode a bitmap: we just return the value held in the field, never making progress. ::: mobile/android/base/favicons/decoders/LoadFaviconResult.java @@ +20,5 @@ > */ > public class LoadFaviconResult { > private static final String LOGTAG = "LoadFaviconResult"; > > + public static final int MAXIMUM_FAVICON_SIZE_BYTES = 1024 * 1024; // 1MB You're defining this constant in three places. Using a constant instead of a magic number has two main advantages: 1) A handy name makes your meaning clearer. 2) Having a single point of definition makes changing the value easy. Having three identically-named constants in three different classes is just asking for someone to forget to update all three someday... Figure out which class it makes most sense to define the constant in, stick it there, and refer to it from all the places that need it. The Favicons class already has a bunch of this sort of constants: maybe that's a nice place to put it. (Richard?) @@ +73,5 @@ > + if (streamBytes.length > MAXIMUM_FAVICON_SIZE_BYTES){ > + Log.e(LOGTAG, "Favicon too large; returning."); > + return null; > + } > + return streamBytes; Rather than doing the same check twice, you might want to introduce a function so you only do your check once. /** * Some appropriate comment here. */ public byte[] getBytesForDatabaseStorage() { byte[] faviconBytes = getFaviconBytes(); if (faviconBytes.length > MAXIMUM_FAVICON_SIZE_BYTES) { Log.e(LOGTAG, "Favicon too large; returning."); return null; } return faviconBytes; } /** * Some appropriate comment here. */ public byte[] getBytesForDatabaseStorage() { // Everything that was in the original getBytesForDatabaseStorage() } That way, you decouple the "Get the bytes that represent this favicon" logic from the "make sure the bytes are suitable for our slightly stupid database" logic. This should make things more readable and slightly shorter.
Typo in the above code example. The second function definition should actually have read something like: /** * Some appropriate comment here. */ private byte[] getFaviconBytes() { // Everything that was in the original getBytesForDatabaseStorage() } The idea being to have a private "Get the bytes and sod the size" function and a public "Get the bytes and throw them away if they're too big" one.
(In reply to Jalpreet Singh Nanda [:imjalpreet] from comment #12) > Coming to the ICODecoder, I think the selective discarding of some .ico that > are smaller than the threshold is already implemented. Note that an ICODecoder instance refers to just one ICO file at a time. A single .ico file can contain many images (See: http://en.wikipedia.org/wiki/ICO_%28file_format%29#Icon_resource_structure ) What we discard are images contained within an ICO: not ICOs. The pruning that currently occurs: 1) Discards corrupt images. 2) Discards all images except the best (in terms of visual quality) one of each size. An ICO may provide a bunch of different images at the same size with different colour depth and whatnot: we pick the best one for each size and bin the rest. (nvidia's website has a favicon that does this). 3) Throw away all images except one larger than Favicons.largestFaviconSize (which is the largest size, in pixels, that we care about). We keep one larger because it's better to scale this one down than to scale a smaller one up. I don't believe we ever downscale this minimum-maximum to be exactly Favicons.largestFaviconSize pixels in size. This decision was made to save time when decoding icons, but maybe it's worthwhile doing that in getBytesForDatabaseStorage() to save more space. We might want to bin some sizes entirely. A good heuristic for doing so doesn't spring immediately to mind. (we could perhaps observe how things survive in the cache and use this to figure out which sizes are "most popular", but that probably counts as over-engineering).
(In reply to Richard Newman [:rnewman] from comment #11) > Imagine we fetch favicon.ico for a site, and it's a 1024x1024 uncompressed > image. There are two ways we can make that smaller: one is to use a smarter > image format (e.g., PNG -- getBytesForDatabaseStorage does this already for > single-image icons), and the other is to rescale the image down to a sane > size. This reminds me of something we talked about a while back: aggresively re-encoding ICOs. An ICO contains a set of images in either BMP or PNG format. Currently, we store multi-image ICOs verbatim in the database. We would save space (at *considerable* time cost) if we re-encoded every image in an ICO to be a PNG. This might be a good approach (perhaps only used as a last resort to reduce size if the ICO is deemed "too big", or something). A highly novel approach that would likely save even more space would be to re-encode ICOs as "spritesheets". Storing a concatenation of PNGs is inefficient: you get better compression ratios on larger images vs. a set of smaller images. So, in principle, you could "tile" the members of your ICO in a single PNG and bolt on a tEXt chunk explaining where to find the various members. This would be extremely fun, and probably reduce ICO sizes by something around 10% in the DB. :P This is also a completely terrible idea: Don't do this.
(In reply to Chris Kitching [:ckitching] from comment #13) > > @@ +364,5 @@ > > if (mIndex > iconDirectory.length) { > > throw new NoSuchElementException("No more elements in this ICO."); > > } > > + > > + if (iconDirectory[mIndex].width > Favicons.largestFaviconSize){ > > "width" is not the property you are looking for. > > The IconDirectoryEntry class represents an ICO ICONDIRENTRY as described: > http://en.wikipedia.org/wiki/ICO_%28file_format%29#Icon_resource_structure > > Each such entry represents one of the contained images. The width is the > width of the contained image. You probably want the payloadSize property > instead. > Can you please give a short expalanation on what is payloadSize? I think I have done all the other changes as suggested by you. I will attach the diff after I complete this part.
Flags: needinfo?(chriskitching)
(In reply to Jalpreet Singh Nanda [:imjalpreet] from comment #17) > Can you please give a short expalanation on what is payloadSize? Sorry for the slow reply, payloadSize is simply the number of bytes in the source ICO file which are used to store the corresponding image. It corresponds directly to the "size" parameter of an ICONDIRENTRY as described excellently by Wikipedia: http://en.wikipedia.org/wiki/ICO_%28file_format%29
Flags: needinfo?(chriskitching)
Thanks, I will get back if I have any doubts.
(In reply to Chris Kitching [:ckitching] from comment #13) > @@ +364,5 @@ > > if (mIndex > iconDirectory.length) { > > throw new NoSuchElementException("No more elements in this ICO."); > > } > > + > > + if (iconDirectory[mIndex].width > Favicons.largestFaviconSize){ > > "width" is not the property you are looking for. > > The IconDirectoryEntry class represents an ICO ICONDIRENTRY as described: > http://en.wikipedia.org/wiki/ICO_%28file_format%29#Icon_resource_structure > > Each such entry represents one of the contained images. The width is the > width of the contained image. You probably want the payloadSize property > instead. ************* if (iconDirectory[mIndex].payloadSize > LoadFaviconTask.MAXIMUM_FAVICON_SIZE_BYTES){ ... } ************* Would this be fine in place of the above if condition?
Flags: needinfo?(chriskitching)
Attachment #8536148 - Flags: feedback?(rnewman) → feedback?(chriskitching)
This is the latest patch.
Attachment #8536148 - Attachment is obsolete: true
Attachment #8536148 - Flags: feedback?(chriskitching)
Attachment #8540345 - Flags: feedback?(chriskitching)
(In reply to Jalpreet Singh Nanda [:imjalpreet] from comment #20) > Would this be fine in place of the above if condition? Yes. Sorry if I wasn't clear enough earlier! I'll get to reviewing your patch later on.
Flags: needinfo?(chriskitching)
No, you were clear, I just wanted to confirm.
Can you please tell if anything is still left to be completed?
Flags: needinfo?(rnewman)
Comment on attachment 8540345 [details] [diff] [review] Bug-1108084-Proposed_Patch Review of attachment 8540345 [details] [diff] [review]: ----------------------------------------------------------------- Looks like Chris is incommunicado. ::: mobile/android/base/db/LocalBrowserDB.java @@ +1085,5 @@ > > public void updateFaviconForUrl(ContentResolver cr, String pageUri, > byte[] encodedFavicon, String faviconUri) { > + > + if(encodedFavicon.length > LoadFaviconTask.MAXIMUM_FAVICON_SIZE_BYTES) { Nit: space between "if" and "(". ::: mobile/android/base/favicons/LoadFaviconTask.java @@ +458,5 @@ > + } > + Log.e(LOGTAG, "Favicon too large; returning."); > + return null; > + } > + Log.e(LOGTAG, "Favicon is null; returning."); That's not the only reason. You can remove lines 459, 460, and 462, and simplify this whole section to: final byte[] faviconBytes = loadedBitmaps.getBytesForDatabaseStorage(); if (faviconBytes == null) { return null; } if (saveFaviconToDb(faviconBytes)) { return pushToCacheAndGetResult(loadedBitmaps); } return null; ::: mobile/android/base/favicons/decoders/FaviconDecoder.java @@ +103,5 @@ > // We assume here that decodeByteArray doesn't hold on to the entire supplied > // buffer -- worst case, each of our buffers will be twice the necessary size. > + > + if (decodedImage.getWidth() > Favicons.largestFaviconSize) > + decodedImage = Bitmap.createScaledBitmap(decodedImage, Favicons.largestFaviconSize, Favicons.largestFaviconSize, true); Nit: always {} your conditionals, even one-liners. Also, you're assuming here that the bitmap is square. You should compute the width and height in order to preserve the aspect ratio before scaling. ::: mobile/android/base/favicons/decoders/ICODecoder.java @@ +364,5 @@ > if (mIndex > iconDirectory.length) { > throw new NoSuchElementException("No more elements in this ICO."); > } > + > + if (iconDirectory[mIndex].payloadSize > LoadFaviconTask.MAXIMUM_FAVICON_SIZE_BYTES ){ Nit: " ){" should be ") {". @@ +367,5 @@ > + > + if (iconDirectory[mIndex].payloadSize > LoadFaviconTask.MAXIMUM_FAVICON_SIZE_BYTES ){ > + Bitmap nextBitmap; > + nextBitmap = decodeBitmapAtIndex(mIndex); > + nextBitmap = Bitmap.createScaledBitmap(nextBitmap, Favicons.largestFaviconSize, Favicons.largestFaviconSize, true); Firstly, same aspect ratio issue as before. Secondly, is it always the case that a favicon larger than MAXIMUM_FAVICON_SIZE_BYTES is larger than largestFaviconSize^2? Is scaling the right choice here?
Attachment #8540345 - Flags: feedback?(chriskitching) → feedback+
Flags: needinfo?(rnewman)
Is this still relevant?
Flags: needinfo?(rnewman)
Yes. :Sebastian can take over mentoring.
Mentor: chriskitching, rnewman → s.kaspari
Flags: needinfo?(rnewman)
(In reply to bd339 from comment #27) > Is this still relevant? It might very well be. However I'm currently looking at the favicon storage in bug 1290014. This here will have to wait.
Depends on: 1290014
Fair enough. I will keep an eye out for how this goes then. I talked with ahunt and snorp about bug 961335 in IRC today. We talked about how it is perhaps not worth writing the tests discussed in bug 961335, because it seems we only display favicons in the history and bookmarks lists nowadays. At least ahunt said he changed his opinion on the usefulness of bug 961335 and that it would now be considered of low priority because the code is not often used (many websites don't have a favicon in the history / bookmarks lists). I'm new to fennec development so I don't know if said lack of favicons is related to your description of bug 1290014 (the part about some favicons never showing up in UI for example). Does this make bug 961335 more worthwhile or do you think it is just unrelated?
Fixed as part of bug 1290014.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: