Closed Bug 978932 Opened 10 years ago Closed 10 years ago

crash in java.lang.OutOfMemoryError: at org.mozilla.gecko.favicons.LoadFaviconTask.downloadAndDecodeImage(LoadFaviconTask.java)

Categories

(Firefox for Android Graveyard :: Favicon Handling, defect)

30 Branch
All
Android
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 34

People

(Reporter: aaronmt, Assigned: ramasamy.zephyr, NeedInfo)

References

Details

(Keywords: crash, Whiteboard: [lang=java][good first bug])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-35237bf6-b6bf-4964-9d49-5b8802140302.
=============================================================

java.lang.OutOfMemoryError
	at org.mozilla.gecko.favicons.LoadFaviconTask.downloadAndDecodeImage(LoadFaviconTask.java:245)
	at org.mozilla.gecko.favicons.LoadFaviconTask.downloadFavicon(LoadFaviconTask.java:201)
	at org.mozilla.gecko.favicons.LoadFaviconTask.doInBackground$2d4c763b(LoadFaviconTask.java:370)
	at org.mozilla.gecko.favicons.LoadFaviconTask.doInBackground$42af7916(LoadFaviconTask.java:42)
	at org.mozilla.gecko.util.UiAsyncTask$BackgroundTaskRunnable.run(UiAsyncTask.java:48)
	at android.os.Handler.handleCallback(Handler.java:733)
	at android.os.Handler.dispatchMessage(Handler.java:95)
	at android.os.Looper.loop(Looper.java:136)
	at org.mozilla.gecko.util.GeckoBackgroundThread.run(GeckoBackgroundThread.java:32)

Can we protect against a crash here?
Sure.
Whiteboard: [mentor=rnewman][lang=java][good first bug]
Can I grab this one!? rnewman could explain how to fix it!?
Assignee: nobody → marcos.cezar.mendes
Flags: needinfo?(rnewman)
(In reply to Marcos Cezar Mendes da Costa Junior from comment #2)
> Can I grab this one!? rnewman could explain how to fix it!?

Sure! Step one is to get a working build:

https://wiki.mozilla.org/Mobile/Fennec/Android

You need to get to a point where you can reproduce the bug, which starts with building and running your own build of Firefox on your Android device.

When you're at that point, let me know!
Flags: needinfo?(rnewman)
I already compiled the code. What goes next!?
Great!

Next is the standard process for fixing a bug:

* Reproduce it.
* Apply a fix.
* Verify that the fix avoids the issue.
* Verify that the fix works in the wild.

This is the line that throws:

        // Allocate a buffer to hold the raw favicon data downloaded.
        byte[] buffer = new byte[bufferSize];

In this case, reproducing without code changes is difficult; it depends on the amount of memory available. You'll have to trigger an OOM at that location yourself, probably by doing something like allocating `new byte[10 * bufferSize]`.

The fix will probably be three things:

1. Verify that we're not doing something stupid here.
2. Catch an OOM error at the appropriate point in the stack.
3. Ensure that calling code does the appropriate thing when LoadFaviconTask fails.
Status: NEW → ASSIGNED
What use case thrigger this line of code?
Loading a favicon that you haven't seen before, that requires fetching from the web.
Mentor: rnewman
Whiteboard: [mentor=rnewman][lang=java][good first bug] → [lang=java][good first bug]
Attachment #8471829 - Flags: review?(rnewman)
rnewman, This is my first bug fix here. Please let me know if I am doing things wrongly. I reproduced the bug by altering the buffer size directly by some multiple as suggested by you.
Comment on attachment 8471829 [details] [diff] [review]
0001-Bug_978932-Handle_out_of_memory_for_favicon.patch

Review of attachment 8471829 [details] [diff] [review]:
-----------------------------------------------------------------

Nit: commit comment should read

  Bug 978932 - Catch OutOfMemoryError in downloadFavicon. r=rnewman

One more pass to avoid unnecessary changes, then this should be good to go! Thanks!

::: mobile/android/base/favicons/LoadFaviconTask.java
@@ +123,5 @@
>          if (visited.size() == MAX_REDIRECTS_TO_FOLLOW) {
>              return null;
>          }
>  
> +

Nit: don't introduce unnecessary changes.

@@ +223,5 @@
>              result = downloadAndDecodeImage(targetFaviconURI);
>          } catch (Exception e) {
>              Log.e(LOGTAG, "Error reading favicon", e);
> +        } catch (OutOfMemoryError e){
> +            Log.e(LOGTAG, "Insufficient memory to download  favicon");

Nit: put a single space between ) and {.
Nit: remove second space between "download" and "favicon".

@@ +278,3 @@
>          try {
> +            // Allocate a buffer to hold the raw favicon data downloaded.
> +            buffer = new byte[bufferSize];

The changes in this method aren't necessary: doing the allocation earlier is better, and the method will still throw. Just keep the OOME catch above.
Attachment #8471829 - Flags: review?(rnewman) → feedback+
(In reply to (Away Aug 14-24) Richard Newman [:rnewman] from comment #10)

> @@ +278,3 @@
> >          try {
> > +            // Allocate a buffer to hold the raw favicon data downloaded.
> > +            buffer = new byte[bufferSize];
> 
> The changes in this method aren't necessary: doing the allocation earlier is
> better, and the method will still throw. Just keep the OOME catch above.

I put there to close the content stream, thanks for the input.
(In reply to ramasamy.zephyr from comment #11)
> (In reply to (Away Aug 14-24) Richard Newman [:rnewman] from comment #10)
> 
> > @@ +278,3 @@
> > >          try {
> > > +            // Allocate a buffer to hold the raw favicon data downloaded.
> > > +            buffer = new byte[bufferSize];
> > 
> > The changes in this method aren't necessary: doing the allocation earlier is
> > better, and the method will still throw. Just keep the OOME catch above.
> 
> I put there to close the content stream, thanks for the input.

Should I go without changes above?
(In reply to ramasamy.zephyr from comment #11)

> I put there to close the content stream, thanks for the input.

Hmm, maybe.

On the one hand, closing the stream (which consumes the entity body and allows the connection to be reused) is good.

On the other hand:

* If we fail in the current code, we never retrieve the input stream, so it doesn't need to be consumed. (The resource itself is never closed, though.)

* Closing the stream without consuming the entire entity is bad, but we can't legitimately do so in an OOM case, because that'll probably fail, too.

* We really ought to be closing these responses, not just exhausting the stream -- we don't need to reuse the connection.


So you've pointed out a bug (yay!), but I don't think always grabbing the entity body is the correct fix.

Let's take a slightly different angle: let's extract most of the body of downloadAndDecodeImage into a method (perhaps 'decodeImageFromResponse') that takes only an HttpResponse as an argument, and make the body of downloadAndDecodeImage into just:

        // Try the URL we were given.
        final HttpResponse response = tryDownload(targetFaviconURL);
        if (response == null) {
            return null;
        }

        try {
            return decodeImageFromResponse(response);
        } finally {
            response.close();
        }

response.close() is equivalent to exhausting the stream regardless, and it also closes the connection, so we'll never leak resources.

decodeImageFromResponse will consume the stream it creates, if it creates one, but it isn't responsible for closing the response.

Thoughts?

If I'm unavailable, ckitching can review. (But let's make sure we test thoroughly!)
Component: General → Favicon Handling
(In reply to (Away Aug 14-24) Richard Newman [:rnewman] from comment #13)


> * If we fail in the current code, we never retrieve the input stream, so it
> doesn't need to be consumed. (The resource itself is never closed, though.)

OOME can happen after retrieve input stream as buffer is allocated dynamically if entityReportedLength is  not provided so I guess we have to close the stream.

 - final long entityReportedLength = entity.getContentLength();

> * Closing the stream without consuming the entire entity is bad, but we
> can't legitimately do so in an OOM case, because that'll probably fail, too.

 According to comments in consumeContent() of HttpEntity it says its deprecated infavour of InputStream close. Though it also says method is important if the entity is obtained from connection. 

> * We really ought to be closing these responses, not just exhausting the
> stream -- we don't need to reuse the connection.

>         // Try the URL we were given.
>         final HttpResponse response = tryDownload(targetFaviconURL);
>         if (response == null) {
>             return null;
>         }
> 
>         try {
>             return decodeImageFromResponse(response);
>         } finally {
>             response.close();
>         }
> 
> response.close() is equivalent to exhausting the stream regardless, and it
> also closes the connection, so we'll never leak resources.
> 
> decodeImageFromResponse will consume the stream it creates, if it creates
> one, but it isn't responsible for closing the response.

Yeah, the above approach seems to be clean and follow the same pattern followed in other parts of source code that is consumeContent() on HttpEntity from HttpResponse.

go ahead?
(In reply to ramasamy.zephyr from comment #14)

> OOME can happen after retrieve input stream as buffer is allocated
> dynamically if entityReportedLength is  not provided so I guess we have to
> close the stream.

But in that case we already do; buffer reallocations inside the try block already close the content stream.

> go ahead?

That's implied by my comment, yes.
Late to the party, possibly being stupid, but are we sure this is the right direction?

Isn't a better approach to add sanity checks to male sure we don't try and allocate a huge buffer if someone sends us a silly size header, and to make sure we abort download if the favicon turns out to be bigger than some sane limit.

After all, if you get an OOM for a reason other than "You did something  dumb" aren't we unrecoverably broken?
(In reply to Chris Kitching [:ckitching] from comment #16)

> After all, if you get an OOM for a reason other than "You did something 
> dumb" aren't we unrecoverably broken?

Crashing the browser when we're too low on memory to load a favicon is bad behavior; it's entirely possible that the favicon load is the peak of memory usage during the page load.

Even if we limit favicon sizes (and I agree that a sanity limit is a good idea), we still need to guard here. There's no acceptable upper bound on favicon size that's small enough that an allocation will never fail.
Attachment #8471829 - Attachment is obsolete: true
Attachment #8472540 - Flags: review?(rnewman)
Comment on attachment 8472540 [details] [diff] [review]
0001-Bug_978932-Handle_out_of_memory_for_favicon.patch

Review of attachment 8472540 [details] [diff] [review]:
-----------------------------------------------------------------

RNewman's gone on holiday, so as per his comment from earlier I'm taking over.

Hello!

Your patch looks prettymuch perfect. A possible improvement might be to add code that enforces an upper limit on favicon size (either aborting in response to the size header or when a certain excessive amount of downloading has occurred). That is, however, just the icing on the cake: your current patch should solve the crash.

Hooray!

If you want to go ahead and add the extra code do so and flag me for review, otherwise I'll get this landed in the near future.

::: mobile/android/base/favicons/LoadFaviconTask.java
@@ +257,5 @@
> +            entity.consumeContent();
> +        }
> +    }
> +
> +    // Copies the fetched stream to a buffer and then decode.

Use a javadoc-style comment here. (makes automating some tasks easier). Perhaps expand the comment to give more helpful details.
Attachment #8472540 - Flags: review?(rnewman) → review+
Using javadoc-style commenting for method.
Attachment #8473425 - Flags: review?(chriskitching)
I didn't see lot of methods following javadoc-style comments. So I assumed only important methods need those. 

Thanks for reviewing. Checking  favicon size upper bound, do u have any suggestion how to go about it. 

I can do an upper bound on sizeof(int) since we are using byte arrays to hold the favicons but > 2GB of favicon data, so I guess its not useful to do that.
Attachment #8472540 - Attachment is obsolete: true
Attachment #8473425 - Flags: review?(chriskitching) → review+
Mentor: rnewman
(In reply to ramasamy.zephyr from comment #21)
> I didn't see lot of methods following javadoc-style comments. So I assumed
> only important methods need those. 

Well, common sense applies. Currently we have a bit of a deficiency of them, so it's good to jam some extra documentation in when we can.

> Thanks for reviewing. Checking  favicon size upper bound, do u have any
> suggestion how to go about it. 
> I can do an upper bound on sizeof(int) since we are using byte arrays to
> hold the favicons but > 2GB of favicon data, so I guess its not useful to do
> that.

The idea was to define a limit that prevents us from blindly downloading data forever if a stupid remote server tries to give us an infinitely-sized favicon.
However, this is perhaps such an obscure edgecase as to not be worth bothering with. It's unclear that an effective DoS could be implemented atop this, so I think just dying on OOM as your patch currently does is sufficient.

Thanks for your contribution! It'll be landing shortlyish: in the meantime, perhaps some of the other favicon handling bugs are of interest to you:
https://bugzilla.mozilla.org/buglist.cgi?product=Firefox%20for%20Android&component=Favicon%20Handling&resolution=---&list_id=11031189
https://hg.mozilla.org/mozilla-central/rev/96fc001d76a7
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 34
Chris: why did the final patch not take the approach of response.close()?
Flags: needinfo?(chriskitching)
(In reply to Richard Newman [:rnewman] from comment #26)
> Chris: why did the final patch not take the approach of response.close()?

Drat. Didn't read enough of the context.
Sorry, I'll submit a corrected patch myself shortlyish.
Flags: needinfo?(chriskitching)
Assignee: marcos.cezar.mendes → ramasamy.zephyr
ni on ckitching for additional patch and consideration of uplift.
Flags: needinfo?(chriskitching)
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.