Closed Bug 1383499 Opened 7 years ago Closed 7 years ago

Bulbapedia Wiki page load stalls or crash Firefox

Categories

(Core :: Graphics: ImageLib, defect, P3)

ARM
Android
defect

Tracking

()

VERIFIED FIXED
mozilla57
Tracking Status
fennec + ---
firefox-esr52 --- unaffected
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- fixed
firefox57 --- verified

People

(Reporter: lh.bennett, Assigned: aosmond)

References

Details

(Keywords: crash, Whiteboard: [gfx-noted])

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0
Build ID: 20170722072631

Steps to reproduce:

Load this Bulbapedia page: http://goo.gl/h2FFQD


Actual results:

Stealth crash in Fx 54.01.

Permastall or crash in Nightly 56.


Expected results:

 Load normally.
Hmm, that page doesn't have excessively many animated images (according to about:memory on Desktop, they only take around 30 MB), but maybe we make additional temporary allocations while loading those images, which is then enough to still push us towards OOM?

On my phone (2 GB RAM), that page didn't crash Firefox, but still caused some weird behaviour with OOM-like symptoms, e.g. pressing the tab button triggered some animation, but the tabs tray itself remained invisible, although after switching to a different app and back, which restored UI responsiveness, it turned out that I still managed to opened some new tabs that way. Additionally, a black bar temporarily appeared instead of the address bar.

For a more extreme case of this, compare http://www.pokewiki.de/Metronom, whose animated PNGs take up about 200 MB on Desktop.

Incidentally, what phone are you using and how much RAM does it have? If you can retrieve some links to corresponding crash reports from about:memory, that would also be nice, although if these are indeed OOM crashes, chances are that gathering and sending the crash report failed, too.
I use a Nexus 6 running Android 7.1.1. Qualcomm Snapdragon 805, 3GB RAM.

I don't have any crash reports because none are recorded. About:Crashes is empty for both builds.

MFx 54.0.1 more often than not will just instantly disappear from the screen after loading that page. Using the app picker to reselect Firefox will just auto reload that webpage. But because Mobile Firefox loads slowly, there's enough time to kill the tab before anything happens.

Nightly 56 typically will freeze all interaction but will not disappear from view, however cycling it like you suggest will return responsiveness for a period of time and sometimes freeze again when scrolling. Other times it will act similarly to MFx54 and disappear. I assumed it to be a crash. But again, there's no crash report recorded.
Attached file log.txt
Device:
 - Nexus 6 (Android 7.0);

Hello,

 I can reproduce the issue Leman is talking about in FF Release & Nightly. Both have the same behavior (app closes). I've attached a log and a video with the issue. 


Video: https://goo.gl/wMvD5T
Status: UNCONFIRMED → NEW
tracking-fennec: --- → ?
Ever confirmed: true
OS: Unspecified → Android
Hardware: Unspecified → ARM
Version: unspecified → Trunk
Tested on a few more devices and here is a link from a crash report on a Note 4 (Android 5.0.1): https://crash-stats.mozilla.com/report/index/2d2cb9ea-e3a1-451f-8f7b-284400170726
(In reply to Bogdan Surd, QA [:BogdanS] from comment #3)
> Created attachment 8890260 [details]
> log.txt

Here's the crash callstack from log.txt. But my local build generates different callstack on PixelXL pasted bottom. Seems web page cause memory or fd exhausted and crash.

Hi snorp, would you help to check on this, please?


07-26 12:21:52.556: D/AndroidRuntime(9218): Shutting down VM
07-26 12:21:52.568: E/GeckoCrashHandler(9218): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main")
07-26 12:21:52.568: E/GeckoCrashHandler(9218): java.lang.NullPointerException: FileDescriptor must not be null
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.ParcelFileDescriptor.<init>(ParcelFileDescriptor.java:176)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.ParcelFileDescriptor$1.createFromParcel(ParcelFileDescriptor.java:994)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.ParcelFileDescriptor$1.createFromParcel(ParcelFileDescriptor.java:987)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Parcel.readParcelable(Parcel.java:2470)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.util.MemoryIntArray.<init>(MemoryIntArray.java:78)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.util.MemoryIntArray.<init>(MemoryIntArray.java)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.util.MemoryIntArray$1.createFromParcel(MemoryIntArray.java:233)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.util.MemoryIntArray$1.createFromParcel(MemoryIntArray.java:231)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Parcel.readParcelable(Parcel.java:2470)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Parcel.readValue(Parcel.java:2364)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Parcel.readArrayMapInternal(Parcel.java:2717)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.BaseBundle.unparcel(BaseBundle.java:269)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.BaseBundle.getString(BaseBundle.java:992)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.provider.Settings$NameValueCache.getStringForUser(Settings.java:1627)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.provider.Settings$System.getStringForUser(Settings.java:1911)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.provider.Settings$System.getIntForUser(Settings.java:1981)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.media.AudioManager.querySoundEffectsEnabled(AudioManager.java:1989)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.media.AudioManager.playSoundEffect(AudioManager.java:1906)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.view.ViewRootImpl.playSoundEffect(ViewRootImpl.java:5722)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.view.View.playSoundEffect(View.java:20191)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.view.View.performClick(View.java:5608)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.view.View$PerformClick.run(View.java:22262)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Handler.handleCallback(Handler.java:751)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Handler.dispatchMessage(Handler.java:95)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.os.Looper.loop(Looper.java:154)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at android.app.ActivityThread.main(ActivityThread.java:6077)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at java.lang.reflect.Method.invoke(Native Method)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:865)
07-26 12:21:52.568: E/GeckoCrashHandler(9218): 	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:755)
07-26 12:21:52.603: W/google-breakpad(9218): ExceptionHandler::GenerateDump cloned child 
07-26 12:21:52.603: W/google-breakpad(9218): 9511
07-26 12:21:52.603: W/google-breakpad(9218): ExceptionHandler::SendContinueSignalToChild sent continue signal to child
07-26 12:21:52.605: W/google-breakpad(9511): ExceptionHandler::WaitForContinueSignal waiting for continue signal...



08-02 16:15:28.090  2247  2270 W GeckoConsole: [JavaScript Warning: "Loading failed for the <script> with source “https://sdk.adspruce.com/1/adspruce.js?pid=3574&sid=3”." {file: "https://bulbapedia.bulbagarden.net/wiki/Roserade_(Pok%C3%A9mon)" line: 1}]
08-02 16:15:28.227  2247  2268 E CursorWindow: Could not allocate CursorWindow '/data/user/0/org.mozilla.fennec_maliu/files/mozilla/etyqcv5w.default/browser.db' of size 2097152 due to error -24.
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler: >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1881 ("GeckoBackgroundThread")
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler: android.database.CursorWindowAllocationException: Cursor window allocation of 2048 kb failed.
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.CursorWindow.<init>(CursorWindow.java:108)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.AbstractWindowedCursor.clearOrCreateWindow(AbstractWindowedCursor.java:198)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:138)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:132)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.AbstractCursor.moveToPosition(AbstractCursor.java:219)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.AbstractCursor.moveToFirst(AbstractCursor.java:258)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.database.CursorWrapper.moveToFirst(CursorWrapper.java:71)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.db.LocalBrowserDB.lookupHistoryGUIDByPageUri(LocalBrowserDB.java:581)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.db.LocalBrowserDB.deletePageMetadata(LocalBrowserDB.java:542)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.GlobalPageMetadata.doAddOrQueue(GlobalPageMetadata.java:96)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.GlobalPageMetadata.add(GlobalPageMetadata.java:80)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.BrowserApp.handleMessage(BrowserApp.java:2153)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.EventDispatcher$2.run(EventDispatcher.java:337)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.os.Handler.handleCallback(Handler.java:751)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.os.Handler.dispatchMessage(Handler.java:95)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at android.os.Looper.loop(Looper.java:154)
08-02 16:15:28.229  2247  2268 E GeckoCrashHandler:     at org.mozilla.gecko.util.GeckoBackgroundThread.run(GeckoBackgroundThread.java:43)
08-02 16:15:28.230  2247  2268 E GeckoCrashHandler: Main thread (1) stack:
08-02 16:15:28.230  2247  2268 E GeckoCrashHandler:     android.os.MessageQueue.nativePollOnce(Native Method)
08-02 16:15:28.230  2247  2268 E GeckoCrashHandler:     android.os.MessageQueue.next(MessageQueue.java:323)
08-02 16:15:28.230  2247  2268 E GeckoCrashHandler:     android.os.Looper.loop(Looper.java:136)
08-02 16:15:28.231  2247  2268 E GeckoCrashHandler:     android.app.ActivityThread.main(ActivityThread.java:6121)
08-02 16:15:28.231  2247  2268 E GeckoCrashHandler:     java.lang.reflect.Method.invoke(Native Method)
08-02 16:15:28.231  2247  2268 E GeckoCrashHandler:     com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:889)
08-02 16:15:28.231  2247  2268 E GeckoCrashHandler:     com.android.internal.os.ZygoteInit.main(ZygoteInit.java:779)
08-02 16:15:28.232  2247  2268 W google-breakpad: ExceptionHandler::GenerateDump sys_pipe failed:
08-02 16:15:28.232  2247  2268 W google-breakpad: Too many open files
08-02 16:15:28.232  2247  2268 W google-breakpad:
08-02 16:15:28.254  2247  2268 W google-breakpad: ExceptionHandler::GenerateDump cloned child
08-02 16:15:28.254  3261  2268 W google-breakpad: ExceptionHandler::WaitForContinueSignal waiting for continue signal...
08-02 16:15:28.254  2247  2268 W google-breakpad: 3261
08-02 16:15:28.254  3261  2268 W google-breakpad: ExceptionHandler::WaitForContinueSignal sys_read failed:
08-02 16:15:28.254  2247  2268 W google-breakpad:
08-02 16:15:28.254  2247  2268 W google-breakpad: ExceptionHandler::SendContinueSignalToChild sys_write failed:
08-02 16:15:28.254  3261  2268 W google-breakpad: Bad file descriptor
08-02 16:15:28.254  2247  2268 W google-breakpad: Bad file descriptor
08-02 16:15:28.254  3261  2268 W google-breakpad:
08-02 16:15:28.254  2247  2268 W google-breakpad:
08-02 16:15:28.254  2247  2268 W google-breakpad: ExceptionHandler::SendContinueSignalToChild sent continue signal to child
Flags: needinfo?(snorp)
Yup, looks like OOM. Jamie this seems up your alley, want to take a look?
tracking-fennec: ? → +
Flags: needinfo?(snorp) → needinfo?(jnicol)
Sorry, was on holiday last week. Looking in to this.
Assignee: nobody → jnicol
Flags: needinfo?(jnicol)
Attached file testcase.html
The crash is because we are running out of available fds, and the culprit is all the animated images on the page. Attached is a simpler test case containing only them.

They seems to use around 200 fds each. Andrew knows more about this.
Flags: needinfo?(aosmond)
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=69384a669e1e9b5b4c189c96e8e8a662c0a6fc9f

This should do it. There is no need to use volatile buffers for APNG frames, as there is actually no benefit compared to single frame images, so we should just use the heap instead.
Assignee: jnicol → aosmond
Status: NEW → ASSIGNED
Flags: needinfo?(aosmond)
Has STR: --- → yes
Component: General → ImageLib
Priority: -- → P3
Product: Firefox for Android → Core
Whiteboard: [gfx-noted]
Keywords: crash
Comment on attachment 8897471 [details] [diff] [review]
Animated PNGs should allocate after-first frames on the heap instead of as volatile buffers., v1

>+    if (!aIsAnimated && !ClearSurface(mRawSurface, mFrameRect.Size(), mFormat)) {
>       NS_WARNING("Could not clear allocated buffer");
>       mAborted = true;
>       return NS_ERROR_OUT_OF_MEMORY;
>     }
>   }

Probably clearer if you make this

if (not animated) {
  if (not clearsurface) {
...
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b9626cad1a27e28e2a57c1a5d394b0fe31e2fcde

Sadness. Because bug 1290293 was not able to fully land because it messes up image containers (if an image gets layerized, it no longer is able to be display as it decodes and needs the full frame first -- there are possibly workarounds, namely exposing a "valid rect", and each user of the container would need to check it, but I never got around to it...), we don't actually clear the unwritten pixels for truncated frames. Which means I cannot avoid the memset and make this a general case. Le sigh. So now I have special cased this for Android, which will have potentially slightly worse decode performance in exchange for not using any file handles.

(In reply to Timothy Nikkel (:tnikkel) from comment #10)
> Comment on attachment 8897471 [details] [diff] [review]
> Animated PNGs should allocate after-first frames on the heap instead of as
> volatile buffers., v1
> 
> >+    if (!aIsAnimated && !ClearSurface(mRawSurface, mFrameRect.Size(), mFormat)) {
> >       NS_WARNING("Could not clear allocated buffer");
> >       mAborted = true;
> >       return NS_ERROR_OUT_OF_MEMORY;
> >     }
> >   }
> 
> Probably clearer if you make this
> 
> if (not animated) {
>   if (not clearsurface) {
> ...

In principal, agreed. But the code changed and now has #ifdefs :).
Attachment #8897471 - Attachment is obsolete: true
Comment on attachment 8897512 [details] [diff] [review]
Animated PNGs should allocate after-first frames on the heap instead of as volatile buffers on Android., v2

Aren't the asserts from v1 of the patch still valid?
Attachment #8897512 - Flags: review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/edd9d2acbac7
Animated PNGs should allocate after-first frames on the heap instead of as volatile buffers on Android. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/edd9d2acbac7
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Whoops, I backed it out and forgot to add a backout comment. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
...and I was looking at the wrong bug.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Depends on: 1391430
Device:
 - Nexus 6 (Android 7.1.1);
 - Samsung Galaxy Note 4 (Android 5.1.1)

Hello, 

I've verified this issue and it is no longer reproducible on any of the devices I've tested on. Marking as verified.
Status: RESOLVED → VERIFIED
Please nominate this for Beta approval when you get a chance.
Flags: needinfo?(aosmond)
@Ryan: Bug 1391430 must be fixed first, though.
(In reply to Jan Henning [:JanH] from comment #19)
> @Ryan: Bug 1391430 must be fixed first, though.

Bug 1391430 is now fixed. Can we get a beta approval request please?
Comment on attachment 8897512 [details] [diff] [review]
Animated PNGs should allocate after-first frames on the heap instead of as volatile buffers on Android., v2

Approval Request Comment
[Feature/Bug causing the regression]: Bug 962670
[User impact if declined]: We may run out of file handles on Android. This will break many things, showing up as a hang or a crash.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: Bug 1391430
[Is the change risky?]: Combined with the fix for the regression in bug 1391430, no.
[Why is the change risky/not risky?]: We are simply swapping one buffer backed surface type for another. It stores the same data, is the same size, etc. Volatile buffer (which is no longer used) just has some additional special properties in that it can get evicted -- if anything, the new change makes it more consistent. There was a regression but that is because I failed to realize there was a layout change for the new surface type I used -- the accompanying fix uses a surface with the same layout.
[String changes made/needed]: None.
Flags: needinfo?(aosmond)
Attachment #8897512 - Flags: approval-mozilla-beta?
Comment on attachment 8897512 [details] [diff] [review]
Animated PNGs should allocate after-first frames on the heap instead of as volatile buffers on Android., v2

Avoid the crash and was verified. Beta56+.
Attachment #8897512 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.