OOM (too many open files) crash on https://www.mediamarkt.at/de/shop/abverkauf-markt.html

RESOLVED FIXED in Firefox 60

Status

()

defect
P3
normal
RESOLVED FIXED
2 years ago
Last year

People

(Reporter: ma.schroeder, Assigned: aosmond)

Tracking

57 Branch
mozilla60
Unspecified
Android
Points:
---

Firefox Tracking Flags

(firefox59 wontfix, firefox60 fixed)

Details

(Whiteboard: [gfx-noted], )

Attachments

(3 attachments, 2 obsolete attachments)

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
Build ID: 20171130114045

Steps to reproduce:

Open this page: "https://www.mediamarkt.at/de/shop/abverkauf-markt.html" and let it load.
At a certain point before the site is fully loaded Firefox will crash.
Opened from bookmarks, as link via Telegram and as link on a Google-search from within the browser, always with the same result -> Firefox crashing without any warning.

Tested and reproduced on 2 different devices (OnePlus 5T with Android 7.1.1 and Fairphone 2 with Android 6 [both Stock Firmware]) both with FF57 for Android without extensions (except the H264-extension that is automatically installed)
Also tested on Chrome 63 on Android 7 and Firefox 57 (x64) on Fedora 27 without encountering any errors.


Actual results:

Page starts to load normal but when enough of it is loaded Firefox crashes without any warning or notice.
The app just gets closed, no alert (not from Firefox and not from Android itself) is shown. On starting the app again, recently open tabs are not lost but will be opened again, but also no message of a previous error/crash is shown.


Expected results:

The site should be loaded without Firefox crashing. If it is an error on the site, that firefox can't prevent from happening, an error message should be shown (like it is the case with e.g. scripts that run for too long/are unresponsive).
Thanks for report!

I was able to reproduce with Nexus 5 (Android 6.0.1) on latest Nightly build (01/02). Searching in about:crashes I've got this: https://crash-stats.mozilla.com/report/index/b20a2d88-23e3-4050-99fb-f90e20180103. 
Marking the bug as New.
Status: UNCONFIRMED → NEW
Ever confirmed: true
It's an OOM crash (or some other resource exhaustion) by the looks of it, which I guess explains why the Crash Reporter might fail to appear. So the question is what on that page makes us consume so much memory?
OS: Unspecified → Android
Summary: Crash of the app without notice on one certain site → OOM crash on https://www.mediamarkt.at/de/shop/abverkauf-markt.html
Hi Joe, Wesly
Please help prioritize this.
Flags: needinfo?(wehuang)
Flags: needinfo?(jcheng)
Looking at a logcat, I'm seeing entries like
> 01-03 20:38:01.955 31883-31913/org.mozilla.fennec_jan W/Adreno-EGLSUB: <SwapBuffers:1339>: gsl_device_3d_add_fence_event failed
> 01-03 20:38:01.955 31883-31913/org.mozilla.fennec_jan W/Adreno-EGL: <qeglDrvAPI_eglSwapBuffers:3904>: EGL_BAD_SURFACE
> 01-03 20:38:01.972 31883-31913/org.mozilla.fennec_jan W/Adreno-GSL: <gsl_ldd_control:475>: ioctl fd 46 code 0xc0140933 (IOCTL_KGSL_TIMESTAMP_EVENT) failed: errno 24 Too many open files
> 01-03 20:38:01.972 31883-31913/org.mozilla.fennec_jan W/Adreno-GSL: <ioctl_kgsl_syncobj_create:2979>: (2f, b, 16446) fail 24 Too many open files

and 
> 01-03 20:38:05.042 31883-31905/org.mozilla.fennec_jan W/art: Large object allocation failed: ashmem_create_region failed for 'large object space allocation': Too many open files
> 01-03 20:38:05.042 31883-31905/org.mozilla.fennec_jan W/art: Throwing OutOfMemoryError "Failed to allocate a 262860 byte allocation with 9353674 free bytes and 82MB until OOM"

This might be something for the platform team?
Flags: needinfo?(snorp)
Summary: OOM crash on https://www.mediamarkt.at/de/shop/abverkauf-markt.html → OOM (too many open files) crash on https://www.mediamarkt.at/de/shop/abverkauf-markt.html
Oh, and I did watch the memory profiler (Android Studio now finally shows all memory usage there, not just the Java heap) while the page was loading, and while memory usage does jump up shortly before crashing, it seems nowhere near real OOM levels, so I suppose whatever is taking up all those file descriptors is indeed the real problem.
My gut feeling here is that we're either creating too many layers (and therefore shmem regions, which need a fd) or this is related to volatile image stuff.
Component: General → Graphics: Layers
Flags: needinfo?(snorp)
Product: Firefox for Android → Core
Version: Firefox 57 → 57 Branch
Flags: needinfo?(wehuang)
e10s isn't enabled on Android yet, so it seems unlikely it is shmems for layers. However it is possible that it is images, I will investigate. If it is imagelib, we could consider putting smaller images into non-volatile buffers on Android, since the benefit is minimal, compared to the risk of exhausting file handles. We already avoid volatile buffers for animated frames.
Flags: needinfo?(jcheng) → needinfo?(aosmond)
Priority: -- → P3
Whiteboard: [gfx-noted]
Crashes on my Android phone. Tested on desktop by changing my user agent. It stores ~2000 images from the memory report, all of which pass the threshold to go into volatile memory instead of the heap. That easily blows the 1024 file descriptor budget adb shell suggests we have. Most of them are small though, < 200kB; I think we can probably safely say anything under 256kB is not worth putting in volatile memory. That's an easy enough change.
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Component: Graphics: Layers → ImageLib
Flags: needinfo?(aosmond)
Fix condition inversion.
Attachment #8952826 - Attachment is obsolete: true
Attachment #8952826 - Flags: review?(tnikkel)
Attachment #8952827 - Flags: review?(tnikkel)
Attachment #8952825 - Flags: review?(tnikkel) → review+
Attachment #8952827 - Flags: review?(tnikkel) → review+
Grrr, reftest failure on Android due to the Factory::AllowedSurfaceSize check failing; I forgot about that. We shouldn't be checking that to begin with for the image surfaces. Solution is to use SourceSurfaceAlignedRawData directly.

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ddd94cf79a8b216a28908402fb0b84cf502d8d7

(And here I spent time today on making a fancy "imgFrame allocation fails, retry after discarding surfaces from cache" workaround patch. Maybe it can land as a standalone.)
Attachment #8952825 - Attachment is obsolete: true
Attachment #8953189 - Flags: review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/99fdbfa4eb21
Part 1. Add preferences to control image frame allocations in volatile memory or the heap. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/6b4514506318
Part 2. Fix misleading image memory reporting on Android. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/99fdbfa4eb21
https://hg.mozilla.org/mozilla-central/rev/6b4514506318
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Since we're getting close to the 59 release date, and we don't have any evidence this is a commonly seen crash, wontfix for 59.
You need to log in before you can comment on or make changes to this bug.