Develop a solution to avoid OOMing large contiguous allocations due to address space fragmentation on Firefox for Android

NEW
Unassigned

Status

()

Firefox for Android
General
--
major
a year ago
a year ago

People

(Reporter: Jukka Jylänki, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

a year ago
Asm.js and wasm applications fundamentally require large contiguous typed array allocations to run as part of the memory load&store security model. With current applications in the wild, we know that in 32-bit browsers satisfying these large allocations can be unreliable even for moderately small heap sizes (~128MB allocation fails on some 5-7% of all visitors), and as a result sites can bleed visitors.

In bug 1266389 we identified a good test suite which reliably reproduces the issue, and the test suite is unable to run to completion on 32-bit Firefox, but needed 64-bit Firefox.

In bugs 1277066 and 1304140 a new HTTP response header was implemented to allow pages to flag that they need a large memory allocation, so the browser can allocate a fresh process for those pages.

In bugs 1266393 and 1314098 it was verified that this proposed scheme will be an effective fix for Firefox on desktop which has E10S on it.

However, on Android we don't have E10S, so the proposed solution does not currently work. This prevents WebAssembly applications from reliably running on Firefox for Android.

What can we do to develop a reliable fix for address space fragmentation for Android? Can we do multiple content processes there at some point (is there an E10S for Android rollout plan?), or is there something else that could be done there?
This is a tricky one. One might suppose headless Gecko is a solution, except for concurrent access to the profile and all the nonsense about tab switching and such. Paging the relevant folks!

It's worth noting that 128MB is pretty much what the entire rest of the browser takes, so perhaps that's already an unrealistic workload for phones …
We're doing some e10s-related work on Android that could mitigate this, such as having a separate GPU process.
(Reporter)

Comment 3

a year ago
Ah, this thought comes up often when discussing this topic, so good to clarify:

The issue is that with address space fragmentation we can have a scenario where an asm.js/wasm page fails to load up in an existing browser instance (even with all previous pages closed), but when visiting the page as the first page fresh after a browser restart, the page is happily able to load up.

That is, because of fragmentation, even with a high end phone that has 4GB of RAM, and with no other Android apps open, Firefox can get into a state where it is able to access only a small fraction of the total memory on the phone, since the allocation is needed to be contiguous.

I'm currently testing asm.js & WebAssembly applications on a Samsung Galaxy S7 Edge, which has 4GB of physical RAM, using a suite of real world asm.js and wasm applications that have shipped on the web. The current state is that at the start of the first test, Firefox is down to only 512MB of available memory out of that 4GB, and after going through each test, the amount of contiguous memory keeps going down until there's not enough memory to allocate for the asm.js/wasm heap. Numbers from a sample run of the suite, with the total amount of free contiguous memory before each page visit looks like follows:

Fresh boot before 1st test: 512MB
Before 2nd test: 320MB
Before 3rd test: 256MB
Before 4th test: 160MB
Before 5th test: 192MB
Before 6th test: 128MB
Before 7th test: 112MB

and after that all the subsequent tests fail because they need more contiguous memory that is available. Closing all browser tabs does not help (the suite only ever has one tab open), but killing the Firefox Android activity and restarting it does get one back to that ~512MB state for one or two pages, after which the memory gets fragmented again.

In summary, on a phone with 4GB of RAM, it is possible that after navigating only ~7 pages or so, less than 128MB of the whole 4GB of RAM is accessible to wasm pages (contiguously available in unfragmented state). This is quite inefficient and could make WebAssembly architecture at risk on mobile, which is why it's such a critical issue to be solved to make it feasible to reliably run WebAssembly pages on mobile devices.

(Btw, seeing only 512MB of memory available out of 4GB physical immediately at boot is a bit odd, discussing that aspect as bug 1324574)
You need to log in before you can comment on or make changes to this bug.