Closed Bug 1277066 Opened 8 years ago Closed 7 years ago

Workaround for OOM issues with large ArrayBuffer allocations on 32-bit Firefox

Categories

(Core :: DOM: Content Processes, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1331083

People

(Reporter: nika, Assigned: nika)

References

Details

(Whiteboard: btpp-active)

Attachments

(7 obsolete files)

Whiteboard: btpp-active
The big blocking question about the implementation of these iframes right now is how we want to implement them (shocker).

The WIP implementation which I have on that github branch creates the iframe as a nested content process within the primary content process. This means that the TabParent object for the iframe's content process actually lives within a content process.

AFAICT there is a decent amount of code which expects the TabParent object to live within the parent process, which is unfortunate. I'm not sure what the best way around this problem would be. My current thoughts are to either somehow create the tab parent within the parent process, and work around sending the events between the content processes with a different mechanism, or to try to forward the events which need to be in the parent process up from the content iframe.

Thinker, you mentioned that b2g didn't use directly nested iframes like this but instead would use other services to ask the parent process to make the app frame for them. Is there infrastructure for doing something like that already in the tree somewhere for me to look at?
Flags: needinfo?(tlee)
I might be missing some context here... But the plan was to put connected windows in the same process (iframes and windows from window.open). How do you plan to not break JS with this work? Especially in the same origin case, but even in the cross origin case there are sync JS calls that can be done. The only way that part can work is using CPOWS which we explicitly not want to.

Also, what is the main motivation of this project? The number of content processes will be quite limited. We won't be able to spawn one process for each tab or each domain... So why are we trying to do this? Is there some special use case here like running iframes of adds in their own process maybe?
Flags: needinfo?(michael)
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #2)
> I might be missing some context here... But the plan was to put connected
> windows in the same process (iframes and windows from window.open). How do
> you plan to not break JS with this work? Especially in the same origin case,
> but even in the cross origin case there are sync JS calls that can be done.
> The only way that part can work is using CPOWS which we explicitly not want
> to.
> 
> Also, what is the main motivation of this project? The number of content
> processes will be quite limited. We won't be able to spawn one process for
> each tab or each domain... So why are we trying to do this? Is there some
> special use case here like running iframes of adds in their own process
> maybe?

The goal of this patch is to provide a tool for game developers to avoid fragmented memory spaces on 32-bit windows when running their games. The idea is that if they run their games in an iframe explicitly marked as out of process with an attribute, then that iframe will run out of process. We plan to require that any iframe which has this attribute set has some sandbox flags set to avoid the problems related to same origin communication.

With regard to the number of content processes. Due to the intention of this patch, the content processes which are used for this are intended to be from a different process pool, and not re-used, such that a non-fragmented memory space is always available on initial load.
Flags: needinfo?(michael)
(In reply to Michael Layzell [:mystor] from comment #3)
> The goal of this patch is to provide a tool for game developers to avoid
> fragmented memory spaces on 32-bit windows when running their games.

Sounds great, thanks for the explanation!
Blocks: e10s-multi
(In reply to Michael Layzell [:mystor] from comment #1)
> Thinker, you mentioned that b2g didn't use directly nested iframes like this
> but instead would use other services to ask the parent process to make the
> app frame for them. Is there infrastructure for doing something like that
> already in the tree somewhere for me to look at?

For b2g, APPs call web activity to request the parent process.  I have heard people trying to remove web activity from Gecko, I am not sure current status of that.
Flags: needinfo?(tlee)
Summary: Implement Out of Process iframes → Workaround for OOM issues with large ArrayBuffer allocations on 32-bit Firefox
The strategy for working around the OOM issues with large ArrayBuffer allocations on 32-bit firefox has changed from out of process iframes to a toplevel navigate to new process and allocate strategy. I'll post more details about the design direction with early prototype patches.
Still to do:

- Try run (in progress: https://treeherder.mozilla.org/#/jobs?repo=try&revision=26ec2096046e09e679579aa1fc8f28eee8e74025)
- (a lot) more testing
  - Does this work nicely with back-forwards?
  - Does this actually allow developers to get large allocations more reliably?
- Decide whether or not you should be booted back to the main process if you navigate away from the loaded page
- Print the diagnostics to the console instead of using NS_WARNING
- Many more things I probably forgot
Attachment #8773430 - Attachment is obsolete: true
Attachment #8773431 - Attachment is obsolete: true
With this updated patch you are booted into a normal content process upon navigating away from a "fresh process" page, and reloads etc. of the page trigger new processes.

I also have (I believe) fixed back-forwards (from manual testing), but I haven't written the tests for that yet.

Diagnostics are now printed to the console, but not necessarily usefully.
Luke, can you look at this and see if the behavior is close to what we were aiming for?
Flags: needinfo?(luke)
TBH, I'm not familiar with any of this code so I can't really get a sense of what is going on.  I'm happy to read a prose explanation of what's going on, though :)
Flags: needinfo?(luke)
Hey, absolutely fantastic to see this already going full steam Michael!

I tried the try build with these patches on the emunittest test suite that I have been piecing together. You can run it with:

1. Visit https://s3.amazonaws.com/mozilla-games/emunittest-public/index.html
2. Scroll at the end of the page and click on "Run tests"
3. After the first test finishes, click on the banner at the top of the page to allow all popups on the site
4. Wait for the tests to complete (21 * ~30 seconds)
5. One the tests have completed, there will be no more tabs spawning, scroll to the end of the page to view the results log.

I added the <meta> tags to all asm.js tests in emunittest-public this morning, so this should be good to test. If you would like to download the suite for local offline execution without having to wait for downloads, you can fetch the whole suite as a zip file at

https://s3.amazonaws.com/mozilla-games/emunittest-public/emunittest-public-1.0.zip (warning: 922 MB download)

The try build has some kind of issue that it looks like the spawned test tab is unable to post results back up to the main window. The tab uses "opener.postMessage()" to post the results back up. Search for 'window.opener.postMessage(testResults, "*");' in https://s3.amazonaws.com/mozilla-games/emunittest-public/emtimer.js to find the code site. If I interpret correctly, it looks like the postMessage never gets delivered(?), or perhaps "window.opener" is null(?).

A second test case is at

1. Visit https://www.openwebgames.com
2. At the bottom, click on "Run the test"
3. Scroll down a bit, and at the right side, click on "Run test" and then "Continue"
4. Wait for the tests to finish

These two suites share some of the same demos.

Will Firefox for Android need some specific/separate code path, or do you think this will cover Firefox for Android case on the same go?
(In reply to Jukka Jylänki from comment #18)
> The try build has some kind of issue that it looks like the spawned test tab
> is unable to post results back up to the main window. The tab uses
> "opener.postMessage()" to post the results back up. Search for
> 'window.opener.postMessage(testResults, "*");' in
> https://s3.amazonaws.com/mozilla-games/emunittest-public/emtimer.js to find
> the code site. If I interpret correctly, it looks like the postMessage never
> gets delivered(?), or perhaps "window.opener" is null(?).

The <meta> tag has the side effect that it causes windows to be disconnected from any other windows they are currently connected to. This means that the window.opener will be a loopback window (as in, a reference to `window`).

To communicate back, we'll probably have to use some other form of communication which doesn't rely on direct window-to-window connections, and works asynchronously across processes. I don't think that we intend to make this <meta> tag support communicating through window.opener.

The <meta> tag also does not function within iframes, for similar reasons.

> Will Firefox for Android need some specific/separate code path, or do you
> think this will cover Firefox for Android case on the same go?

Firefox for Android doesn't render its content out of process with e10s as far as I know. This patch will probably have absolutely no impact on Firefox for Android, as the hint will be ignored due to the lack of e10s support.
Given that window.opener.postMessage() is asynchronous, couldn't that route through an IPC pipe to the parent process? (in the suite, I don't need to access anything else on window.opener object except do the postMessage)

Alternatively, I'd be happy to refactor the suite, but I don't know what to do instead?

Good points regarding Android. Thoughts:
 a) if Android did have e10s, would this patch then directly work?
 b) have there been discussions about developing e10s on Android as well? (or has that been deemed undesirable for some reason?)
(In reply to Jukka Jylänki from comment #20)
> Given that window.opener.postMessage() is asynchronous, couldn't that route
> through an IPC pipe to the parent process? (in the suite, I don't need to
> access anything else on window.opener object except do the postMessage)

The problem isn't that it is not possible, it's just it's not implemented, and it's not trivial to implement with the current way the reload is set up. The nsPIDOMWindowOuter object which you are holding onto in the original process would need to be modified to understand that its window has left the process and to proxy these requests for postMessage across that process boundary. It's not trivial to implement, but I can look into it if we deem cross-process communication with postMessage necessary. 

> Good points regarding Android. Thoughts:
>  a) if Android did have e10s, would this patch then directly work?
>  b) have there been discussions about developing e10s on Android as well?
> (or has that been deemed undesirable for some reason?)

I don't know enough about e10s-in-fennec to be able to give a useful answer. Andrew, do you know who would know what the plans are around that?
Flags: needinfo?(overholt)
(In reply to Michael Layzell [:mystor] from comment #21)
> (In reply to Jukka Jylänki from comment #20)
> >  b) have there been discussions about developing e10s on Android as well?
> > (or has that been deemed undesirable for some reason?)
> 
> I don't know enough about e10s-in-fennec to be able to give a useful answer.
> Andrew, do you know who would know what the plans are around that?

snorp knows.
Flags: needinfo?(overholt) → needinfo?(snorp)
We actually going to work on making e10s work in Fennec (or rather, GeckoView) in H2 this year.
Flags: needinfo?(snorp)
Thanks for the answers!

(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #23)
> We actually going to work on making e10s work in Fennec (or rather,
> GeckoView) in H2 this year.

Sorry, I'm not completely familiar with distinction. Is GeckoView part of Firefox for Android, so after this is done, FF for Android users would be running with content process separation as well? Is there a bug number to follow for this work?

 (In reply to Michael Layzell [:mystor] from comment #21)
> The problem isn't that it is not possible, it's just it's not implemented,
> and it's not trivial to implement with the current way the reload is set up.
> The nsPIDOMWindowOuter object which you are holding onto in the original
> process would need to be modified to understand that its window has left the
> process and to proxy these requests for postMessage across that process
> boundary. It's not trivial to implement, but I can look into it if we deem
> cross-process communication with postMessage necessary. 

Thanks, marked down bug 1294017 to focus on that discussion after this bug lands. Also marked down bug 1294023 to focus on the documentation side, we'll need to be able to educate the feature to asm.js/wasm developers when they start planning to use this. (and earlier, when discussing standardization/cross-browser related aspects)
Blocks: 1266389
It looks like we're going to try to take a path fairly similar to this one. I'm working on cleaning up the hack in my code so that we can land something like this behind a pref and start experimenting more with it.

Next step looks like it will be starting to proxy the nsGlobalWindow calls across the process boundary to recover APIs like window.postMessage
Depends on: 1299269
Depends on: 1267339, 1222516, 1304140, 1304141
No longer depends on: 1299269
Depends on: 1306458
Depends on: 1314098
No longer depends on: 1304141
(In reply to Jukka Jylänki from comment #24)
> Thanks for the answers!
> 
> (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #23)
> > We actually going to work on making e10s work in Fennec (or rather,
> > GeckoView) in H2 this year.
> 
> Sorry, I'm not completely familiar with distinction. Is GeckoView part of
> Firefox for Android, so after this is done, FF for Android users would be
> running with content process separation as well? Is there a bug number to
> follow for this work?

No, GeckoView is a separate library intended for developers to embed Gecko into their Android app. In the long-term, this is also what Fennec would use, but that's quite a ways off.

There isn't a bug # for this, but GV is already using e10s in trunk, and e10s generally works on Android. Fennec, however, has a number of frontend-related problems presenting it from using e10s today.
Attachment #8775755 - Attachment is obsolete: true
Attachment #8775756 - Attachment is obsolete: true
Attachment #8775757 - Attachment is obsolete: true
Attachment #8775758 - Attachment is obsolete: true
Attachment #8776115 - Attachment is obsolete: true
It seems that we now have the desktop side solution well in sight. Marked down bug 1323639 to continue the discussion on the issue specificslly on Android side.
Marking as duplicate of the shipping issue.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: