Last Comment Bug 714861 - [B2G] Implement <iframe mozbrowser> as a separate proccess
: [B2G] Implement <iframe mozbrowser> as a separate proccess
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Document Navigation (show other bugs)
: unspecified
: ARM Linux
: -- normal (vote)
: mozilla15
Assigned To: Justin Lebar (not reading bugmail)
:
:
Mentors:
Depends on:
Blocks: b2g-demo-phone b2g-e10s-work 749018
  Show dependency treegraph
 
Reported: 2012-01-03 10:18 PST by Justin Lebar (not reading bugmail)
Modified: 2013-05-13 04:26 PDT (History)
19 users (show)
justin.lebar+bug: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP v1 (see comment 9) (9.84 KB, patch)
2012-01-30 17:57 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
WIP v2 (19.67 KB, patch)
2012-02-06 09:38 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
WIP v3 (25.45 KB, patch)
2012-02-10 10:48 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
WIP v3 (rebased to tip) (21.80 KB, patch)
2012-04-07 07:14 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Testcase 1 (428 bytes, text/html)
2012-04-07 07:22 PDT, Justin Lebar (not reading bugmail)
no flags Details
Testcase 2 (227 bytes, text/html)
2012-04-07 07:23 PDT, Justin Lebar (not reading bugmail)
no flags Details
Gaia testcase (4.52 KB, patch)
2012-04-09 22:36 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
gets further along on device (23.63 KB, patch)
2012-04-10 10:00 PDT, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
no flags Details | Diff | Splinter Review
Part 1: content/layout changes. (13.48 KB, patch)
2012-04-10 10:53 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Part 2: ipc changes. (6.61 KB, patch)
2012-04-10 10:54 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Interdiff part 1, v1 --> v2 (2.85 KB, patch)
2012-04-11 20:31 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Part 1, v2 - content/layout changes (13.63 KB, patch)
2012-04-11 20:33 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Part 1 v3 - content changes (12.75 KB, patch)
2012-04-12 07:20 PDT, Justin Lebar (not reading bugmail)
bzbarsky: review+
Details | Diff | Splinter Review
Part 2, v2 - IPC changes (8.24 KB, patch)
2012-04-12 07:22 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Part 2, v2.1 - IPC changes (8.16 KB, patch)
2012-04-12 07:28 PDT, Justin Lebar (not reading bugmail)
cjones.bugs: review+
Details | Diff | Splinter Review
Roll-up patch, v4 (17.89 KB, patch)
2012-04-19 16:10 PDT, Justin Lebar (not reading bugmail)
bugs: review+
Details | Diff | Splinter Review
Part 3: Don't leak the layer manager in IPC code. (2.51 KB, patch)
2012-04-23 08:37 PDT, Justin Lebar (not reading bugmail)
cjones.bugs: review+
Details | Diff | Splinter Review

Description Justin Lebar (not reading bugmail) 2012-01-03 10:18:19 PST
We want the b2g browser to be separate from the main gecko process.

We also want a bunch of properties for <iframe mozbrowser> (history isolation, window.top isolation, etc.) as tracked in bug 693515.  We'll also need most of these properties for b2g apps.

It seems like the simplest thing which could possibly work is to always run <iframe mozbrowser> in a content process.  That should get us most of bug 693515 for free, aiui.

If it later turns out that we really want <iframe mozbrowser> without content processes, then we can make that work.  But since we're pretty sure we want an out-of-process browser, I think we should do that first.

cjones, I'm happy to do this, but I don't know where to start.
Comment 1 Benjamin Smedberg [:bsmedberg] 2012-01-03 10:40:41 PST
How is this different from <xul:iframe> or the core bits of <xul:browser>? Can we just use those with remote="true"?
Comment 2 Justin Lebar (not reading bugmail) 2012-01-03 10:50:04 PST
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> How is this different from <xul:iframe> or the core bits of <xul:browser>?

It's basically the same, except that the parent is a content HTML window, not a chrome XUL window.

> Can we just use those with remote="true"?

Maybe?
Comment 3 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-01-03 14:18:03 PST
We want to extend the security checks here

http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsFrameLoader.cpp#1768

and a case where we look for "mozbrowser".  Everything should magically work after that (except that it won't!).  If/when it doesn't, we need to file bugs for implementing in gecko what the fennec frontend currently does.
Comment 4 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-01-03 14:18:53 PST
Although considering that touch events don't work well with the current browser, maybe everything will Just Work to the current level of functionality.
Comment 5 Benjamin Smedberg [:bsmedberg] 2012-01-04 13:05:28 PST
I guess what I don't understand is why you can't just use xul:iframe type="content", or what differences in behavior you want compared with xul:iframe type="content". Apart from our restriction that you can't use XUL elements from web content (which we could lift for this specific case), it seems easier to reuse the existing code than invent a different element type.
Comment 6 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-01-04 15:04:32 PST
We need to add a lot of new APIs to support the new "browser"-type iframe so there's other motivation for adding it, and we want to standardize those APIs.  I'm happy to reuse as much existing code in gecko as we can, but we can't build on XUL if we want to standardize.
Comment 7 Justin Lebar (not reading bugmail) 2012-01-27 15:13:50 PST
I'm not sure how this works with <xul:iframe type="content">, but we have the following problem with <iframe mozbrowser>:

EnsureFrameLoader is called from BindToTree.  But nsIContent::SetPrimaryFrame is not called until after BindToTree completes.  A remote iframe needs its nsIFrame set (via SetPrimaryFrame) before we can create its TabParent.  So this means that we can't create the TabParent until after SetPrimaryFrame runs, so in particular, we can't create the TabParent during BindToTree.

Is there some event I can bind to for the purposes of calling EnsureFrameLoader later in this cycle?

nsGenericHTMLFrameElement::EnsureFrameLoader
nsGenericHTMLFrameElement::LoadSrc
nsGenericHTMLFrameElement::BindToTree
nsINode::doInsertChildAt
nsGenericElement::InsertChildAt
nsINode::ReplaceOrInsertBefore
nsINode::ReplaceOrInsertBefore
nsINode::InsertBefore
nsINode::AppendChild

nsIContent::SetPrimaryFrame
nsSubDocumentFrame::Init
nsCSSFrameConstructor::InitAndRestoreFrame
nsCSSFrameConstructor::ConstructFrameFromItemInternal
nsCSSFrameConstructor::ConstructFramesFromItem
nsCSSFrameConstructor::ConstructFramesFromItemList
nsCSSFrameConstructor::ContentAppended
PresShell::ContentAppended
nsNodeUtils::ContentAppended
nsINode::doInsertChildAt
nsGenericElement::InsertChildAt
nsINode::ReplaceOrInsertBefore
nsINode::ReplaceOrInsertBefore
nsINode::InsertBefore
nsINode::AppendChild
Comment 8 Justin Lebar (not reading bugmail) 2012-01-27 15:45:21 PST
> Is there some event I can bind to for the purposes of calling EnsureFrameLoader later in this cycle?

I guess I can try calling EnsureFrameLoader() asynchronously.
Comment 9 Justin Lebar (not reading bugmail) 2012-01-30 17:56:24 PST
> I guess I can try calling EnsureFrameLoader() asynchronously.

This doesn't work even without forcing a child process, but I'm not sure why.  The iframe is blank, despite a successful LoadURI('http://mozilla.org') call.

I'll dig in tomorrow, but if anyone knows what's going on off the top of their head, I'd certainly appreciate a pointer.  I've attached my (tiny) patch.
Comment 10 Justin Lebar (not reading bugmail) 2012-01-30 17:57:58 PST
Created attachment 592945 [details] [diff] [review]
WIP v1 (see comment 9)
Comment 11 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-01-30 18:18:55 PST
Are you seeing the page load attempted in the content process?  On what OS are you testing?
Comment 12 Justin Lebar (not reading bugmail) 2012-01-30 18:20:32 PST
Comment 9 is explicitly without content processes.  In-process iframe, just asynchronously creating the frame loader.  I'm on Linux-64.
Comment 13 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-01-30 18:22:22 PST
Ah OK, I didn't parse your first sentence correctly.  This is beyond my ken, sorry :(.

bz, any ideas?
Comment 14 Boris Zbarsky [:bz] (still a bit busy) 2012-01-30 18:29:24 PST
I thought we generally did frame loader init off a script runner nowadays...  In any case, I'd start by trying to use a script runner for your LoadSrc call.
Comment 15 Justin Lebar (not reading bugmail) 2012-01-31 11:48:39 PST
Yay, that worked.  Thanks, bz.
Comment 16 Justin Lebar (not reading bugmail) 2012-02-01 12:54:06 PST
Well...the script runner works for the non-cross-process case.  But the script runner runs before SetPrimaryFrame runs.
Comment 17 Boris Zbarsky [:bz] (still a bit busy) 2012-02-01 13:02:41 PST
The script runner can run before SetPrimaryFrame runs, yes.  Why is that a problem?  We should be able to create frame loaders even when there is no nsIFrame, no?  And they need to survive destruction of the nsIFrame....
Comment 18 Justin Lebar (not reading bugmail) 2012-02-01 13:16:28 PST
Looks like just to get the DPI of the screen?  I'm not sure why I didn't read more than one line of context before now.

void
TabParent::SetOwnerElement(nsIDOMElement* aElement)
{
  mFrameElement = aElement;

  // Cache the DPI of the screen, since we may lose the element/widget later
  if (aElement) {
    nsCOMPtr<nsIWidget> widget = GetWidget();
    NS_ABORT_IF_FALSE(widget, "Non-null OwnerElement must provide a widget!");
    mDPI = widget->GetDPI();
  }
}

###!!! ABORT: Non-null OwnerElement must provide a widget!: 'widget', file ../../../src/dom/ipc/TabParent.cpp, line 111
mozilla::dom::TabParent::SetOwnerElement(nsIDOMElement*) (/home/jlebar/code/moz/ff-git/debug/dom/ipc/../../../src/dom/ipc/TabParent.cpp:112)
nsFrameLoader::TryRemoteBrowser() (/home/jlebar/code/moz/ff-git/debug/content/base/src/../../../../src/content/base/src/nsFrameLoader.cpp:1847)
nsFrameLoader::ReallyStartLoadingInternal() (/home/jlebar/code/moz/ff-git/debug/content/base/src/../../../../src/content/base/src/nsFrameLoader.cpp:456)
nsFrameLoader::ReallyStartLoading() (/home/jlebar/code/moz/ff-git/debug/content/base/src/../../../../src/content/base/src/nsFrameLoader.cpp:434)
nsDocument::MaybeInitializeFinalizeFrameLoaders() (/home/jlebar/code/moz/ff-git/debug/content/base/src/../../../../src/content/base/src/nsDocument.cpp:5470)
nsDocument::EndUpdate(unsigned int) (/home/jlebar/code/moz/ff-git/debug/content/base/src/../../../../src/content/base/src/nsDocument.cpp:3999)
nsHTMLDocument::EndUpdate(unsigned int) (/home/jlebar/code/moz/ff-git/debug/content/html/document/src/../../../../../src/content/html/document/src/nsHTMLDocument.cpp:2295)
~mozAutoDocUpdate (/home/jlebar/code/moz/ff-git/debug/content/xul/templates/src/../../../../../src/content/xul/templates/src/../../../base/src/mozAutoDocUpdate.h:67)
nsINode::ReplaceOrInsertBefore(bool, nsINode*, nsINode*) (/home/jlebar/code/moz/ff-git/debug/content/base/src/../../../../src/content/base/src/nsGenericElement.cpp:4041)
nsINode::ReplaceOrInsertBefore(bool, nsINode*, nsINode*, unsigned int*) (/home/jlebar/code/moz/ff-git/debug/js/xpconnect/src/../../../dist/include/nsINode.h:1344)
nsINode::InsertBefore(nsINode*, nsINode*, unsigned int*) (/home/jlebar/code/moz/ff-git/debug/js/xpconnect/src/../../../dist/include/nsINode.h:473)
nsINode::AppendChild(nsINode*, unsigned int*) (/home/jlebar/code/moz/ff-git/debug/js/xpconnect/src/../../../dist/include/nsINode.h:483)
nsIDOMNode_AppendChild (/home/jlebar/code/moz/ff-git/debug/js/xpconnect/src/dom_quickstubs.cpp:5480)
js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, JS::Value*), js::CallArgs const&) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jscntxtinlines.h:311)
js::InvokeKernel(JSContext*, js::CallArgs, js::MaybeConstruct) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsinterp.cpp:519)
js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsinterp.cpp:2801)
js::RunScript(JSContext*, JSScript*, js::StackFrame*) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsinterp.cpp:475)
js::InvokeKernel(JSContext*, js::CallArgs, js::MaybeConstruct) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsinterp.cpp:537)
js::Invoke(JSContext*, js::InvokeArgsGuard&, js::MaybeConstruct) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsinterp.h:157)
js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsinterp.cpp:569)
JS_CallFunctionValue (/home/jlebar/code/moz/ff-git/debug/js/src/../../../src/js/src/jsapi.cpp:5459)
nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) (/home/jlebar/code/moz/ff-inbound/debug/js/xpconnect/src/../../../../src/js/xpconnect/src/XPCWrappedJSClass.cpp:1510)
nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) (/home/jlebar/code/moz/ff-inbound/debug/js/xpconnect/src/../../../../src/js/xpconnect/src/XPCWrappedJS.cpp:617)
PrepareAndDispatch (/home/jlebar/code/moz/ff-inbound/debug/xpcom/reflect/xptcall/src/md/unix/../../../../../../../src/xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:153)
SharedStub (xptcstubs_x86_64_linux.cpp:0)
Comment 19 Justin Lebar (not reading bugmail) 2012-02-01 13:18:56 PST
If we ignore the DPI business, we get a bit further, but apparently don't have a docshell:

creating 1!
loading http://mozilla.org/, 1
WARNING: NS_ENSURE_TRUE(mDocShell) failed: file ../../../../src/embedding/browser/webBrowser/nsWebBrowser.cpp, line 662
WARNING: mWebNav->LoadURI failed. Eating exception, what else can I do?: file ../../../src/dom/ipc/TabChild.cpp, line 512
[TabChild] SHOW (w,h)= (0, 0)

__restore_rt (sigaction.c:0)
mozilla::layout::RenderFrameParent::AllocPLayers(mozilla::layers::LayerManager::LayersBackend*) (/home/jlebar/code/moz/ff-git/debug/layout/ipc/../../../src/layout/ipc/RenderFrameParent.cpp:645)
mozilla::layout::PRenderFrameParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (/home/jlebar/code/moz/ff-git/debug/ipc/ipdl/PRenderFrameParent.cpp:247)
mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (/home/jlebar/code/moz/ff-git/debug/ipc/ipdl/PContentParent.cpp:1771)
mozilla::ipc::SyncChannel::OnDispatchMessage(IPC::Message const&) (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/SyncChannel.cpp:175)
mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/RPCChannel.cpp:432)
void DispatchToMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, bool (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/chromium/src/base/tuple.h:384)
RunnableMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/chromium/src/base/task.h:308)
mozilla::ipc::RPCChannel::RefCountedTask::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../dist/include/mozilla/ipc/RPCChannel.h:462)
mozilla::ipc::RPCChannel::DequeueTask::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../dist/include/mozilla/ipc/RPCChannel.h:485)
MessageLoop::RunTask(Task*) (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:319)
MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:329)
MessageLoop::DoWork() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:426)
mozilla::ipc::DoWorkRunnable::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/MessagePump.cpp:71)
nsThread::ProcessNextEvent(bool, bool*) (/home/jlebar/code/moz/ff-inbound/debug/xpcom/threads/../../../src/xpcom/threads/nsThread.cpp:657)
NS_ProcessNextEvent_P(nsIThread*, bool) (/home/jlebar/code/moz/ff-inbound/debug/xpcom/build/nsThreadUtils.cpp:245)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/MessagePump.cpp:107)
MessageLoop::RunInternal() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:209)
MessageLoop::RunHandler() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:202)
MessageLoop::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:174)
nsBaseAppShell::Run() (/home/jlebar/code/moz/ff-git/debug/widget/xpwidgets/../../../src/widget/xpwidgets/nsBaseAppShell.cpp:191)
nsAppStartup::Run() (/home/jlebar/code/moz/ff-inbound/debug/toolkit/components/startup/../../../../src/toolkit/components/startup/nsAppStartup.cpp:220)
XRE_main (/home/jlebar/code/moz/ff-git2/debug/toolkit/xre/../../../src/toolkit/xre/nsAppRunner.cpp:3537)
do_main (/home/jlebar/code/moz/ff-git2/debug/browser/app/../../../src/browser/app/nsBrowserApp.cpp:205)
main (/home/jlebar/code/moz/ff-git2/debug/browser/app/../../../src/browser/app/nsBrowserApp.cpp:295)
__libc_start_main (/build/buildd/eglibc-2.13/csu/libc-start.c:258)
_start (dist/bin/firefox)
Comment 20 Boris Zbarsky [:bz] (still a bit busy) 2012-02-01 13:24:22 PST
Um.  Did e10s never get things working with display:none iframes?

I think you should ignore the SetPrimaryFrame issue; it's a red herring.  Make <iframe mozbrowser style="display: none" work correctly.   Then you get the rest for free.

Why is nsWebBrowser not initializing its docshell in the display:none scenario?
Comment 21 Justin Lebar (not reading bugmail) 2012-02-01 17:49:57 PST
Well, it looks like TabChild does not nsIBaseWindow::Create its nsWebBrowser until TabChild::RecvShow().  And there's no docshell in the nsWebBrowser until it's Create()'ed.
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2012-02-01 17:56:19 PST
Ho-hum.  That's kinda broken....  Can we Create earlier?
Comment 23 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-01 18:17:33 PST
We wait until RecvShow(), because BaseWindow::Create needs a fake widget, and we can't hook that up properly until we find a real widget in the parent process.

Or at least, that's why things are the way they are now.  It's just code, we can rejigger it.
Comment 24 Justin Lebar (not reading bugmail) 2012-02-02 08:51:42 PST
Playing whack-a-null here.

If we initialize the window (i.e., create(), etc.) at the beginning of TabChild::RecvLoadURL, we crash at

creating 1!
[TabChild] EnsureWindowInitialized aSize=(0,0)

Program dist/bin/firefox (pid = 11138) received signal 11.
Stack:
__restore_rt (sigaction.c:0)
mozilla::layout::RenderFrameParent::AllocPLayers(mozilla::layers::LayerManager::LayersBackend*) (/home/jlebar/code/moz/ff-git/debug/layout/ipc/../../../src/layout/ipc/RenderFrameParent.cpp:645)
mozilla::layout::PRenderFrameParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (/home/jlebar/code/moz/ff-git/debug/ipc/ipdl/PRenderFrameParent.cpp:247)
mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) (/home/jlebar/code/moz/ff-git/debug/ipc/ipdl/PContentParent.cpp:1771)
mozilla::ipc::SyncChannel::OnDispatchMessage(IPC::Message const&) (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/SyncChannel.cpp:175)
mozilla::ipc::RPCChannel::OnMaybeDequeueOne() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/RPCChannel.cpp:432)
void DispatchToMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)()>(mozilla::ipc::RPCChannel*, bool (mozilla::ipc::RPCChannel::*)(), Tuple0 const&) (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/chromium/src/base/tuple.h:384)
RunnableMethod<mozilla::ipc::RPCChannel, bool (mozilla::ipc::RPCChannel::*)(), Tuple0>::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/chromium/src/base/task.h:308)
mozilla::ipc::RPCChannel::RefCountedTask::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../dist/include/mozilla/ipc/RPCChannel.h:462)
mozilla::ipc::RPCChannel::DequeueTask::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../dist/include/mozilla/ipc/RPCChannel.h:485)
MessageLoop::RunTask(Task*) (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:319)
MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:329)
MessageLoop::DoWork() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:426)
mozilla::ipc::DoWorkRunnable::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/MessagePump.cpp:71)
nsThread::ProcessNextEvent(bool, bool*) (/home/jlebar/code/moz/ff-inbound/debug/xpcom/threads/../../../src/xpcom/threads/nsThread.cpp:657)
NS_ProcessNextEvent_P(nsIThread*, bool) (/home/jlebar/code/moz/ff-inbound/debug/xpcom/build/nsThreadUtils.cpp:245)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (/home/jlebar/code/moz/ff-git/debug/ipc/glue/../../../src/ipc/glue/MessagePump.cpp:107)
MessageLoop::RunInternal() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:209)
MessageLoop::RunHandler() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:202)
MessageLoop::Run() (/home/jlebar/code/moz/ff-git/debug/ipc/chromium/../../../src/ipc/chromium/src/base/message_loop.cc:174)
nsBaseAppShell::Run() (/home/jlebar/code/moz/ff-git/debug/widget/xpwidgets/../../../src/widget/xpwidgets/nsBaseAppShell.cpp:191)
nsAppStartup::Run() (/home/jlebar/code/moz/ff-inbound/debug/toolkit/components/startup/../../../../src/toolkit/components/startup/nsAppStartup.cpp:220)
XRE_main (/home/jlebar/code/moz/ff-git2/debug/toolkit/xre/../../../src/toolkit/xre/nsAppRunner.cpp:3537)
do_main (/home/jlebar/code/moz/ff-git2/debug/browser/app/../../../src/browser/app/nsBrowserApp.cpp:205)
main (/home/jlebar/code/moz/ff-git2/debug/browser/app/../../../src/browser/app/nsBrowserApp.cpp:295)
__libc_start_main (/build/buildd/eglibc-2.13/csu/libc-start.c:258)
_start (dist/bin/firefox)

which is probably the problem Chris was referring to in comment 23.
Comment 25 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-02 11:00:08 PST
This code wasn't designed to handle arbitrary display:none/default/none/etc transitions.  We'll need to refactor for the long run.

for now, whack/hack whatever you need to get the basic case up.  I assume that's what you're doing
Comment 26 Justin Lebar (not reading bugmail) 2012-02-03 12:51:44 PST
> for now, whack/hack whatever you need to get the basic case up.  I assume that's what 
> you're doing

Indeed!

When the content process starts up, I get a bunch of failed NS_DispatchToMainThread's from MessagePump::ScheduleWork (20 failures or so) because the main thread pointer is null.  It kind of looks like we could be missing messages in the child process as a result, but I'm new to all this code.

Are these failures expected?
Comment 27 Justin Lebar (not reading bugmail) 2012-02-03 13:18:14 PST
Actually, those null main thread warnings look benign.  All the messages seem to be delivered after the deluge of errors ends.
Comment 28 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-03 17:12:21 PST
No, that's not at all expected.
Comment 29 Justin Lebar (not reading bugmail) 2012-02-06 09:38:39 PST
Created attachment 594735 [details] [diff] [review]
WIP v2

Crashes on browser exit, scrollbars don't work, complete hack.  However, it loads a page!
Comment 30 Justin Lebar (not reading bugmail) 2012-02-07 20:01:32 PST
Hm, it doesn't work on Linux due to an apparent race (I got it to work once in the debugger when I set breakpoints in one of the processes).  I've been and will continue digging.
Comment 31 Justin Lebar (not reading bugmail) 2012-02-10 10:48:02 PST
Created attachment 596100 [details] [diff] [review]
WIP v3

This kind of works on Linux.  The cross-process content shows up, but not the first time.  Once the content appears, you can scroll and interact with the content.
Comment 32 Justin Lebar (not reading bugmail) 2012-04-07 07:14:32 PDT
Created attachment 613104 [details] [diff] [review]
WIP v3 (rebased to tip)
Comment 33 Justin Lebar (not reading bugmail) 2012-04-07 07:22:50 PDT
Created attachment 613105 [details]
Testcase 1

This WIP is a serious hack, in case that's not clear...

In about:config, search for "mozbrowser".  There should be two prefs.  Enable mozbrowser and add this attachment's origin (e.g. "https://bug714861.bugzilla.mozilla.org") to the whitelist pref.

This attachment should work on Mac, but it doesn't load the page on Linux.  If you wait a few seconds and then refresh the page, the inner page loads successfully on Linux.

The same thing happens if you set the iframe's source on the outer page's load, or on the outer page's load plus one trip through setTimeout(0) -- that's too soon.
Comment 34 Justin Lebar (not reading bugmail) 2012-04-07 07:23:52 PDT
Created attachment 613107 [details]
Testcase 2

Restart the browser, then open this testcase.  If you wait a second or two after loading the page and then press "go", it'll work on both Mac and Linux.
Comment 35 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 01:05:42 PDT
What appears to be happening is that the IO thread comes up and starts processing messages before nsThreadManager has been initialized (maybe all of XPCOM).  But without an XPCOM main thread to post them to, they're being dropped.
Comment 36 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 02:03:13 PDT
Alright, that's not the problem ... I'm testing in a --enable-application=b2g build.  What's *actually* happening is that the browser app itself gets loaded out of process, and then it tries to create another child content process for web content.  Something in the browser app's content process is apparently unhappy with that.  Something we need to fix, but not for this bug.
Comment 37 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 02:20:30 PDT
And FTR, the nested content-process creation is failing because in

  nsCOMPtr<nsIXULWindow> window(do_GetInterface(parentOwner));
  if (!window) {
    return false;
  }

(gdb) p parentOwner.mRawPtr
$2 = (nsDocShellTreeOwner *) 0xb29bd820

so apparently nsDocShellTreeOwner can't find an nsIXULWindow, not too surprising in a content process.

I'd like to get this landed but pref'd off asap.
Comment 38 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 02:27:53 PDT
Comment on attachment 613104 [details] [diff] [review]
WIP v3 (rebased to tip)

Lots of nits in here to pick, but the only thing that makes me particularly unhappy with the new |mForceChildProcess| logic in nsFrameLoader.  Why was that needed?  We should centralize those checks.

Reserving judgment pending answer to that question ;).
Comment 39 Justin Lebar (not reading bugmail) 2012-04-09 07:50:45 PDT
The key problem with this patch is that testcase 1 does not work on Linux.  Do you have any insight into that?
Comment 40 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 08:02:59 PDT
I didn't test a desktop FF build, so nothing to add.  There are only a finite number of things it could be.  The question is whether pursuing them is a good use of time now, since many of those causes are linux-x11 or firefox-mouse-interface specific.

This is mostly working on desktop-b2g except for IME, but that's not the fault of this patch.  About to test in phone-b2g.  The right way forward is to clean this up and land it pref'd off, so followup work can go on in parallel.  There's a bit ... ;)
Comment 41 Justin Lebar (not reading bugmail) 2012-04-09 08:46:40 PDT
I'm concerned that this patch is broken in a fundamental way, which is causing it to work only on the second load issued to a mozbrowser, at least on Linux.  Perhaps you don't observe this on desktop b2g because the first load into a mozbrowser is overwritten by a second load.

I am not comfortable landing code in gecko -- pref'ed off or not -- which has flaws we do not understand, particularly because there's a good chance that the flaws I've seen on desktop Linux apply also to desktop b2g.  I'd really appreciate some assistance from the domain expert in understanding this issue better.
Comment 42 Justin Lebar (not reading bugmail) 2012-04-09 13:50:04 PDT
> I am not comfortable landing code in gecko -- pref'ed off or not -- which has flaws we do not 
> understand, particularly because there's a good chance that the flaws I've seen on desktop Linux 
> apply also to desktop b2g.

I just tested on desktop b2g, and amazingly enough, testcase 1 works there.  (I hacked <iframe mozbrowser src="http://mozilla.org"> into the Gaia homescreen.)

If things similarly work on the phone, then I'm OK taking this.
Comment 43 Justin Lebar (not reading bugmail) 2012-04-09 15:12:08 PDT
Yup, it can load pages on the phone.

I'll whip this patch up into shape.
Comment 44 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 20:50:56 PDT
<browser remote> has never been a supported or even tested configuration in desktop-linux-x11/Firefox-mouse.  Let alone <iframe mozbrowser>.

While I agree with you in the abstract, we need to stay on target, and hop carefully from known-to-work lily pad to the next one we want to work, or we might end up falling into a deep and foul pond.  Or something like that.
Comment 45 Justin Lebar (not reading bugmail) 2012-04-09 21:16:11 PDT
(In reply to Justin Lebar [:jlebar] from comment #43)
> Yup, it can load pages on the phone.

I think this is actually untrue, unfortunately.  My upload failed and I didn't notice.  :(

> While I agree with you in the abstract, we need to stay on target

Agreed, but it's also important that things work the same on desktop b2g and mobile b2g, since so many people use desktop b2g for development.  My assumption was that desktop b2g would behave very much like desktop Firefox, but it appears that was wrong in this case!
Comment 46 Justin Lebar (not reading bugmail) 2012-04-09 22:36:26 PDT
Created attachment 613495 [details] [diff] [review]
Gaia testcase

This is a patch to Gaia demonstrating the bug in this patch.  On my SGSII, I get a blank iframe with mozbrowser.  On desktop b2g, I get some content in the iframe.

I don't think <iframe src="http://foo"> is an outlandish thing to want to work before we check this in, even pref'ed off.

Also, it seems that I can't unlock the homescreen with OOP mozbrowser applied.  But maybe something else is going on; I'm getting tired.

If you want to check this in ASAP, perhaps we should check it into a git branch.  That way, it goes in immediately, warts and all, and others can hack on it.

What do you think?
Comment 47 Justin Lebar (not reading bugmail) 2012-04-09 22:38:03 PDT
Also, there's a crash on shutdown, presumably in the child process:

__restore_rt (sigaction.c:0)
nsOfflineCacheDevice::RunSimpleQuery(mozIStorageStatement*, unsigned int, unsigned int*, char***) (/home/jlebar/code/moz/ff-git/debug-b2g/netwerk/cache/../../../src/netwerk/cache/nsDiskCacheDeviceSQL.cpp:2085)
nsOfflineCacheDevice::GetGroupsTimeOrdered(unsigned int*, char***) (/home/jlebar/code/moz/ff-git/debug-b2g/netwerk/cache/../../../src/netwerk/cache/nsDiskCacheDeviceSQL.cpp:2076)
nsOfflineCacheUpdate::EvictOneNonPinned() (/home/jlebar/code/moz/ff-git/debug/uriloader/prefetch/../../../src/uriloader/prefetch/nsOfflineCacheUpdate.cpp:1937)
nsOfflineCacheUpdate::LoadCompleted() (/home/jlebar/code/moz/ff-git/debug/uriloader/prefetch/../../../src/uriloader/prefetch/nsOfflineCacheUpdate.cpp:1450)
nsOfflineCacheUpdateItem::Run() (/home/jlebar/code/moz/ff-git/debug/uriloader/prefetch/../../../src/uriloader/prefetch/nsOfflineCacheUpdate.cpp:473)
nsThread::ProcessNextEvent(bool, bool*) (/home/jlebar/code/moz/ff-git/debug/xpcom/threads/../../../src/xpcom/threads/nsThread.cpp:656)
NS_ProcessNextEvent_P(nsIThread*, bool) (/home/jlebar/code/moz/ff-git/debug/xpcom/build/nsThreadUtils.cpp:245)
Spin (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/components/places/../../../../src/toolkit/components/places/Database.cpp:291)
mozilla::places::Database::Shutdown() (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/components/places/../../../../src/toolkit/components/places/Database.cpp:1863)
mozilla::places::Database::Observe(nsISupports*, char const*, unsigned short const*) (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/components/places/../../../../src/toolkit/components/places/Database.cpp:1956)
nsObserverList::NotifyObservers(nsISupports*, char const*, unsigned short const*) (/home/jlebar/code/moz/ff-git/debug/xpcom/ds/../../../src/xpcom/ds/nsObserverList.cpp:129)
nsObserverService::NotifyObservers(nsISupports*, char const*, unsigned short const*) (/home/jlebar/code/moz/ff-git/debug/xpcom/ds/../../../src/xpcom/ds/nsObserverService.cpp:185)
nsXREDirProvider::DoShutdown() (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/xre/../../../src/toolkit/xre/nsXREDirProvider.cpp:841)
~ScopedXPCOMStartup (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/xre/../../../src/toolkit/xre/nsAppRunner.cpp:1127)
XREMain::XRE_main(int, char**, nsXREAppData const*) (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/xre/../../../src/toolkit/xre/nsAppRunner.cpp:3871)
XRE_main (/home/jlebar/code/moz/ff-git/debug-b2g/toolkit/xre/../../../src/toolkit/xre/nsAppRunner.cpp:3925)
Comment 48 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-09 23:52:39 PDT
(In reply to Justin Lebar [:jlebar] from comment #45)
> (In reply to Justin Lebar [:jlebar] from comment #43)
> > Yup, it can load pages on the phone.
> 
> I think this is actually untrue, unfortunately.  My upload failed and I
> didn't notice.  :(
> 

I'm not against checking this in if it can't load content on the phone, but not being able to load content means there's not much motivation to landitlanditlandit.

I had some issues getting a gecko on device to test this patch, so I'll retry this afternoon and poke a little.  Likely something dumb.

> > While I agree with you in the abstract, we need to stay on target
> 
> Agreed, but it's also important that things work the same on desktop b2g and
> mobile b2g, since so many people use desktop b2g for development.  My
> assumption was that desktop b2g would behave very much like desktop Firefox,
> but it appears that was wrong in this case!

There are many many many things that are different between FF-mouse frontend and b2g.  I don't think that's a good thing, but it's the way it is.  Many of those differences affect cross-process compatibility.
Comment 49 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-10 00:42:27 PDT
I can reproduce the page-not-loading.  |ps| says that we're spawning the plugin-container subprocess, but debuggerd shows up not long afterward.  Something may be hanging on startup.

I'm not able to repro problems with the lockscreen.

Let me poke at this a little bit.
Comment 50 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-10 10:00:12 PDT
Created attachment 613651 [details] [diff] [review]
gets further along on device

Crashes below in the child process.  Progress!  Now we get to start undoing hackity hacks ...

(gdb) bt
#0  nsScreenGonk::GetPixelDepth (this=<value optimized out>, aPixelDepth=0xbec229b4) at /home/cjones/mozilla/b2g/glue/gonk-ics/frameworks/base/include/ui/FramebufferNativeWindow.h:55
#1  0x407606e6 in nsPrincipal::SubsumesIgnoringDomain (this=0x0, aOther=0xbec229b4, aResult=0x410dd1f0) at /home/cjones/mozilla/mozilla-central/caps/src/nsPrincipal.cpp:379
#2  0x40b36af6 in gfxAndroidPlatform::gfxAndroidPlatform (this=0x41c361c0) at /home/cjones/mozilla/mozilla-central/gfx/thebes/gfxAndroidPlatform.cpp:67
#3  0x40b2f8a8 in gfxPlatform::Init () at /home/cjones/mozilla/mozilla-central/gfx/thebes/gfxPlatform.cpp:299
#4  0x40b2fac8 in gfxPlatform::GetPlatform () at /home/cjones/mozilla/mozilla-central/gfx/thebes/gfxPlatform.cpp:255
#5  0x409e1238 in mozilla::widget::PuppetWidget::Create (this=0x41ceb440, aParent=0x0, aNativeParent=<value optimized out>, aRect=<value optimized out>, aHandleEventFunction=0, aContext=0x0, aInitData=0x0) at /home/cjones/mozilla/mozilla-central/widget/xpwidgets/PuppetWidget.cpp:127
Comment 51 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-10 10:01:54 PDT
Since this crash has nothing to do with the patch here and should be fixed in parallel, and we're loading OOP content successfully in desktop-b2g, then please continue on the track of landing what's here pref'd off.  We can start filing followups for bugs like these.
Comment 52 Justin Lebar (not reading bugmail) 2012-04-10 10:53:49 PDT
Created attachment 613666 [details] [diff] [review]
Part 1: content/layout changes.
Comment 53 Justin Lebar (not reading bugmail) 2012-04-10 10:54:24 PDT
Created attachment 613668 [details] [diff] [review]
Part 2: ipc changes.
Comment 54 Boris Zbarsky [:bz] (still a bit busy) 2012-04-10 14:59:53 PDT
Comment on attachment 613666 [details] [diff] [review]
Part 1: content/layout changes.

If the <iframe> is itself inside a display:none iframe, will we just not enter this code?  Otherwise, seems like you need to null-check GetShell().

Also, that should probably be GetCurrentDoc(), not OwnerDoc().  And possibly null-check that too....

I don't understand the EnsureMessageManager change.  Is that not usually called on random content subframes or something?

Please document the new method on nsFrameLoader.

Do we really want the LoadSrc to be completely async?  Is doing it off a scriptrunner not good enough? If so, why not?

Timothy or roc should review the view/widget bit.

I'd like smaug to take a look over this too...
Comment 55 Timothy Nikkel (:tnikkel) 2012-04-10 15:15:27 PDT
Comment on attachment 613666 [details] [diff] [review]
Part 1: content/layout changes.

What is the reason for the presshell change?
Comment 56 Justin Lebar (not reading bugmail) 2012-04-11 20:01:18 PDT
> I don't understand the EnsureMessageManager change.  Is that not usually called on random content 
> subframes or something?

>   nsCOMPtr<nsIDOMChromeWindow> chromeWindow =
>-    do_QueryInterface(mOwnerContent->OwnerDoc()->GetWindow());
>-  NS_ENSURE_STATE(chromeWindow);
>+    do_QueryInterface(OwnerDoc()->GetWindow());
>+  if (!chromeWindow) {
>+    nsCOMPtr<nsIDocShell> rootDocShell = GetRootDocShell(OwnerDoc());
>+    nsCOMPtr<nsIDOMWindow> rootWindow = do_GetInterface(rootDocShell);
>+    chromeWindow = do_GetInterface(rootWindow);
>+  }

Before, we assumed that mOwnerContent->OwnerDoc() was chrome.  But now it's
not, so we have to walk up to the root docshell to find chrome.

> Do we really want the LoadSrc to be completely async?  Is doing it off 
> a scriptrunner not good enough? If so, why not?

At one point during my testing, a script runner was not good enough.  But I'm having some difficulty re-creating whatever didn't work.  I'll get back to you.

> What is the reason for the presshell change?

>@@ -5079,11 +5079,10 @@ LayerManager* PresShell::GetLayerManager()
> {
>   NS_ASSERTION(mViewManager, "Should have view manager");
> 
>-  nsIView* rootView = mViewManager->GetRootView();
>-  if (rootView) {
>-    if (nsIWidget* widget = rootView->GetWidget()) {
>-      return widget->GetLayerManager();
>-    }
>+  nsIWidget *widget;
>+  mViewManager->GetRootWidget(&widget);
>+  if (widget) {
>+    return widget->GetLayerManager();
>   }

Without it, I get a null widget.  I don't pretend to understand what's going on here, but the conversation I had was:

> <jlebar> Does a content docshell's presshell's root view correspond to the view
>   of the root docshell?  I'm trying to get a widget pointer, but the "root view"
>   I get doesn't have a widget.
> <roc> you want nsIViewManager::GetRootWidget probably
> <roc> or better still, nsIFrame::GetNearestWidget

I used GetRootWidget because that seemed to correspond better to what the original code was doing.
Comment 57 Justin Lebar (not reading bugmail) 2012-04-11 20:18:36 PDT
> At one point during my testing, a script runner was not good enough.  But I'm having some difficulty 
> re-creating whatever didn't work.  I'll get back to you.

Using a script runner seems to work fine, in desktop Linux Firefox and desktop Linux b2g.  I'm not going to complain!
Comment 58 Timothy Nikkel (:tnikkel) 2012-04-11 20:22:34 PDT
(In reply to Justin Lebar [:jlebar] from comment #56)
> Without it, I get a null widget.  I don't pretend to understand what's going
> on here, but the conversation I had was:

What is the code that calls GetLayerManager trying to do? And where is it?
Comment 59 Justin Lebar (not reading bugmail) 2012-04-11 20:31:54 PDT
Created attachment 614263 [details] [diff] [review]
Interdiff part 1, v1 --> v2
Comment 60 Justin Lebar (not reading bugmail) 2012-04-11 20:33:11 PDT
Created attachment 614264 [details] [diff] [review]
Part 1, v2 - content/layout changes
Comment 61 Justin Lebar (not reading bugmail) 2012-04-11 20:42:09 PDT
(In reply to Timothy Nikkel (:tn) from comment #58)
> (In reply to Justin Lebar [:jlebar] from comment #56)
> > Without it, I get a null widget.  I don't pretend to understand what's going
> > on here, but the conversation I had was:
> 
> What is the code that calls GetLayerManager trying to do? And where is it?

Does this help?

#0  PresShell::GetLayerManager
#1  mozilla::layout::RenderFrameParent::GetLayerManager
#2  mozilla::layout::RenderFrameParent::AllocPLayers
#3  mozilla::layout::PRenderFrameParent::OnMessageReceived

The message received is, presumably, AllocPLayers.  It happens right after we show() the child.
Comment 62 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-11 23:35:25 PDT
Comment on attachment 613668 [details] [diff] [review]
Part 2: ipc changes.

>diff --git a/dom/ipc/TabParent.cpp b/dom/ipc/TabParent.cpp

> void
> TabParent::SetOwnerElement(nsIDOMElement* aElement)
> {

>+  // Calling GetWidget will cause us to save the screen's DPI, if we can.
>+  unused << GetWidget();

Please factor this out into a TryCacheDPI() helper, that internally
tries to get the widget and save the DPI.

> void
> TabParent::LoadURL(nsIURI* aURI)
> {

>+      printf("TabParent::LoadURL(%s) called before Show(); ignoring LoadURL.\n", spec.get());

NS_WARNING or printf_stderr plz.


> void
> TabParent::Show(const nsIntSize& size)
> {
>     // sigh
>+    mShown = true;
>+    printf("[TabParent] Show\n");

You can just nuke this.

> already_AddRefed<nsIWidget>
>-TabParent::GetWidget() const
>+TabParent::GetWidget()

With the TryCacheDPI() change you don't need to un-|const| this,
right?

>+  nsCOMPtr<nsIWidget> widget = frame->GetNearestWidget();
>+  if (widget && mDPI <= 0) {
>+    mDPI = widget->GetDPI();
>+  }
>+

(Prefer to move this into TryCacheDPI().)
Comment 63 Timothy Nikkel (:tnikkel) 2012-04-11 23:50:01 PDT
(In reply to Justin Lebar [:jlebar] from comment #61)
> (In reply to Timothy Nikkel (:tn) from comment #58)
> > (In reply to Justin Lebar [:jlebar] from comment #56)
> > > Without it, I get a null widget.  I don't pretend to understand what's going
> > > on here, but the conversation I had was:
> > 
> > What is the code that calls GetLayerManager trying to do? And where is it?
> 
> Does this help?
> 
> #0  PresShell::GetLayerManager
> #1  mozilla::layout::RenderFrameParent::GetLayerManager
> #2  mozilla::layout::RenderFrameParent::AllocPLayers
> #3  mozilla::layout::PRenderFrameParent::OnMessageReceived
> 
> The message received is, presumably, AllocPLayers.  It happens right after
> we show() the child.

What does the document hierarchy look like? Ie what document is mFrameLoader in and what is its parent document?

Maybe you just want nsContentUtils::LayerManagerForDocument?
Comment 64 Justin Lebar (not reading bugmail) 2012-04-12 07:14:15 PDT
> Ie what document is mFrameLoader in and what is its parent document?

We have

 chrome document
  content document (mozbrowser1.html)
   <iframe mozbrowser src="http://mozilla.org">
    content document (mozilla.org)

mFrameLoader is for the iframe, and OwnerDoc is mozbrowser1.html.

> Maybe you just want nsContentUtils::LayerManagerForDocument?

Yay, that works.  (I see that we needed to walk the docshell hierarchy, which we weren't doing before...)
Comment 65 Justin Lebar (not reading bugmail) 2012-04-12 07:20:52 PDT
Created attachment 614362 [details] [diff] [review]
Part 1 v3 - content changes

No longer any layout changes in this patch.
Comment 66 Justin Lebar (not reading bugmail) 2012-04-12 07:22:41 PDT
Created attachment 614363 [details] [diff] [review]
Part 2, v2 - IPC changes
Comment 67 Justin Lebar (not reading bugmail) 2012-04-12 07:25:36 PDT
Interdiff part 1 v2 --> v3:

 $ interdiff <(curl -L https://bugzilla.mozilla.org/attachment.cgi?id=614264) <(curl -L https://bugzilla.mozilla.org/attachment.cgi?id=614362)
Comment 68 Justin Lebar (not reading bugmail) 2012-04-12 07:28:46 PDT
Created attachment 614364 [details] [diff] [review]
Part 2, v2.1 - IPC changes

v2 --> v2.1: Remove unnecessary NULL out param.
Comment 69 Justin Lebar (not reading bugmail) 2012-04-12 07:29:53 PDT
Interdiff part 2 v1 --> v2.1:

 $ interdiff <(curl -L https://bugzilla.mozilla.org/attachment.cgi?id=613668) <(curl -L https://bugzilla.mozilla.org/attachment.cgi?id=614364)
Comment 70 Timothy Nikkel (:tnikkel) 2012-04-12 14:18:43 PDT
Comment on attachment 614362 [details] [diff] [review]
Part 1 v3 - content changes

If you don't change RenderFrameParent::GetLayerManager then I don't see how this is going to work.

Also comment 61 said it was the GetLayerManager call in RenderFrameParent::AllocPLayers that was problematic, not the one in nsFrameLoader::ShowRemoteFrame.
Comment 71 Boris Zbarsky [:bz] (still a bit busy) 2012-04-12 14:23:15 PDT
> Before, we assumed that mOwnerContent->OwnerDoc() was chrome.  But now it's
> not, so we have to walk up to the root docshell to find chrome.

Yes, I get that.  Do we not call that function on _other_ content subframes?  _That_ was my question.
Comment 72 Justin Lebar (not reading bugmail) 2012-04-12 14:24:02 PDT
(In reply to Timothy Nikkel (:tn) from comment #70)
> Comment on attachment 614362 [details] [diff] [review]
> Part 1 v3 - content changes
> 
> If you don't change RenderFrameParent::GetLayerManager then I don't see how
> this is going to work.

I do; it's just in the other patch.  :)

> Also comment 61 said it was the GetLayerManager call in
> RenderFrameParent::AllocPLayers that was problematic, not the one in
> nsFrameLoader::ShowRemoteFrame.

They both need to be changed; I just didn't see ShowRemoteFrame until I fixed AllocPLayers.
Comment 73 Boris Zbarsky [:bz] (still a bit busy) 2012-04-12 14:25:59 PDT
Comment on attachment 614362 [details] [diff] [review]
Part 1 v3 - content changes

>+++ b/content/base/src/nsFrameLoader.cpp
>@@ -907,16 +913,23 @@ nsFrameLoader::ShowRemoteFrame(const nsIntSize& size)
>+    if (!mOwnerContent ||
>+        !mOwnerContent->GetCurrentDoc() ||
>+        !nsContentUtils::LayerManagerForDocument(mOwnerContent->GetCurrentDoc()).get()) {

The need for .get() should have been a red flag here.  This code leaks the layer manager. ;)

>+      // This has to happen asynchronously to make OOP browser frames work.
>+      // There's no harm in doing it asynchronously for in-process browser
>+      // frames.

So I just remembered why scriptrunner might not have worked before: we needed an nsIFrame.  Maybe at this point we're flushing frames at some point in the load process, which is why it works now?

r=me modulo the memory leak and an explanation about message manager stuff.
Comment 74 Justin Lebar (not reading bugmail) 2012-04-12 14:32:29 PDT
> The need for .get() should have been a red flag here.  This code leaks the layer manager. ;)

Is that fixable in already_AddRefed (outside this patch, of course)?  Seems like if one of those gets destructed before it's assigned into a comptr, it should unref.  That's certainly what I expected it would do!  Maybe it's done this way as an optimization, real or imagined.
Comment 75 Boris Zbarsky [:bz] (still a bit busy) 2012-04-12 14:38:25 PDT
> Is that fixable in already_AddRefed

Not easily, imo.

Wrapping an nsCOMPtr<> around it should help here.
Comment 76 Justin Lebar (not reading bugmail) 2012-04-12 14:44:14 PDT
(In reply to Boris Zbarsky (:bz) from comment #71)
> > Before, we assumed that mOwnerContent->OwnerDoc() was chrome.  But now it's
> > not, so we have to walk up to the root docshell to find chrome.
> 
> Yes, I get that.  Do we not call that function on _other_ content subframes?
> _That_ was my question.

It did cross my mind that your question was too easy.  :)

EnsureMessageManager bails if (!mIsTopLevelContent && !mRemoteFrame).  So if I understand your question this time, I think the answer is no.  :)
Comment 77 Boris Zbarsky [:bz] (still a bit busy) 2012-04-12 14:47:09 PDT
Yes, perfect. Thanks.
Comment 78 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-17 10:31:40 PDT
smaug, do you have an ETA for review here?  This bug is blocking the followup work on making b2g multi-process.
Comment 79 Olli Pettay [:smaug] (reviewing overload) 2012-04-17 11:05:49 PDT
It is next in my review queue
Comment 80 Olli Pettay [:smaug] (reviewing overload) 2012-04-17 13:50:40 PDT
Comment on attachment 614362 [details] [diff] [review]
Part 1 v3 - content changes


>-    // FIXME get error codes from child
>-    mRemoteBrowser->LoadURL(mURIToLoad);
>+    if (mRemoteBrowserShown || ShowRemoteFrame(nsIntSize(0, 0))) {
>+      // FIXME get error codes from child
>+      mRemoteBrowser->LoadURL(mURIToLoad);
>+    }
>+    else {
} else {


> 
>   // FIXME/bug 589337: Show()/Hide() is pretty expensive for
>   // cross-process layers; need to figure out what behavior we really
>   // want here.  For now, hack.
>   if (!mRemoteBrowserShown) {
>+    if (!mOwnerContent ||
>+        !mOwnerContent->GetCurrentDoc() ||
>+        !nsContentUtils::LayerManagerForDocument(mOwnerContent->GetCurrentDoc()).get()) {
So isn't this leaking all the LayerManagers? Or do they have some bizarre addref/release setup?

> 
>+static already_AddRefed<nsIDocShell>
>+GetRootDocShell(nsIDocument *aDocument)
>+{
>+  nsCOMPtr<nsIWebNavigation> webNav = do_GetInterface(aDocument->GetWindow());
>+  nsCOMPtr<nsIDocShellTreeItem> treeItem = do_QueryInterface(webNav);
>+  NS_ENSURE_TRUE(treeItem, NULL);
Coding style says still nsnull


>+    bool isBrowserFrame = false;
>+    GetReallyIsBrowser(&isBrowserFrame);
>+    if (isBrowserFrame) {
>+      // This has to happen asynchronously to make OOP browser frames work.
>+      // There's no harm in doing it asynchronously for in-process browser
>+      // frames.
>+      nsContentUtils::AddScriptRunner(new LoadSrcRunner(this));
I don't understand why this is needed.
Frame loading starts already using a script runner.

>+    }
>+    else {
} else {
Comment 81 Justin Lebar (not reading bugmail) 2012-04-17 16:51:00 PDT
> Coding style says still nsnull

Have you been following the thread in dev.planning?  roc changed it to nsnull a few days ago, but bsmedberg thinks it should be NULL.  It was NULL until roc changed it.

The consensus in the thread seems to be that we should use something from standard C++, which points towards NULL.  But anyway, the fact that there's disagreement suggests to me we shouldn't be wasting times in reviews over it.

> So isn't this leaking all the LayerManagers? Or do they have some bizarre addref/release 
> setup?

Yes, it leaks.  See comment 74.

> Frame loading starts already using a script runner.

Yeah, that's part of the magic in this patch.  Is comment 7 sufficient explanation?
Comment 82 Olli Pettay [:smaug] (reviewing overload) 2012-04-18 03:05:31 PDT
(In reply to Justin Lebar [:jlebar] from comment #81)
> Yeah, that's part of the magic in this patch.  Is comment 7 sufficient
> explanation?
No. You add a new scriptrunner, but there is already a scriptrunner which starts the loading.
I don't understand why the new scriptrunner changes anything.

If you need nsIFrame to be there, wouldn't it be better to start loading then
the nsIFrame is actually created. Like when nsFrameLoader::Show is called.
Comment 83 Justin Lebar (not reading bugmail) 2012-04-18 18:04:09 PDT
> No. You add a new scriptrunner, but there is already a scriptrunner which starts the 
> loading.  I don't understand why the new scriptrunner changes anything.

You're totally right, it doesn't.  It did, but either something changed, or one of my later changes fixed this too.
Comment 84 Justin Lebar (not reading bugmail) 2012-04-19 16:10:48 PDT
Created attachment 616791 [details] [diff] [review]
Roll-up patch, v4
Comment 85 Justin Lebar (not reading bugmail) 2012-04-19 16:55:40 PDT
Pushed to birch for FF15.
https://hg.mozilla.org/projects/birch/rev/dd82458147fd
Comment 86 Phil Ringnalda (:philor) 2012-04-19 20:08:57 PDT
Backed out in https://hg.mozilla.org/projects/birch/rev/cc409a6f9195 - mochitest-3 on every platform, timeout in test_browserFrame4/5/6/7.html (and thus an end to the suite), debug Linux/Linux64, a leak of a BasicContainerLayer and friends.
Comment 87 Justin Lebar (not reading bugmail) 2012-04-19 20:11:28 PDT
That worked well.

The problem is that oop browser frames aren't pref'ed off on desktop.  Instead, we rely on the fact that browser frames are pref'ed off.  That works fine in general, but not for the test suite!

I'll push to try this time.
Comment 88 Justin Lebar (not reading bugmail) 2012-04-23 08:37:06 PDT
Created attachment 617489 [details] [diff] [review]
Part 3: Don't leak the layer manager in IPC code.

This is the same refptr.get() mistake I made elsewhere.  We're green on try after this.
Comment 89 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-25 04:36:46 PDT
Comment on attachment 617489 [details] [diff] [review]
Part 3: Don't leak the layer manager in IPC code.

landitlanditlandit
Comment 90 Ed Morley [:emorley] 2012-04-25 08:56:08 PDT
Landed on inbound as:
https://hg.mozilla.org/integration/mozilla-inbound/rev/949cb1977ef9

Unfortunately had to be backed out due to compile failures on multiple platforms:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=949cb1977ef9
https://tbpl.mozilla.org/php/getParsedLog.php?id=11197414&tree=Mozilla-Inbound#error0

{
../../../dom/ipc/TabParent.cpp: In member function 'void mozilla::dom::TabParent::LoadURL(nsIURI*)':
../../../dom/ipc/TabParent.cpp:201: error: invalid conversion from 'int' to 'const char*'
../../../dom/ipc/TabParent.cpp:201: error:   initializing argument 1 of 'nsPrintfCString::nsPrintfCString(const char*, ...)'
make[6]: *** [TabParent.o] Error 1
}

https://hg.mozilla.org/integration/mozilla-inbound/rev/99649652cb34
Comment 91 Justin Lebar (not reading bugmail) 2012-04-25 08:58:57 PDT
Bitrotted by bug 745659.
Comment 92 Justin Lebar (not reading bugmail) 2012-04-25 13:25:34 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/9d00516b0ad7

Note You need to log in before you can comment on or make changes to this bug.