Closed
Bug 833935
Opened 11 years ago
Closed 11 years ago
Tab abnormally stops loading content or app is abnormally killed when viewing certain wired.com articles
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(blocking-b2g:tef+, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed, b2g18-v1.0.0 fixed, b2g18-v1.0.1 fixed)
People
(Reporter: jsmith, Assigned: jduell.mcbugs)
References
Details
Attachments
(3 files)
7.22 KB,
patch
|
jdm
:
review+
|
Details | Diff | Splinter Review |
8.35 KB,
patch
|
jdm
:
review+
|
Details | Diff | Splinter Review |
2.69 KB,
patch
|
jdm
:
review+
|
Details | Diff | Splinter Review |
Break out of bug 832796 for the tab abnormally stopping load based on https://bugzilla.mozilla.org/show_bug.cgi?id=832796#c20. Jason indicates the following: "This means code *using* an HTTP channel was misbehaving by not passing security info (i.e. not setting a TabChild as the channel callbacks). Not a necko bug. We need to find the offending client code and fork a bug to fix it." Logcats can be found on that bug for more info.
Reporter | ||
Comment 1•11 years ago
|
||
I think I've seen this bug as well in app mode, so nominating for the same reasons why bug 832796 was +.
blocking-b2g: --- → tef?
Component: Gaia::Browser → General
QA Contact: nhirata.bugzilla
Reporter | ||
Updated•11 years ago
|
Summary: Tab abnormally stops loading content from certain wired.com articles → Tab abnormally stops loading content or app is abnormally killed when viewing certain wired.com articles
Updated•11 years ago
|
blocking-b2g: tef? → tef+
Comment 3•11 years ago
|
||
Hot damn: Breakpoint 3, mozilla::net::PNeckoChild::SendPHttpChannelConstructor (this=0x4276e680, actor=0x4397c000, browser=0x0, loadContext=...) at /run/media/jdm/ssd/b2g/B2G/objdir-gecko/ipc/ipdl/PNeckoChild.cpp:253 253 { #0 mozilla::net::PNeckoChild::SendPHttpChannelConstructor (this=0x4276e680, actor=0x4397c000, browser=0x0, loadContext=...) at /run/media/jdm/ssd/b2g/B2G/objdir-gecko/ipc/ipdl/PNeckoChild.cpp:253 #1 0x404b3fda in mozilla::net::HttpChannelChild::AsyncOpen (this=0x4397c000, listener=<value optimized out>, aContext=<value optimized out>) at /run/media/jdm/ssd/b2g/B2G/gecko/netwerk/protocol/http/HttpChannelChild.cpp:1043 #2 0x40684ef6 in nsScriptLoader::StartLoad (this=0x43971c40, aRequest=0x4391a340, aType=<value optimized out>) at /run/media/jdm/ssd/b2g/B2G/gecko/content/base/src/nsScriptLoader.cpp:330 #3 0x40684f88 in nsScriptLoader::PreloadURI (this=0x43971c40, aURI=<value optimized out>, aCharset=..., aType=..., aCrossOrigin=...) at /run/media/jdm/ssd/b2g/B2G/gecko/content/base/src/nsScriptLoader.cpp:1250 #4 0x40849c24 in nsHtml5TreeOpExecutor::PreloadScript (this=<value optimized out>, aURL=<value optimized out>, aCharset=..., aType=..., aCrossOrigin=...) at /run/media/jdm/ssd/b2g/B2G/gecko/parser/html/nsHtml5TreeOpExecutor.cpp:1058 #5 0x4084cc60 in nsHtml5SpeculativeLoad::Perform (this=<value optimized out>, aExecutor=0x45c47b50) at /run/media/jdm/ssd/b2g/B2G/gecko/parser/html/nsHtml5SpeculativeLoad.cpp:35 #6 0x4084a3ea in nsHtml5TreeOpExecutor::RunFlushLoop (this=0x45c47b50) at /run/media/jdm/ssd/b2g/B2G/gecko/parser/html/nsHtml5TreeOpExecutor.cpp:514 #7 0x4084c286 in nsHtml5ExecutorFlusher::Run (this=<value optimized out>) at /run/media/jdm/ssd/b2g/B2G/gecko/parser/html/nsHtml5StreamParser.cpp:127 #8 0x40c191b2 in nsThread::ProcessNextEvent (this=0x41a06880, mayWait=<value optimized out>, result=0xbec64fd7) at /run/media/jdm/ssd/b2g/B2G/gecko/xpcom/threads/nsThread.cpp:620 #9 0x40bf95de in NS_ProcessNextEvent_P (thread=0x43971c40, mayWait=false) at /run/media/jdm/ssd/b2g/B2G/objdir-gecko/xpcom/build/nsThreadUtils.cpp:237 #10 0x40b16c00 in mozilla::ipc::MessagePump::Run (this=0x41a022e0, aDelegate=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/glue/MessagePump.cpp:82 #11 0x40b16cb2 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x41a022e0, aDelegate=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/glue/MessagePump.cpp:231 #12 0x40c3af90 in MessageLoop::RunInternal (this=0x1000000) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/chromium/src/base/message_loop.cc:215 #13 0x40c3b046 in MessageLoop::RunHandler (this=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/chromium/src/base/message_loop.cc:208 #14 MessageLoop::Run (this=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/chromium/src/base/message_loop.cc:182 #15 0x40a9d6f0 in nsBaseAppShell::Run (this=0x4323f040) at /run/media/jdm/ssd/b2g/B2G/gecko/widget/xpwidgets/nsBaseAppShell.cpp:163 #16 0x4043c4dc in XRE_RunAppShell () at /run/media/jdm/ssd/b2g/B2G/gecko/toolkit/xre/nsEmbedFunctions.cpp:646 #17 0x40b16c80 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x41a022e0, aDelegate=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/glue/MessagePump.cpp:198 #18 0x40c3af90 in MessageLoop::RunInternal (this=0x4323f040) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/chromium/src/base/message_loop.cc:215 #19 0x40c3b046 in MessageLoop::RunHandler (this=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/chromium/src/base/message_loop.cc:208 #20 MessageLoop::Run (this=0xbec658e8) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/chromium/src/base/message_loop.cc:182 #21 0x4043c880 in XRE_InitChildProcess (aArgc=<value optimized out>, aArgv=<value optimized out>, aProcess=GeckoProcessType_Content) at /run/media/jdm/ssd/b2g/B2G/gecko/toolkit/xre/nsEmbedFunctions.cpp:485 #22 0x00008410 in main (argc=5, argv=0xbec65a44) at /run/media/jdm/ssd/b2g/B2G/gecko/ipc/app/MozillaRuntimeMain.cpp:48
Comment 4•11 years ago
|
||
It's interesting. I've seen failures in SendPHttpChannel, GetCookieString, SetCookieString, and WebSocketChild::SendAsyncOpen all stemming from the same problem, where the PBrowser argument is null. Having caught the websocket one in gdb, I can see that we actually have a loadgroup, but are surprisingly not able to obtain an nsITabChild from it. Boris, do you have any suggestions for how to figure out why this is?
Updated•11 years ago
|
Flags: needinfo?(bzbarsky)
Comment 5•11 years ago
|
||
I sorted this out with Boris today: it looks like loads are starting from pages/frames after the containing page has already been destroyed, so there's no tree owner for the docshell to retrieve the nsITabChild from. This presumably isn't a problem with non-IPC channels; the only good solution I see here is to abort the AsyncOpen call if a null nsITabChild is encountered. The downside here is that we lose the high visibility failure case for when our channels are simply set up incorrectly, but the alternative is a very ill-defined investigation into how to keep the loads from starting in the first place.
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 6•11 years ago
|
||
> the only good solution I see here is to abort the AsyncOpen call if a null
> nsITabChild is encountered
I was thinking the same thing. Just return NS_ERROR_ILLEGAL_VALUE or something. I think synchronous failure here should be ok--it's certainly easier to code.
Assignee | ||
Comment 7•11 years ago
|
||
We could potentially throw in a MOZ_ASSERT(TabChild) here too, and then we'd catch other cases where we don't setup the channels correctly. Of course, if we have cases in our test harness that hit this because of the containing-page-has-gone-away issue, we'd get random oranges. But I suspect we don't have any, so try throwing it in and let's see if it passes try OK. I also think issuing a printf_stderr("WARNING: child channel tried to open channel w/o security info" msg is worth issuing here too). I have a patch that does some of the infrastructure here. jdm is off IRC. At the risk of duplicating effort I'm going to steal this and just finish that patch.
Assignee: josh → jduell.mcbugs
Assignee | ||
Comment 8•11 years ago
|
||
Move the usingSecurity variables out from being static in NeckoParent/Child so we can check in other files whether security is being used or not. Could probably do a fix with less lines of code this is pretty much risk free, just boilerplate.
Attachment #706233 -
Flags: review?(josh)
Assignee | ||
Comment 9•11 years ago
|
||
This doesn't cover the appcache (see OfflineCacheUpdateChild.cpp, etc)--I need to look into that tomorrow.
Attachment #706234 -
Flags: review?(josh)
Assignee | ||
Comment 10•11 years ago
|
||
Try run: https://tbpl.mozilla.org/?tree=Try&rev=c3b7cbcf93df
Comment 11•11 years ago
|
||
Comment on attachment 706234 [details] [diff] [review] Part 2: don't kill child in release builds if navigation causes missing TabChild Review of attachment 706234 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/wyciwyg/WyciwygChannelChild.cpp @@ +602,5 @@ > URIParams originalURI; > SerializeURI(mOriginalURI, originalURI); > > mozilla::dom::TabChild* tabChild = GetTabChild(this); > + if (MissingRequiredTabChild(tabChild, "http")) { wyciwyg
Attachment #706234 -
Flags: review?(josh) → review+
Updated•11 years ago
|
Attachment #706233 -
Flags: review?(josh) → review+
Comment 12•11 years ago
|
||
I'll fix this up and land both patches. Also, I looked at OfflineCacheUpdateChild and it already has null tabchild detection and synchronous abort, so I'm updating it to use the same mechanism.
Comment 13•11 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/b662c6940cf4 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/eddbffde3918 remote: https://hg.mozilla.org/releases/mozilla-b2g18/rev/b69e1466e761 remote: https://hg.mozilla.org/releases/mozilla-b2g18/rev/94a2d6fcdfde
Status: NEW → RESOLVED
Closed: 11 years ago
status-b2g18:
--- → fixed
status-firefox21:
--- → fixed
Resolution: --- → FIXED
Assignee | ||
Comment 14•11 years ago
|
||
jdm: how do you feel about landing this too as part of this bug? I realized that by putting static vars inside an inline function we'll wind up creating a copy of them per compilation unit, which seems... unintuitive. GCC at least isn't creating the static in compilation units that don't use the function, but I could imagine other compilers might. Anyway, seems messy. Can also get split out into a new bug, but it's only been a day since we landed part 1/2. Also: have you checked to make sure the JS TCP socket API won't kill the child if we try to open one when the containing page has gone away?
Attachment #706714 -
Flags: review?(josh)
Assignee | ||
Comment 15•11 years ago
|
||
This and bug 832796 need to be verified (i.e. no more crashes in B2G on wired.com?)
Keywords: qawanted
Reporter | ||
Updated•11 years ago
|
Comment 16•11 years ago
|
||
Comment on attachment 706714 [details] [diff] [review] Part 3: stop using static var per compilation unit Review of attachment 706714 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/ipc/NeckoCommon.cpp @@ +11,5 @@ > +namespace net { > + > +namespace NeckoCommonInternal { > + bool securityDisabled = true; > + bool registeredBool = false; Use a g prefix for the name, please.
Attachment #706714 -
Flags: review?(josh) → review+
Comment 17•11 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #14) > Also: have you checked to make sure the JS TCP socket API won't kill the > child if we try to open one when the containing page has gone away? Good catch. I just looked, and we don't actually use the necko security stuff o.O We just merrily go on our way in the parent if no PBrowser is present. I'll fix that.
Comment 18•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/629d2d1485e2 https://hg.mozilla.org/releases/mozilla-b2g18/rev/2226f181c8c2 Pushed part 3.
Comment 19•11 years ago
|
||
FYI, this needs uplift to http://hg.mozilla.org/releases/mozilla-b2g18_v1_0_0
Comment 20•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/b662c6940cf4 https://hg.mozilla.org/mozilla-central/rev/eddbffde3918 https://hg.mozilla.org/mozilla-central/rev/629d2d1485e2
Reporter | ||
Updated•11 years ago
|
Whiteboard: [needs-b2g-v1.0 uplift]
Assignee | ||
Comment 21•11 years ago
|
||
jdm: I'm guessing we should fork the TCP socket fix to another bug at this point. Landed part3 on v1.0.0: https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_0/rev/dc3e991c5ef0 Since I don't see a tracking flags for b2gv1.0.0, I'm assuming clearing the whiteboard is all I need to do for that.
Whiteboard: [needs-b2g-v1.0 uplift]
Updated•11 years ago
|
status-firefox19:
--- → wontfix
status-firefox20:
--- → wontfix
Target Milestone: --- → B2G C4 (2jan on)
Updated•11 years ago
|
status-b2g18-v1.0.0:
--- → fixed
Updated•11 years ago
|
status-b2g18-v1.0.1:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•