Closed Bug 1253788 Opened 8 years ago Closed 8 years ago

Crash while using web components from chrome:// content

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: fabrice, Assigned: heycam)

References

(Blocks 1 open bug)

Details

(Whiteboard: btpp-fixlater fixed-in-pine)

Attachments

(4 files, 1 obsolete file)

I'm working on a reduced test case (currently seeing that when loading gaia system app as chrome://xxx), but here's the stack trace:

Program received signal SIGSEGV, Segmentation fault.
nsINode::CompareDocumentPosition (this=0x0, aOtherNode=...) at /home/fabrice/dev/pine/dom/base/nsINode.cpp:865
865	  if (GetPreviousSibling() == &aOtherNode) {
(gdb) bt
#0  nsINode::CompareDocumentPosition (this=0x0, aOtherNode=...) at /home/fabrice/dev/pine/dom/base/nsINode.cpp:865
#1  0xb420c7ec in nsContentUtils::PositionIsBefore (aNode1=aNode1@entry=0xaab94b80, aNode2=<optimized out>) at ../../../../pine/dom/base/nsContentUtils.cpp:2317
#2  0xb4264b0a in mozilla::dom::ShadowRoot::InsertSheet (this=0xb074d620, aSheet=..., aLinkingContent=aLinkingContent@entry=0xaab94b80)
    at /home/fabrice/dev/pine/dom/base/ShadowRoot.cpp:154
#3  0xb4a70632 in mozilla::css::Loader::LoadInlineStyle (this=0xadb6dfd0, aElement=0xaab94b80, aBuffer=..., aLineNumber=1, aTitle=..., aMedia=..., 
    aScopeElement=aScopeElement@entry=0x0, aObserver=aObserver@entry=0x0, aCompleted=aCompleted@entry=0xbef0f4e3, aIsAlternate=aIsAlternate@entry=0xbef0f4e2)
    at /home/fabrice/dev/pine/layout/style/Loader.cpp:2009
#4  0xb42d2df6 in nsStyleLinkElement::DoUpdateStyleSheet (this=0xaab94bcc, aOldDocument=<optimized out>, aOldShadowRoot=<optimized out>, 
    aObserver=aObserver@entry=0x0, aWillNotify=aWillNotify@entry=0xbef0f7d6, aIsAlternate=aIsAlternate@entry=0xbef0f7d7, aForceUpdate=false)
    at /home/fabrice/dev/pine/dom/base/nsStyleLinkElement.cpp:432
#5  0xb42d2fe0 in nsStyleLinkElement::UpdateStyleSheetInternal (this=<optimized out>, aOldDocument=<optimized out>, aOldShadowRoot=<optimized out>, 
    aForceUpdate=<optimized out>) at /home/fabrice/dev/pine/dom/base/nsStyleLinkElement.cpp:237
#6  0xb3d06d2a in apply<nsMemoryReporterManager, nsresult (nsMemoryReporterManager::*)()> (m=<optimized out>, o=<optimized out>, this=<optimized out>)
    at ../../dist/include/nsThreadUtils.h:663
#7  nsRunnableMethodImpl<nsresult (nsMemoryReporterManager::*)(), true>::Run (this=<optimized out>) at ../../dist/include/nsThreadUtils.h:870
#8  0xb42151d4 in nsContentUtils::RemoveScriptBlocker () at ../../../../pine/dom/base/nsContentUtils.cpp:5136
#9  0xb42a56e2 in nsDocument::EndUpdate (this=this@entry=0xad1dc000, aUpdateType=1) at /home/fabrice/dev/pine/dom/base/nsDocument.cpp:4918
#10 0xb47673e6 in nsHTMLDocument::EndUpdate (this=0xad1dc000, aUpdateType=<optimized out>) at /home/fabrice/dev/pine/dom/html/nsHTMLDocument.cpp:2516
#11 0xb424aeec in mozAutoDocUpdate::~mozAutoDocUpdate (this=0xbef0f888, __in_chrg=<optimized out>) at /home/fabrice/dev/pine/dom/base/mozAutoDocUpdate.h:40
#12 0xb42b2960 in nsINode::ReplaceOrInsertBefore (this=this@entry=0xaff7bfb0, aReplace=<optimized out>, aNewChild=aNewChild@entry=0xb07f96f0, 
    aRefChild=<optimized out>, aError=...) at /home/fabrice/dev/pine/dom/base/nsINode.cpp:2174
#13 0xb4412a8a in InsertBefore (aError=..., aChild=0x0, aNode=..., this=0xaff7bfb0) at ../../../../pine/dom/base/nsINode.h:1755
#14 AppendChild (aError=..., aNode=..., this=0xaff7bfb0) at ../../../../pine/dom/base/nsINode.h:1759
#15 mozilla::dom::NodeBinding::appendChild (cx=0xae151900, obj=..., self=0xaff7bfb0, args=...) at NodeBinding.cpp:632
#16 0xb467cd3c in mozilla::dom::GenericBindingMethod (cx=0xae151900, argc=<optimized out>, vp=<optimized out>)
    at /home/fabrice/dev/pine/dom/bindings/BindingUtils.cpp:2731
#17 0xb530bdca in CallJSNative (args=..., native=0xb467cc71 <mozilla::dom::GenericBindingMethod(JSContext*, unsigned int, JS::Value*)>, cx=0xae151900)
    at ../../../../pine/js/src/jscntxtinlines.h:235
#18 js::Invoke (cx=0xae151900, args=..., construct=<optimized out>) at /home/fabrice/dev/pine/js/src/vm/Interpreter.cpp:478
#19 0xb5304b76 in Interpret (cx=0xae151900, state=...) at /home/fabrice/dev/pine/js/src/vm/Interpreter.cpp:2802
#20 0xb530bb8c in js::RunScript (cx=cx@entry=0xae151900, state=...) at /home/fabrice/dev/pine/js/src/vm/Interpreter.cpp:428
#21 0xb530bcc6 in js::Invoke (cx=cx@entry=0xae151900, args=..., construct=construct@entry=js::NO_CONSTRUCT)
    at /home/fabrice/dev/pine/js/src/vm/Interpreter.cpp:496
#22 0xb530c2e4 in js::Invoke (cx=cx@entry=0xae151900, thisv=..., fval=..., argc=<optimized out>, argv=0xbef10058, rval=...)
    at /home/fabrice/dev/pine/js/src/vm/Interpreter.cpp:530
#23 0xb52652c2 in JS::Call (cx=0xae151900, thisv=thisv@entry=..., fval=fval@entry=..., args=..., rval=...) at /home/fabrice/dev/pine/js/src/jsapi.cpp:2892
#24 0xb445c17c in mozilla::dom::AnyCallback::Call (this=this@entry=0xaf3a5b20, cx=0xae151900, aThisVal=..., value=..., value@entry=..., 
    aRetVal=aRetVal@entry=..., aRv=...) at PromiseBinding.cpp:94
#25 0xb496612c in mozilla::dom::AnyCallback::Call (this=0xaf3a5b20, value=value@entry=..., aRetVal=aRetVal@entry=..., aRv=..., aExecutionReason=<optimized out>, 
    aExceptionHandling=aExceptionHandling@entry=mozilla::dom::CallbackObject::eRethrowExceptions, aCompartment=0xacf0ec00)
    at ../../dist/include/mozilla/dom/PromiseBinding.h:224
---Type <return> to continue, or q <return> to quit---
#26 0xb496aee8 in mozilla::dom::WrapperPromiseCallback::Call (this=0xaf3a5820, aCx=0xb0801040, aValue=...)
    at /home/fabrice/dev/pine/dom/promise/PromiseCallback.cpp:338
#27 0xb4966ab4 in mozilla::dom::PromiseReactionJob::Run (this=<optimized out>) at /home/fabrice/dev/pine/dom/promise/Promise.cpp:106
#28 0xb4967494 in mozilla::dom::Promise::PerformMicroTaskCheckpoint () at /home/fabrice/dev/pine/dom/promise/Promise.cpp:937
#29 0xb3d04c0c in mozilla::CycleCollectedJSRuntime::AfterProcessTask (this=0xb1dc1c00, aRecursionDepth=<optimized out>)
    at /home/fabrice/dev/pine/xpcom/base/CycleCollectedJSRuntime.cpp:1348
#30 0xb403112c in XPCJSRuntime::AfterProcessTask (this=0xb1dc1c00, aNewRecursionDepth=1) at /home/fabrice/dev/pine/js/xpconnect/src/XPCJSRuntime.cpp:3690
#31 0xb3d28c1c in nsThread::ProcessNextEvent (this=0xb6a02550, aMayWait=<optimized out>, aResult=0xbef10517)
    at /home/fabrice/dev/pine/xpcom/threads/nsThread.cpp:1009
#32 0xb3d3a418 in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=false) at /home/fabrice/dev/pine/xpcom/glue/nsThreadUtils.cpp:297
#33 0xb3ea5334 in mozilla::ipc::MessagePump::Run (this=0xb6a55460, aDelegate=0xb1dc71a0) at /home/fabrice/dev/pine/ipc/glue/MessagePump.cpp:95
#34 0xb3e942ae in MessageLoop::RunInternal (this=this@entry=0xb1dc71a0) at /home/fabrice/dev/pine/ipc/chromium/src/base/message_loop.cc:234
#35 0xb3e94360 in RunHandler (this=0xb1dc71a0) at /home/fabrice/dev/pine/ipc/chromium/src/base/message_loop.cc:227
#36 MessageLoop::Run (this=0xb1dc71a0) at /home/fabrice/dev/pine/ipc/chromium/src/base/message_loop.cc:201
#37 0xb49e8c7a in nsBaseAppShell::Run (this=0xb0595040) at /home/fabrice/dev/pine/widget/nsBaseAppShell.cpp:156
#38 0xb4d821f0 in nsAppStartup::Run (this=0xb0563970) at /home/fabrice/dev/pine/toolkit/components/startup/nsAppStartup.cpp:281
#39 0xb4da4c44 in XREMain::XRE_mainRun (this=this@entry=0xbef10688) at ../../../../pine/toolkit/xre/nsAppRunner.cpp:4327
#40 0xb4da4ea0 in XREMain::XRE_main (this=this@entry=0xbef10688, argc=argc@entry=1, argv=argv@entry=0xb6a2b0d0, aAppData=aAppData@entry=0xb6f74c98 <_ZL8sAppData>)
    at ../../../../pine/toolkit/xre/nsAppRunner.cpp:4424
#41 0xb4da5046 in XRE_main (argc=1, argv=0xb6a2b0d0, aAppData=0xb6f74c98 <_ZL8sAppData>, aFlags=<optimized out>)
    at ../../../../pine/toolkit/xre/nsAppRunner.cpp:4526
#42 0xb6f566f4 in do_main (argc=argc@entry=1, argv=argv@entry=0xb6a2b0d0) at ../../../../pine/b2g/app/nsBrowserApp.cpp:167
#43 0xb6f56840 in b2g_main (argc=1, argv=<optimized out>) at ../../../../pine/b2g/app/nsBrowserApp.cpp:299
#44 0xb6f565a6 in RunProcesses (aReservedFds=..., argv=0xbef11954, argc=1) at ../../../../pine/b2g/app/B2GLoader.cpp:233
#45 main (argc=1, argv=0xbef11954) at ../../../../pine/b2g/app/B2GLoader.cpp:300
Could you perhaps link to the specific version and line of code here?
Maybe the top 4 entries in the stack. That might be enough to reveal the issue.
Flags: needinfo?(fabrice)
Flags: needinfo?(wchen)
The gecko build used in based on m-c revision be593a64d7c6
Flags: needinfo?(fabrice)
Whiteboard: btpp-fixlater
I had a look at other similar uses of |GetOwnerNode()| within Gecko and found that they were all checking the return value for validity. In the case exposed by Fabrice, which I reproduce, we have GetOwnerNode() at https://hg.mozilla.org/projects/pine/file/efd11382e5f3/dom/base/ShadowRoot.cpp#l153 that return nullptr.

I have a patch ensuring the return value from there is non null in bug 1256506 attachment 8731579 [details] [diff] [review]: it works. I have not investigated if we are missing or breaking anything else, but at least Gecko does not die in suffering :)
Andrew, would you mind reviewing this? As explained in the comment before, All other uses of this |GetOwnerNode()| within Gecko checks for nullptr. With this I don't have any problem booting at all, but I would like to be sure this is not just hiding dust under the carpet :)
Attachment #8731722 - Flags: review?(overholt)
Comment on attachment 8731722 [details] [diff] [review]
0001-Bug-XXX-Avoid-segfault-on-nullptr-from-GetOwnerNode.patch

Redirecting review as suggested. First one to review gets a beer in London :)
Attachment #8731722 - Flags: review?(wchen)
Attachment #8731722 - Flags: review?(overholt)
Attachment #8731722 - Flags: review?(bugs)
Comment on attachment 8731722 [details] [diff] [review]
0001-Bug-XXX-Avoid-segfault-on-nullptr-from-GetOwnerNode.patch

I'm not familiar with stylesheets' ownership model and when it is ok for
GetOwnerNode to return null.
Cameron should know that stuff. 
(Perhaps William recalls it too :) )
Attachment #8731722 - Flags: review?(bugs) → review?(cam)
https://hg.mozilla.org/projects/pine/rev/e0ad375c9d14
Whiteboard: btpp-fixlater → btpp-fixlater fixed-in-pine
Comment on attachment 8731722 [details] [diff] [review]
0001-Bug-XXX-Avoid-segfault-on-nullptr-from-GetOwnerNode.patch

Review of attachment 8731722 [details] [diff] [review]:
-----------------------------------------------------------------

It would be nice to have a test case so I can investigate how we're getting a style without an owning node. If we do need to get this fixed now, it would be better to check for an owning node earlier in the method and early return with a warning if we don't have one.
Attachment #8731722 - Flags: review?(wchen)
It's possible in general for style sheets to have a null owning node, e.g. sheets that come from the nsLayoutStylesheetCache or the nsStyleSheetService.  But I'm not sure why sheets from ShadowRoot::mProtoBinding should have a null owning node.

Alexandre, could you provide a testcase, or see what CSSStyleSheet without an owning node we're passing to ShadowRoot::InsertSheet and where it came from?
Flags: needinfo?(lissyx+mozillians)
A testcase, for now, I don't think I can. Regarding finding the CSSStyleSheet, I am not sure how I should do this. However, once I have the system booting with chrome URLs and that patch applied, we do see a lot of errors in logcat where the system complains about stylesheet being of type "text/html" instead of css. This is happening on a lot of usecases of shadow root stylesheet insertion.

> 03-18 16:33:55.030  2325  2325 E GeckoConsole: [JavaScript Error: "The stylesheet chrome://gaia/content/system/index.html was not loaded because its MIME type, "text/html", is not "text/css"." {file: "chrome://gaia/content/shared/js/component_utils.js" line: 38}]

Which is: https://github.com/mozilla-b2g/gaia/blob/c4a51139beee99b3e117ae8db7c17f734f94eca1/shared/js/component_utils.js#L38

My guess is that those errors are just a consequence of my patch, and we have a real issue here where WebComponents are not working in our case.
Flags: needinfo?(lissyx+mozillians) → needinfo?(cam)
Now, more informations: I have retested loading without my workaround and this is crashing again, as expected. Yet what is interesting is that |print DumpJSStack()| in gdb shows https://github.com/mozilla-b2g/gaia/blob/c4a51139beee99b3e117ae8db7c17f734f94eca1/apps/system/js/pin_page_system_dialog.js#L33

if I do comment this line and only this one, then it loads properly. I still see told of JS error about mime type as documented above.
So it is when a custom element is appended, which presumably causes the style sheet to be added in the shadow tree.  Can you tell me how I can run that app on my desktop so that I can catch it in gdb too?  Thanks!
Flags: needinfo?(cam) → needinfo?(lissyx+mozillians)
Sure, this is what I am currently focusing on: getting a testcase and/or exposing this on mulet :)
Flags: needinfo?(lissyx+mozillians)
Good, so I am reproducing this with a mulet build:
 - get pine (https://hg.mozilla.org/projects/pine/),
 - get kanikani branch https://github.com/mozilla-b2g/gaia/tree/kanikani, run: make PRODUCTION=1 GAIA_OPTIMIZE=0 profile
 - the profile is produced in gaia/profile
 - on pine, change the path of chrome gaia in b2g/components/B2GComponents.manifest:
   - it is defined at file:///system/b2g/apps/
   - change it to a full path to your |apps| directory in the gaia profile generated
 - build mulet (ac_add_options --enable-application=b2g/dev)
 - run mulet: ./obj-mulet/dist/firefox/firefox -no-remote -profile path/to/gaia/profile

I am still working on a testcase though :)
So far, crash append here: https://github.com/mozilla-b2g/gaia/blob/c4a51139beee99b3e117ae8db7c17f734f94eca1/shared/elements/gaia-component/gaia-component.js#L144

The |this.createShadowRoot()| statement works, returns something that looks like a valid ShadowRoot. But then the subsequent call to .appendChild(node) triggers the crash.
Looks like the two |gaia-button| in PinPageSystemDialog have something to do with this crash. The code is at https://github.com/mozilla-b2g/gaia/blob/c4a51139beee99b3e117ae8db7c17f734f94eca1/apps/system/js/pin_page_system_dialog.js#L56 and what I have been able to observe is that:
 - with both <gaia-button> element present, crash happens
 - removing completely the first OR the second <gaia-button>, no crash
 - removing content of BOTH <gaia-button> elements, crash happens
Changing the |view| method to:
>    55   PinPageSystemDialog.prototype.view = function spd_view() {
>    56     return `<gaia-dialog id="${this.instanceID}">
>    57                 <gaia-header id="pin-page-header" action="close">
>    58                 </gaia-header>
>    59                   <gaia-button data-action="pin-page" class="centered">
>    60                   </gaia-button>
>    61                   <gaia-button data-action="pin-site" class="centered">
>    62                   </gaia-button>
>    63             </gaia-dialog>`;
>    64   };

This reproduces the crash.

Removing any of the elements inside |<gaia-dialog>| avoids the crash, be it |<gaia-header>| or one of the  |<gaia-button>|.
I might have a chrome mochitest reproducing the crash ...
Patch introduching a chrome mochitest exposing the crash
(In reply to Alexandre LISSY :gerard-majax from comment #19)
> Created attachment 8733865 [details] [diff] [review]
> ShadowRoot null pointer deref with chrome:// and WebComponents
> 
> Patch introduching a chrome mochitest exposing the crash

Tested only with Mulet from Pine for now.
Good, I can confirm I do reproduce the very same crash with a browser debug build and patch from attachment 8733865 [details] [diff] [review] applied.
Flags: needinfo?(cam)
The part of the patch touching dom/base/ShadowRoot.cpp is really only useful on a pine tree. On central, this part should fail to apply (since we have not landed that workaround) but can be safely ignored.
Thanks, I can reproduce with that mochitest.  I'll take a look.
So I think the null owner node is wrong, and that all sheets we add to the ShadowRoot's nsXBLPrototypeResources should have an owner node.

We have a number of inline <style> elements that are created and inserted into the document and ShadowRoots.  We use the document's URL as the sheet URL, which for the mochitest here is chrome://mochitests/content/chrome/dom/tests/mochitest/webcomponents/test_webcomponents_chrome.html.  When we add these sheets to a ShadowRoot, we can end up calling nsXBLPrototypeResources::FlushSkinSheets.  That method reloads any chrome: URL sheets, replacing them in nsXBLPrototypeResources::mStyleSheetList.  Before reloading, the sheet will have a correct mOwningNode.  After reloading, it won't (since we don't call SetStyleSheet again on the <style> element that pointed to the old sheet, something that ShadowRoot is responsible for doing), and it will attempt to load the .../test_web_components_chrome.html as the style sheet contents (which is obviously wrong).

Boris: are skin sheets still a thing, and if so, will they ever be inserted into a Web Component's nsXBLPrototypeResources, or can we just remove the FlushSkinSheets call to avoid this problem?  If they are something we need to worry about here I guess we need to identify them more accurately than just "are they chrome: URLs".
Flags: needinfo?(cam) → needinfo?(bzbarsky)
Comment on attachment 8731722 [details] [diff] [review]
0001-Bug-XXX-Avoid-segfault-on-nullptr-from-GetOwnerNode.patch

Review of attachment 8731722 [details] [diff] [review]:
-----------------------------------------------------------------

Per comment 24, we'll need a different fix.
Attachment #8731722 - Flags: review?(cam) → review-
> Boris: are skin sheets still a thing

I don't know...  I haven't been following what our current theming setup is in Firefox.

> and if so, will they ever be inserted into a Web Component's nsXBLPrototypeResources

If they exist at all, I don't see why they wouldn't be.

That said, having FlushSkinSheets do anything with _inline_ style sheets is just broken, right?  We should at least stop doing that.  If we do, is there still a problem for <link rel="stylesheet" href="chrome://"?
Flags: needinfo?(bzbarsky)
OK.  It shouldn't be a problem for the <link> since that'll have a real URL that can be reloaded.  Sounds like we should just record which are inline sheets and skip them in FlushSkinSheets then.
Attached patch patch (obsolete) — Splinter Review
Assignee: nobody → cam
Status: NEW → ASSIGNED
Flags: needinfo?(wchen)
Attachment #8734656 - Flags: review?(bzbarsky)
All WebComponents now gets displayed properly, and no more console log message about fetching CSS with wrong mime type. As expected, this was the same root cause and the patch fixes it completely :)
Great!  Thanks for doing the initial debugging, Alexandre.
Comment on attachment 8734656 [details] [diff] [review]
patch

>+  bool IsInline() const { return mSource == Source::Inline; }

Why could we not simply implement isInline as:

  bool isInline() const { return GetOriginalURI(); }

?  Seems like it would reduce the complexity a good bit...
Flags: needinfo?(cam)
Attached patch patch v2Splinter Review
Yes, that works too.  (I was unsure that a null original URI was used only for inline sheets, but looking at all the places we call SetURIs it seems to be the case.)
Flags: needinfo?(cam)
Attachment #8735052 - Flags: review?(bzbarsky)
Attachment #8734656 - Attachment is obsolete: true
Attachment #8734656 - Flags: review?(bzbarsky)
Comment on attachment 8735052 [details] [diff] [review]
patch v2

r=me.  Note that we had similar code in nsChromeRegistry::RefreshWindow that was null-checking (and then using, but in practice they're the same if not null) the original uri that might make sense to switch to the new boolean check.  Also might make sense to use the new check in nsStyleLinkElement::UpdateStyleSheet and maybe in nsDocument::OnAppThemeChanged.
Attachment #8735052 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] from comment #34)
> r=me.  Note that we had similar code in nsChromeRegistry::RefreshWindow that
> was null-checking (and then using, but in practice they're the same if not
> null) the original uri that might make sense to switch to the new boolean
> check.

Since those URIs are the same in practice (aside from being original URI being null for inline sheets), it might be better just to get rid of the concept of a sheet's original URI altogether and instead use a boolean to record whether it's inline.  I'll make those changes you suggest, using IsInline() and GetSheetURI() instead of GetOriginalURI(), and file a followup to consider getting rid of GetOriginalURI.
https://hg.mozilla.org/mozilla-central/rev/a1ff07269746
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
> Since those URIs are the same in practice (aside from being original URI being null for inline sheets)

Ah, this did not use to be the case.  The sheet URI used to take into account things like HTTP redirects, until that got changed.
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.