Crash in XPCWrappedNative::CallMethod while starting the addon manager
Categories
(Core :: XPConnect, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | wontfix |
firefox-esr68 | --- | wontfix |
firefox59 | + | wontfix |
firefox60 | + | wontfix |
firefox61 | - | wontfix |
firefox62 | - | wontfix |
firefox63 | --- | wontfix |
firefox64 | --- | wontfix |
firefox65 | --- | wontfix |
firefox66 | --- | wontfix |
firefox67 | --- | wontfix |
firefox68 | --- | wontfix |
firefox69 | --- | wontfix |
firefox70 | --- | wontfix |
People
(Reporter: philipp, Unassigned)
Details
(4 keywords)
Crash Data
This bug was filed from the Socorro interface and is report bp-8a94a204-a751-433b-b2e9-df8f50180315. ============================================================= Top 10 frames of crashing thread: 0 xul.dll XPCWrappedNative::CallMethod js/xpconnect/src/XPCWrappedNative.cpp:1234 1 xul.dll XPC_WN_CallMethod js/xpconnect/src/XPCWrappedNativeJSOps.cpp:929 2 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:473 3 xul.dll InternalCall js/src/vm/Interpreter.cpp:522 4 xul.dll Interpret js/src/vm/Interpreter.cpp:3096 5 xul.dll js::RunScript js/src/vm/Interpreter.cpp:423 6 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:495 7 xul.dll InternalCall js/src/vm/Interpreter.cpp:522 8 xul.dll js::fun_call js/src/jsfun.cpp:1189 9 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:473 ============================================================= this startup crash signature is rising in 59.0 release builds on windows. the crash address is indicating a UAF situation in many cases.
Reporter | ||
Updated•6 years ago
|
Updated•6 years ago
|
Comment 1•6 years ago
|
||
Tracking this for 59 since it's a fairly high volume startup crash. mccr8, can you take a look? (since overholt is out on PTO)
Comment 2•6 years ago
|
||
I put off looking into this because the top of the stack is just us calling into C++ from JS, but I noticed just now that below all of the stuff about running JS on the stack, there's a call to nsXREDirProvider::DoStartup(). Which matches up with comment 1 which says it is a startup crash. In fact, 68% of the crashes with this signature have this exact same proto signature. The crash is on this call to observe: // Init the Extension Manager nsCOMPtr<nsIObserver> em = do_GetService("@mozilla.org/addons/integration;1"); if (em) { em->Observe(nullptr, "addons-startup", nullptr); Andrew, could you look at this? I'm not sure how easy it will be to investigate without a JS stack, but maybe you know of something that changed in this code. Thanks.
Updated•6 years ago
|
Comment 3•6 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2) > I put off looking into this because the top of the stack is just us calling > into C++ from JS, but I noticed just now that below all of the stuff about > running JS on the stack, there's a call to nsXREDirProvider::DoStartup(). > Which matches up with comment 1 which says it is a startup crash. In fact, > 68% of the crashes with this signature have this exact same proto signature. > > The crash is on this call to observe: > // Init the Extension Manager > nsCOMPtr<nsIObserver> em = > do_GetService("@mozilla.org/addons/integration;1"); > if (em) { > em->Observe(nullptr, "addons-startup", nullptr); > > Andrew, could you look at this? I'm not sure how easy it will be to > investigate without a JS stack, but maybe you know of something that changed > in this code. Thanks. This is just the call that triggers add-on manager startup. It's unfortunately not a lot to go on, since that triggers the loading of all sorts of toolkit stuff, and a half dozen or so system add-ons. And, unfortunately, I don't see any useful clues in that stack... I wonder if we can find a cheap way to annotate crash reports with the name/interface of XPConnect methods that are being called...
Comment 4•6 years ago
|
||
I'm going to mark this as sec-moderate, as the crash happens so early in startup, I doubt there's any web content running, so I can't imagine it is exploitable.
Comment 5•6 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #4) > I'm going to mark this as sec-moderate, as the crash happens so early in > startup, I doubt there's any web content running, so I can't imagine it is > exploitable. At that point in startup, there really can't be any web content running. Not unless we have something like a legacy extension running that makes sync HTTP requests very early in startup. Which shouldn't be possible in 59 release.
Updated•6 years ago
|
Comment 6•6 years ago
|
||
Not much we're likely going to be able to do for this on 59 at this point, but it's still happening often enough that we should try to figure out something for 60 if at all possible.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Comment 7•6 years ago
|
||
This doesn't seem actionable, so I don't think there's any point in relman tracking it. Andrew, do you think someone on your team might add diagnostic info (suggested in comment 3)?
Updated•6 years ago
|
Comment 8•6 years ago
|
||
I'll take a look at this as I go through the "unactionable" JS crash list and see if there are any similarities.
Comment 10•6 years ago
|
||
I looked at a few minidumps and a lot of these are a mess. We seem to complete smash the stack in a number of cases. Sometimes we crash before the actual XPC call (but still after we start using method info and other virtual calls) and sometimes after. One thing that looks odd to me is that I would expect crashes a few instructions after a call, should still have the expected return address on stack that CPU used to |ret| from the call, but I usually do not see this. It is unclear how we got to were we crash. Often, further up the stack looks fine and has the data structures we expect either complete or with the lower-addressed (top-of-stack) words clobbered. I'm at a loss for how these got so corrupted.
Comment 11•6 years ago
|
||
My best idea at this point is to annotate crash reports with a small memory region (in non-release builds), and copy the interface::method_name into it whenever we're about to make an XPC call. That should be pretty cheap relative to the total XPC call overhead, and should hopefully at least give us something to go on.
Comment 12•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=1f0b7e9a6450c28f816ca52d8809c41b6b87b3e5&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=success&filter-searchStr=M(15)&selectedJob=179523198 I wonder if this intermittent failure is related.
Comment 13•6 years ago
|
||
Given the reduction of the spike in reports here, and comment 5, I'm going to mark this P3. Should we spin comment 11 out into a separate bug?
Comment 14•6 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #13) > Should we spin comment 11 out into a separate bug? I don't think so. It's not worth adding the overhead unless we need it to track this down, so if we're going to do it, it should probably be done as part of this bug.
Comment 15•6 years ago
|
||
Still some crashes for 62/63. If we are going to add annotation as mentioned in comment 11 it would likely be for 64/65.
Updated•6 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 16•5 years ago
|
||
Ted, this has been around for a long time and wontfix for many releases. Do you know if this is actionable at this point?
Updated•5 years ago
|
Comment 17•5 years ago
|
||
Crash rate seems to have dropped off. I don't see anything actionable here.
One thing that Nika pointed out to me is that the info described in Comment 11 is generally already in crash reports since it is on stack. We worked through one example and where able to determine the calling method. This was a one-off non-startup example so probably not representative.
With effort, we could look at stack locals in sample crashes and find names of methods and interfaces. I don't see enough crashes to justify myself doing the work right now.
Updated•5 years ago
|
Comment 18•2 years ago
|
||
There are crashes with this signature, but I'm not seeing any UAFs.
Comment 19•2 years ago
|
||
Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 months ago
|
Description
•