Closed Bug 1078674 Opened 5 years ago Closed 5 years ago

startup crash in nsDocument::AdoptNode with randomly-named library on stack

Categories

(Core :: DOM: Core & HTML, defect, critical)

x86
Windows NT
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla36
Tracking Status
firefox34 --- unaffected
firefox35 + verified
firefox36 + verified

People

(Reporter: kairo, Assigned: dmajor)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-f074f074-5ef2-4526-b686-a8cba2141006.
=============================================================

Top frames:
0 	xul.dll 	nsDocument::AdoptNode(nsIDOMNode*, nsIDOMNode**) 	content/base/src/nsDocument.cpp
Ø 1 	2f0ff925183b421098f5.dll 	2f0ff925183b421098f5.dll@0xdf6c 	
Ø 2 	2f0ff925183b421098f5.dll 	2f0ff925183b421098f5.dll@0xf6cd 	
3 	xul.dll 	nsObserverService::NotifyObservers(nsISupports*, char const*, wchar_t const*) 	xpcom/ds/nsObserverService.cpp
4 	xul.dll 	nsGlobalWindow::DispatchDOMWindowCreated() 	dom/base/nsGlobalWindow.cpp
5 	xul.dll 	nsRunnableMethodImpl<void ( mozilla::dom::XULDocument::*)(void), void, 1>::Run() 	xpcom/glue/nsThreadUtils.h
6 	xul.dll 	nsContentUtils::RemoveScriptBlocker() 	content/base/src/nsContentUtils.cpp
7 	xul.dll 	nsDocumentViewer::InitInternal(nsIWidget*, nsISupports*, nsIntRect const&, bool, bool, bool) 	layout/base/nsDocumentViewer.cpp
8 	xul.dll 	nsDocumentViewer::Init(nsIWidget*, nsIntRect const&) 	layout/base/nsDocumentViewer.cpp


Note the randomly-named DLL on frames 1 and 2, we see different DLLs names but all with @0xdf6c in those crash reports.

This started spiking on Nightly 35.0a1 on October 2, it doesn't really appear anywhere before the 2014100209 nightly build.

I also noticed that correlations say that oleacc.dll is loaded in 100% of the crash reports, which sounds to me like accessibility is on for all of them.

Even if this might be connected to some malware or such, it's still Firefox (or Nightly) crashing in large volume (this is >5% of our crashes in the last 3 days). Also, >90% of those crashes happen within 1 minute of uptime, so this can be classified as a startup crash.

See https://crash-stats.mozilla.com/report/list?signature=nsDocument%3A%3AAdoptNode%28nsIDOMNode%2A%2C%20nsIDOMNode%2A%2A%29 for more stats and reports.
|aResult| is garbage. This is similar to the binary crash we saw in bug 1038243.

Regression range: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2399d1ae89e9&tochange=5d6ec4dddf14

This one looks suspicious:
c3fc41feeb22 Nick Lebedev — Bug 1055773 - Move hasAttributes method from Node to Element. r=bz

Nick, bz, any chance an IID got missed?
Oops, adding the CC this time:

> This one looks suspicious:
> c3fc41feeb22 Nick Lebedev — Bug 1055773 - Move hasAttributes method from
> Node to Element. r=bz
> 
> Nick, bz, any chance an IID got missed?
I don't think so, but if these people are getting their document pointer via a mechanism other than QI then an IID bump wouldn't affect them anyway... :(
Hm, it looks like NS_INODE_IID and NS_ELEMENT_IID (maybe others?) should have been bumped. I doubt it's going to make a difference, but it can't hurt?
Attached patch Update UUIDs (obsolete) — Splinter Review
It's a long shot, but on the off chance it works, it would be the most pleasant way to fix this bug.
Attachment #8505041 - Flags: review?(bzbarsky)
Comment on attachment 8505041 [details] [diff] [review]
Update UUIDs

Can't hurt..... Let's give it a shot.
Attachment #8505041 - Flags: review?(bzbarsky) → review+
Keywords: leave-open
Nope. Well it was worth a try. bp-ab379511-c8d5-4ca6-8c34-7c6102141016
[Tracking Requested - why for this release]:
This is now the #1 crash in yesterday's 35.0a2 data.
Summary: Nightly startup crash spike in nsDocument::AdoptNode with randomly-named library on stack → startup crash in nsDocument::AdoptNode with randomly-named library on stack
Could we catch the exception on this bad write and return a failure code? (Although it might just move the crash somewhere else)
Flags: needinfo?(bzbarsky)
I don't know...

How about we just undo the xpidl change, if drivers are still ok with it?  Or at least undo it enough to make the vtables look like they used to: add something in the slot after localName in nsIDOMNode.  The something doesn't have to be hasAttributes, really, or do anything interesting, as long as it puts the vtable back in the state it used to be in, right?

Assuming that fixes the crash, of course...
Flags: needinfo?(bzbarsky)
Attached patch Add dummy method to nsIDOMNode (obsolete) — Splinter Review
Like this?
Attachment #8522005 - Flags: review?(bzbarsky)
Comment on attachment 8522005 [details] [diff] [review]
Add dummy method to nsIDOMNode

Yeah, but might as well mark it [noscript] so it'll be clear it's just internal consumption.

If we're ok taking this on branches (what with the iid changes) and it fixes the crashes, we'll need to think about how to deal with this stuff going forward.  :(
Attachment #8522005 - Flags: review?(bzbarsky) → review+
At this point only Nightly and Aurora are affected. That should be a no-hassle uplift, right?
Hmm.  I think so, yes.  Let's give it a shot.  Then we _really_ need to think hard about this stuff, because if the crash would have just gone away when people recompile against the new beta....
Added noscript, and out of paranoia I made it a boolean to preserve stack offsets:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c96466a1077
As landed on trunk

Approval Request Comment
[Feature/regressing bug #]: External application
[User impact if declined]: #1 aurora crash
[Describe test coverage new/current, TBPL]: a few days on m-c
[Risks and why]: low risk as it restores FF34 vtable offsets 
[String/UUID change made/needed]: Yes. UUIDs changed on nsIDOMNode and descendants.
Attachment #8505041 - Attachment is obsolete: true
Attachment #8522005 - Attachment is obsolete: true
Attachment #8523699 - Flags: review+
Attachment #8523699 - Flags: approval-mozilla-aurora?
This appears to have fixed the crash on nightly. I'm not seeing any replacement signatures either, but KaiRo can you double check my findings... Do you have a way to tell whether this just moved the crash to a new signature? (Explosiveness reports, maybe?)
Flags: needinfo?(kairo)
(In reply to David Major [:dmajor] (UTC+13) from comment #21)
> This appears to have fixed the crash on nightly. I'm not seeing any
> replacement signatures either, but KaiRo can you double check my findings...
> Do you have a way to tell whether this just moved the crash to a new
> signature? (Explosiveness reports, maybe?)

I don't see anything sticking out. That said, this also went away in Aurora 35 over the weekend and the patch didn't land there yet, right?
Flags: needinfo?(kairo)
> I don't see anything sticking out. That said, this also went away in Aurora
> 35 over the weekend and the patch didn't land there yet, right?

Correct... but the reports are starting to come in now. I see some from 11/16 and 11/17.
(In reply to David Major [:dmajor] (UTC+13) from comment #23)
> > I don't see anything sticking out. That said, this also went away in Aurora
> > 35 over the weekend and the patch didn't land there yet, right?
> 
> Correct... but the reports are starting to come in now. I see some from
> 11/16 and 11/17.

Fun... so people are barely hitting it on weekends but do during the week? Something that might only be installed on work computers or so?
Flags: needinfo?(release-mgmt)
Flags: needinfo?(release-mgmt)
Attachment #8523699 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/0985f515e551
Assignee: nobody → dmajor
Status: NEW → RESOLVED
Closed: 5 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
A quick check in Socorro for this crash over the past 2 weeks [1] shows no crashes on Firefox 36, and no crashes on Firefox 35 in builds after November 24th.

[1] - https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=14&signature=nsDocument%3A%3AAdoptNode%28nsIDOMNode%2A%2C+nsIDOMNode%2A%2A%29#tab-reports
Status: RESOLVED → VERIFIED
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.