Closed Bug 1078674 Opened 5 years ago Closed 5 years ago
startup crash in ns
Document::Adopt Node with randomly-named library on stack
This bug was filed from the Socorro interface and is report bp-f074f074-5ef2-4526-b686-a8cba2141006. ============================================================= Top frames: 0 xul.dll nsDocument::AdoptNode(nsIDOMNode*, nsIDOMNode**) content/base/src/nsDocument.cpp Ø 1 2f0ff925183b421098f5.dll 2f0ff925183b421098f5.dll@0xdf6c Ø 2 2f0ff925183b421098f5.dll 2f0ff925183b421098f5.dll@0xf6cd 3 xul.dll nsObserverService::NotifyObservers(nsISupports*, char const*, wchar_t const*) xpcom/ds/nsObserverService.cpp 4 xul.dll nsGlobalWindow::DispatchDOMWindowCreated() dom/base/nsGlobalWindow.cpp 5 xul.dll nsRunnableMethodImpl<void ( mozilla::dom::XULDocument::*)(void), void, 1>::Run() xpcom/glue/nsThreadUtils.h 6 xul.dll nsContentUtils::RemoveScriptBlocker() content/base/src/nsContentUtils.cpp 7 xul.dll nsDocumentViewer::InitInternal(nsIWidget*, nsISupports*, nsIntRect const&, bool, bool, bool) layout/base/nsDocumentViewer.cpp 8 xul.dll nsDocumentViewer::Init(nsIWidget*, nsIntRect const&) layout/base/nsDocumentViewer.cpp Note the randomly-named DLL on frames 1 and 2, we see different DLLs names but all with @0xdf6c in those crash reports. This started spiking on Nightly 35.0a1 on October 2, it doesn't really appear anywhere before the 2014100209 nightly build. I also noticed that correlations say that oleacc.dll is loaded in 100% of the crash reports, which sounds to me like accessibility is on for all of them. Even if this might be connected to some malware or such, it's still Firefox (or Nightly) crashing in large volume (this is >5% of our crashes in the last 3 days). Also, >90% of those crashes happen within 1 minute of uptime, so this can be classified as a startup crash. See https://crash-stats.mozilla.com/report/list?signature=nsDocument%3A%3AAdoptNode%28nsIDOMNode%2A%2C%20nsIDOMNode%2A%2A%29 for more stats and reports.
Just FYI: I searched with one of the .ddl names and found http://www.herdprotect.com/1a1476218c9a4d6ba557.dll-f8335ff647ed2878c571a47a452c52694227db9f.aspx and http://www.herdprotect.com/signer-outobox-603472def1c8b65b2e93e06c16768db7.aspx
|aResult| is garbage. This is similar to the binary crash we saw in bug 1038243. Regression range: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2399d1ae89e9&tochange=5d6ec4dddf14 This one looks suspicious: c3fc41feeb22 Nick Lebedev — Bug 1055773 - Move hasAttributes method from Node to Element. r=bz Nick, bz, any chance an IID got missed?
Oops, adding the CC this time: > This one looks suspicious: > c3fc41feeb22 Nick Lebedev — Bug 1055773 - Move hasAttributes method from > Node to Element. r=bz > > Nick, bz, any chance an IID got missed?
I don't think so, but if these people are getting their document pointer via a mechanism other than QI then an IID bump wouldn't affect them anyway... :(
Hm, it looks like NS_INODE_IID and NS_ELEMENT_IID (maybe others?) should have been bumped. I doubt it's going to make a difference, but it can't hurt?
It's a long shot, but on the off chance it works, it would be the most pleasant way to fix this bug.
Attachment #8505041 - Flags: review?(bzbarsky)
Comment on attachment 8505041 [details] [diff] [review] Update UUIDs Can't hurt..... Let's give it a shot.
Attachment #8505041 - Flags: review?(bzbarsky) → review+
Nope. Well it was worth a try. bp-ab379511-c8d5-4ca6-8c34-7c6102141016
[Tracking Requested - why for this release]: This is now the #1 crash in yesterday's 35.0a2 data.
Summary: Nightly startup crash spike in nsDocument::AdoptNode with randomly-named library on stack → startup crash in nsDocument::AdoptNode with randomly-named library on stack
Could we catch the exception on this bad write and return a failure code? (Although it might just move the crash somewhere else)
I don't know... How about we just undo the xpidl change, if drivers are still ok with it? Or at least undo it enough to make the vtables look like they used to: add something in the slot after localName in nsIDOMNode. The something doesn't have to be hasAttributes, really, or do anything interesting, as long as it puts the vtable back in the state it used to be in, right? Assuming that fixes the crash, of course...
Attachment #8522005 - Flags: review?(bzbarsky)
Comment on attachment 8522005 [details] [diff] [review] Add dummy method to nsIDOMNode Yeah, but might as well mark it [noscript] so it'll be clear it's just internal consumption. If we're ok taking this on branches (what with the iid changes) and it fixes the crashes, we'll need to think about how to deal with this stuff going forward. :(
Attachment #8522005 - Flags: review?(bzbarsky) → review+
At this point only Nightly and Aurora are affected. That should be a no-hassle uplift, right?
Hmm. I think so, yes. Let's give it a shot. Then we _really_ need to think hard about this stuff, because if the crash would have just gone away when people recompile against the new beta....
Added noscript, and out of paranoia I made it a boolean to preserve stack offsets: https://hg.mozilla.org/integration/mozilla-inbound/rev/8c96466a1077
As landed on trunk Approval Request Comment [Feature/regressing bug #]: External application [User impact if declined]: #1 aurora crash [Describe test coverage new/current, TBPL]: a few days on m-c [Risks and why]: low risk as it restores FF34 vtable offsets [String/UUID change made/needed]: Yes. UUIDs changed on nsIDOMNode and descendants.
This appears to have fixed the crash on nightly. I'm not seeing any replacement signatures either, but KaiRo can you double check my findings... Do you have a way to tell whether this just moved the crash to a new signature? (Explosiveness reports, maybe?)
(In reply to David Major [:dmajor] (UTC+13) from comment #21) > This appears to have fixed the crash on nightly. I'm not seeing any > replacement signatures either, but KaiRo can you double check my findings... > Do you have a way to tell whether this just moved the crash to a new > signature? (Explosiveness reports, maybe?) I don't see anything sticking out. That said, this also went away in Aurora 35 over the weekend and the patch didn't land there yet, right?
> I don't see anything sticking out. That said, this also went away in Aurora > 35 over the weekend and the patch didn't land there yet, right? Correct... but the reports are starting to come in now. I see some from 11/16 and 11/17.
(In reply to David Major [:dmajor] (UTC+13) from comment #23) > > I don't see anything sticking out. That said, this also went away in Aurora > > 35 over the weekend and the patch didn't land there yet, right? > > Correct... but the reports are starting to come in now. I see some from > 11/16 and 11/17. Fun... so people are barely hitting it on weekends but do during the week? Something that might only be installed on work computers or so?
Attachment #8523699 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
A quick check in Socorro for this crash over the past 2 weeks  shows no crashes on Firefox 36, and no crashes on Firefox 35 in builds after November 24th.  - https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=14&signature=nsDocument%3A%3AAdoptNode%28nsIDOMNode%2A%2C+nsIDOMNode%2A%2A%29#tab-reports
You need to log in before you can comment on or make changes to this bug.