Closed Bug 975820 Opened 6 years ago Closed 6 years ago
Firefox nightly crashes immediately on startup in ntdll
Firefox crashes immediately on start-up. No crash submission dialog appears, so I have no crash IDs to show. 1. Start Firefox nightly. 2. Firefox immediately crashes with a dialog that says: Nightly A problem caused the program to stop working correctly. Please close the program. [ Close the program ] There is a Firefox process running, but it is only using around 1.5 MB of RAM. Trying to launch Firefox in Safemode or Profile Manager results in the same immediate crash. The only way to make Firefox work is to start Windows in Windows Safemode. I tried renaming the ../Mozilla/ folders in my User Profile to different things but the crash still occurs, so I'm pretty sure this crash is unrelated to my profile. Working build: 20140220120428 https://hg.mozilla.org/mozilla-central/rev/cc962df350a7 Crashing build: 20140220121328 https://hg.mozilla.org/mozilla-central/rev/b89a9d7b4ca0 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cc962df350a7&tochange=b89a9d7b4ca0 Event viewer shows the crashes as: Faulting application name: firefox.exe, version: 18.104.22.16864, time stamp: 0x5306692c Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: 0x521ea8e7 Exception code: 0xc0000005 Fault offset: 0x0003c4e2 Faulting process id: 0x2a8 Faulting application start time: 0x01cf30ab716a796d Faulting application path: C:\Users\<username>\Desktop\firefox-30.0a1.en-US.win32_1392927208\firefox\firefox.exe Faulting module path: C:\Windows\SysWOW64\ntdll.dll Report Id: af17fe59-9c9e-11e3-b328-5404a632d340 Windows 7 x64, 16GB RAM, i5-2500k (overclocked), GeForce 660Ti, 256 GB SSD. 32 bit version of Firefox. Another person has reported the same problem at Mozillazine @ http://forums.mozillazine.org/viewtopic.php?p=13376663#p13376663 Latest nightly from today (23rd Feb) also crashes in the same way.
Summary: Firefox nightly crashes immediately on startip → Firefox nightly crashes immediately on startup
Summary: Firefox nightly crashes immediately on startup → Firefox nightly crashes immediately on startup in ntdll.dll
Crash stack from VS debugger
Part of crash disassembly from VS debugger
I happens to have Visual Studio installed on this PC, so can add (not very useful) crash stack and disassembly. Hope it helps...
> Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: > 0x521ea8e7 > Exception code: 0xc0000005 > Fault offset: 0x0003c4e2 That offset is LdrLoadDll+5. It's no longer a valid instruction boundary after our new hook. Some code must have noted that offset very early in the process... Alexander, could you save a dump file from Visual Studio? "Minidump with Heap" would be the most helpful.
Seeing the same problem starting with the last few nightlies (noticed it after yesterday's update). Also tried renaming my .Mozilla folder to no avail. Took a dump file with "Minidump with Heap" option set but it just a hair over the bugzilla file size limit so I cannot attache it here. Here is a Dropbox link to the file- https://www.dropbox.com/sh/l68cqai0g36hthc/mbQkqjMfs6
Thank you Barak! (By the way, you can get pretty good compression by zipping .dmp files) I think this is related to McAfee Host Intrusion Prevention. I will try to reproduce the issue locally.
I get the same error, so if it helps here's my config: Windows 7 x64 Enterprise AMD HD5750 4 GB RAM 80 GB SSD (Fx is installed on a common HDD though) Avira Antivir and Comodo are running, but I don't have any McAfee products. OS language is de-DE and I'm using en-US nightly. Release version of Fx runs just fine with the same profile.
The code change that caused this crash has been removed while we investigate. Tomorrow's nightly should be fine.
Still crashing in 20140224131509 https://hg.mozilla.org/mozilla-central/rev/4b6103d24d1e Crashing has disappeared in 20140224174607 https://hg.mozilla.org/mozilla-central/rev/e3daaa4c73dd Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4b6103d24d1e&tochange=e3daaa4c73dd Fixed due to back-out of bug 951827. FWIW I don't run McAfee anything, but like Bur in comment 9 I am running a Comodo product - Comodo Firewall 5.12.256249.2599
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Ok, fixed for me with today's (Feb 25) Nightly. Thanks to all involved :)
Assignee: nobody → dmajor
Target Milestone: --- → Firefox 30
On an affected build, I have been unable to reproduce the crash with latest versions of Comodo and Avira, and McAfee HIPS requires a managed enterprise environment. Still, from Barak's memory dump it's clear that McAfee hooks LdrLoadDll (and many other ntdll functions) very early in the process, noting LdrLoadDll+5 as its resume-point. Presumably the other AVs do the same; it seems like an AV-ish thing to do. We'll have to revisit mozglue's new hook design. It seems six-byte jumps are not an option.
FWIW I highly suspect my version (5.12) of Comodo Firewall was causing the problem and have to note that it's an old version that's over a year old and no longer supported. Comodo Firewall is now on version 6 which seems to be OK as per dmajor's testing in comment 13. Although of course it seems that McAfee was also implicated in this crashing too.
That's interesting, I'm also using Comodo 5.12 (unlike other products it never nagged to be updates, so I forgot about it, go figure), so this might well be the problem besides McAfee. Unfortunately I can't get WinDbg installed, it quits with a nondescript error message. So I can't get a memory dump. Isn't there a simple windbg.exe that can be downloaded anywhere? I read it's more or less portable.
Yeah, it seems WinDbg gets more frustrating to install with every version. There is an old installer at http://download.cnet.com/Debugging-Tools-for-Windows/3000-2086_4-10907878.html that is relatively standalone. (Just make sure to use the real download button and not the fake ads)
Could some of you please give this build a try? http://firstname.lastname@example.org/try-win32/firefox-30.0a1.en-US.win32.zip This build contains another attempt at my code change from bug 973138, in a way that will hopefully play nice with the antivirus hooks. Let me know if it still crashes for you. Thanks!
(Sorry, wrong bug number, I meant to say that it's another attempt at bug 951827)
For me, the build at http://email@example.com/try-win32/firefox-30.0a1.en-US.win32.zip starts up normally and does not crash. I'll use it all day tomorrow to see if there's any weirdness. But now I sleep.
I have the same experience - starts up normally and seems to be working fine. I'll use it for next few days and will update if I have any issues.
Cool, so that's a positive result with both McAfee and Comodo -- thanks! Don't worry about pounding on that build exhaustively. If it hasn't crashed within the first few minutes then I feel pretty confident that we're getting along with the AVs. I'll put the code back into Nightly for broader testing.
Didn't see any strangeness with the build. Restarted it quite a few times, and no problems.
A little late, but it's working flawlessly for me as well. Just out of curiosity, is it possible to say who's "at fault" here? I don't want to blame anybody, it just interests me how things like this happen and what usually seems to prevent them from happening.
> Just out of curiosity, is it possible to say who's "at fault" here? I don't > want to blame anybody, it just interests me how things like this happen and > what usually seems to prevent them from happening. Kind of neither, and kind of both. Applications make assumptions about the state of the process, and hooking system APIs like this can invalidate those assumptions. Either app's actions would be okay on its own, but put the two together and they collide. From the operating system's perspective, both the browser and the antivirus are doing questionable things here :-) But we do it to prevent bad software from doing even worse things.
Thanks, so the AV software hooks LdrLoadDll and when this DLL is called by any app, the AV gets involved and after that the original app returns to plus 5 bytes of the original position which causes a problem? Why does the AV assume it won't be a problem and what was changed in Fx source that made it a problem? If this is too OT, just let me know... :)
(In reply to Bur from comment #26) It's a bit of a long story, bug 951827 has more technical details. In short: bug 951827 comment 15 describes a problem we were having (which was itself caused by yet another change). We tried to fix it by putting an indirect jump at the start of LdrLoadDll, but indirect jumps are 6 bytes, so when the AV returns to the +5 offset, they've jumped into the middle of an instruction.
verified per reporters comments.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.