Closed Bug 718389 Opened 13 years ago Closed 9 years ago

Startup crash @ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr

Categories

(Core :: Networking: DNS, defect)

10 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox10 + ---

People

(Reporter: scoobidiver, Assigned: sworkman)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: startupcrash)

Crash Data

It's a startup crash (98% before 1 min) that first appeared in 10.0b4.
It's currently #2 top crasher in 10.0b4.

Either it's a regression between Beta 3 and 4 or it's caused by new users in the Beta sample group.
For the first case, here is the Beta regression range:
hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=e45fb547926c&tochange=886b2220bff9
It might be caused by bug 694068, bug 707207, bug 704987, or bug 704988 that are about add-on updates.

Signature 	PR_EnumerateAddrInfo More Reports Search
UUID	aa467d75-25c5-42dd-b741-6ddb52120116
Date Processed	2012-01-16 01:49:07
Uptime	1
Last Crash	16 seconds before submission
Install Age	1.4 days since version was first installed.
Install Time	2012-01-14 15:01:33
Product	Firefox
Version	10.0
Build ID	20120111092507
Release Channel	beta
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 23 stepping 10
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x6802958c
App Notes 	
AdapterVendorID: 8086, AdapterDeviceID: 2a42, AdapterSubsysID: 00000000, AdapterDriverVersion: 8.15.10.1892
Has dual GPUs. GPU #2: AdapterVendorID2: 8086, AdapterDeviceID2: 2a43, AdapterSubsysID2: 0000000c, AdapterDriverVersion2: 8.15.10.1892D3D10 Layers? D3D10 Layers- 	
EMCheckCompatibility	True

Frame 	Module 	Signature [Expand] 	Source
0 	nspr4.dll 	PR_EnumerateAddrInfo 	nsprpub/pr/src/misc/prnetdb.c:2117
1 	xul.dll 	nsDNSRecord::GetNextAddr 	netwerk/dns/nsDNSService2.cpp:149
2 	xul.dll 	nsSocketTransport::OnSocketEvent 	netwerk/base/src/nsSocketTransport2.cpp:1470
3 	xul.dll 	nsSocketEvent::Run 	netwerk/base/src/nsSocketTransport2.cpp:97
4 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:631
5 	xul.dll 	nsSocketTransportService::DoPollIteration 	netwerk/base/src/nsSocketTransportService2.cpp:734
6 	xul.dll 	nsSocketTransportService::Run 	netwerk/base/src/nsSocketTransportService2.cpp:639
7 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:631
8 	xul.dll 	nsThreadStartupEvent::Run 	xpcom/threads/nsThread.cpp:201
9 	nspr4.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:426
10 	nspr4.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:122
11 	msvcr80.dll 	msvcr80.dll@0x29ba 	
12 	msvcr80.dll 	msvcr80.dll@0x2a46 	
13 	ntdll.dll 	ntdll.dll@0x6377a 	
14 	ntdll.dll 	ntdll.dll@0x6374d 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=PR_EnumerateAddrInfo
Dave can correct me if I'm wrong, but I don't know that there's enough info to point to the add-on hotfix being at fault. It didn't sound like the hotfix work would affect startup in any way.
Still tracking for FF10, of course.
I requested a correlation report for this signature so should have some more info soon.
The crash appears to be in a background DNS lookup thread. I can't think of anything the hotfix code does that would cause any new DNS lookups during startup so I'm not sure how this could be caused by the hotfix work right now.
Here is the correlation info I was able to get manually with the help of rhelmer - http://people.mozilla.org/~rhelmer/temp/Firefox-10.0b4-correlation/ is the link to the full reports which shows the module breakdown by version as well. 

Windows NT
  PR_EnumerateAddrInfo|EXCEPTION_ACCESS_VIOLATION_READ (593 crashes)
     89% (527/593) vs.  78% (9782/12616) testpilot@labs.mozilla.com (Mozilla Labs - Test Pilot, https://addons.mozilla.org/addon/13661)

Windows NT
  PR_EnumerateAddrInfo|EXCEPTION_ACCESS_VIOLATION_READ (593 crashes)
    100% (591/593) vs.  56% (7121/12616) urlmon.dll
    100% (591/593) vs.  74% (9342/12616) wininet.dll
     90% (536/593) vs.  73% (9238/12616) t2embed.dll
     68% (402/593) vs.  52% (6582/12616) lz32.dll
     68% (402/593) vs.  52% (6582/12616) wldap32.dll
     68% (402/593) vs.  53% (6697/12616) xpsp2res.dll
     76% (448/593) vs.  61% (7748/12616) imagehlp.dll
     68% (402/593) vs.  55% (6929/12616) iphlpapi.dll
     24% (143/593) vs.  11% (1449/12616) idmmkb.dll
    100% (593/593) vs.  88% (11056/12616) rasadhlp.dll
     68% (402/593) vs.  56% (7011/12616) hnetcfg.dll
     68% (402/593) vs.  56% (7046/12616) wshtcpip.dll
     68% (402/593) vs.  57% (7192/12616) ws2help.dll
     66% (393/593) vs.  57% (7133/12616) comres.dll
    100% (593/593) vs.  91% (11527/12616) dnsapi.dll
     88% (524/593) vs.  80% (10155/12616) nssckbi.dll
     88% (524/593) vs.  81% (10234/12616) nssdbm3.dll
     88% (524/593) vs.  81% (10236/12616) freebl3.dll
     96% (569/593) vs.  89% (11227/12616) winrnr.dll
    100% (591/593) vs.  93% (11705/12616) wintrust.dll
     70% (414/593) vs.  64% (8025/12616) mpr.dll
    100% (593/593) vs.  95% (11951/12616) browsercomps.dll
    100% (593/593) vs.  95% (11967/12616) softokn3.dll
    100% (593/593) vs.  95% (11977/12616) firefox.exe

Windows NT
  PR_EnumerateAddrInfo|EXCEPTION_ACCESS_VIOLATION_READ (593 crashes)
      0% (0/593) vs.   0% (6/12616) x86 with 0 cores
     40% (237/593) vs.  28% (3582/12616) x86 with 1 cores
     48% (284/593) vs.  54% (6857/12616) x86 with 2 cores
      0% (2/593) vs.   1% (108/12616) x86 with 3 cores
     10% (61/593) vs.  13% (1628/12616) x86 with 4 cores
      0% (0/593) vs.   0% (1/12616) x86 with 5 cores
      0% (0/593) vs.   1% (77/12616) x86 with 6 cores
      2% (9/593) vs.   3% (341/12616) x86 with 8 cores
      0% (0/593) vs.   0% (10/12616) x86 with 12 cores
      0% (0/593) vs.   0% (2/12616) x86 with 16 cores
      0% (0/593) vs.   0% (4/12616) x86 with 24 cores
I checked some reports. It seems related to:
* Cognizance Identity Manager: http://www.cognizancesecurity.com/products/overview.html
* Ultimate Arena plugin: http://forums.bukkit.org/threads/fun-mech-rpg-ultimatearena-v0-1-the-ultimate-arena-plugin-1597.47753/
* Bonjour services: http://www.apple.com/fr/support/bonjour/
* HP ProtectTools Security Manager: http://www.digitalpersona.com/
* SearchQu toolbar
* Generic.bfr!do!228D71​124580: http://www.mcafee.com/threat-intelligence/malware/default.aspx?id=753443
* Generic.bfr!do!75C9306E9FD6: http://home.mcafee.com/virusinfo/virusprofile.aspx?key=753058#none
* Generic.bfr!dp!226400​913471: http://www.mcafee.com/threat-intelligence/malware/default.aspx?id=758160
* Generic.bfr!dn!1915B2​5B9F5D: http://www.mcafee.com/threat-intelligence/malware/default.aspx?id=741859
* Generic.bfr!13F870874661: http://home.mcafee.com/virusinfo/virusprofile.aspx?key=684417#none
* Trojan.MulDrop3.26142: http://www.drwebhk.com/en/virus_techinfo/Trojan.MulDrop3.26142.html
Summary: Startup crash @ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr → Startup crash @ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr (mainly related to malware)
(In reply to Scoobidiver from comment #7)
> I checked some reports. It seems related to:

Thanks Marcia and Scoobidiver. Let's focus on the non-malware crashes in that list. When malware correlations are removed from the reports, is this still a top crasher?

Including Josh to help take a look at https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=e45fb547926c&tochange=886b2220bff9 and try to figure out if anything introduced in beta 4 could be related to this crash on the DNS lookup thread.
Hard to say about the malware since it was not showing up in the correlation report.

(In reply to Alex Keybl [:akeybl] from comment #8)
> (In reply to Scoobidiver from comment #7)
> > I checked some reports. It seems related to:
> 
> Thanks Marcia and Scoobidiver. Let's focus on the non-malware crashes in
> that list. When malware correlations are removed from the reports, is this
> still a top crasher?
> 
> Including Josh to help take a look at
> https://hg.mozilla.org/releases/mozilla-beta/
> pushloghtml?fromchange=e45fb547926c&tochange=886b2220bff9 and try to figure
> out if anything introduced in beta 4 could be related to this crash on the
> DNS lookup thread.
(In reply to Alex Keybl [:akeybl] from comment #8)
> When malware correlations are removed from the reports, is this
> still a top crasher?
Comment 7 was based on about fifty crash reports I checked manually and I don't want to check manually 1300 crash reports to find non malware related causes.

In addition, for a startup crash, a top crasher rank has no meaning as it will be high at the beginning and low at the end when all users with the problem have switched to another browser or a previous version of Firefox.

How many different users hit it amongst 1M ADU? The worst case is 1300, the best case (5 tentatives per user) is 300.
Amongst those 300 users (extrapolated to 30,000 for the final release), how many don't have malware? I don't know.
Assignee: nobody → sworkman
Just to clarify, this signature is NOT new with b4. It goes back much farther than that. The volume for b4 is much higher than normal but I did some searches for Dec, Nov and found 100+ signatures across all versions.
(In reply to Sheila Mooney from comment #11)
> Just to clarify, this signature is NOT new with b4. It goes back much
> farther than that. The volume for b4 is much higher than normal but I did
> some searches for Dec, Nov and found 100+ signatures across all versions.

Can we determine if any specific DLL correlations exploded in the beta 4 timeframe? I'd like to understand what caused this topcrash bug, and if it's malware, stop tracking it for release of 10.
The interesting thing is that it ONLY seems to spike on 10. It had risen a bit on 10a2 but there were lots of dups. It really exploded on b4. If it were malware, wouldn't we see some corresponding increase in 9.0.1 as well. We only have a handful of these signatures. We still could have checked in something that tweaked a malware related crash. I think it's still worth investigating since this has happened before.
(In reply to Scoobidiver from comment #7)

> * Ultimate Arena plugin:
> http://forums.bukkit.org/threads/fun-mech-rpg-ultimatearena-v0-1-the-
> ultimate-arena-plugin-1597.47753/

This is a Minecraft server plugin. While I guess it's not out of the question Firefox might be running on a Minecraft server, the plugins are Java classes. Any chance we've misidentified this one?

In particular, Xfire (a gaming match service) used to be called Ultimate Arena, and they used that name in their install path. Depending on how you got the string, wondering if their software might still have the old name somewhere.

Not super-important, but if it's implicated want to make sure we're looking at the right thing.
Quick Status update from looking into the code:

* First off, I checked the code in question, and there is not an immediately obvious bug which the stack trace points to.

* Checked through code history, but I don't see anything which was changed recently in either nsDNSService.cpp or prnetdb.c which resulted in the explosion of crashes in b4. The stack trace ends with a call to _pr_ipv6_is_present()  (prnetdb.c:2117), which seems to be a strange place to crash with EXCEPTION_ACCESS_VIOLATION_READ. I checked print.c and pripv6.c as well, and no recent changes there either. So, looks likes some external force exposing a weakness (or just causing a crash). I also checked code history going back to the time of bug 501446 (see next point), but no luck there either.

* There is a possible weakness in the code which was exposed in bug 501446 by NoScript ABE.  However, bug 501446 comment 5 shows a slightly different stack trace, and NoScript was fixed at that time, and we don't seem to be dealing with NoScript here. No fix was supplied in that bug, due to NoScript using our DNS Service incorrectly at the time. (As bug 501446 comment 31 points out, however, maybe there's a defensive measure we could take in our code). But there's it not conclusive evidence that we're seeing the same thing.

* I tried to reproduce the bug, but I was stuck by getting the right plugins (assuming this is caused by a plugin like bug 501446. The app/plugin correlations mentioned in comment 7 either cost money to get (e.g. Cognizance Identity Manager), aren't FF plugins (Ultimate Arena seems to be a Minecraft plugin as geo said in comment 14) or don't have FF plugins (Bonjour Print Services on Windows). Since I can't reproduce it, I can't get a local stack trace to confirm exactly what it is. I tried reproducing with NoScript ABE to see if 501446 is happening again, but no crash with the recent version of it.

So, my best guess at present is, like I said, there is a weakness in nsDNSService::GetNextAddr or PR_EnumerateAddrInfo which is being exposed by some external force, either plugin or malware. The best approach I can suggest based on all that is to analyze the code by hand and look for weaknesses potentially related to 501446. But it's kind of a stab in the dark since I can't reproduce it, and the stack trace provided with the crashes doesn't point to an obvious issue. It'll be hard to say that the actual problem has been targeted and thus hard to say it's fixed.

To echo Alex's comment 12, if we can identify it as malware, then at least we can drop the priority and take more time to look at it.

Or if someone else can reproduce it, that would also be great :)
Keywords: qawanted
Version: 9 Branch → 10 Branch
(In reply to Steve Workman [:sworkman] from comment #15)
> * I tried to reproduce the bug, but I was stuck by getting the right plugins
> (assuming this is caused by a plugin like bug 501446. The app/plugin
> correlations mentioned in comment 7 either cost money to get (e.g.
> Cognizance Identity Manager), aren't FF plugins (Ultimate Arena seems to be
> a Minecraft plugin as geo said in comment 14) or don't have FF plugins
> (Bonjour Print Services on Windows). Since I can't reproduce it, I can't get
> a local stack trace to confirm exactly what it is. I tried reproducing with
> NoScript ABE to see if 501446 is happening again, but no crash with the
> recent version of it.

Thanks Steve. Sounds like there's nothing obvious in the push log for beta 4 that would cause this (so we can't speculatively back anything out in lieu of STR).

> To echo Alex's comment 12, if we can identify it as malware, then at least
> we can drop the priority and take more time to look at it.

Sheila spoke with me about this yesterday. She doesn't think that this is caused by new malware, but that malware may still be tickling something new. The evidence for this is that no other versions of Firefox started crashing significantly with this signature when we saw the crashes in beta 4.

I think there's three avenues that we can explore here

1) Figuring out a way to evaluate whether a majority of users running into this problem have malware installed (and if that's out of the ordinary, de-prioritize this crasher)
2) Continue to try to reproduce the issue with QA's help
3) Give the push log for beta 4 one more look - it's still very suspicious that this (only) exploded in FF10 beta 4
Filed Bug 719476 for the updated correlation report.
Marcia is going to have them run another correlation report to see if there is something We have a higher volume of crashes now so there might be better info.

I really don't think we can ignore this until we are certain it's malware. Also, we need to explain whey we have 1300+ of these in b4 but only a single crash in b3 and none in b2. We have the same number of users.
Just for the record, I don't think we should go too far with the belief that this could be malware-related, as we don't have strong evidence for that. Scoobidiver didn't give us good pointers on how he got to believe what he says in comment #7, I'd love to see some good evidence for that and how he got to that data. The unversioned DLLs that Marcia is pointing to in comment #6 might be a pointer but also might not as we have seen non-malware DLLs without versions as well (we even ship one ourselves).
And even if there's a relation, something makes this only be a problem in 10b4 so far, not in any release version, and usually, we are seeing malware hit majorly on release versions.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #19)
> Scoobidiver didn't give us good pointers on how he got to believe what he
> says in comment #7, I'd love to see some good evidence for that and how he
> got to that data.
Depending on the crash report, there are unversioned DLLs, non Windows versioned DLLs or only Windows DLLs.
My feeling is that about 40-50% of crashes has unversioned DLLs.
For crashes with non Windows versioned DLLs, they might be the cause.
For crashes with only Windows DLLs, buggy LSPs might be the cause.
I took a look at some more of the reports and I am installing some of the malware that I see in a Win XP VM. Stay tuned.
I spent some time this afternoon looking at more individual crash reports. I did find some more suspicious dlls and extensions in individual reports which I traced back to various malware and toolbars which I have installed in my Win XP VM. But so far no crash.

Some of the ones that I see showing up are (beware, these have malware so please test in a VM)

http://www.ilivid.com/
http://www.televisionfanatic.com/

It seems that many of the ones that look suspect end up installing the searchqu toolbar which is known to be problematic in the crash realm.
(In reply to Marcia Knous [:marcia] from comment #22)
> I spent some time this afternoon looking at more individual crash reports. I
> did find some more suspicious dlls and extensions in individual reports
> which I traced back to various malware and toolbars which I have installed
> in my Win XP VM. But so far no crash.
> 

it's kind of a silly question - but can you confirm that you tested ipv6 enabled sites it an ipv6 stack? (given the stack trace)..
In looking at early Beta 5 data, I don't see this signature showing up at all.

Patrick: I did some light testing of ipv6 sites on Friday but I was not able to reproduce the crash.
(In reply to Marcia Knous [:marcia] from comment #24)
> In looking at early Beta 5 data, I don't see this signature showing up at
> all.

Because startup crashes are particularly scary, we reached out to some affected users with https://people.mozilla.com/~akeybl/Firefox_10_Diagnosis.exe. Basically each change in Beta 4 is backed out in a separate build, and we've asked users to launch each build until one works.

Cheng will let us know if/when we hear back.
Checked today and still only a single crash on b5. We will watch this closely. Could be something weird about the build.
Still only one of these on B5. I am going to leave it on the top crash list but if we don't see anything when 10 rolls out, I will remove the keyword.
(In reply to Sheila Mooney from comment #27)
> Still only one of these on B5.
I guess that all Beta testers with the underlying problem haven't upgraded to 10.0b5 (they should have downloaded it again). Usually, a spike in startup crashes in the Beta channel happens only in one Beta version.
(In reply to Scoobidiver from comment #28)
> (In reply to Sheila Mooney from comment #27)
> > Still only one of these on B5.
> I guess that all Beta testers with the underlying problem haven't upgraded
> to 10.0b5 (they should have downloaded it again). Usually, a spike in
> startup crashes in the Beta channel happens only in one Beta version.

That was one of the working theories. But having performed code inspections, attempts at reproduction, and having reached out to all affected users (who left their emails) with a way of simply bisecting the issue, we don't have any next actions.
This crash is back in 10.0.
In looking at the crash data for 10, I noticed a few reports that had the frankenfox theme issue which I asked for investigation in Bug 712824 - could be nothing but just noting.

https://crash-stats.mozilla.com/report/index/f650fb99-210a-4afb-a985-9216f2120201
https://crash-stats.mozilla.com/report/index/3cb917f0-c6fa-46b7-b957-930ed2120131
Still the volume seems to be restricted to 10b4 and 10. No spike in crashes on 9.0.1 at all. We seemed to be blocked on stuff to investigate further. Anybody have any ideas?
(In reply to Sheila Mooney from comment #32)
> Still the volume seems to be restricted to 10b4 and 10. No spike in crashes
> on 9.0.1 at all.

At least that doesn't really sound like malware, as then we should see it at least on 9 as well, where the majority of common users are now - at least if it's not a specific incompatibility of one malware with 10. Not sure how much that data point helps in the end, though.
(In reply to Sheila Mooney from comment #32)
> Still the volume seems to be restricted to 10b4 and 10. No spike in crashes
> on 9.0.1 at all. We seemed to be blocked on stuff to investigate further.
> Anybody have any ideas?

We spoke in the channel meeting about doing the same outreach in https://bugzilla.mozilla.org/show_bug.cgi?id=718389#c25 again for 10.0 users. I'll email Cheng.
(In reply to John Hesling [:John99] from comment #35)
> Is it of any interest to note a user is having both bug718389 and bug716786
Sorry the other bug is bug716386
(In reply to John Hesling [:John99] from comment #35)
> https://crash-stats.mozilla.com/report/index/cd7fb4a1-bab8-4f6d-88dc-
> 81f482120204
> https://crash-stats.mozilla.com/report/index/9616c909-8a07-41be-b31b-
> 84da22120203
Between these two crashes, one in 9.0.1 and one in 10.0, Google Desktop became compatible again although it is no longer supported (see http://googledesktop.blogspot.com/).

(In reply to John Hesling [:John99] from comment #36)
> (In reply to John Hesling [:John99] from comment #35)
> > Is it of any interest to note a user is having both bug718389 and bug716786
> Sorry the other bug is bug716386
They are probably the same bug.
The first correlation files for 10.0 were generated on January 26th:
https://crash-analysis.mozilla.com/crash_analysis/20120126/
Now all 10.0 correlation files are copies of these files!
It means that 10.0 correlations are not updated.

Nevertheless, here are correlations for non-Windows DLLs on January 26th:
    26% (42/162) vs.   5% (633/13378) igd10umd32.dll (Intel Graphics Accelerator Drivers)
         20% (32/162) vs.   1% (172/13378) 8.15.10.2202
          5% (8/162) vs.   0% (19/13378) 8.15.10.2266
          1% (2/162) vs.   0% (38/13378) 8.15.10.2353
     20% (32/162) vs.   0% (48/13378) pchook32.dll (BitDefender)
         20% (32/162) vs.   0% (46/13378) 14.0.13.41
          0% (0/162) vs.   0% (1/13378) 15.0.26.931
          0% (0/162) vs.   0% (1/13378) 15.0.32.1381
      9% (15/162) vs.   0% (19/13378) srvlsa.dll (virus?)
      7% (12/162) vs.   1% (103/13378) mfc80u.dll (MFCDLL Shared Library)
          7% (12/162) vs.   0% (64/13378) 8.0.50727.6195
      8% (13/162) vs.   2% (307/13378) RocketDock.dll (RocketDock)
      7% (11/162) vs.   1% (165/13378) AcSignIcon.dll (AutoCAD)
          2% (3/162) vs.   0% (24/13378) 17.2.56.0
          5% (8/162) vs.   0% (38/13378) 18.0.55.0
          0% (0/162) vs.   0% (15/13378) 18.1.49.0
          0% (0/162) vs.   0% (24/13378) 18.2.51.0
          0% (0/162) vs.   0% (1/13378) 19.0.42.200
      6% (9/162) vs.   0% (16/13378) keyManager.dll (Acer eDataSecurity Management)
          6% (9/162) vs.   0% (9/13378) 2.2.0.18
          0% (0/162) vs.   0% (7/13378) 2.5.26.32
      6% (9/162) vs.   0% (16/13378) BatchCrypto.dll (Acer eDataSecurity Management)
          0% (0/162) vs.   0% (7/13378) 2.5.15.4035
          6% (9/162) vs.   0% (9/13378) 2.5.3026.14
      6% (9/162) vs.   0% (16/13378) MsnChatHook.dll (2.5.3.11) (Acer eDataSecurity Management)
      6% (9/162) vs.   0% (17/13378) mfc80ESP.dll (Microsoft Visual Studio)
          0% (0/162) vs.   0% (1/13378) 8.0.50727.4053
          6% (9/162) vs.   0% (15/13378) 8.0.50727.6195
          0% (0/162) vs.   0% (1/13378) 8.0.50727.762
      6% (9/162) vs.   0% (21/13378) ShowErrMsg.dll (Acer eDataSecurity Management)
          0% (0/162) vs.   0% (7/13378) 2.5.23.4035
          6% (9/162) vs.   0% (9/13378) 2.5.3024.22
          0% (0/162) vs.   0% (5/13378) 3.1.1.1
      6% (9/162) vs.   0% (24/13378) CryptoAPI.dll (Acer eDataSecurity Management)
          0% (0/162) vs.   0% (3/13378) 2.2.0.11
          6% (9/162) vs.   0% (16/13378) 2.2.0.34
          0% (0/162) vs.   0% (5/13378) 3.0.59.32
Summary: Startup crash @ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr (mainly related to malware) → Startup crash @ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr
Depends on: 724799
Here are are 10.0.1 correlations for non-Windows DLLs on February 13th:
     19% (84/432) vs.   8% (1966/25105) GrooveUtil.DLL (MS Office)
          5% (22/432) vs.   3% (646/25105) 12.0.4518.1014
          0% (1/432) vs.   0% (52/25105) 12.0.6211.1000
          1% (5/432) vs.   0% (114/25105) 12.0.6423.1000
          0% (2/432) vs.   0% (45/25105) 12.0.6550.5004
         10% (45/432) vs.   3% (777/25105) 12.0.6562.5000
          2% (9/432) vs.   1% (331/25105) 12.0.6606.1000
     33% (142/432) vs.  22% (5638/25105) mdnsNSP.dll (Bonjour)
          1% (4/432) vs.   2% (410/25105) 1.0.3.1
          1% (5/432) vs.   1% (217/25105) 1.0.6.2
          1% (4/432) vs.   0% (32/25105) 2.0.1.2
          1% (3/432) vs.   0% (105/25105) 2.0.2.0
          1% (5/432) vs.   0% (116/25105) 2.0.3.0
          1% (6/432) vs.   1% (337/25105) 2.0.4.0
          3% (13/432) vs.   1% (341/25105) 2.0.5.0
         19% (84/432) vs.  15% (3664/25105) 3.0.0.10
          4% (18/432) vs.   1% (319/25105) 3.0.0.2
     11% (46/432) vs.   2% (555/25105) GoogleDesktopNetwork3.dll (Google Desktop)
          0% (1/432) vs.   0% (12/25105) 
          1% (4/432) vs.   0% (12/25105) 5.1.703.26697
          2% (7/432) vs.   0% (11/25105) 5.1.706.29690
          1% (3/432) vs.   0% (12/25105) 5.7.802.22438
          0% (1/432) vs.   0% (11/25105) 5.8.809.23506
          6% (28/432) vs.   2% (406/25105) 5.9.1005.12335
          0% (2/432) vs.   0% (19/25105) 5.9.911.3589
      9% (37/432) vs.   3% (670/25105) BtMmHook.dll (Broadcom Bluetooth Software)
          2% (7/432) vs.   0% (11/25105) 6.0.1.3900
          0% (2/432) vs.   0% (11/25105) 6.0.1.6200
          1% (3/432) vs.   0% (13/25105) 6.0.1.6300
          0% (1/432) vs.   0% (4/25105) 6.2.0.4100
          0% (2/432) vs.   0% (2/25105) 6.2.0.6600
          0% (1/432) vs.   0% (2/25105) 6.2.0.7600
          0% (1/432) vs.   0% (6/25105) 6.2.0.8800
          0% (2/432) vs.   0% (24/25105) 6.2.0.9600
          0% (1/432) vs.   0% (13/25105) 6.2.0.9602
          1% (3/432) vs.   0% (6/25105) 6.2.0.9700
          0% (1/432) vs.   0% (11/25105) 6.2.1.100
          0% (1/432) vs.   0% (11/25105) 6.2.1.800
          1% (4/432) vs.   0% (8/25105) 6.3.0.3102
          0% (1/432) vs.   0% (2/25105) 6.3.0.4500
          0% (2/432) vs.   0% (39/25105) 6.3.0.5600
          1% (4/432) vs.   0% (57/25105) 6.3.0.6300
          0% (1/432) vs.   0% (4/25105) 6.3.0.7600

In 11% of crashes, there's Google Desktop that is an outdated extension from Sept, 2011 (see http://googledesktop.blogspot.com/).
Firefox 10 makes extensions compatible by default so maybe Google Desktop became compatible and caused some of those crashes.
(In reply to Scoobidiver from comment #39)
> Here are are 10.0.1 correlations for non-Windows DLLs on February 13th:
[...]
> In 11% of crashes, there's Google Desktop that is an outdated extension from
> Sept, 2011 (see http://googledesktop.blogspot.com/).
> Firefox 10 makes extensions compatible by default so maybe Google Desktop
> became compatible and caused some of those crashes.

I used Google Desktop - I think it's not an extension but a separate program. I don't think the compatible-by-default has any effect on dlls from other programs that communicate with Firefox.
comment 20 makes me wonder if there is some kind of OS upgrade, leaving with no symbols and unversion OS .dll's.   I wonder if a MS patch tuesday updated DNS code lately and maybe tickled bugs.  

The other thing that looks like it still needs check out is patrick's comment in comment 23 about ipv6 testing.
(In reply to chris hofmann from comment #41)
> The other thing that looks like it still needs check out is patrick's
> comment in comment 23 about ipv6 testing.

Marcia and I chatted about this today.  I did some basic IPv6 tests on my Mac and Win7 VM, using 10.0.1 and ipv6.test-ipv6.com, which says it's v6 only.  Couldn't reproduce the issue.  Note, I didn't have any addons installed, and I tried with various combinations of "network.dns.disableIPv6" and "network.http.fast-fallback-to-IPv4".  No crashes.

I also forced IPv6 only on my MacOS X settings and also on the bridged connection from my Win7 VM (using Google's public DNS servers).  Again, with various combinations of the two prefs.  Still no crashes on either platform.

Same with IPv4 only with OS settings - no crash.

Might be worthwhile to run these kind of tests with plugins or external apps running.  Marcia said she had been looking through the crash report comments to see if there were any other leads to follow apart from correlations.  I took a quick look through, installing Yahoo toolbar, tried accessing Facebook, tried launching links from the desktop … no luck, no crash.
I hope this is not considered bugspam, but I note a user has these crashes, in fx10.0.1 but not in fx3.6 on the same machine. (this bug is branch10 but I note a smattering of crashes on other branches). As this is not yet reproducible I wondered if anyone may wish to take advantage of the fact that users with such crashes are in contact with us.https://support.mozilla.org/en-US/questions/919526
One user has found malware to cause this crash and has solved his problem https://support.mozilla.org/en-US/questions/918656#answer-311279
(In reply to John Hesling [:John99] from comment #43)
> https://support.mozilla.org/en-US/questions/919526
In his case, it might be caused by MyWinLocker.

(In reply to John Hesling [:John99] from comment #44)
> One user has found malware to cause this crash and has solved his problem
> https://support.mozilla.org/en-US/questions/918656#answer-311279
We know that some crashes are caused by malware (see comment 7) but we don't know the proportion.
Someone needs to develop a script to remove Windows and Firefox DLLs in crash report modules in order to know loaded software, when there are some.
(In reply to Scoobidiver from comment #45)
> Someone needs to develop a script to remove Windows and Firefox DLLs in
> crash report modules in order to know loaded software, when there are some.

Easier said than done. Graphics drivers and similar things we load change, and other libraries we might intentionally load might as well. And then there are security suites and add-ons loading different sorts of binaries. It's pretty hard to distinguish "friend or enemy" when looking into details.
the key is to figure out some method of automation for this, with the goal of filtering noise.  We should have the data for this since we have debug symbols for firefox, and windows versions, so we should have the .dll data from debuging symbols that would allow building in some automation.

we should investigate it from that end.  lets get a separate bug on file.
For reasons I can't explain, we only have 5 of these so far on 10.0.2 but it was sitting at #3 on 10.0.1.
(In reply to Sheila Mooney from comment #48)
> For reasons I can't explain, we only have 5 of these so far on 10.0.2 but it
> was sitting at #3 on 10.0.1.
It's a startup crash so users who hit it in 10.0 or 10.0.1 coming from 9.0.1 won't upgrade to 10.0.2.
No longer depends on: 724799
So we have about 8 in FF11b4 which is way lower than the spike we saw in 10.0b4. There are only 11 in 10.0.2 in the past week. I am removing the top crash keyword because the spike seems to be gone and we really don't know what the next action is.
Keywords: topcrash
Crash Signature: [@ PR_EnumerateAddrInfo] [@ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr] → [@ PR_EnumerateAddrInfo] [@ PR_EnumerateAddrInfo | nsDNSRecord::GetNextAddr(unsigned short, PRNetAddr*)]
Given comment 50 and the fact that this bug has gone nowhere in 3 months, removing qawanted. Please re-add if there is something more QA can do on this bug.
Keywords: qawanted
For what it's worth, last time I tried, I could still reproduce _a_ startup crash (I saw this crash signature previously) with Firefox 13 when it was in beta, but crash reporter wasn't coming up. I haven't tried with a newer version on the same computer, because there doesn't seem to be any point. The computer will be replaced soon anyway and until then I'm using other browsers.

Most people aren't going to continue to try and start Firefox if it crashes on start up. Not sure what that means for this bug... it's more of a CANTFIX than a WONTFIX or a WORKSFORME.
only one crash with this signature in past week, for version 14
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.