Closed Bug 524944 Opened 15 years ago Closed 5 years ago

Crash on Windows XP [@BaseThreadStart ][@ @0x0 | BaseThreadStart ][@ @0x0 | @0x10dc2bc1 | BaseThreadStart ][@ @0x0 | @0x10dc2bbd | BaseThreadStart ] with Trojan:Win32/Daonol.H and other malwares

Categories

(Firefox :: General, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox10 - ---
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- wontfix
firefox-esr45 --- wontfix
blocking2.0 --- -

People

(Reporter: cbook, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: [crashkill][need investigation])

Crash Data

Attachments

(7 files)

from topcrash stats - currently topcrash #32 - windows only:

http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5.3&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=BaseThreadStart

seems not to be a start up crash - maybe the urls would be helpful ?
Whiteboard: [crashkill] need investigation
for reference, BaseThreadStart is a w32 api method, so ignoring Wine, you'll never see it on another platform. It typically means someone created a thread but passed bad parameters (unloaded library pointer, bogus function pointer, bad data, damaged heap). no time to look further atm.
699 total crashes for ^BaseThreadStart on 20091112-crashdata.csv
78 start up crashes inside 3 minutes

os breakdown
 387 BaseThreadStart Windows NT 5.1.2600 Service Pack 3
 259 BaseThreadStart Windows NT 5.1.2600 Service Pack 2
  20 BaseThreadStart Windows NT 5.1.2600 Service Pack 1
  13 BaseThreadStart Windows NT 5.1.2600
  10 BaseThreadStart Windows NT 5.1.2600 Dodatek Service Pack 3
   4 BaseThreadStart Windows NT 5.1.2600 Szervizcsomag 3
   3 BaseThreadStart Windows NT 5.1.2600 Dodatek Service Pack 2
   1 BaseThreadStart Windows NT 5.2.3790 Service Pack 2
   1 BaseThreadStart Windows NT 5.1.2600 Szervizcsomag 2
   1 BaseThreadStart Windows NT 5.1.2600 Service Pack 3, v.5857

distribution of all versions where the ^BaseThreadStart crash was found on 20091112-crashdata.csv
 377 Firefox 3.5.5
 194 Firefox 3.0.15
  27 Firefox 3.5.3
  25 Firefox 3.5.4
  11 Firefox 3.6b2
   9 Firefox 3.6b1
   6 Firefox 3.1b2
   6 Firefox 3.0.13
   5 Firefox 3.0.3
[and more]

domains of sites and browsing pattern looks widspread

  97 //
  82 \N//
  33 http://apps.facebook.com
  25 http://www.facebook.com
  24 about:blank//
  11 http://www.google.it
  10 http://www.myspace.com
   9 http://www.youtube.com
   9 http://www.google.de
   8 http://www.google.com
   5 http://www.kwick.de
   5 http://home.myspace.com
   5 about:sessionrestore//
[and more]
one strange comment...

see: http://crash-stats.mozilla.com/report/index/800b31d4-c4e3-44bf-837a-d212d2091108

when i try to upgrade my fierfox it says i dont have a lisence code and wont do any updates now what do i do to fix it?? help 

.........

and many more that just express frustration with repeated crashes
automated correlation bot continuing to look for correlation in module, plugins, and addons....

searching http://people.mozilla.com/crash_analysis/
http://people.mozilla.com/crash_analysis/20091113/

  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION (380 crashes)
     47% (178/380) vs.  36% (30553/84094) jqs@sun.com (Java Quick Starter, http://java.sun.com/javase/downloads/) (1.0)
bot says the stack is not to useful on this signature.
bot says: there are lots of related signatures that also might be junk or need fixing during processing.

signature list
 699 BaseThreadStart
 296 @0x0 | @0x10a72b75 | BaseThreadStart
 253 @0x0 | @0x10002b75 | BaseThreadStart
  33 @0x0 | @0x10a62b75 | BaseThreadStart
  15 @0x0 | @0x10a72a81 | BaseThreadStart
  15 @0x0 | @0x10002b61 | BaseThreadStart
  14 @0x0 | @0x10002a81 | BaseThreadStart
  13 @0x0 | @0x10b52b75 | BaseThreadStart
   8 @0x0 | @0x10b42b75 | BaseThreadStart
   8 @0x0 | @0x10002a95 | BaseThreadStart
   6 @0x0 | BaseThreadStart
   6 @0x0 | @0x10a72a29 | BaseThreadStart
   6 @0x0 | @0x10002a29 | BaseThreadStart
   5 @0x0 | @0x10002a4d | BaseThreadStart
   4 @0x0 | @0x10a72bd9 | BaseThreadStart
   4 @0x0 | @0x10a72a4d | BaseThreadStart
   4 @0x0 | @0x10a52b75 | BaseThreadStart
   3 memmove | @0x18c9274 | @0x18c87c3 | @0x18c8f25 | BaseThreadStart
   3 memcpy | @0x1992110 | @0x19987b8 | @0x1998f25 | BaseThreadStart
   3 memcpy | @0x18e2110 | @0x18e87b8 | @0x18e8f25 | BaseThreadStart
   3 @0x0 | @0x10a22b75 | BaseThreadStart
   2 strstr | @0x1ae8fc5 | BaseThreadStart
   2 memmove | @0x24c9274 | @0x24c87ce | @0x24c8f25 | BaseThreadStart
 [and more like this]
   2 memmove | @0x1599274 | @0x15987c3 | @0x1598f25 | BaseThreadStart
   2 memmove | @0x1509274 | @0x15087c3 | @0x1508f25 | BaseThreadStart
   2 memcpy | @0x26b2110 | @0x26b87b8 | @0x26b8f25 | BaseThreadStart
[and more like this]
   2 memcpy | @0x1822110 | @0x18287b8 | @0x1828f25 | BaseThreadStart
   2 memcpy | @0x1142110 | @0x11487b8 | @0x1148f25 | BaseThreadStart
   2 @0x0 | @0x10a72c0a | BaseThreadStart
   2 @0x0 | @0x10a72bbd | BaseThreadStart
   2 @0x0 | @0x10a72b61 | BaseThreadStart
   2 @0x0 | @0x10002bd9 | BaseThreadStart
   1 strstr | @0xb6e8fc5 | BaseThreadStart
  [and tons more like this ]
   1 strstr | @0x22a8fc5 | BaseThreadStart
[and others]
I'm tempted to split the 0x0 from the memcpy/memmove ones.

   2 memmove | @0x24c 9274 | @0x24c 87ce | @0x24c 8f25 | BaseThreadStart
   3 memmove | @0x18c 9274 | @0x18c 87c3 | @0x18c 8f25 | BaseThreadStart
   2 memmove | @0x159 9274 | @0x159 87c3 | @0x159 8f25 | BaseThreadStart
    3 memcpy | @0x199 2110 | @0x199 87b8 | @0x199 8f25 | BaseThreadStart
    3 memcpy | @0x18e 2110 | @0x18e 87b8 | @0x18e 8f25 | BaseThreadStart

So the first 4 digits are basically a random load address, the second 4 digits are a fixed offset for specific code. We need to figure out what module was loaded there. It's possible that it's unloaded.

I'd probably search all threads of all reports for 0x????8f25 - BaseThreadStart (not just crashing threads, and do it on the raw data [well, that means converting BaseThreadStart to some magic form too]). If we're lucky you'll only get a couple distinct libraries which use that offset as a calling out point from BaseThreadStart. The output I'd need is a list of libraries which satisfy that criteria and 2-3 reports for each so that I could see how they're used.
re comment 5 and the message about a "license file":

from irc conversation on sumo with bo

<chofmann> be on the look out for incompatibity/crash on start up problems between zone alarm and firefox 3.5.5.
<Bo>	choffman: I noticed some weirdness with ZoneAlarm's Exteme Security product and upgrading
<chofmann>	https://bugzilla.mozilla.org/show_bug.cgi?id=528798#c11
<Bo>	Not crashes, but an endless loop of failed updates and odd errors about a missing license agreement. I was going to try to tease out something reproducible later.
Summary: Crash at [@BaseThreadStart ] → Crash at [@BaseThreadStart ][@0x0 | @0x10b42bbd | BaseThreadStart]
Summary: Crash at [@BaseThreadStart ][@0x0 | @0x10b42bbd | BaseThreadStart] → Crash at [@BaseThreadStart ][@@0x0 | @0x10b42bbd | BaseThreadStart ]
There is somewhat of a correlation with the Java Quick Starter addon. Data for the past three days on the 3.6b4 branch:

For the [@@0x0 | @0x10b42bbd | BaseThreadStart ] signature
     37% (16/43) vs.  29% (3176/10873) jqs@sun.com (Java Quick Starter, ...)
     71% (10/14) vs.  29% (3702/12858) jqs@sun.com (Java Quick Starter, ...)
     89% (42/47) vs.  29% (3399/11574) jqs@sun.com (Java Quick Starter, ...)

For the [@BaseThreadStart ] signature
     56% (27/48) vs.  29% (3176/10873) jqs@sun.com (Java Quick Starter, ...)
     59% (29/49) vs.  29% (3702/12858) jqs@sun.com (Java Quick Starter, ...)
     59% (35/59) vs.  29% (3399/11574) jqs@sun.com (Java Quick Starter, ...)


Continuing investigation...
A lot of these crashes are happening with the following stack:

038d1cdc()	
ntdll.dll!_NtRegisterThreadTerminatePort@4()  + 0xc bytes	
kernel32.dll!_BaseThreadStart@8()  + 0x37 bytes	

...in case that'll help anyone
it could be that their module unloaded (or unloaded something it loaded)
So I have a whacky idea. Create a skidmark that lists all modules as they are loaded, and in what address space they are loaded. Never remove anything from this list even if a module is unloaded. This way we can hopefully see what module was at one time loaded at a particular address.
This is some skidmark code to record every module ever loaded, even if it later gets unloaded. Hopefully we can use this to figure out what dll was once loaded on the address that we're jumping to.

There's some pretty horrible hacks in here. Mostly the global boolean that's exposed by xpcom which is used to signal that we need to find newly loaded dlls. If there's a better way to do this I'm all ears.

Especially if anyone has opinions on better files to declare the and instantiate the boolean I'm all ears. I guess ideally I should create new files for it, but it seemed excessive for a single temporary debugging boolean.
Assignee: nobody → jonas
Attachment #415993 - Flags: review?(benjamin)
Comment on attachment 415993 [details] [diff] [review]
skidmark to record loaded modules

just a quibble: 
>+  // and display information about each module
you aren't displaying it, you're saving it :)
Katsuhiko Momoi provided the following stack in bug 525402:

Signature	@0x0 | @0x10a72bbd | BaseThreadStart
UUID	b62af969-f68c-402c-81b9-574d92091207
Time 	2009-12-07 23:26:55.167882
Uptime	537
Last Crash	82293 seconds before submission
Product	Firefox
Version	3.5.5
Build ID	20091102152451
Branch	1.9.1
OS	Windows NT
OS Version	5.1.2600 Service Pack 3
CPU	x86
CPU Info	GenuineIntel family 6 model 9 stepping 5
Crash Reason	EXCEPTION_ACCESS_VIOLATION
Crash Address	0x0
User Comments	Typed "Firefox in Japan" at www.google.co.jp and clicked on Search button. Righ after that Firefox 3.5.x crashed. I can repro this same crash problem at the following search sites: www.yahoo.co.jp www.yahoo.com www.bing.com It does not seem to matter what I put into the search input field. Also it crashes on coming back to a search page. For example, 1. Go to maps.google.com and do some search. 2. Then using the "web" link at to left of the maps page, go t www.google.com -> crash!
Processor Notes 	
Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 		@0x0 	
1 		@0x10a72bbd 	
2 	kernel32.dll 	BaseThreadStart 	

Filename 	Version 	Debug Identifier 	Debug Filename
McVSSkt.Dll 	8.0.0.15 		
SKCHUI.DLL 	1.0.1038.0 	3A8110E61 	skchui.pdb
Apoint.dll 	5.4.102.189 		
Vxdif.dll 	6.0.1.59 	3DDA3D962 	Vxdif.pdb
Imghook.dll 	6.4.0.29 		
imjpcd.dic 	8.1.4202.0 	3F9FAF0A1 	imjpcd.pdb
imjp81.ime 	8.1.4206.0 	443B9BD31 	imjp81.pdb
imjpcic.dll 	8.1.4203.0 	400BA29E1 	imjpcic.pdb
imjp81k.dll 	8.1.4202.0 	3F9FAF862 	imjp81k.pdb
msi.dll 	3.1.4001.5512 	668AD4E2E4404B9CA952C0813B22C32F2 	msi.pdb

So we now have a living victim. Let's talk!
Here are other reports I recently submitted. They are all the same thing:

http://crash-stats.mozilla.com/report/index/bp-8588d080-f649-44dd-b8ac-7ac472091204

http://crash-stats.mozilla.com/report/index/bp-b62af969-f68c-402c-81b9-574d92091207

http://crash-stats.mozilla.com/report/index/e911d105-0d65-4ce8-a8bb-2cc112091205

http://crash-stats.mozilla.com/report/index/84736a96-1dc9-4222-964d-041c22091205

http://crash-stats.mozilla.com/report/index/bp-3d33ef9a-3e53-4f09-89f8-9d49b2091205

http://crash-stats.mozilla.com/report/index/bp-54ef3aea-c0dc-48ca-b0fa-754712091206

http://crash-stats.mozilla.com/report/index/c7582cba-460f-4aaa-99ad-4e07e2091206

===

It just so happens on web search sites with a search box but this feels like a more general bug.

Firefox 3.5.5 used to work OK until maybe 1-2 weeks ago on these web sites. So it could be that something changed on my side for this to be caused.

Here are the Windows auto updates I had accepted recently:

1. Windows XP Update for Windows XP (KB976098)  Tuesday, November 24, 2009 Automatic Updates  
2. Windows XP Update for Windows XP (KB973687)  Tuesday, November 24, 2009 Automatic Updates  
3. Windows XP Update for Microsoft XML Core Services 4.0 Service Pack 2 (KB973688)  Tuesday, November 24, 2009 Automatic Updates  
4. Office 2002/XP Security Update for Microsoft Word 2002 (KB973444)  Tuesday, November 10, 2009 Automatic Updates  
5. Office 2002/XP Security Update for Microsoft Excel 2002 (KB973471)  Tuesday, November 10, 2009 Automatic Updates  
6. Windows XP Windows Malicious Software Removal Tool - November 2009 (KB890830)  Tuesday, November 10, 2009 Automatic Updates  
7. Windows XP Security Update for Windows XP (KB969947)  Tuesday, November 10, 2009 Automatic Updates  

I have 2 other Windows XP laptops and this crash problem does not occur with them at all.
A strange fact that I am not sure what to make of it. 

At www.google.com and www.yahoo.com, only the web search page causes this crash. For example, at www.google.com (top left), there are links (among others) for searches in:

web - images - video - maps - shopping

and yet only the "web" search page causes this crash.

Similarly at www.yahoo.com, there are links above the search home page for:

web - images - video - local - shopping

FFox 3.5.5 crashes only with the web search page here also. 
===

What is common on web search front end at these sites that is not found on other types of search?
And at www.bing.com, the same thing. This web site has searches for

web (the main search box)

On the left side the following additional search links are available (among others):

images
video
shopping
maps

And my FFox only crashes at the web search but not at the others.
katsuhiko: please follow the instructions at:
https://developer.mozilla.org/en/How_to_get_a_stacktrace_with_WinDbg

Specifically (quit firefox and) run firefox from windbg. attach the resulting log file. I'm especially worried based on your description that this is some spyware/malware that targets search engines. (but we'll cross that bridge if we come to it.)
Does the crash happen even if you start firefox in safe mode? If not, can you try diabling your extensions one at a time to see which one is causing the crashes?
Comment on attachment 415993 [details] [diff] [review]
skidmark to record loaded modules

>diff --git a/widget/src/windows/nsAppShell.cpp b/widget/src/windows/nsAppShell.cpp

>+#ifdef XRE_WANT_DLL_BLOCKLIST

This ifdef doesn't make sense to me. The only place XRE_WANT_DLL_BLOCKLIST is defined is in nsAppRunner.cpp, so this entire block should be ignored. I think you just want to remove this ifdef altogether, or perhaps copy the ifdef in nsWindowsWMain.cpp, #if defined(_MSC_VER) && defined(_M_IX86)

>diff --git a/xpcom/base/nsDebugImpl.cpp b/xpcom/base/nsDebugImpl.cpp

>+NS_COM PRBool sXPCOMHasLoadedNewDLLs = 0;

This, and the declaration in nsError.h, should be #ifdef XP_WIN
Attachment #415993 - Flags: review?(benjamin) → review+
Attached file WinDbg log —
Went to www.google.com, entered a search term and then clicked on the search button.
About the above WinDbg debug log taken based on the instructions mentioned by timeless above, I am not sure if this log is useful. It was very hard to get any recent Firefox or minefiled build to get started from the debug command line. I kept seeing an error that symbols loaded are not correct also. 
I engaged "Go" from the WinDbg Debug menu several times before Firefox/minefield came up.
Comment on attachment 416718 [details]
WinDbg log

thanks, this is great

So, this line is absolutely unexpected, could you please contact someone via http://support.mozilla.com/chat, identify yourself and this bug and indicate that timeless needs someone to investigate this file:
>ModLoad: 00d70000 00d81000   C:\DOCUME~1\KATSUH~1\LOCALS~1\Temp\qcwe.dat

-- I don't have the resources to investigate it and i'm pretty sure we shouldn't just upload it to bugzilla --

>(5b0.d18): Access violation - code c0000005 (first chance)
>First chance exceptions are reported before any exception handling.
>This exception may be expected and handled.
>eax=77606034 ebx=00000000 ecx=00000000 edx=7c90e514 esi=00000000 edi=00dc0000
>eip=774fd070 esp=0012e410 ebp=0012e418 iopl=0         nv up ei pl zr na pe nc
>cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
>ole32!CoTaskMemAlloc+0x10:
>774fd070 ff510c          call    dword ptr [ecx+0Ch]  ds:0023:0000000c=????????

This is also unexpected, when you get here (if it happens consistently), please do:

!analyze -v -f

and then g, I've never seen a crash like this which was handled, but based on what happens in your log, it seems like someone expected this. I'd love to know why.

>0:000> g

note to self: These are expected because you're Japanese and have an ime:
>ModLoad: 47e30000 47e86000   C:\WINDOWS\system32\imjp81.ime
>ModLoad: 63220000 632f0000   C:\WINDOWS\system32\imjp81k.dll

I don't know these by heart, i'm guessing windows zero conf, if they're not, someone please correct me:

>ModLoad: 73030000 73040000   C:\WINDOWS\system32\WZCSAPI.DLL
>ModLoad: 7db10000 7db9c000   C:\WINDOWS\system32\WZCSvc.DLL

I do not recognize these at all:

>ModLoad: 72810000 7281b000   C:\WINDOWS\system32\EapolQec.dll
>ModLoad: 726c0000 726d6000   C:\WINDOWS\system32\QUtil.dll


Expected:
>ModLoad: 3b100000 3b11b000   C:\WINDOWS\IME\IMJP8_1\Dicts\IMJPCD.DIC
>ModLoad: 631c0000 6321b000   C:\WINDOWS\IME\imjp8_1\IMJPCIC.DLL

Again, this strange duck:
>ModLoad: 00d70000 00d81000   C:\DOCUME~1\KATSUH~1\LOCALS~1\Temp\qcwe.dat

Again, I'd like to see the analysis for this:
>(53c.964): Access violation - code c0000005 (first chance)
>First chance exceptions are reported before any exception handling.
>This exception may be expected and handled.
>eax=77606034 ebx=00000000 ecx=00000000 edx=7c90e514 esi=00000000 edi=00dc0000
>eip=774fd070 esp=0012e410 ebp=0012e418 iopl=0         nv up ei pl zr na pe nc
>cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
>ole32!CoTaskMemAlloc+0x10:
>774fd070 ff510c          call    dword ptr [ecx+0Ch]  ds:0023:0000000c=????????
>1:001> g

This is odd:
>Missing image name, possible paged-out or corrupt data.
>Missing image name, possible paged-out or corrupt data.
>0:000> g

And by this point, we're in trouble. Well, at least we have one suspect, that's a good start!
>(53c.bec): Access violation - code c0000005 (first chance)
>First chance exceptions are reported before any exception handling.
>This exception may be expected and handled.
>eax=00000107 ebx=10b4de8c ecx=fffffef7 edx=01f30568 esi=00019000 edi=01f30568
>eip=00000000 esp=0713fe54 ebp=0713fe94 iopl=0         nv up ei pl nz na po nc
>cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010202
>Missing image name, possible paged-out or corrupt data.
>Missing image name, possible paged-out or corrupt data.
>Missing image name, possible paged-out or corrupt data.
>00000000 ??              ???
>1:020> kp
>ChildEBP RetAddr  
>WARNING: Frame IP not in any known module. Following frames may be wrong.
>0713fe50 10b42953 0x0
>0713fe94 10b42bbe <Unloaded_on.dll>+0x10b42952
>0713ffb4 7c80b729 <Unloaded_on.dll>+0x10b42bbd
>0713ffec 00000000 kernel32!BaseThreadStart+0x37

>1:020> lm
>start    end        module name

Of note, our suspect does not appear in lm. I like it as a suspect!

>Unloaded modules:
>3d930000 3da16000   wininet.dll
>00000001 4ae94822   on.dll  
>001e97b7 00839820   .dll    
>Missing image name, possible paged-out or corrupt data.
>006d0072 00d900d3   Unknown_Module_006d0072
>Missing image name, possible paged-out or corrupt data.
>00740061 00740061   Unknown_Module_00740061
>Missing image name, possible paged-out or corrupt data.
>006c006c 006c006c   Unknown_Module_006c006c

Sadly, this behavior (vaguely garbage views) is expected

Thanks a lot for the log. Please comment once you've spoken with support and please do provide another log for the earlier exception.
Thanks, timeless. I looked into "C:\DOCUME~1\KATSUH~1\LOCALS~1\Temp\qcwe.dat" and it turned out to be a Trojan. See:

http://www.microsoft.com/security/portal/Threat/Encyclopedia/Entry.aspx?Name=Trojan%3AWin32%2FDaonol.H

I tried 3-4 virus removers. Microsoft Security Essential with updates recognizes this threat and it removes it but qcwe.dat keeps on coming back. I have a log of MSE showing something like 40-50 removals of the same thing. The same for the registry key "midi9" mentioned in this article. The summary of all this is that removal programs I tried could not eliminate the file that keeps on producing the registry key and the qcwe.dat file.

I found another article somewhere (I don't remember the URL) that said that there should be an executable in a directory below the one containing "qcwe.dat". 


So I decided to remove this executable, the problem registry key and .dat file manually -- the MS article above mentions where to find the .dat file and that was useful.  

I removed some suspicious directories under the "qcwe.dat" directory -- not sure which one contained the offending executable, however. Deleted the registry key "midi9" and deleted the file "qcwe.dat". After several tries, these offending key and file stopped coming back. After 2 reboots, that state seems to be holding.  

And most importantly, Firefox/Minefiled don't crash on entering a search term any more.
Summary: Crash at [@BaseThreadStart ][@@0x0 | @0x10b42bbd | BaseThreadStart ] → Crash at [@BaseThreadStart ][@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H
Comment on attachment 415993 [details] [diff] [review]
skidmark to record loaded modules

>+  MODULEENTRY32 module;

>+  PRBool done = !Module32First(hModuleSnap, &module);

>+    done = !Module32Next(hModuleSnap, &module);
Don't these need to be the W variants?
Attached patch bustage fix — — Splinter Review
sicking's patch, that I just pushed, to unbust Thunderbird and SeaMonkey's Windows builds.
has this signature shifted to 

[@ @0x0 | @0x10002bbd | BaseThreadStart ]

in 3.6b5?

http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=%400x0%20|%20%400x10002bbd%20|%20BaseThreadStart&version=Firefox%3A3.6b5
Summary: Crash at [@BaseThreadStart ][@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H → Crash at [@BaseThreadStart ][@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H
Chofmann: Looks like the same crash, just with the offending dll loaded to a different base address.

Katsuhiko: Awesome that you found, and fixed!, the source of your crash. We should look into if we can block that trojan.

I still went ahead and landed the skidmark on m-c. I'd be good to verify if others are seeing the same crash.

http://hg.mozilla.org/mozilla-central/rev/eeba84d24038
http://hg.mozilla.org/mozilla-central/rev/f6f1982d758d
Summary: Crash at [@BaseThreadStart ][@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H → Crash at [@BaseThreadStart ] [@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H [@ @0x0 | @0x10002bbd | BaseThreadStart ]
Depends on: 536269
Attached patch Branch skidmark — — Splinter Review
Not sure if we need this on branch yet. But attaching so that i don't loose it.
Since the skidmark has lived on m-c for a few days now I looked at a few minidumps to see if it had generated any valuable data. The skidmark code seems to work, however it doesn't appear that a dll has ever been loaded at the address where we're crashing.

I can think of two explanations:
1. There has never in fact been data loaded to the address where the crash
   occurs. Someone is simply registering bad address for a callback.
2. A dll is (or was) loaded at the given address, but somehow it's hiding itself
   to avoid detection.

I'm not actually sure that 2 is possible, but I wouldn't be surprised if it's something that viruses and their like attempt to do.
I think the skidmark patch caused a new topcrash in CollectNewLoadedModules: bug 537417.
are there any possible defenses against the things in comment 33?

what are possible next steps?

looks like signature has move to @0x0 | @0x10b42bc5 | BaseThreadStart in 3.6.3
Summary: Crash at [@BaseThreadStart ] [@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H [@ @0x0 | @0x10002bbd | BaseThreadStart ] → Crash at [@BaseThreadStart ] [@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H [@ @0x0 | @0x10002bbd | BaseThreadStart ] [@ @0x0 | @0x10b42bc5 | BaseThreadStart ] (3.6.3)
assuming that the dll hooks in using winapi startup hooks, it could hide long before sicking's code starts looking.
There is a spike in recent trunks with BaseThreadStart signature.
It is now #17 top crasher in 3.6.9, 3.6.10 and b7pre/20100928 build.

Here is a list of useful comments :
"again, waking up after hibernation. i am afraid I have somehow corrupted ram"
"On startup -- possibly while restoring a multi-tab session."
"crashing every three min. help help"
"This is a constant problem where browser crashes every 2-3mins. I have the same problem with IE8."
"redirect malware that I can't remove"
"I clicked 'allow' on NoScript for spaceparanoisonline.com"
"My Gmail account and the Weather Channel were up. I have had the crash problem with Explorer 8,also, ever since I installed XP Service Pack 3."
"I'm having problems with my mozilla firefox when I am on the internet I will only be on it for a few seconds and it takes me off the internet. this has been going on for the past few days between 2 to 3 days at the most." (26/09/10)
"I just installed "Unity Player""
blocking2.0: --- → ?
3748 crashes for BaseThreadStart on sept 26

On those comments about "it crashes every 3 minutes" it appears there is a small spike in the distribution of time since last crash but it happens just after 20 minutes; but its still a small pct of the over all crash volume.

seconds since last crash
   count
91 2
93 2
95 1
103 1
104 1
107 1
109 2
112 1
115 2
121 6
122 83
123 87
124 66
125 49
126 54
127 26
128 41
129 39
130 21
131 25
132 19
133 20
134 22
135 22
136 22
137 7
138 6
139 7
140 11
oops, the labels on the graph should have been reversed.
BaseThreadStart crashes have spiked both in recent mozilla-central nightlies and in 4.0b6, so I think the spike is something unrelated to code changes.  Here are the crash numbers for 4.0b6:

[dbaron@dm-peep01 crash_analysis]$ ls 201009*/*.csv.gz | sort | tail -15 | while read FNAME; do echo -n $(echo $FNAME | cut -d/ -f1); echo -n " "; zcat $FNAME | cut -f1,8 | grep "^BaseThreadStart     4\.0b6$" | wc -l ; done;
20100916 27
20100917 56
20100918 74
20100919 31
20100920 111
20100921 139
20100922 102
20100923 87
20100924 152
20100925 234
20100926 217
20100927 222
20100928 310
20100929 300
20100930 479

Compared with the total number of crashes in those builds (to get a sense of usage):

[dbaron@dm-peep01 crash_analysis]$ ls 201009*/*.csv.gz | sort | tail -15 | while read FNAME; do echo -n $(echo $FNAME | cut -d/ -f1); echo -n " "; zcat $FNAME | cut -f8 | grep "^4\.0b6$" | wc -l ; done;
20100916 14139
20100917 17252
20100918 19097
20100919 21877
20100920 22254
20100921 23145
20100922 24477
20100923 22291
20100924 20218
20100925 22321
20100926 23744
20100927 20604
20100928 25035
20100929 24850
20100930 23602
This is based on the CSV data in http://people.mozilla.com/crash_analysis/
I just pulled a minidump from one of the crashes on the Sept. 29 mozilla nightly, and the crash address was substantially lower than any of the DLL base addresses in the skidmark.  Are there modules that we're missing?

Also, should we consider removing the skidmark code at some point?
there is an overall increase in the volume of SocketSend (bug 601097?) and SocketConnect (bug 422044 ?) signatures in the last couple of days.  wonder if they are related and just bystanders of corruption that happens here at BaseThreadStart or before.
blocking2.0: ? → -
BaseThreadStart crashes seem to be on the rise now in 4.0 as well. :(
in a sample of 100 crash reports for [@BaseThreadStart ]  there is some evidence of unversioned .dlls and the malware markers that we have seen in a lot of top crashes lately, but the correlation is not as strong.

here is the list of unversioned .dlls that show in that sample of 100 reports and the details are in the attachment.

   9 RadioWMPCore.dll
   8 UnlockerHook.dll
   6 iwucuraqilaquvac.dll
   4 sshnas21.dll
   4 lpxpcom.dll
   2 pr_imon.dll
   2 modemInst.dll
   2 correct.dll
   2 RocketDock.dll
   1 userlib.dll
   1 spAutofill.dll
   1 sp.DLL
   1 rpnpshimswf.dll
   1 rpchromebrowserrecordhelper.dll
   1 mcafeemn.dll
   1 gasretyw0.dll
   1 dadkeyb.dll
   1 UberIcon.dll
   1 UKHook40.dll
   1 RadioWMPCoreGecko19.dll
   1 MonKey.dll
   1 HKMLLoad.dll
   1 FFExternalAlert.dll
   1 CafeHook.dll
   1 BTKeyInd.dll
   1 BRNstFF.dll
I renamed the summary to take into account the crash signatures in the latest crash stats:
* 3.6.16: BaseThreadStart
* 4.0:    BaseThreadStart, @0x0 | BaseThreadStart, @0x0 | @0x10dc2bc1 | BaseThreadStart, @0x0 | @0x10dc2bbd | BaseThreadStart

With combined signatures, it is #132 top crasher in 4.0 over the last 3 days.
Summary: Crash at [@BaseThreadStart ] [@@0x0 | @0x10b42bbd | BaseThreadStart ] Trojan:Win32/Daonol.H [@ @0x0 | @0x10002bbd | BaseThreadStart ] [@ @0x0 | @0x10b42bc5 | BaseThreadStart ] (3.6.3) → Crash on Windows XP [@BaseThreadStart ][@ @0x0 | BaseThreadStart ][@ @0x0 | @0x10dc2bc1 | BaseThreadStart ][@ @0x0 | @0x10dc2bbd | BaseThreadStart ] with Trojan:Win32/Daonol.H and other malwares
The BaseThreadStart crash signature is #6 top crasher in 5.0b3, #27 in 4.0.1, and #17 in 3.6.17.

The spike seems related mainly to the win32sta.dll malware and maybe also to the Unlocker program (http://emptyloop.com/unlocker/):
* 3.6.17:
67% (153/230) vs.   1% (469/40867) win32sta.dll
* 4.0.1:
64% (334/523) vs.   1% (1021/124735) win32sta.dll
7% (35/523) vs.   2% (2017/124735) UnlockerHook.dll
* 5.0:
46% (44/95) vs.   1% (127/19956) win32sta.dll
Crash Signature: [@BaseThreadStart ] [@ @0x0 | BaseThreadStart ] [@ @0x0 | @0x10dc2bc1 | BaseThreadStart ] [@ @0x0 | @0x10dc2bbd | BaseThreadStart ]
Jonas says over IRC that we can remove the skidmark code here.  If nobody objects I'll work up a patch for that next week.
Crash Signature: [@BaseThreadStart ] [@ @0x0 | BaseThreadStart ] [@ @0x0 | @0x10dc2bc1 | BaseThreadStart ] [@ @0x0 | @0x10dc2bbd | BaseThreadStart ] → [@BaseThreadStart ] [@ @0x0 | BaseThreadStart ] [@ @0x0 | @0x10dc2bc1 | BaseThreadStart ] [@ @0x0 | @0x10dc2bbd | BaseThreadStart ]
This exploded on all versions recently, and is #11 on 9.* and #8 on 10.* in yesterday's data, which doesn't look too good. Bug 601587 has exploded in the last days on all versions as well.
Keywords: topcrash
The spike is correlated to JScript:
  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_READ (799 crashes)
     49% (390/799) vs.   1% (1935/129459) jscript.dll

One comment says: "This crash happens invariably when I start Grab++ through Orbit Download Manager"
The latest release of OrbitDownloadHelper was in June 2011: http://www.orbitdownloader.com/changelog.htm
Is there something to block here?
(In reply to Sheila Mooney from comment #50)
> Is there something to block here?

If we're sure that the current spike is malware (which it feels like, but we don't know for sure yet), then there's nothing to block, I guess.
If this exploded on all versions, I don't think it needs to be tracked for FF10 specifically. There's only 3 ways in which a crasher can affect all versions of Firefox -

1) Changes on the web (Can we pull URLs?)
2) Changes to third party add-ons (Are there any new correlations?)
3) Malware (if neither above yield results, and we have evidence of unversioned DLLs, etc.)
The crashes spike periodically. The current spike appears to be over.
This isn't very high volume at the moment. Plus, it's not clear what we can do about it anyway. I am taking it off the top crash list for now since it's down at #76 on 10.0 and #78 on 10.0.1.
Keywords: topcrash
On yesterday's 10.0.2 data, this is back high, #11 in https://crash-stats.mozilla.com/topcrasher/byversion/Firefox/10.0.2/1/browser currently.
Keywords: topcrash
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #55)
> On yesterday's 10.0.2 data, this is back high, #11 in
> https://crash-stats.mozilla.com/topcrasher/byversion/Firefox/10.0.2/1/
> browser currently.

If we're trying to make progress on this bug, I think we need to figure out which of the states in Comment#52 we're running into with the latest spike. Can we confirm that the most recent spike was highly correlated with a specific DLL (maybe malware)?
(In reply to Alex Keybl [:akeybl] from comment #56)
> Can we confirm that the most recent spike was highly correlated with a
> specific DLL (maybe malware)?

When I checked, I couldn't. The spike is again at the same time as bug 601587 is spiking, and across versions, which feels a lot like malware, but I have no real evidence for that.
If it were caused by one or two DLLs, it would appear clearly in correlations reports, but it's caused by malware with different DLL names, so it fly below the radar.

Nevertheless, I see some correlations:
* Feb 28:
+ 10.0.2:
  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_READ (151 crashes)
     13% (19/151) vs.   0% (49/67404) AcroFF007.dll (variant of Win32/Spy.Banker.XBB trojan)
      8% (12/151) vs.   0% (48/67404) AcroFF005.dll (variant of Win32/Spy.Banker.XBB trojan)
     11% (17/151) vs.   5% (3333/67404) datamngr.dll (SearchQu toolbar and other Bandoo's toolbar)
          8% (12/151) vs.   4% (2924/67404) 1.0.0.1
          1% (1/151) vs.   0% (64/67404) 3.0.0.46577
          3% (4/151) vs.   0% (155/67404) 3.0.0.53061
      5% (8/151) vs.   0% (22/67404) AcroFF004.dll (variant of Win32/Spy.Banker.XBB trojan)

  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_WRITE (35 crashes)
     11% (4/35) vs.   2% (1415/67404) pshook.dll (Punto Switcher or Power Strip)
      6% (2/35) vs.   0% (2/67404) ksnnela.dll (2.0.1.3212)
      6% (2/35) vs.   0% (2/67404) ghjnhzj.dll

+ 11.0:
  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_READ (39 crashes)
    23% (9/39) vs.   6% (2053/33254) datamngr.dll (SearchQu toolbar and other Bandoo's toolbar)
         23% (9/39) vs.   5% (1772/33254) 1.0.0.1
    21% (8/39) vs.   5% (1660/33254) mgAdaptersProxy.dll (SweetIM - emoticon software)
         10% (4/39) vs.   0% (41/33254) 3.5.0.8
         10% (4/39) vs.   1% (215/33254) 3.6.0.3
    10% (4/39) vs.   0% (126/33254) StylerHelper.dll (1.3.1.1) (Styler)
      5% (2/39) vs.   0% (2/33254) OKJ.001 (Ardamax keylogger)

* Feb 29:
+ 10.0.2:
  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_READ (663 crashes)
      8% (50/663) vs.   2% (1064/69282) Iminent.WinCore.dll (Iminent - emoticon software)
          7% (48/663) vs.   1% (691/69282) 3.47.0.0

  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_EXEC (277 crashes)
      8% (21/277) vs.   2% (1064/69282) Iminent.WinCore.dll (Iminent - emoticon software)
          5% (15/277) vs.   1% (691/69282) 3.47.0.0
          2% (6/277) vs.   0% (303/69282) 4.52.52.0

+ 11.0:
  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_READ (447 crashes)
     19% (85/447) vs.   1% (340/34358) SC2Hook.dll (2.0.0.9) (SuperCopier2)
     14% (64/447) vs.   1% (264/34358) jscript.dll (5.6.0.8837) (MS JScript)
      9% (40/447) vs.   0% (41/34358) 38aa76e-5689.tmp
      6% (27/447) vs.   0% (27/34358) 1415e6a0-5689.tmp
      5% (24/447) vs.   0% (30/34358) SpSubLSP.dll (2.0.0.45) (SpamSubtract)
      5% (24/447) vs.   0% (62/34358) IadHide4.dll (6.2.3.66) (BackWeb)
      5% (24/447) vs.   0% (103/34358) nview.dll (6.14.1.4303) (NVIDIA Windows Manager)

  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_EXEC (181 crashes)
     14% (26/181) vs.   0% (26/34358) GDX32.dll
      9% (16/181) vs.   0% (21/34358) IESetting.dll (8.0.6001.17184) (Internet Explorer)
      9% (16/181) vs.   0% (132/34358) StylerHelper.dll (1.3.1.1) (Styler)
      9% (16/181) vs.   1% (340/34358) SC2Hook.dll (2.0.0.9) (SuperCopier2)
      9% (16/181) vs.   2% (568/34358) mfc42.dll (6.2.4131.0) (MFC support)
      6% (11/181) vs.   0% (47/34358) ophookSE2.dll (12.0.0.1) (OmniPage)
Depends on: 665775
Scobbidiver, thanks for the analysis, that's great. So it looks like there's definitely some identifiable malware involved, though not all correlations you point out sound like malware in the end.
Depends on: 732003
So this crash is at #63 on Fx 12, #149 on 13b3, #82 on 14a2 and #47 on trunk. Although still around, it doesn't seem nearly as high as it used to be. If it's malware it could be fading out. I am removing the top crash keyword.
Keywords: topcrash
With combined signatures, it's #26 top browser crasher in 14.0.1, #4 in 15.0b5, #35 in 16.0a2.

It's slightly correlated to MindSpark Toolbar in 15.0:
      5% (114/2169) vs.   0% (115/62570) u4hkstub.dll (1.0.0.1)
      5% (114/2169) vs.   0% (129/62570) 60hkstub.dll (1.0.0.1)
Keywords: topcrash
Whiteboard: [crashkill] need investigation → [crashkill][need investigation][startupcrash]
Right now (yesterday's data), this signature is spiking across a number of Firefox versions.
It's #75 top crasher in 15.0.1 so no longer a top crasher.
Keywords: topcrash
Depends on: 830765
I can't reproduce this issue on FF 19b3 but it looks like this problem is still happening as Soccoro stats shows for last week.

https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5.3&query_search=signature&query_type=exact&reason_type=contains&range_value=1&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=BaseThreadStart

About 80 crashes in last week on FF 19b3.
I am surprised there are still users of Firefox 3.5.3.

The right links are:
https://crash-stats.mozilla.com/report/list?signature=BaseThreadStart
https://crash-stats.mozilla.com/report/list?signature=%400x0%20|%20BaseThreadStart

Here are some correlations per extension in 18.0.1:
  BaseThreadStart|EXCEPTION_ACCESS_VIOLATION_READ (40 crashes)
     10% (4/40) vs.   0% (73/176971) {184AA5E6-741D-464a-820E-94B3ABC2F3B4} (TrojanSpy:Win32/Bafi.A) (see http://www.microsoft.com/security/portal/threat/encyclopedia/entry.aspx?Name=TrojanSpy%3AWin32%2FBafi.A)
Crash Signature: [@BaseThreadStart ] [@ @0x0 | BaseThreadStart ] [@ @0x0 | @0x10dc2bc1 | BaseThreadStart ] [@ @0x0 | @0x10dc2bbd | BaseThreadStart ] → [@ BaseThreadStart] [@ @0x0 | BaseThreadStart]
Whiteboard: [crashkill][need investigation][startupcrash] → [crashkill][need investigation]
Version: 3.5 Branch → Trunk
I think the best thing we can do there is to find the piece of malware involved and submit it to AV vendors, just like we did with other malware files, e.g. in bug 801394 comment#36.
(In reply to Ioana Budnar [QA] from comment #66)
> According to the above stats, this issue was fixed (or at least partly
> fixed) on Firefox 20 and above. There is exactly one crash on Fx20a2.
It happens also on 20.0 (6 crashes) and 21.0 (6 crashes).
Based on Aurora and Nightly populations, you can even forecast those results:
* 549*10*134,180/97,126,183=7.6 for Aurora
* 549*10*85,016/97,126,183=4.8 for Nightly
So it's pretty stable.
Removing QAwanted since QA couldn't reproduce this issue.
Keywords: qawanted
Crash volume for signature 'BaseThreadStart':
 - nightly (version 50): 0 crash from 2016-06-06.
 - aurora  (version 49): 0 crash from 2016-06-07.
 - beta    (version 48): 63 crashes from 2016-06-06.
 - release (version 47): 524 crashes from 2016-05-31.
 - esr     (version 45): 6 crashes from 2016-04-07.

Crash volume on the last weeks:
             Week N-1   Week N-2   Week N-3   Week N-4   Week N-5   Week N-6   Week N-7
 - nightly          0          0          0          0          0          0          0
 - aurora           0          0          0          0          0          0          0
 - beta            14         10         18          6          8          5          0
 - release         78         65         67         74         89         89         28
 - esr              0          1          1          0          1          1          0

Affected platform: Windows
Crash volume for signature 'BaseThreadStart':
 - nightly(version 50):0 crashes from 2016-06-06.
 - aurora (version 49):1 crash from 2016-06-07.
 - beta   (version 48):67 crashes from 2016-06-06.
 - release(version 47):598 crashes from 2016-05-31.
 - esr    (version 45):6 crashes from 2016-04-07.

Crash volume on the last weeks:
            W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - nightly       0       0       0       0       0       0       0
 - aurora        0       0       1       0       0       0       0
 - beta          5      14      11      18       6       8       5
 - release      84      78      77      67      74      89      89
 - esr           2       0       1       1       0       1       1

Affected platform: Windows

(In reply to Robert Kaiser from comment #67)

I think the best thing we can do there is to find the piece of malware
involved and submit it to AV vendors, just like we did with other malware files, e.g. in bug 801394 comment#36.

The only crashes with the signatures are XP which is no longer supported.
Close?

Flags: needinfo?(dolske)

(In reply to Wayne Mery (:wsmwk) from comment #73)

The only crashes with the signatures are XP which is no longer supported.
Close?

As the summary says, this always was an XP bug. May I - as just a bystander nowadays - say "let's kill it with fire"? ;-)

I think we can close this bug now, all of the crash reports are FF 51/52 & older.

Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(dolske)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: