Closed
Bug 598007
Opened 15 years ago
Closed 10 years ago
Start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ]
Categories
(Core :: Networking: Cache, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: scoobidiver, Assigned: jduell.mcbugs)
References
(Blocks 1 open bug)
Details
(Keywords: crash, user-doc-needed)
Crash Data
Build : Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100919
Firefox/4.0b7pre
This is a residual crash signature that exists in trunk builds.
It is #57 top crasher for 4.0b7pre for the last two weeks.
Signature nsDiskCacheMap::Open(nsILocalFile*)
UUID c6ceb095-65f9-447c-8768-85eed2100920
Time 2010-09-20 06:20:51.277412
Uptime 1
Last Crash 2 seconds before submission
Install Age 40712 seconds (11.3 hours) since version was first installed.
Product Firefox
Version 4.0b7pre
Build ID 20100919042023
Branch 2.0
OS Windows NT
OS Version 5.1.2600 Service Pack 3
CPU x86
CPU Info GenuineIntel family 6 model 23 stepping 6
Crash Reason EXCEPTION_ACCESS_VIOLATION_READ
Crash Address 0xffffffff80000000
App Notes AdapterVendorID: 10de, AdapterDeviceID: 0622
Crashing Thread
Frame Module Signature [Expand] Source
0 @0x80000000
1 xul.dll nsDiskCacheMap::Open netwerk/cache/nsDiskCacheMap.cpp:155
2 xul.dll nsDiskCacheDevice::OpenDiskCache
3 xul.dll nsDiskCacheDevice::Init netwerk/cache/nsDiskCacheDevice.cpp:384
4 xul.dll nsCacheService::CreateDiskDevice netwerk/cache/nsCacheService.cpp:1305
5 xul.dll nsCacheService::SearchCacheDevices netwerk/cache/nsCacheService.cpp:1718
6 xul.dll nsCacheService::ActivateEntry netwerk/cache/nsCacheService.cpp:1627
7 xul.dll nsCacheService::ProcessRequest netwerk/cache/nsCacheService.cpp:1490
8 xul.dll nsProcessRequestEvent::Run netwerk/cache/nsCacheService.cpp:913
9 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:547
10 xul.dll nsThread::ThreadFunc xpcom/threads/nsThread.cpp:263
11 nspr4.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:426
12 nspr4.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:122
13 mozcrt19.dll _callthreadstartex obj-firefox/memory/jemalloc/crtsrc/threadex.c:348
14 mozcrt19.dll _threadstartex obj-firefox/memory/jemalloc/crtsrc/threadex.c:326
15 kernel32.dll BaseThreadStart
Updated•15 years ago
|
Assignee: nobody → honzab.moz
Keywords: regression
Comment 1•15 years ago
|
||
Looks like something rotten for already a long time, apparently a race condition, there are similar much older reports from this area of code:
http://crash-stats.mozilla.com/report/index/b0cc7822-a4cb-429e-b758-cdea22100906
http://crash-stats.mozilla.com/report/index/9e270a67-27f6-4b4c-b98b-9b16f2100918
http://crash-stats.mozilla.com/report/index/97cc456a-1361-48b9-bd8f-bb9662100830
http://crash-stats.mozilla.com/report/index/07761768-0cbf-492d-8b3e-7d0242100907
Something just woken this up to happen more often. Will look for the regression range.
| Assignee | ||
Comment 2•15 years ago
|
||
I wouldn't be surprised if this has something to do with the smart_size changes in bug 559942, though I don't exactly see how.
If we're lucky this *might* be fixed by bug 596476 or 595413. Alas, the former won't make it into Beta7.
| Reporter | ||
Updated•15 years ago
|
Comment 3•15 years ago
|
||
#2 top crash in 4.0b7pre early data from yesterday. we should figure out how to mitigate this. is bug 596476 still on track? sounds like bug 595413 is now fixed as of the sept 15. need to check deeper to see if that fix as helped but looks like it may not have.
blocking2.0: --- → ?
Comment 4•15 years ago
|
||
actually looks like this got worse on builds from sept 17.
maybe after trunk users got the patches in
https://bugzilla.mozilla.org/show_bug.cgi?id=596476#c7 or
https://bugzilla.mozilla.org/show_bug.cgi?id=595413#c8
date tl crashes at, count build, count build, ...
nsDiskCacheMap::Open.nsILocalFile..
20100910 2 3.62010011514, 2 ,,
20100911 ,,
20100912 2 3.0b12007110904, 2 ,,
20100913 16 ,, 13 3.0b12007110904, 2 3.0.52008120122, 1 3.0b22007121120,
20100914 14 ,, 8 3.0b22007121120, 5 3.0b12007110904, 1 3.6.92010082415,
20100915 2 ,, 1 3.6.92010082415, 1 3.62010011514,
20100916 2 3.6.92010082415, 2 ,,
20100917 12 ,, 10 4.0b7pre2010091704, 1 4.0b62010091408, 1 3.0b12007110904,
20100918 8 ,, 7 4.0b7pre2010091704, 1 3.0b12007110904,
20100919 46 ,, 42 4.0b7pre2010091704, 3 4.0b62010091408, 1 3.6.102010091412,
20100920 79 , 67 4.0b7pre2010091704, 7 4.0b7pre2010091904, 4 3.0b12007110904,
20100921 163 , 67 4.0b7pre2010091704, 60 4.0b7pre2010092004, 31 4.0b7pr20100919
| Assignee | ||
Comment 5•15 years ago
|
||
Notes:
1) every instance of this crash is happening only on "Windows NT 5.1.2600 Service Pack 3". I'm guessing this reduces the importance of this bug, though we should obviously fix it.
2) It seems to be causing fewer crashes in the last few days. But that could be an artifact of jitter in the number of NT 5.1 boxes running beta7--I don't know how many such boxes there are out there, so we may have high variance.
3) Honza is correct in comment 1 that this bug has been triggered for a while. The main change seems to be that it used to get (infrequently) hit via a synchronous codepath from AsyncOpen->Connect->OpenCacheEntry->ProcessRequest, whereas now it's getting (more frequently) hit via the async cache read path created in bug 513008 (eliminate sync reads from cache). So if we get desperate, that's the bug to back out (I'd really hate to back that out, though)
Still looking into the cause of this by poring over the stack trace. One thing I don't understand is the segfault happening at nsDiskCacheMap.cpp:155: that's a function call, and shouldn't segfault. How accurate are our crash stack traces (I notice there's a frame 0 with just an addr listed). I assume we're crashing in OpenBlockFiles() somewhere.
Comment 6•15 years ago
|
||
(In reply to comment #5)
> So if we get
> desperate, that's the bug to back out (I'd really hate to back that out,
> though)
I would like to avoid that too.
> How accurate are our crash stack traces
On windows you get the cursor on a stack put after the line it is being executed. So, you have to find a line executed "manually" by going upward in the source code.
| Assignee | ||
Comment 7•15 years ago
|
||
> On windows you get the cursor on a stack put after the line it is being
executed. So, you have to find a line executed "manually" by going upward in the source code.
Sorry, having trouble understanding. So the stack trace is
0 @0x80000000
1 xul.dll nsDiskCacheMap::Open netwerk/cache/nsDiskCacheMap.cpp:155
2 xul.dll nsDiskCacheDevice::OpenDiskCache
And line 155 is a call to OpenBlockFiles(). So are you saying that means the crash happened somewhere in OpenDiskCache, or in nsDiskCacheMap::Open somewhere above line 155?
Status: NEW → ASSIGNED
Comment 8•15 years ago
|
||
Is this breakpad, or MSVC? breakpad often skips the next-to-top frame when the top frame is a numeric address.
Comment 9•15 years ago
|
||
this should probably block b7. Its now the #1 topcrash in b7pre and a regression from b6. can someone mark blocking status so we make sure its on the release radar?
| Assignee | ||
Comment 11•15 years ago
|
||
Oddly enough, sometimes the error is EXCEPTION_ACCESS_VIOLATION_READ, and sometimes EXCEPTION_ACCESS_VIOLATION_EXEC. Has anyone ever seen that before? Given that all errors are on x86 systems, which don't even support separate read/exec page permissions, is that a red herring?
FWIW this looks like the same as bug 595957 (which goes back as far as 3.0b1): it also seems to be affecting only Windows NT machines in Russia, and has essentially the same stack trace. The only difference I see is that async cache reads weren't landed yet in 3.6.x, and that some of the errors are EXCEPTION_ACCESS_VIOLATION_WRITE, which we don't seem to be getting any more with b7pre.
Very weird. I'd love to hear ideas on how to proceed (other than staring at code, which I'm still doing). Do we have a Windows NT box somewhere?
Summary: start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ] → start-up crash under Windows NT [@ nsDiskCacheMap::Open(nsILocalFile*) ]
| Assignee | ||
Comment 12•15 years ago
|
||
Wild guess #1: This is a problem with appending ASCII to a Cyrillic filename, and/or passing a Cyrillic filename to an NSPR I/O function. I don't understand charsets (and maybe XPCOM) well enough to know.
nsresult
nsDiskCacheMap::GetBlockFileForIndex(PRUint32 index, nsILocalFile ** result)
{
if (!mCacheDirectory) return NS_ERROR_NOT_AVAILABLE;
nsCOMPtr<nsIFile> file;
nsresult rv = mCacheDirectory->Clone(getter_AddRefs(file));
if (NS_FAILED(rv)) return rv;
char name[32];
::sprintf(name, "_CACHE_%03d_", index + 1);
rv = file->AppendNative(nsDependentCString(name));
if (NS_FAILED(rv)) return rv;
nsCOMPtr<nsILocalFile> localFile = do_QueryInterface(file, &rv);
NS_IF_ADDREF(*result = localFile);
return rv;
}
The IDL for AppendNative says that the argument must be in the native charset of the filesystem (in our error case, Russian Cyrillic). If for some reason converting ASCII "_CACHE_001_" to wchar and appending it (AppendNative does the conversion to wchar) returns NS_OK, but then the QI back to nsILocalFile fails, we'll return NS_OK without having touched 'result', which is a stack variable and thus garbage, which could then segfault when OpenBlockFiles calls Open() with it.
But ascii usually converts to wchar fine, right? And I don't see any reason why the QI back to nsILocalFile could fail: mCacheDirectory is an nsCOMPtr<nsILocalFile>, so we're just going from that to nsIFile and back. There's nothing fancy about nsLocalFileWin.cpp's implementation of QI:
NS_IMPL_THREADSAFE_ISUPPORTS4(nsLocalFile, nsILocalFile,
nsIFile, nsILocalFileWin, nsIHashable)
Wild guess #2: We could get past GetBlockFileForIndex OK, and die in nsDiskCacheBlockFile::Open(), which passes the file to OpenNSPRFileDesc(), which calls the Windows SDK functions GetFileInfo() and CreateFileW(). MSDN doesn't mention GetFileInfo() supporting unicode. Perhaps some of our Russian users have home directories with characters in them that trigger some sort of crash (only on Windows NT)?
Comment 13•15 years ago
|
||
AppendNative should be fine here, it's always ASCII-compatible. The obvious way to check is to create a profile in a Cyrillic-named directory and run against it.
GetFileInfo is not a win32 API, it's http://mxr.mozilla.org/mozilla-central/source/xpcom/io/nsLocalFileWin.cpp#473 and it is unicode-safe.
This is WinXP, so if you don't have a VM of it, we can arrange for one, or you can get somebody in the QA lab to run some experiments for you.
Updated•15 years ago
|
Summary: start-up crash under Windows NT [@ nsDiskCacheMap::Open(nsILocalFile*) ] → start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ]
Comment 14•15 years ago
|
||
sample of OS versions from yesterday
87 Windows NT 5.1. nsDiskCacheMap::Open(nsILocalFile*)
83 0.954023 Windows NT5.1.2600 Service Pack 3
4 0.045977 Windows NT5.1.2600 Service Pack 2
Comment 15•15 years ago
|
||
Can we get an ETA for a patch here? Or, will this be fixed by bug 596476? Also, are we still sure this should block beta 7?
| Assignee | ||
Comment 16•15 years ago
|
||
I no longer think bug 596476 is relevant--this is much older than smart sizing.
I can't give an ETA, because I still have no clue what's going on. I've asked for help from the Mozilla Russia folks, and am trying to repro on an XP box I've set up with Cyrillic.
Re: blocking beta 7: this only appears to affect Russian Windows XP boxes. It also seems to have tapered off in frequency from 300 crashes/day on 9/17 to 20-30 per day in the last few days.
http://tinyurl.com/28hrqvz
Alas, I have no idea why the decline is happening, so it could go back up.
I wouldn't personally keep the train at the station for this, but I'm not a release driver and don't know how much we care about the Russian audience for the beta.
Comment 17•15 years ago
|
||
Leaving this as a blocker so we keep investigating (though it's not clear to me that it actually needs to block, or that it's even something we can fix), but this should not block beta7, not given the decline in crashes and the fact that this has been around seemingly forever.
blocking2.0: beta7+ → betaN+
Comment 18•15 years ago
|
||
its been around forever in low volume, but the crashes happening now are almost exlusively 4.0b7pre. Also we are under somekind of spike related to crash from russia or Cyrillic problems noted in bug 599126 and Bug 597260, but those seem unconnected in time and the releases they apply too.
here are latest stats on which builds were hit by this in the last few days.
date tl crashes at, count build, count build, ...
nsDiskCacheMap::Open.nsILocalFile..
20100920 79 ,, 67 4.0b7pre2010091704, 7 4.0b7pre2010091904, 4 3.0b12007110904,
1 4.0b7pre2010091804,
20100921 163 ,, 67 4.0b7pre2010091704, 60 4.0b7pre2010092004,
31 4.0b7pre2010091904, 3 3.0b12007110904,
1 3.6.92010082415, 1 3.6.102010091412,
20100922 136 ,, 66 4.0b7pre2010091704, 27 4.0b7pre2010092104,
19 3.0b12007110904, 9 4.0b7pre2010092204,
6 4.0b7pre2010091904, 4 3.0b22007121120,
3 4.0b7pre2010091804, 1 4.0b7pre2010092004,
1 3.0.52008120122,
20100923 87 ,, 32 4.0b7pre2010091704, 27 3.0b12007110904,
8 4.0b7pre2010092204, 8 4.0b7pre2010092004,
4 4.0b7pre2010091904, 3 4.0b7pre2010092104,
2 4.0b7pre2010091804, 1 4.0b7pre2010092304,
1 3.6.62010062523, 1 3.6.102010091412,
Comment 19•15 years ago
|
||
Hmm.. I don't see that creation/access to nsCacheService::mDiskDevice would be synchronized... There is some nsCacheService::mLock and the ref counter is thread safe, but what happens when we enter the code on two threads concurrently?
Comment 20•15 years ago
|
||
Exactly: executing nsCacheService::SearchCacheDevices.
Comment 21•15 years ago
|
||
There is a lot of comments in German in the last crashes.
I was trying to create an account with some Czech letters in the name, no luck to reproduce.
Comment 22•15 years ago
|
||
(In reply to comment #19)
> Hmm.. I don't see that creation/access to nsCacheService::mDiskDevice would be
> synchronized... There is some nsCacheService::mLock and the ref counter is
> thread safe, but what happens when we enter the code on two threads
> concurrently?
Taking back... Just checked that all code paths leading to access to mDiskDevice are protected by nsCacheService::mLock.
(In reply to comment #21)
> I was trying to create an account with some Czech letters in the name, no luck
> to reproduce.
And the system was Windows XP SP3 [5.1.2600]
Comment 23•15 years ago
|
||
I installed the multilingual user interface package for Russian, and created user account with Cyrillic characters, and I created a profile on a folder with Cyrillic characters. I've been trying to reproduce though general browsing, but no luck so far. None of the comments I saw say much in the way of reproducing the problem.
Comment 24•15 years ago
|
||
Just a thought, referring to bug #595957, comment #4: Is there any way we could get hold of the fx-binaries from a user who has experienced this and check if there is a trojan involved? (Or alternatively: Is there a way we can guarantee that no trojan is mucking things up in this particular case?)
Comment 25•15 years ago
|
||
Its possible that malware is involved, but that happens rarely as the #1 top crash, and even more rare as the #1 topcrash that affects trunk users.
Another area to look at would be to make sure we've look at all the changes on trunk that could have affected cache operations on just prior to sept 17 with this ramped up exclusively on 4.0b7pre builds.
Honza started that in comment 1 but its not clear that anything conclusive was found.
Summary: start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ] → spike in 4.0b7pre start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ]
Comment 26•15 years ago
|
||
I don't see any mention of
b47978b94fc9
2010-09-16 20:21 -0700 Bjarne Herland - Bug 596808 - nsDiskCacheDevice::Init() called twice resulting in no disk cache available r=jduell, a=betaN
which landed shortly before this started appearing. I wonder if it might be worth backing that out for b7 or for a few days on trunk to see if it makes the volume drop back down. what would be the trade there?
Comment 27•15 years ago
|
||
if think about investigating and trying the back out of bug 596808 we should flip the blocking "betaN+" flag to blocking b7+ so it gets on the radar to hold the release.
Comment 28•15 years ago
|
||
Let's try the backout and see what it does to the stats.
blocking2.0: betaN+ → beta7+
Whiteboard: [trying a backout]
Comment 29•15 years ago
|
||
You might have found the issue although I don't see the relevance of Cyrillic profiles...
The patch for bug #596808 was supposed to initialize the disk-device earlier than it used to. I believe the issue here is that this actually fails (because of the check for existence of the disk-device object in nsCacheService::OnProfileChanged() !) and that this has consequences for later requests which actually creates and initializes the disk-device. The reason the patch resolves bug #596808 is simply because it avoids initializing the disk-device twice (it fails).
IMO, the solution is to ensure the disk-device is created in nsCacheService::OnProfileChanged(). I can come up with a patch for this later, or Honza or Michal could do it.
Yeah, there was a huge spike in this crash on the 17th, although there were a few on the 14th:
http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=nsDiskCacheMap%3A%3AOpen%28nsILocalFile*%29&branch=2.0&product=Firefox
Since it seems like a startup crash, it probably does seem important to fix for beta7.
For what it's worth, there's also a second cache change in the one-day window when this started:
http://hg.mozilla.org/mozilla-central/rev/26e2971eeec9
Comment 32•15 years ago
|
||
Could someone offer an explanation why the number of these crashes drops dramatically in nightlies *after* the 17th (see also comment #16) ?
Also observe that a crash-profile described with nsDiskCacheDevice::OpenDiskCache() is on the top-crasher list of 3.6.9, mainly on WinNT 5.1 SP3 (with lots of Cyrillic fonts in the comments). The stacks from these crashes look very similar to the stacks for this issue.
I'm not so convinced that the patch for bug #596808 is the culprit anymore. IMO it is likely that the earlier initialization performed in this patch exposes something lurking in other parts of the code, and I believe we should try to track down and fix the real issue. It might be worth backing it out to see if it makes a difference in the stats but there are not many crashes with this signature anymore, so I'm not convinced we will see anything.
Comment 33•15 years ago
|
||
Is it possible the decline came because people were crashing on startup so they stopped using the browser? Seems like a reasonable reaction to me.
| Reporter | ||
Comment 34•15 years ago
|
||
> Is it possible the decline came because people were crashing on startup so
> they stopped using the browser? Seems like a reasonable reaction to me.
According to crash stats, the number of users increases :
2010-09-27 1,824 40,509 100% 4.5%
2010-09-26 1,876 32,320 100% 5.8%
2010-09-25 1,867 30,232 100% 6.18%
2010-09-24 1,864 33,378 100% 5.58%
2010-09-23 2,081 32,737 100% 6.36%
2010-09-22 2,431 30,958 100% 7.85%
2010-09-21 2,571 28,803 100% 8.93%
2010-09-20 2,040 25,458 100% 8.01%
2010-09-19 1,653 20,031 100% 8.25%
2010-09-18 1,714 18,371 100% 9.33%
2010-09-17 2,519 20,792 100% 12.12%
2010-09-16 738 18,556 100% 3.98%
2010-09-15 22 11,565 100% 0.19%
2010-09-14 1,081 2,601 100% 41.56%
Comment 35•15 years ago
|
||
> Could someone offer an explanation why the number of these crashes drops
> dramatically in nightlies *after* the 17th (see also comment #16) ?
that's an interesting point, but I'm not sure we can to say the crashes have "dropped", without understand how fast people might be rolling forward. The core of our nightly testers move forward pretty routinely and agressively, but we have had several tech press articles with "feature X lands on mozilla nightlies" lately. One of these articles might have skewed the pool of users on builds from the 17, or changed the nightly tester composition, and maybe more people got stuck on sept 17 or just gave up. here are updated stats.
crashes are showing up on 0924 and 0925 builds, but its true they are still 1/2 the rate of 0917
20100916 2 3.6.92010082415 2 ,
20100917 12 10 4.0b7pre2010091704,
1 4.0b62010091408, 1 3.0b12007110904,
20100918 8 7 4.0b7pre2010091704,
1 3.0b12007110904,
20100919 46 42 4.0b7pre2010091704,
3 4.0b62010091408, 1 3.6.102010091412,
20100920 79 67 4.0b7pre2010091704,
7 4.0b7pre2010091904, 4 3.0b12007110904,
1 4.0b7pre2010091804,
20100921 163 67 4.0b7pre2010091704,
60 4.0b7pre2010092004, 31 4.0b7pre2010091904,
3 3.0b12007110904, 1 3.6.92010082415,
1 3.6.102010091412,
20100922 136 66 4.0b7pre2010091704,
27 4.0b7pre2010092104, 19 3.0b12007110904,
9 4.0b7pre2010092204, 6 4.0b7pre2010091904,
4 3.0b22007121120, 3 4.0b7pre2010091804,
1 4.0b7pre2010092004, 1 3.0.52008120122,
20100923 87 32 4.0b7pre2010091704,
27 3.0b12007110904, 8 4.0b7pre2010092204,
8 4.0b7pre2010092004, 4 4.0b7pre2010091904,
3 4.0b7pre2010092104, 2 4.0b7pre2010091804,
1 4.0b7pre2010092304, 1 3.6.62010062523,
1 3.6.102010091412,
20100924 66 34 4.0b7pre2010091704,
15 4.0b7pre2010092404, 6 3.0b12007110904,
5 4.0b7pre2010092004, 3 4.0b7pre2010091904,
2 4.0b7pre2010091804, 1 3.6.102010091412,
20100925 87 47 4.0b7pre2010091704,
20 4.0b7pre2010092404, 6 3.0b12007110904,
5 4.0b7pre2010092304, 4 4.0b7pre2010092312,
3 3.6.102010091412, 2 3.0.52008120122,
20100926 85 40 4.0b7pre2010091704,
25 4.0b7pre2010092504, 10 4.0b7pre2010092004,
5 4.0b7pre2010092404, 4 3.0b12007110904,
1 3.6.102010091412,
20100927 89 51 4.0b7pre2010091704,
12 4.0b7pre2010092204, 9 4.0b7pre2010092604,
9 4.0b7pre2010092312, 4 3.0b12007110904,
3 4.0b7pre2010092404, 1 3.6.102010091412,
(In reply to comment #35)
> that's an interesting point, but I'm not sure we can to say the crashes have
> "dropped", without understand how fast people might be rolling forward. The
> core of our nightly testers move forward pretty routinely and agressively, but
> we have had several tech press articles with "feature X lands on mozilla
> nightlies" lately. One of these articles might have skewed the pool of users
> on builds from the 17, or changed the nightly tester composition, and maybe
> more people got stuck on sept 17 or just gave up. here are updated stats.
It seems likely that people got stuck on the Sept. 17 build, since this seems to be a startup crash.
| Assignee | ||
Comment 37•15 years ago
|
||
Status:
Spent much of the day staring at minidumps w/dbaron and sicking. Didn't get much traction.
Just checked in a version bump of the HTTP cache:
http://hg.mozilla.org/mozilla-central/rev/a9d1ad0bc386
This will cause nightly users to have their cache re-created. We wanted to do this anyway so that nightly users get the fallocate optimization from bug 592520. But also, since landing 592520 coincided with the crash spike for beta7 (comment 31), we may wind up seeing either a crash or a dropoff in the crash count. Seemed worth trying.
I'm also planning to land the patches for bug 596476 tomorrow--they clean up the smart size logic, and might help reduce the crash rate if we're lucky, though they're almost definitely not going to completely fix this.
I think we were able to rule a few things out from the minidumps:
The most notable is that it's related to having Cyrillic characters in the username. In a bunch of the minidumps (maybe even all?), there were parts of file paths for a cache map file on the stack, and those paths were for the user name Admin.
It's perhaps also of interest that the crashes for this bug are *off* the main thread, and during the crash, the main thread is waiting for the cache lock. This made it seem like the bug on making nsCacheProfilePrefObserver::GetSmartCacheSize (which runs off the main thread) not call NS_GetSpecialDirectory might help, although we couldn't really see how.
We didn't come to a conclusion about whether or not this is the same as bug 595957. They have a whole bunch of similarities, though: most user comments are Cyrillic, spiked around the same time (although not exactly). It's possible that both are related to malware circulating in Russia, the Ukraine, and Poland.
Comment 39•15 years ago
|
||
(In reply to comment #35)
> crashes are showing up on 0924 and 0925 builds, but its true they are still 1/2
> the rate of 0917
I'm sorry, but we're probably looking at different data... I tend to look at the link provided in comment #30, then choose the "Table" tab. I see 4 crashes on the 14th, 536 on the 17th, 43 on the 24th and 25 on the 25th. Am I looking at the wrong thing?
(In reply to comment #37)
> Just checked in a version bump of the HTTP cache:
Brilliant idea! :) If we see another spike, I'd suggest to bump again and back out #596808 (it should probably be fixed more thoroughly anyway).
(In reply to comment #38)
> The most notable is that it's related to having Cyrillic characters in the
> username. In a bunch of the minidumps (maybe even all?), there were parts of
> file paths for a cache map file on the stack, and those paths were for the user
> name Admin.
Admin means elevated privileges on Windows, right? Virus/Malware...?
A few holes in the story still:
- do we know if the users who experience this crash run the beta again without
the crash, or do we even know that this is on the first run? (The version-bump
may provide insight here.)
- is there really no relation to this crash
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=nsDiskCacheDevice%3A%3AOpenDiskCache%28%29&version=Firefox%3A3.6.9
which also has a spike on the 17th
Comment 40•15 years ago
|
||
(In reply to comment #38)
> We didn't come to a conclusion about whether or not this is the same as bug
> 595957. They have a whole bunch of similarities, though: most user comments
> are Cyrillic, spiked around the same time (although not exactly). It's
> possible that both are related to malware circulating in Russia, the Ukraine,
> and Poland.
Sorry - I missed the fact that this is the same as the 3.6.9-crash I was referring to in previous comment.
AFAICS the crashes for these two issues seem to both revolve around the statement
rv = mCacheMap.Open(mCacheDirectory)
Comment 41•15 years ago
|
||
The theory about the bad off-main-thread usage of the directory service is very likely, bug 597658. There's a patch in bug 596476 to fix it.
Depends on: 596476
| Assignee | ||
Comment 42•15 years ago
|
||
> do we know if the users who experience this crash run the beta again without
> the crash, or do we even know that this is on the first run?
We're seeing a lot of repeat crashes with the same hour:minute timestamp--usually from 2-6 in a row, which suggests it may be users crashing repeatedly and then giving up.
> the crashes for these two issues seem to both revolve around
>
> rv = mCacheMap.Open(mCacheDirectory)
which is calling nsDiskCacheMap::OpenBlockFiles(), which calls nsDiskCacheMap::GetBlockFileForIndex() three times (to get nsILocalFiles for _CACHE_001,2, and then 3). I believe we kept seeing "_CACHE_001_" in the disassembly on the stack; if true we're dying after the first call. I'm going to write a patch for the potential segfault mentioned in comment 12 just in case that helps.
| Assignee | ||
Comment 43•15 years ago
|
||
I take it back. The code mentioned in comment 12 already returns any error from QI, so that theory is bunk.
Will land 596476 once I get jst's (or anyone's) +r for the directory service patch.
Oh, hmm--we're still seeing crashes from the build after my cache version bump (build 20100928041914): the crash stack (and exception addr) are still the same, but the exception is now always EXCEPTION_ACCESS_VIOLATION_EXEC (before it was almost always a READ exception, with a few EXEC's thrown in).
(In reply to comment #39)
> (In reply to comment #35)
> > crashes are showing up on 0924 and 0925 builds, but its true they are still 1/2
> > the rate of 0917
>
> I'm sorry, but we're probably looking at different data... I tend to look at
> the link provided in comment #30, then choose the "Table" tab. I see 4 crashes
> on the 14th, 536 on the 17th, 43 on the 24th and 25 on the 25th. Am I looking
> at the wrong thing?
There are two different notions of time: (1) the build ID, and (2) the crash date. chofmann is saying that *for current crash dates*, half the crashes are still from the build ID of the 17th.
The data in comment 35 are a matrix showing *both* of these notions of time. Each entry is a date-of-crash, formatted like:
date-of-crash total-count-on-date build-id-1 crashes-that-date-on-build-id-1
build-id-2 crashes-that-date-on-build-id-2 etc.
Comment 45•15 years ago
|
||
Thanks for the clarification! So that means that e.g on Sept.22nd there were 136 total crashes with this signature, 66 from the 0917-build, 27 from the 0921-build, 9 from the 0922-build etc... ok. (Quite useful, I must say :) )
However, IMO it still doesn't explain why the builds after 0917 produce fewer crashes...
Comment 46•15 years ago
|
||
Well... because nightly users are stuck on the 09-17 build, I'll bet!
Comment 47•15 years ago
|
||
Why would they be stuck? In particular: if it crashes at startup, why would anyone continue using it?
Comment 48•15 years ago
|
||
They keep hitting their Minefield icon in the taskbar and then remember that it crashes. They're stuck because if we can't launch, we can't update.
Anyway, let's land the fix we know is a problem and see if this crash signature goes away.
Comment 49•15 years ago
|
||
(In reply to comment #48)
> They keep hitting their Minefield icon in the taskbar and then remember that it
> crashes. They're stuck because if we can't launch, we can't update.
How would we ever get them back? :)
Seriously: So the theory is that a number of nightly users has the 0917-build installed and do not manage to upgrade from it? In fact, there are so many of these that the crashes they generate after 11 days (and builds) still dominate this type of crash? Counter-intuitive to me, but I'll accept it if established experience say that this is how it works...
> Anyway, let's land the fix we know is a problem and see if this crash signature
> goes away.
Definitely! :)
(In reply to comment #43)
> Oh, hmm--we're still seeing crashes from the build after my cache version bump
> (build 20100928041914): the crash stack (and exception addr) are still the
> same, but the exception is now always EXCEPTION_ACCESS_VIOLATION_EXEC (before
> it was almost always a READ exception, with a few EXEC's thrown in).
But the number of crashes did not jump? I.e. the act of re-creating the cache does not seem to be the problem (yet)?
Anyone who knows what EXCEPTION_ACCESS_VIOLATION_EXEC in fact means? Illegal instruction?
Comment 50•15 years ago
|
||
(In reply to comment #49)
> In fact, there are so many of
> these that the crashes they generate after 11 days (and builds) still dominate
> this type of crash?
... and, btw, they all use Cyrillic keyboards?
| Assignee | ||
Comment 51•15 years ago
|
||
Landed 596476--let's see from the nightlies tomorrow if the directory service was indeed the culprit.
> what does EXCEPTION_ACCESS_VIOLATION_EXEC mean?
I believe it means a bad address was used as an instruction (instead of a read/write). Really not sure what that means here. A little odd given that the stack frame and addr are the same.
So far 16 crashes today with the build from last night. Hard to say if this is an improvement, as our slavic XP user base may or may not be trying it in large numbers (some may be stuck on the build from 17th, or given up on nightlies, etc.)
> How would we ever get [those users] back? :)
We can let them switch to Chrome for a while, then realize FF 4 is better.
Comment 52•15 years ago
|
||
(In reply to comment #51)
> Landed 596476--let's see from the nightlies tomorrow if the directory service
> was indeed the culprit.
>
> > what does EXCEPTION_ACCESS_VIOLATION_EXEC mean?
>
> I believe it means a bad address was used as an instruction (instead of a
> read/write). Really not sure what that means here. A little odd given that
> the stack frame and addr are the same.
That's exactly what it means:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/minidump_processor.cc#723
If you look at:
http://crash-stats.mozilla.com/report/index/deb500a9-e87b-4f4b-926c-b0b0b2100924
The top of the stack is the crash address, yes, which means that something caused us to jump to a bad address in non-executable memory. Saved by DEP!
Interestingly, frame 1 is missing source info, which probably means it's in a "cold" block of that function. I've investigated this in the past, when VC++ does PGO optimization it will separate functions out into "hot" and "cold" blocks, and put all the hot blocks in one set of pages, and the cold ones in another set of pages. Unfortunately VC2005 then fails to write out source line info in the PDB for the cold blocks. (VC2010 fixes this, at least.)
Comment 53•15 years ago
|
||
So something made us try executing from a bad address? Could this be caused by e.g. calling a method on a dangling pointer to an object?
Comment 54•15 years ago
|
||
re comment 45
> So that means that on day A there were X crashes on Y build.... . (Quite useful, I must say :)
Bug 600534 tracks trying to get this view in the web interface of socorro
Comment 55•15 years ago
|
||
(In reply to comment #53)
> So something made us try executing from a bad address? Could this be caused by
> e.g. calling a method on a dangling pointer to an object?
No I'd say, unless the object has virtual methods, nsDiskCacheBlockFile doesn't have any. This all seems to me more like a stack corruption, and BAD_EXEC as RET would jump to a bad address. But I'm not that much expert to deep debugging...
| Assignee | ||
Comment 56•15 years ago
|
||
Well, we're at 10 crashes so far today with the build from last night, so the directory service fix hasn't made this gone away. We're still down from the 9/17 spike, but hard to know what sort of prevalence we'd see if this shipped in beta7.
Error is now back to EXCEPTION_ACCESS_VIOLATION_READ. (Is it just me, or does the combo of same crash stack + different access error + Slavic XP only == probably some sort of malware problem?)
Will look at some minidumps of yesterday and today as soon as I can get my hands on some.
Comment 57•15 years ago
|
||
(In reply to comment #56)
>(Is it just me, or does
> the combo of same crash stack + different access error + Slavic XP only ==
> probably some sort of malware problem?)
No (i.e. it's not just you).
Could we conclude that there was no new spike after the new version-bump?
Still 31 crashes in the 2010-09-30 build.
| Assignee | ||
Comment 59•15 years ago
|
||
21 of those 31 crashes for the 9/30 build are in rapid succession, so probably just a very persistent user crashing over and over at startup.
We have alas made very little headway on this bug. Opening the crashdumps causes my copy of devstudio to load the blue screen of death, which is making it hard for me at least to get anywhere.
Given the crash levels are pretty low since 9/17 do we want to mark this betaN?
Comment 60•15 years ago
|
||
FWIW, I found this blog post in Russian: http://translate.google.com/translate?hl=ru&sl=ru&tl=en&u=http%3A%2F%2Fsibilev.net%2F%3Fp%3D3573 that describes either this bug or probably Bug 595957 (FireFox constantly crashes on startup and tries to send a message about the crash). The reason of Firefox crash on startup is virus loaded through HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Userinit
Once virus removed, Firefox crash is gone. Few users on our local forum indicate that method of virus removal, described in this blog post, fixed their problem with Firefox startup crash.
| Assignee | ||
Comment 61•15 years ago
|
||
Alexander,
Thanks very much for this information!
I am not clear from the translation of the blog post whether the "DrWeb" software mentioned was part of the problem (caused the virus), or was just part of an attempt to fix it.
It looks like the problem here is that a malicious program is somehow inserted into HKEY_LOCAL_MACHINE \ SOFTWARE \ Microsoft \ Windows NT \ CurrentVersion \ Winlogon \ Userinit, and then presumably run at startup. It's not clear how it winds up affecting Firefox, but when the registry key is removed, the crashes go away. One possibility is that the filesystem I/O syscalls are being intercepted, but presumably it could be lots of different things.
We've looked over the "interesting_modules" file for the crash, and we don't see any clear .dll file that's associated with this. So we don't have a .dll name that we can block. Not sure if there's anything else to do here.
At JST's behest, marking INVALID and removing as blocker, since the crash numbers have stayed low since the spike on 9/17:
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=nsDiskCacheMap%3A%3AOpen(nsILocalFile*)&version=Firefox%3A4.0b7pre )
Assignee: honzab.moz → jduell.mcbugs
Status: ASSIGNED → RESOLVED
blocking2.0: beta7+ → ---
Closed: 15 years ago
Keywords: regression
Resolution: --- → INVALID
Whiteboard: [trying a backout]
Comment 62•15 years ago
|
||
(In reply to comment #61)
> I am not clear from the translation of the blog post whether the "DrWeb"
> software mentioned was part of the problem (caused the virus), or was just part
> of an attempt to fix it.
For clarity, Dr.Web is antivirus software, popular in Russia - http://www.drweb.com/?lng=en
Comment 64•15 years ago
|
||
http://technet.microsoft.com/en-us/library/cc939862.aspx
Specifies the programs that Winlogon runs when a user logs on. By default, Winlogon runs Userinit.exe, which runs logon scripts, reestablishes network connections, and then starts Explorer.exe, the Windows user interface.
Think of it as like ~/.profile or something, a happy place for bad guys to ask to run really early.
Comment 65•15 years ago
|
||
we are getting 10,000-14,000 crashes per day on this. lets not call it invalid, and lets try and figure out what we can do to drive those numbers lower.
date crashes at
HeapDestroy bug 597960
20101001 5943
20101002 6788
20101003 5717
20101004 5288
20101005 4787
172:crashdata chofmann$ ./stacktrend.sh nsDiskCacheDevice::OpenDiskCache 201010*
date crashes at
nsDiskCacheDevice::OpenDiskCache bug 595957
20101001 5503
20101002 7564
20101003 7252
20101004 6486
20101005 6324
172:crashdata chofmann$ ./stacktrend.sh nsDiskCacheMap::Open.nsILocalFile.. 201010*
date crashes at
nsDiskCacheMap::Open.nsILocalFile..
20101001 63
20101002 20
20101003 23
20101004 38
20101005 44
I suggested that we post something on SUMO and try and drive traffic to that article with press, but cww says visits by Russian users to SUMO are low. What are the support venues that we should be hitting?
Should we try and ramp up some press on this with instructions on how to repair?
I've sent mail to contacts kasperski to maybe get involved in blocking/repairing this malware; are there other contacts like that we should reach out to?
e-mail responder feature should be going on-line with socorro 1.7 tomorrow night. this is a good candidate to use for responding to users that add e-mails to crash reports on these three signatures.
Comment 66•15 years ago
|
||
(In reply to comment #65)
> I suggested that we post something on SUMO and try and drive traffic to that
> article with press, but cww says visits by Russian users to SUMO are low.
> What are the support venues that we should be hitting?
I guess it's possible to add to crash reporter detection that Firefox has crashed several times on startup. After crash reporter has detected that Firefox has crashed several times on start up, crash reporter could launch another browser (most probably Internet Explorer) and open in it SUMO article explaining what should be done in case if it's not possible to launch Firefox.
Some pitfalls:
1) Most viruses block for user access to web-sites of antivirus companies. They easily could block access to SUMO too. We could ship SUMO web-page bundled with browser, but it will hurt distribution size.
2) Some computers doesn't have another browser installed (thanks to European Union browser choice initiative). Thankfully Russia is not part of EU.
Comment 67•15 years ago
|
||
yeah, we can't ensure that we will each users with any of these channels; but we need to try!
The impact of this is probably far greater than the 10,000-14,000 users I mentioned above. Thats the number of users that crash per day from the buggy malware. We will probably run on to more as we find more signatures releated to this problem. The larger problem might be for users where the malware runs as designed and does not crash and the system is compromised.
Comment 68•15 years ago
|
||
(In reply to comment #67)
> The larger problem might be for users where the malware runs
> as designed and does not crash and the system is compromised.
Well, Firefox already have safe browsing feature, thanks to Google. It would be helpful to add to Firefox some basic virus/malware detection feature, that could indicate that this system has been compromised (may be some antivirus company would be interested). At least Mozilla developers wouldn't waste their precious time on bugs, caused by malware.
Though I guess this discussion doesn't belong to this bug.
Comment 69•15 years ago
|
||
(In reply to comment #67)
> The larger problem might be for users where the malware runs
> as designed and does not crash and the system is compromised.
Do we have a name for the malware concerned at this point? e.g. could we get a copy from Dr.Web for sandboxed analysis?
Comment 70•15 years ago
|
||
Alexander L. Slovesnik: where do most Russians go for tech support (and by extension, where do they go for tech support with Firefox?) Is the front page of http://mozilla-russia.org/ a good place for a notice about this issue? You seem to have a much more active community than we do.
FWIW, since it's a startup crash, the primary driver of traffic to SUMO -- the built-in Help button -- is not usable so we should be looking at messaging in other places.
I have no sense for the scale of this problem either... is it bad enough that we should try to send an official notice to the Technology ministry in Russia? Should we try to get Microsoft to release a security update to address this? Comment 67 makes it seem like a much larger percentage of users are affected than the crash reports we see.
Updated•15 years ago
|
Blocks: malware-attacks
Comment 71•15 years ago
|
||
translated version of the support forum at http://mozilla-russia.org/
http://translate.google.com/translate?js=n&prev=_t&hl=en&ie=UTF-8&layout=2&eotf=1&sl=ru&tl=en&u=http%3A%2F%2Fmozilla-russia.org%2F
shows the symptoms of this bug
translated post name posts views
Mozilla Firefox will not start - Chara [ 1 2 3 4 ] 80 24660
firefox does not run a report of an unexpected error - axe 14 1195
Permanent fall browser - KReoN 8 154
Comment 72•15 years ago
|
||
(In reply to comment #70)
> Alexander L. Slovesnik: where do most Russians go for tech support (and by
> extension, where do they go for tech support with Firefox?) Is the front page
> of http://mozilla-russia.org/ a good place for a notice about this issue? You
> seem to have a much more active community than we do.
Russian Mozilla forum is http://forum.mozilla-russia.org/. I've created post in our local FAQ on this issue on http://forum.mozilla-russia.org/viewtopic.php?id=46369
However, malware removal is very tricky business and I'm reluctant to convert Mozilla support forum to malware removal support forum. Antivirus companies support and special forums are more qualified to deal with malware issues.
> I have no sense for the scale of this problem either... is it bad enough that
> we should try to send an official notice to the Technology ministry in Russia?
FWIW, it's not only Russia problem. On http://crash-stats.mozilla.com/report/list?signature=nsDiskCacheDevice::OpenDiskCache%28%29 there are some comments on Italian and German.
> Should we try to get Microsoft to release a security update to address this?
> Comment 67 makes it seem like a much larger percentage of users are affected
> than the crash reports we see.
There is nothing that indicates that this is Microsoft issue.
Comment 73•15 years ago
|
||
Microsoft, as part of monthly security updates, pushes out a malware scanner... I don't know if it works really well but if chofmann is right, this is affecting tons of users (who are not crashing) and causing loss of personal data and we should leverage whatever resources we can to help them.
Another question: Is there anything you think Mozilla should do to help? You probably have a better sense of your locale than we do and I'd be happy to do what we can. However, you are much better qualified to say what steps/outreach is necessary.
Comment 74•15 years ago
|
||
(In reply to comment #73)
> Microsoft, as part of monthly security updates, pushes out a malware scanner...
> I don't know if it works really well but if chofmann is right, this is
> affecting tons of users (who are not crashing) and causing loss of personal
> data and we should leverage whatever resources we can to help them.
Unfortunately, a lot of users disable Microsoft Update on pirated Windows installations.
> Another question: Is there anything you think Mozilla should do to help? You
> probably have a better sense of your locale than we do and I'd be happy to do
> what we can. However, you are much better qualified to say what
> steps/outreach is necessary.
I've posted a kind of plan in comment 66. Additionaly Mozilla could contact antivirus companies (http://translate.google.com/translate?js=n&prev=_t&hl=en&ie=UTF-8&layout=2&eotf=1&sl=ru&tl=en&u=http%3A%2F%2Fwww.anti-malware.ru%2Frussian_antivirus_market_2009_2010 shows some stats on antivirus market in Russia) to ask them for any data on Firefox start-up crash issue. I guess they can correlate Firefox crash statistic with malware spread statistic.
Comment 75•15 years ago
|
||
The plan in comment 66 is a good long term idea but would minimally require a new version of Firefox to work (and maybe a lot of work in the socorro backend). Is there anything that we can do without making changes to Firefox?
Comment 76•15 years ago
|
||
yes, https://bugzilla.mozilla.org/show_bug.cgi?id=585593 outlines the plan for changes going into socorro that will allow finding the crash signatures like in this bug and the two other related bugs, then pulling e-mail address where users provided them, then e-mailing with instructions on how to avoid the crash they just hit.
this won't require any changes to firefox.
the message that we construct for the e-mail ought to have information in Russian and English and sounds like maybe Italian and German with maybe links in the e-mail with instructions on how to avoid the crash in each of these languages.
Comment 77•15 years ago
|
||
(In reply to comment #76)
> yes, https://bugzilla.mozilla.org/show_bug.cgi?id=585593 outlines the plan for
> changes going into socorro that will allow finding the crash signatures like in
> this bug and the two other related bugs, then pulling e-mail address where
> users provided them, then e-mailing with instructions on how to avoid the crash
> they just hit.
Can you estimate percentage of users, that have provided their e-mail addresses in crash reports? Are we talking about 1%, 10% or 90%?
Comment 78•15 years ago
|
||
yeah, the projections for the number of users that we can reach with this technique are low, but its still one more tool to get the word out.
some quick checks indicate that we might be able to reach just over a 1,000 user per day that that are hitting these crashes. Here is a sample from oct 6
HeapDestroy
6319 reports - no e-mail provided
516 yes, have e-mail address
nsDiskCacheDevice::OpenDiskCache
8269 no e-mail provided
549 yes, have e-mail
nsDiskCacheMap::Open.nsILocalFile..
68 no e-mail
this is probably a good bug to test the rollout of the e-mail responder system.
Comment 79•15 years ago
|
||
I'm no expert in runtime C++ and only use Windows if I'm forced to, but would it be possible to add exception-handling (possible for Windows only) in the appropriately coarse-grained places in the code which loads/bootstraps Firefox-modules? Just to catch stuff like this and pop up some reasonable message?
Comment 80•15 years ago
|
||
not like that. we don't know if a library is poisoning our process and running away, or if a process is attacking our process, or if a kernel driver is ruining us.
there's also another minor detail... a rogue piece of code could hurt any random file i/o, not just the one we pick.
ignoring that, assuming the process actually does care about us, this is a losing battle.
Comment 81•15 years ago
|
||
still currently running at about ten thousand crashes per day on
Bug 597960 - crash under Windows XP [@ HeapDestroy ] mainly on start-up
Plus another 8,000 per day with the nsDiskCacheDevice::OpenDiskCache.. signature
plus another 100 or so per day on this signature would bring the total to 19,000 crashes per day of the crash reports we process.
Comment 82•15 years ago
|
||
I'm not seeing anything at 10,000 crashes a day on http://crash-stats.mozilla.com/products/Firefox/versions/4.0b8pre - where are we seeing this volume?
Comment 83•15 years ago
|
||
this one of several bugs where we are affected by the same possible malware spans all releases. this particular signature applies to only trunk so its low volume. one of the bugs are duped against this bugs so I fugured we were concentrating comments here. maybe we should spin up a tracking bug to cover common stats and attributes of all the bugs. Here is the first comment for the tracking bug
this bug's stats.
date tl crashes at, count build, count build, ...
nsDiskCacheMap::Open.nsILocalFile..
20101020 33 12 4.0b7pre^\2010100204,
10 4.0b8pre^\2010101804, 6 4.0b8pre^\2010102004,
2 4.0b8pre^\2010101904, 1 4.0b8pre^\2010101104,
1 4.0b8pre^\2010100704, 1 4.0b7pre^\2010100304,
20101021 60 53 4.0b7pre^\2010100204,
2 4.0b4^\2010081813, 2 3.6.10^\2010091412,
1 4.0b8pre^\2010101604, 1 4.0b8pre^\2010100904,
1 3.6.11^\2010101211,
Bug 595957 - Sept 10-12, Spike in Firefox Crashes for Russian Users [@ nsDiskCacheDevice::OpenDiskCache() ] (edit)
date tl crashes at, count build, count build, ...
nsDiskCacheDevice::OpenDiskCache..
20101020 4063 2173 3.6.10^\2010091412,
343 4.0b6^\2010091408, 260 3.0.19^\2010031422,
150 3.6.11^\2010101211, 149 3.6^\2010011514,
141 3.5.13^\2010091413, 125 3.5.5^\2009110215,
89 3.6.3^\2010040108, 72 3.6.8^\2010072215,
59 3.0b5^\2008032620, 55 3.0.1^\2008070208,
47 4.0b4^\2010081813, 47 4.0b2^\2010072019,
31 3.6.9^\2010082415, 25 3.0.5^\2008120122,
<releases where volume is less that 30 crashes per day snipped>
Bug 597960 - crash under Windows XP [@ HeapDestroy ]
date tl crashes at, count build, count build, ...
HeapDestroy
20101020 17985 7901 3.6.10^\2010091412,
1517 3.5.13^\2010091413, 901 3.6.8^\2010072215,
844 3.6.11^\2010101211, 762 4.0b6^\2010091408,
751 3.6^\2010011514, 522 3.6.3^\2010040108,
503 3.0.19^\2010031422, 334 3.0.6^\2009011913,
297 3.5.6^\2009120122, 283 3.6.6^\2010062523,
222 3.5.5^\2009110215, 201 3.5.3^\2009082410,
184 3.5.2^\2009072922, 183 3.7a1pre^\2009082804,
148 3.0.1^\2008070208, 137 4.0b7pre^\2010100204,
131 3.0.5^\2008120122, 111 3.0^\2008052906,
101 4.0b4^\2010081813, 96 4.0a1pre^\2008051003,
<releases where volume is under 100 per day snipped>
Comment 84•15 years ago
|
||
Similar crash in thunderbird. All are win XP.
bp-0d4166c4-99dc-42c3-a130-2e3e42101109
"Opens up again but breaks down right with the 1st click. Firefox doesn't even open anymore."
0 @0xf195b58c
1 thunderbird.exe nsDiskCacheMap::OpenBlockFiles netwerk/cache/src/nsDiskCacheMap.cpp:617
2 thunderbird.exe nsDiskCacheMap::Open netwerk/cache/src/nsDiskCacheMap.cpp:155
3 thunderbird.exe nsDiskCacheDevice::OpenDiskCache netwerk/cache/src/nsDiskCacheDevice.cpp:896
4 thunderbird.exe nsDiskCacheDevice::Init netwerk/cache/src/nsDiskCacheDevice.cpp:374
5 thunderbird.exe nsCacheService::CreateDiskDevice netwerk/cache/src/nsCacheService.cpp:966
6 thunderbird.exe nsCacheService::SearchCacheDevices netwerk/cache/src/nsCacheService.cpp:1362
7 thunderbird.exe nsCacheService::ActivateEntry netwerk/cache/src/nsCacheService.cpp:1271
8 thunderbird.exe nsCacheService::ProcessRequest netwerk/cache/src/nsCacheService.cpp:1151
9 thunderbird.exe nsCacheService::OpenCacheEntry netwerk/cache/src/nsCacheService.cpp:1236
10 thunderbird.exe nsCacheSession::OpenCacheEntry netwerk/cache/src/nsCacheSession.cpp:98
11 thunderbird.exe nsHttpChannel::OpenCacheEntry netwerk/protocol/http/src/nsHttpChannel.cpp:1832
bp-ec0241d2-852e-42fe-b18d-b13fe2101110 (e.biehl)
bp-45bb010c-5f03-4f11-b2cb-3e5022101111 (g.birkle)
Comment 85•15 years ago
|
||
We'd like to use this as a test pilot for reaching out to users suffering from a crash where there's a known workaround but not a fix in place in Firefox.
Based on the Russian forum thread, here's my attempt to translate the instructions to English. Can anyone confirm that this is an accurate translation (and clarification)?
1. Open regedit (click Start, then Run..., and then type "regedit" and press Enter).
2. Locate the key: HKEY_LOCAL_MACHINE \ SOFTWARE \ Microsoft \ Windows NT \ CurrentVersion \ Winlogon.
3. Find the entry called "Userinit". It should only have the value of "C:\WINDOWS\system32\userinit.exe". If there is a comma and more text after it, this is a virus. Remember the part after the comma, which might look like this: "C:\WINDOWS\system32\3abcde04.exe".
4. Open My Computer and navigate to the folder containing the virus. In the example above, this is "C:\Windows\system32".
5. Completely remove the virus file by selecting it ("3abcde04.exe" in the example above) and pressing the Delete key while holding down the Shift key.
6. Go back to regedit and remove the part of the entry "Userinit" so it only includes "C:\WINDOWS\system32\userinit.exe".
7. Restart the computer.
Comment 86•15 years ago
|
||
(In reply to comment #85)
> Based on the Russian forum thread, here's my attempt to translate the
> instructions to English. Can anyone confirm that this is an accurate
> translation (and clarification)?
Translation looks good.
Updated•14 years ago
|
Crash Signature: [@ nsDiskCacheMap::Open(nsILocalFile*) ]
| Reporter | ||
Comment 87•14 years ago
|
||
It's now a low volume crash: only 11 crashes in 8.0 over the last week.
Keywords: topcrash
Summary: spike in 4.0b7pre start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ] → Start-up crash under Windows XP [@ nsDiskCacheMap::Open(nsILocalFile*) ]
Updated•10 years ago
|
Crash Signature: [@ nsDiskCacheMap::Open(nsILocalFile*) ] → [@ nsDiskCacheMap::Open(nsILocalFile*) ]
[@ nsDiskCacheMap::Open ]
Comment 88•10 years ago
|
||
zero examples with nsDiskCacheMap::Open in signature in the past week for any version
Status: REOPENED → RESOLVED
Closed: 15 years ago → 10 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•