Closed
Bug 189778
Opened 23 years ago
Closed 23 years ago
Mozilla crash breaks Promise controller IDE mirror.
Categories
(Core :: Networking: Cache, defect)
Tracking
()
VERIFIED
WORKSFORME
People
(Reporter: mjennings.usa, Assigned: gordon)
Details
(Keywords: crash, stackwanted)
Attachments
(2 files, 1 obsolete file)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.2) Gecko/20021216
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0.2) Gecko/20021216
This extremely serious failure has happened twice, once while typing into a
browser window (version 1.2.1), and another time while trying to access a
Mozilla Mail spellchecker that is apparently not installed correctly (version
1.0.2).
Each time the Promise Technology RAID controller hard disk mirror was broken,
and became "critical". Normally this would mean that there was a hard drive
failure. However, in both cases, there was nothing wrong with the hard drives.
Apparently the Mozilla crash is destroying information about the mirror stored
by the Promise FastTrak TX2 controller at the beginning hidden sectors of the
drives.
In both cases all instances of Mozilla Mail and Mozilla browser crashed. The
crash is caused by rapid keyboard or mouse input, apparently. Then there is
intense disk access. Then the TalkBack application appears. (See the TalkBack
report on Sunday, January 19, 2003. It is the only one with this email
address: Microsoft-BUGS at myrealbox dot com. The crash reported then, Moz
1.0.2, was the second that destroyed the Promise Technology RAID mirror.)
In both cases there were several instances of Mozilla browser open with
several tabs in each. In both cases the Mozilla Mail application was running
and there was one or more email messages being composed. (The user has
numerous duties that often require switching to another subject before a
former one is resolved.)
In both cases the Mozilla crash immediately preceded the FastCheck monitoring
utility reporting that the RAID mirror was critical.
Operating system: Windows XP with SP1. No other problems except for the normal
quirkiness of Windows XP.
The controller is configured with two 40 GB Western Digital hard drives. The
driver date is 06/11/2002. The driver version is 2.0.0.26. The controller is
installed in PCI Slot 2 (PCI bus 1, device 10, function 0).
Controller Info: IRQ: 9 BusMaster Base: 0xDF90, ROM Base Addr: 0xFD9E0000,
Hardware Type: FastTrak-100 TX/LP (6268/6270)
Note: Contrary to what is reported by the controller utility above, the
controller is a FastTrak-100 TX2.
The motherboard is an Intel 815EEA2 with an 866 MHz Pentium 3 processor.
Promise Technology
http://www.promise.com/
Promise Technology FastTrak100 TX2 RAID controller
http://www.promise.com/support/download/download2_eng.asp?productId=8&category=All&os=100
Reproducible: Couldn't Reproduce
Steps to Reproduce:
This is a major crash. I don't know how to make it happen other than stressing
Mozilla with many instances open. (The crash happens regularly then, but these
are the only times the crash broke the mirror.
Could you give the precise talkback ID? (run talkback.exe in mozilla/components/)
| Reporter | ||
Comment 2•23 years ago
|
||
This JPEG image shows the TalkBack IDs of all the crashes of Mozilla that were
logged. The latest crash shown (Sunday, January 19, 2003) was the second of the
two crashes that destroyed the Promise Technology RAID mirror.
The first crash that destroyed the RAID mirror caused an intense amount of disk
access that continued for more than 30 seconds. The user, fearing a virus
infection, turned off the power. There was therefore no TalkBack report.
Other than the mirror being broken, no data seems to have been lost in the
second crash that destroyed the RAID mirror. During the first crash that
affected the RAID mirror, all files and new folders created over two days were
lost.
| Reporter | ||
Comment 3•23 years ago
|
||
Latest Talkback incident: TB16361837M. This is the second crash that destroyed
the RAID mirror.
[It is unfortunate that the TalkBack incident numbers must be laboriously
copied by hand. They cannot be selected and copied to the clipboard.]
See the attached .JPG image file which shows all the Mozilla crashes that have
been logged by TalkBack.
The JPEG image only shows the TalkBack IDs of the crashes of Mozilla that were
logged. The latest crash shown (Sunday, January 19, 2003) was the second of
the two crashes that destroyed the Promise Technology RAID mirror.
The first crash that destroyed the RAID mirror caused an intense amount of
disk access that continued for more than 30 seconds. The user, fearing a virus
infection, turned off the power. There was therefore no TalkBack report.
Other than the mirror being broken, no data seems to have been lost in the
second crash that destroyed the RAID mirror. During the first crash that
affected the RAID mirror, all files and new folders created over two days were
lost.
I can't possibly see how Mozilla can cause a RAID array to go critical....
Maybe Mozilla is crashing as a *result* of the array going critical?
Stack should help
Keywords: crash,
stackwanted
my guess :)
Assignee: asa → gordon
URL: Any.
Component: Browser-General → Networking: Cache
QA Contact: asa → tever
Comment 7•23 years ago
|
||
WFM 20030120 Linux with a Promise FasTrak 100. I would try it under XP but i
don't want my mirror gets killed. ;)
Comment #4:
>Maybe Mozilla is crashing as a *result* of the array going critical?
I think this is correct.
Comment 8•23 years ago
|
||
I'm not going to dupe this myself, but this looks ery much like bug 185251 the
spell check error. The stack in the talkback instance doesn't have line numbers,
but the pattern is the same, this one ends with the spellchk.dll.
spellchk.dll + 0x126a (0x0628126a)
spellchk.dll + 0x102a (0x0628102a)
xpcom.dll + 0x3120a (0x6118120a)
composer.dll + 0x73af (0x601373af)
xpcom.dll + 0x377c1 (0x611877c1)
xpc3250.dll + 0x12634 (0x60d62634)
xpc3250.dll + 0x15b87 (0x60d65b87)
The odd thing, is that I don't see how such a crash could cause a driver to
fail. The driver should be protected. But I've seen stranger things. Just for
good measure, if you haven't already, check to see if there's a new version of
the driver.
Also odd, is that the TB listed a jmp instruction as the crash site. This is
somewhat unusual, the address looked ordinary. It's possible that under stress
you're system is haveing problems. When the jmp was executed, it may have failed
to swap in the code due to a disk error and thus generated the exception in
Mozilla code.
Comment 9•23 years ago
|
||
I meant to strike the first part of the message before I posted, but forgot. I
really don't think this is a spell check issue, given the jmp instruction crash.
My money is on a flaky driver/controller/memory causing code to fail to be paged
back in.
| Reporter | ||
Comment 10•23 years ago
|
||
First, the problem would not occur if Mozilla did not crash. Or, if a RAID
mirror failure did occur, there would be no suspicion it was caused by Mozilla
if Mozilla did not crash at the same time that the break of the mirror
occurred.
At present, the crashing causes us not to want to use Mozilla on all our
machines. It's a serious issue. The crashing seems to happen because keyboard
or mouse input is overloaded somehow.
We have plenty of hard drives, computers, and controllers here. We can run any
tests.
The biggest cause of hardware failure is bad contacts. This problem is
eliminated from consideration because the contacts had been renewed a few days
before. Contact renewal is accomplished by pulling all cards and cables out
about a millimeter and pushing them back on again.
There does not seem to be a hardware failure. Both drives have passed Western
Digital's Diagnostic Quick Check. (W.D. is the manufacturer of the drives.)
One of the drives of the mirror passed W.D.'s Rigorous check. The other drive
of the mirror booted perfectly and is being used to write this email message.
Promise Technology Technical Support says they know of no instances in which
software is causing a break of a mirror. But, it is possible if something
writes to the hidden sectors on the hard drive, where the mirror information
is stored.
The latest drivers were being used before the crash. The RAID controller BIOS
was upgraded from 2.00.0.2 to 2.00.0.24 today.
Comment 11•23 years ago
|
||
Ideally with a properly written driver, software running under "user" should
never cause a fault in a driver even in a crashing situation.
The instruction that caused the failure is jmp and it is to a hard coded address:
60437524 e875260000 call 60439b9e
60437529 e95c020000 jmp 6043778a
6043752e 817f042d010000 cmp dword ptr [edi+0x4],0x12d
For windows to fault on the jmp instruction one of two things had to happen.
Either the memory was swapped out and and it was unable to read it from disk, or
CPU had trouble reading the memory itself. Given the close proximity of the
addresses, my bet would be with the CPU unable to read that address in memory.
I'd also check your event log to see if you have any errors in there that might
help identify the source of the problems.
| Reporter | ||
Comment 12•23 years ago
|
||
Running Memtest for more than 2 hours showed no errors. Thanks to Will Dormann
for suggesting Memtest. There is every indication that the hardware is healthy.
| Reporter | ||
Comment 13•23 years ago
|
||
This bug report includes a .JPG showing numerous crashes. Is anyone else
reporting crashes of Mozilla?
David Bradley said,
"The instruction that caused the failure is jmp and it is to a hard coded address:
60437524 e875260000 call 60439b9e
60437529 e95c020000 jmp 6043778a
6043752e 817f042d010000 cmp dword ptr [edi+0x4],0x12d"
Is there any easy way to find what is loaded there?
| Reporter | ||
Comment 14•23 years ago
|
||
Additional Comment #13 should have said, "I don't see any similar crashes in the
bug reports. Does anyone know of one?"
The event log showed nothing near the time of the crash.
There may be bad drivers. There are many unsigned drivers. I can try another system.
The problem is reproducible on this system. If I open 20 instances of Mozilla,
each with 5 to 20 tabs, there is (virtual memory) disk access that seems
unreasonable and disfunctional. The system becomes less responsive to keyboard
input.
The jmp instruction target is Hex 6043778a. This is decimal 1,615,034,250. There
is 256 MB in the system.
Comment 15•23 years ago
|
||
The address is just an address and no indication of how much memory may be being
used.
Stats at time of crash:
Virtual memory: 215,834,624
Working Set: 110,997,504
Peak Working Set: 120541184
Also note, virtual memory is not a great statistic, as you can allocate 1gig on
a system with 128megs and 128meg swap file and it will return just fine. Address
space is not the same as memory usage.
If Talkback is to be believed, the fail occured on the jump, and that means for
whatever reason the CPU wasn't able to jump to that location. It's not
impossible that maybe Talkback didn't record the data properly. The other
incidents listed no longer exist in the talkback database
Comment 16•23 years ago
|
||
One last thing, is the spellchk.dll supposed to work with Mozilla 1.0.2? I
thought it was only out for 1.3a or something like that. Still seems strange
that if this was the cause it would crash at a jmp instruction.
Whiteboard: TB16361837M
| Reporter | ||
Comment 17•23 years ago
|
||
If the Jmp instruction is real, it is a location in virtual memory.
There is the possibility that the Jmp instruction is just data being interpreted
as instructions.
I am preparing to do a thorough test of Mozilla 1.2.1 on a machine with a clean
installation of Windows XP and Mozilla. If the problem does not occur on the new
machine, it seems reasonable to believe that it is caused by a faulty driver.
Certainly I stress Mozilla considerably. I often open 10 or more instances of
Mozilla, each with 5 or more tabs. The crash only occurs under these conditions,
and during hurried input.
Doesn't TalkBack store the crash information on the user's computer? I am
surprised that TalkBack would throw information away. (Programmers have
difficulty loving themselves, it seems.)
Comment 18•23 years ago
|
||
>If the Jmp instruction is real, it is a location in virtual memory.
>There is the possibility that the Jmp instruction is just data being
interpreted
>as instructions.
I've seen this, where a function returns to some erroneous place because the
stack was trashed. I dismissed that possibility because the assembler for
several instructions before and after seemed reasonable.
I'm posting the info from talk back in case anyone else might have ideas. The
stack isn't all that useful, why I didn't post it originally, but the other
stuff might spark some ideas.
Comment 19•23 years ago
|
||
There appears to be an updated driver at:
http://www.promise.com/support/download/download2_eng.asp?productId=8&category=All&os=100
Reporter could you please install it? If you already have installed it, could
you please reinstall your driver?
This smells like a driver issue.
| Reporter | ||
Comment 20•23 years ago
|
||
To Andrew Hagen:
The system had the latest Promise Technology driver installed at the time of the
crash. Re-installing the driver made no difference in the file date.
I will do a new and thorough test using a clean install of Windows XP on another
machine, using the latest driver.
It seems reasonable to guess that this is a bug in the Windows XP virtual
memory, but that is only one of several plausible guesses.
| Reporter | ||
Comment 21•23 years ago
|
||
Mozilla Bugzilla
Bugzilla Bug 189778
http://bugzilla.mozilla.org/show_bug.cgi?id=189778
I seem to have found the problem. Thanks to Gary Hegan on the
microsoft.public.windowsxp.general newsgroup, I found that the Intel Application
Accelerator software was failing. I un-installed this software that is supposed
to be used with Intel motherboards, and the problem dsappeared.
The problem caused hard disk channel parity errors that were listed in Windows
XP's System Event Viewer. These caused Mozilla to crash, but no other obvious
problems:
1) A parity error was detected on \Device\Ide\IdeChnDr0.
2) An error was detected on device \Device\Harddisk0\D during a paging
operation.
At first, I did not think to look in the System Event log, because I saw no
other problems than those with Mozilla.
The Intel Application Accelerator is MUCH worse than Intel says. I found that
Intel technical support is very poorly trained. They have very little knowledge
of major issues, not just this one.
Uninstalling Intel Application Accelerator fixed the problem, and seems to have
had NO bad effects. The system is faster than before. The normal drivers seem to
be fine. There is no need for an "Application Accelerator", which is a
bad name for a hard disk driver.
The Intel Application Accelerator is VERY trouble-prone:
Intel(R) Application Accelerator - Top Technical Issues
http://www.intel.com/support/chipsets/iaa/
Intel Application Accelerator Known Compatibility Issues
http://www.intel.com/support/chipsets/iaa/compat.htm
ftp://download.intel.com/support/chipsets/iaa/iaa_compat2.pdf
If you move a hard drive to another computer, there are problems:
http://www.intel.com/support/chipsets/iaa/harddrive.htm
(Note broken link.)
If you upgrade to Windows XP, there are problems:
http://www.intel.com/support/chipsets/iaa/xptable.htm
http://www.intel.com/support/chipsets/iaa/tti001.htm
This web page:
http://www.intel.com/support/chipsets/iaa/ident.htm
has only one purpose: To link to this web page:
http://www.intel.com/support/chipsets/iaa/reasons.htm
Humorous Microsoft Support article:
Device Settings Are Hard to Find in Windows XP:
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q310751
The Intel Application Accelerator won't work with some Intel motherboards:
http://support.intel.com/support/chipsets/iaa/sb/CS-001410-prd663.htm
The Intel Application Accelerator won't work with some Intel motherboards:
http://support.intel.com/support/chipsets/iaa/sb/CS-001448-prd663.htm
Windows XP System Event Viewer:
start C:\WINDOWS\SYSTEM32\eventvwr.msc
__________________
microsoft.public.windowsxp.general
Re: Hard Disk: Parity errors, unusual Disk activity
Subject: Intel Application Accelerator failure
Gary,
You are correct.
Wow, I lost 20 hours to Intel being flaky.
I NEVER would have found this without your help.
Michael
________________________________________________________________________________
Gary Hegan wrote:
> It is an error caused by Intel's Application Accelerator being installed. I
> had the same problem. I removed it and no longer get the warnings.
>
> --
> Regards
> (-: Gary Hegan :-)
>
>
>
>
> Michael Jennings wrote in message news: e4KMEI8xCHA.1420@TK2MSFTNGP12...
>
>>I have been receiving these messages in System Event Viewer:
>>
>>A parity error was detected on \Device\Ide\IdeChnDr0.
>>
>>An error was detected on device \Device\Harddisk0\D during a paging
>
> operation.
>
>>This is on an Intel D815EEA2 motherboard. This is the second D815EEA2
>>motherboard that has failed with IDE controller problems in the last two
>>months. It seems very unlikely that two identical motherboards would fail
>
> in
>
>>an identical way, after years of error-free use.
>>
>>The hard disks are Western Digital WD400BB, 40 Gigabytes. W.D. Diagnostics
>>show they are error free.
>>
>>Question #1: Is this a real error, or an error in Windows XP?
>>
>>Question #2: Is this error specific to drive D? Is that what the second
>
> error
>
>>message says?
>>
>>
>>_______________________________________
>>
>>
>>System Event Viewer:
>>start C:\WINDOWS\SYSTEM32\eventvwr.msc
>>
>
>
>
Status: UNCONFIRMED → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
Comment 22•23 years ago
|
||
That's good news. Thank you for posting that informative follow-up.
| Comment hidden (typo) |
Updated•1 year ago
|
Attachment #9402764 -
Attachment is obsolete: true
You need to log in
before you can comment on or make changes to this bug.
Description
•