Closed
Bug 494671
Opened 16 years ago
Closed 15 years ago
[SeaMonkey-Ports, MacOSX] cb-seamonkey-osx-*: "mochitest-plain: T-FAIL CRASH L-FAIL", possibly related to 'libSystem.B.dylib'
Categories
(SeaMonkey :: Release Engineering, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 528817
People
(Reporter: sgautherie, Unassigned)
References
()
Details
(Keywords: crash, Whiteboard: [Needs a Parallels fix])
These boxes are always orange, with either this bug or a more random(!?) "T-FAIL + timeout";
this bug is about the CRASH case, which I may have found a "pattern" for.
***
The crashed thread stack varies and can be +/- long,
but it always ends (with different register values but) at the same instruction(s):
{
Operating system: Mac OS X
10.5.6 9G71
CPU: x86
GenuineIntel family 6 model 7 stepping 6
2 CPUs
Crash reason: EXC_ARITHMETIC / EXC_I386_DIV
Crash address: 0xffff0315
Thread 0 (crashed)
0 0xffff0315
eip = 0xffff0315 esp = 0xbfffd0e0 ebp = 0xbfffd0e8 ebx = 0x4a191a57
esi = 0x00006877 edi = 0x00000000 eax = 0xfffd9563 ecx = 0x3b9aca00
edx = 0xffffffff efl = 0x00210246
1 libSystem.B.dylib + 0x29e78
eip = 0x91635e79 esp = 0xbfffd0f0 ebp = 0xbfffd128
[...]
}
Maybe these boxes have a bad "libSystem.B.dylib" or the like?
Reporter | ||
Comment 1•16 years ago
|
||
(In reply to comment #0)
> Maybe these boxes have a bad "libSystem.B.dylib" or the like?
All (but 1) of the other threads (which number varies too) are also running 'libSystem.B.dylib'...
The other thread (running code) seems to be +/- random:
{
(I didn't look a older builds...)
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1242728092.1242735151.31723.gz&fulltext=1
OS X 10.5 comm-1.9.1 unit test on 2009/05/19 03:14:52
0 libgklayout.dylib!nsLineLayout::CombineTextDecorations(nsPresContext*, unsigned char, nsIFrame*, nsRect&, int, float) [nsLineLayout.cpp:e49c05fc9122 : 98 + 0x1]
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1242757522.1242764852.14409.gz&fulltext=1
OS X 10.5 comm-1.9.1 unit test on 2009/05/19 11:25:22
0 libgklayout.dylib!nsTextFrameUtils::TransformText(unsigned char const*, unsigned int, unsigned char*, nsTextFrameUtils::CompressionMode, unsigned char*, gfxSkipCharsBuilder*, unsigned int*) [nsTextFrameUtils.cpp:82d4f0cd8238 : 211 + 0x0]
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1242950367.1242956368.31389.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/21 16:59:27
0 libxpconnect.dylib!WrappedNativeJSGCThingTracer [xpcwrappednativescope.cpp:e8fc03d9a29e : 356 + 0x4]
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243116825.1243122991.1031.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/23 15:13:45
0 libxpcom_core.dylib!NS_LogDtor_P [nsTraceRefcntImpl.cpp:a7eb03446bed : 274 + 0x3b]
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243127633.1243131058.12166.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/23 18:13:53
(no "other" thread)
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243155596.1243159462.20067.gz&fulltext=1
OS X 10.5 comm-1.9.1 unit test on 2009/05/24 01:59:56
0 libgklayout.dylib!oggz_get_stream [oggz.c:a7eb03446bed : 328 + 0xe]
}
Summary: [SeaMonkey-Ports, MacOSX] cb-seamonkey-osx-*: "mochitest-plain: T-FAIL CRASH L-FAIL" → [SeaMonkey-Ports, MacOSX] cb-seamonkey-osx-*: "mochitest-plain: T-FAIL CRASH L-FAIL", possibly related to 'libSystem.B.dylib'
Reporter | ||
Comment 2•16 years ago
|
||
Oh, I forgot the crashing test, in the same order:
*** 43138 INFO Running /tests/layout/base/tests/test_bug441782-2e.html...
*** 43305 INFO Running /tests/layout/base/tests/test_bug441782-1e.html...
*** 43554 INFO Running /tests/layout/base/tests/test_bug441782-5b.html...
*** 43533 INFO Running /tests/layout/base/tests/test_bug441782-2c.html...
*** 13110 INFO Running
/tests/content/canvas/test/test_2d.drawImage.9arg.sourcesize.html...
*** 27974 INFO TEST-PASS | /tests/content/media/video/test/test_timeupdate1.html | Check currentTime of 0.7329999804496765 is greater than last time of 0.6990000009536743
(It looked like we would have a culprit, but not anymore.)
Reporter | ||
Comment 3•16 years ago
|
||
This time it crashed on the "other" thread:
{
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243197733.1243203288.28479.gz&fulltext=1
OS X 10.5 comm-1.9.1 unit test on 2009/05/24 13:42:13
Crash reason: EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash address: 0x165cb4
Thread 0 (crashed)
0 libmozjs.dylib!JS_CallTracer [jsgc.cpp:e376af3a7490 : 1130 + 0x0]
}
![]() |
||
Comment 4•16 years ago
|
||
Even the hangs we're seeing frequently seem to almost always be in video tests somewhere. I wonder what it means that both hangs and crashes are at different places but almost always in video mochitests.
![]() |
||
Comment 5•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243403266.1243407750.548.gz&fulltext=1 has:
Crash reason: EXC_ARITHMETIC / EXC_I386_DIV
Crash address: 0xffff0315
Thread 2 (crashed)
0 0xffff0315
eip = 0xffff0315 esp = 0xb0206dd0 ebp = 0xb0206dd8 ebx = 0x4a1ce42b
esi = 0x0000926b edi = 0x00000000 eax = 0xfa99b621 ecx = 0x3b9aca00
edx = 0xffffffff efl = 0x00010246
1 libSystem.B.dylib + 0x29e78
eip = 0x90b7ee79 esp = 0xb0206de0 ebp = 0xb0206e18
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243424728.1243430169.9162.gz&fulltext=1 has:
Crash reason: EXC_ARITHMETIC / EXC_I386_DIV
Crash address: 0xffff0315
Thread 22 (crashed)
0 0xffff0315
eip = 0xffff0315 esp = 0xb1b9cd90 ebp = 0xb1b9cd98 ebx = 0x4a1d3bc6
esi = 0x000078f3 edi = 0x00000000 eax = 0xeba436bf ecx = 0x3b9aca00
edx = 0xffffffff efl = 0x00010246
1 libSystem.B.dylib + 0x29e78
eip = 0x91635e79 esp = 0xb1b9cda0 ebp = 0xb1b9cdd8
I still suspect this to be largely a Parallels virtualization issue.
Reporter | ||
Comment 6•16 years ago
|
||
(In reply to comment #5)
> I still suspect this to be largely a Parallels virtualization issue.
In order to help figure out what the situation is with this bug (and more globally these boxes), I suggest to disable mochitest-plain for a while...
Comment 7•16 years ago
|
||
So you're crashing inside gettimeofday, which sucks:
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/unix/unix.c#3020
looking at the symbols we have on Socorro, the 10.5.6 symbols for libSystem.B.dylib.sym confirms that:
PUBLIC 29e47 0 gettimeofday
I can only think this is a Parallels issue or an OS X issue exposed by running on Parallels.
![]() |
||
Comment 8•16 years ago
|
||
Ted, thanks, that's very good to know, might be just what we need to file a ticket with the Parallels people. Phong, can you look into doing that?
Comment 9•16 years ago
|
||
After a bit more digging through OSX source:
gettimeofday calls __commpage_gettimeofday:
http://www.opensource.apple.com/source/Libc/Libc-498.1.5/sys/gettimeofday.c
which is a little stub that calls into a fixed address:
http://www.opensource.apple.com/source/Libc/Libc-498.1.5/i386/sys/i386_gettimeofday.s
which is in in the "comm page" to talk to the kernel:
http://www.opensource.apple.com/source/xnu/xnu-1228.9.59/osfmk/i386/cpu_capabilities.h
#define _COMM_PAGE_GETTIMEOFDAY (_COMM_PAGE_START_ADDRESS+0x2e0) /* used by gettimeofday() */
_COMM_PAGE_START_ADDRESS is 0xFFFF0000 on i386, so _COMM_PAGE_GETTIMEOFDAY = 0xFFFF02E0. Your crash is at 0xffff0315, and the next address defined in that header isn't until 0x4e0, so it sure looks like you're crashing inside the code at that address.
Comment 10•16 years ago
|
||
Here's the source for the code that lives at that comm page address:
http://www.opensource.apple.com/source/xnu/xnu-1228.9.59/osfmk/i386/commpage/commpage_gettimeofday.s
My assembler-fu is weak, so I'll leave it at that.
Reporter | ||
Comment 11•16 years ago
|
||
I disabled bug 494769 yesterday, to hopefully (help) work around this too, ftb.
There was one occurrence (only/yet) of this since then:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243563863.1243574516.25364.gz&fulltext=1
OS X 10.5 comm-1.9.1 unit test on 2009/05/28 19:24:23
{
35093 INFO TEST-PASS | nodeCloneFalseNoCopyTextAssert1
35095 INFO Running /tests/dom/tests/mochitest/dom-level1-core/test_hc_nodeclonegetparentnull.html...
NEXT ERROR TEST-UNEXPECTED-FAIL | (automation.py) | Exited with code 3 during test run
INFO | (automation.py) | Application ran for: 0:24:28.216440
NEXT ERROR TEST-UNEXPECTED-FAIL | (automation.py) | Browser crashed (minidump found)
[...]
Thread 0 (crashed)
0 0xffff0315
1 libSystem.B.dylib + 0x29e78
2 libnspr4.dylib!_PR_UNIX_GetInterval [unix.c:560662a707ba : 3020 + 0x12]
3 libgklayout.dylib!PresShell::ProcessReflowCommands(int) [nsPresShell.cpp:560662a707ba : 6740 + 0x4]
}
with no "other" thread.
Depends on: 494769
Reporter | ||
Comment 12•16 years ago
|
||
(In reply to comment #11)
> I disabled bug 494769 yesterday, to hopefully (help) work around this too, ftb.
Scratch that: it seems that bug helped bug 493450, but not this one (at all).
> There was one occurrence (only/yet) of this since then:
Actually, all build had this failure :-<
Reporter | ||
Comment 13•16 years ago
|
||
{
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243644913.1243649587.1904.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/29 17:55:13
17374 INFO Running /tests/content/canvas/test/test_size.attributes.style.html...
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243663806.1243671111.1552.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/29 23:10:06
43529 INFO Running /tests/layout/base/tests/test_bug441782-3c.html...
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1243677738.1243684848.27313.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/30 03:02:18
28055 INFO TEST-PASS | /tests/content/media/video/test/test_wav_ended2.html | Expect at least one playing event
}
Let's wait for Parallels to be fixed!
Whiteboard: [Needs future Parallels fix/upgrade]
![]() |
||
Comment 14•16 years ago
|
||
We'll need to watch this for a bit more, but it looks like the Parallels and system upgrades in bug 494462 might have fixed this. I'd like to see a day or so of non-crash data before closing the bug here though.
Reporter | ||
Comment 15•16 years ago
|
||
(In reply to comment #14)
Ftr, the last occurrence of this bug was:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1244638160.1244643373.14097.gz
OS X 10.5 comm-1.9.1 unit test on 2009/06/10 05:49:20
![]() |
||
Comment 16•16 years ago
|
||
Hrm, looks like our old "friend" is still here :(
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1244757887.1244764036.27165.gz&fulltext=1
Operating system: Mac OS X
10.5.7 9J61
CPU: x86
GenuineIntel family 6 model 7 stepping 6
2 CPUs
Crash reason: EXC_ARITHMETIC / EXC_I386_DIV
Crash address: 0xffff0315
Thread 16 (crashed)
0 0xffff0315
eip = 0xffff0315 esp = 0xb1497df0 ebp = 0xb1497df8 ebx = 0x4a319621
esi = 0x0000c463 edi = 0x00000000 eax = 0xfec1fc3f ecx = 0x3b9aca00
edx = 0xffffffff efl = 0x00010246
1 libSystem.B.dylib + 0x29f38
eip = 0x94d63f39 esp = 0xb1497e00 ebp = 0xb1497e38
2 libclient.dylib + 0x11813e
eip = 0x10eb113f esp = 0xb1497e40 ebp = 0xb1497ed8
3 libclient.dylib + 0x5f5ae
eip = 0x10df85af esp = 0xb1497ee0 ebp = 0xb1497f18
4 libclient.dylib + 0x5f0f9
eip = 0x10df80fa esp = 0xb1497f20 ebp = 0xb1497f98
5 libclient.dylib + 0x2c4934
eip = 0x1105d935 esp = 0xb1497fa0 ebp = 0xb1497fc8
6 libSystem.B.dylib + 0x7d1ff
eip = 0x94db7200 esp = 0xb1497fd0 ebp = 0xb1497fe8
Reporter | ||
Comment 17•16 years ago
|
||
(In reply to comment #8)
> might be just what we need to file a ticket with the Parallels people.
> Phong, can you look into doing that?
Did you?
No longer depends on: 494462
Whiteboard: [Needs a Parallels fix]
Target Milestone: seamonkey2.0b1 → ---
![]() |
||
Comment 18•16 years ago
|
||
Serge: Please let me drive this, I tend to communicate with people outside the bugs as well, so such a question here might be redundant and result in just noise. Phong and I agreed to wait for the updates we just did this week before actually filing this, as we had some hope they may have fixed this.
Meanwhile, we had more crashes with the same signature as comment #16:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1244777538.1244787463.2153.gz&fulltext=1
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1244784829.1244791814.10506.gz&fulltext=1
This appears to just be the 10.5.7 version of what ted investigated with 10.5.6 before.
Phong, based on that, can you please file a ticket with Parallels on this issue as well?
Comment 19•16 years ago
|
||
I have an open ticket with Parallels about the issue.
Reporter | ||
Comment 20•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1244829896.1244835872.12801.gz
OS X 10.5 comm-1.9.1 unit test on 2009/06/12 11:04:56
Fwiw, it's not often but the following extra 'Invalid memory access' log is (still) happening too:
{
[...]
43744 INFO Running /tests/layout/base/tests/test_bug467672-3f.html...
2009-06-12 12:37:22.168 seamonkey-bin[84275:11503] Invalid memory access of location 00000000 eip=ffff0315
}
![]() |
||
Comment 21•16 years ago
|
||
The crashes continue to happen, though they are less frequently now that the VMs have been reduced to one CPU, but they are still there and still have the same signature:
http://tinderbox.mozilla.org/showlog.cgi?tree=SeaMonkey-Ports/1246265696.1246270857.4697.gz&fulltext=1#err1
http://tinderbox.mozilla.org/showlog.cgi?tree=SeaMonkey-Ports/1246155324.1246159961.31658.gz&fulltext=1#err1
http://tinderbox.mozilla.org/showlog.cgi?tree=SeaMonkey-Ports/logfile=1246085376.1246091178.7012.gz&fulltext=1#err1
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1245851552.1245858248.20089.gz&fulltext=1#err10
Reporter | ||
Comment 22•16 years ago
|
||
(In reply to comment #21)
> The crashes continue to happen, though they are less frequently now that the
> VMs have been reduced to one CPU, but they are still there and still have the
> same signature:
Fwiw, note that some of these crashed at a _different_ location...
![]() |
||
Comment 23•16 years ago
|
||
(In reply to comment #22)
> Fwiw, note that some of these crashed at a _different_ location...
Not really, we have moved from 10.5.6 to 10.5.7, and with that, the location looks different, but I'm pretty sure it's still the same location in terms of code, it just moved in the binary with the patched kernel.
Comment 24•15 years ago
|
||
Looks like this is more than just Seamonkey?
OS X 10.5.2 mozilla-central test opt mochitests on 2009/10/01 09:54:21
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1254416061.1254416350.18354.gz&fulltext=1#err2
Reporter | ||
Comment 25•15 years ago
|
||
(In reply to comment #24)
> Looks like this is more than just Seamonkey?
Supposedly unlikely:
Firefox VMs don't run on Parallels, do they ?
> OS X 10.5.2 mozilla-central test opt mochitests on 2009/10/01 09:54:21
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1254416061.1254416350.18354.gz&fulltext=1#err2
And not the same crash:
{
Crash reason: EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Crash address: 0xc
Thread 31 (crashed)
0 XUL + 0x6c9db0
}
Reporter | ||
Comment 26•15 years ago
|
||
![]() |
||
Comment 27•15 years ago
|
||
Looks like bug 528817 fixed this, as long as bug 537308 will not show any regression, we can consider this one fixed.
![]() |
||
Comment 28•15 years ago
|
||
Dupe of bug 528817, but in the mean time we abandoned Parallels for OSX VMs completely.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
![]() |
||
Updated•15 years ago
|
Component: Project Organization → Release Engineering
![]() |
||
Updated•15 years ago
|
QA Contact: organization → release
You need to log in
before you can comment on or make changes to this bug.
Description
•