memory corruption crash: bus error whilst parsing property




18 years ago
17 years ago


(Reporter: calum.mackay, Assigned: dbaron)




Firefox Tracking Flags

(Not tracked)




18 years ago
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:0.9.3+) Gecko/20010904
BuildID:    2001090410

app crashed shortly after starting; not sure which web page was
being viewed as had several browser instances running.

app was being run under watchmalloc

t@1 (l@1) terminated by signal BUS (invalid address alignment)
(dbx) where                                                                  
current thread: t@1
=>[1] realfree(0xd98400, 0xd983f8, 0xd98300, 0xff394000, 0xd982f8, 0x101), at
  [2] malloc_unlocked(0xff39432c, 0xd7fe78, 0xff394000, 0x40, 0xd982b0, 0x0), at
  [3] malloc(0x3c, 0xffbee418, 0x33, 0x0, 0x0, 0x0), at 0xff380e64
  [4] __builtin_new(0x3c, 0xfd614a2c, 0x51, 0xa0041, 0xfd7bace0, 0xff1d7e70), at
  [5] AppendValue__18CSSDeclarationImpl13nsCSSPropertyRC10nsCSSValue(0xd7fe28,
0x50, 0xffbee1a8, 0xfd612d94, 0x2e42c, 0x0), at 0xfd612de0
0xd7fe28, 0x50, 0xffbee1a8, 0xffbee328, 0x0), at 0xfd628b9c
0xffbee418, 0xd7fe28, 0x50, 0xffbee328, 0xff18e914), at 0xfd62919c
  [8] ParseDeclaration__13CSSParserImplRiP17nsICSSDeclarationiT1(0xdc0320,
0xffbee418, 0xd7fe28, 0x1, 0xffbee328, 0xffbee32c), at 0xfd627660
  [9] ParseDeclarationBlock__13CSSParserImplRii(0xdc0320, 0xffbee418, 0x1,
0xfd632104, 0xd84250, 0xff1a1574), at 0xfd626f6c
  [10] ParseRuleSet__13CSSParserImplRi(0xdc0320, 0xffbee418, 0x1, 0xfd63b7c4,
0x3, 0xd842d0), at 0xfd625450
0xdc0328, 0x3, 0xffbee52c, 0xfd623cfc, 0xff0000), at 0xfd623e6c
0xd84258, 0xd62ba0, 0xffbee530, 0xffbee52c, 0x1a979c), at 0xfd620e94
0x84c330, 0xd842b8, 0xd62ba0, 0x0, 0xfe98994c), at 0xfd6210a0
0x84c330, 0x0, 0x0, 0xffbee5c8, 0xdbe760), at 0xfd620548
  [15] OnStopRequest__14nsStreamLoaderP10nsIRequestP11nsISupportsUi(0x84c330,
0xcaad58, 0x0, 0x0, 0xfdeeae88, 0x59), at 0xfdeeaec8
0xcaad58, 0x0, 0x0, 0xfdeea41c, 0xb00), at 0xfdeea460
  [17] OnStopRequest__13nsHttpChannelP10nsIRequestP11nsISupportsUi(0xcaad58,
0xcaad7c, 0x0, 0x0, 0xfdf17fc8, 0xb00), at 0xfdf18098
  [18] HandleEvent__20nsOnStopRequestEvent(0xda77c8, 0xfdf34efc, 0x109158, 0x1,
0x4d0, 0xa7c), at 0xfdf34f90
  [19] HandlePLEvent__23nsARequestObserverEventP7PLEvent(0xda77c8, 0xfdeda924,
0x4d0, 0xa7c, 0x4d0, 0xa7c), at 0xfdeda940
  [20] PL_HandleEvent(0xda77c8, 0x30fc0, 0x0, 0xff394000, 0x0, 0x0), at 0xff1a449c
  [21] PL_ProcessPendingEvents(0x109118, 0x116c4, 0x0, 0x0, 0x0, 0x0), at 0xff1a43cc
  [22] ProcessPendingEvents__16nsEventQueueImpl(0x10fa98, 0xff1a5480,
0xdeadbeef, 0xdeadbeef, 0x3, 0xd65df0), at 0xff1a54b0
  [23] 0xfdcb2d38(0x10fa98, 0x5, 0x1, 0xfdcb2d1c, 0xd65dd0, 0x19), at 0xfdcb2d37
  [24] 0xfdcb29ec(0x16eaa8, 0x1, 0x2f55a0, 0xfdcb29c8, 0x1, 0x0), at 0xfdcb29eb
  [25] 0xfee2429c(0x125c60, 0xffbeec30, 0x2f55a0, 0x0, 0x0, 0x0), at 0xfee2429b
  [26] 0xfee25d08(0x46c, 0x46c, 0x4d0, 0xa7c, 0x4d0, 0xa7c), at 0xfee25d07
  [27] 0xfee265a8(0xfee4c594, 0xfee4c500, 0x4d0, 0xa7c, 0x4d0, 0xa7c), at 0xfee265a7
  [28] g_main_run(0x17fbe0, 0xff09a894, 0x12a610, 0x0, 0x0, 0xfe9b0b6c), at
  [29] gtk_main(0x2c00, 0xfdcb30a4, 0x13a070, 0xfe94f120, 0xfe8c0dd4, 0x0), at
  [30] Run__10nsAppShell(0x157f28, 0xfdcb329c, 0x157f28, 0xfe94ee50, 0x0,
0xffe), at 0xfdcb32d4
  [31] Run__17nsAppShellService(0x13c058, 0xfe949688, 0x13c058, 0xfe94a554,
0xffbeed98, 0xff219640), at 0xfe94969c
  [32] 0x188a8(0x0, 0xffbef054, 0x0, 0x5, 0x100d4, 0x0), at 0x188a7
  [33] main(0x0, 0xffbef054, 0xffbef05c, 0x2ed48, 0x0, 0x0), at 0x19318

Reproducible: Couldn't Reproduce
I don't think there's much I can do about this without a purify trace showing
where the corruption occurred, rather than a stack trace showing where the crash
occured later...

Comment 2

18 years ago
I suppose not; I am running under Solaris' watchmalloc, which makes
crashes occur sooner rather than later, but I don't have access to
purify. The watchmalloc library does have a WATCH facility, but this
makes things around 100 times slower and it was too painful: after
several minutes I didn't even have a window up.

This is a shame, since I'm seeing about a dozen crashes a day with
mozilla with the recent builds (not all in the same area of course);
are you saying there's no point me even logging the bugs?

It's getting to the point where I shall have to give up testing
mozilla, since it's becoming unusable; that's a shame too since I think
it's an excellent tool.
I'm surprised you're seeing so many crashes.  I wonder if your GDK is compiled
so that g_malloc and g_free aren't compatible with malloc and free (which it is
in a debugging mode).  Someone else reported problems due to that (see bug
95599), on Solaris, and it leads to attemps to free unallocated memory that
aren't generally fatal, but I suppose they could be with watchmalloc.

Comment 4

18 years ago
That's very interesting; I'm running with an old 0.5.1 GDK 1.2 from
Netscape's 6.0 release. I have a locally built 0.9.1 GDK but I have a
few other problems with that (which I will look into separately).

Do we know if the ns GDK shipped with 6.0 would show this problem if
used with mozilla?

I will also turn off watchmalloc and see if that makes a difference; I
had originially turned it on since I was seeing crashes, but perhaps it
has made things more noticable.

Comment 5

18 years ago
Marking NEW.
Severity: normal → critical
Ever confirmed: true
Keywords: crash
I don't know anything about the GDK shipped with the N6.0 release, and I'm not
sure who would.

Comment 7

18 years ago
watchmalloc() is _very_ picky about memory stuff. One wrong step and it kills
your process. 
That's it's job.

If an application has problems with watchmalloc then it is (AFAIK) the
applications fault ...
However, without Purify output showing the problem, it's nearly impossible to
tell what the problem is.

Comment 9

18 years ago
I am not sure whether Purify is really able to catch all stuff which can be
catched using watchmalloc ... ;-/
Well, without steps to reproduce and without a stack pointing to a problem, it's
hard to even do that.  If only watchmalloc shows the problem, then maybe
debugging using MALLOC_DEBUG=WATCH in the environment would be helpful?

Comment 11

18 years ago
Yeah, but using WATCH just makes it unusable; couldn't even get a window up
after several minutes waiting.

Still, I'm now using the real GDK and not using watchmalloc (since there's
no real point if we can't track things down from the stack trace alone),
and it's much more stable.

So perhaps not worth wasting any more of your time on this one; it can serve
as a placeholder in case others see it.

Comment 12

18 years ago
> Yeah, but using WATCH just makes it unusable; couldn't even get a window up
> after several minutes waiting.

Uhm... did you read the manual page ? The option "WATCH" will add r/w-"hooks" to
_every_ allocated page - which will cause a slowdown by the factor 400 and more.
This means if you have to wait 20secs to see a window the option WATCH will
increase this time to 20s*400=~2.2222h ...

Debugging with option WATCH is painfull but would give exact stack traces where
the problem occured (as it cacthes write and _read_ faults) ...

Comment 13

18 years ago
Yes, I do know how WATCH works :)

It's more than painful, it's impossible, since I have no way of
reproducing this problem on demand.

Running with watchmalloc enabled causes about a half-dozen crashes per
day, all different.  I was dutifully logging bugs on the new ones, but
have now decided there's no point, so I've turned off watchmalloc.

I can't have mozilla running 400 times slower all day, just to catch

Comment 14

18 years ago
It is recommended to use the WATCH option on a machine with more than one CPU
and "bind" Mozilla to one CPU (see pbind(1M)). Option "WATCH" usually spams the
MMU with tons of rubbish - getting one "free" CPU for remaning OS tasks should
speed-up the debugging process a _lot_.

Comment 15

18 years ago
Yep, I know; not practical for us, where we're on SunRay clients of a big
SunFire server; plus, as I said, I can't reproduce *this* problem; it
only happened once.

And again, I'm not trying to debug this problem, just report it. But there's
probably no point, since no-one else can do much with it.
I'm just trying to help with mozilla development by being a test user, not
a developer. If I were a developer, I would be running it under Purify all the
time, but that's another story.

I'm going to mark this bug as WONTFIX since I don't think there's any
information in this bug report that can lead to (1) a fix for the problem
reported or (2) a way to tell if the problem still exists.
Last Resolved: 17 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.