Closed
Bug 343283
Opened 18 years ago
Closed 18 years ago
SeaMonkey segfaults on FreeBSD when using PrefBar since version 3.3 (wrong handling of malloc?)
Categories
(Core :: DOM: Core & HTML, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla1.9alpha1
People
(Reporter: Manuel.Spam, Assigned: mrbkap)
References
()
Details
(Keywords: fixed1.8.1, verified1.8.0.7)
Attachments
(3 files)
1.53 KB,
patch
|
jst
:
review+
jst
:
superreview+
|
Details | Diff | Splinter Review |
1.22 KB,
patch
|
jst
:
review+
jst
:
superreview+
dveditz
:
approval1.8.0.7+
mtschrep
:
approval1.8.1+
|
Details | Diff | Splinter Review |
91.23 KB,
image/png
|
Details |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; de-AT; rv:1.8.0.2) Gecko/20060409 SeaMonkey/1.0.1 Build Identifier: This bug had been originally filed for PrefBar (see URL above). Thank you very much to j. schade who did really great work to find the reason of this bug. His unchanged comments follow: ------- Comment #18 From j. schade 2006-06-26 15:02:23 ------- Well, since it was raining yesterday here in the Caribbean I finally found time to look at this issue. Here is what I found out so far: Using newer perfbars (>=3.3) gives me a sig11 on FreeBSD-5.4. It happens in libgklayout.so: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1 (LWP 100116)] 0x29a9599b in nsGlobalWindow::RunTimeout (this=0x8393500, aTimeout=0x8935000) at nsGlobalWindow.cpp:6378 6378 timeout->mArgv[timeout->mArgc] = Current language: auto; currently c++ (gdb) p timeout->mArgc $1 = 0 (gdb) p timeout->mArgv $2 = (jsval *) 0x800 (gdb) p timeout->mArgv[timeout->mArgc] Error accessing memory address 0x800: Bad address. The corresponding code in nsGlobalWindow.cpp: ... PRTime lateness = now - timeout->mWhen; timeout->mArgv[timeout->mArgc] = INT_TO_JSVAL((jsint)(lateness / PR_USEC_PER_MSEC)); jsval dummy; scx->CallEventHandler(mJSObject, timeout->mFunObj, timeout->mArgc + 1, timeout->mArgv, &dummy); ... Now, where does the funny 0x800 come from? Throwing in some debugging printf's revealed that when timeout->mArgv had "some reasonable" values (like 0x83655c0, 0x88ef8e0, 0x843d070, ...) nothing bad happens. The special value 0x800 was created when the following code (in the same file) was called with an argc of 1: ... } else if (funobj) { /* Leave an extra slot for a secret final argument that indicates to the called function how "late" the timeout is. */ timeout->mArgv = (jsval *) PR_MALLOC((argc - 1) * sizeof(jsval)); if (!timeout->mArgv) { timeout->Release(scx); ... This gives us a malloc() with 0 as arg. On FreeBSD, a malloc(0) returns some constant value (which is 0x800 on my architecture) which, in fact, does not actually point to a usable memory region. While one can argue what result SHOULD be returned in case of doing a malloc(0), firefox seems to behave properly if it receives NULL as a result (since the timeout gets released immediately in this case). The behaviour of FreeBSD can be changed according to the the malloc(3) manpage by setting some malloc flags: ... V Attempting to allocate zero bytes will return a NULL pointer instead of a valid pointer. (The default behavior is to make a minimal allocation and return a pointer to it.) This option is provided for System V compatibility. This option is incompatible with the ``X'' option. ... So I gave it a try by running MALLOC_OPTIONS=V firefox and firefox didn't crash anymore and prefbar was running. Therefore I come to the point that: 1. It isn't prefbar's fault, it only reveals the problem. 2. It is an interaction of firefox and FreeBSD's funky malloc() implementation. ------- Comment #20 From j. schade 2006-06-28 13:23:25 ------- > If I understand you right, then the problem is solved immediately if you use > "MALLOC_OPTIONS=V firefox"? Yes. I have added it to /usr/X11R6/lib/firefox/run-mozilla.sh. Other options are the use of /etc/malloc.conf or _malloc_options (see the malloc manpage). > If so, then this is in fact a core bug. I'm unable to fix it with PrefBar code > so I could finally mark this heavy bug fixed. I would say "yes". However, I would like to see others confirm that it works on their systems as well. > Do you allow me to paste your whole comment as new core-bug to > bugzilla.mozilla.org? Of course. However, I don't know if we should consider it a firefox bug. IF malloc(0) returns NULL (as it does with MALLOC_OPTIONS=V) firefox behaves properly. If it is wise to return something Non-NULL (as it is done in FreeBSD) is a different question. I will send a message to the -hackers list and we will see. ------- Comment #21 From j. schade 2006-06-29 08:33:38 ------- OK, I have just learned from the freebsd-hackers list (thanks to the people who answered there) that: The C standard explicitly allows both behaviours, and requires the implementation to choose one of them. If it returns a non-NULL pointer, though, that pointer can *only* be used for passing back to free(). It may *not* be dereferenced. So firefox is wrong, and actually broken. So we can safely consider it as a bug in firefox. A patch could be: --- mozilla/dom/src/base/nsGlobalWindow.cpp.ORI Fri Apr 21 23:36:30 2006 +++ mozilla/dom/src/base/nsGlobalWindow.cpp Thu Jun 29 14:29:12 2006 @@ -6079,7 +6079,7 @@ indicates to the called function how "late" the timeout is. */ timeout->mArgv = (jsval *) PR_MALLOC((argc - 1) * sizeof(jsval)); - if (!timeout->mArgv) { + if (!timeout->mArgv || argc==1) { timeout->Release(scx); return NS_ERROR_OUT_OF_MEMORY; However, I have no idea if the whole (argc - 1) construct could be wrong since I don't know what this function is exactly doing :-) Reproducible: Didn't try Steps to Reproduce: To reproduce this bug you may need to install PrefBar Version 3.3 or higher. Actual Results: SeaMonkey segfaults on FreeBSD if you try to setup PrefBar Expected Results: No crash ;-)
Reporter | ||
Updated•18 years ago
|
OS: Other → FreeBSD
Updated•18 years ago
|
Assignee: nobody → general
Component: General → DOM
QA Contact: general → ian
Assignee | ||
Updated•18 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P2
Target Milestone: --- → mozilla1.9alpha
Assignee | ||
Comment 1•18 years ago
|
||
This seems worth fixing on all branches. Writing to memory returned from malloc(0) is bad, and that it doesn't crash on all platforms is really lucky.
Assignee: general → mrbkap
Status: NEW → ASSIGNED
Attachment #227858 -
Flags: superreview?(jst)
Attachment #227858 -
Flags: review?(jst)
Comment 2•18 years ago
|
||
It seems that fixing this improves overall stability as well. Earlier, I had randomly appearing segfaults from time to time, especially when clicking around with the mouse like crazy. I'm still using my MALLOC_OPTIONS=V fix here on FreeBSD and now it seems that these problems are gone as well.
Assignee | ||
Comment 3•18 years ago
|
||
(In reply to comment #2) > It seems that fixing this improves overall stability as well. When you say "this", do you mean this specific problem, or using MALLOC_OPTIONS=V on FreeBSD? If you mean the latter, then I hope that you'll file bugs on the remaining crashes, since they are real bugs, and that we don't crash is IMO an unlucky artifact of malloc implementations. If you mean the former, then we really should fix this bug on all branches.
Comment 4•18 years ago
|
||
> When you say "this", do you mean this specific problem, or using
> MALLOC_OPTIONS=V on FreeBSD? If you mean the latter, then I hope that you'll
I am using only MALLOC_OPTIONS=V, not the patch. I am currently not able to
recompile firefox to test the patch (holiday's over :-().
When the patch has been incorporated and I have found the time to recompile
the whole thing I will happily remove MALLOC_OPTIONS=V and see what happens.
I only wanted to point out that MALLOC_OPTIONS=V seems to improve overall
stability. Since the code in question seems to have to do with timeout
handling, there is IMHO a good possibility that dereferencing the return
value of THIS malloc(0) could have been the cause for other sporadic crashes
(not only the crash of the prefbar extension). However, I really don't know how
many other similar malloc(0) constructs elsewhere are being "fixed" when using
MALLOC_OPTIONS=V. I was already happy being able to track down this one :-).
Assignee | ||
Comment 5•18 years ago
|
||
(In reply to comment #4) > When the patch has been incorporated and I have found the time to recompile > the whole thing I will happily remove MALLOC_OPTIONS=V and see what happens. Great! Thanks for tracking this down.
Comment 6•18 years ago
|
||
Comment on attachment 227858 [details] [diff] [review] Fix for trunk r+sr=jst
Attachment #227858 -
Flags: superreview?(jst)
Attachment #227858 -
Flags: superreview+
Attachment #227858 -
Flags: review?(jst)
Attachment #227858 -
Flags: review+
Assignee | ||
Comment 7•18 years ago
|
||
This bug fixes a use of malloc(0)'d memory, which is bad.
Flags: blocking1.8.1?
Flags: blocking1.8.0.6?
Updated•18 years ago
|
Flags: blocking1.8.1? → blocking1.8.1+
Assignee | ||
Comment 8•18 years ago
|
||
Fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 9•18 years ago
|
||
Attachment #228870 -
Flags: superreview?(jst)
Attachment #228870 -
Flags: review?(jst)
Attachment #228870 -
Flags: approval1.8.1?
Comment 10•18 years ago
|
||
Comment on attachment 228870 [details] [diff] [review] Fix for branch r+sr=jst
Attachment #228870 -
Flags: superreview?(jst)
Attachment #228870 -
Flags: superreview+
Attachment #228870 -
Flags: review?(jst)
Attachment #228870 -
Flags: review+
Updated•18 years ago
|
Attachment #228870 -
Flags: approval1.8.1? → approval1.8.1+
Assignee | ||
Comment 12•18 years ago
|
||
Comment on attachment 228870 [details] [diff] [review] Fix for branch This falls into the 'stability' catagory of fixes. I'm not sure if it's severe enough for 1.8.0, but I thought I'd try anyway.
Attachment #228870 -
Flags: approval1.8.0.6?
Comment 13•18 years ago
|
||
Since the patch didn't show up in recent firefox sources yet (talking about 1.5.0.6) I patched nsGlobalWindow.cpp manually and removed the MALLOC_OPTION statement. Everything seems to work well so far (prefbar can be installed and no crashes). BTW, what has to be done to import this bugfix into the firefox sources officially?
Comment 14•18 years ago
|
||
(In reply to comment #13) > BTW, what has to be done to import this bugfix into the > firefox sources officially? The patch has been landed on the trunk (for Firefox 3) and the 1.8 branch (for Firefox 2). It's waiting for approval to be landed on the 1.8.0 branch for Firefox 1.5.0.7 (the approval1.8.0.7? flag).
Updated•18 years ago
|
Flags: blocking1.8.0.7? → blocking1.8.0.7+
Comment 15•18 years ago
|
||
Comment on attachment 228870 [details] [diff] [review] Fix for branch approved for 1.8.0 branch, a=dveditz for drivers
Attachment #228870 -
Flags: approval1.8.0.7? → approval1.8.0.7+
Comment 17•18 years ago
|
||
two questions: 1) Are you running BSD builds or using the ELF binaries? 2) Does this apply to Firefox as well as SeaMonkey?
Comment 18•18 years ago
|
||
(In reply to comment #17) > two questions: > > 1) Are you running BSD builds or using the ELF binaries? Who is "you"? In case it's me: I run my own self-compiled binaries from the FreeBSD ports system. > > 2) Does this apply to Firefox as well as SeaMonkey? I debugged the bug with firefox. Don't know about SeaMonkey. I now run a patched version of firefox and the bug is gone.
Comment 19•18 years ago
|
||
Hi js, > (In reply to comment #17) > > 1) Are you running BSD builds or using the ELF binaries? > > Who is "you"? In case it's me: I run my own self-compiled > binaries from the FreeBSD ports system. well, by "you" I meant the Reporter. Sorry if that sounded rude. :) > > 2) Does this apply to Firefox as well as SeaMonkey? > > I debugged the bug with firefox. Don't know about SeaMonkey. > I now run a patched version of firefox and the bug is gone. I guess the reporter was running SeaMonkey in an ELF binary, according to the description, so I guess that answers that question. thanks!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 20•18 years ago
|
||
(In reply to comment #17) > 2) Does this apply to Firefox as well as SeaMonkey? This patch applies equally to SeaMonkey, Firefox or any other Gecko embedding. It is in core code that is shared by everybody. I'm guessing that you didn't mean to REOPEN this bug. I'm re-marking this as fixed, or is there a reason it shouldn't be?
Status: REOPENED → RESOLVED
Closed: 18 years ago → 18 years ago
Resolution: --- → FIXED
Comment 21•18 years ago
|
||
Updated•18 years ago
|
Status: RESOLVED → VERIFIED
Updated•18 years ago
|
Keywords: fixed1.8.0.7 → verified1.8.0.7
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•