Closed
Bug 94734
Opened 23 years ago
Closed 22 years ago
crash on a bugzilla search
Categories
(Core :: Networking: HTTP, defect, P1)
Core
Networking: HTTP
Tracking
()
VERIFIED
FIXED
mozilla1.0.1
People
(Reporter: bobbell, Assigned: darin.moz)
References
()
Details
(Keywords: 64bit, crash, Whiteboard: [adt2 RTM] [ETA 07/31])
Attachments
(6 files)
425 bytes,
text/plain
|
Details | |
8.24 KB,
text/plain
|
Details | |
81.55 KB,
text/plain
|
Details | |
413 bytes,
patch
|
Details | Diff | Splinter Review | |
214 bytes,
text/plain
|
Details | |
772 bytes,
patch
|
blizzard
:
review+
darin.moz
:
superreview+
brendan
:
approval+
|
Details | Diff | Splinter Review |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; OSF1 alpha; en-US; rv:0.9.3) Gecko/20010804 BuildID: 2001080416 When I do a bugzilla search, I get the message instructing be to wait. When the screen should refresh with the results, mozilla crashes. Reproducible: Always Steps to Reproduce: 1. Go to bugzilla.mozilla.org 2. Search for a bug (e.g., "corrupt font") 3. Wait for results Actual Results: mozilla crashed instead of displaying results. The "Just a moment" (or whatever the exact text is) screen did display, but not the results. Expected Results: display the results, as my copy of Navigator 4.76 does.
Stack trace (note mozilla was built without symbols):$ dbx
/usr/local/mozilla/mozilla-bin core
dbx version 5.1
Type 'help' for help.
Core file created by program "mozilla-bin"
warning: /usr/local/mozilla/mozilla-bin has no symbol table -- very little is
supported without it
thread 0x8 signal Segmentation fault at >*[__nxm_thread_kill, 0x3ff805cb1d8]
ret zero, (ra), 1
(dbx) t
> 0 __nxm_thread_kill(0xb, 0x0, 0x3ff805b7914, 0x3ffc01b2000, 0x3ffc01b2000)
[0x3ff805cb1d8]
1 pthread_kill(0x0, 0x11fffaab8, 0x0, 0x11fffc010, 0x1) [0x3ff805b7934]
2 (unknown)() [0x3ff805cf854]
3 (unknown)() [0x3ff807f369c]
4 exc_raise_signal_exception(0xb0ffe0003, 0x86, 0x0, 0x3ffbfcb9504, 0x1)
[0x3ff807f3a08]
5 (unknown)() [0x3ff805b9470]
DBX Fault: Segmentation fault
(dbx)
Comment 2•23 years ago
|
||
Reporter, can you try a recent nightly build available at: http://ftp.mozilla.org/pub/mozilla/nightly/latest/mozilla-alpha-dec-osf4.0f.tar.gz
Comment 4•23 years ago
|
||
Marking NEW. BTW, this seems to be the first OSF/1 bug report.
Status: UNCONFIRMED → NEW
Ever confirmed: true
This bug is still being produced with Mozilla Build 2001101719. I've verified that other co-workers can also reproduce this bug.
Comment 6•23 years ago
|
||
we must have a stack trace to assign it to the correct component. Browser general will never fix a bug... On I386-Linux, win and mac we have talkback... Reporter: It is possible that you can build a debug build or a optimized build with symbols ?
Yes, I believe I can do a debug build. Please specify exactly what parameters you would like me to build Mozilla with, as I don't normally compile it myself and therefore don't know how such a build should be done. All specify whether you would like me to build Mozilla from a nightly release or from the last milestone.
Comment 8•23 years ago
|
||
Please read http://www.mozilla.org/build (should contain the build instructions) Please build a nightly build if possible.. Thanks !
Reporter | ||
Comment 10•23 years ago
|
||
Reporter | ||
Comment 11•23 years ago
|
||
I built mozilla from nightly source, using '-O -g3' flags passed to the Compaq
Tru64 'cc' and 'cxx' compilers. I didn't do anything else special. I've
attached the script 'buildit' that I used to build mozilla.
NOTE: I first ran mozilla on the machine where I built it. This started it with
a fairly clean profile. The bug was reproduced. However, because that machine
was running very low on disk space, I subsequently ran it from a second machine,
NFS mounting mozilla from the build machine. This picked up by own profile, and
also make it easier to debug the crash. The bug was reproduced in the same
fashion. Running mozilla from the second machine via NFS from the first is how
I gathered the information in this bug report.
When mozilla ran it generated a lot of output to the terminal window from which
I ran it. I've attached the file 'mozilla.err', which is a copy-and-paste of
this text output.
Mozilla stills crashes as expected when retrieving results of a search from
bugzilla.mozilla.org. Below you will find a debugging session on that dump. I
can give you the dump if you want, or you can tell me what I should investigate.
/usr/bin/dbx /usr/local/rpm/tmp/mozilla/dist/bin/mozilla-bin core
dbx version 5.1
Type 'help' for help.
Core file created by program "mozilla-bin"
thread 0xf signal Segmentation fault at >*[__nxm_thread_kill, 0x3ff805cb1d8]
ret zero, (ra), 1
(/usr/bin/dbx) t
> 0 __nxm_thread_kill(0xb, 0x0, 0x3ff805b7914, 0x3ffc01b2000, 0x3ffc01b2000)
[0x3ff805cb1d8]
1 pthread_kill(0x0, 0x11fffaef8, 0x0, 0x11fffc010, 0x1) [0x3ff805b7934]
2 (unknown)() [0x3ff805cf854]
3 (unknown)() [0x3ff807f369c]
4 exc_raise_signal_exception(0xb0ffe0003, 0x86, 0x0, 0x3ffbf938b40, 0x1)
[0x3ff807f3a08]
5 (unknown)() [0x3ff805b9470]
6
OnDataAvailable__16nsMultiMixedConvXP10nsIRequestP11nsISupportsP14nsIInputStreamUiUi()
["nsMultiMixedConv.cpp":545, 0x3ffbf938b40]
7
OnDataAvailable__18nsDocumentOpenInfoXP10nsIRequestP11nsISupportsP14nsIInputStreamUiUi()
["nsURILoader.cpp":259, 0x3ffbf78de08]
8
OnDataAvailable__19nsStreamListenerTeeXP10nsIRequestP11nsISupportsP14nsIInputStreamUiUi()
["nsStreamListenerTee.cpp":56, 0x3ffbf91abc8]
9
OnDataAvailable__13nsHttpChannelXP10nsIRequestP11nsISupportsP14nsIInputStreamUiUi()
["nsHttpChannel.cpp":2359, 0x3ffbf9a3104]
10 HandleEvent__22nsOnDataAvailableEventXv() ["nsStreamListenerProxy.cpp":192,
0x3ffbf917ea0]
11 HandlePLEvent__23nsARequestObserverEventXP7PLEvent()
["nsRequestObserverProxy.cpp":79, 0x3ffbf8f04f4]
12 PL_HandleEvent(self = (unallocated - symbol optimized away))
["plevent.c":590, 0x3ffbfee1668]
13 PL_ProcessPendingEvents(self = (unallocated - symbol optimized away))
["plevent.c":520, 0x3ffbfee1440]
14 ProcessPendingEvents__16nsEventQueueImplXv() ["nsEventQueue.cpp":388,
0x3ffbfee573c]
15 event_processor_callback__XPvi17GdkInputCondition() ["nsAppShell.cpp":184,
0x3ffbf1568b4]
16 our_gdk_io_invoke__XP11_GIOChannel12GIOConditionPv() ["nsAppShell.cpp":76,
0x3ffbf15622c]
17 (unknown)() [0x300030155c4]
18 (unknown)() [0x3000301786c]
19 (unknown)() [0x3000301808c]
20 g_main_run(0x1, 0x1, 0x300018d5cf0, 0x0, 0x300018d5dac) [0x3000301827c]
21 gtk_main(0x3ffbf147ed0, 0x0, 0x11fffbe00, 0x3ffbfee8dd0, 0x140105a80)
[0x300018d5da8]
22 Run__10nsAppShellXv() ["nsAppShell.cpp":364, 0x3ffbf157240]
23 Run__17nsAppShellServiceXv() ["nsAppShellService.cpp":302, 0x3ffbe0a3a9c]
24 main1__XiPPcP11nsISupports() ["nsAppRunner.cpp":1303, 0x12001466c]
25 main() ["nsAppRunner.cpp":1629, 0x1200154f4]
(/usr/bin/dbx)
Please tell me if you need more information. I may also attempt recompiling
with '-g' instead of '-g3' to gather even more debugging information.
Comment 12•23 years ago
|
||
Thanks for your stack trace !!! -> Networking
Assignee: asa → darin
Component: Browser-General → Networking: HTTP
QA Contact: doronr → tever
Assignee | ||
Comment 13•23 years ago
|
||
reporter: can you reproduce this problem in a more recent nightly build?
Reporter | ||
Comment 14•23 years ago
|
||
I just reproduced this with 2002013113. I used an optimized build, from an custom RPM. I'll look into generating a debug build again, but this may take a little time.
Assignee | ||
Comment 15•23 years ago
|
||
bobbell: before you bother building debug, you might try capturing a HTTP log while reproducing the crash. here's how: setenv NSPR_LOG_MODULES nsHttp:5 setenv NSPR_LOG_FILE http.log then just attach http.log to this bug report. thx!
Reporter | ||
Comment 16•23 years ago
|
||
Reporter | ||
Comment 17•23 years ago
|
||
Since it seems that debug build will take a little longer to generate, I've attached the log as per the instructions given. I went straight to http://bugzilla.mozilla.org/ and searched for "foo bar" (no quotes). I was given the "please wait" screen, and the mozilla crashed (as usual) instead of displaying the results of the search.
Comment 18•23 years ago
|
||
*** Bug 124557 has been marked as a duplicate of this bug. ***
Comment 19•22 years ago
|
||
I am seeing this with Mozilla 0.9.9 on Linux alpha (from redhat rawhide RPM),
although it seems to only happen when the query returns a long list of bugs
(>70). It also crashes on RedHat's bugzilla, but occurs there even with short
lists.
It crashes in nsMultiMixedConv::OnDataAvailable (as in OSF).
I did the NSPR_LOG_MODULES a couple times with similar results to attachment
67747 [details], although one time it gave me:
1026[120246da0]: nsHttpHandler::ReclaimConnection
[conn=20b17e70(bugzilla.redhat.com:80) keep-alive=1]
1026[120246da0]: adding connection to idle list [conn=20b17e70]
1026[120246da0]: active connection count is now 0
1026[120246da0]: nsHttpHandler::ProcessTransactionQ_Locked
1026[120246da0]: >> unable to process transaction queue at this time
I can attach the full log if you want to see it.
I also did sniffit and it appears to crash in different places (one time in the
bug list, one time in the footer).
OS --> All, please
Comment 20•22 years ago
|
||
It does seem to happen sometimes with shorter lists, but more consistently with longer lists.
Assignee | ||
Comment 21•22 years ago
|
||
1026[120246da0]: >> unable to process transaction queue at this time is neither an error nor a warning. it just indicates that there are no pending transactions to process.
Comment 22•22 years ago
|
||
I was unable to reproduce this crash with an unoptimzed build from CVS a few days ago. Currently attempting to recompile with optimization...
Comment 23•22 years ago
|
||
no crash with optimization on. I also built from SRPM 20020327 (since original crasher for me was 0.9.9 RPM). Also no crash. RedHat's bugzilla also works, which was a more consistent crasher than Mozilla's bugzilla. worksforme bobbell: if you still can't get it to compile (bug 134221), you might try the latest nightly (20020312), assuming you have OSF v4.x
Reporter | ||
Comment 24•22 years ago
|
||
I am running running Tru64 UNIX V5.1, though I do find it intriguing that an V4.0F build is occuring nightly. It should still run on V5.1, so I may give it a shot. Also note that bug 134221 does not prevent compiling; it prevents the use of the compiled program.
Comment 25•22 years ago
|
||
Mozilla 1.0RC1 Alpha-Linux RPM is still crashing here.
Comment 26•22 years ago
|
||
Mozilla 1.0rc1 from SRPM works fine...
Comment 28•22 years ago
|
||
*** Bug 143696 has been marked as a duplicate of this bug. ***
Comment 29•22 years ago
|
||
I just came across this problem today on a Tru64 UNIX 5.0. After debugging this for a little while I could find the bug. The code that was causing grief was if (!mPartChannel || !(cursor[bufLen-1] == nsCRT::LF) ) bufLen is declared as PRUint32. changing the exporession to if (!mPartChannel || !(cursor[int(bufLen-1)] == nsCRT::LF) ) solved the problem.
Comment 30•22 years ago
|
||
(and there was much rejoicing) is that not a compiler bug, then? nominating for mozilla1.0.1 since we finally know what's going on and have a fix.
Keywords: mozilla1.0.1
Comment 31•22 years ago
|
||
Can I have review for this fix.
Assignee | ||
Comment 32•22 years ago
|
||
Shanmugavelu: can you explain how your patch solves this crash? is bufLen sometimes zero? is that the problem?
Comment 33•22 years ago
|
||
Yes. The problem is that the buflen in the expression if (!mPartChannel || !(cursor[bufLen-1] == nsCRT::LF) ) is "0". bufLen has been defined as a PRUint32. (ladebug) p bufLen 0 (ladebug) whatis bufLen PRUint32 bufLen
Assignee | ||
Comment 34•22 years ago
|
||
then the question becomes: "is cursor[-1] valid?" if not, then we need to protect against evaluating this expression when bufLen == 0.
Comment 35•22 years ago
|
||
proper testcase output: 11ffff814 11ffff814 11ffff814 11ffff814 actual testcase output: 11ffff814 51ffff814 11ffff814 11ffff814 (that's with cc, I don't have c++ on OSF) this bug also bites on Alpha-Linux with C, but not C++ (at least for me, gcc-2.96-87). Blizzard might be using an older compiler to make RPMS. dunno. also, if bufLen is positive, the bug doesn't bite.
Comment 36•22 years ago
|
||
actually, I can get the bad behavior in c++ with gcc-2.96-101 (so Blizzard is actually using a newer compiler). It seems like this might have been caused (on Linux) by fixing Redhat bug 58746: http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=58746 For all I know, this is proper behavior in 64-bit land (since it happens on OSF and Linux), although it then ought to show up on Solaris/Sun. adding 64bit keyword.
Keywords: 64bit
Assignee | ||
Comment 37•22 years ago
|
||
dereferencing memory you don't own is never valid. simply casting bufLen - 1 to a signed integer is not a solution. we need to understand how it is that bufLen can be 0 since the original author clearly didn't think that was possible. if we decide that it is rightly possible, then the code needs to avoid bufLen - 1 when bufLen == 0, else we need to fix the code that is leading to bufLen == 0.
Assignee | ||
Updated•22 years ago
|
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: Future → mozilla1.0.1
Comment 38•22 years ago
|
||
I also see bufLen==0 on i686, but it just doesn't crash. Here's what happens during the bad pass through OnDataAvailable buffer="Set-Cookie: LASTORDER=bugs.bug_id ; path=/; expires=Sun, 30-Jun-2029 00:00:00 GMT\nSet-Cookie: BUGLIST=\n\n" mFirstOnData is false mProcessingHeaders is true on return from ParseHeaders, bufLen is 0 and done is true, so mProcessingHeaders is set to false. because mProcessingHeaders is false, it does: if (!mPartChannel || !(cursor[bufLen-1] == nsCRT::LF) ) bufAmt = PR_MIN(mTokenLen - 1, bufLen); which causes the crash. note that cursor[-1] is actually part of buffer it had allocated earler (it is not dereferencing memory it doesn't own).
Comment 39•22 years ago
|
||
Assignee | ||
Comment 40•22 years ago
|
||
this isn't going to make 1.0.1 ... -> 1.2alpha
Target Milestone: mozilla1.0.1 → mozilla1.2alpha
Comment 41•22 years ago
|
||
cc'ing valeski, who seems to have written most of the relevant code here (according to CVS blame) valeski: could you explain the desired/expected behavior in the particular case here where it crashes? thanks.
Comment 42•22 years ago
|
||
*** Bug 159619 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 44•22 years ago
|
||
Comment on attachment 86721 [details] [diff] [review] alternate fix sr=darin this patch looks very good. bufAmt shouldn't change from zero if bufLen is zero, so adding this check definitely doesn't change the intended logic of the block.
Attachment #86721 -
Flags: superreview+
Assignee | ||
Comment 45•22 years ago
|
||
-> going to shoot for getting this into both 1.1 and 1.0.1
Priority: P3 → P1
Target Milestone: mozilla1.2alpha → mozilla1.1beta
Comment 46•22 years ago
|
||
Comment on attachment 86721 [details] [diff] [review] alternate fix r=blizzard
Attachment #86721 -
Flags: review+
Comment 47•22 years ago
|
||
Comment on attachment 86721 [details] [diff] [review] alternate fix a=brendan@mozilla.org for trunk and branch. /be
Attachment #86721 -
Flags: approval+
Assignee | ||
Comment 48•22 years ago
|
||
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Whiteboard: [adt1 RTM]
Assignee | ||
Comment 49•22 years ago
|
||
-> mozilla1.0.1 waiting for ADT approval.
Target Milestone: mozilla1.1beta → mozilla1.0.1
Comment 50•22 years ago
|
||
Lowering to adt2 since it appears to only affect 64bit machines.
Whiteboard: [adt1 RTM] → [adt2 RTM]
Assignee | ||
Comment 51•22 years ago
|
||
can someone with a 64-bit system confirm that this patch makes bugzilla usable again? i don't think tever has access to such a machine, and we really need to get this verified ASAP. thx!
Comment 52•22 years ago
|
||
Tested this on a Tru64 UNIX system. Works fine.
Assignee | ||
Comment 53•22 years ago
|
||
marking VERIFIED per previous comment.
Status: RESOLVED → VERIFIED
Comment 54•22 years ago
|
||
adt1.0.1+ (on ADT's behalf) approval for checkin to the 1.0 branch. pls check this in asap, then replace the "mozilla1.0.1+" with "fixed1.0.1". thanks!
Comment 56•22 years ago
|
||
*** Bug 162446 has been marked as a duplicate of this bug. ***
Updated•22 years ago
|
Keywords: fixed1.0.1 → verified1.0.1
You need to log in
before you can comment on or make changes to this bug.
Description
•