Closed Bug 377394 Opened 17 years ago Closed 17 years ago

sigbus on solaris 2.8 in nsTextFrame::MeasureText() while displaying basic html page

Categories

(Core :: Layout, defect)

1.8 Branch
Sun
Solaris
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 161826

People

(Reporter: luc, Unassigned)

Details

(Keywords: crash, Whiteboard: [branch-only])

Attachments

(4 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.8.1.3) Gecko/20060601 Firefox/2.0.0.3 (Ubuntu-edgy)
Build Identifier: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.8.1.3) Gecko/20070412 BonEcho/2.0.0.3

I have compiled myself Firefox 2.0.0.3 on a sparc machine (SunOS antlia 5.8 Generic_117350-25 sun4u sparc SUNW,Sun-Blade-1000), along with all underlying libraries (atk, pango, gtk+ ...).
When I try to display some very simple web pages, firefox dumps core due to a sigbus. It seems this occurs in nsTextFrame::MeasureText().
I have reduced the crash to a very simple html file and it seems to be sensitive to really small details like removing an end of line between elements.

Reproducible: Always

Steps to Reproduce:
1. launch firefox on the provided boo.html file
2.
3.
Actual Results:  
firefox crashes and dumps core.

Expected Results:  
page displayed

I was only able to test this build on a remote connection through ssh (I do not have physical access to the machine). I have checked the connection was not concerned since I can display other graphical applications and firefox itself on a few other files. I have also checked the behaviour was the same with two very different X servers on my display (one PC running GNU/Linux with a recent distribution, and an old sparc running Solaris 2.8 and Openwindows).

I will attach the boo.html file, an extract of gdb core analysis and an extract of a truss command. I can provide the complete core, gdb and truss files upon request.

I am not sure this bug can be reproduced anyway apart from my machine. I can perform tests upon requests, of course. I have done as many search I could on my side (since I did compile it myself, I suppose I did something wrong).
I can provide the complete core file upon request (about 4M bzip2 archive file)
this is an extract of a simple truss -f command (the -f flags follows forks, so their is a PID number at the front of each line)
Component: General → Layout
Product: Firefox → Core
QA Contact: general → layout
Version: unspecified → 1.8 Branch
Add the following line to your .mozconfig file:
ac_add_options --enable-debug
then rebuild Firefox and run it in a debugger (./firefox -d gdb -g),
make it crash, then attach the output of "bt full" in gdb.  Thanks.
Here is the full backtrace with debug mode enabled.
A gdb-wrapped launch using 'firefox -d gdb -g' failed somewhere else with a corrupted stack. This trace was obtained by a direct launch 'firefox boo.html' and a post-mortem core analysis.
(In reply to comment #5)
> A gdb-wrapped launch using 'firefox -d gdb -g' failed somewhere else with a
> corrupted stack.

Yeah, that happens sometimes on the first launch after a rebuild.
Next time, try running just 'firefox' and then quitting it.
Then you should be able to use 'firefox -d gdb -g' I think.
Anyway, thanks for the stack!
It seems the bug is that TransformTextToUnicode() was called with
aNumChars=0.  Bad things will happen if you do that...
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/generic/nsTextFrame.cpp&rev=MOZILLA_1_8_BRANCH&root=/cvsroot#5122

There is only one call site:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/generic/nsTextFrame.cpp&rev=MOZILLA_1_8_BRANCH&root=/cvsroot&mark=5702#5661

I guess you could debug this further by adding
#define DEBUG_WORD_WRAPPING 1
at the top of layout/generic/nsTextFrame.cpp
and see what it prints just before the crash...
Group: security
Severity: normal → critical
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Other → Solaris
Hardware: Other → Sun
Keywords: crash
Let's avoid fixing this until after new textframe has landed...
(In reply to comment #8)
> Let's avoid fixing this until after new textframe has landed...

roc, this is a branch-only bug AFAICT.
Whiteboard: [branch-only]
I failed to compile with DEBUG_WORD_WRAPPING due to an undefined "stop" variable (http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/generic/nsTextFrame.cpp&rev=MOZILLA_1_8_BRANCH&root=/cvsroot&mark=6311#6293).

I did not try further this way because I have done some other tests as follows.
It appears TransformTextToUnicode() was not called with aNumChars=0 but with aNumChars equal to the length of the last word in the link text (which was "1" on my first test). aNumchars is decremented by the backward conversion loop. I checked with other words lengths and always got a SIGBUS somewhere for lengths 1, 2, 3 and 4.

The comment in TransformTextToUnicode() (lines 5131-5132) is exactly what happens to me: I crash here. I followed the link in the comment (http://bugzilla.mozilla.org/show_bug.cgi?id=36146#c44), this is exactly what happens to me: a crash due to a misaligned pointer as shown in my attachment with the truss command (siginfo: SIGBUS BUS_ADRALN addr=0xFFBEB755).

The loop is probably trying to put a two bytes value at an odd address which is not allowed on sparc (I also wonder how this could work in-place, it seems to me that PRUnichar is a typdef from PRUint16).

Comment 48 in the same thread about bug 36146 shows fixing this is not immediate.

This is really a problem for me (I simply cannot use firefox at all on this platform), but I understand a new implementation of this feature is due. If it is due soon, I can wait for it.
Ok, so this is a dupe of bug 161826 then.
Group: security
Status: NEW → RESOLVED
CC list accessible: false
Closed: 17 years ago
Not accessible to reporter
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: