Closed Bug 221975 Opened 21 years ago Closed 21 years ago

Browser crashes with sys3178 error "floating divide-by-zero exception" GKLAYOUT.DLL when accessing any or all URLs above.

Categories

(Core :: Layout: Images, Video, and HTML Frames, defect)

x86
OS/2
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: austyg, Assigned: jdunn)

References

()

Details

(Keywords: crash)

Attachments

(4 files)

User-Agent:       Mozilla/5.0 (OS/2; U; Warp 4; en-US; rv:1.4.1) Gecko/20031010
Build Identifier: Warpzilla 1.5rc2.

Loading any of those URLs one by one, or all simult. by way of a bookmark group
that contains them all CRRRRRASHES the browser and a SYS3178 error is displayed
by the system and logged in the POPUPLOG.OS2 file.

Reproducible: Always

Steps to Reproduce:
1.Start the browser
2.Enter http://themes.mozdev.org/themes/pinball.html in the location bar.
3.Hit enter.  (LOL)
OR
1. Create a bookmark group  to contain these 5 URLs:
http://themes.mozdev.org/themes/pinball.html ;
http://themes.mozdev.org/themes/earlyblue.html ;
http://themes.mozdev.org/themes/orbit.html ;
http://themes.mozdev.org/themes/micromozilla.html ;
http://themes.mozdev.org/themes/rain.html
3. Click in such bookmark group.
4. Watch browser crash.
Actual Results:  
10-11-2003  12:47:53  SYS3178  PID 004e  TID 0001  Slot 0079
D:\USOF\MOZILLA\MOZILLA.EXE
c0000095
1d8230b2
EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
ESI=00000000  EDI=00000001
DS=0053  DSACC=d0f3  DSLIM=1fffffff
ES=0053  ESACC=d0f3  ESLIM=1fffffff
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1d8230b2  CSACC=d0df  CSLIM=1fffffff
SS:ESP=0053:0013cf74  SSACC=d0f3  SSLIM=1fffffff
EBP=0013cf9c  FLG=00002246

GKLAYOUT.DLL 0001:000830b2

------------------------------------------------------------

10-11-2003  12:53:54  SYS3178  PID 004f  TID 0001  Slot 0079
D:\USOF\MOZILLA\MOZILLA.EXE
c0000095
1d8230b2
EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
ESI=00000000  EDI=00000001
DS=0053  DSACC=d0f3  DSLIM=1fffffff
ES=0053  ESACC=d0f3  ESLIM=1fffffff
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1d8230b2  CSACC=d0df  CSLIM=1fffffff
SS:ESP=0053:0013c8b4  SSACC=d0f3  SSLIM=1fffffff
EBP=0013c8dc  FLG=00002246

GKLAYOUT.DLL 0001:000830b2


Expected Results:  
URLs are loaded and displayed either singly or  as a group of bookmarks without
the browser crashing.

10-11-2003  12:47:53  SYS3178  PID 004e  TID 0001  Slot 0079
D:\USOF\MOZILLA\MOZILLA.EXE
c0000095
1d8230b2
EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
ESI=00000000  EDI=00000001
DS=0053  DSACC=d0f3  DSLIM=1fffffff
ES=0053  ESACC=d0f3  ESLIM=1fffffff
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1d8230b2  CSACC=d0df  CSLIM=1fffffff
SS:ESP=0053:0013cf74  SSACC=d0f3  SSLIM=1fffffff
EBP=0013cf9c  FLG=00002246

GKLAYOUT.DLL 0001:000830b2

------------------------------------------------------------

10-11-2003  12:53:54  SYS3178  PID 004f  TID 0001  Slot 0079
D:\USOF\MOZILLA\MOZILLA.EXE
c0000095
1d8230b2
EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
ESI=00000000  EDI=00000001
DS=0053  DSACC=d0f3  DSLIM=1fffffff
ES=0053  ESACC=d0f3  ESLIM=1fffffff
FS=150b  FSACC=00f3  FSLIM=00000030
GS=0000  GSACC=****  GSLIM=********
CS:EIP=005b:1d8230b2  CSACC=d0df  CSLIM=1fffffff
SS:ESP=0053:0013c8b4  SSACC=d0f3  SSLIM=1fffffff
EBP=0013c8dc  FLG=00002246

GKLAYOUT.DLL 0001:000830b2
OS: other → OS/2
1.5rc2 and 1.4.1 are different builds. In which are you seeing this crashing? Both?

http://themes.mozdev.org/themes/pinball.html WFM in both 1.4.1 and 1.5rc2 for
OS/2, using Modern, and no add-ins except for ft2lib in 1.4.1.
Felix: happens only in 1.5rc2
It seems that the crash has to do with the number of web pages loaded, so  that
loading 5 in sequence will crash it as well as loading the 5 URLs as a  group of
bookmarks.  By the way, it seems this crash is specific to these URLs in
mozdev.org . . . ironic, isn't it?  Moz crashes  at its own web site   >_<
nsImageFrame::Paint() [line1331]
  mComputedSize = {0,0}, which leads to divide-by-zero.
Keywords: crash
*** Bug 224617 has been marked as a duplicate of this bug. ***
Confirming on release 1.5
Status: UNCONFIRMED → NEW
Ever confirmed: true
.
Assignee: general → jdunn
Component: Browser-General → Image: Layout
Isn't floating point divide by zero supposed to give Infinity? It does at least
on linux (gcc 3.3.2).
OK, so I understand this issue better now.  There are two problems:

1) We have an issue with the GCC compiler on OS/2 that the FPU control word gets
reset every so often by the Presentation Manager.  That's actually being tracked
in a different bug.

2) There is a cross-platform issue here in that we are glossing over
divide-by-zero errors by simply masking that exception in the FPU control word.
 This doesn't seem right to me.  Any piece of code that could possibly divide by
zero should be wrapped in an 'if'.  Is there any valid reason to mask that
exception in the control word?
fwiw it turns out my comment 7 was wrong, it's undefined in C++ what happens if
you divide by zero.
Attached patch PatchSplinter Review
We do _what_ with the divide-by-zero exceptions?  <sigh>.... Anyway, this just
skips all the image-painting rigmarole when the area to be painted is 0x0 (we
still paint the borders, as we should).
Comment on attachment 134867 [details] [diff] [review]
Same as diff -w

roc, would you review?
Attachment #134867 - Flags: superreview?(roc)
Attachment #134867 - Flags: review?(roc)
Attachment #134867 - Flags: superreview?(roc)
Attachment #134867 - Flags: superreview+
Attachment #134867 - Flags: review?(roc)
Attachment #134867 - Flags: review+
Checked in.   Javier, could you test and resolve as appropriate?
This patch fixes the issue in nsImageFrame.cpp.

However, we found a similar issue in BasicTableLayoutStrategy.cpp that also
leads to a crash (divide by zero). In AllocateUnconstrained() the following code
seems to be the culprit..
503       float percent = (divisor == 0) 
504         ? (1.0f / ((float)numColsAllocated))
505         : ((float)oldWidth) / ((float)divisor);

In our case numColsAllocated is 0 and it is not being checked.

bernd, what would be a reasonable thing to do in that code if numColsAllocated
is 0?  How do we even end up with that being 0 when numCols is not?  If that's a
valid situation, I think the right thing to do is an early return before that
last for loop.
Blocks: 225046
*** Bug 225046 has been marked as a duplicate of this bug. ***
I would say if divisor is zero numColsAllocated is also zero
    
481   nscoord divisor          = 0;
482   PRInt32 numColsAllocated = 0; 
483   PRInt32 totalAllocated   = 0;
484   for (colX = 0; colX < numCols; colX++) { 
485     nsTableColFrame* colFrame = mTableFrame->GetColFrame(colX); 
486     if (!colFrame) continue; 
487     PRBool skipColumn = aExclude0Pro && (e0ProportionConstraint ==
colFrame->GetConstraint());
488     if (FINISHED != aAllocTypes[colX] && !skipColumn ) {
489       divisor += mTableFrame->GetColumnWidth(colX);
490       numColsAllocated++;
491     }
492   }

I only wonder why this thing that is pretty old only appears now.
Bernd, see comment 8.  We're trapping divide-by-0 CPU exceptions for some
reason, but that doesn't quite work on OS2...
Attached patch patchSplinter Review
The attached patch should fix the issue. However, I cant even descent into the
function with the testcase in bug 225046. So I am really curious how the
testcase there passed the following if statement:
372   // allocate the rest to auto columns, with some exceptions
373   if ( (tableIsAutoWidth && (perAdjTableWidth - totalAllocated > 0)) ||
374	   (!tableIsAutoWidth && (totalAllocated < maxWidth)) ) {

The table is autowidth and  
241   // An auto table returns a new table width based on percent cells/cols if
they exist
242   nscoord perAdjTableWidth = 0;
243   if (mTableFrame->HasPctCol()) {
the perAdjTableWidth is zero as the table there has no percent constrained
columns. This means totalAllocated has been negative on OS/2 at least.
So what we do here is IMHO paranoid wallpapering. Did I mention that I am
paranoid?
So lets fix the issue here in order to prevent a crash, but if people with OS/2
could debug how they got totalallocated negative that would be really great.
Turning on all the commented out debug code via  defining DEBUG_TABLE_STRATEGY
would be a great help.
*** Bug 225105 has been marked as a duplicate of this bug. ***
Comment on attachment 135067 [details] [diff] [review]
patch

Boris could you review it, I hoped the OS/2 folks would test it but we should
not wait endless.
Attachment #135067 - Flags: superreview?(bz-vacation)
Attachment #135067 - Flags: review?(bz-vacation)
bernd, I am in the process of testing this.  I'll let you know what I find.
Comment on attachment 135067 [details] [diff] [review]
patch

Yeah, this seems reasonable... r+sr=bzbarsky
Attachment #135067 - Flags: superreview?(bz-vacation)
Attachment #135067 - Flags: superreview+
Attachment #135067 - Flags: review?(bz-vacation)
Attachment #135067 - Flags: review+
fix checked in
Attached file crash log (ZIP)
Zip file of crash log with DEBUG_TABLE_STRATEGY defined.  Bernd, maybe you can
tell what's going on here.
I guess you see the crash with or without my patch.
I managed once to see the following assert:

###!!! ASSERTION: null col frame: 'PR_FALSE', file e:/moz_src/mozilla/layout/htm
l/table/src/nsTableFrame.cpp, line 4010
Break: at file e:/moz_src/mozilla/layout/html/table/src/nsTableFrame.cpp, line 4010

which is probaly completely offline as I run currently with the patch for bug 4510,

This would explain a crash in the layout strategy when printing the column info,
give a hint to a frame construction issue.

When switching in viewer rapidly between res0 and the url I see frequently

WARNING: Content has no document., file e:/moz_src/mozilla/layout/html/base/src/
nsTextFrame.cpp, line 5294
WARNING: Reflow of frame failed in nsLineLayout, file e:/moz_src/mozilla/layout/
html/base/src/nsLineLayout.cpp, line 1015
WARNING: Content has no document., file e:/moz_src/mozilla/layout/html/base/src/
nsTextFrame.cpp, line 5294
WARNING: Reflow of frame failed in nsLineLayout, file e:/moz_src/mozilla/layout/
html/base/src/nsLineLayout.cpp, line 1015
###!!! ASSERTION: reflow dirty lines failed: 'NS_SUCCEEDED(rv)', file e:/moz_src
/mozilla/layout/html/base/src/nsBlockFrame.cpp, line 816
Break: at file e:/moz_src/mozilla/layout/html/base/src/nsBlockFrame.cpp, line 81
6

So my guess is that the LayoutStrategy is just a good place to crash, but the
problem seems to be somewhere else.
Actually, your patch may have fixed this.  Unfortunately, the bug is so hard to
reproduce, that I may just not be hitting the right code paths.  Let's see what
the users say about the nightlies.  If the gklayout crashes disappear, then
we'll close this bug out.  Thanks.
1.6b (2003112708) appears to have corrected this problem.

I routinely experienced this problem. For a test case, go to:
http://www.tvtome.com/LALaw/eplist.html and click on the season 2 episode "Open
Heart Perjury." With the 1.6a milestone, this failed 100%. With the noted
nightly, Mozilla no longer crashes.
Marking FIXED based on user feedback.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Product: Core → Core Graveyard
Product: Core Graveyard → Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: