Closed Bug 83805 Opened 23 years ago Closed 23 years ago

M092, Trunk & N610 crashes [@ nsSliderFrame::DoLayout]

Categories

(Core :: XUL, defect)

x86
Windows 98
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla0.9.5

People

(Reporter: dbaron, Assigned: eric)

References

Details

(Keywords: crash, topcrash, topembed, Whiteboard: fixed have R and SR. approved by dbaron(drivers))

Crash Data

Attachments

(4 files)

In talkback reports there are a number of crashes at nsSliderFrame::DoLayout. 
These seem like they might be related to the same problem as bug 82194 -- that
the scrollbar thumb is being loaded from a separate XBL binding that's
asynchronous.  However, since the fix to bug 82194 was to try and see if that
can be made to work, there may need to be a fix here as well.

I'm not completely sure why it's crashing where it is, though, since it seems to
be crashing the *second* time |thumbBox| is dereferenced.

The top of the stack is:
nsSliderFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp line 386]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp line 985]

For more information see
http://ftp.mozilla.org/pub/data/crash-data/detailed-crash-analysis-all.html#nsSliderFrame::DoLayout
wouldn't a null pointer on the second instance of a variable being 
dereferenced be a good indication that the object was torn down under your 
feet? there was just a discussion in one of these bug around here about async 
loads of stylesheets I think and it doing just that....
Taking.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.2
giving
Assignee: trudelle → hyatt
Status: ASSIGNED → NEW
Hmmm.  This one baffles me.  There's a null check for thumbbox right at the top 
of DoLayout, so I don't really see how this could crash.
Status: NEW → ASSIGNED
This was reported by Talkback to last crash with build 2001060606 on the Trunk
and build 2001060511 on the M091 branch.  

And since I see nsSprocketLayout:Layout in the stack, I thought I would mention
there is a topcrash logged for a crash there in bug 83669...in case looking
there might help understand this problem.

Here are some user comments from the Trunk Talkback topcrash report:
(31352500)
URL: www.google.com
(31351931)
URL: www.google.com
(31351931)
Comments: Opening New window and downloading a PDF going to Acrobat externally.
     (31281820)	Comments: I clicked on 'help' and then 'about netscape 6'.Also there is no
'cancel' button mentioned in the'Agent setup wizard'.
     (31135691)	Comments: After starting mozilla
     (31113182)	Comments: i cleared the url and started to type a new one
     (31069066)	URL: http://bugzilla.mozilla.org/showattachment.cgi?attach_id=35969
(31069066)
Comments: Replying to email

And a couple of instances of this crash reported on the M091 branch:

-   31400162  2001060511  Netscape6.10B1  Win32  2001-06-06 10:22:17   damn
focals page again... how will I ever get a raise if I can't fill out the form! 
barrowma@netscape.com  nsSliderFrame::DoLayout ecf605bb

-  31400160  2001060511  Netscape6.10B1  Win32  2001-06-06 10:20:01   trying to
load one of the focal forms  barrowma@netscape.com  nsSliderFrame::DoLayout ecf605bb

And a full stack from the most recent crash reported on the Trunk:

Incident ID 31402008
Stack Signature nsSliderFrame::DoLayout ecf605bb
Trigger Time 2001-06-06 11:16:00
Email Address
Client IP Address 209.1.108.123
User Comments
Build ID 2001060606
Product ID MozillaTrunk
Platform ID Win32
Stack Trace
nsSliderFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp, line 386]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsPopupSetFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsPopupSetFrame.cpp, line 291]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsMenuFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsMenuFrame.cpp, line 805]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979]
nsBox::Layout [d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBox.cpp, line 985]
nsSprocketLayout::Layout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSprocketLayout.cpp, line 417]
nsContainerBox::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsContainerBox.cpp, line 553]
nsBoxFrame::DoLayout
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsBoxFrame.cpp, line 979] 

Keywords: crash
Summary: crashes [@ nsSliderFrame::DoLayout] → Trunk and M091 crashes [@ nsSliderFrame::DoLayout]
*** Bug 84470 has been marked as a duplicate of this bug. ***
*** Bug 85858 has been marked as a duplicate of this bug. ***
Keywords: nsenterprise
Here are some additional user comments from the latest Talkback reports:

 Source File :
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/xul/base/src/nsSliderFrame.cpp
line : 386

     (31770510)	URL: www.txcountydata.com/valueInfo.asp
(31770510)
Comments: opening additional browser window
     (31756073)	URL: www.downloadit.de
(31756073)
Comments: switch between mail und navigator while the navigator window was open
at start and closed before I changed to mail
     (31716703)	Comments: loading another page just after printing
     (31715684)	Comments: Opened new browser window.   (Ctrl-N)

Also adding qawanted to see if we can get this reproduced.
Keywords: qawanted
I can't reproduce this.
Keywords: nsBranch
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Dave: I just come across the similar problem and filed a bug
(http://bugzilla.mozilla.org/show_bug.cgi?id=90516).

As I comment in 95016, it crashed at line 379.
mRatio = float(ourmaxpos)/float(maxpos + ourmaxpos)
where maxpos and ourmaxpos are ZEROs.


I hope this helps.
For bug 95016, if maxpos and ourmaxpos are ZEROs at line 379, assuming the pixel 
to twip stuff didn't break and that no-one has been corrupting our local 
variables, that would appear to imply that (at a minimum) maxpospx was 0 at line 
361, which would mean it was set to 0 at line 357 by the 
GetMaxPosition(scrollbar) call. On the face of the code, I don't see any 
guarentee that GetMaxPosition(scrollbar) might not on occasion return zero, so I 
think the code near line 379 needs to check for this to avoid divide by zero 
errors. I'm not convinced that this is related to bug 83805, where it seems that 
thumbBox was non-null at line 322 (and presumably 348) but had mysteriously 
become null or dangling (the comments are unclear) by line 384. The only 
possible common cause I can see would be if something was corrupting our local 
stack variables, and sometimes it corrupts something in the maxpos calculation 
while other times it hits the thumbBox pointer. On the other hand if thumbBox is 
really becoming null (as opposed to just dangling), it is hard to see how this 
could happen other than by something corrupting local variables.

In the absence of a testcase I thought I'd try looking at a recent talkback 
incident details:

http://climate/reports/incidenttemplate.cfm?bbid=32429078

[WARNING: link may go dead soon: if so use topcrash data to get a new incident 
number]

(specifically

 Trigger Type:  Program Crash 

 Trigger Reason:  Invalid operation 

 Thread ID:   

 Call Stack:    (Signature = nsSliderFrame::DoLayout 26e00dd1) 
     
   nsSliderFrame::DoLayout 
                                   
[d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp, line 390]

plus

http://climate/reports/incidenttemplate.cfm?setvar=DeveloperDeveloperTabSet:Code
+Around+the+PC&bbid=32429078#DeveloperMachineState

 x86 Registers: 
 EAX: 01e5ded8   EBX: 0000000f   ECX: 0068d918   EDX: 0068d92c
 ESI: 01e5dc20   EDI: 00000000   ESP: 0068d8d4   EBP: 0068d944
 EIP: 603998ad   cf PF af ZF sf of IF df nt RF vm   IOPL: 0 

 CS: 0137  DS: 013f  SS: 013f  ES: 013f  FS: 4cdf  GS: 0000


 Code Around the PC: 
 -> 603998ad d95e44           fstp    dword ptr [esi+0x44]
    603998b0 8b08             mov     ecx,[eax]
    603998b2 ddd8             fstp    st(0)
    603998b4 ff5118           call    dword ptr [ecx+0x18]
    603998b7 837de800         cmp     dword ptr [ebp-0x18],0x0
    603998bb 7e45             jle     60399902
    603998bd d945e0           fld     dword ptr [ebp-0x20]
    603998c0 d84e44           fmul    dword ptr [esi+0x44]
    603998c3 d9ee             fldz
    603998c5 d8d9             fcomp   st(1)
    603998c7 dfe0             fstsw
    603998c9 9e               sahf
    603998ca 7708             ja      603998d4
    603998cc d809             fmul    dword ptr [ecx]
)

I'm not sure that thumbBox is null: I don't see any registers that look like a 
null pointer. In particular EAX: 01e5ded8 ECX: 0068d918 ESI: 01e5dc20 ESP: 
0068d8d4 all look pretty plausible.

The code in this region is:

 385 nscoord flex = 0;
 384 thumbBox->GetFlex(aState, flex);
 385   
 386 if (flex > 0) 
 387 {
 388    nscoord thumbsize = NSToCoordRound(ourmaxpos * mRatio);
 389 
 390    if (thumbsize > thumbcoord)
 
Guessing at a disassambly

 385 nscoord flex = 0;
 384 thumbBox->GetFlex(aState, flex);
 -> 603998ad d95e44           fstp    dword ptr [esi+0x44]
    603998b0 8b08             mov     ecx,[eax]
    603998b2 ddd8             fstp    st(0)
    603998b4 ff5118           call    dword ptr [ecx+0x18]

 385   
 386 if (flex > 0)
    603998b7 837de800         cmp     dword ptr [ebp-0x18],0x0
    603998bb 7e45             jle     60399902
 388    nscoord thumbsize = NSToCoordRound(ourmaxpos * mRatio);
    603998bd d945e0           fld     dword ptr [ebp-0x20]
    603998c0 d84e44           fmul    dword ptr [esi+0x44]
 389 
 390    if (thumbsize > thumbcoord) 
    603998c3 d9ee             fldz
    603998c5 d8d9             fcomp   st(1)
    603998c7 dfe0             fstsw

In which case it looks like we are falling over before we actually call 
thumbBox->GetFlex(), somewhere in the process of setting up its call parameters 
- it looks like it was aState rather than flex. Frankly I'm puzzled as to what 
the "Invalid operation" could be, unless we suddenly lost access to the memory 
around 01e5dc20 and it's a page fault.

Anyone more familiar with x86 assembly code want to take this and run with it?
Since we still don't have a solid test case, here is some more user comments and 
urls from Talkback data:

From N610 branch reports:
Source File : 
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/xul/base/src/nsSlider
Frame.cpp line : 386
     (32742909) Comments: Right-clicked on a link in a message and then chose 
"Open in a new window".
     (32739727) Comments: opening a link from mail
     (32738178) Comments: Had just clicked File>Open
     (32549750) URL: http://freesms.kataweb.it/servlet/RequestGateway
     (32549750) Comments: open frame in new window

From MozillaTrunk reports:
Source File : 
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/xul/base/src/nsSlider
Frame.cpp line : 406
     (32781590) Comments: Typing something into the URLbar.
     (32772859) Comments: MOZILLA caused an exception 10H in module GKLAYOUT.DLL 
at 0177:6039a0c0.Registers:EAX=04804ed4 CS=0177 EIP=6039a0c0 
EFLGS=00010246EBX=0000000f SS=017f ESP=0068bea4 EBP=0068bf14ECX=0068bee8 DS=017f 
ESI=04804c4c FS=198fEDX=0068befc ES=017f
     (32772859) Comments:  EDI=00000000 GS=0000Bytes at CS:EIP:d9 5e 44 8b 08 dd 
d8 ff 51 18 83 7d e8 00 7e 45 Stack dump:04804ed4 0068cea0 0068befc 00000000 
00000001 04804c4c 04ebc640 0068cea0 0068bfe0 60331b8b 04ec45d0 04804c18 60f0753a 
00000000 00000000 00000000 
     (32757176) URL: http://www.hc-sc.gc.ca/hpb/lcdc/fedlab/html/vr_tour.html 
(the pages it links to)
     (32602771) Comments: STARTING
     (32578701) URL: http://info.pihla.net/test.html
     (32578701) Comments: 1. Press the button2. Windows opens
     (32566450) Comments: print then canceledit -> preferences then cancelprint 
then canceledit -> preferences
     (32482902) Comments: click on print on nav bar
     (32482854) Comments: clicking on edit to go to preferences
     (32444360) Comments: Click on printing a page then go to File -> new 
browser window
     (32429360) Comments: Printing then  clicking on File -> New web browser

And from M092 reports:
Source File : 
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/xul/base/src/nsSlider
Frame.cpp line : 386
     (32806005) Comments: This error is reproducible. I get it when I first open 
a Print.. dialog
     (32779662) Comments: right clicked on Open in new window.
     (32747319) URL: mail.yahoo.com
     (32747319) Comments: Reading mail from an IMAP server
     (32677102) URL: http://www.evendi.de
     (32677102) Comments: I was leaving the site through another link to 
www.theregister.co.uk
     (32639256) Comments: Clicked on link before page finished loading.  Didn't 
go to new page.  Right clicked Reload. Crash.
     (32610602) Comments: opened in a new window
     (32555827) Comments: Starting program
Summary: Trunk and M091 crashes [@ nsSliderFrame::DoLayout] → M092, Trunk & N610 crashes [@ nsSliderFrame::DoLayout]
I can reproduce this every time by just opening the Open Web Location dialog
while a page is printing.
(Hmm, and that doesn't work for me). Are you in a mozilla build? Opt or debug?
(I'm mozilla build, opt. with symbols).
I'm on a 2001071103 Win98 nightly. I go to File | Print... when at, say, this 
page, press OK (leaving the default settings). Then I wait a couple seconds 
until I hear my printer going. Then I go to File | Open Web Location...

Boom.

Has this been reproduced on anything but Windows 98?
So, it sure didn't take long for my FULLY REPRODUCIBLE! testcase to become 
completely benign on my machine.

But then I created a new profile and immediately reproduced it again with the 
same steps. This time my printer was off-line (for some reason) so the document 
didn't even print, but I still crashed.

Can you try with a new profile and, even better, on Windows 98?

By the way, to further answer your question, this is a Mozilla branch build.
So far, twice in a row now, I can consistency reproduce the crash using my steps 
if, with a new profile, I keep hitting No at the win32 integration alert that 
appears upon startup.  Once I hit Yes, I can no longer reproduce it from that 
point on.

And, come to think of it, I finally got tired of that dialog and hit Yes last 
time -- right before I started being unable to reproduce the crash.

In Advanced > System prefs, I've got everything checked off.
> has this been reproduced on anything but windows98?

looks like the answer to that is _no_ from search through 
a sample of talkback data

number of rpts: build id : date rpted : OS ver. : stack sig : source line 

   1 2001062817 2001-07-04 Windows 98  4.90 build 73010104   
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-05 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout 808ad5da 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   2 2001062817 2001-07-05 Windows 98  4.10 build 67766446
      nsSliderFrame::DoLayout ecf605bb
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-05 Windows 98  4.90 build 73010104 
      nsSliderFrame::DoLayout 6d11c8b9 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   2 2001062817 2001-07-06 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-08 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout ecf605bb
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-09 Windows 98  4.10 build 67766222 
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-10 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout 4aa54c7d 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-10 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-10 Windows 98  4.90 build 73010104
      nsSliderFrame::DoLayout 808ad5da 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-11 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   2 2001062817 2001-07-12 Windows 98  4.10 build 67766222 
      nsSliderFrame::DoLayout 93dc7d7e
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-12 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-13 Windows 98  4.10 build 67766446 
      nsSliderFrame::DoLayout d5ffbd94 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386
   1 2001062817 2001-07-13 Windows 98  4.10 build 67766446
      nsSliderFrame::DoLayout ecf605bb 
      d:\builds\seamonkey\mozilla\layout\xul\base\src\nsSliderFrame.cpp 386

> Anyone more familiar with x86 assembly code want to take this and run with it?

morse?
The x86 code posted in the talkback report is not consistent with the stacktrace 
posted in this bug report.  So the guess at the disassembly posted above is all 
wrong.  Below is the correct disassembly corresponding to the stacktrace and it 
doesn't at all agree with the x86 code in the traceback report.  Could it 
be that we are talking about two different problems here?

380:    // if there is more room than the thumb need stretch the
381:    // thumb
382:
383:    nscoord flex = 0;
0292F4F3   mov         dword ptr [flex],0
384:    thumbBox->GetFlex(aState, flex);
0292F4FA   lea         eax,[flex]
0292F4FD   push        eax
0292F4FE   mov         ecx,dword ptr [aState]
0292F501   push        ecx
0292F502   mov         edx,dword ptr [thumbBox]
0292F505   mov         eax,dword ptr [edx]
0292F507   mov         ecx,dword ptr [thumbBox]
0292F50A   push        ecx
0292F50B   call        dword ptr [eax+18h]
385:
386:    if (flex > 0)
0292F50E   cmp         dword ptr [flex],0
0292F512   jle         nsSliderFrame::DoLayout+2C5h (0292f565)
387:    {
388:      nscoord thumbsize = NSToCoordRound(ourmaxpos * mRatio);
0292F514   fild        dword ptr [ourmaxpos]
0292F517   mov         edx,dword ptr [this]
0292F51A   fmul        dword ptr [edx+44h]
0292F51D   push        ecx
0292F51E   fstp        dword ptr [esp]
0292F521   call        NSToCoordRound (02826c60)
0292F526   add         esp,4
0292F529   mov         dword ptr [thumbsize],eax
389:
390:      if (thumbsize > thumbcoord)
0292F52C   mov         eax,dword ptr [thumbcoord]
0292F52F   mov         ecx,dword ptr [thumbsize]
0292F532   cmp         ecx,dword ptr [eax]
0292F534   jle         nsSliderFrame::DoLayout+2ACh (0292f54c)
391:      {
392:          // if the thumb is flexible make the thumb bigger.
393:        if (isHorizontal)
0292F536   cmp         dword ptr [isHorizontal],0
0292F53A   je          nsSliderFrame::DoLayout+2A4h (0292f544)
394:           thumbSize.width = thumbsize;
0292F53C   mov         edx,dword ptr [thumbsize]
0292F53F   mov         dword ptr [thumbSize],edx
395:         else
0292F542   jmp         nsSliderFrame::DoLayout+2AAh (0292f54a)
396:            thumbSize.height = thumbsize;
0292F544   mov         eax,dword ptr [thumbsize]
0292F547   mov         dword ptr [ebp-60h],eax
397:
398:      } else {
0292F54A   jmp         nsSliderFrame::DoLayout+2C3h (0292f563)
399:          ourmaxpos -= thumbcoord;
0292F54C   mov         ecx,dword ptr [thumbcoord]
0292F54F   mov         edx,dword ptr [ourmaxpos]
0292F552   sub         edx,dword ptr [ecx]
0292F554   mov         dword ptr [ourmaxpos],edx
400:          mRatio = float(ourmaxpos)/float(maxpos);
0292F557   fild        dword ptr [ourmaxpos]
0292F55A   fidiv       dword ptr [maxpos]
0292F55D   mov         eax,dword ptr [this]
0292F560   fstp        dword ptr [eax+44h]
401:      }
402:    } else {
0292F563   jmp         nsSliderFrame::DoLayout+2DCh (0292f57c)
403:        ourmaxpos -= thumbcoord;
0292F565   mov         ecx,dword ptr [thumbcoord]
0292F568   mov         edx,dword ptr [ourmaxpos]
0292F56B   sub         edx,dword ptr [ecx]
0292F56D   mov         dword ptr [ourmaxpos],edx
404:        mRatio = float(ourmaxpos)/float(maxpos);
0292F570   fild        dword ptr [ourmaxpos]
0292F573   fidiv       dword ptr [maxpos]
0292F576   mov         eax,dword ptr [this]
0292F579   fstp        dword ptr [eax+44h]
405:    }
Hyatt said:

> Hmmm.  This one baffles me.  There's a null check for thumbbox right at
> the top of DoLayout, so I don't really see how this could crash.

There are actually several ways that this particular statement can crash.  
First, the test for thumbbox guarentees that it is not zero but it does not 
guarentee that it points to valid memory.  Furthermore, thumbbox is not the only 
thing being dereferenced in this statement.  If you look at the disassembly that 
I posted, you'll see that aState is being dereferenced as well.  And, 
furthermore, thumbbox is being double dereferenced, so not only must it not be 
null, but it must not point to a location containing null.

Can someone who has a reproducible testcase and a debugger check to see that 
aState is not null and that thumbbox does not point to a null.
I just meant I didn't think we were looking at the same crash.  (The original 
assertion a long time ago was that thumbFrame was null).  I of course see 
plenty of other ways in which you could get a crash... but we're talking about 
a different bug than the one I originally fixed.
*** Bug 91319 has been marked as a duplicate of this bug. ***
Is there a simple null-check (or 2 or N) that could be added to narrow down the
occurances of this crash.  Maybe we won't exactly kill this, but the branch
would be a better release for it.  
shd we not be checking for a zero denominator before doing

mRatio = float(ourmaxpos)/float(maxpos + ourmaxpos);

in this code? what shd we return in that case? is there any reason we are not
doing that check? 
Attaching a patch for the branch that can't be any worse that what we now have 
and maybe will make things a bit better.

cc'ing blake and vishy for r and sr
cc'ing selmer for branch approval
r=vishy for the branch only. 
I'm worried about returning from the routine.  Sure, it might prevent the crash,
but it probably horks things up a lot.  

All this code is about finding the mRatio.  I think ourmaxpos and maxpos should
just both be prevented from getting set to zero by instead setting them to one.
 I believe if you do that you'll get through all the subsequent code without
danger of divide by zero (unless thumbcoord can actually have exactly the value
1 as well.)  

Another alternative is to change your patch to just set mRatio itself to 1 and
skip all the subsequent fooling around on the assumption that 1 is the best we
can do when the numbers are goofy.  (This should be safe because all of that
code is trying to optimize the mRatio and the new mRatio is the only reason it's
there.)

Anyway, returning probably results in a missing scroll thumb or some other
artifact that ain't pretty and there's no reason to fail that badly when there's
a simple way to avoid it.
Saari/Blake could you r=/sr= the second patch for the branch only? thanks, Vishy
r=saari. Someone that knows this code should consider initializing mRatio, but
since that didn't happen before, I don't think this will make things any worse.
PDT+.  Ok, maybe I'm biased :-)
Whiteboard: PDT+
sr=blake for branch only, under the assumption that there's nothing worse than a
crash (except reformatting the user's hard drive, resetting their default
browser to IE, uninstalling ourselves, etc.).

I agree with saari that mRatio should be initialized (later on, as part of a
trunk fix).
checked in the 07/19/01 00:22 patch on the Netscape branch. removing PDT+ as 
this does not block the netscape release anymore. 
Whiteboard: PDT+ → zero divide fixed on branch.
Oops, sorry I fell asleep last night and wasn't able to react sooner.  I got up 
early this morning to see what the status was and I saw that vishy already 
checked it in so thanks.

Setting the ratio to 1 was my original thought to and I couldn't decide whether 
to do that or to simply return.  So that's good too.
Thanks for the help on this one, guys.  Reassigning back to evaughan, who can
fix this on the trunk.
Assignee: hyatt → evaughan
Status: ASSIGNED → NEW
*** Bug 90516 has been marked as a duplicate of this bug. ***
Target Milestone: mozilla0.9.3 → mozilla0.9.4
*** Bug 88990 has been marked as a duplicate of this bug. ***
*** Bug 88289 has been marked as a duplicate of this bug. ***
Is there a time frame on when this will be checked into the trunk? Hopefully 
before 0.9.3?
needs landing on trunk and evaughan isn't around. -->hyatt
Assignee: evaughan → hyatt
Latest nightly build fixed my problems. Thank you.
Anyone can do this.  Load balance. Thanks.
Assignee: hyatt → saari
I've just checked with lxr that the patch has not yet been checked into the trunk. 
I wonder why?
I'll take this.
Assignee: saari → dprice
I've attached a new patch, can I get an R= and an SR=

Thanks
dprice, your patch is almost identical to Vishy's except you redundantly set
mRatio to 1 before overwriting it in the if block.  The trunk is waiting for
someone to say what the right behavior is if any of the denominators become
zero.  Are you authoritatively saying it should just be set to 1?
I believe evaughn is the only one who is able to speak authoratively about the
code in question.  In a brief conversation with hyatt, he suggested setting it
to 1  Requests for additional feedback went unanswered, so I figured I'd toss
this up and see where it landed.
May God have mercy on us all. The 212 bug spam-o-rama is Now!
QA Contact: aegis → jrgm
Keywords: nsBranchnsbranch
Whiteboard: zero divide fixed on branch. → zero divide fixed on branch.[want for 0.9.4]
-> evaughn  He knows the code
Assignee: dprice → evaughan
sr=hyatt
r=saari
Status: NEW → ASSIGNED
Whiteboard: zero divide fixed on branch.[want for 0.9.4] → fixed have R and SR awaiting approval
a=dbaron on behalf of drivers
Whiteboard: fixed have R and SR awaiting approval → fixed have R and SR. approved by dbaron(drivers)
Set milestone to mozilla0.9.5
Target Milestone: mozilla0.9.4 → mozilla0.9.5
if we can get this in on the 0.9.4 branch before mozilla releases
then lets try.  this could use a good milestone testing verfication...

otherwise the option is to take the fix on the 0.9.4 branch after
the mozilla release in some additional work netscape will do there
that leads up to the next netscape release.

after we get this landed somewhere and we get a chance for many
people to pound on it to make sure the fix is right and has no
side effects then we can look at the risk/benefit of it landing
on older branches.

this has the nsbranch keyword so its on the right path to 
get put in the right branches in the plan I outlined above.

Keywords: topembed
oops... my long winded comment just above was applied to the
wrong bug.  it should have been applied to bug 93371.   

this one look a big less risk so if that the case lets 
try and get it too...
Was checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Blocks: 99225
Is this fixed on 0.9.4 as well, I could not tell from the comments? This was
marked fixed AFTER the branch was cut so I doubt it...
Looks like it was on the branch:

----------------------------
revision 1.76
date: 2001/08/31 20:37:56;  author: evaughan%netscape.com;  state: Exp;  lines:
+11 -4
b=83805
r=saari
sr=hyatt
a=dbaron

Fixes a divide by 0 bug in the scrollbar.
----------------------------

	MOZILLA_0_9_4_RELEASE: 1.76
	MOZILLA_0_9_4_BRANCH: 1.76.0.2
	MOZILLA_0_9_4_BASE: 1.76
Marking verified fixed.  Talkback data shows this crash last occurred with
MozillaTrunk build 2001080112.
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsSliderFrame::DoLayout]
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: