Closed Bug 146308 Opened 20 years ago Closed 20 years ago

N700 M100 M1BR crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle]

Categories

(Core :: Layout, defect, P1)

x86
Windows ME
defect

Tracking

()

VERIFIED FIXED
mozilla1.2beta

People

(Reporter: greer, Assigned: alexsavulov)

References

Details

(Keywords: crash, topcrash+, Whiteboard: [ADT2 RTM] [ETA 09/21])

Crash Data

Attachments

(3 files, 4 obsolete files)

This one looks like some analysis has already been done in bug 139198. Talkback
data for M1RC2 and for N70PR1 show Windows users crashing at the
nsHTMLReflowState::ComputeContainingBlockRectangle signature.

cc'ing timeless (who reported bug 139198 and) who may have insight into how to
reproduce /fix this one. timeless, feel free to dupe this one if you think the
new bug is unnecessary.

Stack Trace:

nsHTMLReflowState::ComputeContainingBlockRectangle
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line 1490]
nsHTMLReflowState::InitConstraints
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line 1637]
nsHTMLReflowState::Init
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line 268]
nsHTMLReflowState::nsHTMLReflowState
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line 209]
ViewportFrame::Reflow
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsViewportFrame.cpp, line 587]
PresShell::InitialReflow
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsPresShell.cpp, line 2680]
HTMLContentSink::StartLayout
[d:\builds\seamonkey\mozilla\content\html\document\src\nsHTMLContentSink.cpp,
line 3993]
HTMLContentSink::OpenBody
[d:\builds\seamonkey\mozilla\content\html\document\src\nsHTMLContentSink.cpp,
line 3232]
CNavDTD::OpenBody [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line
3153]
CNavDTD::OpenContainer [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp,
line 3395]
CNavDTD::HandleDefaultStartToken
[d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 1324]
CNavDTD::HandleStartToken
[d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 1730]
CNavDTD::HandleToken [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp,
line 908]
CNavDTD::BuildModel [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp,
line 521]
nsParser::BuildModel [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp,
line 1869]
nsParser::ResumeParse [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp,
line 1733]
nsParser::OnDataAvailable
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2369]
nsDocumentOpenInfo::OnDataAvailable
[d:\builds\seamonkey\mozilla\uriloader\base\nsURILoader.cpp, line 244]
nsStreamIOChannel::OnDataAvailable
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsInputStreamChannel.cpp, line 508]
nsOnDataAvailableEvent::HandleEvent
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerProxy.cpp, line 203]
PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 597]
PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c,
line 530]
_md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line
1078]
KERNEL32.DLL + 0x248f7 (0xbff848f7)
0x00648bfa
0x00058f64 

--------------------
M1RC2 (nsHTMLReflowState::ComputeContainingBlockRectangle):       17
 Unique Users  8
  10 (2002051008) Windows 98  4.90 build 73010104
   3 (2002051008) Windows NT  5.1 build 2600
   2 (2002051008) Windows NT  5.0 build 2195
   1 (2002051008) Windows NT  4.0 build 1381
   1 (2002051008) Windows 98  4.10 build 67766446

Trunk (nsHTMLReflowState::ComputeContainingBlockRectangle):        0
 Unique Users  0
  
N70PR1 (nsHTMLReflowState::ComputeContainingBlockRectangle):       11
 Unique Users  2
   6 (2002051220) Windows 98  4.90 build 73010104
   5 (2002051220) Windows 98  4.10 build 67766446
Keywords: crash, qawanted
Keywords: topcrash
Summary: M1RC2 crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle] → M1RC3 crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle]
Assignee: attinasi → waterson
Priority: -- → P1
Target Milestone: --- → mozilla1.0.1
Taking.
This is currently the #4 topcrash for Netscape 7.0 PR1, adding N7PR1 to summary.
 I will attach recent Talkback reports...since there is A LOT of data.
Summary: M1RC3 crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle] → M1RC3 N70PR1 crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle]
The disassembly makes it look like the crash is really a few lines earlier, and
that aContainingBlockRS is null.
David: you are right. The parameter is passed as null. 
   aContainingBlockRS = 0x00000000  (*aContainingBlockRS) = Data not available
and we are immediately deferencing to get the computed width and height.  

1479 void
1480 nsHTMLReflowState::ComputeContainingBlockRectangle(nsIPresContext*        
 aPresContext,
1481                                                    const nsHTMLReflowState*
aContainingBlockRS,
1482                                                    nscoord&               
 aContainingBlockWidth,
1483                                                    nscoord&               
 aContainingBlockHeight)
1484 {
1485   // Unless the element is absolutely positioned, the containing block is
1486   // formed by the content edge of the nearest block-level ancestor
1487   aContainingBlockWidth = aContainingBlockRS->mComputedWidth;
1488   aContainingBlockHeight = aContainingBlockRS->mComputedHeight;

x86 Registers:
EAX: 0068f578 EBX: 0068f5d0 ECX: 0068f5d0 EDX: 0068f578
ESI: 00000000 EDI: 0068f57c ESP: 0068f534 EBP: 0068f540
EIP: 603dac2c cf PF af ZF sf of IF df nt RF vm   IOPL: 0
CS: 0177 DS: 017f SS: 017f ES: 017f FS: 50d7 GS: 0000
Code Around the PC: 603dac2c 8b4628           mov     eax,[esi+0x28]
603dac2f bbff7fffff       mov     ebx,0xffff7fff
603dac34 8902             mov     [edx],eax
603dac36 8b462c           mov     eax,[esi+0x2c]
603dac39 8907             mov     [edi],eax
603dac3b 8b411c           mov     eax,[ecx+0x1c]
603dac3e 23c3             and     eax,ebx
603dac40 83f804           cmp     eax,0x4
603dac43 0f8588000000     jne     603dacd1
603dac49 8b461c           mov     eax,[esi+0x1c]

nominating for nsbeta1
Keywords: nsbeta1
Whiteboard: [ADT2 RTM]
I suspect this may have something to do with the porting of bzbarsky's changes
from trunk to branch.
I can't usefully investigate this till tuesday or wednesday... 
see also useless bug 139198
so we know that aContainingBlockRS is null which means cbrs is null in
InitConstraints(). here is where cbrs is init-ed:
1625     const nsHTMLReflowState* cbrs =
1626       GetContainingBlockReflowState(parentReflowState);

calling this fn, there are two ways where null is returned. either aParentRS
(the param passed in) was null or parentReflowState (a field of aParentRS) was
null. from the useless bug, looks like timeless deduced that it's the member var
which i concur since execution would not have gotten this far.
This crashes RC2?  The vieportframe changes I made on the branch were made on
the 1.0.1 branch (May 30).  My previous change to this code was March 12 (on the
trunk and branch both)....

I also do not see any recent branch changes to the HTMLReflowState code...
This crash is happening in Mozilla 1.0 RC3  and Netscape 7.0 PR1.
Ok... does it also crash RC2?  What about RC1?  Is it visible on the trunk, or
just branch?

It would be good to have a window on the branch for when this broke if it's
branch-only.
For internal folk, you can look at these query results to get an idea of when
and where this crash has been happening:
http://climate.mcom.com/reports/VeryFastSearchStackSigNEW.cfm?stacksig=nsHTMLReflowState%3A%3AComputeContainingBlockRectangle

Looking at current Talkback data, this is/was a crash for:
- Almost every Netscape 6.x release going back to Netscape 6.10
- Mozilla 1.0 RC1 (2002041717   Mozilla1.0)
- Mozilla 1.0 RC2 (2002051008   Gecko1.0)
- Mozilla 1.0 RC3 (2002052308   Gecko1.0)
- Netscape 7.0 PR1 (2002051220   Gecko1.0)

And we have already started getting a few incidents for Mozilla 1.0 (2002053015
  Gecko1.0).  So it might be difficult to find out exactly when this started.  

There aren't too many MozillaTrunk crashes, but that's because we don't have as
many users testing nightly builds...the most recent crash was with build
2002050908   MozillaTrunk.
Target Milestone: mozilla1.0.1 → mozilla1.1beta
this null pointer goes back to nsPresShell.cpp in InitialReflow(). in
here, on line 2679 (using xemacs line count) it calls Reflow() (in
nsViewportFrame.cpp) passing in reflowState. reflowState is being init-ed a few
lines above using the first constructor in nsHTMLReflowState.cpp. _in_ the
constructor, parentReflowState is init-ed to NULL. once inside Reflow(), another
nsHTMLReflowState object is created passing in reflowState.

nsHTMLReflowState   kidReflowState(aPresContext, aReflowState, kidFrame,
                                   availableSpace);
where aReflowState is reflowState. now within _this_ constructor
parentReflowState is assigned to the address of reflowState. so, when initially
inside GetContainingBlockReflowState() the param looks like this:
 nsHTMLReflowObj                  nsHTMLReflowObj
 ---------------------            ---------------------
| parentReflowState = +--------->| parentReflowState = +------>null
|                     |          |                     |
 ---------------------            ---------------------

because a null pointer is returned from GetContainingBlockReflowState(), could
this be an off by one error?
Target Milestone: mozilla1.1beta → mozilla1.0.1
We're reflowing the root frame, so the reflow state is supposed to have a null
parent.  The question is why we're trying to dereference it.  We ought to be
using one of the reflow state constructors for the root frame.  Are we?
Since we understand where the crash is happening, I am marking this bug as
topcrash+.
Keywords: topcrashtopcrash+
dbraon: oh, huh, the fn_name GetContainingBlockReflowState() suggested that it
was trying to grab an existing block (not a null block). *shrug*

this is where the root frame is created in InitialReflow():
2618  nsIFrame* rootFrame;
2619  mFrameManager->GetRootFrame(&rootFrame);
i'm trying to investigate if this is an actual call to "one of the reflow state
constructors" since i'm not familiar with the code. it's a deep dig woohoo!
actually this is where the root frame gets created in nsPresShell.cpp:
2811       mStyleSet->ConstructRootFrame(mPresContext, root, rootFrame);
2812       mFrameManager->SetRootFrame(rootFrame);

dbaron: i'm not precisely sure what you mean by "one of the reflow state
constructors for the root frame". the rootFrame is of type |nsIFrame*| which is
what the object |nsHTMLReflowState| is _composed_ of among other things. a
|nsIFrame*| object is passed in as a param into one of the reflow constructors.
so, the answer to your question is no, if i understand your question <--- my
insurance :)
any progress on this?  It would be great to get this fixed for 1.0.1.
Target Milestone: mozilla1.0.1 → mozilla1.1beta
Status: NEW → ASSIGNED
yo waterson! XXX would like to see thsi fixed for 1.0.1. pls update the ETA in
the status whiteboard. thanks!
Blocks: 143047
Keywords: nsbeta1nsbeta1+
Whiteboard: [ADT2 RTM] → [ADT2 RTM] [ETA Needed]
i don't see this crash on the trunk but it is the 6th topcrasher for the branch
from the talkback data.
Summary: M1RC3 N70PR1 crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle] → M1RC3 N70PR1 M1BR crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle]
waterson: chris, any updates on this one?
according to brendan (in gila logs) and others on irc, waterson has been
reassigned to some secret non mozilla project. we'd love to have him back.
thanks for the heads-up timeless. looking for a new owner to take waterson's
patch to completion ...
karnaze told me about this one today. since i'm w/ the layout troop i'll look at
this too. if someone else is looking at: don't stop trying, you might be
successfull sooner than i will :-)
Alex volunteered to take this. Thanks!
Assignee: waterson → alexsavulov
Status: ASSIGNED → NEW
retargeting after discussion with Tom Greer.
Target Milestone: mozilla1.1beta → mozilla1.2beta
seems to be a frequent crash on talkback. i tried the url's listed in the
talkback report and cannot make it crash. from the talkback reports the crash is
caused by the argument 

nsHTMLReflowState* aContainingBlockRS

that is passed null to the call to 

nsHTMLReflowState::ComputeContainingBlockRectangle

(this was already mentioned in the bug by others)

working on it...
Updating summary with N700 since this is a topcrasher for Netscape 7.0.  I will
attach the latest Talkback data.
Summary: M1RC3 N70PR1 M1BR crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle] → N700 M100 M1BR crashes [@ nsHTMLReflowState::ComputeContainingBlockRectangle]
This attachment has A LOT of user comments that might help us reproduce this
crash.
Data from talkback:

nsHTMLReflowState::ComputeContainingBlockRectangle
        this = Register variable - data not available
        aPresContext = 0x0086dd80
        aContainingBlockRS = 0x00000000
        aContainingBlockWidth = 0x006af574
        aContainingBlockHeight = 0x006af578

So the crash occurs very probably here:

    if (NS_FRAME_GET_TYPE(aContainingBlockRS->mFrameType)
         == NS_CSS_FRAME_TYPE_INLINE) {


The crash occurs almost always with the following stack:

nsHTMLReflowState::ComputeContainingBlockRectangle   
nsHTMLReflowState::InitConstraints   
nsHTMLReflowState::Init   
nsHTMLReflowState::nsHTMLReflowState   
ViewportFrame::Reflow
PresShell::InitialReflow  
.
.
.

except in incident id=10436989 (and maybe some others)

nsHTMLReflowState::ComputeContainingBlockRectangle  
nsHTMLReflowState::InitConstraints   
nsHTMLReflowState::Init   
nsHTMLReflowState::nsHTMLReflowState   
nsBoxToBlockAdaptor::Reflow   
nsBoxToBlockAdaptor::DoLayout   
nsBox::Layout
.
.
.

From the actual trunk code is imposible to have a null pointer there since:

PresShell::InitialReflow
...
    nsHTMLReflowState reflowState(mPresContext, rootFrame,
                                  eReflowReason_Initial, rcx, maxSize);

[ this causes reflowState.mCBReflowState = &reflowState ]
...
    rootFrame->Reflow(mPresContext, desiredSize, reflowState, status);

that is

ViewportFrame::Reflow
...
      nsHTMLReflowState   kidReflowState(aPresContext, aReflowState,
                                         kidFrame, availableSpace);


[ aReflowState is &reflowState declared in InitialREflow above, that means
  kidReflowState.parentReflowState will be &reflowState ]

[ the constructor for kidReflowState will do this : ]

nsHTMLReflowState::nsHTMLReflowState
...
  nsHTMLReflowState::Init
...
    nsHTMLReflowState::InitConstraints
...
    const nsHTMLReflowState* cbrs = parentReflowState->mCBReflowState;

[ parentReflowState is &reflowState, that means cbrs will be &reflowState ]


Thus, based on the current code, the crash is not possible since both
nsHTMLReflowState objects involved here are local objects that are created
staticaly so there are no dangling pointers possible. I will have to see how
many different reporters we really have here. This might be a bogus installation
or so. I invite anyone that want to spend a little time to take a look and see
if i'm right or wrong. For now, I say it is not a crash in an intact installation.


comparing the following information( Local Variables values, Source Code around
the crash, Call Stack and Assembly Language Instructions) the crash seems to be 
happening due to de-referencing aContainingBlockRS(NULL Value). 

Alex: Can you add null check to avoid crash ? and continue debugging the problem ?


Local Variables and Params from Talkback incident.

nsHTMLReflowState::ComputeContainingBlockRectangle  
      aContainingBlockRS = 0x00000000           
           (*aContainingBlockRS) = Data not available        

Source code around the Crash.

1509 // Called by InitConstraints() to compute the containing block rectangle for
1510 // the element. Handles the special logic for absolutely positioned elements
1511 void
1512 nsHTMLReflowState::ComputeContainingBlockRectangle(nsIPresContext*        
 aPresContext,
1513                                                    const nsHTMLReflowState*
aContainingBlockRS,
1514                                                    nscoord&               
 aContainingBlockWidth,
1515                                                    nscoord&               
 aContainingBlockHeight)
1516 {
1517   // Unless the element is absolutely positioned, the containing block is
1518   // formed by the content edge of the nearest block-level ancestor
1519   aContainingBlockWidth = aContainingBlockRS->mComputedWidth;
1520   aContainingBlockHeight = aContainingBlockRS->mComputedHeight;
1521   
1522   if (NS_FRAME_GET_TYPE(mFrameType) == NS_CSS_FRAME_TYPE_ABSOLUTE) {
1523     // See if the ancestor is block-level or inline-level

Equivalent Assembly code. 

603db719 55               push    ebp
603db71a 8bec             mov     ebp,esp
603db71c 8b5510           mov     edx,[ebp+0x10]
603db71f 53               push    ebx
603db720 56               push    esi
603db721 8b750c           mov     esi,[ebp+0xc]
603db724 57               push    edi
603db725 8b7d14           mov     edi,[ebp+0x14]
603db728 8b4628           mov     eax,[esi+0x28] <--- Crash is happening here.
603db72b bbff7fffff       mov     ebx,0xffff7fff
603db730 8902             mov     [edx],eax
603db732 8b462c           mov     eax,[esi+0x2c]
603db735 8907             mov     [edi],eax
603db737 8b411c           mov     eax,[ecx+0x1c]
603db73a 23c3             and     eax,ebx
603db73c 83f804           cmp     eax,0x4
603db73f 0f8588000000     jne     603db7cd
603db745 8b461c           mov     eax,[esi+0x1c]
603db748 23c3             and     eax,ebx
603db74a 83f801           cmp     eax,0x1

Any idea if this was fixed by the patches to bug 143706?
namachi:

are you able to reproduce the crash? i couldn't do it. regarding a null-check
patch: i don't know if a simple null check will solve the problem. from the
talkback data, i see a bunch of startup crashes, that makes sense be cause the
most of them are happening in the first nsPressShell::InitialReflow. now, just
placing a null check there it will, may make it crash somewhere else so it makes
us move from one crash to another.

ok, i'm checking dbaron's patches now .....
Surely that was supposed to be a comment on bug 165062?
damn, yes! thx Boris!
still cannot make it crash. if someone has an installation that crashes, i would
like to have the zipped installation dir with all it's content. let me know if
you have one, we'll figure out a way to transfer those binaries. thx!

for now, i could add the folowing null check:


nsHTMLReflowState::ComputeContainingBlockRectangle(
        nsIPresContext*          aPresContext,
        const nsHTMLReflowState* aContainingBlockRS,
        nscoord&                 aContainingBlockWidth,
        nscoord&                 aContainingBlockHeight)
{
+ if (!aContainingBlockRS) {
+   aContainingBlockWidth = NS_UNCONSTRAINEDSIZE;
+   aContainingBlockHeight = NS_UNCONSTRAINEDSIZE;
+   return;
+ }
  // Unless the element is absolutely positioned, the containing block is
  // formed by the content edge of the nearest block-level ancestor
  aContainingBlockWidth = aContainingBlockRS->mComputedWidth;
  aContainingBlockHeight = aContainingBlockRS->mComputedHeight;


however, i don't think this is will help a lot.
Alex: 
  I don't have a reproducible testcase for this crash. But, this is one of
topmost crash in Mozilla. We need to fix this asap. AFAICT in this case null
check should avoid the crash. With your patch it will avoid the crash and may
have acceptable rendering issue. Please go ahead and attach the patch and let
gets r/sr and try it out in the Trunk builds.
i need to check first if NS_UNCONSTRAINEDSIZE is acceptable for that part. (well
i guess is more acceptable than a crash :-)
This is only a null check patch. There was no way i could repro this bug.

Regarding dbaron's qestion:
The code added to patch bug 413706 does not get called in the initial reflow
(where the crash occurs). The constructors used there are:

    nsHTMLReflowState reflowState(mPresContext, rootFrame,
				  eReflowReason_Initial, rcx, maxSize);

in the PresShell::InitialReflow and

      nsHTMLReflowState   kidReflowState(aPresContext, aReflowState,
					 kidFrame, availableSpace);

in the ViewportFrame::Reflow.

I didn't added an assertion in the patch be cause InitConstraints already has
an assertion built in before calling ComputeContainingBlockRectangle. There is
only one other place where this method get's called:

nsAbsoluteContainingBlock::ReflowAbsoluteFrame

but that one gets a reference passed as in arg and it passes it further as a
pointer to that reference.

I'm convinced that there is some problem due to a corupt file or profile or
something similar. This crash gets reported a lot of times in talkback and the
incidents are reaching back to release 6.1:
 2002082310   Gecko1.0	(NS 7.0 release)
 2002061815   Gecko1.0	? dunno
 2002051220   Gecko1.0	(NS 7.0 PR1)
 2002051008   Gecko1.0	? dunno
2002050814   Netscape6.23
2002031420   Netscape6.22
2001112815   Netscape6.21
2001102218   Netscape6.20
2001072700   Netscape6.10

In all this time there is not one report from Netscape/AOL, tuckson testers or
whatsoever (common, such a frequent crash and no one can reproduce it?), the
most of the crashes are duplicates from the same users.

further talkback analysis to follow...
another interresting thing is that there are reports that show _really ancient_
releases however with recent dates. on the other side judging by the ID number
looks like the talkback system is messing up the dates reported. example:

Netscape6.10      NETSCP6.EXE 6.1.0.0
Netscape Win32 (2001072700)
Trigger Time: 08/12/2002 10:14:30
Incident ID: 9255690

here is a recent one:

Gecko1.0      NETSCP.EXE 7.0.0.0
Netscape Win32 (2002082310)
Trigger Time: 09/10/2002 16:36:15
Incident ID: 10716284

There must be some kind of a problem with the date displayed by talkback.

The other thing is that this is a crash on branch only so i would have to pull
the branch to see if i can reproduce it there, but i doubt since i tried all
kind of releases. I also started to install beginning with 6.2, then 6.21 on top
of it and so on until 7.0 and couldn't reproduce the crash.

pulling the 1_0 bracnh again....
Alex, regarding your comment above: The buildID matches the date of the build
itself, but the Trigger time is simply the time when that crash incident was
submitted to our servers. 
In your first example, the user is using the NS 6.1 build (from 7/27/2001) and
had a crash while using it on 8/12/2002 at 10:14am.

In both examples the dates, times, builds and incident IDs look normal. Have I
misunderstood your point?
Alex: Trigger Time is based on System Time in the Users Machine. It is very much
possible that users machine is not correct.
greer, namachi:
it seems strange to me that almost all the incidents with this signature that
are displayed when i run the search for all incidents containing this stack
signature, occured within the time period between beginning of august and now
(aproximately). see all the build numbers involved there. they range from 6.1 to
7.0. also the incidents numbers (or IDs) strech over almost a million positions.
Is that normal? On the other side i see that there are aprox 30K incidents
reported in a single day so it might be normal. 

Hmmm, ok, let's see:

- there are no trunk crashes
- there are almost only Netscape builds involved (i found a couple of Mozilla
Branch and Mozilla 1.0 entries)
- the build numbers are reaching so far back that the trunk should have
registered those crashes too right?
- there are aprox. 200 reports a day
- there is not even one netscape employee report there
- the crashes are reported on Win9x as well as all NT's and Linux but no Mac

I'm still tempted to say, this is an old profile or a corrupt installation
issue. I will try do the following thing: install 7.0 and get a gecko dll from a
previous release. see if i can then repro the crash.

Inbetween someone that can extract the unique email addresses only could
eventually try to get one of those guys that reported the bug to send us a
zipped installation directory with the binaries if he/she has a fast connection.
I'd like to have a look in those binaries.
Attachment #98668 - Attachment is obsolete: true
Comment on attachment 98796 [details] [diff] [review]
revided patch, i forgot the "return" (d'oh) 

This patch will only push the crash out further. Look in InitConstraints() and
see how many times |cbrs| is used after ComputeContainingBlockRectangle() is
called. If we're really going to wallpaper the problem because no one can
reproduce it, I think the best thing to do would be to check |cbrs| for null
after the call to GetContainingBlockReflowState() in InitConstraints() and if
it is null, initialize everything to something reasonable like |if (nsnull ==
parentReflowState)| case does and return immediately.
Comment on attachment 98796 [details] [diff] [review]
revided patch, i forgot the "return" (d'oh) 

oh i see now what you mean. damn, with this talkback crashers one losses his
mind
Attachment #98796 - Attachment is obsolete: true
ok. let's use the same values as used for the root frame. i moved the assertion
in a DEBUG block and removed an unneccesary re-declaration of cbrs after  

	if (NS_FRAME_REPLACED(NS_CSS_FRAME_TYPE_INLINE) == mFrameType) {

since parentReflowState does not change.
after a talk with Kin, looks like the patch is not what we want. it looks like
is a branch only problem and i need to check the branch now. i got a reference
to bug 143706 that might have solved the problem already.
what i wanted to say is, that i need to check again if the patch for bug 143706
fixed the crash.
Attachment #98814 - Attachment is obsolete: true
we have to get those 2 patches (see bug 143706) on the branch. will check this
with adt. (where the hell was my mind when i was looking all the time at the
trunk and not at the branch?)
Please port the patches from bug 143706 to the branch and attach them here so we
can get reviews and such. Looks like a straightforward port, but we want to be
extra careful on the branch.
ok, will attach them soon.
ok, i made this patch from dbarons two patches for bug 143706. need r=, sr=, a=
to check into branch.
Whiteboard: [ADT2 RTM] [ETA Needed] → [ADT2 RTM] [ETA 09/19]
Comment on attachment 99621 [details] [diff] [review]
patch made of the patches for bug 143706 for the branch only

This patch looks like it was applied to the branch by hand rather than by
applying the patches or by using "cvs up -j<version1> -j<version2> <file>".  It
(in addition to numerous whitespace differences with the original patch) seems
to be missing the last line of InitCBReflowState, which I think happens to be
the actual line that fixes this crash (by changing the behavior in the "this
shouldn't happen" case).
Attachment #99621 - Flags: needs-work+
This is what I get when I merge the patches from bug 143706 to the branch by
doing:

cvs up -j3.30 -j3.32 base/public/nsHTMLReflowState.h
cvs up -j1.139 -j1.142 html/base/src/nsHTMLReflowState.cpp
cvs up -j3.221 -j3.222 html/table/src/nsTableOuterFrame.cpp

I haven't tested it at all.
Attachment #99621 - Attachment is obsolete: true
Er, actually, the missing line isn't the error case, but rather the normal case.
i didn't had time to run queries on bonsai to get version numbers. a hint that
there was a missing line would have been enough.
Enough for what?  Not to demonstrate why one should always merge using cvs up
-jA -jB (or at a minimum, by applying patches that are exactly what was checked
in to the trunk) to avoid the risk of introducing regressions on the branch by
unintentionally checking in something other than what's on the trunk?  I'm also
not sure that there weren't other changes missed -- I didn't examine the patch
that closely (diffs of diffs are hard to read, especially when the diffs differ
in whitespace in ways that change which lines are marked in the first level
diff).  (I also really don't understand the comment about not having time if you
did have time to merge the patch by hand, which takes much longer than issuing a
few cvs log commands (or bonsai queries) and cvs up commands.)
how do you want to know how i merge patches by hand?
Alex: Can you verify this
http://bugzilla.mozilla.org/attachment.cgi?id=99625&action=view  patch ? To make
sure it contains all the changes needed to fix the topcrash in the Mozilla 1.0
Branch. Let us get r=/sr=.   
The patch (attachment 99625 [details] [diff] [review]) and is good to go. Anyone wants to review it
additionally? Waterson was the initial sr=, bzbarski and attinasi were the r='s.
Attinasi is gone, and afaik bzbarski is not available right now. Kin?
Comment on attachment 99625 [details] [diff] [review]
patches from 143706 merged using cvs up -j

r/sr=kin@netscape.com
Attachment #99625 - Flags: superreview+
Comment on attachment 99625 [details] [diff] [review]
patches from 143706 merged using cvs up -j

so far i ran tests on the branch that hit all three return paths in
InitCBReflowState() and have not seen any problems. i also ran the regression
test suite and there were no suspect problems so far. if anyone still wants to
do additional reviews go ahead. 

r= alexsavulov
Attachment #99625 - Flags: review+
Comment on attachment 99625 [details] [diff] [review]
patches from 143706 merged using cvs up -j

a=rjesup@wgate.com for 1.0 branch.  Please change mozilla1.0.2+ to fixed1.0.2
when this is checked in.
Attachment #99625 - Flags: approval+
Keywords: adt1.0.2
Whiteboard: [ADT2 RTM] [ETA 09/19] → [ADT2 RTM] [ETA 09/21]
fixed on the branch. (no trunk fix neccesary)
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Please verify this on the branch
Keywords: adt1.0.2adt1.0.2+
chris:

if you cannot reproduce the bug with a branch build before the patch was
applied, the only way to verify this is to see if the code is checked into the
branch. i'm mentioning this be cause i wasn't able to reproduce this crash before.
With Alex's help, I can verify that this checkin was made to the branch. Verified.
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsHTMLReflowState::ComputeContainingBlockRectangle]
You need to log in before you can comment on or make changes to this bug.