Closed Bug 485351 Opened 15 years ago Closed 15 years ago

Hang [@ gfxWindowsFont::ComputeMetrics] and eating up all GDI resources with percentage height, mathml and binding

Categories

(Core :: Layout, defect, P2)

x86
Windows XP
defect

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: martijn.martijn, Assigned: jfkthame)

References

Details

(4 keywords, Whiteboard: [fixed1.9.1b4])

Attachments

(5 files)

The testcase that is attached to this bug is hanging in current trunk build and if I don't close it quickly enough, it also makes my computer unusable. I noticed that it basically eats up all available GDI resources, it seems.
With Firefox 3.0.6 I just seem to hang.
Attached file stack from debug build
This is a stack from the debug build, that I get with the hang.
I also noticed this assertion:
###!!! ASSERTION: Failed to make HFONT: 'mFont', file c:/mozilla-build-1.3/src/g
fx/thebes/src/gfxWindowsFonts.cpp, line 751
I presume that assertion happens when all GDI resources are exhausted.

 	ntdll.dll!_KiFastSystemCallRet@0() 	
 	gdi32.dll!_NtGdiGetWidthTable@28()  + 0xc bytes	
 	gdi32.dll!_pcfLocateCFONT@24()  + 0x32e1 bytes	
 	gdi32.dll!_GetTextExtentPointWInternal@20()  + 0xcc bytes	
 	gdi32.dll!_GetTextExtentPoint32W@16()  + 0x18 bytes	
>	thebes.dll!gfxWindowsFont::ComputeMetrics()  Line 933	C++
 	thebes.dll!gfxWindowsFont::GetMetrics()  Line 776	C++
 	thebes.dll!gfxFont::Measure(gfxTextRun * aTextRun=0x0854ec30, unsigned int aStart=0, unsigned int aEnd=1, gfxFont::BoundingBoxType aBoundingBoxType=TIGHT_HINTED_OUTLINE_EXTENTS, gfxContext * aRefContext=0x08b0fd38, gfxFont::Spacing * aSpacing=0x00000000)  Line 510 + 0x13 bytes	C++
 	thebes.dll!gfxWindowsFont::Measure(gfxTextRun * aTextRun=0x0854ec30, unsigned int aStart=0, unsigned int aEnd=1, gfxFont::BoundingBoxType aBoundingBoxType=TIGHT_HINTED_OUTLINE_EXTENTS, gfxContext * aRefContext=0x08b0fd38, gfxFont::Spacing * aSpacing=0x00000000)  Line 1027 + 0x24 bytes	C++
 	thebes.dll!gfxWindowsFont::Measure(gfxTextRun * aTextRun=0x0854ec30, unsigned int aStart=0, unsigned int aEnd=1, gfxFont::BoundingBoxType aBoundingBoxType=TIGHT_HINTED_OUTLINE_EXTENTS, gfxContext * aRefContext=0x08b0fd38, gfxFont::Spacing * aSpacing=0x00000000)  Line 1022 + 0x32 bytes	C++
 	thebes.dll!gfxTextRun::AccumulateMetricsForRun(gfxFont * aFont=0x04e2dcd0, unsigned int aStart=0, unsigned int aEnd=1, gfxFont::BoundingBoxType aBoundingBoxType=TIGHT_HINTED_OUTLINE_EXTENTS, gfxContext * aRefContext=0x08b0fd38, gfxTextRun::PropertyProvider * aProvider=0x0012c554, unsigned int aSpacingStart=0, unsigned int aSpacingEnd=1, gfxFont::RunMetrics * aMetrics=0x0012c4c0)  Line 1875	C++
 	thebes.dll!gfxTextRun::MeasureText(unsigned int aStart=0, unsigned int aLength=1, gfxFont::BoundingBoxType aBoundingBoxType=TIGHT_HINTED_OUTLINE_EXTENTS, gfxContext * aRefContext=0x08b0fd38, gfxTextRun::PropertyProvider * aProvider=0x0012c554)  Line 1948	C++
 	gklayout.dll!nsTextFrame::ComputeTightBounds(gfxContext * aContext=0x08b0fd38)  Line 5678	C++
 	gklayout.dll!nsFrame::ComputeSimpleTightBounds(gfxContext * aContext=0x08b0fd38)  Line 3281 + 0x29 bytes	C++
 	gklayout.dll!nsBlockFrame::ComputeTightBounds(gfxContext * aContext=0x08b0fd38)  Line 783 + 0x10 bytes	C++
 	gklayout.dll!nsMathMLContainerFrame::ReflowChild(nsIFrame * aChildFrame=0x08ec7724, nsPresContext * aPresContext=0x08dbd5b0, nsHTMLReflowMetrics & aDesiredSize={...}, const nsHTMLReflowState & aReflowState={...}, unsigned int & aStatus=0)  Line 928	C++
 	gklayout.dll!nsMathMLTokenFrame::Reflow(nsPresContext * aPresContext=0x08dbd5b0, nsHTMLReflowMetrics & aDesiredSize={...}, const nsHTMLReflowState & aReflowState={...}, unsigned int & aStatus=0)  Line 169 + 0x22 bytes	C++
 	gklayout.dll!nsLineLayout::ReflowFrame(nsIFrame * aFrame=0x08ec766c, unsigned int & aReflowStatus=0, nsHTMLReflowMetrics * aMetrics=0x00000000, int & aPushedFrame=0)  Line 852 + 0x2d bytes	C++
etc...
In general, current trunk builds seem to leak GDI resources like a sieve.
MathML... Could we be leaking as a regression from bug 479276 maybe?
Is there any way we can get automated detection of leaking GDI objects?
(In reply to comment #3)
> MathML... Could we be leaking as a regression from bug 479276 maybe?

I've done a build with that patch backed-out, and it seems to be in an infinite (or very-near-infinite) loop; nsBlockReflowContext is spewing messages to the console about crazy metrics, and it looks like a pair of <div> blocks are continually growing in height.

So far it hasn't actually crashed, so that does raise the question of whether 479276 was causing it to consume GDI resources in a way that it doesn't otherwise.

Aha.... I'm suspicious of this line (not from that patch):

  http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/src/gfxWindowsFonts.cpp#724

Note jtd's comment there; I think he's right. This means that we leak a Windows DC for each font where we test the character map in this way. Bug 479276 causes MathML code to create lots of additional temporary font instances, and that's why it makes the failure show up, but it's not really the source of the leak.

I'm currently building with a patch for this, to check whether it helps. Note that this will *not* resolve the fact that the testcase here hangs (which it did with 3.0.6 as well, apparently), it will only resolve the exhaustion of GDI resources.
That's fine... thanks for dealing with this!
The function that is not returning is nsStackLayout::Layout with aBox=0x354fcd0, which is the interior deck.  It's two hbox children and the button Block frame descendants are very high.
This resolves the resource exhaustion issues; it does *not* resolve the infinite loop in layout, which is not a regression from bug 479276 (it is also present in 3.0.6).

Besides the device-context leak already noted, which would only occur with non-TrueType fonts, the problem was the use of nsRefPtr<gfxWindowsFont> for temporary fonts being created just for measurement purposes. Because these transient fonts are not kept in the global cache, Release() never actually led to them being deleted properly.
Attachment #369608 - Flags: review?(roc)
Loading the testcase with a trunk build on OS X, I get an assertion:

###!!! ASSERTION: A Box's child is constantly growing!!!!!: 'passes < 10', file /Users/jonathan/mozdev/mozsrc/layout/xul/base/src/nsSprocketLayout.cpp, line 627

Not sure why I'm not hitting this same assertion on Windows, unless perhaps it's a recent addition and my trees aren't quite in sync.
This is likely what I experienced today with my local Win2K m-c builds.

With Firefox, I worked around it by using
|skip == table-width-1.xhtml table-width-1-ref.xhtml|
in <reftests\mathml\reftest.list>.

With SeaMonkey, it already failed (randomly) in <reftests\bugs> directory...
(In reply to comment #4)
> Is there any way we can get automated detection of leaking GDI objects?

We can query GDI objects in use; I think I filed a bug a long time ago to get this added to Talos, but can't find it now.
Comment on attachment 369608 [details] [diff] [review]
patch to resolve GDI resource exhaustion
[Checkin: Comment 12]


http://hg.mozilla.org/mozilla-central/rev/098c0921cf99
Attachment #369608 - Attachment description: patch to resolve GDI resource exhaustion → patch to resolve GDI resource exhaustion [Checkin: Comment 12]
Blocks: 437356
Flags: blocking1.9.1?
Whiteboard: [needs landing] → [needs 1.9.1 landing]
Target Milestone: --- → mozilla1.9.2a1
Attachment #369608 - Flags: approval1.9.1?
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090328 Minefield/3.6a1pre] (home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/262d44d6e425)
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090328 SeaMonkey/2.0b1pre] (experimental/_m-c_, home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/262d44d6e425
 +http://hg.mozilla.org/comm-central/rev/054adad6539d)

I didn't test this fix specifically, but these new builds don't have comment 10 issue(s) anymore :-)
Assignee: nobody → jfkthame
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
I think the hang (minus the resource leak) is a dup of one of the other <stack>-related bugs.
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
Comment on attachment 369608 [details] [diff] [review]
patch to resolve GDI resource exhaustion
[Checkin: Comment 12]

Does not apply cleanly to 1.9.1:
{
patching file gfx/thebes/src/gfxWindowsFonts.cpp
Hunk #2 FAILED at 1010
1 out of 2 hunks FAILED
}
Attachment #369608 - Flags: approval1.9.1?
As we haven't taken bug 479276 on 1.9.1, the issue of not releasing temporary gfxWindowsFont instances doesn't exist there. So only one half of this fix is relevant to the branch.
Comment on attachment 370267 [details] [diff] [review]
patch for 1.9.1, only fixing the DC leak in TestCharacterMap
[Checkin: Comment 18]

Roc, marking this for re-review as it's now only half the patch it was. Do you still want to land it on 1.9.1, considering that we didn't take 479276 there, and so we didn't run into the major resource issue? The part of this that applies to 1.9.1 would still be correct, just not that important by itself.
Attachment #370267 - Flags: review?(roc)
Pushed to 1.9.1
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/180363e4c829
Keywords: fixed1.9.1
Whiteboard: [needs 1.9.1 landing]
Whiteboard: [fixed1.9.1b4]
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090402 Minefield/3.6a1pre] (home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/c4f6bb1db611)

V.Fixed
Status: RESOLVED → VERIFIED
Attachment #370267 - Attachment description: patch for 1.9.1, only fixing the DC leak in TestCharacterMap → patch for 1.9.1, only fixing the DC leak in TestCharacterMap [Checkin: Comment 18]
verified FIXED on Shiretoko: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b4pre) Gecko/20090422 Shiretoko/3.5b4pre ID:20090422042031
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: