Closed Bug 159298 Opened 22 years ago Closed 21 years ago

win98 GDI handle leak on table background changes

Categories

(Core Graveyard :: GFX, defect, P2)

x86
Windows 98
defect

Tracking

(Not tracked)

RESOLVED FIXED
Future

People

(Reporter: will, Assigned: kmcclusk)

References

()

Details

(Keywords: fixed1.4.2, hang, memory-leak)

Attachments

(3 files, 1 obsolete file)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.1b) Gecko/20020721
BuildID:    2002072104

When holding shift and running the mouse over the table (which changes the
background color) the systems resources will drop off quickly (190mb ram
consumed in 30sec)

The screen will gray and mozilla will crash. Everything goes back to normal when
mozilla is killed.

Reproducible: Always
Steps to Reproduce:
1.Load
http://honors.tntech.edu:8080/play/template/ColorTable.vm?hues=10&saturations=10&brightnesses=10
2.Hold shift and move the mouse over the colored cells
3.Continue for several seconds watching system resources

Actual Results:  The browser turned funny colors and died.

Expected Results:  Not dying.
observing system resources, and behaviour of windows, Mozilla is leaking GDI
handles, which causes the system to run out, and so no other process can get a
handle either. Closing Mozilla causes the handles to be freed, and thus freeing
up the handles and fixing other apps.
Component: XP Apps → JavaScript Engine
Keywords: mlk
wfm using build 20020809 on Win2k (trunk).
do you still leak with latest version and can you post Talkback ID when crashing
('mozilla/bin/components/talkback.exe') ?
Keywords: crash, stackwanted
==> Browser General
Assignee: sgehani → Matti
Component: JavaScript Engine → Browser-General
Keywords: mlk
QA Contact: paw → asa
worksforme, linux trunk build 20020808
will@himinbi.org: the URL you originally posted is no longer active.  Can you
attach the page to this bug as a testcase?  This may be a duplicate of Bug
157160 which was fixed on 2002-08-13; can you test this against mozilla 1.1 or
1.2a and see if this problem still appears?

Changed Summary to be more descriptive
Summary: win98 large javascript memory leak → win98 GDI handle leak on table background changes
This page is somewhat long (~350 lines) because the included stylesheet and
javascript are now inline. If the link works (which is should, the servlet
container went down for two days, which messed it up) then that source is more
easily readable.
No longer have a win98 box to test on. Does not cause any problems in other
versions of windows to the best of my knowledge.
Confirming with Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.3a) Gecko/20021115
Status: UNCONFIRMED → NEW
Ever confirmed: true
jasajudeju@telia.com: do you have a stacktrace or a talkback-id for your crash?
 I'm very interested in this bug, which appears to win98-only, but tracking it
down has been elusive.

kmcclusk:  I have seen several bugs (win98 only, not NT4/5/XP) that indicated a
handle leak, probably GDI handles.  Is there code in GFX (or elsewhere) that is
win98-specific which might be causing handle leaks?
It does'nt really crash, it just consumes so much of the systems resources that
the program and most of Windows freezes. Fortunately I can still move the mouse
around and it is no problem to close the program manually, so Talkback is never
invoked.

Some info about my system:
Athlon 1.2 GHz, 256 MB RAM, Windows 98FE, GeForce2 graphic card.
This might be caused by MS Knowledgebase article K149289: Selecting a previously
selected HFONT (handle to a font object) back into a device context may result
in a 52-byte memory leak if and only if the mapping mode is set to MM_ISOTROPIC
or MM_ANISOTROPIC, and the window or viewport extents have changed.

http://support.microsoft.com/default.aspx?scid=KB;en-us;149289&

Assignee: Matti → kmcclusk
Component: Browser-General → GFX Compositor
QA Contact: asa → ian
QA Contact: ian → nobody
changing keywords per comment 10.

Hasse - can you still reproduce this with a recent build of mozilla? if nobody
can reproduce, this should be resolved as WORKSFORME.
Keywords: crash, stackwantedhang
I can still reproduce this with build 20021230.
Priority: -- → P2
Target Milestone: --- → Future
Re Comment #11.

Mozilla doesn't appear to use MM_ISOTROPIC or MM_ANISOTROPIC. It does, however,
use MM_HIMETRIC in /embedding/browser/activex/src/control/ControlSite.cpp, line
1263 and MM_TEXT in /mailnews/mapi/old/tests/mapitest/main.cpp, line 225 (for
WIN16 only).
Keywords: mlk
OK, first problem, any use of a 16-bit GDI call is really bad. Due to a design
feature (well, that's what Microsoft calls it), a 16-bit GDI call will not
release memory (GDI resources) even after Mozilla is closed.

An example of this is the use of EnumFontFamilies in nsDeviceContextWin.cpp
which is a 16-bit call according to MSDN. Using EnumFontFamiliesEx is better
(although I'm not if it does the same thing).
Filed bug #192342 for my example in Comment #15. Also adding it to dependencies.
Depends on: 192342
Me too Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.3) Gecko/20030307 (latest).
System like comment #10: Win98SE; Athlon Thunderbird 1.3; 384MB RAM; GeForce2 MX.
I'm seeing this bug with Moz1.3.1 (build 20030425). I also saw it in 1.3, and
it's present in the new Firebird0.6 (20030516). Moz1.2 seemed to be free of it,
but it was present in earlier milestones.

It's taken me a long time to find this bug in bugzilla - the description is
wrong. (And I haven't reported it as a new bug because I've been unable to pin
down the trigger or develop a working test case.) Certainly, Will Holcomb's test
case demonstrates it very nicely with changing table backgrounds, but I've seen
it triggered by sites with no dynamic JS/ECMAScript content. Usually it seems to
have something to do with opening many Mozilla windows and/or tabs, and the
presence of embedded Java applets may be a factor. But I've also seen it with as
few as 2 Mozilla windows open, and with Firebird it occured with just one window.


I'm running Win98SE on an old PII with 320Mb of RAM and an 8Mb ATI graphics
card. Till I found this bug report, I suspected it was another ATI driver problem.
Re comment #18.

Dan, can you supply some of the URL's that you've seen the problem on. I'm
currenty running on a Win98SE machine as well, with an AMD 1.2 CPU and an old
TNT video card, so I might be able to do some digging.
Adding Dan (from Comment #18) to cc list.
David, I can't find any specific site that'll generate this problem as quickly
and reliably as Will's original test case link.

The occasion I mentioned when I saw it in Firebird, I was visiting a message
board I frequent. But I've been there with Firebird a good number of times over
the last couple of days without the bug re-appearing.



The attachment I just uploaded shows the bug in Firebird, triggered from Will's
testcase. The other open windows are Mozilla1.3.1, Eudora and Digiguide (a TV
listings programme which is apparently based in part on Mozilla). After
capturing the screenshot to the windows clipboard, I got a low memory error
message from windows. That's the first of those I've seen since upgrading to
320Mb. :-)

Firebird seems to dislike http://www.powerbasic.com - I just started up the
browser and went to that url. I started scrolling through the page before all
its graphics had finished downloading, don't know if that makes any difference.

Reproducibility: with Firebird, at that URL, fairly high - after closing the
browser and starting to type this report, I re-started Firebird and repeated the
above procedure withthe same results. But I did not re-boot in between.

Firebird seems to be far more susceptible to this bug than Mozilla.


Other applications running: Mozilla 1.3.1, Eudora and DigiGuide, as in the
screen-shot I posted yesterday.

d.
Blocks: 204374
My goodness, this is horrible! It can bring a Win9x system to its knees in a
matter of seconds.

Requesting blocking1.4 status, although it's a bit late in the day for that...
Can anyone using the gdi tool mentioned in bug 199443
tell me what resources are being leaked by this testcase?
I played around and it doesn't appear to be bitmaps.
I was thinking it was regions but I couldn't verify.
My next thought was fonts.

Anyhow i can't get the tool in bug 199443 to work on my win98 machine.
I have access to 98SE with a working version of the tool.

Does 98SE exhibit this bug?

I'll happily test, but I haven't been able to reproduce this bug before.

Could somebody post step-by-step instructions to reproduce this, including which
build to use?
I tested with GDIUsage.exe on Win98 SE (german OEM)

A short mousemove over the tabeles increased my GDI.brushes by 308, 
regions by 10, after i closed the tab regions went down to 4.
THANKS - looks like I need to focus on brushes
I have Win98 SE (Spanish). Where can I find "GDIUsage.exe"?
The GDIUsage.exe tool is the third attachment to Bug 199443

http://bugzilla.mozilla.org/attachment.cgi?id=121792&action=view

After my test in comment 27 started displaying pictures black and white 
(see other GDI Bugs), i couldn't save a screenshot because The Gimp hung
on startup. AVGuard, ATItools, taskplaner did the same, hat so kill some
by Taskmanager. Mozilla stood rock-solid, closed fine. (had to restart win 98).
Only issue: On restart, it did not open the last Page (slashdot), but the
last loaded advertisingbanner.
Got it...  we are leaking brushes.
I hacked nsRenderingContextWin :: SetupSolidBrush
(which is called every 100+times a mouse over)
and forced a  "::SelectObject(mDC, gStockWhiteBrush);"
before calling "HBRUSH tbrush = gSolidBrushCache.GetSolidBrush(mColor);"
which does a "::DeleteObject(mCache[mIndexOldest].mBrush);"
By doing this, I stopped leaking in this testcase.

So now just have to find the leak and a fix

Thanks for everyone's input
Attached patch Fix GDI leak (obsolete) — Splinter Review
When changing brushes, we would first check to see if the brush
was in the cache, if not we would create a new one adding the
new one to the brush cache.  Well if the brush cache was full, 
we would delete the oldest brush in the cache and then select
the new brush as *current* brush.

Ok, for SOME reason on win98 this was causing a GDI leak (it
didn't on win2k).  So I changed the code and *selected* the
new brush and then deleted the oldest brush in the cache.
This fixed the leak.  

NOTE: Using printfs I was unable to find any instance where
the oldest brush in the cache was the currently selected
brush so theoretically this shouldn't have leaked.  However,
in playing around with scenarios I found that if we just
select the new brush before deleteing the oldest one in our
cache, everything seems to work fine.

NOTE: I think this testcase is somewhat unique since we are
changing the brush about 100X every mouseover and so was just
really stressing the code.

If someone could verify that they too see that this patch
fixes the problem on win98 (I have tested it but alway like
when someone else backs me up) that would be great.
The patch looks good to me, from my limited Win GDI knowledge.

However, I'm rather curious as to why the brush is changing more than 100 times
on every mouseover. Maybe this is a subject for a seperate bug?

I'd also suggest that this should get into 1.4final if at all possible.
Flags: blocking1.4?
What about the case where (i < BRUSH_CACHE_SIZE)? It looks like your patch will
create a new brush, but will not select it. It probably needs something like the
following:


if (!result) {
    // We didn't find it in the set of existing brushes, so create a
    // new brush
    result = (HBRUSH)::CreateSolidBrush(PALETTERGB_COLORREF(aColor));

    // If there's an empty slot in the cache, then just add it there
    if (i >= BRUSH_CACHE_SIZE) {
     ...
    } else {
    	::SelectObject(theHDC, result);
    }


You should also add a comment just before the ::SelectObject line that states
SelectObject must be done before DeleteObject otherwise it causes a WIN98 GDI leak. 

if (i >= BRUSH_CACHE_SIZE) {
       ... 
       ::SelectObject(theHDC, result);
       ::DeleteObject(mCache[mIndexOldest].mBrush);
Ok, I updated the patch adding comment and catching
the missed case where i < BRUSH_CACHE_SIZE.  (sorry about missing it)

NOTE: instead of using an else clause, I just select it before the if clause.
Attachment #126481 - Attachment is obsolete: true
We should try to get this for 1.5b and 1.4.1. Who can review?
Flags: blocking1.4.x+
What is happening with this bug? Apperantly theres been a proposed fix for one
and a half month, but noone cares to review it? :-(
Wouldn't it be stupid to release 1.5b (and especially 1.4.1) without the fix it
if works?
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

requesting review... (not my patch)
Attachment #127232 - Flags: review?(ere)
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

Looks good to me.
Attachment #127232 - Flags: review?(ere) → review+
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

requesting sr (not my patch)
Attachment #127232 - Flags: superreview?(roc+moz)
I'm confused about one thing. If, in Win98, you have to remove the brush from
the DC before deleting it, how do we know that there aren't other DCs other than
theDC which could have the brush selected --- so could still leak?
Not 100% sure, but if the other DC has it, then THAT DC should release it, no?
Also, wouldn't this case exist the way it is implemented now?  If so, then
we make it no worse.  Probably not the answer you wanted but I am not 100% of all
the ways this situation can be exploited.  This fix takes care of the
problem by just changing the order things are done, so *theoretically*
it shouldn't mess anything up (more than it already is).
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

I agree that this will reduce the impact of the problem enormously.

> if the other DC has it,
> then THAT DC should
> release it, no?

Well sure, it just appears that you need to release it before it's deleted in
order to avoid a leak.

In theory we could avoid this problem by a reference count for each brush in
the cache, counting how many DCs it's selected into and refusing to evict a
brush that's selected into more than just theDC.
Attachment #127232 - Flags: superreview?(roc+moz) → superreview+
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

Requesting 1.4.1 approval (not my patch).
Attachment #127232 - Flags: approval1.4.x?
Does everyone agree this is the right thing?

Is this patch on the trunk?
Flags: blocking1.4.1+ → blocking1.4.1-
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

too late for 1.4.1. if you'd like to see this in the next 1.4.x release please
set the approval1.4.2? flag.
Attachment #127232 - Flags: approval1.4.1? → approval1.4.1-
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

what's the plan for this patch? it's not been committed to what used to be
trunk(now 1.5branch). shouldn't this block 1.5 if the problem is as serious as
described in some comments?
yes, we should get it on the trunk (for 1.6) and the branch for 1.5.

jdunn, can you do the deed?
Flags: blocking1.5+
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

approval1.4.2? Or does noone care to check this into the 1.4 branch anyways?
Attachment #127232 - Flags: approval1.4.2?
I checked the patch into the trunk... sorry for missing 1.4.1.

Let me know if you want it in the 1.4 branch (will hold off on 
marking this fixed till I hear about the branch)
Setting blocking 1.4.2. This should be on the 1.4-BRANCH
Flags: blocking1.4.2?
This needs to bake on the trunk for a few weeks first.
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

Requesting 1.5 branch approval since the bug is labeled as a 1.5 blocker and
has a reviewed patch.
Attachment #127232 - Flags: approval1.5?
Has anyone found any bad side effects from this landing on the trunk? 
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

too late for 1.5.
Attachment #127232 - Flags: approval1.5? → approval1.5-
Are we totally comfortable with this patch now?

If so, I'll approve it for 1.4.2
Comment on attachment 127232 [details] [diff] [review]
Updated: Fix GDI leak

I'll allow this for 1.4.2. It better not break anything :)
Attachment #127232 - Flags: approval1.4.2? → approval1.4.2+
Has approval now.
Flags: blocking1.4.2?
Neil Rashbrook checked this into the 1.4-branch. Marking fixed now.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Keywords: fixed1.4.2
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: