threadmanager checkin 2006-05-10 causes increased memory usage by ~10MB per minute with shockwave flash (memory leak?)

RESOLVED FIXED in mozilla1.9alpha6

Status

()

Core
Plug-ins
--
critical
RESOLVED FIXED
12 years ago
10 years ago

People

(Reporter: nivtwig, Assigned: jimm)

Tracking

(Blocks: 1 bug, 5 keywords)

Trunk
mozilla1.9alpha6
x86
Windows XP
dogfood, memory-leak, perf, regression, testcase
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(3 attachments, 3 obsolete attachments)

(Reporter)

Description

12 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060626 Minefield/3.0a1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060626 Minefield/3.0a1

Firefox trunk memory usage grows by ~10MB per minute on http://www.ynet.co.il/articles/0,7340,L-2813378,00.html, when the same page is opened 7 times on 7 tabs (to see the problem faster), it is probably a memory leak. 
After some time (several hours) the memory use grows to 600MB and Firefox is practically unusable /hangs due to having to use swap.
This problem doesn't happen in Bon Echo alpha 3, so it is a Firefox problem, not a flash plugin problem.
I am using Flash 9 beta (version 9.0 b296) since the flash 8 plugin had a huge memory leak problem that was reported and fixed in bug 334322, therefore you should use the flash 9 beta, or disable flash to make sure that what you are seeing is due to Firefox.
I used a new clean profile without extensions.


Reproducible: Always

Steps to Reproduce:
1. With a new clean profile, open Firefox to about:blank or a simple html page, check the memory usage using task manager. Close task manager.
2. open 7 tabs and in each of them open the same URL http://www.ynet.co.il/articles/0,7340,L-2813378,00.html, wait for the tabs to finish loading. (This is meant to simplify but simulate opening 7 tabs of different articles of this major Israeli news site)
3. Note the memory usage of task manager at the start
4. Notice the memory usage keeps growing
5. Close task manager ( to allow faster operation)
6. Open task manager and check the memory use every several minutes (lets say 5)
7. Notice the memory use keeps growing at a pace of about ~10MB per minute
8. Close all tabs, replace the last tab with about:blank
9. Notice the memory at the end is much higher than at the start. In order to check that there are no cached pages, navigate to several other "simple" html sites that should not cause memory leaks, the memory usage will still stay higher than at the start, therefore it is probably a memory leak.


Actual Results:  
Memory usage at the end much larger than at the start.
Constantly growing memory usage when the above URL is open . 


Expected Results:  
No growth in memory used during the test and at the end.
(Reporter)

Updated

12 years ago
Version: unspecified → Trunk
(Reporter)

Comment 1

12 years ago
The URL in the first post is a mistake (although it will probably also show the problem, since the incorrect URL is also an article from the same site, with the same structure. I just didn't test it), the correct URL I tested with was actually http://www.ynet.co.il/articles/0,7340,L-3267772,00.html . I updated the URL field to the correct URL.
(Reporter)

Comment 2

12 years ago
Added qawanted - can help with creating a reduced testcase, assigning to correct component, and creating a memory leak report with a leak detection tool.
Keywords: qawanted

Comment 3

12 years ago
somebody else had a big leak on a .il site because of the flash ads, turn out it was a flash bug, got fixed in Flash Player 9 Beta 3 (if I remember well).


hahaha, after finding the bug, it turn out it was you, see https://bugzilla.mozilla.org/show_bug.cgi?id=334322

Comment 4

12 years ago
sorry for bugspam,

my last comments wasn't usefull, bug 334322 is allready mentionned in the original bug report.

Comment 5

12 years ago
Confirming. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060629 Minefield/3.0a1 with Flash 9.0.7.0 consumes about 8 MB/minute when seven tabs are opened to the URL given in comment #1. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1a3) Gecko/20060629 SeaMonkey/1.1a with Flash  9.0.7.0 consumes about 2 MB/minute with the same test. It's not an actual memory leak, because the memory is immediately released when the tab is closed or the user navigates to another page.

Opera 9.00 with Flash 9.0.7.0 consumes about 1 MB after 20 minutes, then releases the memory on its own. This makes it unlikely that it's a problem with the Flash plugin or with the SWF file itself.
Severity: critical → normal
Status: UNCONFIRMED → NEW
Component: General → General
Ever confirmed: true
Product: Firefox → Core
QA Contact: general → general

Updated

12 years ago
Component: General → Plug-ins
QA Contact: general → plugins
(Reporter)

Comment 6

12 years ago
(In reply to comment #5)
>  It's not an
> actual memory leak, because the memory is immediately released when the tab is
> closed or the user navigates to another page.

That is not what I see. Yes, memory is released when the tabs are closed, but it never goes back to what it was at the start suggesting that it is a memory leak, please recheck. 

I tested again with:
the latest trunk build Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060630 Minefield/3.0a1 
and with
Flash 9,0,16 (just noticed that Flash 9 was officially released so I installed it)
I used browser.sessionhistory.max_total_viewers = 0 to avoid cached pages.


Initial memory was ~33M, 
after 7 tabs were loaded ~153M, 
after waiting ~10 minutes memory grew to ~223M,
after closing all tabs and navigating to several "simple" pages - ~123M
( The figures are only the "Mem usage" column of task manager, the "VM size" column was always similer - about 10M less than the "Mem usage")

Memory at the end was above 90M higher than at the start with the same URL open, suggesting it is a leak.

Keywords: mlk

Comment 7

12 years ago
(In reply to comment #6)
> That is not what I see. Yes, memory is released when the tabs are closed, but
> it never goes back to what it was at the start suggesting that it is a memory
> leak, please recheck. 

That increase in memory use can be caused by lots of things that aren't memory leaks, such as the caches being filled or memory fragmentation. If a leak detection tool reports a leak, or repeating the same set of steps over and over results in unbounded memory growth, that would indicate an actual memory leak.
(Reporter)

Comment 8

12 years ago
(In reply to comment #7)

> ... If a leak
> detection tool reports a leak, or repeating the same set of steps over and over
> results in unbounded memory growth, that would indicate an actual memory leak.
> 

According to your criteria it seems like a severe memory leak. I tested abd repeated the same set of steps over and over (closing the 7 tabs, reopening the 7 tabs), it results in unbounded growth, with very little memory released.
I opened 7 tabs, waited until it reached ~420MB mem usage,
closed the 7 tabs, the memory usage was reduced to ~350MB,
Opened again the 7 tabs, when all tabs were loaded the memory continued to grow immediately, not after some time indicating it is not due to fragmentation. The memory continued to grow to 700MB mem usage and 900MB VM size .
Conclusion : there is no way to release the memory and it hangs Firefox and prevents (real) memory from other applications.

Please confirm this test. If confirmed, since memory is almost not released, only grows, when closing tabs and reopening, shouldn't this bug's Severity be Critical instead of Normal ? 
It is probably a severe memory leak. Even if not a leak, it causes loss of data since you have to close Firefox to avoid freezes due to swapping, therefore I think it should be marked Critical according to the Severity definitions I read (but I am not very knowledgeable of the rules here ...)

Comment 9

12 years ago
I am also able to confirm what appears to be excessive memory consumption on rendering this page in BonEcho/2.0b1 ID:2006070805 .
Page: http://www.ynet.co.il/articles/0,7340,L-3267772,00.html
Memory consumption is about 20-30mb per tab with this page, most of which is recovered upon closing the tab.  CPU usage is also fairly high on a 1.2ghz PC, although that may be due to flash ads.  I am using Flash 9.0 r16.

The page uses the Unicode character encoding, and is fairly lengthy.

Comment 10

12 years ago
Just to add that this memory grows happens to me a lot, and it doesn't relate to any specific url.
At some point the application won't open a new window, just a new tab.

Comment 11

12 years ago
I see this problem when I am accessing one of my web pages that has a Java streaming video applet running.  I can readily duplicate the problem and share details with anyone who needs them.  This web page is on my LAN.  I am having problems making it available to the internet, but will do so if needed. - Tim

Comment 12

12 years ago
I can confirm it here. I didn't believe until I did a Ctrl-Alt-Del and found out how resource-hungry Firefox is. I'm using 2.0 Beta 2. Have a look for yourself:

http://img219.imageshack.us/img219/9348/firefoxjq4.png

I have 2 GB of RAM, and after a few hours with Firefox minimized, my computer becomes absolutely unusable, each time I switch from one application to another one, it keeps seeking on the HDD and the mouse freezes or keeps jumping. Eventually I need to reboot manually.

Comment 13

12 years ago
This bug is trunk only, so shouldn't block bug 320915, which is for 1.8 branch memory problems.
No longer blocks: 320915

Updated

11 years ago
Keywords: regression
Summary: Memory usage grows by ~10MB per minute (probably memory leak) → Memory usage grows by ~10MB per minute on page with flash (probably memory leak)
So is this bug also reproducible with flash disabled?  Or no?
Flags: blocking1.9?
Gavin, can you reproduce this?  If so, would you be willing to do some refcount-log leak debugging?
(In reply to comment #15)
> Gavin, can you reproduce this?  If so, would you be willing to do some
> refcount-log leak debugging?

Using Flash 9.0 r16, I can reproduce the always-increasing memory usage after loading http://www.ynet.co.il/articles/0,7340,L-3267772,00.html. I wasn't able to reproduce the always-increasing memory usage with Flash disabled, so it seems to me like Flash it at fault here again. In both cases (flash enabled and disabled), XPCOM_MEM_LEAK_LOG shows leaks of nsFrame, nsBox, and nsStyleContext objects after starting the browser and letting that page load complete, but I don't really know enough about leak debugging to get any more information about that, and either way I'm not sure that it's really relevant to this bug.

Comment 17

11 years ago
(In reply to comment #16)
> (In reply to comment #15)
> > Gavin, can you reproduce this?  If so, would you be willing to do some
> > refcount-log leak debugging?
> 
> Using Flash 9.0 r16, I can reproduce the always-increasing memory usage after
> loading http://www.ynet.co.il/articles/0,7340,L-3267772,00.html. I wasn't able
> to reproduce the always-increasing memory usage with Flash disabled, so it
> seems to me like Flash it at fault here again. In both cases (flash enabled and
> disabled), XPCOM_MEM_LEAK_LOG shows leaks of nsFrame, nsBox, and nsStyleContext
> objects after starting the browser and letting that page load complete, but I
> don't really know enough about leak debugging to get any more information about
> that, and either way I'm not sure that it's really relevant to this bug.


worth testing v9 update 1 (beta) even though release notes don't indicate any fixes to memory issues?

http://labs.adobe.com/downloads/flashplayer9.html 
http://labs.adobe.com/technologies/flashplayer9/releasenotes.html
(Reporter)

Comment 18

11 years ago
(In reply to comment #16)
> I wasn't able to reproduce the always-increasing memory usage with Flash disabled, so it
> seems to me like Flash it at fault here again. 

As written in the original bug report (I am the bug reporter), This problem doesn't happen in the Firefox2 branch (back then it was Bon Echo alpha3), only on the trunk, so I think it is a Firefox problem, not a flash plugin problem. You can also test the same page with Opera browser since it works with the same flash plugin, and see that the problem doesn't happen there.

(In reply to comment #18)
> As written in the original bug report (I am the bug reporter), This problem
> doesn't happen in the Firefox2 branch (back then it was Bon Echo alpha3), only
> on the trunk, so I think it is a Firefox problem, not a flash plugin problem.

The two are not mutually exclusive: it could be a problem with both, or how they interact. The fact that I couldn't reproduce it with Flash disabled leads me to believe Flash has some effect. Are you saying that you can reproduce it with Flash disabled?
(Reporter)

Comment 20

11 years ago
(In reply to comment #19)
> (In reply to comment #18)
> > As written in the original bug report (I am the bug reporter), This problem
> > doesn't happen in the Firefox2 branch (back then it was Bon Echo alpha3), only
> > on the trunk, so I think it is a Firefox problem, not a flash plugin problem.
> 
> The two are not mutually exclusive: it could be a problem with both, or how
> they interact. The fact that I couldn't reproduce it with Flash disabled leads
> me to believe Flash has some effect. Are you saying that you can reproduce it
> with Flash disabled?
> 

I didn't try to disable flash, because I don't know what is the right way to do it (Use adblock plus, or just remove the flash plugin from the plugins dir? ).

If it isn't reproducible with flash disabled, I conclude as you that flash has some effect, but not that the flash plugin is the problem. A possible explanation is: When you disable flash you probably disable some parts of firefox that interact with it, that were probably the cause for the leak.
Add to that the fact that in Opera (Which uses the same flash plugin) and the Firefox 2 branch it doesn't happen, and the most likely conclusion is that the problem is with the firefox trunk, not the flash plugin.

PS: I am not an expert in Valgrind on Mozilla, but is it possible to run valgrind on Firefox trunk ocmpiled build ? (i.e. not debugging build with symbols, just the standard build).  I think after the Firefox memory goes up, and you exit firefox, valgrind results should tell you if there were large leaks. It seems to me the easiest thing to do in order to determine just if there are large leaks (not where they are, which needs the debugging build).
I don't have a Linux system to do it, but I think anybody with a Linux can do this test easily.

very similar behavior with seamonkey Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.9a1) Gecko/20061020 SeaMonkey/1.5a
running flash player 9r16

increase is more likely 100KBytes per seconds, can reach more than 600MBytes just by having a few tabs opened on pages with flash content and leave the browser open for half a day

will try with flash disabled
Now running Flash Player 9 r28: 
grows 3,5 MBytes per minute with a single tab with the given URL
Windows 2000
Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.9a1) Gecko/20061025 SeaMonkey/1.5a
(Reporter)

Updated

11 years ago
Severity: normal → critical
(Reporter)

Comment 23

11 years ago
There was no progress on this bug for a very long time . I am not sure of your processes, but I guess making its severity Critical might make the treatment faster ... :)
I asked on comment #8, why the Severity was changed to Normal instead of Critical, but didn't get an answer for a long time. Therefore, I changed it back to Critical. If someone thinks this is not correct, please change back and let me know why.

Comment 24

11 years ago
severe memory leak (which this is) qualifies by BMO definition as critical.

Keywords: perf
So for what it's worth, here is where this bug currently stands:

1)  I haven't been able to reproduce it with Flash 7 on Linux
2)  No one on Windows seems to be stepping up to help out.
3)  I haven't gotten Flash 9 on Linux to work yet.

What would probably help some with this bug is if someone could create a minimal-ish testcase (presumably using one of the Flash files from the original site)...  That would at least limit where one should look for possible leaks if trying to find them by code inspection.
The other thing that would seriously help if this is a trunk regression is locating _when_ it regressed.  That would give us a decent idea of what checkins could be responsible.

Comment 27

11 years ago
Tested Firefox 2.0 with Flash 8,0,22,0 on Windows XP SP2.  Initial total memory usage was 476 Mb (including several other running applications, but without performing any interactions with these).  With this bug entry page open, I opened 7 tabs of http://www.ynet.co.il/articles/0,7340,L-2813378,00.html -- memory climbed for a bit until it got to 526 Mb, and then stayed that way (note:  I did not close Task Manager), except for an occasional temporary bump up to 529 Mb.  With all those tabs open, I opened 6 or 7 tabs of http://www.ynet.co.il/articles/0,7340,L-3267772,00.html -- memory climbed for a bit until it got to 626 Mb, and then stayed that way, except for an occasional temporary bump up to 627 Mb.  I closed all tabs except for this bug entry page -- memory dropped back down to 526 Mb, but then stayed that way for a long time.  At some point while I was typing it got down to 508 Mb.  Opening 14 tabs of each of 2 other bug pages only put memory usage up to 512 Mb, which dropped back down immediately after closing them.

I looked at the memory usage ascribed to firefox.exe (sorry, did not catch what it was at the start), and saw that it was ~64,000 K.  I opened 7 tabs of the 2nd news site URL above again, and memory climbed back to 584 Mb.  Closing them dropped memory back to 521 Mb.  Opening 7 tabs of another bug page again did not compensate for the gradual drop in memory usage that occurred meanwhile (ended up at 519 Mb).  Closing them got memory down to 515 Mb.  After this, the memory usage ascribed to firefox.exe was ~70,712 K.

Conclusion:  under Windows XP with Flash 8, some memory is not returned, but at least it does not leak continuously under this test case.
Lucius, see comment 13.  This bug postdates the release of Firefox 2.

Comment 29

11 years ago
If a problem is occurring since Firefox 2.0, then this is not the same as the original bug (reported on 2006-06-26).

Addressing an omission in my last post (attempting to test the originally reported bug):  total memory usage after closing Firefox entirely was 456 Mb; opening Firefox with just this bug page got memory up to 473 Mb, with 29,404 Kb ascribed to Firefox, but gradually climbing by a few Kb/second even without doing anything other than typing this post (but too slow to see on the Task Manager total memory usage without waiting for a long time; found no correlation between periods of typing/inactivity and this slow memory usage rise).  Opening 7 tabs of the 2nd Israeli news site URL above got total memory usage up to 573 Mb, with 132,572 K ascribed to firefox.exe (again, climbing very slowly, but then dropped later to 132,516 K).  Closing the news site tabs got total memory usage down to 506 Mb, with 63,068 K ascribed to firefox.exe (fluctuated slightly).
> If a problem is occurring since Firefox 2.0, then this is not the same as the
> original bug (reported on 2006-06-26).

Please read the original comment carefully.  Especially the first line.  The one that says what version it's reported against.  Then please look up the way Mozilla version numbers work and the version numbers associated with Firefox 2 (specifically, the Gecko version, which is 1.8.1).

Gecko 1.9 (and Firefox 3) development started in August 2005 (yes, that's a 5, not a 6).  It's been proceeding in parallel with Firefox 2 development.  The bug was filed in June 2006 on the code that will become Firefox 3.  At that point, approximately 8 months of development had take place on the code.  Most of those changes were NOT made to the branch that became Firefox 2.

Please, no more Firefox 2 related comments in this bug.  It's simply not relevant.

Comment 31

11 years ago
plus


Gavin in comment #16:
> (In reply to comment #15)
> > Gavin, can you reproduce this?  If so, would you be willing to do some
> > refcount-log leak debugging?
>
> I don't really know enough about leak debugging to get any more information 

gavin, bz, who would be such a person?  
leak log analyzer is clean, so is a different type of analysis needed?

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20061116 Minefield/3.0a1 ID:2006111604
Leaked 0 out of 7 DOM windows
Leaked 0 out of 35 documents
Leaked 0 out of 3 docshells

I tested with clalit2.swf (there may be more which fail)
http://www.ynet.co.il/banner2/html/clalit2.swf 
niv, are all other image files on that web page clean?

this regressed in the same time frame as stated in Ria's bug 354087 comment 2.
for Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a1) Gecko/20060510 Minefield/3.0a1 ...
03:47 build works fine
14:46 build fails

so does this implicate bug 341430?
Wayne, the refcount leaks gavin was seeing are probably not this bug -- I see those on linux too and I can't reproduce this bug, as I said.  I'll admit that those leaks confuse me -- I can't tell how we can leak frames without leaking content nodes.  This is where a small testcase would come in handy...

So this regressed in a range when nothing bug threadmanager landed, basically?  That's kinda unfortunate.  :(
Blocks: 326273
Depends on: 341430
And at risk of sounding repetitive, a minimal-ish testcase really really would help.  gmail does not such a testcase make, nor does the original site this is reported on... but this site should be a lot more amenable to being minimized than gmail is.  ;)

Comment 34

11 years ago
I meant to mention threadmanager, not bug 341430 :( but I'll leave it there for now
(Reporter)

Comment 35

11 years ago
> 
> I tested with clalit2.swf (there may be more which fail)
> http://www.ynet.co.il/banner2/html/clalit2.swf 
> niv, are all other image files on that web page clean?
> 

I don't  understand your question - why image files (and not flash files), and I don't understand what do you mean by "clean" ?
In addition, you chose a specific flash file. The ads on that page are changed all the time dynamically, so on a refresh of the page you will get different flash ads. I don't think it is a problem with the flash ads or plugin due to the reasons mentioned above - Opera and Firefox2 doesn't have its memory grow with the same flash plugin.
(In reply to comment #33)
> And at risk of sounding repetitive, a minimal-ish testcase really really would
> help.  gmail does not such a testcase make, nor does the original site this is
> reported on... but this site should be a lot more amenable to being minimized
> than gmail is.  ;)
> 

Boris 
I propose the following, don't know if you would call it a mimimal test case:
http://olivier.vit.free.fr/testcase%20memleak/index.html
It includes only the flash animations from the original page of this bug report, without any fancy javascript and just minimal HTML

I went back to early builds of this year, and found that the bug shows up in build 2006-05-25 (May!) but not in 2006-04-25.
On this test page opened in 3 tabs, doing nothing else in the meantime, memory usage is icreasing by 8MBytes per minute.

With the same test scenario and build 2006-04-25, memory usage is stable.

This is with running Flash Player 7,0,63 !

May be the Flash Player version can minimize the effects of some bug in Mozilla, but since Firefox 2.0 with the same Player version and the same test page doesn't show any symptom of this bug, I would believe that the issue is in the trunk code rather than in Flash Player code.

(yes a month isn't a precise target to find the reason of the regression, but my time is over for today)
(In reply to comment #36)
Tested with player 9,0,28 (these tests are on Windows 2000)
=> Very same memory usage growth : 8MBytes per minute
=> much higher CPU usage (50% when tab not hidden, instead of 20% with player 7,0,63)

Comment 38

11 years ago
(In reply to comment #35)
> I don't  understand your question - why image files (and not flash files), 
I include flash in that category

> I don't understand what do you mean by "clean" ?
clean as in ones that don't show the problem

> In addition, you chose a specific flash file. The ads on that page are changed
I picked a specific one that doesn't change


> all the time dynamically, so on a refresh of the page you will get different
> flash ads. I don't think it is a problem with the flash ads or plugin due to
> the reasons mentioned above - Opera and Firefox2 doesn't have its memory grow
> with the same flash plugin.

FIREFOX2 DOES NOT HAVE THE PATCHES THAT TRUNK HAS - so please stop citing tests with FF2

Comment 39

11 years ago
(In reply to comment #36)
> I propose the following, don't know if you would call it a mimimal test case:
> http://olivier.vit.free.fr/testcase%20memleak/index.html
> It includes only the flash animations from the original page of this bug
> report, without any fancy javascript and just minimal HTML

strip it down to one image and you have a better minimal testcase.
strip away even more or even ALL html and, if the problem is still seen, then you have an even better testcase

> I went back to early builds of this year, and found that the bug shows up in
> build 2006-05-25 (May!) but not in 2006-04-25.
> On this test page opened in 3 tabs, doing nothing else in the meantime, memory
> usage is icreasing by 8MBytes per minute.
> 
> With the same test scenario and build 2006-04-25, memory usage is stable.
> 
> This is with running Flash Player 7,0,63 !
> 
> May be the Flash Player version can minimize the effects of some bug in
> Mozilla, but since Firefox 2.0 with the same Player version and the same test
> page doesn't show any symptom of this bug, I would believe that the issue is in
> the trunk code rather than in Flash Player code.

on the contrary, if my test in bug 354087 comment 13 is accurate, then a change in player affects amount of memory in FF.


> (yes a month isn't a precise target to find the reason of the regression, but
> my time is over for today)

in comment 31 I already state a reduced regression window of 11 hours on 2006-05-10

Comment 40

11 years ago
using clalit2.swf I turned off "play" in the SW player and it doesn't change the rate of memory loss - 700k/min. Same for image quality.  (I don't know anything about player or SW, so I don't know if that *should* affect anything.)

The same player and image in Seamonkey trunk is 4k/min.
(In reply to comment #39)
> on the contrary, if my test in bug 354087 comment 13 is accurate, then a change
> in player affects amount of memory in FF.
It doesn't change the fact that memory increases, while other trunk builds don't show *any* memory usage increase

> 
> 
> > (yes a month isn't a precise target to find the reason of the regression, but
> > my time is over for today)
> 
> in comment 31 I already state a reduced regression window of 11 hours on
> 2006-05-10
> 
Great !

Then what is missing for a 6 months lasting critical bug ? a windows developer instead of a unix one ?
Would this be a management bug then ? :-)
Olivier: is there a single flash animation that causes the problem?  Or do you need multiple flash animations on the page to trigger it?  As in, if you start removing the <embed>s one by one from the page you put up in comment 36, at what point does the memory growth disappear?  Are there some Flash animations that cause the problem and others that do not?

Wayne: you're getting the problem in Firefox but not in Seamonkey with the same Gecko???

Olivier: a Windows developer would sure help.  ;)  But the real issue is that now we know that it's most likely a timing bug, if threadmanager is what regressed it.  That makes life kinda hard.  :(

Comment 43

11 years ago
please, let's stick to technical issues so the bug makes progress rather than pissing people off.  What in fact usually precedes fixing is *minimal* testcase(s) plus developer(s) _with right skills_.  

Having this in the right component might also help - I'm not sure plugins is the right place.  bz, any suggestion/theory?

replaced http://www.ynet.co.il/articles/0,7340,L-3267772,00.html
with http://www.ynet.co.il/banner2/html/clalit2.swf which is sufficient to demonstrate the problem AFAICT

bz, If you are having trouble with FF+linux+player perhaps Adobe need to be informed?  release notes do NOT indicate stopper problems with linux + FF. http://labs.adobe.com/technologies/flashplayer9/releasenotes.html#install


(In reply to comment #41)
> (In reply to comment #39)
> > on the contrary, if my test in bug 354087 comment 13 is accurate, then a change
> > in player affects amount of memory in FF.
> It doesn't change the fact that memory increases, while other trunk builds
> don't show *any* memory usage increase

My statement doesn't mean FF bears no blame**, but it does mean player isn't a saint.  **The fact that there is a regression window implicating a core checkin makes that clear.


(In reply to comment #42)
> Wayne: you're getting the problem in Firefox but not in Seamonkey with the same Gecko???

:( my bad. I just rechecked and what I tested with was SM 1.1b. I just checked with 1.5 (same Gecko) and it DOES have the same problem as FF
Wayne, thanks a ton for the minimal testcase!  And good to know that this affects trunk Gecko across the board.

I do think plug-ins is as good a component as any for this at the moment.

I've got Flash 9 for Linux working, and can't seem to reproduce the problem with that on the reduced testcases in this bug -- running for a few minutes showed no change in memory usage.

David, do you know what sort of memory-leak-debugging tools we have on Windows?
Keywords: qawanted
I did just look at the Windows appshell code in bug 326273 and nothing jumps out at me....
Er, I meant appshell and widget code.

Updated

11 years ago
Summary: Memory usage grows by ~10MB per minute on page with flash (probably memory leak) → Memory usage grows by ~10MB per minute on page with shockwave flash (probably memory leak)
(Reporter)

Comment 47

11 years ago
(In reply to comment #43)
> replaced http://www.ynet.co.il/articles/0,7340,L-3267772,00.html
> with http://www.ynet.co.il/banner2/html/clalit2.swf which is sufficient to
> demonstrate the problem AFAICT

I am having problems with the new testcase you submitted (http://www.ynet.co.il/banner2/html/clalit2.swf ), it doesn't play the flash animation, just opens a dialog which asks me whether to save it, or cancel. I am using Windows, is the behavior different on Linux?
Hmm.  I get the same on Linux; I'd assumed the behavior was different on Windows... ;)
Same for me
Some server headers have probably changed on their site, but I didn't managed to log them

I copied the Flash to my hosting 
http://olivier.vit.free.fr/testcase%20memleak/clalit2.swf
Then I noticed this Flash is doing a POST request:
POST http://213.8.137.51/ERate/eventreport.asp

Then I wonder if it is a good test case since this Flash is doing more than just loading and being there...

Comment 50

11 years ago
choose save it and run locally. on startup there is also a prompt about ip address that must be passed, perhaps there's a better testcase but it works for me. 
Created a very simple Flash
http://olivier.vit.free.fr/testcase%20memleak/testcase.swf
Just some text in a single key frame, no animation

Memory usage growth is 300KBytes/min for a single browser window with a single instance of this Flash loaded
Flash Player 9r28
Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.9a1) Gecko/20061025 SeaMonkey/1.5a

Note: memory doesn't increase at the very begining, but after a few 20-30 seconds

Comment 52

11 years ago
Created attachment 246279 [details]
minimal Flash testcase

Flash testcase from comment #51
Just some text in a single key frame, no animation

Updated

11 years ago
Keywords: testcase
Yeah, that's pretty definitely not leaking with Linux, either with Flash 7 or with the test builds of Flash 9...

I assume that that flash file doesn't make any POST requests, right?  ;)

I almost wonder whether it would make sense to try breakpointing in malloc() with that testcase loaded and see what sort of callers one runs into...  Might at least give us an idea of what's allocating the memory...

Comment 54

11 years ago
For me, it took several minutes with the testcase open before I could see memory usage increase on Windows XP. I'm guessing there are lots of little allocations, and at first the allocations were filling up the memory fragmentation. It's only after the memory holes are filled up that Firefox needs to get more memory from the OS. If you've been using Firefox for a while, you might need to leave the testcase open much longer before the memory usage rises.
Created attachment 246296 [details]
Source code of the minimal test case
Steve, I started Firefox and loaded the testcase (without loading anything else, other than the start page), then waited for about 30 minutes...
I have been running the steps described in http://dbaron.org/log/2006-01#e20060110a
with NSPR_LOG_MODULES=DOMLeak:5,DocumentLeak:5,nsDocShellLeak:5
and the result of the log processing is "no leak" for each of them, with the minimal test case, as well as with http://olivier.vit.free.fr/testcase%20memleak/index.html

I suspect that some other modules/components should be declared to get logged, which names should be used ?


Apart from this, when looking at the seamonkey process with sysinternals.com's process explorer http://www.microsoft.com/technet/sysinternals/ProcessesAndThreads/ProcessExplorer.mspx
only the memory working set 'private' is increasing (performance tab), along with page faults
But I'm unsure of this would mean anything ...
(In reply to comment #56)
> Steve, I started Firefox and loaded the testcase (without loading anything
> else, other than the start page), then waited for about 30 minutes...
> 

Hmmm, Trunk build on Windows or Linux ?
> I suspect that some other modules/components should be declared to get logged,

Nope.  Those are what you can log.  And we already knew they were not getting leaked, given the XPCOM_MEM_LEAK_LOG output.

It sounds like someone who can reproduce this needs to run under Purify or some such...

> Hmmm, Trunk build on Windows or Linux ?

Trunk Linux.  This was the basis for me saying I can't reproduce on Linux.
Another test case suggestion if you don't mind

I noticed that when you start watching a video on Google Video in a tab, and close the tab at the begining of the show, the video actually continues to play in the background (since you continue hearing the sound until the end of the video)

Wouldn't this bring evidence that the Flash content is being left flying somewhere in the background and never being released properly ?

Here is a sample video to play with http://video.google.fr/videoplay?docid=-1750972330502570908&q=Inarritu

my VM usage raised above 1GBytes tonight, with a lot of tabs opened on commercial sites with animated flash banners, only one hour of surf but the browser left opened for more than 6 hours...
Sorry :
with Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.9a1) Gecko/20061025 SeaMonkey/1.5a and Flash Player 9 r28 

Comment 62

11 years ago
Olivier, that happens to be a SeaMonkey-only CTho-induced bug of which I can't find the number for.

Comment 63

11 years ago
(In reply to comment #60)
> Another test case suggestion if you don't mind
>
> http://video.google.fr/videoplay?docid=-1750972330502570908&q=Inarritu

not able reproduce this using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20061119 SeaMonkey/1.5a, nor with Firefox.
(In reply to comment #62)
> Olivier, that happens to be a SeaMonkey-only CTho-induced bug of which I can't
> find the number for.
> 

Neil-induced CTho-fixed, actually - bug 356703, which is fallout from bug 354927 which was fixing a regression from bug 350416.  Granted, the whole chain of bugs was my fault though ;).
I have about:blank set for every new tab/window/startup
Would this be related to the memory leak ?

bz, does it match your preferences ?

(can't use recent builds since they crash on https for me because of bug 330271)
Yes, I have about:blank set as my homepage.

Comment 67

11 years ago
This couldn't be somehow related to the plugin monitoring mouse activity, could it?

If I point Firefox to:
http://www.softwaretipsandtricks.com/forum/windows-xp/23281-corrupt-profile-lost-outlook.html 
(which displays a random Flash ad at the top), the memory usage usually stabilizes. But if I methodically and slowly move the mouse cursor in and out of the Flash ad, the memory usage for Firefox increments by small amounts (~10-20K) each time I do so, and isn't released again. At least it does this for some ads.

This may not be the major memory hog here, but it's something at least.

Comment 68

11 years ago
just to note that the problem is also happening in Gran Paradiso 
(Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20061204 GranParadiso/3.0a1)

Since flash plugin causes lots of problems in the trunk is it a good idea to have a tracking bug for flash problems in the trunk
in the beginning it might include bug 342810 and bug 341380 ?

To be able to use Minefield in daily use currently I have to delete the flash plugin - I just temporarily install it to test things, and then delete it again.

Comment 69

11 years ago
with seamonkey trunk 2006120908 on Windows 2000 with Flash 9.0.16 memory usage (Virtual Memory) doesn't increase in my case. At least not with the speed I see in Minefield (it's about 6 MB/minute), while in seamonkey with 10 flash tabs it's about 500KB/min.
Neither with the minimalish Flash testcase attached here, nor with the testcase in bug 341380

Is nsIThreadManager used in SeaMonkey ?
Yes.  All trunk Gecko apps use the thread manager.
(In reply to comment #67)
> This couldn't be somehow related to the plugin monitoring mouse activity, could
> it?
>
I often see the my machine complaining for lack of free virtual memory after a few hours of idle time (because of this bus), then I hardly believe it has anything to do with mouse activity
Could bug 363427 be related to these memory leaks ?

Comment 73

11 years ago
(In reply to comment #72)
> Could bug 363427 be related to these memory leaks ?

In which nightly build did bug 363427 begin? It would be useful to know if it began at the same time.

Comment 74

11 years ago
clarified summary re: bug 326273
Keywords: qawanted
Summary: Memory usage grows by ~10MB per minute on page with shockwave flash (probably memory leak) → threadmanager checkin 2006-05-10 causes increased memory usage by ~10MB per minute with shockwave flash (memory leak?)
(Reporter)

Comment 75

11 years ago
There was no progress on this Critical bug for a long time. It has a minimal testcase, and the checkin where the regression occured is known. What is needed now as I understand is a Windows developer to debug it.

I also think it should be marked "blocking 1.9" (not with question mark as it is currently), but I don't think I have the authority to do that...
This sound bad, but we need to be able to reproduce this in order to fix it. Jst said he'd give it a look.
Assignee: nobody → jst
Flags: blocking1.9? → blocking1.9+
Bug 377419 [now morphed to another issue] lead me here.

If it matters, "I have about:blank set for every new window/startup", but my tabs open as "blank".
(In reply to comment #77)
> Bug 377419 [now morphed to another issue] lead me here.
> 
> If it matters, "I have about:blank set for every new window/startup", but my
> tabs open as "blank".

My bad: I have <about:> for windows/startup, nothing("=" <about:blank>) fot tabs.

Comment 79

11 years ago
sicking in comment #76:
> This sound bad, but we need to be able to reproduce this in order to fix it.
> Jst said he'd give it a look.

trivial to reproduce. Bug 341430 on which this bug is marked dependent is untouched, marked by dbaron in June 06 as target milestone 1.9a1/dec 2006.

FWIW, after continuous testing of trunk for many months I gave up in January and stopped all FF trunk testing because of this bug. Perhaps other people stopped as well.

Comment 80

11 years ago
(In reply to comment #79)
> trivial to reproduce. Bug 341430 on which this bug is marked dependent is
> untouched, marked by dbaron in June 06 as target milestone 1.9a1/dec 2006.

correction(sorry): dbaron is the assignee, but it was darin who marked it P1/1.9a1. 

bz, why do you think bug 341430 has a causal relationship to this bug?
I think it might, not that it does.  As for why, it's a hunch...
(Reporter)

Comment 82

11 years ago
(In reply to comment #79)
> FWIW, after continuous testing of trunk for many months I gave up in January
> and stopped all FF trunk testing because of this bug. Perhaps other people
> stopped as well.
> 
I stopped testing the trunk on a regular basis too because of this bug for almost a year now. It causes me to have to restart Firefox after one to several hours of using it.

I am repeating my comment #75 : There was no progress on this Critical bug for a long time. It has a minimal testcase therefore is easy to reproduce, and the checkin where the regression occured is known. What is needed
now as I understand is a Windows developer to debug it.

Jonas said in comment #76 that jst will give it a look, but there was no progress. If jst is too busy, can it be assigned to someone else?

For what it's worth, there are currently 300-some bugs that are blocking1.9+, with several hundred nominations still to go, and jst has 14 of them on his plate right now (including this one), in addition to whatever else he has to do...

Unfortunately, everyone else is in a similar position (e.g. Jonas has 13 blocker bugs on his plate, etc).  So there's no one better to reassign to, really.

This is definitely on the radar as a must-fix, but so are 300 other things.  :(

Comment 84

11 years ago
I think the point of the previous posters is that perhaps trunk builds are not being adequately tested because of this bug. Maybe this one should not be only a must-fix, but an ASAP fix so the trunk can get more testing? With this bug out of the way, more users would test the builds, and we could test more thoroughly by running the browser for longer periods of time.

Comment 85

11 years ago
bz, steve, good points. yes, what I tried to say is it's a problem for the project (not for me personally) because it affects testing. So much so that it might be considered a blocker.

Can adobe be of any help, since according to msintov in bug 341480 they are still working on mozilla specific issues?
(Reporter)

Comment 86

11 years ago
(In reply to comment #85)
> Can adobe be of any help, since according to msintov in bug 341480 they are
> still working on mozilla specific issues?
> 

I don't see a connection between msintov or adobe and the bug you mentioned (bug 341480), searched for "sintov" and it doesn't appear anywhere in the bug page. Can you recheck the bug number?

Comment 87

11 years ago
(In reply to comment #85)
> bz, steve, good points. yes, what I tried to say is it's a problem for the
> project (not for me personally) because it affects testing. So much so that it
> might be considered a blocker.
> 
> Can adobe be of any help, since according to msintov in bug 341480 they are
> still working on mozilla specific issues?
> 

(In reply to comment #86)
> (In reply to comment #85)
> > Can adobe be of any help, since according to msintov in bug 341480 they are
> > still working on mozilla specific issues?
> > 
> 
> I don't see a connection between msintov or adobe and the bug you mentioned
> (bug 341480), searched for "sintov" and it doesn't appear anywhere in the bug
> page. Can you recheck the bug number?
> 

Hi,
just to note that you probably mean bug 341380. (not bug 341480)

Comment 88

11 years ago
I meant bug 341380.   341480 is a typo
This is likely to be related to bug 379530.

Comment 90

11 years ago
(In reply to comment #89)
> This is likely to be related to bug 379530.

jst, Is there a non-obvious connection? In this bug, memory consumption got dramatically worse when using flash after the threadmanager checkin on trunk. But bug 379530 is against 2.0.

(non-flash bug 341430 is also linked to threadmanager checkin on trunk)

Updated

11 years ago
Duplicate of this bug: 374746

Updated

11 years ago
Target Milestone: --- → mozilla1.9beta1
The flashblock extension would mitigate this problem dramatically, 
except for bug 345711
Keywords: dogfood
Jim, can you take a crack at figuring out what's up here?
Assignee: jst → jmathies
(Assignee)

Comment 94

11 years ago
Hey all – For those of you still following this bug, a “Windows guy” is now officially on the case. Some prelim info I’ve found so far:

The Flash files in this page are little stubs which download Flash video files which they then play. Some paint graphics on top of the video. Some reproducible urls:

http://i.total-media.net/yn/ads/yafora/04.06.07/shwepps_683x60_ynet_site.swf
http://i.total-media.net/yn/ads/clalit/button%2011.02.07/Clalit_logo_118x20_ynet.swf
http://i.total-media.net/yn/ads/YnetCenter/Car/21.05.07/ynet_59x585_loader.swf
http://i.total-media.net/yn/ads/Machon%20Mor/17.04.07/fix/116%20116.swf
http://i.total-media.net/yn/ads/Neviot/31.05.07/ynet_268x195_health_loader.swf

The problem is easily reproducible – load up any one of these ads independently, shrink minefield down to a thumbnail size so that drawing doesn’t impact memory usage, minimize the window (or open a tab on top), and wait. I’m experiencing 10-30k increase per ad per every two or three seconds or so. Leaving things running for an extended period of time will definitely eat up a lot of memory. 

Closing the swf tab after waiting a bit shows the memory is not freed, at least not entirely. Over a period of time, 30MB -> 10 minutes -> 41MB -> close tab -> 38MB.

Netscape plug-in logging interfaces show no activity after the Flash movie begins to play, including to NPN_MemAlloc.

Leak checking shows nsRunnable as the culprit, which I guess could pretty much be anything. It does not appear to be related to the event queue though, nsEventQueue is not going crazy.

Comment 95

11 years ago
(In reply to comment #94)

[meta]

Please try in the future to write your comments
in plain text, rather than with a word processor.

As received in a plain text email client, your
m-dash, left-quote, right-quote and other
non-ASCII characters are an unreadable mess.

Thanks.

xanthian.

Kent, my plain text email client shows those characters just fine, as the email has the proper content type header with character set. There's no reason for anyone to avoid any non-ascii characters these days. Very likely all you need to do is tweak your terminal settings a bit and utf8 will work just fine.
Jim, are you saying we're actually leaking nsRunnable objects here?  If so, any idea where they were allocated?  Which leak checking tool did you use?
(Assignee)

Comment 98

11 years ago
Hey Boris - this is the first leak bug I've worked on, so I started here - 

http://www.mozilla.org/projects/xpcom/MemoryTools.html

I built a debug version of minefield and ran it with XPCOM_MEM_BLOAT_LOG. According to the paper, anything with excessively large numbers in the bytes leaked and objects rem columns might be suspect. The result I had after a 10 minutes run looked like this:

               Leaked   Total   Rem
nsRunnable 12  4326516  361596  360543

Most everything else had zeros or very very small numbers. I take that to imply there may be a problem in a class that implements that interface. (That's just a guess at this point.) I'd welcome comments / suggestions on this if you have any.

What I'd like to do is try to trace malloc somehow. It's obviously allocating memory at a fast clip - I would hope this would point to the problem.

Kent - sorry for the characters man, I edit everything in my email client to check for spelling. I'll try to get in the habit of popping things into notepad before I post them.
Hey Jim, those are great findings so far! What heppens if you randomly break in nsRunnable's ctor while you're likely to be "leaking" those, where do they get created from? Try a bunch and see if you get consistent stacks, if so, that's probably a good place to start looking.
Yeah, that definitely indicates that we're leaking instances of nsRunnable.  There are a whole bunch of classes inheriting from it, but the leak logging comes from the refcounting on nsIRunnable, of course.

I guess the refcount logger stack-walking code still doesn't work on Windows, right?  Given that, I'd try what jst suggested.  If that doesn't help, we can add explicit inherited nsISupports impls to subclasses of nsRunnable so they'll show up in the refcount logs in their own right.  Maybe we should do that anyway, conditioned on NS_BUILD_REFCNT_LOGGING, to catch future problems like this.
Actually, belay that.  I may see what the problem is... or at least what _a_ problem is.  Patch coming up in a second, but I can't test-compile it on windows, so it might be a little exciting. ;)
Created attachment 267535 [details] [diff] [review]
Does applying this help?

This is the only OS-specific plugin-related nsRunnable impl I see around.  And it's only used for Flash, and if ProcessFlashMessageDelayed is called we're definitely leaking it -- it's addrefed once by GetPluginWindowEvent and again by the assignment of the return value into the nsCOMPtr, then only released by the nsCOMPtr.  This patch fixes that by preventing the extraneous addref from the nsCOMPtr.
(Assignee)

Comment 103

11 years ago
Thanks for the help guys!

Boris - ProcessFlashMessageDelayed was called repeatedly due to a quirk in Flash related to WM_USER messages that get thrown around with video. 

I'm taking this for a spin, clean compile and so far about 10 minutes in memory usage is holding even. Looks like you solved it. The add ref stuff looks obvious now – thanks for the Mozilla ref counting primer. :)

The only question I have left is why this bug was reported in a way that made it seem so site specific. I’d think that any site with Flash video would have reproduced it. I’ll take a look at a few sites like YouTube and Google Video in the morning to see if the same behavior was occurring.

I’ll post back with a final report tomorrow. 
Guys, thanks for this exceptional collaboration with great results !
(let's hope similar interactions will happen faster in the future)

I wonder however how the rise of this bug can be explained: which code change did introduce this memory leak as visible ? does it match the suspected code check-in window ?

Comment 106

11 years ago
(In reply to comment #96)
> Kent, my plain text email client shows those
> characters just fine, as the email has the proper
> content type header with character set. There's no
> reason for anyone to avoid any non-ascii
> characters these days.

Yes, there is, it's called "avoiding computer virus
infections".

I'm using mailx across putty under SunOS from a
WinXP system, that latter a virus writer's premium
target.

Mailx was probably last updated before MIME types or
content type headers were invented, so it isn't
hiding the problem of emailing RFC-non-compliant
non-7-bit-pure ASCII like your email client is, but
it has one _huge_ advantage; more than 20 years of
having the security holes fixed.

Beyond which, it is too old and little used to be of
interest to malicious hacking today,

Compared to having the Outlook Express/Thunderbird -
style bells and whistles, that's a lot more
important.

Also, I have a couple of decades of investment in
scripts written around its ideosyncracies. For
example, I get 600 or so spams a day, one script
cuts out 95% of them before I bother to look.

What I see where the original poster's oriented
quotes are is five characters of trash per each
non-7-bit-pure-ASCII character, a real reading
impediment.

There's no reason to use _any_ of the characters
causing the problem: oriented quotes are
unnecessary for ordinary communication, m-dash
and n-dash can be replaced by hyphens, bullet
characters by asterixs, ellipses typed out, and
so on.

xanthian.

Sorry this is so off-topic; meanwhile, Boris,
well done on the fix. I can hardly wait to see
this in a nightly build to see how much of the
uncontrolled memory consumption by SeaMonkey
has been fixed.

(Reporter)

Comment 107

11 years ago
> The only question I have left is why this bug was reported in a way that made
> it seem so site specific. I’d think that any site with Flash video would have
> reproduced it. I’ll take a look at a few sites like YouTube and Google Video
> in the morning to see if the same behavior was occurring.
> 
> I’ll post back with a final report tomorrow. 
> 

Jim, thanks to you and all others who helped finally solve this.
Well done !

I am the original bug reporter, in answer to your question, it is not site specific. The URL I gave was just an example that was chosen on purpose because this site contains many flash ads, the reason was to show that the bug causes very fast growth of memory usage, otherwise it would have been more problematic for others to confirm the bug.

In addition, you probably didn't notice the minimal very simple flash testcase that is attached to this bug, from comment #51, it only contains some text with no animation, and caused the problem.

Comment 108

11 years ago
(In reply to comment #103)
>...
> The only question I have left is why this bug was reported in a way that made
> it seem so site specific. I’d think that any site with Flash video would have
> reproduced it. I’ll take a look at a few sites like YouTube and Google Video
> in the morning to see if the same behavior was occurring.

great work Jim. 

To expand on niv's comment, the problem is indeed quite pervasive among flash sites. But, I avoided linking lots of other cases/bugs to this bug because a) it was not known whether the problem in this case was flash or gecko because of the long history of problems with the flash plugin (going back to gecko 1.7, eg bug 289198), b) a secondary symptom is high cpu which some people don't link to being a memory issue, and c) many of the bugs are not well defined. So we were in a holding pattern until someone narrowed the cause of this problem.

Problems with youtube and others are widely reported in the forums for example, but most of them will be users of branch builds, not trunk.  It may be too much to hope that this patch fixes all flash memory problems :)

The following should be checked to see how they are affected by bz's patch:
Trunk Flash bugs - Bug 354087, bug 377419
Branch flash bugs - bug 289198, bug 358586 comment 2, Bug 373759, bug 319143
Misc bugs - Bug 340260, bug 341430, bug 341380 (adobe says they will issue a fix)

and then the bugs that are not well defined.
Branch examples - bug 370202, bug 378401, bug 378830, bug 367115, bug 366895 
trunk examples - bug 363427

I'm sure I have missed some, but this should be the bulk of potentially related bugs
(Assignee)

Comment 109

11 years ago
This bug’s testcase.swf - Memory use is stable.

Bug 354087 - Stable memory usage with this patch. 
Bug 377419 - Stable memory usage but has high CPU usage (48%) when visible, 2% when hidden. 
Bug 341380 - Memory is stable, CPU is sitting at 10%-15% on the test case swf.
Bug 363427 - I'd say not related - crashes in FULLSOFT.DLL would not be caused by this bug afaict.

I also checked a couple major sites  -

my yahoo - Stable memory usage with flash ads present.
youtube - I sat on one video for about 20 minutes, hitting reply every time it ended. Memory was stable once the video file was fully played (cached.)
Comment on attachment 267535 [details] [diff] [review]
Does applying this help?

OK, let's get this reviewed.  ;)
Attachment #267535 - Flags: superreview?(jst)
Attachment #267535 - Flags: review?(jst)
Comment on attachment 267535 [details] [diff] [review]
Does applying this help?

It's amazing how trivial a fix can be once the problem has been tracked down. Thanks Jim and bz!

r+sr=jst
Attachment #267535 - Flags: superreview?(jst)
Attachment #267535 - Flags: superreview+
Attachment #267535 - Flags: review?(jst)
Attachment #267535 - Flags: review+
Checked in the patch.  Please resolve bugs (including this one) as needed?
(Assignee)

Updated

11 years ago
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
(Assignee)

Updated

11 years ago
Duplicate of this bug: 354087

Comment 114

11 years ago
Thanks guys! Would the patch be considered for the 1.8 branch at some point? Not only because of the bug's severity for some people, but also the FUD it's been used to generate regarding Firefox and memory leaks.

Comment 115

11 years ago
This is a trunk-only regression that was never present on the branches.
(In reply to comment #111)
> (From update of attachment 267535 [details] [diff] [review])
> It's amazing how trivial a fix can be once the problem has been tracked down.
> Thanks Jim and bz!
> 
> r+sr=jst
> 

Yes !
But it is also *amazing* that the process to assign the right persons and skills on such critical bugs is so *slow* (this bug was reported ONE YEAR ago!!, yes, ONE YEAR !)
How can the mozilla organization maintain some trust from users and reporters by neglecting their time spent and energy (people involved in this discussion have spent a total of weeks of workload, mainly in vain: the trouble was obvious, and reporters are not there to compile debug builds and run a debugger, are they ?)

Then please improve your process for assignments.
Please also improve the decision making process on new features and their upcoming drawbacks and impacts on end users (see the new installer for SM for example, or other long lasting bugs on attachments, cache, etc...), but this is another topic :-)
Olivier, the fact of the matter is that there is a shortage of "the right persons and skills".  If every single developer with a Windows setup and some knowledge of the code is stuck dealing with critical bugs (e.g. branch security bugs), we as a project have a problem.  Unfortunately we often have this problem...  See how many bugs get fixed in each branch update.  :(
Boris,
How many of these bugs are introduced by new 'nice to have' features or code architecture changes 'that look fancy to software architects but end users don't care' ? Priorities and decision making process are key, along with end user focus.

I really consider to revert to Mozilla 1.7 since later versions are just unusable, or why not Netscape 4.7 ?? (I kept an archive of it:-))
The branch bugs?  Very few.  Almost all are in code dating back to 1998 that predates the current code review setup.

Of course Mozilla 1.7 happens to have all those security bugs in unfixed form, since it's not being maintained anymore... ;)

You're right that priorities are key, and fixing security bugs in the shipped code trumps trunk work, unfortunately.  :(

Comment 120

11 years ago
I'm sure this fix hasn't caught up with the nightly builds process, but just to
notice that what seems to be this bug is alive and well in the week-old SeaMonkey new version alpha:
Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre) Gecko/2007060714 SeaMonkey/2.0a1pre
where with a useful test page open:
http://www.garfield.com/comics/comics_todays.html
one can watch, using the MS-windows XP task manager Processes tab, the memory
held by SeaMonkey grow by ~100Kbytes/3_seconds while the user sits with hands
folded.

It would be a Good Thing if that build were checked to be better behaved on
this URL when this bug 342810 fix is included in the nightly build.

The problem is almost certainly Flash-oriented, since installing SeaMonkey 2.0
over SeaMonkey 1.5 somehow invalidated the Flash plugin, the page originally
came up unable to use Flash, and the buggy memory growth did not occur. Once I
reinstalled Flash and relaunched SeaMonkey, the buggy behavior was evident.

xanthian.
(In reply to comment #120)
> I'm sure this fix hasn't caught up with the nightly builds process, but just to

Then what is the point reporting it ?

> notice that what seems to be this bug is alive and well in the week-old
> SeaMonkey new version alpha:
> Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre)
> Gecko/2007060714 SeaMonkey/2.0a1pre

Of course, the patch was checked in at "2007-06-07 14:41", which is after this build !

> where with a useful test page open:
> http://www.garfield.com/comics/comics_todays.html
> one can watch, using the MS-windows XP task manager Processes tab, the memory
> held by SeaMonkey grow by ~100Kbytes/3_seconds while the user sits with hands
> folded.

What's the point reporting a bug after it is fixed ?

> It would be a Good Thing if that build were checked to be better behaved on
> this URL when this bug 342810 fix is included in the nightly build.

You're welcome to do so !

> The problem is almost certainly Flash-oriented, since installing SeaMonkey 2.0
> over SeaMonkey 1.5 somehow invalidated the Flash plugin, the page originally
> came up unable to use Flash, and the buggy memory growth did not occur. Once I
> reinstalled Flash and relaunched SeaMonkey, the buggy behavior was evident.

Your installation issues have nothing to do with this bug ... Please, try a support area (newsgroup, ...).

***

To those who might not know,
fixing this (kind of) bug has been very recently discussed in newsgroup (at least),
and all there is to do here and now is thank those who made the actual fix possible at last.

This has now been done too;
then, please, stop commenting,
unless you find something wrong with the patch.

Comment 122

11 years ago
Hi serge. which newsgroup ?

xanthian in comment #120:
> I'm sure this fix hasn't caught up with the nightly builds process, but just to
> notice that what seems to be this bug is alive and well in the week-old
> SeaMonkey new version alpha:

you obviously know that is the case since you read that checkin just happened at 14:42:08 PDT 2007-06-07 and know the build won't appear until about 5:30 PTD today. So why the comment? When you check http://www.garfield.com/comics/comics_todays.html if you see a problem with that or another URL of a different magnitude of memory loss than what is seen prior to this page, there is a strong likelihood it is NOT this bug - there are other existing flash issues/bugs to be explored.


(In reply to comment #116)
> (In reply to comment #111)
> > (From update of attachment 267535 [details] [diff] [review] [details])
> > It's amazing how trivial a fix can be once the problem has been tracked down.
> > Thanks Jim and bz!
> 
> Yes !
> But it is also *amazing* that the process to assign the right persons and
> skills on such critical bugs is so *slow* (this bug was reported ONE YEAR
> ago!!, yes, ONE YEAR !)

actually a year and a month, since the problem was first reported afaict by Eugene in bug 340260. But, um, that bug didn't get attention because it was filed as a suite bug and not of firefox bug.  

Then this bug appeared and bz hooked the two together.  bz did a yoman's job as usual trying the guide the bug and bring the right people resources to bear and for whatever odd reason it never worked out. Lack of such resources will always be a problem -- witness the high percentage of regression bugs of 326273 that are not fixed. 

But I do see room for process improvements. Consider:  

a) it took 4 months to get blocking 1.9+ for a well-defined, critical bug,

b) the bug may have been overlooked or deprioritized because it involved a plugin, 

c) what was the plan (was there a plan) to explicitly monitor for important regressions which were inevitable after the landing of such a large patch as was bug 326273 (a huge undertaking in itself)?

d) there is an apparent need to consider pervasive plugins and perhaps even add-ons in the development process, and more importantly expanded/better automated regression testing - expanded testing capabilities is being explored for tinderboxes iirc - is that correct?

and on the lighter side, 
e) why wan't jim hired sooner? :)  sign that man up for bug 341430!

Updated

11 years ago
Target Milestone: mozilla1.9beta1 → mozilla1.9alpha6
(In reply to comment #122)
> Hi serge. which newsgroup ?

About a better handling of such regressions ?
The one I was involved in was mozilla.dev.planning.

Comment 124

11 years ago
(In reply to comment #121)

It is simple awesome to see a developer
complain about being furnished with
additional debugging information. The
usual whine is about not getting enough.

Appropriate response sent as private email.

> installation issues

Umm, the download nightly from mozilla.org is
an _installer_; guess who that makes responsible
for the 2.0 alpha SeaMonkey installation
nerfing recognition of my existing flash
plugin that worked with the 1.5a SeaMonkey
build?

(In reply to comment #122)

> So why the comment?

A "bug" is its symptoms, not some one of its code
error instances. [I've fixed "a (crashy) bug" that
had 704 separate instances in the eBay source code.]
If the garfield.com behavior survives the next build,
then bz fixed one but not all instances of the error
that displays itself as the bug. Thus it is worth
reporting new bug instances even into the teeth of a
point fix in the source code, so developers can
check that the bug is thoroughly squashed, and not
merely one of its instances fixed that happens to
present against a single regression test.

This isn't rocket science; a good set of real world
regression tests against a bug is _always_ valuable
to developers to acquire.

(In reply to comment #116)

The code is maintained by volunteers, for the most
part, though a bit of money comes in from AOL/Netscape.

There are critical bugs over six years old still active
against the source code, a thirteen month turnaround is
fairly normal, not something about which to overexcite
yourself. Take a deep breath and count how much money
you've spent to purchase/support the product, and be at
peace. You _could_ be running Internet Explorer, and
offering your computer as a sacrifice to every malware
writing team on the planet, instead.

xanthian.
(In reply to comment #124)
> A "bug" is its symptoms, not some one of its code
> error instances. [I've fixed "a (crashy) bug" that
...
> that displays itself as the bug. Thus it is worth
> reporting new bug instances even into the teeth of a
> point fix in the source code, so developers can
> check that the bug is thoroughly squashed, and not
> merely one of its instances fixed that happens to
> present against a single regression test.

That doesn't mean that separate problems should be reported in the same bug report.  Generally, a new report should be opened to avoid the confusion that results when different problems are discussed in the same report; otherwise bugzilla would have only 10 or so bug reports:
 1) Firefox sometimes crashes
 2) Firefox sometimes uses too much memory
 3) Firefox user interface is confusing
 etc.
When it's not clear whether a new bug report should be used or comments added to an old one, it's better to err on the side of a new bug report, since it's easy to mark a bug as duplicate, but hard to split comments out, especially when trying to understand one of the many problems described in a single report.
(In reply to comment #124)

> (In reply to comment #116)
> 
> The code is maintained by volunteers, for the most
> part, though a bit of money comes in from AOL/Netscape.
> 
> There are critical bugs over six years old still active
> against the source code, a thirteen month turnaround is
> fairly normal, not something about which to overexcite
> yourself. 

Volunteers more than others prefer not wasting their time. This is valid for reporter, triagers, developers, and all the others. And improving the process is a key factor, along with tools and some level of documentation, to shorten software development cycle and respect contributors.
I see no reason to waste my time because it is free software. The assumption that my (our/your) time is free is just wrong. Donating time is as much valuable as donating money, and becomes even more effective when the process keeps improving.

>Take a deep breath and count how much money
> you've spent to purchase/support the product, and be at
> peace. 

I answered about my time.
I expect to see public reports of the annual incomes and expenses of the Mozilla Foundation before I give a cent to it. They are not on http://www.mozilla.org/foundation/donate.html .
And see how much money, and resources, were given for example by Google, Adobe and others.
(In reply to comment #124)
> (In reply to comment #121)
> 
> It is simple awesome to see a developer

I did not develop this feature.

> complain about being furnished with
> additional debugging information. The

I fail to see what "debugging information" your comment 120 contains.

> usual whine is about not getting enough.

That's all you seem to be doing here and now.

> Appropriate response sent as private email.

Your email being
{{
Serge,

Are you always this much of an asshole,
[...]

xanthian.
}}

This is the second and probably the last time I care about any of your comment.

> > installation issues
> 
> Umm, the download nightly from mozilla.org is
> an _installer_; guess who that makes responsible

Obviously, NOT this bug.

> (In reply to comment #122)
> 
> If the garfield.com behavior survives the next build,
> {...}
> 
> This isn't rocket science; a good set of real world
> regression tests against a bug is _always_ valuable
> to developers to acquire.

Who cares about "if"s, after the patch was checked in ?
Test the fix first !

Comment 128

11 years ago
(In reply to comment #127)

> Who cares about "if"s, after the patch was checked in ?
> Test the fix first !

Behavior remains as described in comment #120 as of the
Seamonkey 2.0pre 2007060902 build.

xanthian.

Comment 129

11 years ago
(In reply to comment #128)
> Behavior remains as described in comment #120 as of the
> Seamonkey 2.0pre 2007060902 build. 
> xanthian.

what version of flash?


(comment #120)
> http://www.garfield.com/comics/comics_todays.html
> one can watch, using the MS-windows XP task manager Processes tab, the memory
> held by SeaMonkey grow by ~100Kbytes/3_seconds 

http://www.garfield.com/comics/comics_todays.html
Shockwave Flash 8.0 r22 (from C:\WINNT\system32\Macromed\Flash\NPSWF32.dll)
and Shockwave Flash 9.0 r45
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre) Gecko/2007060809 SeaMonkey/2.0a1pre

takes 5MB in the first few seconds, but then after 2 minutes and 7MB it is flatlined, even when loaded in two tabs

in addition, it gives up it's memory when replaced by about:blank

the bug is dead, long live the bug.

Comment 130

11 years ago
(In reply to comment #129)
> (In reply to comment #128)
> > Behavior remains as described in comment #120 as of the
> > Seamonkey 2.0pre 2007060902 build. 
> > xanthian.
> 
> what version of flash?

more to the point I should have asked, what memory usage at what points in time?
(Assignee)

Comment 131

11 years ago
Hey all – I’m attaching an updated patch for this. While doing some debugging this weekend I found that the changes from last year also broke performance oriented cache logic for the nsRunable objects that were leaked. This patch fixes that without sacrificing the recent ref counting fix. I’ve also removed some cruft from the code that wasn’t necessary. 

I’ve run the same leak checks to make sure I’ve not introduced new memory leak issues into the source. 

This patch replaces the patch from last week. 

Overall – I’m still not happy with our performance. On a typical flash laden site, IE7 still beats Mozilla in term of performance by about 50%. This seems odd considering the two models are fairly similar – maybe worthy of some additional research.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Jim, how much do the IE and Mozilla flash plugins actually have in common?  Last I checked, the IE plugin used the ActiveX APIs and was pretty different from the NPAPI one in some respects.
(Assignee)

Comment 133

11 years ago
Created attachment 267875 [details] [diff] [review]
Updated patch
Attachment #267535 - Attachment is obsolete: true
Does the new patch have anything to do with memory leaks, or should it be on a different bug?  (It looks like the OS2 part won't compile because you didn't remove the aMsg param from the definition of Init.)
(Assignee)

Comment 135

11 years ago
Hey David - sorry if I handled that wrong. I wasn't sure if I should re-open it. But this effectively replaces the old patch, addressing the same memory leak so the same qa should take place to make sure that has stayed fixed.

I'll patch the os2 stuff and resubmit.

Boris - Under the hood, it's quite similar. Active-x controls are contained windows - windows created by the browser which host the control. They both have message queues, act on standard windowing events like paint, and user input messaging, etc. Neither involves any scripting or anything, the code paths are compiled c++ in both. Seems like we should be able to get about the same level of performance.  

Having written both I found more similarities than differences. You can (and I'd bet Adobe probably does) use the same code base to handle all the internal workings of the control. The dfference is in the shim that acts as the host.  Firefox has the nsapi and ie uses a COM based window control.
(Assignee)

Comment 136

11 years ago
Created attachment 267881 [details] [diff] [review]
fixed os2 compile
Attachment #267875 - Attachment is obsolete: true
(Assignee)

Updated

11 years ago
Attachment #267881 - Flags: superreview?(jst)
Attachment #267881 - Flags: review?(jst)

Comment 137

11 years ago
Comment on attachment 267881 [details] [diff] [review]
fixed os2 compile

>Index: modules/plugin/base/src/nsPluginNativeWindowOS2.cpp
>@@ -134,18 +127,19 @@ public: 
>-  already_AddRefed<nsIRunnable> GetPluginWindowEvent(HWND aWnd, ULONG aMsg,
>-                                                     MPARAM mp1, MPARAM mp2);
>+  PluginWindowEvent* GetPluginWindowEvent(HWND aWnd,
>+                                          MPARAM mp1, 
>+                                          MPARAM mp2);

You're removing the already_AddRefed<> decoration. That looks extremely dangerous.
(In reply to comment #129)
> takes 5MB in the first few seconds, but then after 2 minutes and 7MB it is
> flatlined, even when loaded in two tabs
> in addition, it gives up it's memory when replaced by about:blank

That's much like what I saw while testing for bug 377419 comment 16.
(Although the (released/allocated) memory seemed to increase a little over time.)

(v1.8.1 branch pattern seemed more like: flat for a while, then a big allocation, then flat (forever ?); and released if you leave the page.)


(In reply to comment #131)
> I found that the changes from last year also broke
> performance oriented cache logic for the nsRunable objects that were leaked.

Would that be somehow related to bug 377419 perf regression ?

> Overall – I’m still not happy with our performance. On a typical flash
> laden site, IE7 still beats Mozilla in term of performance by about 50%. This

(I don't know about IE7, but first thing would be to get back to what perf we have with Gecko 1.8.1 branch.)
timeless, he's also removing the addref...
(Assignee)

Comment 140

11 years ago
> You're removing the already_AddRefed<> decoration. That looks extremely
> dangerous.

Yes, although it doesn't leak.  The threadmanager changes broke the caching, and added the memory leak. We fixed the leak, but after digging through it it ended up being the broken caching that caused the whole problem. I can put this back in and still make it work if that's a safer way to do it. Really all I did here was (re)modeled this to look more like the original code in bonecho.

> Would that be somehow related to bug 377419 perf regression ?

Man, what is up with that site! :) There's very little Flash and it still uses 35% CPU for me with my local test version that has this patch in place. I'll take it for now and see if I can track down the cause. 

> (I don't know about IE7, but first thing would be to get back to what perf we
> have with Gecko 1.8.1 branch.)

We're there with this patch. I did a quick comparison of the amount of assembly involved in this and in bonecho - it's pretty darn close. We added a little overhead due to the threadmanager changes but not enough to make a difference. 
(In reply to comment #140)
> > Would that be somehow related to bug 377419 perf regression ?
> 
> Man, what is up with that site! :) There's very little Flash and it still uses
> 35% CPU for me with my local test version that has this patch in place. I'll
> take it for now and see if I can track down the cause. 

Thanks ;->
(Notice that I mentioned it only as Flash related: my timeframe and your test confirm that that regression is unrelated to this bug.)

> > first thing would be to get back to what perf we
> > have with Gecko 1.8.1 branch.)
> 
> We're there with this patch. I did a quick comparison of the amount of assembly
> involved in this and in bonecho - it's pretty darn close.

Good, then. Thanks for looking further into it :-)
I was thinking more about bug 377419 like regression(s).
Comment on attachment 267881 [details] [diff] [review]
fixed os2 compile

Jim, in PluginWindowEvent::Run() I see this change:

NS_TRY_SAFE_CALL_VOID(::CallWindowProc(win->GetWindowProc(), 
                         hWnd, 
-                        GetMsg(), 
+                        WM_USER+1, 
                         GetWParam(), 
                         GetLParam()),
                         nsnull, inst);

How is this the right thing to do? Seems to me that it's not safe to say the message is always WM_USER+1. That's the case for nsDelayedPopupsEnabledEvent::Run(), but it's not clear to me that it's true here.
(Assignee)

Comment 143

11 years ago
That's just some over zealous optimization that doesn't need to be there because the compiler does it for us. :) (Msg is always WM_USER+1, it's never anything else.) I'll post an update with proper msg handling later tonight - I just had to clobber my source tree and recompile after a checkout.
(Assignee)

Comment 144

11 years ago
Created attachment 268183 [details] [diff] [review]
updated msg handling
Attachment #267881 - Attachment is obsolete: true
Attachment #268183 - Flags: superreview?(jst)
Attachment #268183 - Flags: review?(jst)
Attachment #267881 - Flags: superreview?(jst)
Attachment #267881 - Flags: review?(jst)
Comment on attachment 268183 [details] [diff] [review]
updated msg handling

 void PluginWindowEvent::Init(const PluginWindowWeakRef &ref, HWND aWnd,
                              ULONG aMsg, MPARAM mp1, MPARAM mp2)
 {
-  NS_ASSERTION(aWnd!=NULL && aMsg!=0, "invalid plugin event value");
-  NS_ASSERTION(mWnd==NULL && mMsg==0 && mWParam==0 && mLParam==0,"event already in use");
+  NS_ASSERTION(aWnd!=NULL, "invalid plugin event value");
+  NS_ASSERTION(mWnd==NULL,"event already in use");

While you're here, you wanna add the missing space after the comma in the second assert there (both files)?

r+sr=jst

Let me know if you need help landing this.
Attachment #268183 - Flags: superreview?(jst)
Attachment #268183 - Flags: superreview+
Attachment #268183 - Flags: review?(jst)
Attachment #268183 - Flags: review+
I just landed the last fix (with minor tweaks to the whitespace in those assertions). Marking FIXED. Thanks again Jim!
Status: REOPENED → RESOLVED
Last Resolved: 11 years ago11 years ago
Resolution: --- → FIXED

Updated

11 years ago
Keywords: qawanted
Duplicate of this bug: 340260
No longer blocks: 354087
Blocks: 341430
No longer depends on: 341430
Regressions generally go in the "depends on" field.
No longer blocks: 341430, 381699
Depends on: 341430, 381699
Opps, these weren't regressions of this bug. Never mind me.
Blocks: 341430, 381699
No longer depends on: 341430, 381699
You need to log in before you can comment on or make changes to this bug.