Closed Bug 303163 Opened 19 years ago Closed 19 years ago

[FIX]Huge memory leak at http://www.cbsnews.com/

Categories

(Core :: DOM: Core & HTML, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.8beta4

People

(Reporter: mmoy, Assigned: bzbarsky)

References

()

Details

(4 keywords)

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8b4) Gecko/20050802 Firefox/1.0+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8b4) Gecko/20050802 Firefox/1.0+

Memory gets consumed at a rate of 3 MB per second on this site with the latest
trunk build. I've tested Trunk builds going back to 6/3/2005 and the problem
goes back at least to there.

There appears to be some kind of ad or image loop with messages repeating in the
status bar.

Reproducible: Always

Steps to Reproduce:
1. Go to http://www.cbsnews.com/
2.
3.

Actual Results:  
Firefox quickly consumed 300 MB of memory until I stopped it.

Expected Results:  
Loaded the webpage using a reasonably amount of memory
Confirmed: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4)
Gecko/20050802 Firefox/1.0+

Firefox's memory usage shoots through the roof with no indication of stopping.
IE and Opera show no noticable increase in memory usage.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking1.8b4?
I suspect this is a dupe of bug 303043.
Keywords: mlk
(In reply to comment #2)
> I suspect this is a dupe of bug 303043.

I'll try the patch to see if it fixes the problem.
> I've tested Trunk builds going back to 6/3/2005 and the problem
> goes back at least to there.

If this is the case, then it's nothing to do with bug 303043 (feedview)
Disabling JavaScript prevents the memory leak. This bug also goes back way
beyond any feedview checkins.
Yeah, I just found that out when I tried to patch my DPA2 directory tree and
there was no such directory.
Regression window of: 2005-03-07-07 and 2005-03-08-07. I'd guess bug 285188,
since it deals with recursion protection. CCing jst and bzbarsky.
Keywords: regression
Tried reproducing this with the latest build, new profile with default settings,
flash installed, AMD Athlon 2000+, 512 MB RAM. After 15 minutes I could only
conclude that this problem must be hardware related.
Screenshot: http://img239.imageshack.us/img239/9262/cbs6mk.jpg
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4) Gecko/20050802
Firefox/1.0+ ID:2005080222
I've reproduced this on one of my Pentium 4 systems running Windows XP and
also on my Athlon 64 3200+ system on both the Windows XP 32-bit and Windows XP
64-bit operating systems.

There are other confirmations of this problem in two threads at MozillaZine.

Some can reproduce the problem and some can't. Those that can't usually have some
kind of ad-blocking settings in the browser, on their system or elsewhere.
(In reply to comment #9)

Sorry about not being clear; I tested it on a new profile (= first run) without
extensions and with default prefs.
On another system (AMD Duron 900, 240 MB RAM, same testing conditions) I get a
quite other result, visible here:
http://img325.imageshack.us/img325/5991/cbs22rx.jpg
(In reply to comment #10)
> (In reply to comment #9)
> 
> Sorry about not being clear; I tested it on a new profile (= first run) without
> extensions and with default prefs.
> On another system (AMD Duron 900, 240 MB RAM, same testing conditions) I get a
> quite other result, visible here:
> http://img325.imageshack.us/img325/5991/cbs22rx.jpg

There are other ways to block ad sites external to Firefox.
WFM: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4) Gecko/20050802
Firefox/1.0+
should be: OS All
grows'n'grows with Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b4)
Gecko/20050803 Firefox/1.0+

and I get a lot of these messages over and over again:
[...]
CSS Error (http://www.cbsnews.com/common/css/main.css :4.53): Error in parsing
value for property 'visibility'.  Declaration dropped.
CSS Error
(http://www.cbsnews.com/sections/home/main100.shtml?display=high&width=320&height=304
:0.132): Unknown property 'layer-background-color'.  Declaration dropped.
CSS Error
(http://www.cbsnews.com/sections/home/main100.shtml?display=high&width=320&height=304
:170.191): Error in parsing value for property 'cursor'.  Declaration dropped.
CSS Error
(http://www.cbsnews.com/sections/home/main100.shtml?display=high&width=320&height=304
:171.197): Error in parsing value for property 'cursor'.  Declaration dropped.
CSS Error
(http://www.cbsnews.com/sections/home/main100.shtml?display=high&width=320&height=304
:0.36): Expected color but found 'null'.  Error in parsing value for property
'background-color'.  Declaration dropped.
CSS Error
(http://www.cbsnews.com/sections/home/main100.shtml?display=high&width=320&height=304
:0.36): Expected color but found 'null'.  Error in parsing value for property
'background-color'.  Declaration dropped.
WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed, file
/home/jklippel/mozilla/dom/src/base/nsDOMClassInfo.cpp, line 6207
JavaScript error: , line 0: uncaught exception: [Exception... "Failure" 
nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame ::
http://www.cbsnews.com/common/js/hbe.js :: $x :: line 49"  data: no]
WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed, file
/home/jklippel/mozilla/dom/src/base/nsDOMClassInfo.cpp, line 6207
JavaScript error: , line 0: uncaught exception: [Exception... "Failure" 
nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame ::
http://www.cbsnews.com/common/js/hbe.js :: $x :: line 49"  data: no]
WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed, file
/home/jklippel/mozilla/dom/src/base/nsDOMClassInfo.cpp, line 6207
JavaScript error: , line 0: uncaught exception: [Exception... "Failure" 
nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame ::
http://www.cbsnews.com/common/js/hbe.js :: $x :: line 49"  data: no]
[...]
If it locks up the browser, should that give it a "crash" key word?
worcester12345@yahoo.com: absolutely not. crash means crash not hang. we have a 
keyword for hang, oddly enough, it's "hang".
Severity: critical → major
(In reply to comment #15)
> worcester12345@yahoo.com: absolutely not. crash means crash not hang. we have a 
> keyword for hang, oddly enough, it's "hang".

Thanks. Even more oddly, it wouldn't let me add that key word when it used to
before. MORE ODDLY STILL, it said I can't change severity even though I never
went near that field.
Running Build Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4)
Gecko/20050803 Firefox/1.0+ with multiple extensions (Forecastfox 0.8.1.2,
Adblock 0.5.2.039, statusbarclock 1.7.1, and stop-or-reload 0.1), I get maximum
CPU usage until the page completes loading, and then back to baseline.  I also
don't see any incremental increases in memory usage.  I get only the following
in my javascript console (no infinite looping of error messages):

Error: Error in parsing value for property 'visibility'.  Declaration dropped.
Source File: http://www.cbsnews.com/common/css/main.css
Line: 4
Error: Unknown property 'layer-background-color'.  Declaration dropped.
Source File: http://www.cbsnews.com/sections/home/main100.shtml
Line: 0
Error: Error in parsing value for property 'cursor'.  Declaration dropped.
Source File: http://www.cbsnews.com/sections/home/main100.shtml
Line: 151
Error: Error in parsing value for property 'cursor'.  Declaration dropped.
Source File: http://www.cbsnews.com/sections/home/main100.shtml
Line: 152
Error: Expected color but found 'null'.  Error in parsing value for property
'background-color'.  Declaration dropped.
Source File: http://www.cbsnews.com/sections/home/main100.shtml
Line: 0
Error: Expected color but found 'null'.  Error in parsing value for property
'background-color'.  Declaration dropped.
Source File: http://www.cbsnews.com/sections/home/main100.shtml
Line: 0
Error: Error in parsing value for property 'display'.  Declaration dropped.
Source File: http://www.cbsnews.com/sections/home/main100.shtml
Line: 0
Error: uncaught exception: [Exception... "Failure"  nsresult: "0x80004005
(NS_ERROR_FAILURE)"  location: "JS frame ::
http://www.cbsnews.com/common/js/hbe.js :: $x :: line 49"  data: no]
Error: replyToComment is not defined
Source File: https://bugzilla.mozilla.org/show_bug.cgi?id=303163
Line: 1
Adding "hang" keyword, as per comment 14 thru comment 16.

Can confirm a lot of JavaScript activity, looks like the relevent offender is this:

Error: uncaught exception: [Exception... "Failure"  nsresult: "0x80004005
(NS_ERROR_FAILURE)"  location: "JS frame ::
http://www.cbsnews.com/common/js/hbe.js :: $x :: line 49"  data: no]
Keywords: hang
Brendan,

Can you take a look?
Assignee: nobody → brendan.eich
Flags: blocking1.8b4? → blocking1.8b4+
Keywords: qawanted
If I press the stop button I get the following dialog:

Warning: Unresponsive script
Assignee: brendan.eich → brendan
I'm on leave and won't have time to look at this further until some time next
week, so assigning to me is not the best way to get fast relief.

The NS_ENSURE_TRUE lines cited in comment 13 mean that the DOM was not able to
register event handlers set by the $x function, whose definition is mostly on
line 49 in hbe.js (this JS is badly written and poorly laid out, but indent -kr
sort of beautifies it to the point where it can be read without going blind).

If there was a JS error thrown when compiling, it's being trodden upon by the
DOM or XPConnect NS_ERROR_FAILURE exception, which generates the "uncaught
exception" lines.

The CSS errors cited earlier in comment 13 are irrelevant.

Cc'ing sicking, soliciting comments from DOM folks.

/be
Bob may be able to diagnose this further.

/be
[I got some free time due to baby nap / Mom and toddler outing convergence,
hallelujah!]

The JS error-as-exception being stomped by XPConnect (which takes the DOM's
NS_ERROR_FAILURE rv and turns it into an XPConnect 'Exception... "Failure"' etc.
shown in comment 13) is "syntax error".  The stack property of this original JS
exception is

onclick()@:0
$x()@http://www.cbs news.com/common/js/hbe.js:49
anonymous([object Event])@http://www.cbsnews.com/common/js/hbe.js:53
@:0

and it's now clear that this bug, or at least the syntax error where there was
no syntax error before, is a regression caused by bug 299209.

But 299209 fixed JS to report a top-level anonymous function declaration as a
syntax error, per ECMA-262, and to avoid complicating code generation for a
useless expression that would be optimized away.  So 'function () {...}' is an
error when not part of a larger expression.

However, the DOM uses the JS API to compile event handler source strings as if
they were top-level programs, when (as in this case, when function $x sets an
onclick handler by assignment) event handler source strings must be parsed as
expressions, not as top-level declarations or statements.

This requires either a new JS API or for the DOM to wrap the string it compiles
in parentheses to force expression context.  Whatever the solution, this bug
ought to be fixed for 1.8, but it need not block branching.

Of course, continuing after I debugged this regression led to no runaway memory
use, and cbsnews.com loaded and seems to work.  I'm running FC3 Linux.

I'll file a separate bug on the problem of XPConnect stomping on a pending JS
exception.

/be
Component: General → JavaScript Engine
Flags: blocking-aviary1.5+
OS: Windows XP → All
Product: Firefox → Core
Hardware: PC → All
Target Milestone: --- → mozilla1.8beta4
Version: unspecified → 1.0 Branch
There are at least four distinct bugs here:

1.  The runaway memory growth problem, still not understood.

2.  The tendency of XPConnect to stomp on a pending JS exception, which I've
reported as bug 303720.

3.  This is new to me, or worse than I'd heard recently (I heard we recompile
event handlers on every dispatch, or some such):

When you get an event handler property (e.g. in 'if (d.onclick) ...' where
d=a.links[b] as in function $x from line 49 of cbsnews.com's hbe.js file), the
nsDOMClassInfo.cpp code in nsEventReceiverSH, and the code it calls in
nsEventListenerManager.cpp, conspire to get the event handler's body source from
its DOM attribute value, compile it, and return the compiled function object
reference.

I'm in gdb right now debugging cbsnews.com and I see the value of the onclick
attribute of a certain HTMLAnchorElement is "check(" -- that's it.  That is
indeed a syntax error, not a valid JS program (which a function body, that is,
an event handler body, must be).  But where did this bogus value come from?

Groveling around in gdb shows it to be the onclick value of document.links[28]
in http://www.cbsnews.com/sections/home/main100.shtml, but the source I wget
from that URL has no occurrences of "check(" or even the word "check" in it, and
the 29th link counting from 1 has no onclick attribute.  Maybe it was computed
by a script, but I didn't catch any scripted assignment via a conditional
breakpoint in nsEventReceiverSH::NewResolve.

There could be a further bug in our code here, or a bug in cbsnews.com content.

If the bug is in cbsnews.com content computing a bogus onclick="check(" setting,
then this is at least a tech evangelism bug.  It's allowed and often useful for
our DOM implementation to complain about malformed event handler attribute
values, although we could use a bug to address our manic recompilation of the
same DOM event handler attribute value on every get (which may be the same bug
as the one I'd heard rumors of, where we recompile on every event handler
invocation!).

I have filed bug 303728 to track the "check(" mystery, but I'd like another bug
on file about our frenetic recompilation on every get of a property that
reflects an event handler attribute.  Anyone know of such a bug?

Note that this item does not cause any malfunction in my testing, only a perf
hit that we haven't escalated based on profiling, and of course the diagnostic
output on the JS console that led to me turning over some rocks in the woods,
and finding creepy-crawlies, this fine afternoon.

4.  The bug that I wrongly concluded in my last comment was at work here: that
bug 299209 has unwittingly changed the JS API incompatibly.  I have filed bug
303723 on that API compatibility vs. bona-fide bugfix conundrum.

I will leave this bug tracking item 1, the runaway memory problem, but I am not
the right owner for it.  We need to start with someone who can reproduce it.

/be
Assignee: brendan → nobody
Component: JavaScript Engine → General
Product: Core → Firefox
Target Milestone: mozilla1.8beta4 → ---
Version: 1.0 Branch → Trunk
Bonsai query for the regression window in comment 7 at Brendan's request:
http://tinyurl.com/c2tmh

Probably best if someone double checks that I've got it right.
(In reply to comment #24)
>
> attribute of a certain HTMLAnchorElement is "check(" -- that's it.  That is
> indeed a syntax error, not a valid JS program (which a function body, that is,
> an event handler body, must be).  But where did this bogus value come from?

<http://www.cbsnews.com/common/js/nav.js>
We take the else branch in function makeHome() which document.writes a link with 
 onclick="check(\"other\");" 
The escaping works in JS, but not when it is parsed as HTML. They don't make the
same mistake on the IE or NS branches.
(In reply to comment #27)
> <http://www.cbsnews.com/common/js/nav.js>
> We take the else branch in function makeHome() which document.writes a link with 
>  onclick="check(\"other\");" 
> The escaping works in JS, but not when it is parsed as HTML. They don't make the
> same mistake on the IE or NS branches.

Bob, you rock my world! ;-)

So that leaves the memory runaway condition, whatever it is.  Did you reproduce
that on any of your systems?

/be
(In reply to comment #28)

> 
> So that leaves the memory runaway condition, whatever it is.  Did you reproduce
> that on any of your systems?

Not yet. It may be related to specific ads which aren't served for every visit.
I only have winxpsp2 to test with, but will keep trying.

Anyone have pointers to the mozillazine threads mentioned in comment 9? All I
found was a thread on crashes not that relevant to this bug.
I can reproduce the runaway memory condition everytime whether it's on one of my
builds or on the official nightlies. One minor point is that I'm running Windows
XP 64-bit edition but I don't think that matters as there was a report on Linux
along with a report from someone else that I'm pretty sure is running Windows 32.

I setup a profile with no extensions and no themes though there may be a few
pref settings turned on or off.

If there are some debugging flags or settings that you want enabled for logging or
if you'd like me to try an instrumented build, let me know.

One of the threads is at:
http://forums.mozillazine.org/viewtopic.php?t=300482&postdays=0&postorder=asc&postsperpage=15&start=45

The other one is in my thread at the unofficial builders site but I think that
the one above is more useful as it has Cusser's testing in it.
I'm on Win32 (Windows XP Pro SP2). 

Complete hardware profile available:
http://www.cusser.net/misc/pc_info.txt
(In reply to comment #30)

Since turning off JavaScript stops the memory increase, please open Venkman,
then open the offending page, then when the memory increase is apparent, click
the Stop button in Venkman and give us where it stops and the stack. Thanks.
I tried installing and running Venkman but was unable to get it to work. I saw
lots of comments indicating that it's broken on the trunk.

Do you have a link to a working XPI for Venkman?
(In reply to comment #33)

<http://hacksrus.com/~ginda/venkman/> links to venkman 0.9.85. A new bug appears
to have appeared in Firefox trunk builds today 20050808. I can start venkman and
break the js execution, but after closing venkman it can not be reopened and
when firefox is exited, the process does not exit and must be killed. You can
either deal with that or run a 20050806 build which does not appear to have this
problem.
I tried Venkman on an older build from June. How do you use it?

I still had the problem with having to kill it using the Task Manager to get rid
of the process.

Could someone provide some easy-to-use directions on using it or provide another
way to debug the memory leak?
Dbaron - can you take a look or figure out the right owner?
Assignee: nobody → dbaron
in a recent post on mozillazine
http://forums.mozillazine.org/viewtopic.php?p=1684773#1684773

a user reports this:

"I've been seeing more crashes lately and tonight disabled javascript and looked
at an offending page. This ad script is the offender:

<!-- Casale Media 2005 (C) -->
<!-- Ad Format: Pop Under -->
<!-- Domain(s): imageshack.us, exs.cx -->
<script language="Javascript"><!--
var d=new Date();var r=(d.getTime()%8673806982)+Math.random();var
u=escape(window.location.href);
var host=' language="Javascript" src="http://as.casalemedia.com/s?s=';
document.write('<scr'+'ipt'+host+'51779&u='+u+'&f=1&id='+r+'"></scr'+'ipt>');
//--></script>

<!-- Casale Media 2005 (C) -->


Using Adblock, I blocked the site as.casalemedia.com, enabled javascript and
reloaded the page without problem.

Hope this helps. Love Firefox, hate the crashes, but not enough to go back to IE."

Does this help with the memory leak at all?
The problem is that this page bypasses our frame recursion protection by
setting document.location instead of setting src on the iframe.  Since the
recursion protection is done in the iframe code, this ends up in an infinite
load loop, with the number of live docshells/windows/documents growing without
bound.

My proposed solution is twofold:

1) Make contentWindow and contentDocument on frames that have reached the
   recursion limit return null.  This we can do for 1.8.
2) Move the recursion protection into docshell so that it happens no matter
   what.  This we should probably do for 1.9.
Assignee: dbaron → general
Component: General → DOM
Product: Firefox → Core
QA Contact: general → ian
Assignee: general → bzbarsky
Keywords: qawanted
Priority: -- → P1
Summary: Huge memory leak at http://www.cbsnews.com/ → [FIX]Huge memory leak at http://www.cbsnews.com/
Target Milestone: --- → mozilla1.8beta4
Blocks: 285188
Attachment #193376 - Flags: superreview?(jst)
Attachment #193376 - Flags: review?(jst)
I filed bug 305448 on the leak that's left here once this is fixed.
Comment on attachment 193376 [details] [diff] [review]
Proposed patch to do step 1

Yeah, looks good. r+sr=jst
Attachment #193376 - Flags: superreview?(jst)
Attachment #193376 - Flags: superreview+
Attachment #193376 - Flags: review?(jst)
Attachment #193376 - Flags: review+
Comment on attachment 193376 [details] [diff] [review]
Proposed patch to do step 1

Requesting branch approval for this regression fix.  This is reasonably safe,
and restores some frame recursion protection that we had lost.
Attachment #193376 - Flags: approval1.8b4?
Attachment #193376 - Flags: approval1.8b4? → approval1.8b4+
Fixed, trunk and branch.  I filed bug 305524 on item 2 of comment 38.
Status: NEW → RESOLVED
Closed: 19 years ago
Keywords: fixed1.8
Resolution: --- → FIXED
no crash on minimal testcase firefox 1.5 rc2 winxp/linux
Keywords: fixed1.8verified1.8
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: