516458 - Excessive memory consumption/OOM on http://bbarker.co.uk

Reporter

Description

•

15 years ago

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.0.14) Gecko/2009090216 Ubuntu/9.04 (jaunty) Firefox/3.0.14
Build Identifier: firefox-3.7a1pre.en-US.linux-i686.tar.bz2

I'm filing this as a Javascript bug as it seems likely, but the fault may lie elsewhere.

Viewing this URL on a Thinkpad T42p with 1.5GB memory running Ubuntu Jaunty: http://bbarker.co.uk/animate/map.php?file=Edinburgh_to_Holy_Island_cycle.gpx

results in resident memory rising until desktop becomes unresponsive at around 1GB resident, usage is still increasing (highest I've seen is 1.2GB resident then top becomes unresponsive). At this point either firefox is killed, or xorg hangs. 

Confirmed with firefox-3.0.14 and firefox-3.7a1pre nightly.

I already reported this bug in Ubuntu: https://bugs.launchpad.net/ubuntu/+source/firefox-3.0/+bug/429476

Reproducible: Always

Steps to Reproduce:
1.go to http://bbarker.co.uk/animate/map.php?file=Edinburgh_to_Holy_Island_cycle.gpx
2. that's it
3.
Actual Results:  
resident memory increases, eventually firefox is killed or desktop hangs

Expected Results:  
Chromium uses about 80MB resident memory for the same page.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 1

•

15 years ago

Firefox 3.5.3 with 32-bit Windows XP SP 3: Memory consumption rises to 200 MB, then goes down to 20-30 MB, but virtual memory size is higher than 700 MB.

Andreas Gal :gal

Comment 2

•

15 years ago

Confirmed. For some reason the GC doesn't kick in and our RSS size grows out of control. We have a patch for much better GC scheduling. Either we land that, or we have to investigate how to better address this problem here with the current heuristics. I saw a JSON failure instead of a crash, so my suspicion would be a ton of strings being created.

Status: UNCONFIRMED → NEW

Ever confirmed: true

David Mandelin [:dmandelin]

Assignee

Updated

•

15 years ago

Summary: crash on viewing page → Excessive memory consumption/OOM on http://bbarker.co.uk

Andreas Gal :gal

Updated

•

15 years ago

Depends on: 506125

David Mandelin [:dmandelin]

Assignee

Comment 3

•

15 years ago

I tried it on MacOS 10.5 with a debug build and it worked (mostly). When I first loaded the page, CPU went to 100% and FF became unresponsive for at least 10 seconds. During this time memory usage (Real Memory in Activity Monitor) went up to about 1GB but then it went back down to reasonable levels. I clicked "Continue" to the slow script dialog. Eventually the map with the track did come up, and then it worked normally.

I found that Safari had somewhat similar behavior: slow startup, but then worked normally. I tried Chrome on Windows and startup was pretty quick, and then it worked normally.

So what actually should be the bug here? Is it that memory usage goes up to ~1GB during the startup phase?

Andreas Gal :gal

Comment 4

•

15 years ago

Do we GC during that 1GB phase? If we don't, its a bug. If we do, I am not sure its a bug.

David Mandelin [:dmandelin]

Assignee

Comment 5

•

15 years ago

In a run of this page from start until the animated map appears, this is the log of the times at which a GC was done:

2:214:GC t=0.175912
2:370:GC t=0.925824
2:595:GC t=1.469851
2:645:GC t=1.532591
2:2284:GC t=23.293265
2:10953:GC t=85.411238
2:11016:GC t=87.652958
2:11034:GC t=87.910751
2:11038:GC t=88.201925
2:11039:GC t=88.234485
2:11040:GC t=88.316330
2:11041:GC t=88.328484

As you can see, we go 20 seconds without any GC, and then 60 more after that until the next GC. Here are the last few calls to IsGCThresholdReached before the 23s GC:

IsGCThresholdReached t=18.488329 mb=7437487/33554432, gb=2427140/4390912
IsGCThresholdReached t=18.640668 mb=7439791/33554432, gb=2427140/4390912
IsGCThresholdReached t=18.869077 mb=7443295/33554432, gb=2427140/4390912
IsGCThresholdReached t=19.843841 mb=7446751/33554432, gb=2431236/4390912
IsGCThresholdReached t=19.930377 mb=7447423/33554432, gb=2431236/4390912
IsGCThresholdReached t=21.518867 mb=7450891/33554432, gb=2431236/4390912

'mb' is the mallocBytes (cur/max) and 'gb' is the gcBytes (cur/max). The last few calls just before the 85s GC are:

IsGCThresholdReached t=84.934737 mb=7226436/33554432, gb=7086000/34246720
IsGCThresholdReached t=84.935768 mb=7230504/33554432, gb=7090096/34246720
IsGCThresholdReached t=84.936678 mb=7234144/33554432, gb=7090096/34246720
IsGCThresholdReached t=84.937600 mb=7238212/33554432, gb=7094192/34246720
IsGCThresholdReached t=84.938453 mb=7241852/33554432, gb=7094192/34246720
IsGCThresholdReached t=84.942108 mb=7245272/33554432, gb=7098288/34246720

Whatever is causing the memory usage, it doesn't show up in the metrics used by IsGCThresholdReached, so we are not GC'ing.

David Mandelin [:dmandelin]

Assignee

Comment 6

•

15 years ago

Here is the key part of a trace-malloc snapshot with 1.4GB allocated. Almost all of the memory is created through nsXMLHttpRequest::ConvertBodyToText, which I believe is called to access the property |requestText|.

What I don't understand is why the nsAStrings returned by ConvertBodyToText are not freed in XPCWrappedNative::CallMethod after the call. Who would know about this?

1411376623 malloc
  1407026664 my_malloc_logger (pldhash.c:)
    1394617050 malloc_zone_malloc (/usr/lib/libSystem.B.dylib)
      1393036801 malloc (/usr/lib/libSystem.B.dylib)
        1376888066 nsAString_internal::MutatePrep(unsigned int, unsigned short**, unsigned int*) (pldhash.c:)
          1376865590 nsAString_internal::ReplacePrep(unsigned int, unsigned int, unsigned int) (pldhash.c:)
            1376853800 nsAString_internal::Assign(unsigned short const*, unsigned int) (pldhash.c:)
              1376771840 nsXMLHttpRequest::ConvertBodyToText(nsAString_internal&) (nsUnicharUtils.cpp:)
                1376771840 NS_InvokeByIndex_P (pldhash.c:)
                  1376771840 XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/xpconnect/src/dom_quickst
ubs.h:)
                    1376771840 XPC_WN_GetterSetter(JSContext*, JSObject*, unsigned int, long*, long*) (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/xpconnect/src/dom_quickstu
bs.h:)
                      1376771840 js_Invoke (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                        1376771840 js_InternalInvoke (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                          1376771840 js_InternalGetOrSet (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                            1376771840 js_NativeGet (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                              1375691234 js_Interpret (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)
                              1080606 js_GetPropertyHelper (/Users/dmandelin/sources/tracemonkey/obj-tm/js/src/jsautokw.h:)

Andreas Gal :gal

Comment 7

•

15 years ago

Nasty. Sounds like bug 506125 would fix this very nicely.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 8

•

15 years ago

So do we know if the data is leaked in the sense that we loose track of it without freeing it, or is it leaked in the sense that we are keeping track of it but think it's being used and so we don't free it. Or finally, are things leaking in the sense that if we only ran a GC it would get freed, but for some reason a GC isn't running.

cc'ing jst who is the resident expert on XPConnect string stuff :)

Andreas Gal :gal

Comment 9

•

15 years ago

From my observation memory use drops back down. The problem is that we don't count this in our current cheesy memory pressure heuristics so we don't know its time to GC (which would free all this stuff). Hence my pointer to the actual working-set based heuristics bug.

David Mandelin [:dmandelin]

Assignee

Comment 10

•

15 years ago

(In reply to comment #8)
> So do we know if the data is leaked in the sense that we loose track of it
> without freeing it, or is it leaked in the sense that we are keeping track of
> it but think it's being used and so we don't free it. Or finally, are things
> leaking in the sense that if we only ran a GC it would get freed, but for some
> reason a GC isn't running.

I don't know the full answer yet. The memory does get freed eventually, so presumably we don't lose track of it. But I haven't yet figured out exactly what event causes the memory to get freed.

In particular, what I don't know is whether these DOM strings are (a) sent back to JS as wrapped DOM strings, or (b) converted to JS strings. If (a), then running js_GC would do the trick, and the problem is that the JS GC doesn't know it could free a lot of memory if it ran. If (b), then something weird is going on. So I guess it's probably (a).

Peter Van der Beken [:peterv]

Comment 11

•

15 years ago

The DOM strings are probably wrapped in JSStrings and I don't think the JS engine gets notified about that memory (JS_NewExternalString just sets the length using initFlat). The wrapping would happen in XPCStringConvert::ReadableToJSVal.

David Mandelin [:dmandelin]

Assignee

Comment 12

•

15 years ago

Attached patch Patch (obsolete) — Details — Splinter Review

(In reply to comment #11)
> The DOM strings are probably wrapped in JSStrings and I don't think the JS
> engine gets notified about that memory (JS_NewExternalString just sets the
> length using initFlat). The wrapping would happen in
> XPCStringConvert::ReadableToJSVal.

Thanks! That's exactly what I needed to know. This patch accounts for the memory in the 'gcMallocBytes' statistic. It prevents the memory spike on this test--it doesn't go over 125MB or so on my machine with this site. The only thing I'm not sure about is if I need to take a lock or increment a thread-local counter instead. I was hoping Igor could educate me on that. :-)

Assignee: general → dmandelin

Status: NEW → ASSIGNED

Attachment #401579 - Flags: review?(igor)

Igor Bukanov

Comment 13

•

15 years ago

Comment on attachment 401579 [details] [diff] [review]
Patch

>diff -r 7e985a173416 -r 892ea3ad67be js/src/jsapi.cpp
>--- a/js/src/jsapi.cpp	Fri Sep 18 14:27:07 2009 -0700
>+++ b/js/src/jsapi.cpp	Fri Sep 18 17:33:53 2009 -0700
>@@ -2626,6 +2626,7 @@
>     if (!str)
>         return NULL;
>     str->initFlat(chars, length);
>+    cx->runtime->gcMallocBytes += length * sizeof(jschar);
>     return str;

This is not thread-safe. Use cx->updateMallocCounter(size) here. r+ with that fixed. Also it would be nice to have a a diff with -p -U8.

Attachment #401579 - Flags: review?(igor) → review-

Andreas Gal :gal

Comment 14

•

15 years ago

gcMallocBytes is just an approximate counter. Precision is not required here so collisions and loss of increments can be tolerated.

Igor Bukanov

Comment 15

•

15 years ago

(In reply to comment #14)
> gcMallocBytes is just an approximate counter. Precision is not required here so
> collisions and loss of increments can be tolerated.

updateMallocCounter is the function to use here. Optimizing that function is for another bug.

David Mandelin [:dmandelin]

Assignee

Comment 16

•

15 years ago

Attached patch Patch 2 (updates per review) — Details — Splinter Review

Updated per review & pushed to TM as 9131d01c2d2a.

Attachment #401579 - Attachment is obsolete: true

David Mandelin [:dmandelin]

Assignee

Updated

•

15 years ago

Whiteboard: fixed-in-tracemonkey

Brendan Eich [:brendan]

Comment 17

•

15 years ago

That should be (length + 1) * sizeof(jschar), right? rs=me on followup.

/be

David Mandelin [:dmandelin]

Assignee

Comment 18

•

15 years ago

(In reply to comment #17)
> That should be (length + 1) * sizeof(jschar), right? rs=me on followup.

I intentionally left the 1 out because it's really just an approximation anyway so I figured I'd save a cycle (which I admit is equally pointless if not more). I suppose it does hurt accuracy notably if someone allocates a million short strings, so pushed to TM as ee059a50f204.

Robert Sayre

Comment 19

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/ee059a50f204

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Robert Sayre

Comment 20

•

15 years ago

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/1c790b1d7dab

status1.9.2: --- → beta1-fixed

Flags: wanted1.9.2+

Launchpad

Updated

•

14 years ago

See Also: → https://launchpad.net/bugs/429476

Patch 15 years ago David Mandelin [:dmandelin] 338 bytes, patch	igor : review-	Details \| Diff \| Splinter Review
Patch 2 (updates per review) 15 years ago David Mandelin [:dmandelin] 657 bytes, patch		Details \| Diff \| Splinter Review