Closed Bug 740162 Opened 12 years ago Closed 11 years ago

Large heap-unclassified (>4GB) on jenkins landing page due to XHR strings

Categories

(Core :: General, defect)

14 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 826521

People

(Reporter: developer, Assigned: justin.lebar+bug)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(3 files)

Attached file about:memory
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:14.0) Gecko/20120328 Firefox/14.0a1
Build ID: 20120328050845

Steps to reproduce:

This is similar to bug 712822 except it looks like an instance of jenkins  is causing the problem.  I had 4 tabs open when I noticed the browser eating up memory (I have the memchaser extension installed and I could see the resident memory >2GB).  I closed all the tabs except jenkins and the memory still didn't go away.  I had previously seen this and when I closed jenkins the memory did go away.  This time I left jenkins open and opened up about:memory.  I'll attach it but it shows heap-unclassified at around 4.3 GB.  Clicking the clear memory buttons at the bottom didn't seem to do anything.  I usually keep jenkins open as the first or second tab and the its been a day or two since I last saw it eat up memory so it seems to be a rare situation.  I also rarely have jenkins as the main tab as well and in both cases, it was not the main tab.

I'll see if I can duplicate it again.
Blocks: DarkMatter
Whiteboard: [MemShrink]
This is integer underflow.  I don't see something negative in about:memory, so it could be a race condition in how we calculate about:memory.
Marco - jenkins is an only internally accessible.  We are running 1.431 from http://jenkins-ci.org/ if you want to set it up yourself but I'm not sure if that would help as well.

Justin - It was definitely using up a lot of memory.  htop showed firefox using 85% of the memory.  So, I don't think it is just mistaken calculation.
Taking a closer look, you're right, it looks like it's probably not underflowing.  Sorry!  RSS is 3+ gb.

Perhaps this is orphaned DOM nodes, bug 704623.  Hard to tell without running DMD on the page, though!
Some of the numbers are close to 4.2GB, which is what you'd see if a small negative 32-bit integer was interpreted as a positive integer.  However, we have these:

4,456,817,720 B ── heap-allocated
4,520,681,472 B ── heap-committed

which are reported directly by jemalloc and should be trustworthy.

The fact that you're seeing this on an internal installation of jenkins is unfortunate, as it makes reproduction much harder.  I can believe that jenkins does stuff quite unlike normal web pages, which would explain why heap-unclassified is so high.
I just saw it again.  memchaser showed >2GB in resident memory.  I closed all the other tabs that I had open and then opened about:memory.  Clicking "minimize memory" actually minimized memory this time, though.  One interesting thing about this memory problem is that it is continually climbing.  I had htop open once I noticed it and the resident memory was climbing about 6M every second.  I would close a tab and that would clear out some memory but within a few seconds it would all be back.  I can't imagine anything running in the background jenkins tab that is eating up 6M every second.
And I just saw it again.  It was climbing more than 6M every second, more like 20M every second.  Then after a while (1-2 minutes), it just went away.
I think I can duplicate this consistently now.  I've disable all my addons and duplicated it again.  The attached about:memory is the result of that.  I'll try and get DMD working and attach the output of that.
Attached file DMD output
I'm attaching the DMD output.

It is definitely looking like Jenkins.  We have an internal site that has a button that when clicked causes the machine to make a curl request to jenkins to grab a file.  If I click that button with Jenkins open, the memory jumps.  If I click that button through a different browser on a different machine, the memory jumps.  If jenkins is closed, the memory doesn't jump.  It appears as if jenkins sees the curl request and decides to send the same data to the browser.  I'll just have to stop leaving jenkins open.
Jenkins may be doing something wrong, but this may still be a valid bug -- we shouldn't have such high heap-unclassified.

It's XHR strings:

==7574==  2,147,483,648 bytes (2,147,483,644 requested / 4 slop)
==7574==  97.10% of the heap (97.10% cumulative unreported)
==7574==    at 0x4C2813B: realloc (vg_replace_malloc.c:632)
==7574==    by 0x67E31AD: moz_realloc (mozalloc.cpp:145)
==7574==    by 0x961DA1C: nsStringBuffer::Realloc(nsStringBuffer*, unsigned long) (nsSubstring.cpp:239)
==7574==    by 0x961DCAB: nsAString_internal::MutatePrep(unsigned int, unsigned short**, unsigned int*) (nsTSubstring.cpp:135)
==7574==    by 0x961E9BB: nsAString_internal::SetCapacity(unsigned int) (nsTSubstring.cpp:542)
==7574==    by 0x874FE01: nsXMLHttpRequest::AppendToResponseText(char const*, unsigned int) (nsXMLHttpRequest.cpp:879)
==7574==    by 0x8753800: nsXMLHttpRequest::StreamReaderFunc(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*) (nsXMLHttpRequest.cpp:1977)
==7574==    by 0x95D6C50: nsPipeInputStream::ReadSegments(unsigned int (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (nsPipe3.cpp:799)
==7574==    by 0x8753E08: nsXMLHttpRequest::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (nsXMLHttpRequest.cpp:2072)
==7574==    by 0x8685D2F: nsCORSListenerProxy::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (nsCrossSiteListenerProxy.cpp:656)
==7574==    by 0x8202FA3: nsHttpChannel::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned int, unsigned int) (nsHttpChannel.cpp:4608)
==7574==    by 0x8118341: nsInputStreamPump::OnStateTransfer() (nsInputStreamPump.cpp:514)
Summary: Large heap-unclassified (>4GB) on jenkins landing page → Large heap-unclassified (>4GB) on jenkins landing page due to XHR strings
Ben, do you have any idea how hard it would be to write a memory reporter for this?  It's easy to get SizeOf an nsXMLHttpRequest, but is there a way I can go from a window to its list of nsXMLHttpRequests?
I don't think we have a nice way of doing this currently... You could probably do something hacky with smaug's EventTarget weak hash table, though, since all XHRs are EventTargets. Let me know if you want details on that.

Otherwise, maybe sicking has some ideas here?
We could certainly keep a list of all XHRs and iterate over that -- but is there any way to go from an XHR object to its document/window?
Yes, but... In comment 11 you wanted to go from window to XHR, and then in comment 13 you want to go from XHR to window? Both are possible, XHR keeps a weak ref to its owner window.
(In reply to ben turner [:bent] from comment #14)
> Yes, but... In comment 11 you wanted to go from window to XHR, and then in
> comment 13 you want to go from XHR to window? Both are possible, XHR keeps a
> weak ref to its owner window.

Either way works for the purposes of memory reporters.
What does mOwner() == NULL mean?
(This may not make sense to anyone other than njn): Doing this memory report by keeping a global list of XHR objects and then going from XHR to window object is complicated by the ghost windows split.  When we're running the XHR memory reporter, we don't know whether its window is a ghost, so we don't know what the path should be!

I can conceive of machinery which would allow us to use a list of XHR elements to generate this memory report: We'd basically register a "window memory sub-reporter" with the window memory reporter.  When we're doing the window memory report, we call this sub-reporter, which gives us a series of reports per-window.  We'd then attach each of these reports to the actual window report.

Anyway, I think it's doable, but pretty complicated.  I'm curious what comment 12 would entail.
P2 per Memshrink
Assignee: nobody → justin.lebar+bug
Whiteboard: [MemShrink] → [MemShrink:P2]
(In reply to Justin Lebar [:jlebar] from comment #16)
> What does mOwner() == NULL mean?

mOwner == NULL means (possibly among other things) that this is a C++ created XHR.
The memory is now being classified with bug 826521.  I see a large event-target line item under the jenkins window item.
Thanks, Trev!

jlebar, I think I avoided the problems you mentioned in comment 17 because I did what bent suggested in comment 12 -- I iterate over the event targets table and measure all the event targets that are XHRs.
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Note that for worker-xhr there is both an xhr object on the main thread and one on the worker thread (they are entirely different C++ classes). Both can hold significant amounts of memory.
(In reply to Jonas Sicking (:sicking) from comment #22)
> Note that for worker-xhr there is both an xhr object on the main thread and
> one on the worker thread (they are entirely different C++ classes). Both can
> hold significant amounts of memory.

Well, they both share the same nsString, so presumably the actual buffer is not duplicated.
> Well, they both share the same nsString, so presumably the actual buffer is
> not duplicated.

https://hg.mozilla.org/mozilla-central/rev/de2ab911692d is the patch that added the measurement of nsXHR::mResponseText.  It is a shared string, and so I did some non-typical stuff to measure it in nsXMLHttpRequest::SizeOfEventTargetIncludingThis -- normally we don't even try to measure shared strings because of the risk of double-counting.  If that code isn't valid, please let me know!
I'm not sure we're always able to do sharing. Especially in the case when .response returns an ArrayBuffer or a JSON object.
(In reply to Jonas Sicking (:sicking) from comment #25)
> I'm not sure we're always able to do sharing. Especially in the case when
> .response returns an ArrayBuffer or a JSON object.

In those cases we're definitely not sharing. It's just when response is text.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: