Last Comment Bug 751557 - Adding new jobs in a Jenkins CI causes a lot of leaks and ends-up in an OOM [@ mozalloc_abort(char const*) | Abort ]
: Adding new jobs in a Jenkins CI causes a lot of leaks and ends-up in an OOM ...
Status: NEW
[MemShrink:P2]
: hang, mlk
Product: Tech Evangelism Graveyard
Classification: Graveyard
Component: Other (show other bugs)
: unspecified
: All Mac OS X
: -- major
: ---
Assigned To: other
:
Mentors:
http://mm-ci-production.qa.scl3.mozil...
Depends on: 600307
Blocks: 702495 791695
  Show dependency treegraph
 
Reported: 2012-05-03 07:02 PDT by Henrik Skupin (:whimboo)
Modified: 2015-10-13 07:29 PDT (History)
21 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
MemChaser log (65.30 KB, text/plain)
2012-05-03 09:48 PDT, Henrik Skupin (:whimboo)
no flags Details
MemChaser log (946.02 KB, text/plain)
2012-05-09 06:08 PDT, Henrik Skupin (:whimboo)
no flags Details
Detailed MemChaser log (1.29 MB, text/plain)
2012-10-01 08:34 PDT, Henrik Skupin (:whimboo)
no flags Details
about:memory?verbose (160.51 KB, text/plain)
2013-01-25 05:29 PST, Henrik Skupin (:whimboo)
no flags Details
cc-edges.log (953.05 KB, text/plain)
2013-01-25 05:41 PST, Henrik Skupin (:whimboo)
no flags Details
memchaser.log (380.24 KB, text/plain)
2014-06-24 10:22 PDT, Henrik Skupin (:whimboo)
no flags Details

Description Henrik Skupin (:whimboo) 2012-05-03 07:02:38 PDT
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:14.0) Gecko/20120502 Firefox/14.0a2

I have already seen this issue some weeks ago and talked to Olli. Since then it was kinda sporadic that it happened again. Last time it was end of last week when I haven't had time for a deeper analysis. Today it happened again and I tried to find some steps to reproduce but so far I wasn't lucky.

The following information is everything I have at the moment:
* It happens in my daily profile
* Resident memory usage grows up to 2GB within an hour or so
* GC durations go up to around 4s!!!! and make the browser unusable
* It happens when I have the following Jenkins pages for Mozmill CI loaded (it's in MV VPN)

http://release3.qa.mtv1.mozilla.com:8080 (daily tests)
http://release8-osx-107.qa.mtv1.mozilla.com:8080/ (l10n tests)

First I thought it is related to auto-refresh being enabled but it doesn't look like that this is the reason. One other assumption is that it happens when new job requests are getting added via the Jenkins API. That's what I will test next. Otherwise I will try to reduce my profile as much as possible to have a minimized profile in hand.

Given the cc log I have sent to Olli there were about 32544 nsDOMStorageItem objects and over 100.000 nsXPCWrappedJS (nsISupports) objects listed. For the first issue Olli already filed bug 751529. For the latter no action has happened yet. I will leave it up to Olli because he currently analyses my cc log.

I hope that I can come up with some better STR soon.
Comment 1 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-05-03 08:09:00 PDT
Henrik, while you investigate STR, can you also check to see if:

A) This is any better or worse in Firefox 15 Nightly
B) This has gotten worse over time (for example, Firefox 12, 11, 10, etc)
Comment 2 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-05-03 08:42:08 PDT
CC times or GC times are 4s?  Though I guess both could be bad given how much memory it is using up.
Comment 3 Henrik Skupin (:whimboo) 2012-05-03 09:48:28 PDT
Created attachment 620738 [details]
MemChaser log

It'd GC only. But I wanted to add a memchaser log which can hopefully give some more information what is getting executed.
Comment 4 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-05-03 10:17:18 PDT
There are a number of long CCs (one is 12 seconds!), but the GCs do look a little longer.  I'd guess that GC times are long because the page is using an obscene amount of memory.  If you could get the contents of about:memory?verbose when you are having this problem, that might help.
Comment 5 Bill McCloskey (:billm) 2012-05-03 10:27:13 PDT
To some extent, the long GCs are a symptom of all the xpconnect stuff going on. We're spending ~100ms in root marking, which is very unusual. Another ~90ms is in mark_other. This means that we have to enumerate a bunch of xpconnect gray roots, and we're also spending a lot of time in gray marking.

I'm not sure why the sweeping times are so long. That's been a persistent issue for us, though, so it's known.
Comment 6 Henrik Skupin (:whimboo) 2012-05-09 06:08:23 PDT
Created attachment 622357 [details]
MemChaser log

Updated MemChaser log now with a memory consumption of about 2GB of RAM. The log shows the constant raise of the usage over time.

I think that this leak happens when new jobs are getting added to the queue. I will have to test this now on my local instance. If I can prove it, I will give details instructions how to reproduce.
Comment 7 Henrik Skupin (:whimboo) 2012-05-11 05:23:26 PDT
The same effect can also be seen in Safari. I had the Jenkins website open on our master for a couple of days now and Safari claimed about 3.25GB of memory.
Comment 8 Henrik Skupin (:whimboo) 2012-05-11 08:23:59 PDT
For people with VPN access, simply open the following URL and keep the browser open for a couple of days. You will have to catch the time when new job requests get added, means when builds have been finished. That's early in the day PTO so you should keep the browser open.

When you have the Jenkins web page open, please make sure to not change any configuration aspects. Thanks.
Comment 9 Henrik Skupin (:whimboo) 2012-05-11 08:25:22 PDT
Olli, would you mind giving an update based on the latest cc log I have sent to you? You have mentioned about 170.000 XPCVariant objects, is it something we should file as a new bug?
Comment 10 Justin Lebar (not reading bugmail) 2012-05-29 16:13:07 PDT
Over to tech evang; given comment 7, this doesn't sound like our bug.

It would be great if pages which created giant heaps didn't cause GC pauses, but we already have bugs on that.
Comment 11 Henrik Skupin (:whimboo) 2012-10-01 08:34:52 PDT
Created attachment 666566 [details]
Detailed MemChaser log

This is a detailed log of Memchaser which I wanted to attach since ages. Sorry for the delay. It contains hopefully all the information we need to sort out this issue.
Comment 12 Henrik Skupin (:whimboo) 2012-10-01 08:37:23 PDT
I have had Jenkins open for an hour or so and the memory of Firefox raised from 500MB to 1.1GB. Closing the Jenkins tab caused a freeze of Firefox for about 1 minutes. As the log shows there are CC cyles of about 27s in duration but also iGC of 6s.

I wonder if we can do something here as leaving it in the evangelism section only.
Comment 13 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2012-10-01 08:40:18 PDT
memchaser log doesn't give much information.
GC and CC logs are needed.
https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump


IIRC you gave me once some logs, but I don't have them on this laptop.
Comment 14 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2012-10-01 08:45:56 PDT
But Justin is right. If the web app leaks, then it leaks which will at some point slow
down performance on any kind of browser.
We can ofc try to make CC and GC faster, but the web app needs to be fixed.
Comment 15 Henrik Skupin (:whimboo) 2012-10-01 08:46:49 PDT
Olli, if you have VPN access to MV please open http://10.250.73.243:8080/ and enable 'auto refresh' at the top right corner. Leave the tab open for a while and you will notice the effect really soon.
Comment 16 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-10-01 08:49:51 PDT
The non-closing CC times could probably be improved, as the CC is looking at 200k GC objects and not freeing any.

DOM merging could potentially help the closing CC, as there are 1.2 million ref counted nodes in the graph, but it depends on what exactly they are.

I'm not sure if anything can be done for the GC. The page is just allocating a huge amount of memory and never letting it go. A large percentage of the GCs are nonincremental because the page is allocating so much (and I guess the GC is falling behind).
Comment 17 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-10-01 08:50:39 PDT
Olli, I can get a log for you, if you don't regularly VPN in, as I'll be in the MV office.
Comment 18 Henrik Skupin (:whimboo) 2012-10-01 10:20:46 PDT
As downloadable from Dropbox you have a gc/cc log of the issue with an iGC of about 200ms and a low CC duration. That should make it easier for Olli to get it processed.

https://www.dropbox.com/s/820qcp5vtactep4/cc_gc_log.zip
Comment 19 Henrik Skupin (:whimboo) 2012-10-01 12:44:01 PDT
Olli, here an updated version of the logs with the XUL cache pref removed. 

https://www.dropbox.com/s/820qcp5vtactep4/cc_gc_log.zip
Comment 20 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2012-10-01 14:49:49 PDT
Henrik, which version of FF are you using?
Comment 21 Bill McCloskey (:billm) 2012-10-01 15:17:37 PDT
Henrik, we think an addon might be involved. Could you try this again with a fresh profile?

Also, can you try to get minimal steps to reproduce this problem? I tried it myself, but I'm not getting the same kind of objects that are showing up in your CC log.
Comment 22 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-10-01 15:20:37 PDT
We're seeing some weird objects that we're not sure where they came from. Our theory is that they are weak map related, because we don't log weak maps very well right now.
Comment 23 Henrik Skupin (:whimboo) 2012-10-01 15:38:15 PDT
Right now this log has been made with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/17.0 Firefox/17.0 ID:20120930042009

I can try a fresh profile with memchaser installed only.
Comment 24 Bill McCloskey (:billm) 2012-10-01 15:57:20 PDT
Actually, could you turn off memchaser as well? I'm afraid it might affect things, since it uses Jetpacl.. Could you just use the CC times in the error console instead?
Comment 25 Henrik Skupin (:whimboo) 2012-10-01 16:12:19 PDT
Here the logs with a fresh profile:

https://www.dropbox.com/s/k5nu8ygv4w61yij/gg_cc_log_memchaser.zip
https://www.dropbox.com/s/uikivvn8ive8yxn/gg_cc_log_plain.zip

Without memchaser I'm not able to reproduce this issue. Here the logs.
Comment 26 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2012-10-02 04:39:59 PDT
Haven't manage to reproduce using a trunk build.
CC graph stays almost all the time really small (<500) - only the automatic page reload increases it temporarily and CC times are between 6ms and 46ms.
Comment 27 Justin Lebar (not reading bugmail) 2012-10-02 08:37:26 PDT
> Without memchaser I'm not able to reproduce this issue.

Updating the bug metadata to reflect that this is caused by MemChaser.

This would be a significant issue if the problem turned out to be Jetpack and not simply MemChaser.
Comment 28 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-10-02 16:06:50 PDT
Kris says that MemChaser is using version 1.9 of the add-on SDK, which is pretty recent.  So yeah, the real question here is whether this is an add-on SDK problem or MemChaser-specific?
Comment 29 Kris Maglione [:kmag] 2012-10-03 14:46:40 PDT
If what we're worried about is memory usage after http://10.250.73.243:8080/ is closed, it looks like it's mostly heap-unclassified:

https://gist.github.com/3830049

I'm getting about 60MB growth in heap-unclassified with MemChaser installed, and a bit of growth elsewhere, as opposed to nothing without it installed. I'll try stripping down MemChaser to narrow down the cause when I get a chance.
Comment 30 Trev 2012-10-16 08:53:55 PDT
This might be similar to Bug 740162 where I also had massive amounts of memory being used by jenkins.  I didn't have any addons, though.
Comment 31 Henrik Skupin (:whimboo) 2012-11-06 04:01:55 PST
I have uploaded a 0.5pre XPI of memchaser which I will test over the next days if it changes the behavior here:

https://github.com/downloads/mozilla/memchaser/memchaser-0.5pre-201211061117.xpi
Comment 32 Henrik Skupin (:whimboo) 2013-01-25 05:29:40 PST
Created attachment 706358 [details]
about:memory?verbose

Looks like we have a lot of orphaned nodes for both the Jenkins instances:

* 97,530,664 B (20.59%) ── orphan-nodes
* 110,193,800 B (23.27%) ── orphan-nodes
Comment 33 Henrik Skupin (:whimboo) 2013-01-25 05:30:50 PST
I have to say this is with a profile with any extensions disabled.
Comment 34 Henrik Skupin (:whimboo) 2013-01-25 05:41:41 PST
Created attachment 706361 [details]
cc-edges.log

cc log from the same state as the about:memory dump.
Comment 35 Henrik Skupin (:whimboo) 2013-01-25 06:24:59 PST
So this is indeed the Jenkins CI page which is leaking or keeping orphaned elements around. Closing the tab and minimizing the memory a couple of times restores the previous memory usage. 

The latest GG/CC logs you can find here:
http://people.mozilla.com/~hskupin/downloads/

For everyone interested, here the dump of the conversation between Olli and myself on IRC:

14:29 smaug: whimboo: any chance you could create new GC/CC logs? hopefully immediately when you start to see the leak, so that the logs don't get huge
14:29 smaug will be offline soon, but could perhaps analyze the logs tomorrow
14:30 whimboo: smaug: would your addon help here
14:30 whimboo: ?
14:31 smaug: whimboo: if you know what to look at, then yes
14:31 smaug: you could check if documents are leaded
14:31 smaug: whimboo: it is possible that gc log is need too, but my tool is for cc log only
14:31 whimboo: k
14:31 whimboo: i will check right now
14:31 smaug: my local cc analyzing scripts do just the same things as about:cc
14:32 whimboo: smaug: do you want to put this xpi up somewhere?
14:32 whimboo: i always have trouble to find it
14:34 smaug: whimboo: well, I was going to put it to AMO but that was too difficult
14:34 smaug: perhaps I should get it to tree
14:34 smaug: make it a bit prettier first
14:34 whimboo: that would be great
14:34 smaug: it is very tiny addon
14:35 whimboo: smaug: k. so i will let you have a look at
14:35 whimboo: i will produce the cc now
14:38 whimboo: smaug: cc is about 1MB. that's ok?
14:38 whimboo: gc 314MB
14:38 smaug: 1MB is ok
14:38 smaug: 314GC, huh
14:39 smaug: that is less than tiny
14:39 smaug: er, "less than tiny" doesn't make sense
14:40 whimboo: heh. so i will attach the CC log to the bug
14:41 whimboo: smaug: log is up
14:42 whimboo: smaug: so good thing is with Nightly: 82.10 MB (13.19%) ── heap-unclassified
14:42 whimboo: that's way lesser as with aurora
14:45 smaug: whimboo: that is not too big cc graph
14:45 smaug: 8000 objects
14:45 smaug: sure, too big, but still...
14:45 smaug: but ok, looking
14:45 whimboo: sweet
14:45 whimboo: if it's too large I think i can give you a smaller one later
14:46 whimboo: given that I know how to produce it
14:46 smaug: so we're leaking all the XBL document
14:46 smaug: whimboo: do you have XUL cache disabled?
14:46 smaug: whimboo: it is small enough
14:47 whimboo: ups. that was a copy of an old profile
14:47 whimboo: yes it was disabled
14:47 whimboo: but should we leak in such a case?
14:47 smaug: whimboo: disabling XUL cache is not supported setup
14:47 smaug: yes, we don't care about that case
14:48 smaug: since everything behaves differently anyway
14:48 smaug: it is development only setup
14:48 whimboo: ok, so let me restart then
14:48 smaug: whimboo: we're talking about nglayout.debug.disable_xul_cache here, right :)
14:49 whimboo: i know :)
14:50 whimboo: smaug: looks like it's starting again
14:50 whimboo: 2.03 MB (01.32%) ── orphan-nodes
14:51 whimboo: smaug: shall i wait until it's a big higher?
14:52 smaug: whimboo: you could wait a bit...
14:53 whimboo: smaug: k, how long would you prefer? until it reaches about 10MB or so?
14:54 smaug: whimboo: do you have about:cc installed?
14:54 whimboo: smaug: yes
14:54 whimboo: but not sure what to do with it
14:54 smaug: could you run cc and see the size of the graph
14:55 smaug: graph size ~10000 should be enough
14:55 smaug: whimboo: run CC few times to get stable results
14:55 whimboo: smaug: where do i see that?
14:55 smaug: whimboo: in the addon
14:55 whimboo: smaug: sure.. number of objects?
14:56 whimboo: or roots, edges
14:56 smaug: whimboo: number of objects
14:56 whimboo: k, at 353 right now
14:56 smaug: whimboo: I'd expect something like 10000
14:57 smaug: though, the previous log had 8000
14:57 smaug: whimboo: if you get it to 5000
14:57 whimboo: smaug: orphaned nodes are already at 15MB
14:57 smaug: sure
14:57 smaug: JS keeping nodes alive
14:57 whimboo: so you still want to have the 10.000 limit?
14:57 smaug: but the js is black (certainly alive) so those nodes can't be released
14:58 smaug: whimboo: 5000 could be enough
14:58 whimboo: k lets wait when I reach that
14:59 whimboo: smaug: so when I run cc it always drops down to about 200 again
14:59 whimboo: seems I can't get over 500
14:59 smaug: whimboo: ahaa
15:00 smaug: whimboo: so do you actually get leak
15:00 smaug: I mean if you close the tab
15:00 smaug: now that xul cache is enabled
15:01 whimboo: smaug: shall I close the tab and run cc?
15:01 smaug: whimboo: yes, and check what about:memory says
15:02 whimboo: smaug: hm, i think the addon initializes wrongly
15:02 whimboo: when I tried to run 'Run cycle collector' it did nothing
15:02 whimboo: now after closing the tab I get 220000 objects
15:03 smaug: that is very possibe
15:03 smaug: whimboo: run cc few times
15:03 whimboo: oh, right
15:03 whimboo: yes. it cleans up everything
15:03 whimboo: back at 200 objects
15:03 smaug: so it was garbage that we collected
15:03 whimboo: and about:memory doesn't list the page anymore
15:04 whimboo: that means it's really the website which is leaking memory
15:04 smaug: and memory usage is back to normal levels?
15:04 mdas has joined (mdas@50AE257C.D30B51A1.412CF160.IP)
15:04 whimboo: just a sec. will repeat iwth the other tab
15:04 smaug: well, "leaking", but yes, the website keeps objects alive
15:05 whimboo: smaug: err, now I'm at 1GB
15:06 whimboo: is that because of your addon?
15:06 whimboo: seems like
15:06 whimboo: 391.25 MB (61.12%) -- top(about:cc, id=28)/active/window(about:cc)
15:07 whimboo: smaug: wohoo. back to normal.. 240MB
15:07 smaug: whimboo: well, about:cc may use a lot of memory if the graph is huge
15:08 whimboo: smaug: so if I can help to analyze which js is holding that please tell me the steps
15:08 smaug: thinking...
15:09 smaug: whimboo: perhaps the best is to get cc and gc logs
15:09 smaug: before they are too huge
15:09 smaug: especially gc log is needed
15:09 smaug: since it is probably page keeping some js alive which ends up keeping nodes alive ...
15:10 whimboo: ok, will give those in a bit
15:11 whimboo: smaug: is 50MB for GC fine?
15:12 smaug: whimboo: uh, that is quite big one
15:12 whimboo: gna. let me restart then
15:13 smaug: whimboo: so, I need to go soon, but could you record this conversation to the bug, so that it is clear that it is the page which is leaking
15:13 whimboo: smaug: i cna't get it under 40MB
15:14 smaug: whimboo: put it to some server and I'll download it from there?
15:14 whimboo: sure
15:14 whimboo: thanks
15:14 whimboo: will comemnt in the bug
15:14 smaug: and put cc log too
15:14 smaug: thanks
15:14 whimboo: sure
Comment 36 Henrik Skupin (:whimboo) 2013-01-25 06:25:31 PST
Comment on attachment 706361 [details]
cc-edges.log

This was collected with XUL cache disabled.
Comment 37 Henrik Skupin (:whimboo) 2013-01-25 06:26:45 PST
Updated URLs for our own pages:

http://10.250.73.243:8080/
http://mm-ci-master.qa.scl3.mozilla.com:8080/

Open one of them and wait until jobs are running. The number of orphaned nodes will increase constantly and reach 10MB after a couple of minutes.
Comment 38 Henrik Skupin (:whimboo) 2013-02-03 04:25:08 PST
Not sure why I haven't commented on this bug with the location of the CC and GC log I have uploaded to my people account. Olli, whenever you have time please have a look at those. 

http://people.mozilla.com/~hskupin/downloads/cc-edges.88911.log
http://people.mozilla.com/~hskupin/downloads/gc-edges.88911.log
Comment 39 Henrik Skupin (:whimboo) 2013-02-20 12:20:48 PST
Requesting info-needed from Olli for my latest cc/gc logs for his analysis. I hope you will have a bit of time soon. Thanks.
Comment 40 Henrik Skupin (:whimboo) 2013-03-13 16:13:41 PDT
Olli, can find a bit of time to have a look at?
Comment 41 Henrik Skupin (:whimboo) 2013-03-30 04:03:02 PDT
An old version of Firefox which was 15.0 I got to crash in js::gc::MarkObjectUnbarried after adding hundreds of job requests to the Jenkins instance. I have now updated the version of Firefox to the latest release and will check that again after Eastern.
Comment 42 Henrik Skupin (:whimboo) 2013-03-30 04:03:37 PDT
Seems like Kris doesn't work on it. Back into the pool.
Comment 43 Henrik Skupin (:whimboo) 2013-04-04 04:01:30 PDT
I got the crash again today with the latest release of Firefox:
bp-6e5c9d5a-08dd-4a6b-95ed-6c6182130404

As the crash info shows Firefox crashes due to an OOM. Is there something we can prevent to not let this happen?
Comment 44 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2013-07-22 09:05:07 PDT
Sorry about not dealing with this. Henrik, could you re-test and report whether there are any changes.
Comment 45 Henrik Skupin (:whimboo) 2013-07-22 11:00:55 PDT
Yes, this is still an issue with Aurora. I know that information about Nightly would be better, but I can't deliver those today. If you need those I can check tomorrow. Otherwise I would really appreciate if you could test it yourself.
Comment 46 Henrik Skupin (:whimboo) 2013-08-13 07:37:53 PDT
I left our Jenkins page open for half a day with latest Aurora (26.0a2) and now I have a memory usage of 1.6GB whereby the referred web page already uses 650MB! 480MB in orphaned nodes which can even not get released via GC. 

I would love if we could get some tracking on this bug. It's waiting for months meanwhile. :(

1,004.78 MB (63.90%) -- window-objects
│  ├────642.13 MB (40.83%) -- top(http://mm-ci-master.qa.scl3.mozilla.com:8080/, id=83)
│  │    ├──637.28 MB (40.53%) -- active
│  │    │  ├──636.97 MB (40.51%) -- window(http://mm-ci-master.qa.scl3.mozilla.com:8080/)
│  │    │  │  ├──483.43 MB (30.74%) -- dom
│  │    │  │  │  ├──482.45 MB (30.68%) ── orphan-nodes
│  │    │  │  │  └────0.98 MB (00.06%) ++ (4 tiny)
│  │    │  │  ├──145.40 MB (09.25%) -- js-compartment(http://mm-ci-master.qa.scl3.mozilla.com:8080/)
│  │    │  │  │  ├──130.26 MB (08.28%) -- gc-heap
│  │    │  │  │  │  ├──129.73 MB (08.25%) -- objects
│  │    │  │  │  │  │  ├───79.33 MB (05.04%) ── ordinary
│  │    │  │  │  │  │  ├───34.48 MB (02.19%) ── function
│  │    │  │  │  │  │  └───15.92 MB (01.01%) ── dense-array
│  │    │  │  │  │  └────0.54 MB (00.03%) ++ (3 tiny)
│  │    │  │  │  └───15.13 MB (00.96%) ++ (5 tiny)
│  │    │  │  └────8.15 MB (00.52%) ++ (3 tiny)
│  │    │  └────0.31 MB (00.02%) ++ window(about:blank)
│  │    └────4.85 MB (00.31%) ++ js-zone(0x7fbb3c6f4800)
Comment 47 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2013-08-13 07:58:33 PDT
Yeah, I know. I or someone should find time to find the memory leak in Jenkins.
(Of course it would be nice if the Jenkins devs would do it.)
Comment 48 Justin Lebar (not reading bugmail) 2013-08-13 11:10:12 PDT
Henrik, have you tried contacting the Jenkins developers?  This pretty clearly is a leak in their page.
Comment 49 Henrik Skupin (:whimboo) 2013-08-21 15:58:23 PDT
Sure, I even talked with the lead developer. But sadly they never followed-up on it and on IRC I didn't receive helpful answers.

But lets wait here for a moment. At the moment we are going to upgrade our local Jenkins instance to the latest LTS (see https://github.com/mozilla/mozmill-ci/pull/260). That means nearly 1 year passed since we did the last upgrade. Lets see if the new version is better. I will report back in the next couple of days.
Comment 50 Henrik Skupin (:whimboo) 2013-09-02 07:32:18 PDT
So we have updated to the latest Jenkins LTS version today in our production environment, but the issue is still persistent. With that in mind I will try to get this reproduced with a local Jenkins installation, and plugins disabled. One of those might also cause this problem.
Comment 51 Henrik Skupin (:whimboo) 2014-01-08 01:19:17 PST
Not sure where all this unclassified memory comes from but I suspect it is from this bug. Is there anything which has been regressed here in terms of collecting orphaned nodes? I'm not sure what I should do to get more details here.

2,202.88 MB (100.0%) -- explicit
├──1,035.84 MB (47.02%) ── heap-unclassified
Comment 52 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2014-01-08 07:50:31 PST
Orphaned nodes held alive by JS should be tracked in about:memory now.
Comment 53 Henrik Skupin (:whimboo) 2014-01-08 08:52:26 PST
Andrew, what do you mean by now? Which version of Firefox? I'm currently running 28.0a2 and the memory situation is kinda bad as shown above.
Comment 54 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2014-01-08 08:55:24 PST
I'm suggesting that the heap-unclassified probably isn't orphan nodes, but it could be.  The only way to figure out what heap-unclassified is, is to run a DMD build.
  https://wiki.mozilla.org/Performance/MemShrink/DMD#Build
Comment 55 Nicholas Nethercote [:njn] (on vacation until July 11) 2014-01-08 17:48:40 PST
And orphan nodes get reported -- look for "orphan-nodes" in about:memory. It's possible that some are being missed, but in general it's not immediately obvious that high heap-unclassified implies orphan nodes. As mccr8 says, DMD is the way to analyze this.
Comment 56 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2014-06-02 00:08:57 PDT
We need some DMD analysis here, I think.
But I could take a look at fresh cc logs - both "concise" and "verbose" (when created using about:memory).

If possible, create logs with a clean profile (no addons).
Comment 57 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2014-06-23 13:41:31 PDT
Someone who can reproduce this needs to do some DMD analysis here.
Comment 58 Henrik Skupin (:whimboo) 2014-06-23 22:52:47 PDT
Early last week we updated our Jenkins installation to the most recent LTS version. I can now have a look if something has been changed. So I setup a new profile with only memchaser being installed. I will watch the memory usage through the day and will report back with logs as requested.

I will also try to get a DMD build working.
Comment 59 Henrik Skupin (:whimboo) 2014-06-24 10:22:46 PDT
Created attachment 8445286 [details]
memchaser.log

Ok, so this is still a problem with the latest Jenkins. Memory usage of a fresh profile with only memchaser installed went up from about 150MB to 1900MB. Firefox was barely reacting at this time. So for now I have the memchaser logs. The other requested information I will provide once I'm in a state again when Firefox uses about 1GB of memory.
Comment 60 Henrik Skupin (:whimboo) 2014-06-25 07:42:12 PDT
(In reply to Olli Pettay [:smaug] from comment #56)
> But I could take a look at fresh cc logs - both "concise" and "verbose"
> (when created using about:memory).

Ok, so I saved a concise log:
https://www.dropbox.com/s/2o3xhyttkit5wcw/concise.zip

Sadly all the memory has been released right after, so for a verbose log I will have to wait again until the memory usage is at a specific level. As it looks like it's really a memory leak for that site.
Comment 61 Henrik Skupin (:whimboo) 2014-06-26 01:58:14 PDT
Here is the verbose cc log:
https://www.dropbox.com/s/7y9w1cpiq03kwf0/verbose.zip

Olli, can you please have a look at?
Comment 62 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2014-07-02 12:26:02 PDT
I did look at the logs, and that was discussed on IRC
http://logs.glob.uno/?c=mozilla%23developers&s=26%20Jun%202014&e=26%20Jun%202014#c995967
Comment 63 Henrik Skupin (:whimboo) 2015-03-20 07:30:29 PDT
I still see this problem with current Nightly builds. The important fact here is now that even when I close the tab, the memory doesn't go down, but raises up from about 2GB to 5GB! Then Firefox has 100% CPU load, and memory goes up and down but the application is no longer usable. It completely hangs.
Comment 64 Syd Polk :sydpolk 2015-03-20 08:41:50 PDT
I have seen this as well.
Comment 65 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2015-03-20 09:19:43 PDT
Could you perhaps upload new GC/CC logs?

Note You need to log in before you can comment on or make changes to this bug.