894993 - [Session Restore] SessionWorker takes lots of memory.

David Teller [:Yoric] - still alive but not very active

Reporter

Description

•

12 years ago

No description provided.

Guilherme Lima

Comment 1

•

12 years ago

From the time I posted to now, the memory went from 83MB to 96MB. This session began when I updated to this nightly, which I think happened 3h or 4h ago. I have 4 app tabs (about:memory, Gmail - but I'm not logged in all the time, mozillazine and Facebook). Besides that, I have 1 or 2 tabs opened, normal browsing.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 2

•

12 years ago

Note to readers: This is a spinoff from https://bugzilla.mozilla.org/show_bug.cgi?id=891360#c12 . If I understand correctly Guilherme's observations, some of the memory holding strings is never reclaimed until he hits 'Minimize memory usage'.

Tim Taubert [:ttaubert] (inactive)

Comment 3

•

12 years ago

│ ├──110.64 MB (10.67%) -- worker(resource:///modules/sessionstore/SessionWorker.js, 0x7ff71d761800) │ │ ├──106.94 MB (10.31%) -- zone(0x7ff715c72000) │ │ │ ├───72.04 MB (06.95%) ++ string-chars/huge │ │ │ ├───34.75 MB (03.35%) -- compartment(web-worker) │ │ │ │ ├──34.28 MB (03.31%) -- objects-extra │ │ │ │ │ ├──34.25 MB (03.30%) ── elements/non-asm.js │ │ │ │ │ └───0.03 MB (00.00%) ── slots │ │ │ │ └───0.47 MB (00.05%) ++ (6 tiny) │ │ │ └────0.15 MB (00.01%) ++ (3 tiny) │ │ └────3.70 MB (00.36%) ++ (4 tiny) That's from my about:memory page. My sessionstore.js is about 1.8mb. Note that the PageThumbsWorker isn't really great at memory usage either: │ ├───64.59 MB (06.23%) -- worker(resource://gre/modules/PageThumbsWorker.js, 0x7ff6e66f7800) │ │ ├──61.46 MB (05.93%) -- zone(0x7ff6e6a15800) │ │ │ ├──60.13 MB (05.80%) -- compartment(web-worker) │ │ │ │ ├──58.01 MB (05.59%) -- objects-extra │ │ │ │ │ ├──57.55 MB (05.55%) ── elements/non-asm.js │ │ │ │ │ └───0.47 MB (00.04%) ++ (4 tiny) │ │ │ │ └───2.11 MB (00.20%) ++ (6 tiny) │ │ │ └───1.33 MB (00.13%) ++ (4 tiny) │ │ └───3.13 MB (00.30%) ++ (4 tiny)

Tim Taubert [:ttaubert] (inactive)

Comment 4

•

12 years ago

After hitting "Minimize memory usage" three times, both of these workers now moved to "Other measurements": 32.29 MB (100.0%) -- decommitted ├──26.78 MB (82.93%) ── js-non-window/gc-heap/decommitted-arenas └───5.51 MB (17.07%) -- workers/workers() ├──2.17 MB (06.73%) ── worker(resource:///modules/sessionstore/SessionWorker.js, 0x7ff71d761800)/gc-heap/decommitted-arenas ├──2.16 MB (06.69%) ── worker(resource://gre/modules/PageThumbsWorker.js, 0x7ff6e66f7800)/gc-heap/decommitted-arenas

Tim Taubert [:ttaubert] (inactive)

Comment 5

•

12 years ago

... And just a couple of minutes later the SessionWorker is almost as heavy as before, holding multiple session states as strings: 84.56 MB (08.57%) -- worker(resource:///modules/sessionstore/SessionWorker.js, 0x7ff71d761800)

Matt Brubeck (:mbrubeck)

Updated

•

12 years ago

Whiteboard: regr

Matt Brubeck (:mbrubeck)

Comment 6

•

12 years ago

[Sorry for the previous bugspam; pressed return by accident.] Nominating for tracking-firefox25 because this is a regression from bug 891360.

status-firefox24: --- → unaffected

status-firefox25: --- → affected

tracking-firefox25: --- → ?

Keywords: footprint, regression

Whiteboard: regr → [MemShrink]

Version: Trunk → 25 Branch

bhavana bajaj [:bajaj]

Updated

•

12 years ago

Assignee: nobody → ttaubert

tracking-firefox25: ? → +

Tim Taubert [:ttaubert] (inactive)

Comment 7

•

12 years ago

I don't think I'm the right assignee for this bug. I have no idea about string GC in workers and why we're taking up so much memory here. We'll need help from someone a little more familiar with this topic.

Matt Brubeck (:mbrubeck)

Comment 8

•

12 years ago

Maybe khuey knows something about this, or knows whom to ask?

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 9

•

12 years ago

Yeah we're terrible at GCing on workers.

Assignee: ttaubert → nobody

Component: Session Restore → DOM: Workers

Product: Firefox → Core

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 10

•

12 years ago

Ben, khuey threw you under the bus - is this something you can help fix in the 25 cycle? Alternatively, can you suggest an alternate assignee? A bunch of Session Store improvements we're working on depend on being able to move this I/O to a worker, and since session objects (or their serialization) can be rather large, we need to be GCing stuff properly.

Flags: needinfo?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 11

•

12 years ago

I think khuey is applying his usual editorial flourish a little too thickly here... We have four mechanisms in place to GC a worker currently: 1 and 2: Workers GC on a timer while running (every 30 seconds) and then on another timer when idle (after 5 seconds, and this one actually gives memory back to the OS). If a worker is constantly running for short periods of time and then going idle again it's possible it never GCs from the timers. 3 and 4: Workers also GC when the JS engine "thinks" it should based on the normal memory preferences that we have. We check this immediately when the event queues are empty and then the JS engine also does this from time to time internally while running. We could tweak those preferences or start using the worker-specific overrides too. This is obviously not bulletproof. One of the most obvious problems for ChromeWorkers is if the worker is constantly blocked on IO then the GC will be delayed or never run. I have some plans to fix that but currently it's not on my work list. So what is the SessionWorker code doing? Is it somehow missing all four of these mechanisms? Maybe there's a bug in one of them, or maybe it's as simple as tweaking the prefs. Maybe we need to check the memory allocated at every event.

Flags: needinfo?(bent.mozilla)

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 12

•

12 years ago

The SessionWorker code is only doing trivial stuff with the strings: * receive string; * run a few tests on the string; * optionally, some logging using dump(); * convert the string to an ArrayBuffer using TextEncoder. The string is never stored anywhere. Also, that ChromeWorker is idle most of the time (i.e. except on startup, it only ever receives a message every ~15s and stays awake for only a few milliseconds), so there are good chances that we don't benefit from 1. and 2. Note that this ChromeWorker calls js-ctypes at every call (the strings that annoy us are not passed as arguments), so this might play in somewhere.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 13

•

12 years ago

Ok, I poked around a little, and it looks like the idle timer is working as expected (though I forgot that the idle timer still just calls JS_MaybeGC, it doesn't force a shrinking GC). The JS engine just doesn't think it's worth GCing...

Nicholas Nethercote [inactive]

Comment 14

•

12 years ago

nbp, do you think you could look at this? This is a big problem at the moment, because it's blocking the session restore rewrite feature.

Whiteboard: [MemShrink] → [MemShrink:P1]

Nicholas Nethercote [inactive]

Updated

•

12 years ago

URL: :nbp

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

12 years ago

URL: :nbp

Nicholas Nethercote [inactive]

Updated

•

12 years ago

Assignee: nobody → nicolas.b.pierron

Nicolas B. Pierron [:nbp]

Comment 15

•

12 years ago

(In reply to Nicholas Nethercote [:njn] from comment #14) > nbp, do you think you could look at this? This is a big problem at the > moment, because it's blocking the session restore rewrite feature. Not within 1 or 2 weeks.

Flags: needinfo?(n.nethercote)

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 16

•

12 years ago

gwagner and I looked a little more here and it definitely seems like a JS engine problem... The runtime's gcMallocBytes is a large negative number (0xff6b8004) and gcMaxMallocBytes is 0x08000000. We should have GC'd.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 17

•

12 years ago

Ok, this is definitely a JS engine bug. Here's what's going wrong: JSRuntime::updateMallocCounter(JS::Zone * zone, unsigned int nbytes) JSRuntime::updateMallocCounter(unsigned int nbytes) js::MallocProvider<JSRuntime>::malloc_(unsigned int bytes) js::MallocProvider<JSRuntime>::new_<JS::Zone,JSRuntime *>(JSRuntime * p1) JSRuntime::init(unsigned int maxbytes) JS_NewRuntime(unsigned int maxbytes, JSUseHelperThreads useHelperThreads) `anonymous namespace'::CreateJSContextForWorker() The JSRuntime constructor sets gcMallocBytes to 0. Then, JSRuntime::init allocates a Zone and subtracts its size from 0, leaving a negative number. From that point on the malloc trigger is basically dead. I think this only happens on workers because we don't force a full GC at any point during normal operation, we just use JS_MaybeGC.

Component: DOM: Workers → JavaScript Engine

Nicholas Nethercote [inactive]

Updated

•

12 years ago

Flags: needinfo?(n.nethercote)

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 18

•

12 years ago

Attached patch changes.patch — Details — Splinter Review

Ok, this fixes the JS engine and then switches the worker idle timer back to a forced shrinking GC.

Assignee: nicolas.b.pierron → bent.mozilla

Status: NEW → ASSIGNED

Attachment #780655 - Flags: review?(wmccloskey)

Attachment #780655 - Flags: review?(khuey)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

12 years ago

Attachment #780655 - Flags: review?(khuey) → review+

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 19

•

12 years ago

(In reply to ben turner [:bent] from comment #18) > switches the worker idle timer back to a forced shrinking GC. ... because JS_MaybeGC is just not tuned right for workers. It won't try to GC based on time if there has never been a GC before, and it won't try to GC based on allocation until we've allocated 128mb (!!!). We need to fix JS_MaybeGC to be a bit smarter, and I'll talk with gwagner about this next week. The worker change here should be reverted once we fix it.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 20

•

12 years ago

Comment on attachment 780655 [details] [diff] [review] changes.patch Oops, sorry about this.

Attachment #780655 - Flags: review?(wmccloskey) → review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 21

•

12 years ago

> ... because JS_MaybeGC is just not tuned right for workers. It won't try to GC based on time > if there has never been a GC before, and it won't try to GC based on allocation until we've > allocated 128mb (!!!). We need to fix JS_MaybeGC to be a bit smarter, and I'll talk with > gwagner about this next week. The worker change here should be reverted once we fix it. Where does 128mb come from? I've always had trouble understanding how the GC prefs are set in workers, but that seems very high for the initial trigger bytes, even in the browser.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 22

•

12 years ago

(In reply to Bill McCloskey (:billm) from comment #21) > Where does 128mb come from? http://mxr.mozilla.org/mozilla-central/source/modules/libpref/src/init/all.js#828 We can override it for workers but guessing the right number could be tough.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 23

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f1e8ada1ddf7

Ryan VanderMeulen [:RyanVM]

Comment 24

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/f1e8ada1ddf7

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

status-firefox25: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla25

Justin Lebar (not reading bugmail)

Comment 25

•

12 years ago

Do we need this on b2g18?

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 26

•

12 years ago

The zone stuff is on trunk only I think (right billm?) We should make the same worker changes (remove the one JS_MaybeGC) on b2g18 too.

Flags: needinfo?(wmccloskey)

Justin Lebar (not reading bugmail)

Comment 27

•

12 years ago

Yes, zones is not on b2g18.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 28

•

12 years ago

I think the GC malloc bytes thing has been broken for a long time. http://hg.mozilla.org/releases/mozilla-b2g18/file/1fe3339e3d96/js/src/jscntxt.cpp#l1321

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Updated

•

12 years ago

status-b2g18: --- → affected

status-b2g18-v1.0.0: --- → affected

status-b2g18-v1.0.1: --- → affected

status-b2g-v1.1hd: --- → affected

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Updated

•

12 years ago

Flags: needinfo?(wmccloskey)

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Updated

•

12 years ago

blocking-b2g: --- → leo?

Justin Lebar (not reading bugmail)

Comment 29

•

12 years ago

Would it be hard to write a sketch of the patch for b2g18 while this is still leo? ? That might make it easier to argue this case to release drivers. Although maybe that doesn't really matter, since Leo is cherry-picking off b2g18 anyway.

Mike Lee [:mlee]

Updated

•

12 years ago

Keywords: perf

Whiteboard: [MemShrink:P1] → [MemShrink:P1] [c= s=2013.07.26]

bhavana bajaj [:bajaj]

Comment 30

•

12 years ago

Ben, this does not seem like a blocker and we are not taking any new perf improvements at this point, unless we have already committed. Also this is 1.0.1 affected so not sure why we would need this on 1.1.

blocking-b2g: leo? → -

Justin Lebar (not reading bugmail)

Comment 31

•

12 years ago

> we are not taking any new perf improvements at this point, unless we have already committed We've been taking a lot of perf improvements at leo's behest. It would be frustrating if they hit this bug in their last round of testing and then we had to scramble to fix it. Perhaps you can coordinate with them to find out for sure that is not desired.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 32

•

12 years ago

(In reply to Justin Lebar [:jlebar] from comment #29) > Would it be hard to write a sketch of the patch for b2g18 It's identical to the m-c patch, RuntimeService.cpp changes only.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 33

•

12 years ago

(In reply to ben turner [:bent] from comment #32) > (In reply to Justin Lebar [:jlebar] from comment #29) > > Would it be hard to write a sketch of the patch for b2g18 > > It's identical to the m-c patch, RuntimeService.cpp changes only. Don't we want the resetGCMallocBytes change as well? That's broken on b2g18 as far as I can tell.

Ben Turner (not reading bugmail, use the needinfo flag!)

Assignee

Comment 34

•

12 years ago

Hm, I guess so. Zones trigger the bug but don't cause it.

u279076

Comment 35

•

12 years ago

Guilherme, are you still experiencing this issue in Firefox 25?

Flags: needinfo?(guijoselito)

Guilherme Lima

Comment 36

•

12 years ago

I'm a Nightly user, and don't have a Firefox 25 (but I can download it if you need). But I remember that the first nightly with this patch made it work like it does now: the memory comsuption is always 4.27 MB (11.16%) -- workers/workers() ├──1.43 MB (03.72%) ── worker(resource://gre/modules/PageThumbsWorker.js, 0x64da800)/gc-heap/decommitted-arenas ├──1.43 MB (03.72%) ── worker(resource://gre/modules/osfile/osfile_async_worker.js, 0x6826000)/gc-heap/decommitted-arenas └──1.42 MB (03.71%) ── worker(resource:///modules/sessionstore/SessionWorker.js, 0x7252400)/gc-heap/decommitted-arenas 100% of the time, the memory is 4MB.

Flags: needinfo?(guijoselito)

u279076

Comment 37

•

12 years ago

I think that's enough to call this verified. Thank you Guilherme.

Status: RESOLVED → VERIFIED

status-firefox25: fixed → verified