Open Bug 864927 Opened 8 years ago Updated 3 years ago

optimize in-chrome workers to be more memory efficient

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

REOPENED

People

(Reporter: justin.lebar+bug, Assigned: nihsanullah)

References

(Depends on 3 open bugs)

Details

(Whiteboard: [MemShrink:P2][~5MB][Async:blocker])

Attachments

(1 file)

The RIL JS worker thread uses a lot of memory.  I strongly suspect that if we rewrote it in C++, we'd be able to reduce the compoment's memory usage significantly.

Here's a dump of the worker thread's memory usage.  Most-to-all of unused-arenas will go away after bug 829482, but even after that this worker uses way too much memory.

> 2.83 MB (08.26%) -- worker(resource://gre/modules/ril_worker.js, 0x42fd6400)
> ├──1.56 MB (04.56%) -- gc-heap
> │  ├──1.53 MB (04.47%) ── unused-arenas
> │  └──0.03 MB (00.09%) -- (2 tiny)
> │     ├──0.03 MB (00.09%) ── chunk-admin
> │     └──0.00 MB (00.00%) ── unused-chunks
> ├──0.61 MB (01.77%) -- compartment(web-worker)
> │  ├──0.29 MB (00.84%) -- gc-heap
> │  │  ├──0.09 MB (00.27%) ── unused-gc-things [2]
> │  │  ├──0.08 MB (00.24%) -- shapes
> │  │  │  ├──0.04 MB (00.11%) ── tree
> │  │  │  ├──0.03 MB (00.10%) ── dict
> │  │  │  └──0.01 MB (00.03%) ── base
> │  │  ├──0.06 MB (00.19%) -- objects
> │  │  │  ├──0.03 MB (00.10%) ── non-function
> │  │  │  └──0.03 MB (00.09%) ── function
> │  │  ├──0.04 MB (00.13%) ── scripts
> │  │  └──0.01 MB (00.03%) ── sundries [2]
> │  ├──0.16 MB (00.47%) ── script-data
> │  ├──0.05 MB (00.15%) -- objects-extra
> │  │  ├──0.04 MB (00.12%) ── slots
> │  │  └──0.01 MB (00.04%) ── elements
> │  ├──0.03 MB (00.09%) ── type-inference/script-main
> │  ├──0.03 MB (00.08%) ── other-sundries [2]
> │  ├──0.03 MB (00.07%) ── analysis-temporary
> │  └──0.02 MB (00.07%) -- shapes-extra
> │     ├──0.01 MB (00.04%) ── compartment-tables
> │     └──0.01 MB (00.03%) ── dict-tables
> ├──0.41 MB (01.20%) -- runtime
> │  ├──0.13 MB (00.36%) ── gc-marker
> │  ├──0.11 MB (00.33%) ── script-sources
> │  ├──0.06 MB (00.18%) ── atoms-table
> │  ├──0.06 MB (00.18%) ── jaeger-code
> │  ├──0.04 MB (00.10%) ── runtime-object
> │  ├──0.00 MB (00.01%) ── dtoa
> │  ├──0.00 MB (00.01%) ── stack
> │  ├──0.00 MB (00.01%) ── temporary
> │  ├──0.00 MB (00.00%) ── contexts
> │  ├──0.00 MB (00.00%) ── script-filenames
> │  ├──0.00 MB (00.00%) ── unused-code
> │  ├──0.00 MB (00.00%) ── ion-code
> │  ├──0.00 MB (00.00%) ── math-cache
> │  └──0.00 MB (00.00%) ── regexp-code
> └──0.25 MB (00.72%) -- compartment(web-worker-atoms)
>    ├──0.14 MB (00.41%) -- gc-heap
>    │  ├──0.13 MB (00.38%) ── strings
>    │  ├──0.01 MB (00.02%) ── unused-gc-things
>    │  └──0.00 MB (00.00%) ── sundries
>    ├──0.10 MB (00.30%) ── string-chars/non-huge
>    └──0.00 MB (00.01%) ── other-sundries
(In reply to Justin Lebar [:jlebar] from comment #0)
> The RIL JS worker thread uses a lot of memory.  I strongly suspect that if
> we rewrote it in C++, we'd be able to reduce the compoment's memory usage
> significantly.
> 

I think it would be easier to make the GC parameters tunable for workers?
> I think it would be easier to make the GC parameters tunable for workers?

If you want to try that, I don't think it could hurt.  But what I see here to my naive eyes doesn't look like GC parameter problems (aside from unused-arenas, of course); it looks like we have a bunch of things inside a runtime / component, and they each eat up a tiny bit of space.
Whiteboard: [MemShrink]
There are four relevant workers here:

 * RIL worker (this bug)
 * Net worker (bug 864931)
 * wifi worker x2 (bug 864932)
(In reply to Justin Lebar [:jlebar] from comment #0)
> The RIL JS worker thread uses a lot of memory.  I strongly suspect that if
> we rewrote it in C++, we'd be able to reduce the compoment's memory usage
> significantly.
> 
> Here's a dump of the worker thread's memory usage.  Most-to-all of
> unused-arenas will go away after bug 829482, but even after that this worker
> uses way too much memory.
> 
> > 2.83 MB (08.26%) -- worker(resource://gre/modules/ril_worker.js, 0x42fd6400)
Justin, commercial RIL is implemented by C++. Could you please also analyse the memory size
of commercial RIL? This answer will help us to clearly know the benefit of rewriting 
ril worker.
commercial ril

35.66 MB (100.0%) -- explicit
├───9.34 MB (26.18%) ++ js-non-window
├───7.07 MB (19.82%) -- workers/workers()
│   ├──2.45 MB (06.88%) -- worker(resource://gre/modules/wifi_worker.js, 0x44658400)
│   │  ├──1.77 MB (04.97%) -- gc-heap
│   │  │  ├──1.73 MB (04.86%) ── unused-arenas
│   │  │  └──0.04 MB (00.11%) ++ (3 tiny)
│   │  └──0.68 MB (01.90%) ++ (3 tiny)
│   ├──2.32 MB (06.51%) -- worker(resource://gre/modules/net_worker.js, 0x44185400)
│   │  ├──1.74 MB (04.89%) -- gc-heap
│   │  │  ├──1.70 MB (04.78%) ── unused-arenas
│   │  │  └──0.04 MB (00.11%) ++ (3 tiny)
│   │  └──0.58 MB (01.62%) ++ (3 tiny)
│   └──2.29 MB (06.43%) -- worker(resource://gre/modules/wifi_worker.js, 0x42e44400)
│      ├──1.79 MB (05.01%) -- gc-heap
│      │  ├──1.75 MB (04.92%) ── decommitted-arenas
│      │  └──0.03 MB (00.09%) ++ (3 tiny)
│      └──0.51 MB (01.43%) ++ (3 tiny)
├───5.67 MB (15.90%) ── heap-unclassified
├───4.31 MB (12.09%) ++ window-objects
├───3.61 MB (10.13%) ++ images
├───2.73 MB (07.65%) ── freetype
├───1.84 MB (05.15%) ++ (11 tiny)
└───1.09 MB (03.07%) ── xpti-working-set

and the library file takes less than 500k.


moz ril

39.84 MB (100.0%) -- explicit
├───9.93 MB (24.91%) -- workers/workers()
│   ├──2.86 MB (07.17%) -- worker(resource://gre/modules/ril_worker.js, 0x42d39400)
│   │  ├──1.54 MB (03.86%) -- gc-heap
│   │  │  ├──1.50 MB (03.76%) ── unused-arenas
│   │  │  └──0.04 MB (00.10%) ++ (3 tiny)
│   │  ├──0.66 MB (01.66%) ++ compartment(web-worker)
│   │  ├──0.41 MB (01.03%) ++ runtime
│   │  └──0.25 MB (00.62%) ++ compartment(web-worker-atoms)
│   ├──2.45 MB (06.15%) -- worker(resource://gre/modules/wifi_worker.js, 0x42d40c00)
│   │  ├──1.77 MB (04.45%) -- gc-heap
│   │  │  ├──1.73 MB (04.35%) ── unused-arenas
│   │  │  └──0.04 MB (00.10%) ++ (3 tiny)
│   │  └──0.68 MB (01.70%) ++ (3 tiny)
│   ├──2.32 MB (05.83%) -- worker(resource://gre/modules/net_worker.js, 0x43f81400)
│   │  ├──1.74 MB (04.37%) -- gc-heap
│   │  │  ├──1.70 MB (04.27%) ── unused-arenas
│   │  │  └──0.04 MB (00.10%) ++ (3 tiny)
│   │  └──0.58 MB (01.45%) ++ (3 tiny)
│   └──2.29 MB (05.76%) -- worker(resource://gre/modules/wifi_worker.js, 0x42d39800)
│      ├──1.79 MB (04.48%) -- gc-heap
│      │  ├──1.75 MB (04.40%) ── decommitted-arenas
│      │  └──0.03 MB (00.08%) ++ (3 tiny)
│      └──0.51 MB (01.28%) ++ (3 tiny)
├───9.52 MB (23.89%) ++ js-non-window
├───5.19 MB (13.02%) ── heap-unclassified
├───4.79 MB (12.02%) ++ images
├───4.26 MB (10.69%) ++ window-objects
├───2.73 MB (06.85%) ── freetype
├───1.69 MB (04.23%) ++ (10 tiny)
├───1.08 MB (02.71%) ── xpti-working-set
└───0.67 MB (01.68%) -- startup-cache
    ├──0.67 MB (01.68%) ── data
    └──0.00 MB (00.00%) ── mapping
Note that heap-unclassified is ~500kb bigger with the commercial RIL.  That may or may not be noise in the data; DMD would tell us.

500kb sounds like a lot for the RIL, but in any case, if we never ship the Mozilla RIL (can someone confirm this?), I'm happy to WONTFIX this bug.  We can file a separate bug on the commercial RIL taking up 500kb...
We can re-open this one if we ever start shipping this RIL.  For now, WONTFIX.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Whiteboard: [MemShrink] → [MemShrink][tarako][~5MB]
How is 5MB from?  RIL in JS is good for stability, I would rather to keep it in JS instead of rewriting it in C++.  I would rather to squash memory from other modules before starting this bug.
Yeah, discussing with Andreas he'd rather keep it in js too. The issue we face and should fix though is that on multi-sim devices like Tarako we use one worker per SIM. And each one uses around 2.5MB (maybe a bit less now that we removed jsctypes from the worker).

Could we refactor to use a single worker with multiple sims?
OK

(In reply to Thinker Li [:sinker] from comment #8)
> How is 5MB from?  RIL in JS is good for stability, I would rather to keep it
> in JS instead of rewriting it in C++.  I would rather to squash memory from
> other modules before starting this bug.
(In reply to Fabrice Desré [:fabrice] from comment #9)
> Yeah, discussing with Andreas he'd rather keep it in js too. The issue we
> face and should fix though is that on multi-sim devices like Tarako we use
> one worker per SIM. And each one uses around 2.5MB (maybe a bit less now
> that we removed jsctypes from the worker).
> 
> Could we refactor to use a single worker with multiple sims?
It is possible. Vicamo will provide more details.

By the way, is it really useless to rewrite RIL worker in C++? Rewriting RIL worker in C++ is very very huge task. If it have to be a target in the future, we need to have a plan to start this task now. Or we just pass this problem to future again.
(In reply to Ken Chang[:ken] from comment #11)
> (In reply to Fabrice Desré [:fabrice] from comment #9)
> > Could we refactor to use a single worker with multiple sims?
> It is possible. Vicamo will provide more details.

Filed as bug 960894.

> By the way, is it really useless to rewrite RIL worker in C++? Rewriting RIL
> worker in C++ is very very huge task. If it have to be a target in the
> future, we need to have a plan to start this task now. Or we just pass this
> problem to future again.

If memory pressure here is already not sufficient to make us re-consider this bug, there will probably be no other reason strong enough to re-open it again.  I won't call it a uselees move.  It's interesting and could be a chance to refactor our code in a more extreme way.  But that is still not strong enough after asking advices from Thinker and Cervantes.

WIP last night: https://github.com/vicamo/b2g_mozilla-central/tree/bugzilla/864927/rebuild-from-scratch
Whiteboard: [MemShrink][tarako][~5MB] → [MemShrink:P1][tarako][~5MB]
Summary: Rewrite RIL worker in C++ → optimize in-chrome workers to be more memory efficient
I am sure we can win back most of the memory here with a little bit of JS engine work. At minimum it should be possible to start a ChromeWorker and disable the JIT from the get go if you know you won't need it (just interpreter). Also, we want workers to be able to share heaps more efficiently as well as code storage. I am sure the JS peeps have more ideas to improve this case.
Assignee: nobody → nihsanullah
Product: Firefox OS → Core
Naveed, please discuss a plan with the team and file dependent bugs as needed. It would be good if we can get some initial improvements (if not improvements definitely an analysis) here quickly.
Component: General → JavaScript Engine
(In reply to Justin Lebar (not reading bugmail) from comment #0)
> > 2.83 MB (08.26%) -- worker(resource://gre/modules/ril_worker.js, 0x42fd6400)
> > ├──1.56 MB (04.56%) -- gc-heap
> > │  ├──1.53 MB (04.47%) ── unused-arenas

Is this still present? I thought we reduced the GC limit to start after 1 MB, and we are now supposed to decommit unused pages (Bug 941792)

> > ├──0.61 MB (01.77%) -- compartment(web-worker)

I am afraid there is no simple solution for this part.

> > │  └──0.02 MB (00.07%) -- shapes-extra
> > │     └──0.01 MB (00.03%) ── dict-tables

Is there any reason to have objects with a lot of property in the ril_worker?

> > ├──0.41 MB (01.20%) -- runtime

Bug 916021 is opened to address these issues.
Depends on: 941783, 916021
Attached file memory-reports
Here's an up to date memory report from a dual sim device (hence the two ril workers for now - fix pending to share a single worker).
(In reply to Fabrice Desré [:fabrice] from comment #16)
> Here's an up to date memory report from a dual sim device (hence the two ril
> workers for now - fix pending to share a single worker).

Here's some of the output from that report.

│   ├──2,954,804 B (07.51%) -- worker(resource://gre/modules/ril_worker.js, 0x436c9c00)
│   │  ├──1,677,088 B (04.26%) -- runtime
│   │  │  ├────942,560 B (02.40%) ── script-sources
│   │  │  ├────336,528 B (00.86%) ── script-data
│   │  │  ├────262,144 B (00.67%) ── atoms-table
│   │  │  ├─────65,536 B (00.17%) ++ code
│   │  │  ├─────40,960 B (00.10%) ── runtime-object
│   │  │  ├─────16,384 B (00.04%) ── gc-marker
│   │  │  ├──────4,096 B (00.01%) ── dtoa
│   │  │  ├──────4,096 B (00.01%) ── interpreter-stack
│   │  │  ├──────4,096 B (00.01%) ── temporary
│   │  │  ├────────688 B (00.00%) ── contexts
│   │  │  ├──────────0 B (00.00%) ── math-cache
│   │  │  └──────────0 B (00.00%) ── regexp-data
│   │  ├────622,576 B (01.58%) ++ zone(0x446c5c00)
│   │  ├────428,080 B (01.09%) ++ zone(0x446c5000)
│   │  ├────194,292 B (00.49%) ++ zone(0x446c5800)
│   │  └─────32,768 B (00.08%) ++ gc-heap
│   └──2,918,164 B (07.42%) -- worker(resource://gre/modules/ril_worker.js, 0x446c4800)
│      ├──1,677,088 B (04.26%) -- runtime
│      │  ├────942,560 B (02.40%) ── script-sources
│      │  ├────336,528 B (00.86%) ── script-data
│      │  ├────262,144 B (00.67%) ── atoms-table
│      │  ├─────65,536 B (00.17%) ++ code
│      │  ├─────40,960 B (00.10%) ── runtime-object
│      │  ├─────16,384 B (00.04%) ── gc-marker
│      │  ├──────4,096 B (00.01%) ── dtoa
│      │  ├──────4,096 B (00.01%) ── interpreter-stack
│      │  ├──────4,096 B (00.01%) ── temporary
│      │  ├────────688 B (00.00%) ── contexts
│      │  ├──────────0 B (00.00%) ── math-cache
│      │  └──────────0 B (00.00%) ── regexp-data
│      ├────585,936 B (01.49%) ++ zone(0x449c0000)
│      ├────428,080 B (01.09%) ++ zone(0x449bec00)
│      ├────194,292 B (00.49%) ++ zone(0x449bf800)
│      └─────32,768 B (00.08%) ++ gc-heap

The obvious thing: do we really need two workers? Because a ton of that stuff is entirely duplicated. The script-sources are identical, the script-data is identical, the atoms-tables are probably identical or almost identical, etc, etc.

Also, I've lost track of the various source/parsing optimizations we have going now... can we say "don't bother storing the source for this worker, I promise I won't call toSource() on it"? Relatedly, I bet that script source isn't compressed because we only have one core on these phones.

And I've been wondering for a while whether our atoms-tables could be made smaller. I've never quite worked out exactly which strings get atomized and why.

Finally, it's not shown in the entries I've cut and pasted above, but the various zone sub-trees for these workers seem reasonable. There's nothing that sticks out as obviously wasteful, e.g. no big chunks of unused GC heap or anything like that -- indeed, the "Other Measurements" section has this:

3,096,576 B (100.0%) -- decommitted
├──2,564,096 B (82.80%) -- workers/workers()
│  ├──1,298,432 B (41.93%) ── worker(resource://gre/modules/ril_worker.js, 0x446c4800)/gc-heap/decommitted-arenas
│  └──1,265,664 B (40.87%) ── worker(resource://gre/modules/ril_worker.js, 0x436c9c00)/gc-heap/decommitted-arenas
└────532,480 B (17.20%) ── js-non-window/gc-heap/decommitted-arenas

which shows that decomitting is working well.


(In reply to Andreas Gal :gal from comment #13)
> I am sure we can win back most of the memory here with a little bit of JS
> engine work. At minimum it should be possible to start a ChromeWorker and
> disable the JIT from the get go if you know you won't need it (just
> interpreter).

The JITs aren't a problem -- generated code is only 64 KiB in each worker.
(In reply to Nicholas Nethercote [:njn] from comment #17)
> (In reply to Fabrice Desré [:fabrice] from comment #16)
> > Here's an up to date memory report from a dual sim device (hence the two ril
> > workers for now - fix pending to share a single worker).
> 
...
> The obvious thing: do we really need two workers? Because a ton of that
> stuff is entirely duplicated. The script-sources are identical, the
> script-data is identical, the atoms-tables are probably identical or almost
> identical, etc, etc.

As I said, work is underway (in bug 960894) to have a single worker.

> Also, I've lost track of the various source/parsing optimizations we have
> going now... can we say "don't bother storing the source for this worker, I
> promise I won't call toSource() on it"? Relatedly, I bet that script source
> isn't compressed because we only have one core on these phones.

I don't think we can do that by worker. That would be nice to be able to do "no source" like we have "use strict".
nbp, does that look feasible/stupid ?
Flags: needinfo?(nicolas.b.pierron)
This is chrome JS so anything goes. I recommend we just disable toSource in general for chrome via a pref. At least for FFOS we can definitely make that work. I am pretty sure we don't depend on it.
If I read the code correctly for chrome we already load the source lazily and do not keep it around. Is "script-sources" maybe the bytecode?
> Is "script-sources" maybe the bytecode?

If you hover over an entry in about:memory you get a tool-tip describing it. "script-sources" is definitely the source code.

Bytecode is either in "script-data" (in the zone) or "scripts/malloc-heap/data" (in the compartments)... I don't recall how that split works now. Till would know.

So maybe the lazy-loading isn't working, or isn't enabled in that repository, or maybe something is causing it to be loaded?
Oh boy can we save a ton of memory if comment 21 is right. Looking at the source we should be using lazy script loading for chrome. Something must be broken.
mozJSComponentLoader.cpp:810

        CompileOptions options(cx);
        options.setPrincipals(nsJSPrincipals::get(mSystemPrincipal))
               .setNoScriptRval(mReuseLoaderGlobal ? false : true)
               .setVersion(JSVERSION_LATEST)
               .setFileAndLine(nativePath.get(), 1)
               .setSourcePolicy(mReuseLoaderGlobal ?
				CompileOptions::NO_SOURCE :
                                CompileOptions::LAZY_SOURCE);
pref("jsloader.reuseGlobal", true); so we are supplying NO_SOURCE here
(In reply to Fabrice Desré [:fabrice] from comment #18)
> (In reply to Nicholas Nethercote [:njn] from comment #17)
> > Also, I've lost track of the various source/parsing optimizations we have
> > going now... can we say "don't bother storing the source for this worker, I
> > promise I won't call toSource() on it"? Relatedly, I bet that script source
> > isn't compressed because we only have one core on these phones.
> 
> I don't think we can do that by worker. That would be nice to be able to do
> "no source" like we have "use strict".
> nbp, does that look feasible/stupid ?

This question was already asked in Bug 941786.

The current state is that we are keeping a (not well) compressed version of the source on the runtime to parse the functions on-demand, so as long as we do not execute a function, we keep a hunk to remember its properties such as the captured variables and its inner functions.

Knowing that the bytecode is larger than the source, keeping the source in addition to these LazyScript should be smaller than keeping the bytecode around without keeping the source, as long as not all functions are used permanently.

Lately, the JS Engine is doing a relazification, which evict the script-data of functions which are no-longer used.

I think the "no source" annotation makes sense if all the code is constantly active, in which case most of the script-data is kept in addition to the script-source.  I most cases many functions are unused most of the time, and keeping the LazyScript makes more sense than removing the source.  Otherwise, this will substitute the script-source and replace it by even more script-data in the memory report.

In Bug 944659, which suggests to remove the source even if we have LazyScripts at the cost of loading the source again, but this might be critical in terms of performance as mentioned Luke (Bug 944659 comment 59).

(In reply to Nicholas Nethercote [:njn] from comment #17)
> │   ├──2,954,804 B (07.51%) -- worker(resource://gre/modules/ril_worker.js,
> 0x436c9c00)
> │   │  ├──1,677,088 B (04.26%) -- runtime
> │   │  ├────…
> │   │  ├────428,080 B (01.09%) ++ zone(0x446c5000)
> │   │  ├────194,292 B (00.49%) ++ zone(0x446c5800)

We have bug opened to either merge the self-hosted zone with atoms zone (see Bug 921213 comment 1) or share the atoms compartment with the main thread.

> And I've been wondering for a while whether our atoms-tables could be made
> smaller. I've never quite worked out exactly which strings get atomized and
> why.

I am not sure if Bug 921176 is related to the atoms table, but the problem sounds similar.
Flags: needinfo?(nicolas.b.pierron)
The use of require() might be to blame for some of the memory usage.

This function works as follows:
- load the source code of the module with a XHR;
- do trivial rewriting on that source code to enforce modularization;
- create object URL on the rewritten source code;
- importScripts() on said object URL;
- revoke object URL.

This is certainly inefficient in terms of source caching. Once bug 568953 has landed for good, we should be able to either rewrite require() using ES6 modules or move away from require() altogether.
Whiteboard: [MemShrink:P1][tarako][~5MB] → [MemShrink:P1][tarako][~5MB][Async:blocker]
I removed require.js. It minimally reduces script sources (10%).
Why are we not sharing the script sources cache between the two workers?
(In reply to Nicholas Nethercote [:njn] from comment #17)
> The obvious thing: do we really need two workers? Because a ton of that
> stuff is entirely duplicated. The script-sources are identical, the
> script-data is identical, the atoms-tables are probably identical or almost
> identical, etc, etc.

In the long run, bug 941818 should enable us to share much more of this. We can only share script-sources and bytecode within a runtime, and have one atoms compartment per runtime, so not creating a runtime per worker would get our memory usage down by a lot, I bet.

> 
> Also, I've lost track of the various source/parsing optimizations we have
> going now... can we say "don't bother storing the source for this worker, I
> promise I won't call toSource() on it"?

We can't right now. As nbp says, whether it's a good idea depends on how much lazy parsing and relazification (bug 886193) happens in such a worker: if most of the scripts are going to be run infrequently, not saving the source will cost us memory because it disables relazification and thus collection of JSScripts and their bytecode - which usually is a lot larger than the source.

> Relatedly, I bet that script source
> isn't compressed because we only have one core on these phones.

We should really switch to LZ4 for this compression (bug 837110), at least for the single-core case. While we now have LZ4 in MFBT, I abandoned that project last time around because LZ4 doesn't yet support interruptible compression. However, we could use it right now by compressing JS source in independent chunks - which has the advantage of allowing us to decompress individual chunks instead of the whole source for a file each time a function's source is requested. This is quite a bit of work, however, because we need to teach the parser to deal with the chunks, first.

(In reply to Nicholas Nethercote [:njn] from comment #21)
> Bytecode is either in "script-data" (in the zone) or
> "scripts/malloc-heap/data" (in the compartments)... I don't recall how that
> split works now. Till would know.

It's script-data in the runtime: all bytecode is shared per-runtime, so that identical functions from different globals (say, dozens of jquery usages) share the same bytecode.

> So maybe the lazy-loading isn't working, or isn't enabled in that
> repository, or maybe something is causing it to be loaded?

(In reply to Andreas Gal :gal from comment #24)
> pref("jsloader.reuseGlobal", true); so we are supplying NO_SOURCE here

The other option, LAZY_SOURCE, wouldn't help us here, either: LAZY_SOURCE means that we don't do lazy parsing or relazification, but that the devtools *might* get to the source, and toSource *might* work. A source hook is invoked, but it's not guaranteed to provide the source: if it doesn't, "[no source]" is returned as the function's body.

The only ways to enable lazy functions here would be to change this to SAVE_SOURCE and eat the source memory usage, or use LOCAL_SOURCE after the patches in bug 944659 have landed. Whether that's a good idea remains to be seen, however: the jank caused by retrieving the sources from, maybe on-disk, archives might be too much. nbp already mentioned bug 944659 comment 59.
Status: REOPENED → NEW
(In reply to Andreas Gal :gal from comment #28)
> Why are we not sharing the script sources cache between the two workers?

Because it's not trivial. Ideally, we would share this kind of read-only data between processes, even. Especially on b2g, but also with e10s, that would probably save a lot of memory.

Last May, I proposed a scheme for doing just that in bug 876173. It's certainly even less trivial, so doing more sharing across workers might be a good intermediate step. Or doing bug 941818.
Can you please talk to naveed to give this priority. Clearly this is very important for us so we can continue to use JS inside gecko for FFOS.
Naveed, some of the tasks mentioned here need prioritization.

Specifically, and in no particular order, the following would probably help quite a bit:
- lz4 compression of JS source on single-core systems (bug 837110)
- lazy-parsing for self-hosted JS (bug 900292)
- don't store source for local scripts, but still lazily parse them (bug 944659)
- sharing read-only data between workers or even processes (no bug for the first, bug 876173 for the second)
- Implement DOM workers using JS helper threads (bug 941818)

Theoretically, I could work on the first two, but that'd mean neglecting Shumway stuff. The third is almost ready, but might not work out performance-wise. The last two are large projects, and probably things someone from the e10s team would be best suited for.
Flags: needinfo?(nihsanullah)
Thanks for the summary, Till. Fabrice also mentioned bug 960894 which will merge N RIL workers into one; that bug appears to be making good progress and so is likely to be the first big win here.
Also, you are all wonderful people. Thanks for looking into this.
remove [tarako] whiteboard
for tarako we will track bug 960894
Whiteboard: [MemShrink:P1][tarako][~5MB][Async:blocker] → [MemShrink:P1][~5MB][Async:blocker]
I think this is basically covering the same ground as bug 916021. Since that bug has more dependencies, I'm going to mark this as a duplicate of that one.
Status: NEW → RESOLVED
Closed: 8 years ago7 years ago
No longer depends on: 916021, 941783
Resolution: --- → DUPLICATE
Duplicate of bug: 916021
> I think this is basically covering the same ground as bug 916021. Since that
> bug has more dependencies, I'm going to mark this as a duplicate of that one.

Actually, this bug is more general, since JSRuntime is only one part of a worker's memory consumption. So I'll reverse the direction of the duplication shortly, assuming nobody objects. Sorry for the bugspam.
Blocks: 941783
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
See Also: → 962585
Depends on: 970177
For those who aren't CC'd to the blockers, bhackett's been making some great progress on worker runtime size in bug 964057 and bug 964059. See bug 964059 comment 13 for some numbers.
This seems like less of a problem now, and lots of the blocking bugs have been fixed... downgrading to MemShrink:P2. Please comment if you disagree.
Whiteboard: [MemShrink:P1][~5MB][Async:blocker] → [MemShrink:P2][~5MB][Async:blocker]
Flags: needinfo?(nihsanullah)
You need to log in before you can comment on or make changes to this bug.