Memory usage of b2g process rises significantly when loading web page, due to gc-able screenshots in browser compartment

RESOLVED FIXED in Firefox 18

Status

()

defect
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: cjones, Unassigned)

Tracking

unspecified
mozilla19
ARM
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-basecamp:+, firefox18 fixed, firefox19 fixed)

Details

(Whiteboard: [MemShrink:P1][slim:>10MB])

Attachments

(2 attachments, 2 obsolete attachments)

STR
 (1) Note idle USS of b2g process.  It should be in the neighborhood of 60-65MB.
 (2) Open browser, load nytimes.
 (3) Note usage of b2g process while nytimes loads.  The USS goes as high as 80-85MB.

The USS of the process loading nytimes stays between 25-35MB.

This is not all expected, and might explain a lot of lowmemkills of network-loaded content.

This should be compared with b2g process USS while loading packaged apps.
Hah!  Caught you!

Memory usage went up to almost 90MB in the b2g process.  There's a smattering of things that are high but known but then

│  │  ├──16.55 MB (19.84%) -- window(app://browser.gaiamobile.org/index.html)
│  │  │  ├──16.02 MB (19.22%) -- js/compartment(app://browser.gaiamobile.org/index.html)
│  │  │  │  ├──14.95 MB (17.93%) ── string-chars

HOLY MOTHER OF GOD WHAT'S THAT?!?!

Are we holding onto full-resolution page snapshots as data: URIs somewhere perhaps?
Closing the browser app reclaims this memory.
Whiteboard: [MemShrink]
> Are we holding onto full-resolution page snapshots as data: URIs somewhere perhaps?

I thought that must have been it, but I commented out the screenshot code in the browser and I still see the large string-chars value.

It seems to vary quite a bit; I saw it at 1.7mb once (before I disabled screenshots), and now I see it at 10mb (after I disabled screenshots).
This actually may be a GC issue.

I ran ./get-about-memory.py && ./get-about-memory.py --minimize.  The second one runs a GC/CC before collecting the dumps.

Before the GC, I see

│   │  │  ├──3.86 MB (06.45%) -- js/compartment(app://browser.gaiamobile.org/index.html)
│   │  │  │  ├──2.97 MB (04.96%) ── string-chars

and after, I see

│   │  │  ├──0.74 MB (01.33%) ++ js/compartment(app://browser.gaiamobile.org/index.html)
│   │  │  └──0.42 MB (00.75%) -- (4 tiny)
│   │  │     ├──0.24 MB (00.43%) ++ layout
│   │  │     ├──0.12 MB (00.21%) ── style-sheets
│   │  │     ├──0.06 MB (00.11%) ++ dom
│   │  │     └──0.00 MB (00.00%) ── property-tables

the string-chars have completely disappeared.
This also may not be restricted to the browser app.  I see the homescreen using a lot of string-chars sometimes:

│   ├──3.03 MB (15.25%) -- js/compartment(app://homescreen.gaiamobile.org/index.html)
│   │  ├──1.20 MB (06.05%) ── string-chars

unclear what it's doing (I'd guess screenshots again, but we got that wrong once...).
(In reply to Chris Jones [:cjones] [:warhammer] from comment #1)
> Are we holding onto full-resolution page snapshots as data: URIs somewhere
> perhaps?

FWIW, yes we do keep screenshots in the DOM as data URIs (which is what the API provides) for the tabs list. They seem to be between 7k and 70k each depending on the complexity of the web page. We could optimise this by storing scaled down versions of the screenshots, but I don't know whether there is a tradeoff with the resources needed to do the scaling operation so regularly. Also not sure if this should be done in-app using canvas, or on the platform where we pass the size of the screenshot we want in the API call.

However, it sounds like this may not be the issue. I'm not familiar with the tools you're using to analyse resource usage this, but I'm happy to learn more tools if it helps make the browser more performant. I appreciate that the browser has a big responsibility, living right in the main B2G process.

Also, please point out if we're doing anything really dumb!
What types of allocations does "string-chars" cover, exactly?
Most JS strings consist of a header and then a pointer to the array of characters. "string-chars" adds up the sizes of all the char arrays. The headers are counted in a different category, "gc-heap/strings" I think.

In general you can mouse over each thing in about:memory and it gives a little summary.
Summary: Memory usage of b2g process rises significantly when loading web page → Memory usage of b2g process rises significantly when loading web page, due to strings-chars
Summary: Memory usage of b2g process rises significantly when loading web page, due to strings-chars → Memory usage of b2g process rises significantly when loading web page, due to string-chars
billm suggested that we could debug this by doing a gc dump and then looking through the strings therein.  If we have 17mb of strings, it shouldn't be too hard to find the offenders.
Bill/Andrew, which of the GC/CC dump expressions should I be using here?  Is it the allTraces() one?

https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump
It doesn't matter since you only are going to be using the GC part of the dump. The one without allTraces is faster.
You can view the strings with something like

> $ zcat gc-edges.13705.log.gz | grep string | awk '{print length,$0}' | sort -rn | vim -

(This will sort the strings in reverse order by length.)

I don't see anything obvious in here.  Just a lot of strings.

I'll see if I can get a gc dump when we have 10+mb of strings (that's trickier to reproduce).  Maybe that will reveal something.
In this dump, the main process has 10mb of string-chars, 8.9mb of which are due to the browser app compartment.  (I loaded nytimes.com in the browser to generate this dump.)

Note that bug 799536 is fixed in the build which I used to generate this data.
Attachment #669573 - Attachment is obsolete: true
Using the commands in comment 12 on your latest log only generates a log that is about 1.2MB, so I'm not sure what else is going on. (It is obviously only going to be a rough metric, but it should be closer.) Strings must be getting truncated in the dumping code. Maybe patching that would get a more accurate picture, if in fact it is a few huge strings that are the problem.  I can try looking for that.
Since it's /still/ not clear to me which strings are coming from the browser compartment (even though I know that 90% of them weighted by number of chars are from the browser), I'm going to try to annotate the dumps with the compartment URI.

> Strings must be getting truncated in the dumping code.

Ah, that's possible!  I can fix that.
Also, if it helps, you can easily get a string's compartment as follows:

char buffer[256];
(*rt->compartmentNameCallback)(rt, string->compartment(), buffer, sizeof(buffer));

You could add this to JS_GetTraceThingInfo for every string.
> Also, if it helps, you can easily get a string's compartment as follows:

That helps a lot; thanks.
Hm...maybe our reporting or the gc dump is wrong somehow.

I don't see many megabytes of strings associated with the browser compartment in the dump I'll attach momentarily.  In fact, I see only 90 strings.  All but two have length less than 100; the other two have lengths 378 and 2014.

I do see some long strings in this dump.  The settings app has two 44kb base-64-encoded jpegs, and the system app has a handful of 7kb pngs (presumably screenshots).  That's bad, but not this bad.

The sum of all the string lengths in the dump I'll attach momentarily is 524668.  At 2 bytes per char, that's ~1mb.  But my about:memory dump says the main process has 7mb of string-chars.  (That's true in about:memory dumps I took before and after the gc/cc dump.)

The plot thickens...
billm and mccr8 informed me that the gc dump only shows /reachable/ objects, which explains what's going on.  (Shows how much I know.)

When I force a GC/CC, the string chars fall down to 1mb in the main process, just as expected from comment 19.

There are still some problems here, though:

1) We're allocating 10mb of GC'able strings in the main process / browser-app for some unknown reason.
2) We don't trigger a GC after these allocations.  billm says that it currently takes 32mb to trigger a GC, which explains this.

(1) may be due to aggressive screenshotting, although I thought I ruled that out in comment 3.  billm is writing a patch to iterate over all live strings, so we can figure this out.

For (2), we probably want to decrease this high water mark -- 32mb is far too high.  I filed bug 799656 on this.  We may also want to do something else -- for example, the content process will GC 5s after a page finishes loading, so maybe we want to do the same in the main process.  But we can wait until we see where these strings are coming from before we make that decision.
(In reply to Justin Lebar [:jlebar] from comment #21)
> billm and mccr8 informed me that the gc dump only shows /reachable/ objects,
> which explains what's going on.  (Shows how much I know.)
> 
> When I force a GC/CC, the string chars fall down to 1mb in the main process,
> just as expected from comment 19.
> 
> There are still some problems here, though:
> 
> 1) We're allocating 10mb of GC'able strings in the main process /
> browser-app for some unknown reason.
> 2) We don't trigger a GC after these allocations.  billm says that it
> currently takes 32mb to trigger a GC, which explains this.

> 
> For (2), we probably want to decrease this high water mark -- 32mb is far
> too high.  I filed bug 799656 on this.  

This only solves the problem if we allocate a ton of long strings. Short strings don't modify the water mark trigger. For them we would have to tighten the GC heap growth.
billm gave me http://pastebin.mozilla.org/1862246 to dump all the strings.  I'll try it as soon as I can.
Summary: Memory usage of b2g process rises significantly when loading web page, due to string-chars → Memory usage of b2g process rises significantly when loading web page, due to gc-able string-chars in the browser compartment
Looks like it is screenshots, and I messed something up in comment 3.  I see 6 base-64 PNGs with length 775510 in the browser.

I need to respin with a huger buffer so that I can actually open one of these images, but what else could it possibly be?
6 * 0.75MB = 4.5MB.  Does that account for what you're seeing in about:memory?  I was seeing ~15MB in comment 1.

It's possible we have different numbers of screenshots, or there are copies (ugh!) floating around.
The patch I sent to Justin prints the length of a string in characters, but a jschar is 2 bytes.
Hmm ... 320 x 480 x 4 (RGBA) is 614400 bytes.  Base64-encoded would 819200.  So it seems we're either not really compressing (which is OK), or we're hitting some large allocation class.  It's strange that all the screenshots are the same size.
(In reply to Bill McCloskey (:billm) from comment #26)
> The patch I sent to Justin prints the length of a string in characters, but
> a jschar is 2 bytes.

Ah, ok.  So up to 9MB.

But in comment 27, that means we're allocating 1.55MB for screenshots that would be at maximum 819200 uncompressed and base64 encoded.  Ouch!
Oh right, the base64 encoding would be 2 bytes per character.  /me sobs
bad screenshots!

Opening the browser right after a clean flash doesn't impact the main b2g process.
After visiting 4 sites, rebooting the phone and starting the browser, the main b2g process increases by 15-25MB without loading a single site.
I know nothing about whether we need these screenshots or where they come from. However, typed arrays might be a better way to store them. You wouldn't have to worry about sizeof(jschar) or about base64ing anything.
Copying the screenshots out of IDB probably explains the remainder of the overhead.

bent, sicking can IDB store typed arrays directly?
jlebar, does the browser screenshot API allow specifying a scale or maximum dimension?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #33)
> jlebar, does the browser screenshot API allow specifying a scale or maximum
> dimension?

Not at the moment, but it seems obvious we're going to have to add it.  We'll probably also want to do lossy compression on them.

Everything is going through the messagemanager atm, so typed arrays is not necessarily an easy thing to do.  But we have options.

I think all of the screenshots are the same size because they're identical.  But I've been caught up with other things and haven't had a chance to actually look at them.
Yes, IDB supports storing both ArrayBuffers and ArrayBufferViews directly.

Will that help a lot though? If we're not GCing, then ArrayBuffers won't be freed either.
Interestingly, it looks like the settings app holds the homescreen background as a string (to the tune of ~45kb).  That string may or may not go away on gc; I'm not sure.

Anyway, yes, these are screenshots.  980x1118, RGBA, lossless PNGs.  I think we can improve on this a bit.
(In reply to Justin Lebar [:jlebar] from comment #36)
> Anyway, yes, these are screenshots.  980x1118, RGBA, lossless PNGs.  I think
> we can improve on this a bit.

Also base64 encoded in 2-byte jschar strings.

o_O

I would shoot for smaller / better encoding (typed array) before lossy compression.

Updated

7 years ago
Blocks: 796644
Whiteboard: [MemShrink] → [MemShrink][slim:>10MB]
Khuey just landed canvas-as-blob on inbound.  bug 648610.  Cool.
Depends on: 648610
Whiteboard: [MemShrink][slim:>10MB] → [MemShrink:P1][slim:>10MB]
Whiteboard: [MemShrink:P1][slim:>10MB] → [MemShrink][slim:>10MB]
Whiteboard: [MemShrink][slim:>10MB] → [MemShrink:P1][slim:>10MB]
We could also do arraybuffers if you need it.  That would let you do manual memory management ...
No longer depends on: 648610
There are essentially three things we need to do here, I think.

1) Reduce the dimensions of the screenshots.  That's bug 800170.
2) Reduce the overhead of storing the screenshots.  That's bug 802647.
3) Reduce the number of unreachable screenshot objects which sit around in memory.

There are a number of different ways we can accomplish (3).  Bug 802662 and bug 802665 are two options, and we may want to try both.

Alternatively, we could explore what Kyle suggested in comment 40, which involves making the screenshot objects mutable so we can do manual memory management on them.  The trick with that approach is that we pass these objects through the message manager, and the semantics of passing a mutable object that way aren't entirely clear to me.

We also need to convert the ArrayBuffer into a blob before we display it.  I presume this makes a copy of the buffer?  At that point, we're kind of screwed anyway.
blocking-basecamp: --- → ?
Reducing startup time and mem usage is p1 blocker, and this is the biggest known win left.
blocking-basecamp: ? → +
Summary: Memory usage of b2g process rises significantly when loading web page, due to gc-able string-chars in the browser compartment → Memory usage of b2g process rises significantly when loading web page, due to gc-able screenshots in browser compartment
I'm going to call this fixed; I can't force Gecko to create tons of blobs/screenshot strings in my testing.  If we find new bugs, we can reopen this metabug, or we can just block on slim-fast.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Not convinced that this shouldn't be WORKSFORME, but meh. Setting the tracking flags so it at least doesn't show up in my bug queries for bb+ bugs needing Aurora landing.
Target Milestone: --- → mozilla19
You need to log in before you can comment on or make changes to this bug.