Closed Bug 1436615 Opened 8 years ago Closed 7 years ago

Lots of memory used for activity stream PNGs encoded as data URIs

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Firefox 62

Iteration:

62.2 - Jun 4

Tracking Flags:

Tracking

Status

firefox62

---

fixed

People

(Reporter: mccr8, Assigned: imjching)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [MemShrink:P1][fxperf:p3])

Attachments

(2 files)

example part of about:memory using ~900MB for about:newtab 8 years ago Randell Jesup [:jesup] (needinfo me) 31.81 KB, text/plain		Details
Link to GitHub pull-request: https://github.com/mozilla/activity-stream/pull/4150 7 years ago GitHub Bugzilla PR Linker 52 bytes, text/x-github-pull-request		Details \| Review

Andrew McCreight [:mccr8]

Reporter

Description

•

8 years ago

I was looking in about:memory, and I see megabytes of strings that look like they are based 64 encoded data URIs for PNGs. The largest one is 1MB. I'm not entirely sure what all of these are, but I extracted the largest string, and another smaller one, pasted them into my URL bar, and they were both images from my about:newtab page. edited down, the about:memory entries look like this: 261.90 MB (100.0%) -- explicit ├───58.07 MB (22.17%) ++ window-objects ├───46.15 MB (17.62%) -- js-non-window │ ├──30.12 MB (11.50%) -- zones │ │ ├──26.46 MB (10.10%) -- zone(0x7f97bfe79000) │ │ │ ├───9.20 MB (03.51%) ++ compartment([System Principal], shared JSM global) │ │ │ ├───7.22 MB (02.76%) ++ (22 tiny) │ │ │ ├───7.04 MB (02.69%) -- strings │ │ │ │ ├──1.64 MB (00.62%) ++ string(<non-notable strings>) │ │ │ │ ├──1.00 MB (00.38%) ++ string(length=573258, copies=1, "data:image/png;base64,[...]" (truncated)) │ │ │ │ ├──0.50 MB (00.19%) ++ string(length=286406, copies=1, "data:image/png;base64,[...]" (truncated)) │ │ │ │ ├──0.50 MB (00.19%) ++ string(length=313370, copies=1, "data:image/png;base64,[...]" (truncated)) │ │ │ │ ├──0.50 MB (00.19%) ++ string(length=393482, copies=1, "data:image/png;base64,[...]" (truncated)) │ │ │ │ ├──0.50 MB (00.19%) ++ string(length=480114, copies=1, "data:image/png;base64,[...]" (truncated)) │ │ │ │ ├──0.25 MB (00.10%) ++ string(length=143278, copies=1, "data:image/png;base64,[...]" (truncated)) These images shouldn't be kept around in such a memory inefficient format.

Eric Rahm [:erahm]

Comment 1

•

8 years ago

I've seen similar entries recently, for example in bug 1427594, comment 6.

Tim Spurway [:tspurway]

Updated

•

8 years ago

Flags: needinfo?(usarracini)

Ursula Sarracini (:ursula)

Comment 2

•

8 years ago

Getting rid of data URIs are definitely something we've wanted to do for a while. Bug 1184701 would let us use the moz-page-thumbs protocol which would mean we could just use img tags with the src being the URL, and then we could ditch all these data URIs. In order for us to make the protocol work in the content process and with Activity Stream, we'd need to sort out bug 1385306 first. If bug 1385306 gets done and we want Activity Stream working with the protocol, we need to get moz-page-thumbs:// working with either a codeBase principal or a null principal, which means we need to let the protocol be loadable by anyone or somehow special case about:newtab, which brings into question security. There's still a lot of questions around what the best approach is here... maybe using indexedDB and Blobs would avoid all of this? Overall there's a non trivial amount of work to get us into a state where we don't use data URIs but it's on our radar and we definitely want to do this.

Flags: needinfo?(usarracini)

Tim Spurway [:tspurway]

Updated

•

8 years ago

Priority: -- → P3

Ed Lee :Mardak

Comment 3

•

8 years ago

Potentially we can avoid some message passing of the full data uris if we cache them in indexedDB. Although now there would be copies of the data uri there and additional messaging and caching overhead.

Eric Rahm [:erahm]

Comment 4

•

8 years ago

(In reply to Ursula Sarracini (:ursula) from comment #2) > Getting rid of data URIs are definitely something we've wanted to do for a > while. Bug 1184701 would let us use the moz-page-thumbs protocol which would > mean we could just use img tags with the src being the URL, and then we > could ditch all these data URIs. In order for us to make the protocol work > in the content process and with Activity Stream, we'd need to sort out bug > 1385306 first. If bug 1385306 gets done and we want Activity Stream working > with the protocol, we need to get moz-page-thumbs:// working with either a > codeBase principal or a null principal, which means we need to let the > protocol be loadable by anyone or somehow special case about:newtab, which > brings into question security. There's still a lot of questions around what > the best approach is here... maybe using indexedDB and Blobs would avoid all > of this? > > Overall there's a non trivial amount of work to get us into a state where we > don't use data URIs but it's on our radar and we definitely want to do this. Can you just use blobs?

Whiteboard: [MemShrink] → [MemShrink:P1]

Eric Rahm [:erahm]

Updated

•

8 years ago

Updated

•

8 years ago

Whiteboard: [MemShrink:P1] → [MemShrink:P1][fxperf]

Ed Lee :Mardak

Updated

•

8 years ago

See Also: → https://github.com/mozilla/activity-stream/issues/2324

Mike Conley (:mconley) (:⚙️)

Updated

•

8 years ago

Whiteboard: [MemShrink:P1][fxperf] → [MemShrink:P1][fxperf:p3]

Mike Conley (:mconley) (:⚙️)

Comment 6

•

8 years ago

ni?ing over to ursula for comment 4

Flags: needinfo?(usarracini)

Ursula Sarracini (:ursula)

Comment 7

•

8 years ago

(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #4) > Can you just use blobs? Can you expand on what you mean here? Can we send blobs over IPC? Store the blobs in indexedDB?

Flags: needinfo?(usarracini) → needinfo?(erahm)

Eric Rahm [:erahm]

Comment 8

•

8 years ago

(In reply to Ursula Sarracini (:ursula) from comment #7) > (In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment > #4) > > Can you just use blobs? > > > Can you expand on what you mean here? Can we send blobs over IPC? Store the > blobs in indexedDB? I'm not really a blob expert, but yes IPC supports them [1 for example] and IDB seems to as well [2]. This is a somewhat old mozilla hacks article [3], but I think it covers the basics of blob support in idb and using createObjectURL to get a URL you can use to map an img's src attribute to a blob. Let me know if I can help out further, odds are I'll redirect you to someone with more knowledge though :) [1] https://searchfox.org/mozilla-central/rev/f5fb323246bf22a3a3b4185882a1c5d8a2c02996/dom/file/ipc/IPCBlobUtils.h#12-210 [2] https://searchfox.org/mozilla-central/rev/f5fb323246bf22a3a3b4185882a1c5d8a2c02996/dom/indexedDB/ActorsParent.cpp#6708-6736 [3] https://hacks.mozilla.org/2012/02/storing-images-and-files-in-indexeddb/

Flags: needinfo?(erahm)

Andrew Sutherland [:asuth] (he/him)

Comment 9

•

8 years ago

Blob enthusiast here. They key features of Blobs are that they are a handle on immutable cross-global memory that, under ideal conditions, does not duplicate the memory used when sent via postMessage(). That is, if you postMessage(k1MegabyteString) to 10 separate globals, you will end up with 1 megabyte used in each global. You will also put 1 megabyte of data over the wire if IPC is involved. If you do that a second time, each of those globals will end up with ANOTHER copy of the string and there will be the same wire overhead. However, if you do `const blobString = new Blob([k1MegabyteString])` and postMessage that blob: - Bad news: creation of the blob will probably create its own copy of the string. So you need to drop references to the string and let it be GC'ed if you want to be a responsible memory user. - Good news: Under ideal situations, at most one copy of that backing memory will end up in each process, and the wire size of the blob for IPC is super tiny. Because when you postMessage the Blob around, you are just postMessaging a handle to the underlying memory. - Also good news: Many APIs are happy to provide you with a Blob directly so you don't need to round-trip through JS strings or ArrayBuffers. For example, HTMLCanvasElement.toBlob() is more efficient than toDataURL(). If you need to manipulate Blobs, you can create composite Blobs from bits and pieces of other blobs using Blob.slice() and new Blob(["a string maybe", anArrayMaybe, otherBlob, otherOtherBlob.slice(100, 200)]). You may have detected some hand-waving there about "under ideal conditions". The short story of that is: - IndexedDB-sourced Blobs are perfectly ideal. They are references to on-disk files. Passing them around never duplicates anything in memory. The catch here is that if you do an IDB put({ fooBlob }) and then an IDB get() that returns `diskBackedFooBlob`, `fooBlob` is still a memory-backed Blob, not the ideal disk-backed `diskBackedFooBlob`. - Because of Blob/File semantics, it is possible for us to further optimize things so that everything is an ideal condition. For example, we have a bug on file to morph `fooBlob` so that it is the same as `diskBackedFooBlob` after the IDB transaction commits. I think there may also be some cross-process optimizations for memory-backed Blobs that IDB already does that we could adopt in general. Additionally, we have a mechanism to spill huge blobs generated by XHR and MediaRecorder to temporary files on disk, etc. etc.

Andrew Sutherland [:asuth] (he/him)

Comment 10

•

8 years ago

One last thing to mention that may be relevant here: IndexedDB file-backed Blobs are reference counted over all the values stored in an IndexedDB database and in memory which means free data de-duplication and cleanup. That is: const sitePic = await asyncCanvasToBlobUsingPromises(canvas); idb.put({ key: "visit-1", image: sitePic }); // later, using either the same sitePic from memory or retrieved from IDB idb.put({ key: "visit-2", image: sitePic }); On disk, we store both values as structured clones that reference the single on-disk Blob. Once both values are removed from the database and all in-memory references to the IndexedDB-sourced Blobs are forgotten, we will purge the sitePic from disk. This is done using SQLite transactions and is fully atomic. This means that if you're dealing with large images that may be referenced by multiple object structures, you don't need to build your own management layer that does its own reference counting or garbage collection sweeps or dealing with edge-cases related to crashes while your accounting was mutating something non-atomically. You get it for free. Plus, you don't have to worry if there are any in-memory references to the picture that might break/etc. Caveats: - De-duplication happens based on object identity, not content-addressible hashing or something magic. If you new Blob(), you're going to get another copy of the Blob on disk. - Each IDB database is its own island. If you get() a blob from IDB database A and put() it into IDB databse B, there will be two copies on disk.

Randell Jesup [:jesup] (needinfo me)

Comment 11

•

8 years ago

Attached file example part of about:memory using ~900MB for about:newtab — Details

This 900MB is basically all the memory used in that Content Process.... also, what's up with two about:newtabs with different Id's? (perhaps a newtab is actually open) - and what's with the 3rd copy in js-non-window??

GitHub Bugzilla PR Linker

Comment 12

•

7 years ago

Attached file Link to GitHub pull-request: https://github.com/mozilla/activity-stream/pull/4150 — Details

Ed Lee :Mardak

Updated

•

7 years ago

Assignee: nobody → jlim

Iteration: --- → 62.2 - Jun 4

Priority: P3 → P1

Ed Lee :Mardak

Updated

•

7 years ago

Blocks: 1445085

Ed Lee :Mardak

Comment 14

•

7 years ago

> what's up with two about:newtabs with different Id's? (perhaps a newtab is actually open) - and what's with the 3rd copy in js-non-window?? about:newtab is preloaded per window so that the next new tab will be immediately shown. mconley has been working on ways to be smarter about when to preload in bug 1353013.

Comment 15

•

7 years ago

Commits pushed to master at https://github.com/mozilla/activity-stream https://github.com/mozilla/activity-stream/commit/fc7a8a59059136ff23a25f9fecb2cf8df0d3180f Bug 1436615 - Lots of memory used for activity stream PNGs encoded as data URIs Signed-off-by: Jay Lim <jlim@mozilla.com> https://github.com/mozilla/activity-stream/commit/a645823cbed8b3a202a25d9f9b5027d2a4dbe860 Merge pull request #4150 from imjching/master Bug 1436615 - Lots of memory used for activity stream PNGs encoded as data URIs

Tim Spurway [:tspurway]

Updated

•

7 years ago

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

Ed Lee :Mardak

Updated

•

7 years ago

Blocks: 1466971

Ed Lee :Mardak

Comment 16

•

7 years ago

https://hg.mozilla.org/mozilla-central/rev/c9ae7e47a6ef

status-firefox62: --- → fixed

Target Milestone: --- → Firefox 62

Ed Lee :Mardak

Updated

•

7 years ago

Blocks: 1467485

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: Activity Streams: Newtab → New Tab Page

Ed Lee :Mardak

Updated

•

6 years ago

Updated

•

5 years ago

Blocks: 1647486

You need to log in before you can comment on or make changes to this bug.