Closed Bug 1477213 Opened 6 years ago Closed 6 years ago

Reduce memory usage of gNameToHistogramIDMap in content process

Categories

(Toolkit :: Telemetry, enhancement, P1)

enhancement
Points:
2

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: janerik, Assigned: janerik)

References

(Blocks 1 open bug)

Details

(Whiteboard: [overhead:130k])

Attachments

(3 files, 2 obsolete files)

bug 1475557 comment 2 shows that |gNameToHistogramIDMap| is using a significant amount of memory as it is dynamically allocated and filled for every process.

Its initialization is based on fully static data (known at compile time).
It is immutable after initialization.

It should be possible to reduce dynamic memory allocation here, especially in content processes
Blocks: 1475557
Assignee: nobody → jrediger
Status: NEW → ASSIGNED
Priority: -- → P1
A first PoC using a Perfect Hash Table generated at compile-time shows the reduce of memory usage across all processes (bug 1475557 comment 4).

This try run[1] has a second PoC implementation using the (already vendored) PHF library in Rust. Telemetry tests (and xpcshell tests in general) are still passing.

[1]: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0b44c356754453f737a2327e3d8d9a8dcf1fc583

This drops the `cpp_guard` from histograms, leading to a slight increase in (unused) storage slots (5 Kb), but reduces the name->id map to nothing (it's all static at compile time now, a drop of 130 Kb.
This means it's a total reduce of 125 Kb for every process.
There are currently 42 `cpp_guard`s defined in Histograms.yaml
I couldn't find any code that actually relies on guarded histograms to exist/not exist. We might be able to drop them.
Blocks: 1475556
No longer blocks: 1475557
By using a hash table based on a perfect hash function, generated at build time,
we can avoid dynamic memory allocation for every process at start.

Downside:
This drops the "cpp_guard" in order to have a fixed and stable
number of histograms to put into the hash table.
This slightly increases the allocated memory for storage slots, but the
savings for dropping the dynamic hash table are outweighing.

It looks like no one actually depends on the cpp_guards to exclude histograms at runtime.
The map is now immutable and thus we don't need exclusive access to read it.


Depends On D2319
I don't want to block this bug, but we already have an in-tree DAFSA implementation for handling static look-ups like this:

https://searchfox.org/mozilla-central/source/xpcom/ds/make_dafsa.py

And Nika is adding a perfect hash table implementation in bug 1477432.

It would be nice if we didn't need to ship 3 different static lookup table implementations...
Nathan pointed out that we already depend on the Rust phf crates in Stylo, so I guess I withdraw my objection.
Nathan brought up another point about Rust phf: Its data structures require relocations, and while relocated data is generally shared on Windows, it's not currently shared on Linux or OS-X.

DAFSA doesn't have that problem, and should also generate much more compact data structures for tables with lots of common prefixes and suffixes, like histogram IDs.
Thanks, :kmag. I initially used rust-phf as it was in-tree. If we have another more suitable thing already I'm fine with switching over to it. I'll take a look at `make_dafsa.py`
make_dafsa.py is not usable as it only maps to {0,1,2,4}.
We do have perfecthash.py[1] in tree (and the required minimal C++ part for it as well). It works as a replacement as well.

[1]: https://searchfox.org/mozilla-central/source/xpcom/reflect/xptinfo/perfecthash.py
(In reply to Jan-Erik Rediger [:janerik] from comment #8)
> make_dafsa.py is not usable as it only maps to {0,1,2,4}.

I know. I'm going to have to change that for another bug.

In any case, I think the Rust solution is good enough for now, but switching to one of the other options might be good as a follow-up.
Depends on: 1479484
Attachment #8994502 - Attachment is obsolete: true
Attachment #8994503 - Attachment is obsolete: true
`cpp_guard` was only used for operating system checks.
We can check this value in our Python build scripts and exclude histograms that shouldn't be included for the target OS.
By using a hash table based on a perfect hash function, generated at build time,
we can avoid dynamic memory allocation for every process at start.


Depends On D2927
Is there a follow-up bug filed yet for using `platforms` over `cpp_guard` for Scalars.yaml and Events.yaml?
Comment on attachment 8998477 [details]
Bug 1477213 - Remove now-empty shallow sizes. r?gfritzsche

Georg Fritzsche [:gfritzsche] has approved the revision.
Attachment #8998477 - Flags: review+
Blocks: 1482912
> const char* bytes = PromiseFlatCString(aKey).get();
> size_t length = aKey.Length();

Yuck, we should be able to do better than that. =/
:froydnj: What do you suggest? Shouldn't this be rather cheap? AIUI the nsACString should already be null-terminated 8-bit chars and length will give us the number of bytes there.
Flags: needinfo?(nfroyd)
nsACString is explicitly *not* null-terminated; it's a character buffer with an explicit length.  Which is all this code really wants, so there's no need to use PromiseFlagCString, which will perform a bunch of needless work, possibly including allocation.

The pattern in comment 17 is also explicitly warned against in xpcom/string/nsTPromiseFlatString.h.  We should rewrite the code generator first.
Flags: needinfo?(nfroyd)
Thanks, I'll rewrite to not use the PromiseFlatCString then.
(In reply to Jan-Erik Rediger [:janerik] from comment #20)
> Thanks, I'll rewrite to not use the PromiseFlatCString then.

The canonical method to get the the raw string in this case is |BeginReading|.
(In reply to Nathan Froyd [:froydnj] from comment #19)
> nsACString is explicitly *not* null-terminated; it's a character buffer with
> an explicit length.

Nit: It's explicitly *not guaranteed to be* null-terminated. But it may be, and callers can check IsTerminated() to find out if it is.

Not that it matters here, since the caller doesn't care whether it's null terminated or not.

That said, this code is super dangerous:

  const char* bytes = PromiseFlatCString(aKey).get();

If aKey were not already null terminated, this would create a copy, and that copy would be destroyed at the end of the expression, leaving `bytes` pointing to data which has already been freed.

I'm kind of surprised our Clang plugin doesn't complain about this...
Points: --- → 2
Try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f21f62f681b4c3e081a71d992cb09cf40f7a9387
Gist updated with current generated code.

Fixed the dangerous use of PromiseFlatCString as well as the other review comments.
Comment on attachment 8998475 [details]
Bug 1477213 - Replace cpp_guard with operating_systems for compile-time OS check. r?gfritzsche

Georg Fritzsche [:gfritzsche] has approved the revision.
Attachment #8998475 - Flags: review+
Comment on attachment 8998476 [details]
Bug 1477213 - Replace Name->ID map with pre-generated hash table. r?gfritzsche

Georg Fritzsche [:gfritzsche] has approved the revision.
Attachment #8998476 - Flags: review+
Comment on attachment 8998476 [details]
Bug 1477213 - Replace Name->ID map with pre-generated hash table. r?gfritzsche

Kris Maglione [:kmag] has approved the revision.
Attachment #8998476 - Flags: review+
Pushed by jrediger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/600af91dfc45
Replace cpp_guard with operating_systems for compile-time OS check. r=gfritzsche
Pushed by jrediger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fcd411537143
Replace Name->ID map with pre-generated hash table. r=kmag,gfritzsche
Pushed by jrediger@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/afeaf5a5abe3
Remove now-empty shallow sizes. r=gfritzsche
https://hg.mozilla.org/mozilla-central/rev/600af91dfc45
https://hg.mozilla.org/mozilla-central/rev/fcd411537143
https://hg.mozilla.org/mozilla-central/rev/afeaf5a5abe3
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Whiteboard: [overhead:130k]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: