Open Bug 1441736 Opened 2 years ago Updated 5 months ago

Look into ways of reducing the memory atoms use

Categories

(Core :: DOM: Core & HTML, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: bzbarsky, Unassigned)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

(Whiteboard: [overhead:meta])

Attachments

(2 files)

No description provided.
Attached patch test.patchSplinter Review
Logging patch.
Bug 1257126 shows some previous work in this direction; it's a mixture of memory use reduction and code clean-ups.

Bug 1441430 will give us more detail about atom memory usage in about:memory.

- Bug 1411469 is about statically allocating static atoms. On 64-bit it could save up to 150 KiB per process. I have some draft patches from a couple of months ago. They touch a *lot* of code and when I was working on them browser startup was badly broken for reasons unknown.

- Various ideas for dynamic atoms:

  - Store 8-bit chars instead of 16-bit chars.

  - Store the chars at the end of the nsAtom struct instead of having a pointer to a separate nsStringBuffer object.

  - Do we really need the mHash field? If it's not used often, we should just recompute it when necessary. (Note: with the current nsAtom layout, removing it won't actually reduce memory usage due to struct padding and mozjemalloc rounding. But it might help in conjunction with other changes.)

  - Do we really need a word-sized refcount? Would 32-bits suffice?

  - On 64-bit a dynamic nsAtom is currently 24 bytes, which mozjemalloc rounds up to 32, and there is also the nsStringBuffer, which is 8 bytes plus at least 2 bytes per char. (And the nsStringBuffer is potentially shared, though I don't know if that's common in practice.) If we implemented all of the above ideas it would drop to 8 bytes plus 1 byte per char in the atom; a lot of them would fit in 16 bytes, and those that don't would get rounded up by mozjemalloc to 32 bytes. Overall this might save up to ~200 KiB(?) in larger processes, and less in smaller ones.

- None of the above help with the atoms hash table.

  - On 64-bit 25% of the space used by the hashtable is padding, because each entry has a layout of [4 bytes hash value; 4 bytes padding; 8 bytes nsAtom pointer]. Changing PLDHashTable to store keys separately from values would fix that, saving about 50 KiB per process. It would also have cache implications, and I'm not sure how it would interact with bug 1402910.

  - Also, using a full pointer to identify an atom when there are a few thousand of them in a process is space-inefficient. It would be nice to do better -- 32 bits per atom? 16 bits per atom? -- but that would likely be difficult because use of `nsAtom*` is so widespread.

With the exception of bug 1411469, all of the above will improve things but still won't get us away from atoms memory usage being proportional to the number of processes. Doing that (via shared memory?) would be the "real" fix.
> Do we really need the mHash field?

We did with the old style system.  It was used a _lot_ for the selector-matching Bloom filter.

I do know servo/components/style/gecko_string_cache/mod.rs still uses mHash.  And it looks to me like each_relevant_element_hash() in servo's Bloom filter uses that, just like Gecko's did.

> And the nsStringBuffer is potentially shared, though I don't know if that's common in practice.

We could measure, but it really depends on workload.  Web pages that get .localName on nodes would definitely share it.
Depends on: 1441430
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #4)
> > Do we really need the mHash field?
> 
> We did with the old style system.  It was used a _lot_ for the
> selector-matching Bloom filter.
> 
> I do know servo/components/style/gecko_string_cache/mod.rs still uses mHash.
> And it looks to me like each_relevant_element_hash() in servo's Bloom filter
> uses that, just like Gecko's did.

Yes, we definitely need it in the new style system.
Depends on: 1442433
> - Bug 1411469 is about statically allocating static atoms. On 64-bit it
> could save up to 150 KiB per process. I have some draft patches from a
> couple of months ago. They touch a *lot* of code and when I was working on
> them browser startup was badly broken for reasons unknown.

In bug 1442433 I am removing the refcount from static atoms. On 64-bit this reduces memory usage by 42 KiB; on 32-bit it has no effect (due to mozjemalloc rounding up request sizes).
I was also contemplating a more aggressive change: completely split the storage of static and dynamic atoms.

- Static atoms would be statically allocated and const (and thus shareable), and stored in a way that lets them be searched without extra heap memory. E.g. put them in a sorted static array and use binary search.

- Dynamic atoms would still be in a hash table, but the hash table would be smaller because it wouldn't contain the static atoms.

- Normal atom lookups would potentially involve searching both structures.

I'm not sure if it's worth the effort, but it's an interesting thought in terms of being as aggressive as possible.
It seems like there are really two separate issues:

(1) Static const atoms shareable atoms. Seems like this is something we definitely want if we can make it work.

(2) Storing static atoms in a separate table. This depends on whether the cost of the second table lookup can be made cheap enough. With 2700 static atoms, we're looking at log2(2700) = ~11 comparisons. Is that cheap enough relative to the other work we do?
(And note that we still need some superclass, because consumers need to be able to operate on an nsAtom* without worrying about whether it's static or dynamic).
Depends on: 1447951
>   - On 64-bit a dynamic nsAtom is currently 24 bytes, which mozjemalloc
> rounds up to 32, and there is also the nsStringBuffer, which is 8 bytes plus
> at least 2 bytes per char. (And the nsStringBuffer is potentially shared,
> though I don't know if that's common in practice.)

I just did some measurements and there is very little sharing going on. I filed bug 1447951.
Depends on: 1449395
Priority: -- → P3
Whiteboard: [overhead:meta]
Component: DOM → DOM: Core & HTML
See Also: → 1541208
You need to log in before you can comment on or make changes to this bug.