Closed Bug 1411469 Opened 8 years ago Closed 7 years ago

Static static atoms

Categories

(Core :: XPCOM, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

Here's how static atoms currently look, ignoring the atom tables (64-bit): > static const char16_t foobar_buffer[7] (.rodata, 2(N+1) bytes) > /-----------------------------------------\ <----+ > | u"foobar" // the actual chars | <--+ | > \-----------------------------------------/ | | > | | > nsAtom (heap, 24 bytes, rounds up to 32) | | > /------------------------------------------\ | | > | uintptr_t mRefCnt | | | > | uint32_t mLength: 30 = 6 | | | > | uint32_t mKind: 2 = AtomKind::StaticAtom | | | > | uint32_t mHash = ... | | | > | char16_t* mString = @--------------------|---+ | > \------------------------------------------/ | > | > static nsAtom* MyAtoms::foobar (.bss, 8 bytes) | > /---\ <---------------------------------------+ | > | @-|-----------------------------------------|--| > \---/ | | > | | > static nsStaticAtomSetup (.d.r.ro, 16 bytes) | | > (this is an element in sMyAtomsSetup[]) | | > /-----------------------------------\ | | > | const char16_t* const mString = O-|---------|--+ > | nsAtom** const mAtom = @----------|---------+ > \-----------------------------------/ Bad things: - The nsStaticAtomSetup array is annoying; it's only used at startup. It provides a way of iterating over all the static atoms, and a way of associating the nsAtom pointer to its the buffer. - The nsAtom is heap allocated, even though it's static data that never changes. - On 64-bit bits, the heap allocation request of 24 bytes is rounded up to 32 bytes. Here's a minimal setup (64-bit): > static const char16_t foobar_buffer[7] (.rodata, 2(N+1) bytes) > /-----------------------------------------\ > | u"foobar" // the actual chars | <--+ > \-----------------------------------------/ | > | > nsAtom (which section??, 24 bytes) | > (this is an element in sAtoms[]) | > /------------------------------------------\ | > | uintptr_t mRefCnt | | > | uint32_t mLength: 30 = 6 | | > | uint32_t mKind: 2 = AtomKind::StaticAtom | | > | uint32_t mHash = ... | | > | char16_t* mString = @--------------------|---+ > \------------------------------------------/ Things to note: - Each static nsAtom is actually |static|; no heap allocation. (Hence the bug title of "Static static atoms".) - No need for a separate nsAtom pointer, because we can directly use the address baked into the binary. The static size is unchanged (we drop 8 + 16 bytes, but add 24 bytes) but the heap size reduces by 32 bytes per static atom. (On 32-bit the static size increases by 4 per static atom but the heap size reduces by 16 per static atom.) There are some complications. - We need a way to iterate over all the atoms at startup, in order to insert them into the table. So we have to put the atoms themselves into an array, `sAtoms`. And then to provide direct access to specific atoms within the array, we need an enum `Atoms` that names all the atoms. This means that we have to change references like `nsGkAtoms::foo` to `nsGkAtoms::sAtoms[Atoms::foo]`. That's a bit gross, so we can add getters, so we'll end up with something like `nsGkAtoms::foo()`. All this is definitely doable. - The Rust bindings for nsGkAtoms in servo/components/style/gecko/generated/atom_macro.rs will need changing, but I think that'll be doable. - There's the question of what to do with the mHash field for static atoms. If we can compute them at hash time we might be able to embed them into the binary, and then the nsAtom array could be made read-only, which would be nice. Or we could make the nsAtom array writable and update mHash at start-up when we insert the static atoms into the atom table. - And we want to avoid static constructors for the static nsAtoms.
Attached patch Make static atoms |static| (obsolete) — — Splinter Review
Nathan, I've converted just the static atoms defined by nsDirectoryService.{h,cpp} to the new scheme, as a proof of concept. Things to note. - Lots of "njn:" comments in the patch. A lot of them identify things that can be removed once all atoms have been converted to the new scheme. - There are new NS_STATIC_ATOM_* macros for generating the now-required enum and getters. - I confirmed with objdump that the static data size is unchanged, as expected. But the text size went up by 1256 bytes. Also, the static atom array ends up in the .bss section. I think this means that the static atom array must be getting initialized via static constructors? I was hoping to be able to bake it into the binary as .data. The presence of the ThreadSafeAutoRefCnt field in nsAtom really complicates this; it took me a while to work out how to even initialize sAtoms[] as a literal, and I had to introduce a copy constructor for nsAtom to do so. Any suggestions on how to improve this would be welcome. (BTW, how do I check for static constructors in the binary? I learned this once but have forgotten.) - I'm currently dynamically setting the mHash field in NS_RegisterStaticAtoms2. I'm not sure if doing it at compile time is possible. I'd be happy to hear your thoughts on all this. Thank you.
Attachment #8921713 - Flags: review?(nfroyd)
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
(In reply to Nicholas Nethercote [:njn] from comment #1) > - I confirmed with objdump that the static data size is unchanged, as > expected. > But the text size went up by 1256 bytes. Also, the static atom array ends > up > in the .bss section. > > I think this means that the static atom array must be getting initialized > via > static constructors? I was hoping to be able to bake it into the binary as > .data. The presence of the ThreadSafeAutoRefCnt field in nsAtom really > complicates this; it took me a while to work out how to even initialize > sAtoms[] as a literal, and I had to introduce a copy constructor for nsAtom > to do so. Any suggestions on how to improve this would be welcome. The static atom array is getting initialized via static constructors, because nsAtom has a non-trivial destructor, so we need a static constructor to initialize all the nsAtom objects, and then register a static destructor to call the user-defined destructor on all of the objects. It's not clear to me why the compiler couldn't statically initialize the atoms and then just have a static constructor to register the destructor(s), but such is life. (GCC here actually registers a single function that loops over sAtoms and calls ~nsAtom on each one, which is kind of clever.) The only way around this that I can think of is to separate the actual data fields from nsAtom out into a separate class that *doesn't* have a destructor, and have nsAtom inherit from this new class. Then static atoms could be instances of this class, and we'd do some dodgy casting to make all the types work out. > (BTW, how do I check for static constructors in the binary? I learned this > once but have forgotten.) You can check for them like so: readelf -sW $OBJECT_FILE | grep _GLOBAL__sub_I and you'll get output if OBJECT_FILE contains a static constructor. Something like: objdump -dr $OBJECT_FILE | sed -n -e '/<_GLOBAL__sub_I/,/^$/p' | c++filt will dump out the disassembly of the static constructor in an OBJECT_FILE suspected to contain static constructors. > - I'm currently dynamically setting the mHash field in > NS_RegisterStaticAtoms2. > I'm not sure if doing it at compile time is possible. I think this would require making a lot of mfbt/HashFunctions.h constexpr-clean. Then we could just call HashString (or whatever the right function is) and all the computation would happen at compile time. We'd probably have to carefully ensure that the call of HashString happened in a context where it could be constexpr-evaluated, which might be its own little adventure.
Priority: -- → P3
(In reply to Nicholas Nethercote [:njn] from comment #0) > Here's a minimal setup (64-bit): > > > static const char16_t foobar_buffer[7] (.rodata, 2(N+1) bytes) > > /-----------------------------------------\ > > | u"foobar" // the actual chars | <--+ > > \-----------------------------------------/ | > > | > > nsAtom (which section??, 24 bytes) | > > (this is an element in sAtoms[]) | > > /------------------------------------------\ | > > | uintptr_t mRefCnt | | > > | uint32_t mLength: 30 = 6 | | > > | uint32_t mKind: 2 = AtomKind::StaticAtom | | > > | uint32_t mHash = ... | | > > | char16_t* mString = @--------------------|---+ > > \------------------------------------------/ This nsAtom still looks like 32 bytes (8 + 4 + 4 + 4 + 8) on 64-bit, which rounds up to 32?
Comment on attachment 8921713 [details] [diff] [review] Make static atoms |static| Review of attachment 8921713 [details] [diff] [review]: ----------------------------------------------------------------- Ideally the big comment below is informative and insightful. I think the idea is reasonable, but there are issues to consider before scaling this up: 1. Is saving the heap size really important? 2. How much heap size would we save, and would that savings be worth it for the larger .data (or .bss) section? (The considerations for .data are different, because .data figures into binary size on disk, whereas .bss does not.) 3. Can we reduce size in other ways, e.g. eliminating pointers in nsStaticAtomSetup and/or slimming it down? 4. I think the AtomClass::FOO() syntax is a little gross. I'd prefer a solution where we keep AtomClass::FOO. ::: xpcom/ds/nsAtom.h @@ +102,5 @@ > + {} > + > + // This constructor is *only* used for initializing static atoms, which are > + // |static|, and so it must be public. > + nsAtom(const nsAtom&& aOther) Why is this constructor necessary? ::: xpcom/ds/nsAtomTable.cpp @@ +628,5 @@ > } > > +// njn: rename this once all static atoms are converted to the new form > +// njn: if I manage to do the hashes statically, aAtoms can become |nsAtom* > +// aAtoms|. Is that `const nsAtom* aAtoms`? ::: xpcom/ds/nsStaticAtom.h @@ +28,5 @@ > +// enum class Atoms { // This enum must be called "Atoms". > +// #define MY_ATOM(name_, value_) NS_STATIC_ATOM_ENUM(name_) > +// #include "MyAtomList.h" > +// #undef MY_ATOM > +// COUNT You may want to make this value a little more distinguishing so as to not accidentally collide. @@ +101,5 @@ > + > +// Getter for the atom. The enum must be called |Atoms| and the atoms array > +// must be called |sAtoms| for this to work. > +#define NS_STATIC_ATOM_GETTER(name_) \ > + nsAtom* name_() { return &sAtoms[static_cast<size_t>(Atoms::name_)]; } I tried reworking static atoms a while back, being offended that we had: struct nsStaticAtomSetup { const char16_t* const mString; nsAtom** const mAtom; }; The `mAtom` member is unnecessary overhead, because we have storage for the atom (8 bytes) and we have an indirection to the storage (8 bytes). That indirection also requires a relocation on Linux (24 bytes on x86-64, 8 bytes on 32-bit platforms), which makes it really expensive! (The `mString` member has the same issues with relocations, but is trickier to address.) So I tried setting things up so static atom definers would have: struct nsStaticAtomSetup { const char16_t* const mString; }; static const nsStaticAtomSetup sSetupStorage[...]; static nsAtom* sAtomStorage[...]; and then we'd pass *both* of those in to NS_RegisterStaticAtoms. Less indirection is better, and we save a pointer-sized thing per static atom, plus a relocation for that pointer. So I had similar problems to what you're trying to do here, because I no longer had individually-addressable named atom locations. I wound up doing the roughly equivalent: #define NS_STATIC_ATOM_GETTER(name_) \ nsAtom* const name_ = &sAtoms[static_cast<size_t>(Atoms::name_)]; and that meant that I didn't have to modify a bunch of code that used the atoms. I did have to sprinkle `const` around various places to placate the compiler, but that seemed better than making named entities require function calls to access. Note too that we do things like &nsGkAtoms::FOO in various places, and doing &nsGkAtoms::FOO() is not going to work out too well. I was not ambitious enough to try to stop that. I think the compiler was mostly smart enough to inline uses of `name_` and resolve the address arithmetic at compile-time...though I also see that I had to define a class something like: // We had `nsAtom* sAtomStorage[]` because I was not ambitious enough to try // to make things really static, I was trying for size reductions along other // dimensions. struct AtomProxy { explicit constexpr AtomProxy(size_t i) : mIndex(i) {} operator nsAtom*() const { return sAtomStorage[mIndex]; } nsAtom* operator->() const { return sAtomStorage[mIndex]; } nsIAtom** operator&() const { return &sAtomStorage[mIndex]; } const size_t mIndex; }; and then generate variables like: const nsGkAtoms::AtomProxy nsGkAtoms::name_(nsGkAtoms::AtomsEnum::name_); which IIRC was to make static constructors go away; I can't remember which compiler was motivating that decision. Maybe things have improved since then. GCC and (I think) clang could generate good code by not requiring storage for AtomProxy::mIndex, but MSVC, IIRC, could not do quite as good a job. That may have been with MSVC 2013, though. Perhaps you have already discovered layout/style/{nsCSSAnonBoxes,nsCSSPseudoElements}.{h,cpp}, which do their own interesting things with nsAtom, and which required some special handling.
Attachment #8921713 - Flags: review?(nfroyd)
> > > nsAtom (which section??, 24 bytes) | > > > (this is an element in sAtoms[]) | > > > /------------------------------------------\ | > > > | uintptr_t mRefCnt | | > > > | uint32_t mLength: 30 = 6 | | > > > | uint32_t mKind: 2 = AtomKind::StaticAtom | | > > > | uint32_t mHash = ... | | > > > | char16_t* mString = @--------------------|---+ > > > \------------------------------------------/ > > This nsAtom still looks like 32 bytes (8 + 4 + 4 + 4 + 8) on 64-bit Did you overlook the bitfields? mLength+mKind is only 4 bytes, so it's 8 + 4 + 4 + 8 = 24. On 32-bit it's 4 + 4 + 4 + 4 = 16.
Ah, the bitfields! That's what I was missing.
> 1. Is saving the heap size really important? I'm trying to get things down to the absolute minimum. For each atom, that's: - the chars - the nsAtom object itself - the entry in gAtomTable Both for conceptual simplicity and for memory savings. (This bug would do most of that; bug 529808 will get rid of the entry in gStaticAtomTable.) The advantage of not heap allocating the nsAtom object is that we don't need a separate nsAtom pointer, saving 8 bytes per static atom. > 2. How much heap size would we save, and would that savings be worth it for > the larger .data (or .bss) section? (The considerations for .data are > different, because .data figures into binary size on disk, whereas .bss does > not.) As per comment 0: > The static size is unchanged (we drop 8 + 16 bytes, but add 24 bytes) but the > heap size reduces by 32 bytes per static atom. (On 32-bit the static size > increases by 4 per static atom but the heap size reduces by 16 per static > atom.) > 3. Can we reduce size in other ways, e.g. eliminating pointers in > nsStaticAtomSetup and/or slimming it down? Yes, but the savings will be smaller. > 4. I think the AtomClass::FOO() syntax is a little gross. I'd prefer a > solution where we keep AtomClass::FOO. So would I, but we need an array of *something* for the iteration at start-up. If it's not the nsAtom objects themselves in the array, then it's something else, which takes up extra space. > > + // This constructor is *only* used for initializing static atoms, which are > > + // |static|, and so it must be public. > > + nsAtom(const nsAtom&& aOther) > > Why is this constructor necessary? I couldn't get it to compile without it. The array initialization required the use of copy or move constructors. > > +// njn: rename this once all static atoms are converted to the new form > > +// njn: if I manage to do the hashes statically, aAtoms can become |nsAtom* > > +// aAtoms|. > > Is that `const nsAtom* aAtoms`? Yes! > You may want to make this value a little more distinguishing so as to not > accidentally collide. Sure. > I tried reworking static atoms a while back, being offended that we had: > > [...] Right, we have the same motivation :) > I wound up doing the roughly equivalent: > > #define NS_STATIC_ATOM_GETTER(name_) \ > nsAtom* const name_ = &sAtoms[static_cast<size_t>(Atoms::name_)]; > > and that meant that I didn't have to modify a bunch of code that used the > atoms. At the cost of an extra pointer per static atom. > Note too that we do things like &nsGkAtoms::FOO in various places, and doing > &nsGkAtoms::FOO() is not going to work out too well. I was not ambitious > enough to try to stop that. Interesting. That may be a more compelling problem with the getter functions than the extra parentheses. > Perhaps you have already discovered > layout/style/{nsCSSAnonBoxes,nsCSSPseudoElements}.{h,cpp}, which do their > own interesting things with nsAtom, and which required some special handling. Yeah, they already require their own versions of the NS_STATIC_ATOM_* macros, nothing much changes there.
Just to follow up our face-to-face conversation: > #define NS_STATIC_ATOM_GETTER(name_) \ > nsAtom* const name_ = &sAtoms[static_cast<size_t>(Atoms::name_)]; So this should give us the best of both worlds: - a direct handle to each atom, so no need for a getter method; - it shouldn't take up any space (assuming a sufficiently smart compiler, which GCC apparently is).
I'm making progress here, but I've hit some interesting link errors. Here's a representative case: > mScratchArray.AppendElement(nsGkAtoms::insertafter); where mScratchArray is an nsTArray<RefPtr<nsAtom>>. Here's the link error: > 0:22.86 /home/njn/moz/mi2/layout/xul/tree/nsTreeBodyFrame.cpp:0: error: undefined reference to 'nsGkAtoms::insertafter' Most references to nsGkAtoms members are fine. The common thing shared by all the link errors is that they involve an nsTArray::AppendElement() call. The signature of the relevant function is this: > template<class Item, typename ActualAlloc = Alloc> > elem_type* AppendElement(Item&& aItem); I suspect there's some kind of problem with passing the address (defined as per comment 8) as an rvalue reference argument. Introducing a temporary is enough to fix it. Or even just a cast! > mScratchArray.AppendElement((nsStaticAtom*)nsGkAtoms::insertafter); I would add a comment explaining the cast, but there are are about 40 such callsites, and I don't want to repeat the comment that many times... froydnj, any thoughts about this? Can you see any other way around this?
Flags: needinfo?(nfroyd)
I don't have any good ideas right now. What does the assembly look like for a function that has this sort of call in it?
Flags: needinfo?(nfroyd)
I have hit another interesting pickle, what appears to be an MSVC bug. I have this: > static nsStaticAtom sAtoms[static_cast<size_t>(Atoms::AtomsCount)]; > > #define NS_STATIC_ATOM_HANDLE(name_) \ > static constexpr nsStaticAtom* const name_ = &sAtoms[static_cast<size_t>(Atoms::name_)]; MSVC's value for name_ is computed as if sizeof(nsStaticAtom) is 1, when really it's 24 (on 64-bit). Unsurprisingly, this breaks the world. I tried changing it to the following, but got the same result: > static constexpr nsStaticAtom* const name_ = sAtoms + static_cast<size_t>(Atoms::name_); In both cases if I multiply the index by sizeof(nsStaticAtom) it works correctly. But obviously that's bogus and won't work on other platforms. I also tried casting to char* and doing the pointer arithmetic in 1 byte chunks, but reinterpret_casts aren't allowed in constexpr expressions. So I'm stuck. I can't think of any other way to formulate this expression. Nathan, any thoughts?
Flags: needinfo?(nfroyd)
I believe you can do: _name = static_cast<nsStaticAtom*>(static_cast<void*>(static_cast<char*>(static_cast<void*>(&sAtoms[0])) + sizeof(nsStaticAtom) * static_cast<size_t>(Atoms::name_))) (That is, carefully convert to a char* -- AFAICT from the standard, static_cast is usable in constexpr -- then do the arithmetic, then convert back via static_cast. I think that works, but I have not tried it.) This is...um, horrible. Have you filed a bug with Microsoft?
Flags: needinfo?(nfroyd)
It works! Thank you. I'll report the bug to Microsoft if I can make a small test case that reproduces.
Actually... it works in MSVC and GCC, but clang doesn't like it. I have this: > #define NS_STATIC_ATOM_HANDLE(type_, name_) \ > static constexpr type_* const name_ = \ > static_cast<type_*>( \ > static_cast<void*>( \ > static_cast<char*>(static_cast<void*>(sAtoms)) + \ > sizeof(type_) * static_cast<size_t>(Atoms::name_) \ > ) \ > ); In reference to the `static_cast<char*>`, clang says "cast from 'void *' is not allowed in a constant expression". (Though I tried the same thing in a small test program and it *doesn't* complain there, I don't understand why.) Item 13 in http://en.cppreference.com/w/cpp/language/constant_expression suggests that clang is correct, though I'm no language lawyer. Nathan, any other ideas?
Flags: needinfo?(nfroyd)
> (Though I tried the same thing in a small test program and it *doesn't* complain there, I don't understand why.) Oh, I was using GCC when I was compiling that. clang does complain about this test program.
I guess I can do the obvious thing for non-MSVC compilers, and then use the horrible thing for MSVC.
(In reply to Nicholas Nethercote [:njn] from comment #16) > I guess I can do the obvious thing for non-MSVC compilers, and then use the > horrible thing for MSVC. And then hope that MS fixes the array index bug before they tighten up the use of void* casts in constexpr. O_o
I guess casts from void* not being allowed makes sense, otherwise the lack of reinterpret_cast is trivially worked around. I don't have any ideas beyond hacking around things for MSVC. Does the bad behavior happen for both MSVC 2015 and 2017?
Flags: needinfo?(nfroyd)
The MSVC bug is known (see bug 1408695).
(In reply to Nicholas Nethercote [:njn] from comment #0) > - We need a way to iterate over all the atoms at startup, in order to insert > them into the table. So we have to put the atoms themselves into an array, Could this be achieved with linker tricks, ala kPStaticModules? (In reply to Nicholas Nethercote [:njn] from comment #1) > (BTW, how do I check for static constructors in the binary? I learned this > once but have forgotten.) Please check Windows builds as well, since MSVC can have a mind of its own about these things. There's probably a nicer way to do this with dumpbin, but I normally just attach windbg and do "x xul!*initializer*YOURVARIABLE*" and see if it gets hits like "xul!mozilla::dom::`dynamic initializer for 'gSetDOMProxyInformation'' (void)" (In reply to Nathan Froyd [:froydnj] from comment #12) > static_cast<nsStaticAtom*>(static_cast<void*>(static_cast<char*>(static_cast< > void*>(&sAtoms[0])) + sizeof(nsStaticAtom) * > static_cast<size_t>(Atoms::name_))) Why the need to route through void*? Could you just cast from type => char* => type? (There's probably an obvious reason but my coffee hasn't kicked in.)
> > - We need a way to iterate over all the atoms at startup, in order to insert > > them into the table. So we have to put the atoms themselves into an array, > > Could this be achieved with linker tricks, ala kPStaticModules? Yes, but I prefer to avoid linker tricks when possible. Also, there are some O(n) membership tests that can be converted to O(1) "is it within the array bounds?" tests. I'm not sure if that would be possible with the linker tricks? > Please check Windows builds as well, since MSVC can have a mind of its own > about these things. Ok. Note that I still don't have this working -- on Linux I get a browser window but no web content rendering. > Why the need to route through void*? Could you just cast from type => char* > => type? (There's probably an obvious reason but my coffee hasn't kicked in.) That requires reinterpret_cast, which is forbidden in constexpr.
Blocks: 1441292
Depends on: 1443706
Depends on: 1444031
Attachment #8921713 - Attachment is obsolete: true
In case it helps, this is how static atoms currently work: > static const char16_t foobar_buffer[7] (.rodata, 2(N+1) bytes) > /-----------------------------------------\ <----+ > | u"foobar" // the actual chars | <--+ | > \-----------------------------------------/ | | > | | > nsAtom (heap, 16 bytes) | | > /------------------------------------------\ | | <-+ > | uint32_t mLength: 30 = 6 | | | | > | uint32_t mKind: 2 = AtomKind::StaticAtom | | | | > | uint32_t mHash = ... | | | | > | char16_t* mString = @--------------------|---+ | | > \------------------------------------------/ | | > | | > static nsAtom* MyAtoms::foobar (.bss, 8 bytes) | | > /---\ <---------------------------------------+ | | > | @-|-----------------------------------------|------+ > \---/ | | | > | | | > static nsStaticAtomSetup (.d.r.ro, 16 bytes) | | | > (this is an element in sMyAtomsSetup[]) | | | > /-----------------------------------\ | | | > | const char16_t* const mString = @-|---------|--+ | > | nsStaticAtom** const mAtom = @----|---------+ | > \-----------------------------------/ | > | > AtomTableEntry (heap, ~2 x 16 bytes[*]) | > (this entry is part of gAtomTable) | > /-------------------------\ | > | uint32_t mKeyHash = ... | | > | nsAtom* mAtom = @-------|--------------------------+ > \-------------------------/ > > [*] Each hash table is half full on average, so each entry takes up > approximately twice its actual size. > > static shared bytes: 2(N+1) > static unshared bytes: 8 + 16 = 24 > heap bytes: 16 + ~32 = ~48 > total bytes: 2(N+1) + ~72 * num_processes And this is how they work with the attached patch: > static constexpr PodStaticAtom<7> foobar_atom > (.rodata, RoundUp(8 + 2(N+1), 4) bytes) > /--------------------------------------\ <-------+ > | uint32_t mLength: 30 = 6 | | > | uint32_t mKind: 2 = AtomKind::Static | | > | uint32_t mHash = ... | | > | u"foobar" | | > \--------------------------------------/ | > | > static nsAtom* MyAtoms::foobar (.bss, 8 bytes) | > /---\ <---------------------------------------+ | > | @-|-----------------------------------------|--+ > \---/ | | > | | > static nsStaticAtomSetup (.d.r.ro, 16 bytes) | | > (this is an element in sMyAtomsSetup[]) | | > /-----------------------------------\ | | > | nsStaticAtom* const mAtom = @-----|---------|--+ > | nsStaticAtom** const mAtomp = @---|---------+ | > \-----------------------------------/ | > | > AtomTableEntry (heap, ~2 x 16 bytes[*]) | > (this entry is part of gAtomTable) | > /-------------------------\ | > | uint32_t mKeyHash = ... | | > | nsAtom* mAtom = @-------|----------------------+ > \-------------------------/ > > [*] Each hash table is half full on average, so each entry takes up > approximately twice its actual size. > > static shared bytes: RoundUp(8 + 2(N+1), 4) > static unshared bytes: 8 + 16 = 24 > heap bytes: ~32 > total bytes: RoundUp(8 + 2(N+1), 4) + ~56 * num_processes This isn't as reduced as comment 0 suggested, but it's enough to satisfy the title of this bug. Removing the other bits can be follow-ups.
I did some measurements with `file`. It says that the "text" size (which I assume includes .rodata) has increased by 19304 bytes, and the other segments changed only negligibly. This is good! We have ~2700 prefs, and I was expecting a roughly 2700*8 = 21600 byte increase.
> This isn't as reduced as comment 0 suggested, but it's enough to satisfy the title of this bug. Removing the other bits > can be follow-ups. I realized something important about that most aggressive approach. We have a number of atoms that are registered twice. E.g. some or all of the nsHTMLTags ones also appear in nsGkAtoms: "abbr", "acronym", etc. In the face of such duplication, if we use .rodata addresses we will get major problems, because we will be using atoms with different addresses but the same string. Currently we don't have problems because we have a level of indirection, i.e. the MyAtoms::foobar pointer in comment 23's picture. If we have MyAtoms::foobar and YourAtoms::foobar, they'll both end up pointing to the same atom (whichever one gets registered first). In order to safely remove the level of indirection, we'd have to eliminate atom duplication, which would involve disallowing this case: https://searchfox.org/mozilla-central/source/xpcom/ds/nsAtomTable.cpp#706
Comment on attachment 8957077 [details] Bug 1411469 - Statically allocate static atoms. Clearing review. I have some ideas how to make this better.
Attachment #8957077 - Flags: review?(nfroyd)
We are going to require VS 2017 15.6 that fixed the constexpr pointer math bug.
Comment on attachment 8957077 [details] Bug 1411469 - Statically allocate static atoms. https://reviewboard.mozilla.org/r/226034/#review234060 Code analysis found 2 defects in this patch: - 2 defects found by clang-tidy You can run this analysis locally with: - `./mach static-analysis check path/to/file.cpp` (C/C++) If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx ::: xpcom/ds/nsAtomTable.cpp:105 (Diff revision 2) > > MOZ_ASSERT(mString[mLength] == char16_t(0), "null terminated"); > MOZ_ASSERT(Equals(aString), "correct data"); > } > > -// This constructor is for static atoms. > +nsDynamicAtom::nsDynamicAtom(const nsAString& aString) Error: Bad implicit conversion constructor for 'nsdynamicatom' [clang-tidy: mozilla-implicit-constructor] ::: xpcom/ds/nsAtomTable.cpp:114 (Diff revision 2) > - > MOZ_ASSERT(mString[mLength] == char16_t(0), "null terminated"); > - MOZ_ASSERT(NS_strlen(mString) == mLength, "correct storage"); > + MOZ_ASSERT(Equals(aString), "correct data"); > } > > -nsAtom::~nsAtom() > +nsDynamicAtom::~nsDynamicAtom() Warning: Use '= default' to define a trivial destructor [clang-tidy: modernize-use-equals-default]
> > -// This constructor is for static atoms. > > +nsDynamicAtom::nsDynamicAtom(const nsAString& aString) > > Error: Bad implicit conversion constructor for 'nsdynamicatom' [clang-tidy: > mozilla-implicit-constructor] This is bogus -- the declaration in nsAtom.h has `explicit` on it. > > -nsAtom::~nsAtom() > > +nsDynamicAtom::~nsDynamicAtom() > > Warning: Use '= default' to define a trivial destructor [clang-tidy: > modernize-use-equals-default] This is also bogus -- the destructor is not trivial.
Blocks: 1447951
Comment on attachment 8957077 [details] Bug 1411469 - Statically allocate static atoms. https://reviewboard.mozilla.org/r/226034/#review236114 This is surprisingly straightforward. ::: commit-message-9a217:16 (Diff revision 2) > +The change to static atoms means they can be made constexpr and stored in > +read-only memory instead of on the heap. On 64-bit this reduces the per-process > +overhead by 16 bytes; on 32-bit the saving is 12 bytes. (Further reductions > +will be possible in follow-up patches.) To be clear, the per-process overhead is reduced by 16 bytes *per atom*, yes? All of this work for 16 bytes of reduction would not seem worth it otherwise. :)
Attachment #8957077 - Flags: review?(nfroyd) → review+
> This is surprisingly straightforward. That's not the reaction I was expecting! But I'm glad you think so. My follow-up patch to remove nsStaticAtomSetup is... less straightforward :) > To be clear, the per-process overhead is reduced by 16 bytes *per atom*, > yes? All of this work for 16 bytes of reduction would not seem worth it > otherwise. :) Per atom, yes.
I created https://github.com/servo/servo/pull/20433 for the Servo part of this.
(In reply to Nicholas Nethercote [:njn] from comment #34) > I created https://github.com/servo/servo/pull/20433 for the Servo part of > this. And the attached commit is ready for autolanding once that PR lands.
Comment on attachment 8957077 [details] Bug 1411469 - Statically allocate static atoms. https://reviewboard.mozilla.org/r/226034/#review236498 Code analysis found 2 defects in this patch: - 2 defects found by clang-tidy You can run this analysis locally with: - `./mach static-analysis check path/to/file.cpp` (C/C++) If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx ::: xpcom/ds/nsAtomTable.cpp:105 (Diff revision 3) > > MOZ_ASSERT(mString[mLength] == char16_t(0), "null terminated"); > MOZ_ASSERT(Equals(aString), "correct data"); > } > > -// This constructor is for static atoms. > +nsDynamicAtom::nsDynamicAtom(const nsAString& aString) Error: Bad implicit conversion constructor for 'nsdynamicatom' [clang-tidy: mozilla-implicit-constructor] ::: xpcom/ds/nsAtomTable.cpp:114 (Diff revision 3) > - > MOZ_ASSERT(mString[mLength] == char16_t(0), "null terminated"); > - MOZ_ASSERT(NS_strlen(mString) == mLength, "correct storage"); > + MOZ_ASSERT(Equals(aString), "correct data"); > } > > -nsAtom::~nsAtom() > +nsDynamicAtom::~nsDynamicAtom() Warning: Use '= default' to define a trivial destructor [clang-tidy: modernize-use-equals-default]
Pushed by ecoal95@gmail.com: https://hg.mozilla.org/integration/autoland/rev/51793742d828 Statically allocate static atoms. r=froydnj
(In reply to Code Review Bot [:reviewbot] from comment #36) > ::: xpcom/ds/nsAtomTable.cpp:105 > (Diff revision 3) > > > > MOZ_ASSERT(mString[mLength] == char16_t(0), "null terminated"); > > MOZ_ASSERT(Equals(aString), "correct data"); > > } > > > > -// This constructor is for static atoms. > > +nsDynamicAtom::nsDynamicAtom(const nsAString& aString) > > Error: Bad implicit conversion constructor for 'nsdynamicatom' [clang-tidy: > mozilla-implicit-constructor] This is https://github.com/mozilla-releng/services/issues/999, I think. > ::: xpcom/ds/nsAtomTable.cpp:114 > (Diff revision 3) > > - > > MOZ_ASSERT(mString[mLength] == char16_t(0), "null terminated"); > > - MOZ_ASSERT(NS_strlen(mString) == mLength, "correct storage"); > > + MOZ_ASSERT(Equals(aString), "correct data"); > > } > > > > -nsAtom::~nsAtom() > > +nsDynamicAtom::~nsDynamicAtom() > > Warning: Use '= default' to define a trivial destructor [clang-tidy: > modernize-use-equals-default] I filed https://github.com/mozilla-releng/services/issues/1006 for this issue.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
> > I guess I can do the obvious thing for non-MSVC compilers, and then use the > > horrible thing for MSVC. > > And then hope that MS fixes the array index bug before they tighten up the > use of void* casts in constexpr. O_o Oh dear: MSVC 15.6.2 disallows void* casts, but hasn't fixed the indexing bug. Sigh.
> Oh dear: MSVC 15.6.2 disallows void* casts, but hasn't fixed the indexing bug. Sigh. For posterity: Bryce filed https://developercommunity.visualstudio.com/content/problem/223146/constexpr-code-gneration-bug.html over in bug 1445089.
Depends on: 1455178
Depends on: 1481555
See Also: → 1541208
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: