Closed Bug 196506 Opened 22 years ago Closed 22 years ago

Excessive inlining in string libs: Substring()

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dwitte, Assigned: jag+mozilla)

References

Details

Attachments

(1 file, 3 obsolete files)

So during my rewrite of cookie code, alecf noticed that one patch in particular had increased codesize by ~4k. This was determined to be due to converting cookie code to use the string classes instead of char*. We kinda let that one lie, but I've just finished another patch for cookies which increases codesize by ~3k again. After digging around, mvl found the culprit... Substring(). It turns out that Substring() is inlined. Substring() calls either nsDependent{C}Substring or nsDependentSingleFragment{C}Substring, whose ctors are themselves inlined. Which call other ctors/functions that are inlined. This would be fine if the resulting functions were small, but it turns out that each instance of any of the above three functions are 200-300bytes in opt builds! nsDependent{C}Substring: ~200 bytes per instance nsDependentSingleFragment{C}Substring: ~300 bytes per instance Substring: either of the above, depending what type is handed to it We did some further testing and found that un-inlining the two ctors mentioned above, results in a ~57k codesize reduction. mvl is unable to reliably measure any impact on Ts/Tp/Txul (which jag recommended measuring), due to too much fluctuation. However, a simple test on a 1,000,000-iteration |while| loop that calls Substring() on an nsString and an nsCString, showed: non-inline is 0.6% slower than inline (with a standard deviation of ~0.3%, averaged over 6 runs). Details: platform is linux, compiler is gcc 3.2.2 debian prerelease, compiled with -o2. This can be expected to make even less difference in "real" code, and could even result in a speedup due to less icache blowage etc, but that's purely speculation... my point is that we shouldn't blindly expect unlining to reduce performance, without testing it out. I'm attaching a patch (used to get that 57k reduction figure) that uninlines all four ctors mentioned above. Since it's been difficult to get reliable Ts/Tp/Txul figures on dev boxes, it might be worth checking this into the tree and letting the tboxes get a few build cycles in, just to see what difference it makes. What do you guys think?
Attached patch uninlining patch (obsolete) — Splinter Review
Blocks: 188224
Comment on attachment 116670 [details] [diff] [review] uninlining patch it'd be nice if the lines tried to honor 80cols :)
Attachment #116670 - Flags: review+
Checked in for dwitte. We'll watch tinderbox to check for issues.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
We have issues. bbaetz kindly checked in during the wee hours of Monday morning, to see what impact the patch will have across the platforms. However, it killed all the clbr builds (dep was fine), due to a linker error... embedstring.lib(nsDependentSubstring.obj) : error LNK2001: unresolved external symbol "unsigned int __cdecl Distance(class nsReadingIterator<char> const &,class nsReadingIterator<char> const &)" (?Distance@@YAIABV?$nsReadingIterator@D@@0@Z) no idea what's going on here, perhaps we need to add nsStringIteratorUtils.h or nsReadableUtils.h to the string embedding makefile, string/embed/Makefile.in? BTW, we got a few good tbox results: a) from linux comet dep: -78k codesize, *no change* in Txul/Ts (if anything, Txul dropped by an amount slightly outside the experimental error, but this could be noise)... b) from MacOSX monkey dep: no change in Tp (within exp error), perhaps a slight increase in Ts (~1%?) c) WINNT beast dep: no change in Txul, inconclusive data for Ts (fluctuates all over the place). interestingly, codesize was +755b on this platform (->0 after stripping?), so i guess this compiler already doesn't inline those functions.
reopening
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
nsEmbedString is a very small implementation of the AString and ACString. If you un-inline stuff, you may have to make a copy of that method in string/embed/.
ah, okay. I'm not very familiar with string/embed - can you explain why it broke there (if you know)? What would I need to change - just add something to a makefile, or copy-paste the function impl's completely?
Basically, string/embed has a small implementation of the AString and ACString. It compiles the bare minimun source files in string/* to implement these two classes. Now, if you change an inline method in a header file file, the embed string implementation of this method may no longer be available to the library or the consumer of the library. Therefore, you might have to copy this no-longer-inline method into the string/embed library.
Attached patch fix embedstring (obsolete) — Splinter Review
Patch which includes copying Distance() into nsEmbedString.cpp I am not sure if it is the right thing to do. Maybe it would be better to create a new file?
Comment on attachment 116784 [details] [diff] [review] fix embedstring this actually seems like the right thing to do, as long as you add a comment explaining where the code came from. sr=alecf as long as the r= is from dougt
Attachment #116784 - Flags: superreview+
I just had another thought about this - is there any way we could just stop inling Substring()? If substring (and not EVERY use of nsDependentString) is the cause of the bloat, then it would make sense to make Substring() a non-inline call, and then nsDependentString would be inlined inside of that. That said, if this causes no noticable performance degredation, then we should surely stop inlining nsDependentString, as the patch does.
Attached patch Include comment (obsolete) — Splinter Review
Now with comment. Carrying forward sr=
Attachment #116784 - Attachment is obsolete: true
Attachment #116795 - Flags: superreview+
Attachment #116795 - Flags: review?(dougt)
Initially I thought it'd be nice to uninline all of them, for consistency (they're the same functions really...), but I was hoping you guys'd have a comment on that. Looks like we're getting no degradation (luna even showed some slight improvements), so doesn't look like it's a big deal. If the iterator-based ctors are small (the length-based ones aren't), maybe we could leave those as inline.
Attachment #116670 - Attachment is obsolete: true
Attachment #116795 - Flags: review?(dougt) → review+
Speaking of inlining, has anyone tried to inline ns{C}AutoString constructors? That might be a perf win.
I don't think inlining those constructors would be a win -- I think there's enough inside of them that we can't inline the whole thing, and, given that, we'd be better off leaving them non-inline and perhaps inlining or simplifying the things inside of them (see, e.g., bug 188828).
okay, this caused bustage again - even though it was tested on linux. This is strange. the link error looks like libembedstring and libstring are picking up multiple definitions of Distance() now, instead of not enough (as previously). they're finding them in nsReadableUtils.o - looks like we need some Makefile changes to go along with this patch. embedstring_s.lib(nsEmbedString.obj) : error LNK2005: "unsigned int __cdecl Distance(class nsReadingIterator<unsigned short> const &,class nsReadingIterator<unsigned short> const &)" (?Distance@@YAIABV?$nsReadingIterator@G@@0@Z) already defined in string_s.lib(nsReadableUtils.obj) embedstring_s.lib(nsEmbedString.obj) : error LNK2005: "unsigned int __cdecl Distance(class nsReadingIterator<char> const &,class nsReadingIterator<char> const &)" (?Distance@@YAIABV?$nsReadingIterator@D@@0@Z) already defined in string_s.lib(nsReadableUtils.obj) embedstring_s.lib(nsEmbedString.obj) : warning LNK4006: "unsigned int __cdecl Distance(class nsReadingIterator<unsigned short> const &,class nsReadingIterator<unsigned short> const &)" (?Distance@@YAIABV?$nsReadingIterator@G@@0@Z) already defined in string_s.lib(nsReadableUtils.obj); second definition ignored embedstring_s.lib(nsEmbedString.obj) : warning LNK4006: "unsigned int __cdecl Distance(class nsReadingIterator<char> const &,class nsReadingIterator<char> const &)" (?Distance@@YAIABV?$nsReadingIterator@D@@0@Z) already defined in string_s.lib(nsReadableUtils.obj); second definition ignored Creating library mozilla.lib and object mozilla.exp mozilla.exe : fatal error LNK1169: one or more multiply defined symbols found dougt, is there any preferred way you'd like us to fix this?
Attached patch new patchSplinter Review
after talking with dbaron, we agreed that all that a solution was to mask out the the CalculateLength, Distance_Impl, and Distance(*) methods from the non-standalone version of the embedstring. To do this, I added a new build flag STRING_STANDALONE which is included defined in the string/embed/standalone makefile. If this define is true, the CalculateLength, Distance_Impl, and Distance(*) methods are compilied into the lib, otherwise they are not.
Attachment #116795 - Attachment is obsolete: true
Comment on attachment 117648 [details] [diff] [review] new patch sounds great to me. sr=alecf, assuming r=dbaron
Attachment #117648 - Flags: superreview+
Comment on attachment 117648 [details] [diff] [review] new patch >Index: embed/Makefile.in >=================================================================== >RCS file: /cvsroot/mozilla/string/embed/Makefile.in,v >retrieving revision 1.1 >diff -u -1 -0 -r1.1 Makefile.in >--- embed/Makefile.in 14 Jan 2003 03:34:58 -0000 1.1 >+++ embed/Makefile.in 18 Mar 2003 22:53:19 -0000 >@@ -91,11 +91,11 @@ > GARBAGE += $(addprefix $(srcdir)/,$(STRING_LCSRCS)) > endif > > SRCS_IN_OBJDIR = 1 > > include $(topsrcdir)/config/rules.mk > > export:: $(STRING_CSRCS) > $(INSTALL) $^ . > >-DEFINES += -DXPCOM_GLUE >+DEFINES += -DXPCOM_GLUE Better not to add extra whitespace at the end (which is what this change is). Also, I would slightly prefer that the macro be called EMBEDSTRING_STANDALONE rather than STRING_STANDALONE, but if you want to leave it that's fine as well. r=dbaron
Attachment #117648 - Flags: review+
neat, thanks a lot for fixing it up dougt :) let's hope it was worth it, after all that. by the way - now that this low-hanging fruit has been picked, do you have any opinion on optimizing any of the string impls? it just irks me that these beasts are 300+ bytes to start with, when you can do the same thing in char* for probably 4. maybe one day when i'm really bored, i'll start playing in string/ ...
third time lucky, though not without bustage.
Status: REOPENED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
Dan Witte wrote: > third time lucky, though not without bustage Is anyone looking at the "nebiros" bustage ? AFAIK it misses a small |#include "nsReadableUtils.h"| to go back to "green" state again...
Component: String → XPCOM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: