Closed Bug 206682 Opened 22 years ago Closed 22 years ago

ToNewUTF8String is inefficient

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.5alpha

People

(Reporter: dbaron, Assigned: dbaron)

Details

(Whiteboard: [patch])

Attachments

(2 files, 10 obsolete files)

37.37 KB, patch
jag+mozilla
: review+
Details | Diff | Splinter Review
28.46 KB, patch
Details | Diff | Splinter Review
ToNewUTF8String is inefficient -- it consructs an nsCAutoString temporary and does a double-copy. Patch coming.
Attached patch patch (obsolete) — Splinter Review
compiled but not yet tested (the rest of my tree is still rebuilding)
My build seems fine, and runs. The test code: NS_NAMED_LITERAL_STRING(str, "foo bar aoeunhc uech eunthoue noeu neuonhc"); for (int i = 0; i < 10000000; ++i) { delete ToNewUTF8String(str); } used to take 8.6 seconds and now takes 6.7 seconds and the test code: NS_NAMED_LITERAL_STRING(str, "foo bar aoeunhc uech eunthoue noeu neuonhc"); NS_NAMED_LITERAL_STRING(str2, "foo"); for (int i = 0; i < 10000000; ++i) { delete ToNewUTF8String(str + str2); } used to take 11.6 seconds and now takes 6.7 seconds.
Comment on attachment 123936 [details] [diff] [review] patch >+ // D800- DBFF - High Surrogate >+ // N = (H- D800) *400 + 10000 + ... >+ PRUint32 ucs4 = 0x10000 | ((0x03FF & c) << 10); You have to use '0x10000 + ((0x03FF & c) << 10)' to avoid hiding the 17th bit set by '((0x03FF & c) << 10)'. I made the same mistake before :-) >+ if (! (c & 0xFF80)) >+ mSize += 1; >+ else if (! (c & 0xF800)) >+ mSize += 2; Wouldn't it be nicer to add some comments as below? The same is true of other places where these if-clause are used. >+ if (! (c & 0xFF80)) // U+0000 - U+007F >+ mSize += 1; >+ else if (! (c & 0xF800)) // U+0100 - U+07FF >+ mSize += 2;
Ah, will do. I just moved that code, though. :-)
Attached patch patch (obsolete) — Splinter Review
Attachment #123936 - Attachment is obsolete: true
Attachment #123947 - Flags: review?(jaggernaut) → review+
Comment on attachment 123947 [details] [diff] [review] patch sr=alecf
Attachment #123947 - Flags: superreview?(alecf) → superreview+
Attached patch patch (obsolete) — Splinter Review
Forgot to null-terminate (applies to already-checked-in, but unused, code as well). Somehow my build mostly worked.
Attachment #123947 - Attachment is obsolete: true
Comment on attachment 123989 [details] [diff] [review] patch carrying over my sr=alecf
Attachment #123989 - Flags: superreview+
Comment on attachment 123989 [details] [diff] [review] patch I'd like to ask for approval for this patch. It's pretty safe -- it would break a lot of things if it had bugs (like what happened in my tree this morning). It also contains two bug fixes that I think we want -- one to the handling of surrogates (I guess we don't use those much yet, eh?), and the other to code that isn't used yet, but that's already in the tree, and may start being used soon and transferred to the branch. I guess I could check in just that fix, but I don't see any reason to delay landing the rest of the patch.
Attachment #123989 - Flags: approval1.4?
Comment on attachment 123989 [details] [diff] [review] patch This should go in, imho.
Attachment #123989 - Flags: review+
Attachment #123989 - Flags: approval1.4? → approval1.4+
Fix checked in to trunk, 2003-05-22 14:25 -0700.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Backed out, since it slowed down NS_ConvertUCS2toUTF8. (I left the terminating null fix in.)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(It also caused a codesize increase since the inlined constructor, which is probably the most common, required two function calls (base class constructor and |Init| instead of one.)
Comment on attachment 123989 [details] [diff] [review] patch Now that you backed it out, ..... >+ if (! (c & 0xFF80)) // U+0000 - U+007F ... >+ else if (! (c & 0xF800)) // U+0100 - U+07FF ... >+ else if (0xD800 == (0xFC00 & c)) // U+D800 - U+DBFF ... >+ else if (0xDC00 == (0xFC00 & c)) // U+DC00 - U+DFFF ... >+ else // U+0800 - U+D7FF, U+E000 - U+FFFF Wouldn't we be a bit better off by checking non-surrogate code points (which are much more common) before surrogate code points considering a potential branch penalty? Can we just leave that to the optimization? else if (0xD800 != (0xF800 & c)) // U+0800 - U+D7FF, U+E000 - U+FFFF ... else if (0xD800 == (0xFC00 & c)) // U+D800 - U+DBFF ... else // U+DC00 - U+DFFF ....
Attached patch patch (obsolete) — Splinter Review
Here's a patch against the current trunk (since I didn't back out the whole thing), with the idea in the comment above incorporated.
Attachment #123989 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
This is somewhat faster. Measuring using the const PRUnichar* constructor: * it gets 42.8% of the loss back by having a separate nsASingleFragmentString& Init method that uses nsASingleFragmentString::const_char_iterator * it gets 15.2% of the loss back by only calling BeginReading and EndReading once each * it gets 5.6% of the loss back by replacing EndReading with the result of BeginReading + Length() * it gets 2.8% of the loss back by using the results of BeginReading and Length directly on the sinks and avoiding copy_string for a total of 66.4% of the performance loss back. (The last digit is probably not significant throughout.)
Attachment #124026 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
And skipping the nsDependentString and changing the Init to take (const PRUnichar*, PRUint32), gets another 12.4%.
Attachment #125198 - Attachment is obsolete: true
Most of the rest of the difference seems to be in the difference between having the conversion loop inline or having it in write(). I suspect this may have something to do with member variables. I do get a little boost from using SetCapacity and manually adding the terminator over using SetLength.
Using a temporary instead of |mBuffer| in the loop in |write| didn't help, so I don't really understand what's going on. (Yes, I tried using the code from |write| directly in |Init|, and that *was* faster.)
Attached patch patch (obsolete) — Splinter Review
Just a tiny bit faster, since I can't explain the rest of the difference.
Attachment #125200 - Attachment is obsolete: true
Attached patch patch (diff -c) (obsolete) — Splinter Review
Same thing, context diff (easier to read in this case).
Adding |__attribute__((always_inline))| (a gcc-only extension) to the sink's |write| method (add it before the function name, or with |inline| preceding to a constructor) helps a little bit, but it doesn't help for any of the other methods on the sink and it doesn't help nearly as much as I'd hoped.
Attached patch patch (obsolete) — Splinter Review
OK, this one is actually slightly faster than the old code on my gcc 3.3. I had to do a bunch of things the compiler should have done for me, and who knows if they're the same things that help other compilers. Note the |#ifdef __GNUC__| use of |__attribute__((always_inline))|. The trick was using the assignment from |mBuffer| to |out| and back at the same time as the |__attribute__((always_inline))| -- either one alone didn't help much but the two together helped a lot. Now I'm beginning to wonder if this is too ugly to check in...
Attachment #125274 - Attachment is obsolete: true
Attachment #125279 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
This does most of the same things for UTF8toUCS2 conversions, so that they can be faster too.
Attachment #125287 - Attachment is obsolete: true
Attached patch patchSplinter Review
Same thing, but with typo fixed.
Attachment #125289 - Attachment is obsolete: true
diff -up, please.
Status: REOPENED → ASSIGNED
Attached patch patch (diff -u)Splinter Review
I specifically did a diff -c since the nsString.cpp changes are unreadable with a diff -u, but here's the unreadable one anyway, since you seem to prefer it.
+#ifdef __GNUC__ +#define GCC_ALWAYS_INLINE __attribute__((always_inline)) +#else +#define GCC_ALWAYS_INLINE +#endif Wouldn't that #define be better named "ALWAYS_INLINE", i.e. drop GCC from the name? That way we could use the same macro for other compilers as well, if and when they support similar attributes. Just my 2 cents.
Perhaps NS_ALWAYS_INLINE?
Comment on attachment 125325 [details] [diff] [review] patch r=jag jst's comment makes sense though, so we should probably do that. I'm not too happy that we need this, but this is the right thing to do and I'd rather take the always_inline than the slowdown. Should we measure on other platforms before checking in, or just check in and back out if it regresses Tp, Txul etc. too badly?
Attachment #125325 - Flags: review?(jaggernaut) → review+
Comment on attachment 125339 [details] [diff] [review] patch (diff -u) NS_ALWAYS_INLINE sounds good to me. +NS_ConvertUTF8toUCS2::NS_ConvertUTF8toUCS2( const char* aCString ) + { + Init(aCString, nsCharTraits<char>::length(aCString)); + } + You made NS_ConvertUCS2toUTF8(const char *) null save, don't you want to do the same here? With that, sr=jst
Attachment #125339 - Flags: superreview?(jst) → superreview+
> You made NS_ConvertUCS2toUTF8(const char *) null save, don't you want to do the > same here? No, since someone already went to the trouble of making it not be null-safe. I'd rather switch NS_ConvertUTF8toUCS2 at some point in the future...
Fix checked in to trunk, 2003-06-10 21:27 -0700.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.5alpha
Component: String → XPCOM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: