js_short_strncpy was introduced with the patch for bug 578205. As a quick fix for bug 583779, the function was changed to just do a simple loop copying jschars instead of special-casing small copies. We should figure out what's actually fastest.
Created attachment 462713 [details] Test program I tried the attached test program on my machine (running OS X), compiled with -O3. I tried to trick it into having reasonable branch prediction and compiler optimization behavior. It should be run with command line argument "1". I tested the different approaches by commenting out different parts of my_memcpy. Here are the results (running times, in seconds): 64-bit 32-bit memcpy 2.43 2.76 Simple loop 1.32 1.26 Switch block with only length=1 case 1.41 1.27 Switch block with length=1 and length=2 1.46 1.30 Full switch block 1.32 1.29 Looks like, at least on my machine and for this case, handling common cases with a switch doesn't end up helping.
Looks like the overhead of the outer code and the short maximum length of short strings makes optimizing this pointless, except for ... avoiding memcpy, which is known sucky on macosx. Thanks for doing the analysis!
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → WONTFIX
Yeah, we should consider what the numbers look like on other platforms. We encountered memcmp being wildly different in performance across platforms, for example.... :(
You need to log in before you can comment on or make changes to this bug.