Last Comment Bug 560349 - memset is slow and lame with gcc
: memset is slow and lame with gcc
Status: RESOLVED FIXED
: perf
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86 Mac OS X
: -- normal (vote)
: mozilla11
Assigned To: Nathan Froyd [:froydnj]
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-19 14:18 PDT by Andreas Gal :gal
Modified: 2012-02-01 13:59 PST (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (757 bytes, patch)
2011-12-08 17:04 PST, Nathan Froyd [:froydnj]
nfroyd: review+
Details | Diff | Splinter Review

Description Andreas Gal :gal 2010-04-19 14:18:12 PDT
GCC optimizes memset with a constant N, i.e. memset(a, 0, sizeof(T))

GCC does not optimize memset with a constant element size, i.e. memset(a, 0, n * sizeof(T))

In general, the call into memset is really expensive for small n.

PodZero should be changed to only use memset with a constant T and loop around that. We also should have PodCompare that is a substite for memcmp.
Comment 1 Luke Wagner [:luke] 2010-04-19 14:23:06 PDT
PodCopy, too, to complete the trilogy.
Comment 2 Ryan VanderMeulen [:RyanVM] 2011-11-24 10:18:53 PST
Still a valid non-TM bug?
Comment 3 Ryan VanderMeulen [:RyanVM] 2011-12-07 20:05:17 PST
Andreas/bz/Luke, ping?
Comment 4 Boris Zbarsky [:bz] (still a bit busy) 2011-12-07 20:50:42 PST
Yes.  This stuff is used all over the VM.  This bug needs an owner.....
Comment 5 Nathan Froyd [:froydnj] 2011-12-08 17:04:32 PST
Created attachment 580256 [details] [diff] [review]
patch

So this is the trivial patch with a possibly overly-long explanatory comment.  Luke, WDYT?

I didn't add PodCopy because we already have that.  PodCompare didn't seem worth it for the very small number of memcmps in the codebase.
Comment 6 Luke Wagner [:luke] 2011-12-08 17:15:16 PST
Comment on attachment 580256 [details] [diff] [review]
patch

I was about to say "but this will lose the benefits of memset when nelem is large" but then I check and none of the uses of binary PodZero would seem to have big nelem.  So this looks great; if it ever matters, we'll just use memset directly.
Comment 7 Nathan Froyd [:froydnj] 2011-12-08 17:44:50 PST
Comment on attachment 580256 [details] [diff] [review]
patch

Converting luke's feedback+ into r+ for such a trivial patch.
Comment 8 Ed Morley [:emorley] 2011-12-15 02:33:55 PST
In my queue with a few other checkin-neededs that are being sent to try first and then onto inbound :-)
https://tbpl.mozilla.org/?tree=Try&rev=fd440327d5e4
Comment 10 Ed Morley [:emorley] 2011-12-16 06:19:02 PST
https://hg.mozilla.org/mozilla-central/rev/b9a619e265d5

Note You need to log in before you can comment on or make changes to this bug.