GCC optimizes memset with a constant N, i.e. memset(a, 0, sizeof(T)) GCC does not optimize memset with a constant element size, i.e. memset(a, 0, n * sizeof(T)) In general, the call into memset is really expensive for small n. PodZero should be changed to only use memset with a constant T and loop around that. We also should have PodCompare that is a substite for memcmp.
PodCopy, too, to complete the trilogy.
Still a valid non-TM bug?
Yes. This stuff is used all over the VM. This bug needs an owner.....
Created attachment 580256 [details] [diff] [review] patch So this is the trivial patch with a possibly overly-long explanatory comment. Luke, WDYT? I didn't add PodCopy because we already have that. PodCompare didn't seem worth it for the very small number of memcmps in the codebase.
Comment on attachment 580256 [details] [diff] [review] patch I was about to say "but this will lose the benefits of memset when nelem is large" but then I check and none of the uses of binary PodZero would seem to have big nelem. So this looks great; if it ever matters, we'll just use memset directly.
Comment on attachment 580256 [details] [diff] [review] patch Converting luke's feedback+ into r+ for such a trivial patch.
In my queue with a few other checkin-neededs that are being sent to try first and then onto inbound :-) https://tbpl.mozilla.org/?tree=Try&rev=fd440327d5e4