Closed
Bug 697450
Opened 14 years ago
Closed 12 years ago
Several WebGL texel unpack/pack functions are not inlined spontaneously, at least by GCC
Categories
(Core :: Graphics: CanvasWebGL, defect)
Core
Graphics: CanvasWebGL
Tracking
()
RESOLVED
FIXED
mozilla10
People
(Reporter: bjacob, Unassigned)
References
Details
Attachments
(1 file)
|
20.30 KB,
patch
|
jgilbert
:
review+
|
Details | Diff | Splinter Review |
WebGL texture conversions are performance critical and our code relies on per-texel pack/unpack functions to get inlined, or else we're slow.
The problem is that here, with GCC 4.6.1 on linux x86_64, many of them are not getting inlined. Here's what nm has to say:
bjacob@cahouette:~/build/firefoxrelease$ nm --radix=d -S -C dist/bin/libxul.so | grep -v constprop | grep -v isra | grep WebGL | grep Conver
0000000009548754 0000000000022455 t mozilla::WebGLContext::ConvertImage(unsigned long, unsigned long, unsigned long, unsigned long, unsigned char const*, unsigned char*, int, bool, int, bool, unsigned long)
0000000009524400 0000000000000022 t mozilla::(anonymous namespace)::WebGLTexelConversions::unpackRA8ToRGBA8(unsigned char const*, unsigned char*)
0000000009524372 0000000000000028 t mozilla::(anonymous namespace)::WebGLTexelConversions::unpackBGRA8ToRGBA8(unsigned char const*, unsigned char*)
0000000009524344 0000000000000028 t mozilla::(anonymous namespace)::WebGLTexelConversions::unpackRGBA8ToRGBA8(unsigned char const*, unsigned char*)
0000000009524461 0000000000000057 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToR8Unmultiply(unsigned char const*, unsigned char*)
0000000009524422 0000000000000039 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToR8Premultiply(unsigned char const*, unsigned char*)
0000000009524562 0000000000000060 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToRA8Unmultiply(unsigned char const*, unsigned char*)
0000000009524518 0000000000000044 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToRA8Premultiply(unsigned char const*, unsigned char*)
0000000009524699 0000000000000095 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToRGB8Unmultiply(unsigned char const*, unsigned char*)
0000000009524622 0000000000000077 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToRGB8Premultiply(unsigned char const*, unsigned char*)
0000000009524877 0000000000000099 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToRGBA8Unmultiply(unsigned char const*, unsigned char*)
0000000009524794 0000000000000083 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToRGBA8Premultiply(unsigned char const*, unsigned char*)
0000000009525552 0000000000000046 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort565(unsigned char const*, unsigned short*)
0000000009524976 0000000000000055 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort4444(unsigned char const*, unsigned short*)
0000000009525258 0000000000000059 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort5551(unsigned char const*, unsigned short*)
0000000009525697 0000000000000117 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort565Unmultiply(unsigned char const*, unsigned short*)
0000000009525134 0000000000000124 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort4444Unmultiply(unsigned char const*, unsigned short*)
0000000009525424 0000000000000128 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort5551Unmultiply(unsigned char const*, unsigned short*)
0000000009525598 0000000000000099 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort565Premultiply(unsigned char const*, unsigned short*)
0000000009525031 0000000000000103 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort4444Premultiply(unsigned char const*, unsigned short*)
0000000009525317 0000000000000107 t mozilla::(anonymous namespace)::WebGLTexelConversions::packRGBA8ToUnsignedShort5551Premultiply(unsigned char const*, unsigned short*)
Now if I force inlining of all these pack/unpack functions, this becomes:
bjacob@cahouette:~/build/firefoxrelease$ nm --radix=d -S -C dist/bin/libxul.so | grep -v constprop | grep WebGL | grep Conver
0000000009560046 0000000000019224 t mozilla::WebGLContext::ConvertImage(unsigned long, unsigned long, unsigned long, unsigned long, unsigned char const*, unsigned char*, int, bool, int, bool, unsigned long)
So the code size is only 19k anymore, down from roughly 24k. My explanation for that is that inlining these functions allows the compiler to get read of the temporary buffer used in between them:
UnpackType tmp[4];
unpackingFunc(reinterpret_cast<const SrcType*>(src_row_ptr), tmp);
packingFunc(tmp, reinterpret_cast<DstType*>(dst_row_ptr));
so inlining can allow to get rid of a good part of the (useless) code there.
The impact on performance is that texture conversions are 2.5x faster with inlining in profiles on Google MapsGL.
Attachment #569685 -
Flags: review?(jgilbert)
| Reporter | ||
Updated•14 years ago
|
Blocks: mapsgl-performance
| Reporter | ||
Comment 1•14 years ago
|
||
Note: I did try just 'inline' and GCC still wasn't inlining! Whence the always_inline.
Comment 2•14 years ago
|
||
Comment on attachment 569685 [details] [diff] [review]
force-inline texel conversion helpers
Looks good.
Attachment #569685 -
Flags: review?(jgilbert) → review+
| Reporter | ||
Comment 3•14 years ago
|
||
It was a mistake on my part to omit the .constprop symbols: they actually represent inlined functions. The true code size is 42k.
| Reporter | ||
Comment 4•14 years ago
|
||
Comment 5•14 years ago
|
||
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
Comment 6•14 years ago
|
||
There's a reason NS_ALWAYS_INLINE exists.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
| Reporter | ||
Comment 7•14 years ago
|
||
http://mxr.mozilla.org/mozilla-central/source/configure.in#3898
if I understand correctly, NS_ALWAYS_INLINE is basically:
#ifdef __GNUC__
#define NS_ALWAYS_INLINE __attribute__((always_inline))
#else
#define NS_ALWAYS_INLINE
#endif
this seems rather useless: doesn't do anything outside of GCC (and compatible compilers) so the name is very misleading. Here I really want at least 'inline' in the general case, and if possible __forceinline on MSVC
Also, this code is shared with WebKit so it makes sense to avoid dependencies.
Please re-close unless you have other objections.
Comment 8•12 years ago
|
||
This was fixed with the addition of MOZ_ALWAYS_INLINE to MFBT. Returning to prior 'fixed' status.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•