Uploading ImageBitmap to texture with gl.texImage2D() is slow
Categories
(Core :: Graphics: Canvas2D, defect, P3)
Tracking
()
People
(Reporter: hogehoge, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [gfx-noted])
Attachments
(1 file)
Reporter | ||
Comment 1•6 years ago
|
||
Comment 3•6 years ago
|
||
Comment 4•6 years ago
|
||
:takahirox.
There are some items that are needed to be figured out. Before uploading an imageBitmapData to WebGLContext::texImage2D(). We will do
- THREE.ImageBitmapLoader.Load -> Generate an ImageBlob.
- createImageBitmap -> get the ImageBitmapData.
- WebGLContext::texImage2D() -> Upload this ImageBitmapData to WebGL texture.
I saw this profiling number includes all of three. However, only WebGLContext::texImage2D is relative with WebGL context. The other twos are more likely related with image decoder. I would expect we can figure out which part is our main bottleneck.
Comment 5•6 years ago
|
||
ImageBitmapData vs. HTMLImageElement are different code path. ImageBitmapData uses async approach, and HTMLImageElement uses sync one. For the general case, the async approach should be faster, I need to dig deeper to confirm if they use the same image decoder.
ImageBitmapData
xul.dll!mozilla::dom::CreateImageBitmapFromBlob::Create(mozilla::dom::Promise * aPromise, nsIGlobalObject * aGlobal, mozilla::dom::Blob & aBlob, const mozilla::Maybe<mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> > & aCropRect, nsIEventTarget * aMainThreadEventTarget) Line 1484 C++
xul.dll!mozilla::dom::AsyncCreateImageBitmapFromBlob(mozilla::dom::Promise * aPromise, nsIGlobalObject * aGlobal, mozilla::dom::Blob & aBlob, const mozilla::Maybe<mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> > & aCropRect) Line 1267 C++
xul.dll!mozilla::dom::ImageBitmap::Create(nsIGlobalObject * aGlobal, const mozilla::dom::HTMLImageElementOrSVGImageElementOrHTMLCanvasElementOrHTMLVideoElementOrImageBitmapOrBlobOrCanvasRenderingContext2DOrImageData & aSrc, const mozilla::Maybe<mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> > & aCropRect, mozilla::ErrorResult & aRv) Line 1335 C++
xul.dll!nsGlobalWindowInner::CreateImageBitmap(JSContext * aCx, const mozilla::dom::HTMLImageElementOrSVGImageElementOrHTMLCanvasElementOrHTMLVideoElementOrImageBitmapOrBlobOrCanvasRenderingContext2DOrImageData & aImage, mozilla::ErrorResult & aRv) Line 7097 C++
HTMLImageElement
xul.dll!imgLoader::LoadImage(nsIURI * aURI, nsIURI * aInitialDocumentURI, nsIURI * aReferrerURI, mozilla::net::ReferrerPolicy aReferrerPolicy, nsIPrincipal * aTriggeringPrincipal, unsigned __int64 aRequestContextID, nsILoadGroup * aLoadGroup, imgINotificationObserver * aObserver, nsINode * aContext, mozilla::dom::Document * aLoadingDocument, unsigned int aLoadFlags, nsISupports * aCacheKey, unsigned int aContentPolicyType, const nsTSubstring<char16_t> & initiatorType, bool aUseUrgentStartForChannel, imgRequestProxy * * _retval) Line 2060 C++
xul.dll!nsContentUtils::LoadImage(nsIURI * aURI, nsINode * aContext, mozilla::dom::Document * aLoadingDocument, nsIPrincipal * aLoadingPrincipal, unsigned __int64 aRequestContextID, nsIURI * aReferrer, mozilla::net::ReferrerPolicy aReferrerPolicy, imgINotificationObserver * aObserver, int aLoadFlags, const nsTSubstring<char16_t> & initiatorType, imgRequestProxy * * aRequest, unsigned int aContentPolicyType, bool aUseUrgentStartForChannel) Line 3439 C++
xul.dll!nsImageLoadingContent::LoadImage(nsIURI * aNewURI, bool aForce, bool aNotify, nsImageLoadingContent::ImageLoadType aImageLoadType, bool aLoadStart, mozilla::dom::Document * aDocument, unsigned int aLoadFlags, nsIPrincipal * aTriggeringPrincipal) Line 986 C++
xul.dll!nsImageLoadingContent::LoadImage(const nsTSubstring<char16_t> & aNewURI, bool aForce, bool aNotify, nsImageLoadingContent::ImageLoadType aImageLoadType, nsIPrincipal * aTriggeringPrincipal) Line 868 C++
xul.dll!mozilla::dom::HTMLImageElement::AfterMaybeChangeAttr(int aNamespaceID, nsAtom * aName, const nsAttrValueOrString & aValue, const nsAttrValue * aOldValue, nsIPrincipal * aMaybeScriptedPrincipal, bool aValueMaybeChanged, bool aNotify) Line 405 C++
xul.dll!mozilla::dom::HTMLImageElement::AfterSetAttr(int aNameSpaceID, nsAtom * aName, const nsAttrValue * aValue, const nsAttrValue * aOldValue, nsIPrincipal * aMaybeScriptedPrincipal, bool aNotify) Line 287 C++
Comment 6•6 years ago
•
|
||
I am trying to use synchronize way to load imageBitmap in my patch. The performance gets a little bit increase but not so obvious. Probably, it is because this patch is not totally synchronize due to CreateImageBitmapFromBlob::DecodeAndCropBlob() still will try to use asynchronize way to load its mimeType and the bitmap data.
Updated•6 years ago
|
Comment 7•6 years ago
•
|
||
Providing new profile from my Windows PC:
Firefox
Async
ImageBitmap
Texture uploading time [ms]
25.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 216.00 / 239.00 / 224.71 / 7
2048x2048 PNG 4.5MB 25.00 / 31.00 / 27.43 / 7
Firefox
Sync
ImageBitmap
Texture uploading time [ms]
214.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 213.00 / 238.00 / 217.89 / 9
2048x2048 PNG 4.5MB 24.00 / 32.00 / 26.13 / 8
Chrome
ImageBitmap
Texture uploading time [ms]
26.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 80.70 / 96.10 / 88.57 / 13
2048x2048 PNG 4.5MB 16.50 / 27.40 / 25.83 / 13
For the small size imageBitmap, the difference is not so obvious.
Reporter | ||
Comment 8•6 years ago
|
||
There are some items that are needed to be figured out. Before uploading an imageBitmapData to WebGLContext::texImage2D(). We will do
- THREE.ImageBitmapLoader.Load -> Generate an ImageBlob.
- createImageBitmap -> get the ImageBitmapData.
- WebGLContext::texImage2D() -> Upload this ImageBitmapData to WebGL texture.
I count the elapsed time of only 3 in the example. Search "isTexture" in the code. The problem I want to fix here is longer main thread blocking time so seeing only 3 (my understanding is 1 and 2 are done asynchronously).
Comment 9•6 years ago
|
||
(In reply to Takahiro Aoyagi (:takahirox) from comment #8)
There are some items that are needed to be figured out. Before uploading an imageBitmapData to WebGLContext::texImage2D(). We will do
- THREE.ImageBitmapLoader.Load -> Generate an ImageBlob.
- createImageBitmap -> get the ImageBitmapData.
- WebGLContext::texImage2D() -> Upload this ImageBitmapData to WebGL texture.
I count the elapsed time of only 3 in the example. Search "isTexture" in the code. The problem I want to fix here is longer main thread blocking time so seeing only 3 (my understanding is 1 and 2 are done asynchronously).
I think you are right. I will dig into understanding what we did in WebGLContext::texImage2D(), then providing some data to figure out the problem.
Comment 10•6 years ago
•
|
||
Although the format and internalFormat of texImage2D() [1] are RGB8. the SourceSurface format from ImageBitmapData is decoded as BGRX8, so we have to do a conversion at [2] to make the src image to be RGB8 format. If we skip the conversion, we can see the elapsed time will be decreased. Besides this, I don't see other weird things. Our implementation of texImage2D() is very straightforward, just do verification and call the ANGLE function.
Firefox
"Early return the conversion."
ImageBitmap
Texture uploading time [ms]
19.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 123.00 / 166.00 / 147.13 / 8
2048x2048 PNG 4.5MB 19.00 / 26.00 / 20.63 / 8
Firefox
"no early return the conversion."
ImageBitmap
Texture uploading time [ms]
26.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 178.00 / 316.00 / 225.86 / 7
2048x2048 PNG 4.5MB 24.00 / 28.00 / 25.71 / 7
Chrome
ImageBitmap
Texture uploading time [ms]
17.80
image / min / max / avg / count
8192x4096 JPG 4.4MB 79.80 / 97.80 / 91.00 / 7
2048x2048 PNG 4.5MB 17.80 / 27.70 / 25.20 / 7
[1] https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/texImage2D
[2] https://dxr.mozilla.org/mozilla-central/rev/c2593a3058afdfeaac5c990e18794ee8257afe99/dom/canvas/TexUnpackBlob.cpp#313
Reporter | ||
Comment 11•6 years ago
|
||
Thanks for the investigation. Can we do the conversion in createImageBitmap()? My understanding of the reason why ImageBitmap upload with texImage2D() should be fast is the data is asynchronously decoded and ready for uploading beforehand so we can skip any image data handling in texImage2D(). I think the createImageBitmap() is the best place to convert unless there is any problems.
Comment 12•6 years ago
|
||
(In reply to Takahiro Aoyagi (:takahirox) from comment #11)
Thanks for the investigation. Can we do the conversion in createImageBitmap()? My understanding of the reason why ImageBitmap upload with texImage2D() should be fast is the data is asynchronously decoded and ready for uploading beforehand so we can skip any image data handling in texImage2D(). I think the createImageBitmap() is the best place to convert unless there is any problems.
Agree. We should figure out why ImageBitmapData from createImageBitmap() is BGRX8 and check what the format of Image is.
Comment 13•6 years ago
|
||
For Image, we also do the conversion (WebGL perf warning: texImage2D: Conversion requires pixel reformatting. (26->15)
). I will add a profiler in Gecko to figure out where is the bottleneck.
Comment 14•6 years ago
•
|
||
On my Windows PC, I think Chrome is getting slow recently. Firefox gets the same profiling result but is faster than Chrome now.
"Chrome"
Image
Texture uploading time [ms]
72.57
image / min / max / avg / count
8192x4096 JPG 4.4MB 433.48 / 436.37 / 434.56 / 7
2048x2048 PNG 4.5MB 71.53 / 72.57 / 72.03 / 7
ImageBitmap
Texture uploading time [ms]
16.55
image / min / max / avg / count
8192x4096 JPG 4.4MB 210.75 / 257.78 / 222.22 / 6
2048x2048 PNG 4.5MB 16.55 / 17.24 / 16.85 / 6
"Nightly"
Image
Texture uploading time [ms]
16.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 151.00 / 193.00 / 164.00 / 5
2048x2048 PNG 4.5MB 16.00 / 17.00 / 16.80 / 5
ImageBitmap
Texture uploading time [ms]
17.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 193.00 / 194.00 / 193.40 / 5
2048x2048 PNG 4.5MB 14.00 / 17.00 / 15.60 / 5
Comment 15•6 years ago
|
||
You might also try uploading as RGBA instead of RGB. RGB will often incur an unpack in the driver to RGBX anyways.
Comment 16•6 years ago
|
||
My GPU is Nvidia GTX 1080 TI
"Chrome"
[RGB]
ImageBitmap
Texture uploading time [ms] 254.64
image / min / max / avg / count
8192x4096 JPG 4.4MB 228.09 / 254.86 / 249.28 / 5
JS at rAF time spent 0.00 / 1.00 / 0.51 / 5
[RGBA]
ImageBitmap
Texture uploading time [ms] 123.39
image / min / max / avg / count
8192x4096 JPG 4.4MB 122.59 / 123.97 / 123.29 / 13
JS at rAF time spent 0.00 / 1.00 / 0.13 / 5
"Nightly"
[RGB]
ImageBitmap
Texture uploading time [ms] 179.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 179.00 / 230.00 / 211.22 / 9
JS at rAF time spent 0.00 / 1.00 / 0.40 / 9
WebGLContext::TexImage elapsed time 155.694404
->TexOrSubImage::TexUnpackSurface avg. elapsed time 141.756672
-->Conversion TexOrSubImage::TexUnpackSurface avg. elapsed time 41.727599
-->Driver TexOrSubImage::TexUnpackSurface avg. elapsed time 100.029205
[RGBA]
ImageBitmap
Texture uploading time [ms] 127.00
image / min / max / avg / count
8192x4096 JPG 4.4MB 125.00 / 138.00 / 128.89 / 9
JS at rAF time spent 0.00 / 1.00 / 0.17 / 9
WebGLContext::TexImage elapsed time 80.424603
->TexOrSubImage::TexUnpackSurface avg. elapsed time 68.367779
-->Conversion TexOrSubImage::TexUnpackSurface avg. elapsed time 52.362530
-->Driver TexOrSubImage::TexUnpackSurface avg. elapsed time 16.005117
Texture uploading time includes "JS interpreter time" + "WebGLContext::TexImage()" + "CPU wait for GPU time".
So, according to this stats, we can find if we use RGB internalFormat, GPU spends more time in its GPU function calls and also need time to let CPU wait GPU for unpacking pixels.
Comment 17•6 years ago
|
||
Cool, it's encouraging that this matches my prediction about RGBA8 vs RGB8. Please continue to prefer RGBA8 formats over RGB8.
Updated•2 years ago
|
Comment 18•6 months ago
|
||
So, any luck on getting some progress on this issue? The lag is very annoying (my code is similar, works fine on Chrome and Safari, even on mobile).
Description
•