Open Bug 1246410 (slow-canvas-to-webgl) Opened 8 years ago Updated 2 days ago

Low performance of texImage2D with canvas2d and WebGL

Categories

(Core :: Graphics: CanvasWebGL, defect, P3)

42 Branch
defect

Tracking

()

Performance Impact low
Tracking Status
firefox44 --- wontfix
firefox45 - wontfix
firefox46 - wontfix
firefox47 - wontfix
firefox48 --- wontfix
b2g-v2.6 --- wontfix
b2g-master --- wontfix

People

(Reporter: ivan.kuckir, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: perf, regression, testcase, Whiteboard: gfx-noted, [ft:conndevices], [webvr],)

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36

Steps to reproduce:

Here is a demo, which sends data to GPU using texImage2D(). First, it reads from Image, then from Canvas, then from Uint8Array. It prints 3 execution times into console.
http://www.ivank.net/veci/texImage2d_test.html




Actual results:

In Chrome, execution takes 1735 ms, 27 ms, 793 ms.
In Firefox, execution takes  361 ms, 3758 ms, 50 ms.


Expected results:

Creating texture from Canvas is 100x slower in FF than in Chrome. I think Chrome is so fast, because it stores Canvas content on the GPU already. Is it possible to make texImage2D with Canvas faster in FF?
Component: Untriaged → Canvas: 2D
Keywords: testcase
Product: Firefox → Core
The perf about Canvas regressed with FF42:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ecd916f018fdbad3a2332cc2c5d84b562d699f87&tochange=1304d7aa36ce42147ace62bd59294ddbbfcb3612

I suspect:
Matt Woodrow — Bug 1176363 - Part 2: Allow mapping of SourceSurfaceRawData from multiple threads. r=bas
Andreas Pehrson — Bug 1176363 - Part 1: Make a raw copy of each Canvas::CaptureStream frame. r=mattwoodrow
Blocks: 1176363
Flags: needinfo?(matt.woodrow)
Keywords: perf, regression
Version: 44 Branch → 42 Branch
Flags: needinfo?(pehrsons)
via local build,
Last Good: ecd916f018fd   830 1823 238, 819 1868 242
First Bad: 1ce930b301cb   820 6200 247, 819 6206 239

Regressed by :Bug 1106138
Blocks: 1106138
No longer blocks: 1176363
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(kyle_fung)
Component: Canvas: 2D → Canvas: WebGL
Kyle isn't an intern any more, I doubt he'll be able to look at this.

This patch switched us from using gfxUtils::CreateUnpremultipliedDataSurface to unpremultiply, to using the unpack routine in WebGLTexelConversions instead.

The former uses a lookup table for the conversion and the latter doesn't, so it's not surprising that this is slower.

The WebGLTexelConversions code also adds two extra copies of the data (unpacking into unpackedSrc, and converting into unpackedDst). It's believable/hopeful that the compiler is removing these again, but I'm not sure if we should rely on this without checking the generated assembly.

Jeff, what do you want to do about this?


It should also be possible to copy directly in GPU memory for windows with d2d, though we'd need to add code for getting external d3d11 textures into ANGLE, and for dealing with unpremultiplication.
Flags: needinfo?(pehrsons)
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(kyle_fung)
Flags: needinfo?(jgilbert)
Whiteboard: gfx-noted
(In reply to Matt Woodrow (:mattwoodrow) from comment #3)
> Kyle isn't an intern any more, I doubt he'll be able to look at this.
> 
> This patch switched us from using gfxUtils::CreateUnpremultipliedDataSurface
> to unpremultiply, to using the unpack routine in WebGLTexelConversions
> instead.
> 
> The former uses a lookup table for the conversion and the latter doesn't, so
> it's not surprising that this is slower.
> 
> The WebGLTexelConversions code also adds two extra copies of the data
> (unpacking into unpackedSrc, and converting into unpackedDst). It's
> believable/hopeful that the compiler is removing these again, but I'm not
> sure if we should rely on this without checking the generated assembly.
> 
> Jeff, what do you want to do about this?
> 
> 
> It should also be possible to copy directly in GPU memory for windows with
> d2d, though we'd need to add code for getting external d3d11 textures into
> ANGLE, and for dealing with unpremultiplication.

1, presently) We should ensure the fast-path works (canvas2d->webgl w/ unpack-premult-alpha), and print a warning in the console when you're not on the fast-path.

2, eventually) We want to something like what we do for <video> but for <canvas>, however I don't think this will give us the correct frame in the current code. (it'll give us the current frontbuffer, when IIRC we want the backbuffer here)
Flags: needinfo?(jgilbert)
Not a new regression, past the beta mid cycle of 45, not tracking.
Matt or Jeff, how bad is this regression? Can you help find someone to work on it? 

Tracking for 47; I'm not tracking for 46 since no one is assigned and this regression from 42 has already shipped. But it would be good to fix.
Flags: needinfo?(matt.woodrow)
It's quite bad (more than 3x), but I'm not sure about how common this case is.

Over to Jeff for a fix, this is his area of expertise.
Flags: needinfo?(matt.woodrow) → needinfo?(jgilbert)
I do not think this is an important regression, as this is not a common use-case.

I think I can throw together a patch for medium-fast-pathing pure premult/unpremult.

Really we should look at surfacing a warning to the web console when people incur a premult/unpremult that they don't need to do.

I don't want to look at doing gpu->gpu copy/CoW until have more time. (likely in the 48+ timeframe)
Flags: needinfo?(jgilbert)
Assignee: nobody → jgilbert
Peter, do Morris or Ethan have time to look at this before Jeff has a chance?
Flags: needinfo?(howareyou322)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #6)
> Matt or Jeff, how bad is this regression? Can you help find someone to work
> on it? 
> 
> Tracking for 47; I'm not tracking for 46 since no one is assigned and this
> regression from 42 has already shipped. But it would be good to fix.

This isn't a regression. I don't think we should track this unless it ends up prioritized by something else.

Not that we shouldn't fix it, but it's not an emergency.
(In reply to Milan Sreckovic [:milan] from comment #9)
> Peter, do Morris or Ethan have time to look at this before Jeff has a chance?

yes, Ethan will fix WebGL crash and then work on this.
Flags: needinfo?(howareyou322)
I'm going to drop tracking on this, as this isn't a regression from any active Firefox version.
We should fix it, but absent another priority pushing us to uplift the fix, there's no need for this to be anything more than a bugfix bug.
Aggressively wontfixing Beta and older.
blocking-b2g: --- → 2.6?
Whiteboard: gfx-noted → gfx-noted, [ft:conndevices]
Platform triage decision: This is not something that will block Fx47 release, wontfix for 47 and 48.
Let me show you, where I use this feature.

Go to http://www.Photopea.com , click somewhere and drag, notice the response (FPS).

Now, enable Rulers (View->Rulers  or  Ctrl+R) and do the same thing. The FPS is about 5 times lower.

The rulers (and some other parts, e.g. selection contours, guides, ...) can be redrawn at every frame (60 times per second, when zooming, moving etc). I draw it all into a separate context2D and then send it into a WebGL texture.

I have thousands of users, who use Photopea inside Firefox. All of them are getting a bad experience because of you. Can you please fix it or propose another solution?
Assigning to Ethan based on comment 11.
Assignee: jgilbert → ethlin
Comment on attachment 8756259 [details] [diff] [review]
wip

Do we have to worry about alpha=0 division?
Attached patch wipSplinter Review
This is a wip to use shader for the premul conversion. I haven't checked the correctness.
(In reply to Milan Sreckovic [:milan] from comment #18)
> Comment on attachment 8756259 [details] [diff] [review]
> wip
> 
> Do we have to worry about alpha=0 division?

Probably not. Unpremult means the only valid value would be (0,0,0,0), which technically (and correctly) is (NaN,NaN,NaN,0). This is probably fine.
(In reply to Ethan Lin[:ethlin] from comment #17)
> Created attachment 8756259 [details] [diff] [review]
> wip
> 
> This is a wip to use shader for the premul conversion. I haven't checked the
> correctness.

I do not think this is strictly necessary.
What we should do first is use a gpu-gpu copy for the 'upload' fastpath, and generate a warning when we can't use the fastpath.

We talk about looking at gpu-size premult after that, but it's not strictly-speaking necessary. (If desired, the client app should do it)
blocking-b2g: 2.6? → ---
I use g.texImage2D (g.TEXTURE_2D, 0, g.RGBA, g.RGBA, g.UNSIGNED_BYTE, tcv2)
to upload textures prepared on a 2D canvas and now (on FF nightly 49.0a1) receive strange error messages 
"Error: WebGL: texImage2D: Chosen format/type incured an expensive reformat: 0x1908/0x1401".
I came across this bug and assume that this is the newly introduced warning discussed above.
I tried to enable and disable g.UNPACK_PREMULTIPLY_ALPHA_WEBGL without any effect on the
error message. I think this message should be flaged as warning and not as error message in the
log. Also it would be very helpful to include some more information on what format/type to use
in order to get on the fast track.
Is it possible to get a status update on this issue, or advice for client app developers who encounter it?

I'm building a 360º video player for which I'm converting each frame of an HTMLVideoElement > Canvas > WebGL Texture for output in a webgl context.

The difference in performance between Firefox and Chrome for this use case appears to be substantial (on OS X, at least), and is likely made worse by the relatively high resolution of the source media required for 360 stereo videos.

I'll follow up with a small standalone test case and actual performance metrics on my hardware shortly.
Looking at the test from comment 0

On OS X (MacBook Pro, Retina): Chrome: 915, 695, 256.  Firefox: 361, 904, 144.
Windows 7: Chrome: 934, 5, 169. Firefox: 284, 726, 29.

In Firefox I also see a lot of these:
Error: WebGL: texImage2D: Incurred CPU-side conversion, which is very slow.  texImage2d_test.html:28:30
Error: WebGL: texImage2D: Incurred CPU pixel conversion, which is very slow.  texImage2d_test.html:28:30
Error: WebGL: texImage2D: Chosen format/type incurred an expensive reformat: 0x1908/0x1401  texImage2d_test.html:28:30

Cameron - yes, please, a standalone test case would be great.
Attached file webgl_video_test.html
This is about as simple as I can reduce it to, I think.

It creates a video and canvas element, sets up the webgl context, loads the video, offers simplistic play/pause controls, and renders the video frame as webgl texture.

My setup:

MacBook Pro (13-inch, Late 2011)
OS X 10.11.6
2.8 GHz Intel Core i7
8 GB 1333 MHz DDR3
Intel HD Graphics 3000 512 MB

I tested with Chrome 53.0.2785.143 (64-bit) and Firefox Developer Edition 51.0a2 (2016-09-30) (64-bit) and performance in Chrome is significantly better than Firefox.

I recorded profiling data from both browsers, but while Chrome shows time spent in specific gl calls (i.e. teximage2d) I can't see a way to gather that data in Firefox. 

Firefox did log the following error messages:

Error: WebGL: texImage2D: Failed to hit GPU-copy fast-path. Falling back to CPU upload.
Error: WebGL: texImage2D: Incurred CPU-side conversion, which is very slow.
Error: WebGL: texImage2D: Incurred CPU pixel conversion, which is very slow.
Error: WebGL: texImage2D: Chosen format/type incurred an expensive reformat: 0x1907/0x1401

Here's a 4098x2048 H.264/AAC 30fps video which you can download and use with the attached test case. It's just an example I found online, as I'm unable to share the source material I'm working with.

http://photocreations.ca/3D/fountain-L-4k.mp4
I was focusing on other issues. I'll start to look this again. The optimization may only work for MacOS. I'm not sure if we really want it at this moment.
Basically we need to do something similar to TexUnpackImage[1] for canvas element. Currently TexUnpackImage is only for video element. I'll study how to do the same thing for canvas.

[1] https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#474
(In reply to Ethan Lin[:ethlin] from comment #27)
> Basically we need to do something similar to TexUnpackImage[1] for canvas
> element. Currently TexUnpackImage is only for video element. I'll study how
> to do the same thing for canvas.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.
> cpp#474

Just to confirm, the repro I attached is using an HTMLVideoElement as the pixel source for texImage2D. It appears to have the same performance issues as when using HTMLCanvasElement as the pixel source.
Bug 1307342 also seems to be related to this bug.
Whiteboard: gfx-noted, [ft:conndevices] → gfx-noted, [ft:conndevices], [webvr]
Just wanted to add that with the fast path appears never to be taken for regular images as well. This is a separate bug but not sure where to put it (feedback welcome).

User agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
FF version: 50.1.0

Page to reproduce errors: https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API/Tutorial/Using_textures_in_WebGL

Errors received:
* Error: WebGL: texImage2D: Incurred CPU-side conversion, which is very slow.
* Error: WebGL: texImage2D: Chosen format/type incurred an expensive reformat: 0x1908/0x1401

Error scan be observed on many other sites as well. Using the right format, UNPACK_PREMULTIPLY_ALPHA_WEBGL = false and UNPACK_FLIP_Y_WEBGL = false appears not to have any effect.
Summary: Low performance of texImage2D with canvas → Low performance of texImage2D with canvas and WebGL
Assign to me before finding the right owner.
Assignee: ethlin → howareyou322
Since FF 51.x the error is now this one.
Error: WebGL: texImage2D: Conversion requires pixel reformatting.
But seems to be the same.
Appears with all three.js tests.
Whiteboard: gfx-noted, [ft:conndevices], [webvr] → gfx-noted, [ft:conndevices], [webvr], [qf]
> Error: WebGL: texImage2D: Conversion requires pixel reformatting.

I got similar error with FF52.0.1(32bit).
(My code is modified one of "Basic Usage Example" in https://github.com/pixijs/pixi.js but seems same issue.)

It happens when gl.texImage2D() is called.
So, I picked up related Mozilla sources.

And (not checked the values but) I guess srcFormat and dstFormat for ConvertIfNeeded()in TexUnpackSurface::TexOrSubImage() are not same.

* srcFormat and dstFormat should be WebGLTexelFormat::RGBA8(or RGB565) to match, I think.

srcFormat (== out_*texelFormat in GetFormatForSurf()) is one of
WebGLTexelFormat::BGRA8
WebGLTexelFormat::BGRX8
WebGLTexelFormat::RGBA8
WebGLTexelFormat::RGBX8
WebGLTexelFormat::RGB565
WebGLTexelFormat::A8

dstFormat (== returned value of FormatForPackingInfo()) is one of 
(case LOCAL_GL_UNSIGNED_BYTE)
WebGLTexelFormat::R8
WebGLTexelFormat::A8
WebGLTexelFormat::RA8
WebGLTexelFormat::RGB8
WebGLTexelFormat::RGBA8
WebGLTexelFormat::RG8

----
Related sources

Error message (Conversion requires pixel reformatting.) is here.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#316-318
>TexUnpackBlob::ConvertIfNeeded(...)
>{
>    ...
>    if (srcFormat != dstFormat) {
>        webgl->GeneratePerfWarning("%s: Conversion requires pixel reformatting.",
>                                   funcName);
>    }
>    ...
>}

TexUnpackSurface::TexOrSubImage() calls TexUnpackBlob::ConvertIfNeeded() with srcFormat and dstFormat as parameters.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#804-805

srcFormat and dstFormat are here.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#762,768
>const auto dstFormat = FormatForPackingInfo(dstPI);
>if (!GetFormatForSurf(mSurf, &srcFormat, &srcBPP)) {...}

* TexUnpackBytes::TexOrSubImage() also calls ConvertIfNeeded() but format parameters are same.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#445-446

GetFormatForSurf()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#701-704

FormatForPackingInfo()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#98-104

----
WebGLTexture::TexImage() calls webgl::TexUnpackBlob::TexOrSubImage()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLTextureUpload.cpp#1297

WebGLTexture::TexSubImage() also calls TexOrSubImage().
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLTextureUpload.cpp#1385

----
WebGLContext::TexImage() calls WebGLTexture::TexImage()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLContextTextures.cpp#389,394-395

----
WebGLContext::TexImage2D() (== "WebGL: texImage2D" in error message) calls WebGLContext::TexImage()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLContext.h#1180,1187-1190
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf] → gfx-noted, [ft:conndevices], [webvr], [qf:p1]
Daosheng will work on this to improve video upload.
Assignee: howareyou322 → dmu
I think we mutated this bug along the way; it started as a canvas 2D performance comparison to Chrome, and is currently also covering the regression in video+WebGL combination.  The later is what comment 34 is about.
Assignee: dmu → jgilbert
Flags: needinfo?(jgilbert)
Hi confirming this issue on Windows and macOS with this three.js demo. I have a very powerful I7 and a GTX1060 card. I doubt I should be seeing these errors ? 

https://threejs.org/examples/webgl_video_panorama_equirectangular.html

I get

Error: WebGL warning: texImage2D: Failed to hit GPU-copy fast-path. Falling back to CPU upload.  
Error: WebGL warning: texImage2D: Conversion requires pixel reformatting.

I also get however errors like this 

THREE.WebGLShader: gl.getShaderInfoLog() fragment WARNING: 0:1: extension 'GL_ARB_gpu_shader5' is not supported
(In reply to Daniel from comment #36)
> Hi confirming this issue on Windows and macOS with this three.js demo. I
> have a very powerful I7 and a GTX1060 card. I doubt I should be seeing these
> errors ? 
> 
> https://threejs.org/examples/webgl_video_panorama_equirectangular.html
> 
> I get
> 
> Error: WebGL warning: texImage2D: Failed to hit GPU-copy fast-path. Falling
> back to CPU upload.  
> Error: WebGL warning: texImage2D: Conversion requires pixel reformatting.
> 
> I also get however errors like this 
> 
> THREE.WebGLShader: gl.getShaderInfoLog() fragment WARNING: 0:1: extension
> 'GL_ARB_gpu_shader5' is not supported

We are working on GL_VIDEO format support for avoiding the conversion for WebGL video at Bug 1322746.
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf:p1] → gfx-noted, [ft:conndevices], [webvr], [qf:p2]
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf:p2] → gfx-noted, [ft:conndevices], [webvr], [qf:p1]
¡Hola Jeff!

Ended up here from https://support.mozilla.org/es/questions/1161556 where a fellow user reports this message:

"Error: WebGL warning: texImage2D: Conversion requires pixel reformatting."

Is this the same bug or shall a new one be filed?

¡Gracias!
Alex
Depends on: 1374710
Status: NEW → ASSIGNED
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf:p1] → gfx-noted, [ft:conndevices], [webvr], [qf:p2]
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf:p2] → gfx-noted, [ft:conndevices], [webvr], [qf:p3]
Alias: slow-canvas-to-webgl
Assignee: jgilbert → nobody
Blocks: 1222877
Status: ASSIGNED → NEW
Flags: needinfo?(jgilbert)
Summary: Low performance of texImage2D with canvas and WebGL → Low performance of texImage2D with canvas2d and WebGL

I am observing heavy lag with Firefox for webgl and webgl2 canvas in Linux, when the same demo has smooth performance on Chrome.

This affects pretty much all PixiJS webgl2 applications as well as manual webgl demos. I came across this bug when I wanted to report a new one. Can anyone with a better understanding of the related internals confirm whether or not this is the same issue?

Examples:

Please note that everything goes smooth in a small window. The demos quickly become laggy in a maximalized/fullscreen window on a high resolution (retina) display.

Observed with system specs:

  • OS: Linux Mint 19.1 Cinnamon (AKA Ubuntu 18.04.1 LTS) 4.15.0-50-generic x64
  • Video: NVIDIA GK104 [GeForce GTX 660 Ti] Driver: Proprietary 396.54 from Driver Manager
  • Monitor: DELL P2415Q 3840x2160
  • Firefox: Firefox Quantum 67 x64
  • Chrome: Chrome 76.0.3806.1 64-bit

Originally reported at https://github.com/pixijs/pixi.js/issues/5751

(In reply to Redsandro from comment #39)

I am observing heavy lag with Firefox for webgl and webgl2 canvas in Linux, when the same demo has smooth performance on Chrome.

This affects pretty much all PixiJS webgl2 applications as well as manual webgl demos. I came across this bug when I wanted to report a new one. Can anyone with a better understanding of the related internals confirm whether or not this is the same issue?

Examples:

Please note that everything goes smooth in a small window. The demos quickly become laggy in a maximalized/fullscreen window on a high resolution (retina) display.

Observed with system specs:

  • OS: Linux Mint 19.1 Cinnamon (AKA Ubuntu 18.04.1 LTS) 4.15.0-50-generic x64
  • Video: NVIDIA GK104 [GeForce GTX 660 Ti] Driver: Proprietary 396.54 from Driver Manager
  • Monitor: DELL P2415Q 3840x2160
  • Firefox: Firefox Quantum 67 x64
  • Chrome: Chrome 76.0.3806.1 64-bit

Originally reported at https://github.com/pixijs/pixi.js/issues/5751

This is bug 1010527.

This is bug 1010527.

Thank you. My search skills are existent yet unhoned. I will amend there.

Performance Impact: --- → P3
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf:p3] → gfx-noted, [ft:conndevices], [webvr],
Severity: normal → S3

Hi,

first post here.

I do think this issue is still happening. You can very easily reproduce it via babylonjs https://playground.babylonjs.com/#5ZCGRM%234896
The firefox profiler points me to WebGLTexelConversion leading me to this issue.
There is already a forum post on babylonjs here: https://forum.babylonjs.com/t/performance-of-updating-dynamic-texture-on-firefox/47624
But it seems like they cannot do much from their end.
Test it out on chrome and firefox and you will see a great difference in fps. Especially increasing the resolution factor does not influence chrome as much as it does on firefox.

My use case is moving a canvas of a mapping library into babylonjs context. Another person mentioned the use of custom canvas element in a game. And I have also seen use cases on threejs.

Let me know if I should open a separate issue for this.

Best!

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: