Low performance of texImage2D with canvas and WebGL

NEW
Assigned to

Status

()

Core
Canvas: WebGL
a year ago
4 days ago

People

(Reporter: Ivan Kuckir, Assigned: jgilbert, NeedInfo)

Tracking

(Blocks: 1 bug, {perf, regression, testcase})

42 Branch
perf, regression, testcase
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 wontfix, firefox45- wontfix, firefox46- wontfix, firefox47- wontfix, firefox48 wontfix, b2g-v2.6 affected, b2g-master affected)

Details

(Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf:p1])

Attachments

(2 attachments)

(Reporter)

Description

a year ago
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36

Steps to reproduce:

Here is a demo, which sends data to GPU using texImage2D(). First, it reads from Image, then from Canvas, then from Uint8Array. It prints 3 execution times into console.
http://www.ivank.net/veci/texImage2d_test.html




Actual results:

In Chrome, execution takes 1735 ms, 27 ms, 793 ms.
In Firefox, execution takes  361 ms, 3758 ms, 50 ms.


Expected results:

Creating texture from Canvas is 100x slower in FF than in Chrome. I think Chrome is so fast, because it stores Canvas content on the GPU already. Is it possible to make texImage2D with Canvas faster in FF?

Updated

a year ago
Component: Untriaged → Canvas: 2D
Keywords: testcase
Product: Firefox → Core

Comment 1

a year ago
The perf about Canvas regressed with FF42:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ecd916f018fdbad3a2332cc2c5d84b562d699f87&tochange=1304d7aa36ce42147ace62bd59294ddbbfcb3612

I suspect:
Matt Woodrow — Bug 1176363 - Part 2: Allow mapping of SourceSurfaceRawData from multiple threads. r=bas
Andreas Pehrson — Bug 1176363 - Part 1: Make a raw copy of each Canvas::CaptureStream frame. r=mattwoodrow
Blocks: 1176363
status-firefox44: --- → affected
status-firefox45: --- → affected
status-firefox46: --- → affected
status-firefox47: --- → affected
tracking-firefox45: --- → ?
tracking-firefox46: --- → ?
tracking-firefox47: --- → ?
Flags: needinfo?(matt.woodrow)
Keywords: perf, regression
Version: 44 Branch → 42 Branch

Updated

a year ago
Flags: needinfo?(pehrsons)

Comment 2

a year ago
via local build,
Last Good: ecd916f018fd   830 1823 238, 819 1868 242
First Bad: 1ce930b301cb   820 6200 247, 819 6206 239

Regressed by :Bug 1106138
Blocks: 1106138
No longer blocks: 1176363
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(kyle_fung)

Updated

a year ago
Component: Canvas: 2D → Canvas: WebGL
Kyle isn't an intern any more, I doubt he'll be able to look at this.

This patch switched us from using gfxUtils::CreateUnpremultipliedDataSurface to unpremultiply, to using the unpack routine in WebGLTexelConversions instead.

The former uses a lookup table for the conversion and the latter doesn't, so it's not surprising that this is slower.

The WebGLTexelConversions code also adds two extra copies of the data (unpacking into unpackedSrc, and converting into unpackedDst). It's believable/hopeful that the compiler is removing these again, but I'm not sure if we should rely on this without checking the generated assembly.

Jeff, what do you want to do about this?


It should also be possible to copy directly in GPU memory for windows with d2d, though we'd need to add code for getting external d3d11 textures into ANGLE, and for dealing with unpremultiplication.
Flags: needinfo?(pehrsons)
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(kyle_fung)
Flags: needinfo?(jgilbert)
Whiteboard: gfx-noted
(Assignee)

Comment 4

a year ago
(In reply to Matt Woodrow (:mattwoodrow) from comment #3)
> Kyle isn't an intern any more, I doubt he'll be able to look at this.
> 
> This patch switched us from using gfxUtils::CreateUnpremultipliedDataSurface
> to unpremultiply, to using the unpack routine in WebGLTexelConversions
> instead.
> 
> The former uses a lookup table for the conversion and the latter doesn't, so
> it's not surprising that this is slower.
> 
> The WebGLTexelConversions code also adds two extra copies of the data
> (unpacking into unpackedSrc, and converting into unpackedDst). It's
> believable/hopeful that the compiler is removing these again, but I'm not
> sure if we should rely on this without checking the generated assembly.
> 
> Jeff, what do you want to do about this?
> 
> 
> It should also be possible to copy directly in GPU memory for windows with
> d2d, though we'd need to add code for getting external d3d11 textures into
> ANGLE, and for dealing with unpremultiplication.

1, presently) We should ensure the fast-path works (canvas2d->webgl w/ unpack-premult-alpha), and print a warning in the console when you're not on the fast-path.

2, eventually) We want to something like what we do for <video> but for <canvas>, however I don't think this will give us the correct frame in the current code. (it'll give us the current frontbuffer, when IIRC we want the backbuffer here)
Flags: needinfo?(jgilbert)
Not a new regression, past the beta mid cycle of 45, not tracking.
status-firefox44: affected → wontfix
tracking-firefox45: ? → -
Matt or Jeff, how bad is this regression? Can you help find someone to work on it? 

Tracking for 47; I'm not tracking for 46 since no one is assigned and this regression from 42 has already shipped. But it would be good to fix.
tracking-firefox46: ? → -
tracking-firefox47: ? → +
Flags: needinfo?(matt.woodrow)
It's quite bad (more than 3x), but I'm not sure about how common this case is.

Over to Jeff for a fix, this is his area of expertise.
Flags: needinfo?(matt.woodrow) → needinfo?(jgilbert)
(Assignee)

Comment 8

a year ago
I do not think this is an important regression, as this is not a common use-case.

I think I can throw together a patch for medium-fast-pathing pure premult/unpremult.

Really we should look at surfacing a warning to the web console when people incur a premult/unpremult that they don't need to do.

I don't want to look at doing gpu->gpu copy/CoW until have more time. (likely in the 48+ timeframe)
Flags: needinfo?(jgilbert)
(Assignee)

Updated

a year ago
Assignee: nobody → jgilbert
Peter, do Morris or Ethan have time to look at this before Jeff has a chance?
Flags: needinfo?(howareyou322)
(Assignee)

Comment 10

a year ago
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #6)
> Matt or Jeff, how bad is this regression? Can you help find someone to work
> on it? 
> 
> Tracking for 47; I'm not tracking for 46 since no one is assigned and this
> regression from 42 has already shipped. But it would be good to fix.

This isn't a regression. I don't think we should track this unless it ends up prioritized by something else.

Not that we shouldn't fix it, but it's not an emergency.
(In reply to Milan Sreckovic [:milan] from comment #9)
> Peter, do Morris or Ethan have time to look at this before Jeff has a chance?

yes, Ethan will fix WebGL crash and then work on this.
Flags: needinfo?(howareyou322)
(Assignee)

Comment 12

a year ago
I'm going to drop tracking on this, as this isn't a regression from any active Firefox version.
We should fix it, but absent another priority pushing us to uplift the fix, there's no need for this to be anything more than a bugfix bug.
tracking-firefox47: + → ---
tracking-firefox48: --- → ?
(Assignee)

Comment 13

a year ago
Aggressively wontfixing Beta and older.
status-firefox45: affected → wontfix
status-firefox46: affected → wontfix
status-firefox48: --- → affected
tracking-firefox48: ? → ---
Blocks: 1217740

Updated

a year ago
blocking-b2g: --- → 2.6?

Updated

a year ago
Whiteboard: gfx-noted → gfx-noted, [ft:conndevices]
Platform triage decision: This is not something that will block Fx47 release, wontfix for 47 and 48.
status-firefox47: affected → wontfix
status-firefox48: affected → wontfix
tracking-firefox47: --- → -
(Reporter)

Comment 15

11 months ago
Let me show you, where I use this feature.

Go to http://www.Photopea.com , click somewhere and drag, notice the response (FPS).

Now, enable Rulers (View->Rulers  or  Ctrl+R) and do the same thing. The FPS is about 5 times lower.

The rulers (and some other parts, e.g. selection contours, guides, ...) can be redrawn at every frame (60 times per second, when zooming, moving etc). I draw it all into a separate context2D and then send it into a WebGL texture.

I have thousands of users, who use Photopea inside Firefox. All of them are getting a bad experience because of you. Can you please fix it or propose another solution?
Assigning to Ethan based on comment 11.
Assignee: jgilbert → ethlin
Comment on attachment 8756259 [details] [diff] [review]
wip

Do we have to worry about alpha=0 division?

Comment 18

11 months ago
Created attachment 8756259 [details] [diff] [review]
wip

This is a wip to use shader for the premul conversion. I haven't checked the correctness.
(Assignee)

Comment 19

11 months ago
(In reply to Milan Sreckovic [:milan] from comment #18)
> Comment on attachment 8756259 [details] [diff] [review]
> wip
> 
> Do we have to worry about alpha=0 division?

Probably not. Unpremult means the only valid value would be (0,0,0,0), which technically (and correctly) is (NaN,NaN,NaN,0). This is probably fine.
(Assignee)

Comment 20

11 months ago
(In reply to Ethan Lin[:ethlin] from comment #17)
> Created attachment 8756259 [details] [diff] [review]
> wip
> 
> This is a wip to use shader for the premul conversion. I haven't checked the
> correctness.

I do not think this is strictly necessary.
What we should do first is use a gpu-gpu copy for the 'upload' fastpath, and generate a warning when we can't use the fastpath.

We talk about looking at gpu-size premult after that, but it's not strictly-speaking necessary. (If desired, the client app should do it)

Updated

11 months ago
blocking-b2g: 2.6? → ---
status-b2g-v2.6: --- → affected
status-b2g-master: --- → affected

Comment 21

11 months ago
I use g.texImage2D (g.TEXTURE_2D, 0, g.RGBA, g.RGBA, g.UNSIGNED_BYTE, tcv2)
to upload textures prepared on a 2D canvas and now (on FF nightly 49.0a1) receive strange error messages 
"Error: WebGL: texImage2D: Chosen format/type incured an expensive reformat: 0x1908/0x1401".
I came across this bug and assume that this is the newly introduced warning discussed above.
I tried to enable and disable g.UNPACK_PREMULTIPLY_ALPHA_WEBGL without any effect on the
error message. I think this message should be flaged as warning and not as error message in the
log. Also it would be very helpful to include some more information on what format/type to use
in order to get on the fast track.

Comment 22

7 months ago
Is it possible to get a status update on this issue, or advice for client app developers who encounter it?

I'm building a 360º video player for which I'm converting each frame of an HTMLVideoElement > Canvas > WebGL Texture for output in a webgl context.

The difference in performance between Firefox and Chrome for this use case appears to be substantial (on OS X, at least), and is likely made worse by the relatively high resolution of the source media required for 360 stereo videos.

I'll follow up with a small standalone test case and actual performance metrics on my hardware shortly.
Looking at the test from comment 0

On OS X (MacBook Pro, Retina): Chrome: 915, 695, 256.  Firefox: 361, 904, 144.
Windows 7: Chrome: 934, 5, 169. Firefox: 284, 726, 29.

In Firefox I also see a lot of these:
Error: WebGL: texImage2D: Incurred CPU-side conversion, which is very slow.  texImage2d_test.html:28:30
Error: WebGL: texImage2D: Incurred CPU pixel conversion, which is very slow.  texImage2d_test.html:28:30
Error: WebGL: texImage2D: Chosen format/type incurred an expensive reformat: 0x1908/0x1401  texImage2d_test.html:28:30

Cameron - yes, please, a standalone test case would be great.

Comment 24

7 months ago
Created attachment 8796615 [details]
webgl_video_test.html

This is about as simple as I can reduce it to, I think.

It creates a video and canvas element, sets up the webgl context, loads the video, offers simplistic play/pause controls, and renders the video frame as webgl texture.

My setup:

MacBook Pro (13-inch, Late 2011)
OS X 10.11.6
2.8 GHz Intel Core i7
8 GB 1333 MHz DDR3
Intel HD Graphics 3000 512 MB

I tested with Chrome 53.0.2785.143 (64-bit) and Firefox Developer Edition 51.0a2 (2016-09-30) (64-bit) and performance in Chrome is significantly better than Firefox.

I recorded profiling data from both browsers, but while Chrome shows time spent in specific gl calls (i.e. teximage2d) I can't see a way to gather that data in Firefox. 

Firefox did log the following error messages:

Error: WebGL: texImage2D: Failed to hit GPU-copy fast-path. Falling back to CPU upload.
Error: WebGL: texImage2D: Incurred CPU-side conversion, which is very slow.
Error: WebGL: texImage2D: Incurred CPU pixel conversion, which is very slow.
Error: WebGL: texImage2D: Chosen format/type incurred an expensive reformat: 0x1907/0x1401

Here's a 4098x2048 H.264/AAC 30fps video which you can download and use with the attached test case. It's just an example I found online, as I'm unable to share the source material I'm working with.

http://photocreations.ca/3D/fountain-L-4k.mp4

Comment 25

7 months ago
Profile data from Firefox:

https://cleopatra.io/#report=53ac2e9af0a7ba63abdf43fa53b624aee1422380

Comment 26

7 months ago
I was focusing on other issues. I'll start to look this again. The optimization may only work for MacOS. I'm not sure if we really want it at this moment.

Comment 27

7 months ago
Basically we need to do something similar to TexUnpackImage[1] for canvas element. Currently TexUnpackImage is only for video element. I'll study how to do the same thing for canvas.

[1] https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#474

Comment 28

7 months ago
(In reply to Ethan Lin[:ethlin] from comment #27)
> Basically we need to do something similar to TexUnpackImage[1] for canvas
> element. Currently TexUnpackImage is only for video element. I'll study how
> to do the same thing for canvas.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.
> cpp#474

Just to confirm, the repro I attached is using an HTMLVideoElement as the pixel source for texImage2D. It appears to have the same performance issues as when using HTMLCanvasElement as the pixel source.
Bug 1307342 also seems to be related to this bug.

Updated

4 months ago
Whiteboard: gfx-noted, [ft:conndevices] → gfx-noted, [ft:conndevices], [webvr]

Comment 30

3 months ago
Just wanted to add that with the fast path appears never to be taken for regular images as well. This is a separate bug but not sure where to put it (feedback welcome).

User agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
FF version: 50.1.0

Page to reproduce errors: https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API/Tutorial/Using_textures_in_WebGL

Errors received:
* Error: WebGL: texImage2D: Incurred CPU-side conversion, which is very slow.
* Error: WebGL: texImage2D: Chosen format/type incurred an expensive reformat: 0x1908/0x1401

Error scan be observed on many other sites as well. Using the right format, UNPACK_PREMULTIPLY_ALPHA_WEBGL = false and UNPACK_FLIP_Y_WEBGL = false appears not to have any effect.
Summary: Low performance of texImage2D with canvas → Low performance of texImage2D with canvas and WebGL

Comment 31

3 months ago
Assign to me before finding the right owner.
Assignee: ethlin → howareyou322

Comment 32

3 months ago
Since FF 51.x the error is now this one.
Error: WebGL: texImage2D: Conversion requires pixel reformatting.
But seems to be the same.
Appears with all three.js tests.
Blocks: 1207170
Whiteboard: gfx-noted, [ft:conndevices], [webvr] → gfx-noted, [ft:conndevices], [webvr], [qf]

Comment 33

a month ago
> Error: WebGL: texImage2D: Conversion requires pixel reformatting.

I got similar error with FF52.0.1(32bit).
(My code is modified one of "Basic Usage Example" in https://github.com/pixijs/pixi.js but seems same issue.)

It happens when gl.texImage2D() is called.
So, I picked up related Mozilla sources.

And (not checked the values but) I guess srcFormat and dstFormat for ConvertIfNeeded()in TexUnpackSurface::TexOrSubImage() are not same.

* srcFormat and dstFormat should be WebGLTexelFormat::RGBA8(or RGB565) to match, I think.

srcFormat (== out_*texelFormat in GetFormatForSurf()) is one of
WebGLTexelFormat::BGRA8
WebGLTexelFormat::BGRX8
WebGLTexelFormat::RGBA8
WebGLTexelFormat::RGBX8
WebGLTexelFormat::RGB565
WebGLTexelFormat::A8

dstFormat (== returned value of FormatForPackingInfo()) is one of 
(case LOCAL_GL_UNSIGNED_BYTE)
WebGLTexelFormat::R8
WebGLTexelFormat::A8
WebGLTexelFormat::RA8
WebGLTexelFormat::RGB8
WebGLTexelFormat::RGBA8
WebGLTexelFormat::RG8

----
Related sources

Error message (Conversion requires pixel reformatting.) is here.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#316-318
>TexUnpackBlob::ConvertIfNeeded(...)
>{
>    ...
>    if (srcFormat != dstFormat) {
>        webgl->GeneratePerfWarning("%s: Conversion requires pixel reformatting.",
>                                   funcName);
>    }
>    ...
>}

TexUnpackSurface::TexOrSubImage() calls TexUnpackBlob::ConvertIfNeeded() with srcFormat and dstFormat as parameters.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#804-805

srcFormat and dstFormat are here.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#762,768
>const auto dstFormat = FormatForPackingInfo(dstPI);
>if (!GetFormatForSurf(mSurf, &srcFormat, &srcBPP)) {...}

* TexUnpackBytes::TexOrSubImage() also calls ConvertIfNeeded() but format parameters are same.
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#445-446

GetFormatForSurf()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#701-704

FormatForPackingInfo()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/TexUnpackBlob.cpp#98-104

----
WebGLTexture::TexImage() calls webgl::TexUnpackBlob::TexOrSubImage()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLTextureUpload.cpp#1297

WebGLTexture::TexSubImage() also calls TexOrSubImage().
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLTextureUpload.cpp#1385

----
WebGLContext::TexImage() calls WebGLTexture::TexImage()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLContextTextures.cpp#389,394-395

----
WebGLContext::TexImage2D() (== "WebGL: texImage2D" in error message) calls WebGLContext::TexImage()
https://dxr.mozilla.org/mozilla-central/source/dom/canvas/WebGLContext.h#1180,1187-1190
Whiteboard: gfx-noted, [ft:conndevices], [webvr], [qf] → gfx-noted, [ft:conndevices], [webvr], [qf:p1]
Daosheng will work on this to improve video upload.
Assignee: howareyou322 → dmu
I think we mutated this bug along the way; it started as a canvas 2D performance comparison to Chrome, and is currently also covering the regression in video+WebGL combination.  The later is what comment 34 is about.
Assignee: dmu → jgilbert
Flags: needinfo?(jgilbert)

Comment 36

8 days ago
Hi confirming this issue on Windows and macOS with this three.js demo. I have a very powerful I7 and a GTX1060 card. I doubt I should be seeing these errors ? 

https://threejs.org/examples/webgl_video_panorama_equirectangular.html

I get

Error: WebGL warning: texImage2D: Failed to hit GPU-copy fast-path. Falling back to CPU upload.  
Error: WebGL warning: texImage2D: Conversion requires pixel reformatting.

I also get however errors like this 

THREE.WebGLShader: gl.getShaderInfoLog() fragment WARNING: 0:1: extension 'GL_ARB_gpu_shader5' is not supported
(In reply to Daniel from comment #36)
> Hi confirming this issue on Windows and macOS with this three.js demo. I
> have a very powerful I7 and a GTX1060 card. I doubt I should be seeing these
> errors ? 
> 
> https://threejs.org/examples/webgl_video_panorama_equirectangular.html
> 
> I get
> 
> Error: WebGL warning: texImage2D: Failed to hit GPU-copy fast-path. Falling
> back to CPU upload.  
> Error: WebGL warning: texImage2D: Conversion requires pixel reformatting.
> 
> I also get however errors like this 
> 
> THREE.WebGLShader: gl.getShaderInfoLog() fragment WARNING: 0:1: extension
> 'GL_ARB_gpu_shader5' is not supported

We are working on GL_VIDEO format support for avoiding the conversion for WebGL video at Bug 1322746.
You need to log in before you can comment on or make changes to this bug.