CanvasRenderingContext2D::DrawImage takes lots of time when scrolling Google Spreadsheet

ASSIGNED
Assigned to

Status

()

ASSIGNED
3 years ago
19 days ago

People

(Reporter: smaug, Assigned: jerry, NeedInfo)

Tracking

(Blocks: 2 bugs, {perf})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(platform-rel +)

Details

(Whiteboard: [qf:p3][platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSheets])

Attachments

(5 attachments, 1 obsolete attachment)

Once bug 1237058 is fixed, profiles show that quite a bit time, ~20%-30%, of scrolling 
https://docs.google.com/spreadsheets/d/10UeyRoiWV2HjkWwAU51HXyXAV7YLi4BjDm55mr5Xv6c/edit?pref=2&pli=1#gid=368835050 is taken by stuff in and under CanvasRenderingContext2D::DrawImage (on linux)

Perhaps someone more familiar with gfx could take a look to see if there is something to optimize.
Flags: needinfo?(milan)
Blocks: 1260981
Jerry, I can find somebody to look at this, but since you've already done Google Doc performance work, thought I'd ask if you're interested.
Flags: needinfo?(milan) → needinfo?(hshih)
We'd want to see if we're having the same problem on Windows.
Version: 36 Branch → Trunk
Whiteboard: [platform-rel-Google][platform-rel-GoogleDocs]
Hi Olli,

Do you already have the profiler data for the high loading CanvasRenderingContext2D::DrawImage() call?
Assignee: nobody → hshih
Status: NEW → ASSIGNED
Flags: needinfo?(hshih) → needinfo?(bugs)
I use Zoom for profiling. I could upload a Zoom profile somewhere if that is useful.
Gecko profiler wasn't working last time I tried it couple of weeks ago (some issue with cleopatra IIRC).
Flags: needinfo?(bugs)
Does that profile help?
Flags: needinfo?(hshih)
platform-rel: --- → ?
It looks like that google spreadsheet uses the same canvas as both source and drawing destination.

https://hg.mozilla.org/mozilla-central/annotate/51377a64158941f89ed73f388ae437cfa494c030/dom/canvas/CanvasRenderingContext2D.cpp#l4733

Maybe we still can do something for this case.
Flags: needinfo?(hshih)
See Also: → bug 1083672, bug 1090323
First of all, I'm trying to reduce the surface initialization cost.
Since we always create a crop sourceSurface in [1], the filling to (0,0,0,0) op could be skip.
I think the performance boost might be just a little, but let's do it first.

The cost for surface initialization:
https://cleopatra.io/#report=2209670eadb28cd5ea74293d3cf4150522055cd7&selection=0,1,2,3,3,4,5,6,4,7,8,9,10,11,12,13,14,15,16,11,12,17,18,11,12,19,20,21,22,23,24,25,26,27,28,29,30,31


[1]
https://hg.mozilla.org/mozilla-central/annotate/51377a64158941f89ed73f388ae437cfa494c030/dom/canvas/CanvasRenderingContext2D.cpp#l4372
Created attachment 8765182 [details] [diff] [review]
P1: create a DrawTarget with the existing Surface content data. v1

Merge the DrawTarget creation and CopySurface() into one function.
Then we could overwrite this function to omit the DrawTarget content initialization cost in some platforms.
Created attachment 8765184 [details] [diff] [review]
P2: DrawTargetSkia::CreateSimilarDrawTargetWithSurfaceData() impl. v1
Created attachment 8765185 [details] [diff] [review]
P3: DrawTargetD2D1::CreateSimilarDrawTargetWithSurfaceData() impl. v1
I'm using this canvas drawImage() test:
https://bug1083672.bmoattachments.org/attachment.cgi?id=8505998

The performance at windows is extremely slow when I use intel hd 530.

copy form A to A (windows)
firefox(d2d1)
3700 op/s
google chrome
8300 op/s

copy form A to A (mac)
firefox(skagl)
51913 op/s
google chrome
1208 op/s
The DrawTargetD2D1::Snapshot()[1] call is heavy. It calls flush() for the first time we use Snapshot().

[1]
https://hg.mozilla.org/mozilla-central/annotate/c2da34d96746288b5fee27bf6542a12c9f410988/gfx/2d/DrawTargetD2D1.cpp#l89
Component: Graphics → Canvas: 2D
platform-rel: ? → +
Whiteboard: [platform-rel-Google][platform-rel-GoogleDocs] → [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSheets]
This should be measured again after bug 1335149 landed, just in case it changes the profile.
I will have a new testing comment 12 with bug 1335149 fix later.
Comment hidden (obsolete, spam)
Comment hidden (obsolete)
Created attachment 8835301 [details]
scrollTest.html
Attachment #8834827 - Attachment is obsolete: true
There are two test cases in this experiment:

Test case 1 : https://bug1083672.bmoattachments.org/attachment.cgi?id=8505998
  The original test case in Bug1083672, there are two test items in this test case.
  1. The first one is to execute the canvas' drawImag function with the same souce and destination (copy a rect in pos x in a canvas and draw it in pos x' in the same canvas), we called this A->A.
  2. The second one is to do the same task as the first one with a slightly difference : we copy the source from original canvas(canvas A) to another canvas(canvas B) and copy the source in canvas B back to destination position in canvas A, we called this A->B->A.
  The test case will measure how many copy operations can be done in specific timeout.

Tese case 2 : attachment 8835301 [details] The spread sheet scrolling simulation test.
  Like test case 1, there are A->A and A->B->A tests in this test case. The different part is that we are trying to simulate the real situation of spread sheet scrolling :
   1. The size of canvas is similar to the real sreen size.
   2. We replace rect(0, 0, width, height - 50) with rect(0, 50, width, height - 50) in the canvas to simulate the action of scrolling down in spread sheet. 50 is the px number that simulate the px changes with a scroll(This value can be modified by change the "pace" variable in the test case).
   The tests will measure how many copy operations can be done in specific timeout.


The following data is measured by m-c release build with changeset id f4f37462211 in Windows 10.

Experiment 1 : If the fix in Bug 1335149 affect the performance of drawImg ?
In Bug 1335149, we removed some flush() functions from codebase, this experiment is to test if the modification makes any changes in performance.

  Group 1 with fix
    Test case 1 / with Bug 1335149 / backends direct2d1.1
      A->A
      Result: 8618.16 Op/s. 
      A->B->A
      Result: 26910.40 Op/s. 

    Test case 1 / with Bug 1335149 / backends skia
      A->A
      Result: 36074.09 Op/s.
      A->B->A
      Result: 25321.54 Op/s. 

  Group 2 without fix
    Test case 1 / without Bug 1335149 / backends direct2d1.1
      A->A
      Result: 8691.31 Op/s. 
      A->B->A
      Result: 26025.11 Op/s. 

    Test case 1 / without Bug 1335149 / backends skia
      A->A
      Result: 37341.39 Op/s. 
      A->B->A
      Result: 27387.14 Op/s. 

Conclusion : There are no significant performance changes with this fix. The difference between skia backend should be the bias.


Experiment 2 : If the A->B->A is better than A->A in real situation ?
We use test case 2 to simulate the spread sheet scrolling with the characteristic : big size drawImg, and see if A->B->A can still faster than A->A.
  Test case 2 / backends direct2d1.1
    A->A
    Result: 60.68 Op/s. 
    A->B->A
    Result: 60.92 Op/s. 

  Test case 2 / backends skia
    A->A
    Result: 60.93 Op/s. 
    A->B->A
    Result: 60.99 Op/s. 

Conclusion : A->B->A is not better than A->A in this case, the reason might be the copy size.


Additional Experiment 3 : Check if the relation between performance and the copy size.
In this case, we modify the canvas size and copy size in test case 1 to check if the perfomance improvement is related to copy size.
Group 1 : canvas size (2048, 1024) copy size (64, 64)
  Test case 1 / backends direct2d1.1
    A->A
    Result: 9512.20 Op/s. 
    A->B->A
    Result: 27548.17 Op/s. 
    Improvement rate (A->B->A / A->A)
    Result: 2.8

  Test case 1 / backends skia
    A->A
    Result: 37894.37 Op/s. 
    A->B->A
    Result: 30116.53 Op/s. 
    Improvement rate (A->B->A / A->A)
    Result: 0.79

Group 2 : canvas size (2048, 1024) copy size (512, 512)
  Test case 1 / backends direct2d1.1
    A->A
    Result: 3806.89 Op/s. 
    A->B->A
    Result: 4002.40 Op/s. 
    Improvement rate (A->B->A / A->A)
    Result: 1.05

  Test case 1 / backends skia
    A->A
    Result: 1635.74 Op/s. 
    A->B->A
    Result: 2671.75 Op/s. 
    Improvement rate (A->B->A / A->A)
    Result: 1.6

Group 3 : canvas size (2048, 1024) copy size (1024, 512)
  Test case 1 / backends direct2d1.1
    A->A
    Result: 2221.91 Op/s. 
    A->B->A
    Result: 1847.12 Op/s. 
    Improvement rate (A->B->A / A->A)
    Result: 0.83

  Test case 1 / backends skia
    A->A
    Result: 983.47 Op/s. 
    A->B->A
    Result: 1609.18 Op/s. 
    Improvement rate (A->B->A / A->A)
    Result: 1.6

Conclusion : With direct2d1.1 backend, the bigger the copy size is the benefit of A->B->A is more obscure even worse than A->A; however, with the skia backend the result seems opposite.
George has been looking at (unrelated?) canvas performance issues.
Flags: needinfo?(gwright)
Thanks, Kevin.


Anyway, we might be able to reduce the time of ExtractSubrect() function a little bit.
https://dxr.mozilla.org/mozilla-central/rev/25a94c1047e793ef096d8556fa3c26dd72bd37d7/dom/canvas/CanvasRenderingContext2D.cpp#4663
Gecko create a new initialized-content drawTarget and then copy another data on itself.
The initialization operation of that drawTarget is redundant. I will create a bug for this update.
Flags: needinfo?(hshih)
Blocks: 1340130
Whiteboard: [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSheets] → [qf:p3][platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSheets]
You need to log in before you can comment on or make changes to this bug.