Closed Bug 1872705 Opened 9 months ago Closed 9 months ago

A canvas demo that takes ~1min to render will not draw anything on screen unless you switch away from the tab

Categories

(Core :: Graphics: Canvas2D, defect, P1)

Firefox 123
x86_64
Windows 11
defect

Tracking

()

VERIFIED FIXED
123 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox121 --- unaffected
firefox122 + verified
firefox123 + verified

People

(Reporter: mayankleoboy1, Assigned: bobowen)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(4 files)

  1. Keep the testcase tab in the foreground all the time. Dont click on anything, dont siwtch tabs, dont resize window. Just wait for 1 minute. Open Windows taskmanager and monitor GPU usage as a proxy for the testcase to complete.
  2. Use latest Nightly on Windows11x64 (which should have D2D canvas)
  3. Open the attached testcase
  4. Choose the following :
    Char -> emoji
    Font size ->3
  5. Keep the testcase tab in the foreground all the time. Dont click on anything, dont siwtch tabs, dont resize window. Just wait for 1 minute. Keep monitoing the GPU usage in the taskmanager. Once the processing is complete, hte GPU use should reduce.

AR: Nothing is drawn on the screen . If you switch tabs, the content will get drawn. Sometimes, even switching tabs doesnt help. Then you can click on canvas->open image in new tab->go to the image tab->return back to the testcase tab.
ER: Not so

Regressed by :
Bug 1863914: Use multiple shmem buffers for remote canvas recording. r=aosmond
Differential Revision: https://phabricator.services.mozilla.com/D193207

This is a slightly modified version of the testcase from bug 1870986. The modification is to change the following :

Original : const DPF = 512; // dots per frame
Modified: const DPF = 512000; // dots per frame

What this does (i think) is wait to process 512000 dots before drawing on the screen. Which means that the complete image is processed and then rendered.

Attached file about:support

Set release status flags based on info from the regressing bug 1863914

:bobowen, since you are the author of the regressor, bug 1863914, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(bobowencode)
Severity: -- → S2
OS: Unspecified → Windows 11
Priority: -- → P1
Hardware: Unspecified → x86_64
Version: unspecified → Firefox 123

FWIW, on the latest Nightly I get the following in the gfx-log:

(#0) GP+[GFX1-]: RemoteTexture ready timeout
(#1) GP+[GFX1-]: RemoteTexture ready timeout
(#2) GP+[GFX1-]: remote texture for rendering does not exist id:39
(#3) GP+[GFX1-]: RemoteTexture ready timeout
(#4) GP+[GFX1-]: RemoteTexture ready timeout

The bug is marked as tracked for firefox122 (beta) and tracked for firefox123 (nightly). We have limited time to fix this, the soft freeze is in 8 days. However, the bug still isn't assigned.

:bhood, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(bhood)
Assignee: nobody → lsalzman
Flags: needinfo?(bhood)

From looking at profiles on Release and Beta, it looks like the back pressure from waiting for the ring buffer to clear is helping.
Potentially, in the new code without that we are getting loads more draw events, creating a backlog in the canvas worker thread and hitting some sort of timeout.

lsalzman - any ideas as to what might be the best option here?

Flags: needinfo?(bobowencode) → needinfo?(lsalzman)

Exploring the back pressure idea, I've created a WIP patch that set's a maximum on the number of buffers.
This seems to fix the issue and is fairly simple.
Not sure how big we'd want to make this, in this patch the total recycled buffers would be 8 MB.
I imagine we'd want it to be configurable.

See Also: → 1873588

(In reply to Bob Owen (:bobowen) from comment #9)

Try push where I thought there might be a mac animometer issue:
https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=27b539227c8a3ffa45311c05026e9b1a57802b5a&newProject=try&newRevision=95568777446e24d0d875a6f2f20d3cbc95e78241&framework=13&page=1

Second try push where it seems to have disappeared (I added in the pref, so I'll update the WIP patch):
https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=27b539227c8a3ffa45311c05026e9b1a57802b5a&newProject=try&newRevision=f6b68014fe0f95a79c5d00d16b70d73bc695363d&framework=13&page=1&showOnlyComparable=1

These try pushes do look like they fix the issue. After i run the attached testcase, the final image is drawn on the screen.

OK as this seems to help with bug 1873588 as well, I'm going to put the patch up for review.

Assignee: lsalzman → bobowencode
Status: NEW → ASSIGNED
Attachment #9371446 - Attachment description: WIP: Bug 1872705: Set a maximum number of recycled canvas buffers. → Bug 1872705: Set a maximum number of recycled canvas buffers. r=#gfx-reviewers
Flags: needinfo?(lsalzman)
Attachment #9371446 - Attachment description: Bug 1872705: Set a maximum number of recycled canvas buffers. r=#gfx-reviewers → Bug 1872705: Set a maximum number of recycled canvas buffers. r=lsalzman
Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/bc06c1cf74de Set a maximum number of recycled canvas buffers. r=gfx-reviewers,lsalzman

Do either of you think this is something worth uplifting to Beta?
We're getting quite late in the cycle now, but it "looks" fairly simple. :-)

Flags: needinfo?(lsalzman)
Flags: needinfo?(aosmond)
Flags: needinfo?(lsalzman)
Flags: needinfo?(bobowencode)
Flags: needinfo?(aosmond)
Flags: needinfo?(bobowencode)
Attachment #9372016 - Flags: approval-mozilla-beta?

(In reply to Bob Owen (:bobowen) from comment #15)

If the wait fails we already do that:
https://searchfox.org/mozilla-central/rev/52f45d12530f9f8da61c6c7237003b5d1499b6d1/gfx/layers/CanvasDrawEventRecorder.cpp#152

Oh good, thanks for sanity checking me :).

Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch
Duplicate of this bug: 1873668
Duplicate of this bug: 1873600

:bobowen the uplift request is missing an uplift request form.
You can add it by selecting the Change Uplift Request form Action in the phab revision

Flags: needinfo?(bobowencode)

Uplift Approval Request

  • Code covered by automated testing: yes
  • Steps to reproduce for manual QE testing: Follow steps in description.
  • User impact if declined: Examples of issues in this bug and bugs 1873600, 1873668 will continue to occur. i.e. Canvas JS that can cause very large amounts of drawing events within a transaction can overwhelm the canvas worker thread in the GPU process.
  • Risk associated with taking this patch: Low to medium
  • Needs manual QE test: yes
  • Fix verified in Nightly: no
  • String changes made/needed: None
  • Is Android affected?: yes
  • Explanation of risk level: Fairly simple patch to limit the number of shared memory buffers used and use existing wait mechanism if limit is hit.
Flags: qe-verify+

(In reply to Donal Meehan [:dmeehan] from comment #21)

:bobowen the uplift request is missing an uplift request form.
You can add it by selecting the Change Uplift Request form Action in the phab revision

Sorry, it's the first time I've requested via Phabricator.

Flags: needinfo?(bobowencode)

(In reply to Bob Owen (:bobowen) from comment #23)

(In reply to Donal Meehan [:dmeehan] from comment #21)

:bobowen the uplift request is missing an uplift request form.
You can add it by selecting the Change Uplift Request form Action in the phab revision

Sorry, it's the first time I've requested via Phabricator.

No problem, thanks!

Duplicate of this bug: 1873588
Attachment #9372016 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

I was able to reproduce the issue on Win11x64 using Firefox build 123.0a1(20240102095659). Before the fix attached testcase did not load for at least ~2 minutes and until tab was changed twice.
After the fix on Nightly 123.0a1(20240111210611) it took less than 30 seconds for the attached testcase to load without having to change tabs, but I got a infobar message :' This page is slowing down Nightly. To speed up your browser, stop this page.'
Pending beta9 to confirm the fix on Beta.

Bob Owen is the infobar message expected? Thank you.

Flags: needinfo?(bobowencode)

(In reply to Monica Chiorean from comment #27)

I was able to reproduce the issue on Win11x64 using Firefox build 123.0a1(20240102095659). Before the fix attached testcase did not load for at least ~2 minutes and until tab was changed twice.
After the fix on Nightly 123.0a1(20240111210611) it took less than 30 seconds for the attached testcase to load without having to change tabs, but I got a infobar message :' This page is slowing down Nightly. To speed up your browser, stop this page.'
Pending beta9 to confirm the fix on Beta.

Bob Owen is the infobar message expected? Thank you.

I wouldn't say it is expected, but it is quite likely because the test page has very heavy use of JS.

Flags: needinfo?(bobowencode)

On Beta 122.0b9(20240112091933) it took less than 30 seconds for the attached testcase to load without having to change tabs. Mark as verified.

Flags: qe-verify+
See Also: 1873588
Status: RESOLVED → VERIFIED
Duplicate of this bug: 1871865
Blocks: 1831609
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: