Closed Bug 1850775 Opened 1 year ago Closed 1 year ago

We block in CanvasChild::OnTextureForwarded() during Charts-chartjs

Categories

(Core :: Graphics: Canvas2D, enhancement)

All
Windows
enhancement

Tracking

()

RESOLVED FIXED
123 Branch
Tracking Status
firefox121 --- wontfix
firefox122 --- wontfix
firefox123 --- fixed

People

(Reporter: mstange, Assigned: lsalzman)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Attachments

(2 obsolete files)

Profile: https://share.firefox.dev/47V3VFt

https://speedometer-preview.netlify.app/?suite=Charts-chartjs

With the larger buffer from bug 1841891 we now no longer block during fill/stroke, but some blocking has moved into the paint that happens during CanvasChild::OnTextureForwarded().

If we can remove this wait, our Charts-chartjs score should improve by a lot, up to 10% on the Windows machine that we're using for profiling.

Type: defect → enhancement

Made it back to this and got a fairly hacked together example of how this might work (waiting in the gpu webrender thread instead of in content process for the lock in the Canvas thread).
Hopefully there isn't a fault in this experimental version that means the results below are not valid:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=1658777b297094afd7307c87ce215c2f6744640b&originalSignature=4586009&newSignature=4586009&framework=13&application=firefox&originalRevision=f4d355aaae63e220e65e3a5c96755aab227e3363&page=1&filter=chartjs

It's a bit messy because it involves linking together parts that are very loosely coupled at the moment.

Can you give an overview of how it works?

Flags: needinfo?(bobowencode)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #2)

Can you give an overview of how it works?

Sorry, as I was making changes I decided to finish working on the use of multiple shmems instead of a single ring buffer, because there was a small amount of overlap.
This was also partly because we were still seeing crashes in bug 1851636 since the buffer switching. Although looking now they only appear to be in Nightly for some reason.
Here's a comparison of one of the latest versions

For the waiting in the GPU part, I pass up the canvas checkpoint as part of the TimedTexture and then I've added a mechanism so we can wait for those checkpoints in the GPU process as well. This is done before we take the Lock in the RenderTextureHost.
In the latest version I've also changed it so we wait for the unlock instead of the lock, so it could be used later for textures that don't have their own internal locking.
I still need to do some clean-up in general and from the rebase from aosmond's recent changes, but I should have the patches up soon.

Flags: needinfo?(bobowencode)

(In reply to Bob Owen (:bobowen) from comment #3)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #2)

Can you give an overview of how it works?

Sorry, as I was making changes I decided to finish working on the use of multiple shmems instead of a single ring buffer, because there was a small amount of overlap.

Is there a bug for the multiple shmem work?

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

(In reply to Bob Owen (:bobowen) from comment #3)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #2)

Can you give an overview of how it works?

Sorry, as I was making changes I decided to finish working on the use of multiple shmems instead of a single ring buffer, because there was a small amount of overlap.

Is there a bug for the multiple shmem work?

I was thinking I would just use this bug, but I could split it out.

I've found some sort of deadlock now that I've rebased on top of the CanvasManagerParent changes, just trying to track that down now.

(In reply to Bob Owen (:bobowen) from comment #6)

I've found some sort of deadlock now that I've rebased on top of the CanvasManagerParent changes, just trying to track that down now.

Turned out to be a change I'd made earlier to try and speed up getImageData was flooding the CanvasWorkers with mapping of snapshots when we didn't actually need them. With that change reverted chartjs seems slightly faster as well.

Hopefully not too far from review now.

This replaces the use of a single large ring buffer.
The buffers are still processed in parallel and are recycled to reduce
allocation. Events that do not fit in the default sized buffer have a separate
buffer created to fit them. These large buffers are not recycled.
Separate shared memory is used for readback, with a single shmem cached for this
purpose. Generally only one cached shmem should be required, because the
operations that usually readback the data do it straight away.

This moves the wait that currently happens for the texture write locks in the
content process (blocking the main thread) up to the GPU process, to just before
the texture is locked for rendering.
The canvas recording checkpoint is passed up as part of the TimedTexture and the
wait is done using a new mechanism provided by the CanvasManagerParent and
CanvasTranslator.

Depends on D192273

Hitting an issue on a lot of tests where we are forwarding images, but the external image has not been registered yet.
This is basically to do with the use of TextureHost wrappers that have an external image ID but have not actually registered the image.

I have a couple of possible solutions, one of which is an improvement that removes the need for the external image lookup and the CanvasTranslator lookup. I was going to leave this idea until later, but I think given this issue it makes sense to do it now.

https://treeherder.mozilla.org/jobs?repo=try&duplicate_jobs=visible&revision=da3a1d5b9fa49fc31d506983554c8c21096a39ea

Assignee: nobody → bobowencode
Attachment #9361085 - Attachment description: WIP: Bug 1850775 p1: Use multiple shmem buffers for remote canvas recording. → Bug 1850775 p1: Use multiple shmem buffers for remote canvas recording. r=aosmond!
Status: NEW → ASSIGNED
Attachment #9361086 - Attachment description: WIP: Bug 1850775 p2: Wait for Canvas Drawing in the GPU process before locking. → Bug 1850775 p2: Wait for Canvas Drawing in the GPU process before locking. r=aosmond!

I came across about some jank visiting a news website (with animated ads) that might be related, the profile I took shows 7.2s in mozilla::layers::CanvasChild::OnTextureForwarded which might be related here?

(This might then better be a defect to be appropriately triaged? The web site I was visiting is very popular in Germany.)

Flags: needinfo?(lsalzman)
Flags: needinfo?(lsalzman) → needinfo?(bobowencode)

(In reply to Jens Stutte [:jstutte] from comment #12)

I came across about some jank visiting a news website (with animated ads) that might be related, the profile I took shows 7.2s in mozilla::layers::CanvasChild::OnTextureForwarded which might be related here?

(In reply to Jens Stutte [:jstutte] from comment #13)

(This might then better be a defect to be appropriately triaged? The web site I was visiting is very popular in Germany.)

I don't seem to get a similar problem, so I can't test these changes, but I guess it might depend on the ads that are served.
Either way there are patches up for review, so I'm not sure reclassifying the bug will make much difference.

Flags: needinfo?(bobowencode)

Yes, this bug has a defined purpose and it may or may not address the issue on the news site. Jens, please file a new bug about the issue you found. If you can reproduce it, please get a profile with the "Graphics" preset, too.

See Also: → 1863884

After an Element chat with lsalzman, it seems like we may not need the second patch as the issue it is fixing should be addressed by the use of RemoteTextureMap in his changes.

So, I'm going to move the first patch to a separate bug specific to that change and I'll set the patch on this bug to a WIP one in case we do decide we want to land it.

Depends on: 1863914
Attachment #9361086 - Attachment description: Bug 1850775 p2: Wait for Canvas Drawing in the GPU process before locking. r=aosmond! → WIP: Bug 1850775: Wait for Canvas Drawing in the GPU process before locking.
Attachment #9361085 - Attachment is obsolete: true

This ended up being fixed by the work in bug 1829026.

Assignee: bobowencode → lsalzman
Depends on: 1829026
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch
Depends on: 1871467
Attachment #9361086 - Attachment is obsolete: true
See Also: → 1812362
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: