1265824 - Add a ClientStorageTextureSource

BufferTextureHost already has most of what you need to get client storage working, you just need to add a ClientStorageTextureSource that wraps the BufferTextureHost the same way we do with the basic compositor (see BufferTextureHost::EnsureWrappingTextureSource). So we don't need for an extra TextureClient/Host pair.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

9 years ago

Blocks: 1282797

Matt Woodrow (:mattwoodrow)

Comment 6

•

9 years ago

Attached patch WIP: client storage support — Details — Splinter Review

Based on m-c 300620:1828937da949 This code holds the read lock to the textures for quite a few frames after they have been used (probably more than necessary) to ensure that OpenGL has finished reading from them before we attempt to write to the system memory again. This unfortunately interacts really poorly with our tile recycling, and we end up doing a lot of allocations (and spending time in the VM mapping and zeroing memory). We probably need to figure out the minimum number of frames we need to hold them for, and then fix the tile pool to recycle nicely for this. Alternatively we should try blocking the compositor until previous frames have fully completed (using the apple fence extension?) so that our existing recycling assumptions work. The patch also has code to temporarily disable alignment to partial tiles. Aligning to partial tiles means that we get lots of invalidations for partial tiles, and spend a lot of time reading from the old front buffer into our new back which was interacting badly with the above recycling issues. This is just experimental, probably not something we want to do for real (since we can get partial tiles in other ways, like the document boundary).

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 7

•

9 years ago

It looks like CoreAnimations is actually using TexSubImage in addition to TexImage (this makes some sense). I'll investigate more into what they're doing.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 8

•

9 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #7) > It looks like CoreAnimations is actually using TexSubImage in addition to > TexImage (this makes some sense). I'll investigate more into what they're > doing. Yeah, they seem to be doing full TexSubImage2Ds. This explains why we were seeing so much more allocation traffic then Safari. They are also not TexSubImage2ding the same textures every frame. They seem to be alternating between sets of textures per frame. (i.e. it appears to be double buffering, but could be something more sophisticated).

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 9

•

9 years ago

The basic picture here seems to be to: 1. texture->TexImage2D(size, buffer_addr); 2. texture->Draw() 3. Swap(); 4. texture2->TexImage2D(size, buffer_addr2); 5. texture2->Draw(); 6. Swap(); 8. texture1->TexSubImage2D(size, buffer_addr); 9. texture1->Draw(); 10. Swap(); 11. texture1->TexSubImage2D(size, buffer_addr2); 12. texture1->Draw(); etc. The modifications of the buffer_addr*[] seem to be protected by fences but the exact mechanism is not clear to me. Typically the same buffer is always used with TexImage2D and TexSubImage2D as otherwise a copy will occur. Coreanimation does not always flip between sets of textures. Sometimes it uses the same ones in a row. I don't know why.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 10

•

9 years ago

It turns out all of the weirdness I was seeing was because of the out-of-order dtrace printing. When sorting the logs entries by the order they happened everything seems to make sense. Here's my current understanding. Frame1 0. fence1->FinishFence() 1. texture->TexImage2D(size, buffer_addr); 2. texture->Draw() 3. Swap(); 4. fence1->SetFence(); Frame2 5. fence2->FinishFence(); 6. texture2->TexImage2D(size, buffer_addr2); 7. texture2->Draw(); 8. read_unlock(buffer_addr); 9. Swap(); 10. fence2->SetFence(); Frame3 11. fence1->FinishFence(); 12. texture1->TexSubImage2D(size, buffer_addr); 13. texture1->Draw(); 14. read_unlock(buffer_addr2); 15. Swap(); 16. fenec1->SetFence(); Frame4 17. fence2->FinishFence(); 18. texture1->TexSubImage2D(size, buffer_addr2); 19. texture1->Draw(); 20. read_unlock(buffer_addr); 21. Swap(); 22. fence2->SetFence();

Matt Woodrow (:mattwoodrow)

Comment 11

•

9 years ago

The other interesting thing to know would be when do they write to buffer_addr(2)? I assume it's between the fence and the TexImag2D call.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 12

•

9 years ago

My understanding is that the write to buffer_addr will happen sometime after the read_unlock(buffer_addr2) and before the next TexSubImage2D(size, buffer_addr2). Unfortunately, confirming this is difficult as the writes happen in another process.

Markus Stange [:mstange]

Comment 13

•

9 years ago

In particular, assuming comment 12 is true, the timing of the write violates the client storage spec. The spec says that, before you write to the memory, you need to bind a different buffer to your texture with another call to glTexImage2D or delete the texture. It does not look like CoreAnimation is doing that. Also, Jeff's investigations are being done on 10.9. CoreAnimation's behavior may well have changed in the meantime.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 14

•

9 years ago

(In reply to Nicolas Silva [:nical] from comment #5) > BufferTextureHost already has most of what you need to get client storage > working, you just need to add a ClientStorageTextureSource that wraps the > BufferTextureHost the same way we do with the basic compositor (see > BufferTextureHost::EnsureWrappingTextureSource). > So we don't need for an extra TextureClient/Host pair. Since we now need the texture's lifetime to persist as long as the buffer it seems like we need a different approach. What would you recommend nical?

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 15

•

9 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #14) > Since we now need the texture's lifetime to persist as long as the buffer it > seems like we need a different approach. What would you recommend nical? I am not sure I understand what you mean by that (or in what way it is different from what we had before, which I don't remember in detail). As long as all you need to share between the content and compositor processes is a shmem, there is no need to add a new TextureClient/Host pair. The ClientStorage TextureSource can do all of the API-specific things and there are hooks in BufferTextureHost for all of the interesting steps the TextureSource would need to be notified.

Flags: needinfo?(nical.bugzilla)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 16

•

9 years ago

I have a test app at https://github.com/jrmuizel/client-storage that is happily uploading about 4k*4k of textures per frame with no noticeable cpu usage being spent in texture upload.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 17

•

9 years ago

I found the missing piece for synchronization. CoreAnimation calls FinishObjectAPPLE on the textures before read_unlocking them.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 18

•

9 years ago

I've updated the repo above to use FinishObjectAPPLE for synchronization. It works although is slow because we call FinishObjectAPPLE immediately after Swap() which causes us to actually block. CoreAnimation doesn't call FinishObjectAPPLE until the next frame. Here's the updated timeline: Frame1 1. texture1->TexImage2D(size, buffer_addr1); 2. texture1->Draw() 3. Swap(); Frame2 4. texture2->TexImage2D(size, buffer_addr2); 5. texture2->Draw(); 6. texture1->FinishObject(); 7. read_unlock(buffer_addr1); 8. Swap(); Frame3 9. texture1->TexSubImage2D(size, buffer_addr1); 10. texture1->Draw(); 11. texture2->FinishObject(); 12. read_unlock(buffer_addr2); 13. Swap(); Frame4 14. texture2->TexSubImage2D(size, buffer_addr2); 15. texture2->Draw(); 16. texture1->FinishObject(); 17. read_unlock(buffer_addr1); 18. Swap();

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 19

•

9 years ago

Nical, CoreAnimation doesn't try to unlock the buffer until the end/beginning of the next frame (i.e. 15ms after the Swap) what's the best way for us to do this?

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 20

•

9 years ago

IIUC the current Compositor::ReadUnlockAfterComposition helper does exactly that. It is called for all textures that are not already ReadUnlock'ed (uploads etc.) when they are not current on any compositable (as soon as they have been replaced by another texture), and as a result the compositor ReadUnlocks the textures at the end of the frame where they have been replaced. We should try that with IOSurfaces and see. It sounds like what you want and it would work the same way other shared gpu textures, which is nice.

Flags: needinfo?(nical.bugzilla)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 21

•

9 years ago

(In reply to Nicolas Silva [:nical] from comment #20) > IIUC the current Compositor::ReadUnlockAfterComposition helper does exactly > that. It is called for all textures that are not already ReadUnlock'ed > (uploads etc.) when they are not current on any compositable (as soon as > they have been replaced by another texture), and as a result the compositor > ReadUnlocks the textures at the end of the frame where they have been > replaced. > > We should try that with IOSurfaces and see. It sounds like what you want and > it would work the same way other shared gpu textures, which is nice. Is there a way for the TextureSource to know before this is happening? We need a place to call FinishObject on the texture. From my current reading it looks like the unlocking happens without the involvement of the TextureSource.

Flags: needinfo?(nical.bugzilla)

Matt Woodrow (:mattwoodrow)

Comment 22

•

9 years ago

Won't we need to have triple buffering for this to work correctly? If we have content drawing and compositing running simultaneously, then we have one buffer being drawn to, one buffer getting TexSubImage2D called on it, and a 3rd waiting in the pipeline for us to read_unlock it.

Nicolas Silva [:nical]

Comment 23

•

9 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #21) > Is there a way for the TextureSource to know before this is happening? We > need a place to call FinishObject on the texture. From my current reading it > looks like the unlocking happens without the involvement of the > TextureSource. You can just add methods to TextureSource and have the TextureHost call them. (In reply to Matt Woodrow (:mattwoodrow) from comment #22) > Won't we need to have triple buffering for this to work correctly? Sure, but that's outside the scope of TextureClient/Host/Source. The buffering logic is in whichever CompositableClient uses the texture, and we already have made most of them play nice with gralloc which is somewhat similar, although now that I think about it we have fences with gralloc so we can afford to block production instead of adding more buffers I guess.

Flags: needinfo?(nical.bugzilla)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 24

•

9 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #22) > Won't we need to have triple buffering for this to work correctly? > > If we have content drawing and compositing running simultaneously, then we > have one buffer being drawn to, one buffer getting TexSubImage2D called on > it, and a 3rd waiting in the pipeline for us to read_unlock it. CoreAnimation does not triple buffer which suggests we shouldn't have to either. The time between a buffer getting TexSubImage2Ded and previous buffer being unlocked should be short enough that we can block production for this period of time.

Matt Woodrow (:mattwoodrow)

Comment 25

•

9 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #24) > > CoreAnimation does not triple buffer which suggests we shouldn't have to > either. The time between a buffer getting TexSubImage2Ded and previous > buffer being unlocked should be short enough that we can block production > for this period of time. Ok, that seems reasonable. We don't currently have infrastructure (within the tiling code at least) to wait for a lock to become available, the code just checks and creates a new tile if the existing one is locked.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 26

•

9 years ago

In talking about this with mstange we thought it might be able to use the same mach connection that we use for shared memory allocations to do this blocking.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 27

•

8 years ago

Attached patch WIP towards better reuse/finishing (obsolete) — Details — Splinter Review

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

8 years ago

Summary: Add a ClientStorage TextureClient/TextureHost pair → Add a ClientStorageTextureSource

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 28

•

8 years ago

Attached patch WIP (obsolete) — Details — Splinter Review

This version actually seems to work. As long as we're not creating or destroying textures, we don't seem to spend a noticeable amount of time doing upload. It currently has a hack where we spin on the readlock to wait on the compositor. Currently, we spend a substantial amount of time spinning but the the cpu time in the compositor stays low.

Attachment #8798180 - Attachment is obsolete: true

Nathan Froyd [:froydnj]

Comment 29

•

8 years ago

I had a friend ping me about janky scrolling performance in Firefox, and when talking with kats yesterday, he pointed me to this bug. It would be great to get this fixed.

(no longer active)

Updated

•

8 years ago

Blocks: 1342209

Markus Stange [:mstange]

Comment 30

•

8 years ago

Attached image CoreAnimation timeline — Details

Jeff and I refreshed our minds today again about how CoreAnimation does this. Here's a picture and a piece of text. CoreAnimation has two buffers per "tile". When content wants to paint to a buffer, it sends a blocking message to the mach ipc thread on the Compositor. Once that message is replied to, it starts to paint. Every composite does these things, in this order: - Iterate over all textures used in the previous composite and call glFinishObjectAPPLE on them. - Signal a condition variable to record that these textures are now available, and possibly to unblock the libdispatch thread that might be waiting on it. - Composite the current frame, uploading the new textures via glTexSubimage() in the process. - Swap buffers. The mach IPC thread has a handler for the message that content sends when it wants to paint. This handler: - Checks the condition variable for that set of textures - If it has already been signaled, which means that those textures have already been glFinishObjectAPPLE'd, then it directly replies to the message from content, and content can paint. - If the condition variable has not been signaled, then it kicks off a task using GCD. This task will run on a libdispatch thread. This task does one thing: it blocks on the condition variable. - Once the condition variable has been signaled, and the waiting task is unblocked, that task sends the reply to the blocking message from content. This will unblock content so that it can paint. This achieves the following: - Don't write to textures that the compositor is using. - Don't allocate a third buffer if both buffers are in use; instead, make content wait. - Don't block the mach IPC thread if content needs to wait - do the waiting on a different thread. The current plan is to implement something extremely similar to this, because it's a known-to-work state. There might be things we can simplify later on (e.g. don't use mach ipc and take advantage of our own IPDL / IPC mechanisms), but we can do that once we have something that works. The patch that is currently attached in this bug already has a lot of this, but it doesn't implement the part that would make content wait until it can paint. The same is true for https://github.com/jrmuizel/client-storage/ - it's missing the waiting. Jeff would like to finish this bug, once he has time, but if somebody else is available to work on it, please do, and don't wait for Jeff.

Jean Gong :jgong

Updated

•

8 years ago

Whiteboard: [gfx-noted] → [gf:p1]

(no longer active)

Updated

•

8 years ago

Blocks: 1343558

Naveed Ihsanullah [:naveed]

Comment 31

•

8 years ago

Is this still relevant with Web Render? If so I will make it a qf:p1

Flags: needinfo?(ehsan)

Naveed Ihsanullah [:naveed]

Updated

•

8 years ago

Whiteboard: [gf:p1] → [gf:p1][qf]

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 32

•

8 years ago

It will probably not be relevant with webrender.

(no longer active)

Comment 33

•

8 years ago

Even if it's not relevant with Web Render, do we want to keep janking until Web Render ships?

Flags: needinfo?(ehsan) → needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 34

•

8 years ago

No. I'd rather stop janking before WebRender ships.

Flags: needinfo?(jmuizelaar)

(no longer active)

Comment 35

•

8 years ago

Bas, can you please help find an owner for this? Thanks!

Flags: needinfo?(bas)

Whiteboard: [gf:p1][qf] → [gfx-noted][qf:p1]

Bas Schouten (:bas.schouten)

Comment 36

•

8 years ago

Milan, neither me, nical or jamie have OS X machines, do we have someone that could deal with this?

Flags: needinfo?(bas) → needinfo?(milan)

(no longer active)

Comment 37

•

8 years ago

Attached video Video demontrating the jank — Details

Here is a video demonstrating how bad this bug can be in practice. Look at the tab throbber to get a sense of the UI jank caused by this. I was trying to scroll constantly while capturing the video. The places in the video when Firefox wasn't scrolling was where we were janking.

(no longer active)

Updated

•

8 years ago

Blocks: 1347559

Milan Sreckovic [:milan] (needinfo for best results)

Comment 38

•

8 years ago

(In reply to Naveed Ihsanullah [:naveed] from comment #31) > Is this still relevant with Web Render? If so I will make it a qf:p1 Even if it's Mac only?

Flags: needinfo?(milan)

OS: Unspecified → Mac OS X

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

8 years ago

Flags: needinfo?(nihsanullah)

Naveed Ihsanullah [:naveed]

Comment 39

•

8 years ago

Yes this should be [qf:p1]. This is bad enough as a user experience that it isn't really shippable in the current form. The compositor jank janks the entire window and the impact is across many sites.

Flags: needinfo?(nihsanullah)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 40

•

8 years ago

Attached patch Elaborated sketch — Details — Splinter Review

Here's a somewhat elaborated sketch that I worked on over the weekend. One thing missing is that the frame/transaction id probably needs to live in shared memory so that the client side can check for completion without needing to do any ipc.

Assignee: nobody → jmuizelaar

Attachment #8799032 - Attachment is obsolete: true

Milan Sreckovic [:milan] (needinfo for best results)

Comment 41

•

8 years ago

Re-tagging for QF triage as this is OS X only.

Whiteboard: [gfx-noted][qf:p1] → [gfx-noted][qf]

Peter Chang[:pchang]

Updated

•

8 years ago

Blocks: 1355419

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

8 years ago

No longer blocks: 1355419

Naveed Ihsanullah [:naveed]

Comment 42

•

8 years ago

Even though this is OSX only the jank is severe enough that users may leave over it. That hits our P1 criteria.

Whiteboard: [gfx-noted][qf] → [gfx-noted][qf:p1]

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

8 years ago

Assignee: jmuizelaar → nobody

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

8 years ago

Priority: -- → P3

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

8 years ago

Assignee: nobody → milan

Peter Chang[:pchang]

Updated

•

8 years ago

Blocks: 1353137

Andrew Overholt [:overholt]

Comment 43

•

8 years ago

Given this is OS X-only, it seems like it's time to not have this as qf:p1.

Whiteboard: [gfx-noted][qf:p1] → [gfx-noted][qf]

Naveed Ihsanullah [:naveed]

Comment 44

•

8 years ago

According to Milan this will require resources that are already stretched with the current qf:p1 load for FF57. Moving to qf:p2 and will re-evaluate for FF58.

Whiteboard: [gfx-noted][qf] → [gfx-noted][qf:p2]

Blake Wu [:bwu][:blakewu]

Updated

•

7 years ago

Blocks: 1404047

Firefox Bug Husbandry Bot

Updated

•

7 years ago

Keywords: perf

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

7 years ago

Blocks: 1383292

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

7 years ago

Priority: P3 → P2

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

7 years ago

Comment 45

•

7 years ago

I will do the remaining work.

Assignee: milan → hshih

Status: NEW → ASSIGNED

Naveed Ihsanullah [:naveed]

Updated

•

7 years ago

Whiteboard: [gfx-noted][qf:p2] → [gfx-noted][qf:p3]

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

7 years ago

Comment 46

•

7 years ago

Jerry, I don't know if the patches in bug 1005441 are relevant, take a look.

Flags: needinfo?(hshih)

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 47

•

7 years ago

That patch was written before the tiling landed. I think we could do the whole things in CreateWrappingDataSourceSurface()(just like the wip attachment 8768908 [details] [diff] [review]). I'm getting a weird screen result with my local patch. The tiled buffers are updated by content side unexpectedly. Maybe the problem comes from the tile-buffer recycling.

Flags: needinfo?(hshih)

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 48

•

7 years ago

Attached patch P1: remove the unsed member in GLTextureSource. — Details — Splinter Review

The "mExternallyOwned" is used for gralloc buffer. We don't use the gralloc buffer now. MozReview-Commit-ID: 7Gurpa3kdp0

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 49

•

7 years ago

Attached patch P2: add a new texture source type "DirectMapTextureSource". — Details — Splinter Review

The DirectMapTextureSource could let the compositor to read the buffer directly. That could get rid of some memory copy operations during texture uploading. MozReview-Commit-ID: CHhoR96P7VG

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 50

•

7 years ago

Attached patch P3: pass the texture direct mapping capability from CompositorOGL to KnowsCompositor. — Details — Splinter Review

The client side can't get the GL context in CompositorOGL. So, it can't know the texture direct mapping capability directly. This patch adds the texture direct mapping info in TextureFactoryIdentifier. Then, the client side could get the info form the TextureFactoryIdentifier. MozReview-Commit-ID: KEazDVg0p9Y

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 51

•

7 years ago

Attached patch P4: pass the texture direct mapping info to all texture creating functions. — Details — Splinter Review

MozReview-Commit-ID: He7UNvBCa8z

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 52

•

7 years ago

Attached patch P5: implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL. — Details — Splinter Review

This patch will use DirectMapTextureSource to wrap the DataSourceSurface data for gpu access. That could improve the texture uploading performance. MozReview-Commit-ID: CGPFcCsR1RY

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 53

•

7 years ago

Attached patch P6: handle the texture uploading and readLock related things for direct mapping texture source. — Details — Splinter Review

MozReview-Commit-ID: BC065h1Ac6k

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 54

•

7 years ago

Attached patch P7: don't recycle the texture source if it's direct mapping. — Details — Splinter Review

There is no benefit to use recycle the old one. That will have an additional memory copy from the new texture host to the old texture source. We should just wrap the new texture host to a direct mapping texture source to get better texture uploading performance. MozReview-Commit-ID: 9ONQAhvKmuh

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 55

•

7 years ago

Wait for the reftest and talos's result.

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 56

•

7 years ago

There are a lot of tests failed. Wait for the updated try.

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 57

•

7 years ago

The talos result has a lot of regressions. I'm checking the profile data. https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=0ebbaef03c6ee06f350afa9af012087d6211a74f&framework=1&filter=osx&selectedTimeRange=172800

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 58

•

7 years ago

The performance regression comes from the TileClient buffer allocation. It's addressed in comment 24 and comment 25. In my patch, TileClient will allocate a new buffer when the frontbuffer is still read-locked. The buffer allocation costs a lot. But I can see the client_storage improves a lot for texImage2D() calls in profiler. I will try to add a waiting code as comment 24 and 25 or try to have triple buffers for TileClient.

James Cheng[:JamesCheng]

Updated

•

7 years ago

Blocks: 1427325

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 59

•

7 years ago

Here are the comparison between the original texture uploading path and the client-storage. Test with the full-screen video from https://www.csfd.cz/film/402192-zmensovani/videa/ a) The original path: https://perfht.ml/2mL3nwO b) client-storage https://perfht.ml/2mLWlrO This version use client-storage for texture uploading. And it also creates a glSync object after the teximage2d/texSubImage2d call. Then, we will wait the glSync object at the end of next composition. I don't have a good way to make the client side waiting for the readlock. So, I just use a simple spinlook at client side. We can see the waiting time in profiler if the compositor doesn't handle the glSync object. We could adjust the glSync waiting position to another position.

Jerry Shih[:jerry] (UTC+8) (inactive)

Comment 60

•

7 years ago

The original idea for client-storage is that: 1) before a tiled layer painting, send a sync message to parent 2) parent side starts to wait glsync object. Then, go back to child side. 3) start to paint. This model will have problem if there are a lot of painting layers. If we try to reduce the message number, we should have a preprocessing step to find all tiled painting layer. And send a message to wait all sync object for these layers. And If the parent side is busy, it can't handle the message soon. Then, the child side could spend more time for waiting. That's why I don't create a message to trigger the glsync object waiting.

Eric Rahm [:erahm]

Updated

•

7 years ago

Blocks: 1424325

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Assignee: bignose1007+bugzilla → dothayer

Whiteboard: [gfx-noted][qf:p3] → [gfx-noted][qf:p3][fxperf:p1]

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Blocks: 1457413

Alex Thayer [:alexical] (she/her)

Assignee

Comment 61

•

7 years ago

Quick update on where this is right now, to any interested parties: I've set up the IPC on the mach connection we use for shared memory (per comment 26). I went through a bunch of iterations on this, because I was trying to sort out a way to simply sync on a transaction ID. Transaction IDs didn't work out - as far as I can tell this is because we can reuse tiles across many transactions, meaning we might free up some tiles with a transaction but leave the rest alive. I ended up resorting to tracking a set of texture serial IDs per process, and sending an array of required texture IDs up to the parent process when are going to need them. This waits on a condition variable until items are removed from the set, at which point it checks again. In any case, this does seem to perform rather well, and makes scrolling on sites like gsmarena.com (from Bug 1457413) very smooth. However, it does regress some of the motionmark tests in talos. We seem to be trading memcpy time in the compositor for allocation time in the content process due to the double buffering (?). Additionally, and more importantly, I'm running into intermittent crashes on talos from timeouts waiting on texture serials to free up. My assumption is that if we're waiting on a texture serial to get the back buffer, then we've already sent up the front buffer, meaning the back buffer should stop being used within a matter of time, but that doesn't seem to be the case. Are there cases where both the front buffer and the back buffer should be locked and won't unlock until we allocate a new back buffer and send it up? It could be a bug in my code, but I just wanted to make sure that assumption was correct. In the mean time I'm just sending up talos runs with printf's scattered everywhere trying to understand better where this is coming from. Feel free to call out anything in the above that seems off.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 62

•

7 years ago

Resolved the crashes, but still seeing the motionmark regressions. Not sure what to do about them. Going to sleep on it.

Matt Woodrow (:mattwoodrow)

Comment 63

•

7 years ago

(In reply to Doug Thayer [:dthayer] (PTO on May 17) from comment #62) > Resolved the crashes, but still seeing the motionmark regressions. Not sure > what to do about them. Going to sleep on it. We have existing bugs open where we're really slow due to allocating (and memset(0)ing) tiles. The solution to both problems might be to investigate the tile pool size/allocation strategy and see if we can do better.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 64

•

7 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #63) > We have existing bugs open where we're really slow due to allocating (and > memset(0)ing) tiles. > > The solution to both problems might be to investigate the tile pool > size/allocation strategy and see if we can do better. Yeah, it seemed like a problem that I was just exacerbating with my changes. From my profiling, it seems the tile pool isn't in the hot path here, since we end up using ClientSingleTiledLayerBuffer for most of the heavy lifting in these tests. I'm not sure what it would take to keep a pool for the ClientSingleTiledLayerBuffer, since its tiles are all different sizes. Maybe we could benefit from using a set of predefined tile sizes and finding the smallest predefined size that fits our requested size, assuming our requested size isn't something like 5x1000. In either case I don't know what I can do for this within the scope of this bug, so I'm not sure how to go forward here.

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Blocks: 1460264

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Whiteboard: [gfx-noted][qf:p3][fxperf:p1] → [gfx-noted][qf:p3][fxperf:p1][qf:f64]

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 73

•

7 years ago

The patches still need some cleanup, but I'm throwing them up here in the mean time in case anyone wants to pull them down and take a look.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 74

•

7 years ago

Animated gifs on the new reddit (not on the old reddit) are busted with this. If I switch the textures from GL_TEXTURE_RECTANGLE to GL_TEXTURE_2D it avoids the problem, but that doesn't allow us to avoid the driver copy that we avoid with glTextureRangeAPPLE. Trying to figure out why we're running into this, and what's special about the new reddit's animated gifs.

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 78

•

7 years ago

Hey Jeff - do you have any thoughts on either of these two things? - What can we do to mitigate the cost of the extra allocations we're now doing? I'm specifically interested in the talos regressions of the motionmark tests, which are the only regression I'm seeing on try. It seems like we don't need to be allocating so much - we don't use a texture pool because we hit the ClientSingleTiledLayerBuffer, since the tests tend to just be large numbers of tiny layers. Could we explore using a pool or set of pools for ClientSingleTiledLayerBuffer? I don't know enough context around this and what exploration has already been done here. - Can you think of any reasons why using GL_TEXTURE_RECTANGLE_ARB in the DirectMapTextureSource (this looks roughly like the sketch you posted a while ago) would interfere with animated gifs sometimes? It seems to me to be interfering with MacIOSurfaceTextureHost planes, but I'm not sure. Those also use GL_TEXTURE_RECTANGLE_ARB, but I don't see why that would result in any conflict. See comment 74.

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 79

•

7 years ago

So the animated gif thing is actually related to videos which perhaps makes some sense. We should try to figure out more about what scenario's cause it. I'm don't have a concrete suggestion for the motionmark problem yet. I'm going to think about it more.

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 80

•

7 years ago

Can you post before and after profiles of this patch on the motionmark bouncing clipped rectangle test?

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

7 years ago

Flags: needinfo?(dothayer)

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 82

•

7 years ago

My assertion that it was due to allocations appears to be off? Maybe I accidentally fixed a bug since I last profiled that test :/ Anyway, here they are: before: https://perfht.ml/2IMPdZu after: https://perfht.ml/2xdet5G Nothing jumps out at me at the outset - the profiles look very similar to me. But maybe you can pick out something I can't. Do you happen to know how the scores are calculated for this test?

Flags: needinfo?(dothayer)

Comment hidden (obsolete)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 84

•

7 years ago

wait - wrong profiles (ignore the above comment) before: https://perfht.ml/2IJqp07 after: https://perfht.ml/2xbarKT

Mark

Comment 85

•

7 years ago

I had a play with the latest Try https://treeherder.mozilla.org/#/jobs?repo=try&author=dothayer@mozilla.com&selectedJob=180086432 and it looks great on gsmarena.com. It also seems to fix some of my high GPU power consumption complaints on other pages. But I sometimes see large black (unrendered?) areas. I just got it now, I switched to the (already loaded) gsmarena.com tab and it appeared completely black with just the toolbar. I cmd-R 'd to reload and it refused to render, it just stayed black. Sometimes it shows a small broken thumbnail of the whole page at the center of the blackness. Scrolling or mouseovering shows the status bar hints bottom left but nothing else. This profile is me scrolling and reloading the blank gsmarena.com tab https://perfht.ml/2GNUjyx This is on a MacBookPro11,1 @ 1440x900 resolution. Mostly default settings but gfx.compositor.glcontext.opaque TRUE. hope that's useful to you, keep up the good work.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 86

•

7 years ago

I moved the glFinishObjectAPPLE calls to before composition rather than after, and my paint times for the motionmark tests went down dramatically, so I did another talos run: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=7867306bde5158bee26b76f10b3a771d69d60932&newProject=try&newRevision=5ad1e5e4b865f79e3b1c147b0d71d928f7aba159&framework=1 This is interesting because we see more green on it than before, which is good. But the red has also increased. Notably damp seems to have regressed, as well as displaylist_mutate.html, and motionmark is still having some trouble, though it seems it may be less pronounced. I couldn't reproduce the regression on displaylist_mutate.html (I did set the ASAP mode prefs), but here's the profiles anyway: displaylist_mutate.html (before): https://perfht.ml/2ILghbm displaylist_mutate.html (after): https://perfht.ml/2LuE7Wx I could reproduce the motionmark regressions, and I'm seeing longer than expected composites in the after profile, but nothing jumps out as me as a smoking gun: CSS_bouncing_blend_circles (before) https://perfht.ml/2xdf6vT CSS_bouncing_blend_circles (after) https://perfht.ml/2xkx53s I don't quite know the best way of getting detailed easy to reproduce profiles for the damp tests. Thoughts / recommendations? Just talos-test with --geckoProfile? That didn't yield anything useful for the motionmark tests, but maybe I'm getting something wrong? In all cases I would have expected moving glFinishObjectAPPLE before composition to make glFinishObjectAPPLE show up in the profiles, which it doesn't. I'm also surprised that the GPU has finished with these textures at the beginning of the composition in which they are no longer referenced. I would have expected a bit more latency than that? But maybe my mental model of all of this is off.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 87

•

7 years ago

Eh - looking at the graphs I'm thinking damp and displaylist_mutate might be due to chance on the baseline. motionmark is still not looking great though. Something like a 3-5% regression? Still not sure why.

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Flags: needinfo?(mconley)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 88

•

7 years ago

Here's some profiles of the motionmark_animometer Design test, before and after: motionmark_animometer Design (before) https://perfht.ml/2J3c3YA motionmark_animometer Design (after) https://perfht.ml/2xs2SQ6 tldr; the before shows a ton of time spent in the compositor in memmoves copying the textures all the way down through the driver. The second profile gets rid of this, but trades it for time spent in glFinishObject and time spent in the content process waiting on the IPC response which happens once the glFinishObject completes. Markus, do you have any ideas for how to tackle this problem? Locally my throughput for the test is higher in my version than in the baseline, but we spend more time that I would like to blocking in the content process. I can't think of a way to solve this within the scope of this bug, but maybe I'm missing something. It seems like the GPU just has a lot of work we've given it, and so if we block on that work then we're going to spend a lot of time blocking.

Flags: needinfo?(mstange)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 89

•

7 years ago

Just responding quickly, I'd recommend trying to produce a more descriptive timeline of what things are being waited on when. When I was initially figuring out how to best work with client storage I used the logviz.html that I'll attach. It's super simple and you'll probably need to modify to work best, but it basically just takes the first int of a line and interprets that as a timestamp and draws the text of the line to a canvas. Try it out with the following content in the text box to get a feeling: 4 First 90000 Second 300000 Third

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 90

•

7 years ago

Attached file logviz.html — Details

Flags: needinfo?(jmuizelaar)

Mike Conley (:mconley) (:⚙️)

Comment 91

•

7 years ago

Looking over these profiles, I don't think I'm going to be much help, I'm afraid. :( I will say, however, that these patches seem to address bug 1453080 for me, which I'm very excited about.

Flags: needinfo?(mconley)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 92

•

7 years ago

Attached file logviz.html — Details

Created a visualization of a log of textures locking/unlocking/etc. per Jeff's recommendation. This log is captured from a particular motionmark animometer test[1]. To explain what's going on with all the colors and layout: the two columns on the right are the logs of the parent and content process. They have colored bars to the left indicating when certain events begin and end, roughly mirroring perf.html's view. Grey is a RefreshDriverTick, red is a LayerTransaction, green is a Rasterize, blue is SyncTextures (waiting on texture IDs to be notified as unlocked by the parent process), and yellow is a glFinishObject call. Regarding the text on the left, some of the log entries have numbers after them - these numbers represent the affected texture IDs. The view on the right is a set of per-texture timelines. Each column corresponds to a particular TextureHost/Client (with the ID of that texture at the top). The grey bar is the time that it's locked out by the parent - beginning and ending in TiledLayerBufferComposite::UseTiles. The blue is us waiting in SyncTextures in the content process for those textures, and the yellow is the glFinishObject call for that texture. Everything looked more or less as I expected it to in the 1x timescale view. However, if you look at the 100x view you'll see large periods of time with certain textures being locked, in a way that it actually creates a clear visual pattern. I suspect that this is just due to these textures finding themselves stuck behind the textures we're glFinishObject'ing, and the ordering just makes that pattern look the way it does. It's still interesting to think about. If anyone has any thoughts about what this shows or what else I should include in the log, let me know. [1]: https://browserbench.org/MotionMark/developer.html?test-interval=15&display=minimal&tiles=big&controller=fixed&frame-rate=30&kalman-process-error=1&kalman-measurement-error=4&time-measurement=performance&suite-name=Animometer&test-name=Design&complexity=17

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 93

•

7 years ago

The log looks really nice. Can you include uses of the textures (both glBindTexture and glTexSubImage) and calls to glSwapBuffers in the log?

Flags: needinfo?(dothayer)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

7 years ago

Blocks: 1461666

Alex Thayer [:alexical] (she/her)

Assignee

Comment 94

•

7 years ago

Attached file logviz.html — Details

Added info for glSwapBuffers / glBindTexture / glTex(Sub)Image2D. The orange lines all the way across are glSwapBuffers calls, the blue per-texture lines are glBindTexture calls, and the green lines are glTexSubImage2D calls.

Flags: needinfo?(dothayer)

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Blocks: 1459875

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 95

•

7 years ago

My guess is that the finishing might be getting blocked by texture upload because we're interleaving finishObject and texSubImage2d. Can we make sure we do all of the finishing before starting any new uploads?

Flags: needinfo?(dothayer)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 96

•

7 years ago

Attached file logviz.html — Details

It's a similar story with the glFinishObject calls all preceding the glTexSubImage2D calls. Attaching a log if you want to take a look for yourself - in short, the bars all look the same, but the glTexSubImage2D lines are just moved to the bottom of the cycle.

Flags: needinfo?(dothayer)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 97

•

7 years ago

At the risk of sounding ignorant, it feels like the big problem here is that we're blitting so many overlapping buffers on the GPU? I don't know too much about the background behind this, but it doesn't look like that's what Chrome or Safari are doing - it looks like they're squashing the layers in this test before compositing?

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 98

•

7 years ago

(In reply to Doug Thayer [:dthayer] (PTO on June 4) from comment #97) > At the risk of sounding ignorant, it feels like the big problem here is that > we're blitting so many overlapping buffers on the GPU? I don't know too much > about the background behind this, but it doesn't look like that's what > Chrome or Safari are doing - it looks like they're squashing the layers in > this test before compositing? It is possible that we're just running into the limits of the GPU. It would be nice if we could confirm this somehow though. Yes, I Safari at least is not using separate layers for all the items.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 99

•

7 years ago

(In reply to Jeff Muizelaar [:jrmuizel] on parental leave until at least June 25 from comment #98) > It is possible that we're just running into the limits of the GPU. It would > be nice if we could confirm this somehow though. I'm trying to sort out a way of confirming it. The best I can come up with is that if we remove the glFinishObject calls, the framerate score remains the same. I'm attaching two more visualizations of logs with and without glFinishObject to demonstrate the similarities. The time spent in glSwapBuffers seems to be what's really gating our score here (let me know if that seems right). The key problem with the approach though is that when the GPU is our bottleneck, we're not free with this approach to, say, run scripts, since we're blocking on this in the main thread of the content process, no?

Alex Thayer [:alexical] (she/her)

Assignee

Comment 100

•

7 years ago

Attached file withoutFinishObject.html — Details

Alex Thayer [:alexical] (she/her)

Assignee

Comment 101

•

7 years ago

Attached file withFinishObject.html — Details

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [gfx-noted][qf:p3][fxperf:p1][qf:f64] → [gfx-noted][fxperf:p1][qf:p3:f64]

Alex Thayer [:alexical] (she/her)

Assignee

Comment 102

•

7 years ago

If I force it all into one layer by always returning false for ActiveLayerTracker::IsStyleAnimated, our performance on the test goes up dramatically. Matt, do you have an idea of how hard it would be / what difficulties we might run into trying to implement a heuristic that would avoid the layer explosion for cases with many overlapping active layers?

Flags: needinfo?(matt.woodrow)

Matt Woodrow (:mattwoodrow)

Comment 103

•

7 years ago

(In reply to Doug Thayer [:dthayer] (PTO on June 4) from comment #102) > If I force it all into one layer by always returning false for > ActiveLayerTracker::IsStyleAnimated, our performance on the test goes up > dramatically. Matt, do you have an idea of how hard it would be / what > difficulties we might run into trying to implement a heuristic that would > avoid the layer explosion for cases with many overlapping active layers? So it sounds like this testcase has a bunch of stacked layers that it manipulates the transform on to get the 3d effect. It also modifies the colour of each layer, so we have to repaint everything, and layerization isn't helping much. The transform changes come from JS, so we don't get to have async animations either. It sounds like IsStyleAnimated is returning true in the hope that we can just push the layers around without repainting, but the colour change cancels that out. I think we should be able to detect that, either by counting style changes to other properties, or checking if the frame is invalidated? Jamie is the go-to guy for layerization optimizations at the moment, so ni? him too.

Flags: needinfo?(matt.woodrow) → needinfo?(jnicol)

Comment hidden (off-topic)

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Depends on: 1467619

Alex Thayer [:alexical] (she/her)

Assignee

Comment 105

•

7 years ago

The remaining talos regressions after Bug 1467619 and Bug 1468401 seem to all be things like rasterflood_gradient which run in ASAP mode, causing us to check if textures are locked a tiny bit too early (around .15 to .3 ms locally). This incurs a small hit to the overall test score (I'm seeing about 3%). I'm not sure what to do about this. We're not actually waiting on the GPU - we're just waiting on the compositor to wake up for that frame and check the textures to unlock them. Short of scheduling the compositor to check this early, I don't know what options are open here. Still investigating, but any thoughts/opinions are welcome.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 106

•

7 years ago

Hmm, I think I just need to send an async message via our SharedMemory backchannel up to the compositor to recheck its textures before flushing async paints.

Karl Dubost💡 :karlcow

Comment 107

•

7 years ago

We were wondering if this issue was related. https://webcompat.com/issues/9930 It's about this site https://code.visualstudio.com/docs/?dv=osx And the reduced test case to reproduce is http://la-grange.net/2018/05/30/test/svg-animation.html

Mark

Comment 108

•

7 years ago

I filed bug 1469594 based on the reduced test case in comment 107. But I think my new bug is not related to the question in comment 107.

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Blocks: 1467305

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Attachment #8979708 - Attachment is obsolete: true

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Attachment #8980356 - Attachment is obsolete: true

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Attachment #8980357 - Attachment is obsolete: true

Alex Thayer [:alexical] (she/her)

Assignee

Comment 118

•

7 years ago

I wanted to make sure I get these up before the weekend. Markus, feel free to kick any/all of these to someone else if it's a better fit. In any case, there's still two talos regressions here: tresize (~2%) - Due to the CGLLockContext call in the resize handler. It seems we lock things up, which doesn't go over well when we're trying to glFinishObject our textures and wait on that in the compositor tsvgx (~8%) - This seems to be limited to ASAP mode, as far as I can tell. The compositor is stuck in SwapBuffers, and the content process is waiting on the compositor from SyncTextures. I can't really come up with a good way around this - I'm open to suggestions, but given the real world benefits of this change I think it's worth it to eat the regression Anyway, the plan was to enable this on Nightly with the pref, and see how it works out with peoples' real workflows.

Matt Woodrow (:mattwoodrow)

Updated

•

7 years ago

Blocks: 1340126

Mike Conley (:mconley) (:⚙️)

Comment 119

•

7 years ago

Hey mattwoodrow - would you feel comfortable reviewing any of this stuff, or should we wait until mstange is back?

Flags: needinfo?(matt.woodrow)

Matt Woodrow (:mattwoodrow)

Comment 120

•

7 years ago

mozreview-review

Comment on attachment 8979705 [details] Bug 1265824 - Remove the unsed member in GLTextureSource https://reviewboard.mozilla.org/r/245854/#review259476

Attachment #8979705 - Flags: review+

Matt Woodrow (:mattwoodrow)

Comment 121

•

7 years ago

mozreview-review

Comment on attachment 8979706 [details] Bug 1265824 - Add a new texture source type "DirectMapTextureSource" https://reviewboard.mozilla.org/r/245856/#review259504 ::: gfx/layers/opengl/TextureHostOGL.h:316 (Diff revision 2) > +private: > + bool UpdateInternal(gfx::DataSourceSurface* aSurface, > + nsIntRegion* aDestRegion, > + gfx::IntPoint* aSrcOffset, > + bool aInit); > + GLsync mSync; What is this for? I can't see it used in any of the patches. Can we get rid of it?

Matt Woodrow (:mattwoodrow)

Comment 122

•

7 years ago

mozreview-review

Comment on attachment 8979707 [details] Bug 1265824 - Pass the texture direct mapping info to all texture creating functions https://reviewboard.mozilla.org/r/245858/#review259508 Can we add this to the TextureAllocationFlags instead of passing an extra bool to all the ::Create functions? All the TextureFactoryIdentifier changes look great.

Matt Woodrow (:mattwoodrow)

Comment 123

•

7 years ago

mozreview-review

Comment on attachment 8979709 [details] Bug 1265824 - implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL https://reviewboard.mozilla.org/r/245862/#review259510

Attachment #8979709 - Flags: review+

Matt Woodrow (:mattwoodrow)

Comment 124

•

7 years ago

mozreview-review

Comment on attachment 8979710 [details] Bug 1265824 - handle the texture uploading and readLock related things for direct mapping texture source https://reviewboard.mozilla.org/r/245864/#review259512 ::: gfx/layers/composite/TextureHost.cpp:1013 (Diff revision 2) > if (!mHasIntermediateBuffer && EnsureWrappingTextureSource()) { > + if (mFirstSource && mFirstSource->IsDirectMap()) { > + // The direct mapping texture source still needs to call Update() to > + // upload the new data. Only skip the uploading if we have the same > + // serial id. > + if (mFirstSource->GetUpdateSerial() == mUpdateSerial) { It looks like MaybeUpload already checks the serial, so you shouldn't need to do so here.

Attachment #8979710 - Flags: review+

Matt Woodrow (:mattwoodrow)

Comment 125

•

7 years ago

mozreview-review

Comment on attachment 8979711 [details] Bug 1265824 - Don't recycle the texture source if it's direct mapping https://reviewboard.mozilla.org/r/245866/#review259514 I might have misunderstood how some of this work, but I thought we still had actual GPU memory associated with the texture? The client storage code just lets OpenGL manage when they want to upload into it etc. If that's the case, then recycling those should save an allocation shouldn't it?

Matt Woodrow (:mattwoodrow)

Comment 126

•

7 years ago

mozreview-review

Comment on attachment 8979712 [details] Bug 1265824 - Wait on texture handles with IPC https://reviewboard.mozilla.org/r/245868/#review259516 I haven't gone through this one properly yet, will try do so tomorrow. It's probably worth putting StaticMonitor into a new file (and patch, with the OffTheBooksCondVar changes) so that other people can use it. It's going to need a review from an xpcom peer regardless, so separate patch is probably easiest. I also see that this switches us from TEXTURE_RECTANGLE_ARB to TEXTURE_2D. Why the change here? I thought generally all apple extensions required TEXTURE_RECTANGLE.

Matt Woodrow (:mattwoodrow)

Comment 127

•

7 years ago

mozreview-review

Comment on attachment 8987174 [details] Bug 1265824 - Guard client storage stuff behind a pref https://reviewboard.mozilla.org/r/252418/#review259518

Attachment #8987174 - Flags: review+

Matt Woodrow (:mattwoodrow)

Comment 128

•

7 years ago

mozreview-review

Comment on attachment 8987175 [details] Bug 1265824 - Don't use client storage for overly large textures https://reviewboard.mozilla.org/r/252420/#review259520

Matt Woodrow (:mattwoodrow)

Comment 129

•

7 years ago

(In reply to Mike Conley (:mconley) (:⚙️) (Catching up on needinfos / reviews) from comment #119) > Hey mattwoodrow - would you feel comfortable reviewing any of this stuff, or > should we wait until mstange is back? I'll do what I can! Still would really like mstange to have a look over them, even if it's just a rubber-stamp.

Matt Woodrow (:mattwoodrow)

Updated

•

7 years ago

Flags: needinfo?(matt.woodrow)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 130

•

7 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #126) > Comment on attachment 8979712 [details] > Bug 1265824 - Wait on texture handles with IPC > > https://reviewboard.mozilla.org/r/245868/#review259516 > > I haven't gone through this one properly yet, will try do so tomorrow. > > It's probably worth putting StaticMonitor into a new file (and patch, with > the OffTheBooksCondVar changes) so that other people can use it. It's going > to need a review from an xpcom peer regardless, so separate patch is > probably easiest. > > I also see that this switches us from TEXTURE_RECTANGLE_ARB to TEXTURE_2D. > Why the change here? I thought generally all apple extensions required > TEXTURE_RECTANGLE. I couldn't measure or perceive any benefits from using TEXTURE_RECTANGLE_ARB - simply using TEXTURE_2D was enough to get rid of all of the memmoves in my profiles. Switching from TEXTURE_2D to TEXTURE_RECTANGLE_ARB however did cause visual regressions with some videos on reddit that I couldn't get to the bottom of. I would be more interested in discovering the cause of those regressions and potentially fixing them if I knew there were some performance gain, but as far as I can tell there isn't. This was surprising to me since the documentation indicates I should see a win, but I don't. The other surprise was that I couldn't perceive any improvement from the glTextureRangeAPPLE call, though I did see gains from using GL_STORAGE_CACHED_APPLE, which as far as I could tell is supposed to be used in conjunction with glTextureRangeAPPLE. AFAIK glTextureRangeAPPLE is supposed to be used to map *multiple* textures within contiguous buffers, which is a bit tricky to manage, though I did play around with it. I think it might be interesting playing around with this, but for all the dependent bugs the texture upload cost was reduced to near nothing, so I'm not sure it's worth the effort.

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Attachment #8979711 - Attachment is obsolete: true

Attachment #8979711 - Flags: review?(mstange)

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Attachment #8987175 - Attachment is obsolete: true

Attachment #8987175 - Flags: review?(mstange)

Nathan Froyd [:froydnj]

Comment 139

•

7 years ago

mozreview-review

Comment on attachment 8988938 [details] Bug 1265824 - Add StaticMonitor https://reviewboard.mozilla.org/r/254060/#review261006 ::: xpcom/base/StaticMonitor.h:80 (Diff revision 1) > + Atomic<OffTheBooksMutex*> mMutex; > + Atomic<OffTheBooksCondVar*> mCondVar; This is a little weird, but I assume that having separate member variables rather than a single `Atomic<OffTheBooksMonitor*>` is slightly less work, or something? I think to have `Atomic<OffTheBooksMonitor*>`, we'd need all the stuff you implemented here, plus an extra `OffTheBooksMonitor` class, so it would be more code. WDYT? ::: xpcom/base/StaticMonitor.h:89 (Diff revision 1) > + // Disallow copy constructor, but only in debug mode. We only define > + // a default constructor in debug mode (see above); if we declared > + // this constructor always, the compiler wouldn't generate a trivial > + // default constructor for us in non-debug mode. > +#ifdef DEBUG > + StaticMonitor(StaticMonitor& aOther); Please make the argument here `const`. Bonus points if you file a follow-up for fixing `StaticMutex` in the same way, since I imagine some of the code was copy-pasta'd into this bug. ::: xpcom/base/StaticMonitor.h:93 (Diff revision 1) > +#ifdef DEBUG > + StaticMonitor(StaticMonitor& aOther); > +#endif > + > + // Disallow these operators. > + StaticMonitor& operator=(StaticMonitor* aRhs); I think this should be `operator=(const StaticMonitor&)`? Having a declaration for `operator=(StaticMonitor*)` is a bit strange, as the compiler won't auto-generate a method with such a signature.

Attachment #8988938 - Flags: review?(nfroyd) → review+

Markus Stange [:mstange]

Comment 140

•

7 years ago

(In reply to Doug Thayer [:dthayer] from comment #118) > In any case, there's still two talos regressions here: > > tresize (~2%) - Due to the CGLLockContext call in the resize handler. It > seems we lock things up, which doesn't go over well when we're trying to > glFinishObject our textures and wait on that in the compositor > tsvgx (~8%) - This seems to be limited to ASAP mode, as far as I can tell. > The compositor is stuck in SwapBuffers, and the content process is waiting > on the compositor from SyncTextures. I can't really come up with a good way > around this - I'm open to suggestions, but given the real world benefits of > this change I think it's worth it to eat the regression I agree, these regressions absolutely seem worth eating. We have tons of evidence of the positive effects of this change, and the tsvgx regression, being ASAP-mode-only, is not user facing. I'll try to review these patches over the next several days.

Flags: needinfo?(mstange)

Markus Stange [:mstange]

Comment 141

•

7 years ago

mozreview-review

Comment on attachment 8979706 [details] Bug 1265824 - Add a new texture source type "DirectMapTextureSource" https://reviewboard.mozilla.org/r/245856/#review261264 ::: gfx/layers/composite/TextureHost.h:189 (Diff revision 3) > + // The direct-map cpu buffer should be alive when gpu uses it. And it > + // should not be updated while gpu reads it. This Sync() function > + // implements this synchronized behavior. This comment should probably mention that this method blocks until the GPU is done with this texture. The current comment did not make this obvious to me. There's a comment in the DirectMapTextureSource implementation that mentions this, but I think it should be mentioned here as well.

Markus Stange [:mstange]

Comment 142

•

7 years ago

I realized that, while I have a good idea of what we want to achieve here, I don't actually know the code that's being changed here all that well. Matt / nical / sotaro are actually a much better choice of reviewer for this code. I've looked through all patches except the one that adds the waiting and they all look good to me. I'll try to look at the waiting patch soon.

Matt Woodrow (:mattwoodrow)

Comment 143

•

7 years ago

mozreview-review

Comment on attachment 8979706 [details] Bug 1265824 - Add a new texture source type "DirectMapTextureSource" https://reviewboard.mozilla.org/r/245856/#review262396

Attachment #8979706 - Flags: review?(matt.woodrow) → review+

Matt Woodrow (:mattwoodrow)

Comment 144

•

7 years ago

mozreview-review

Comment on attachment 8979707 [details] Bug 1265824 - Pass the texture direct mapping info to all texture creating functions https://reviewboard.mozilla.org/r/245858/#review262398

Attachment #8979707 - Flags: review?(matt.woodrow) → review+

Matt Woodrow (:mattwoodrow)

Comment 145

•

7 years ago

mozreview-review

Comment on attachment 8979712 [details] Bug 1265824 - Wait on texture handles with IPC https://reviewboard.mozilla.org/r/245868/#review262400 Looking really good! I'm excited to have this landed. Sorry for the slow reviews, was travelling last week. We might need someone else to review the ipc/glue bits, though I'm not sure anyone still knows that code. ::: gfx/layers/client/MultiTiledContentClient.cpp:264 (Diff revision 5) > oldRetainedTiles.Clear(); > > nsIntRegion paintRegion = aPaintRegion; > nsIntRegion dirtyRegion = aDirtyRegion; > + > + AutoTArray<uint64_t, 10> syncTextureSerials; Can you put this block into a helper? Update is a bit long already :) ::: gfx/layers/opengl/TextureHostOGL.cpp:364 (Diff revision 5) > { > gl()->MakeCurrent(); > > if (aInit) { > gl()->fGenTextures(1, &mTextureHandle); > - gl()->fBindTexture(LOCAL_GL_TEXTURE_RECTANGLE_ARB, mTextureHandle); > + gl()->fBindTexture(LOCAL_GL_TEXTURE_2D, mTextureHandle); Would be great if you could fold the GL_TEXTURE_2D changes into the initial patch that adds DirectMapTextureSource. ::: ipc/glue/SharedMemoryBasic_mach.h:54 (Diff revision 5) > > + static void RegisterTextureSourceProvider(layers::TextureSourceProvider* textureSourceProvider); > + static void UpdateTextureLocks(base::ProcessId aProcessId); > + static bool WaitForTextures(base::ProcessId aProcessId, const nsTArray<uint64_t>& aTextureIds); > + static void SetTexturesLocked(pid_t pid, const nsTArray<uint64_t>& textureIds); > + static void SetTexturesUnlocked(pid_t pid, const nsTArray<uint64_t>& textureIds); Hmm, I'm not a huge fan of adding texture logic into the SharedMemoryBasic class. My understanding of this code is that it's currently code that sets up a mach ipc connection between processes, and then some code on top of that which shares memory handles across the connection. Adding texture locking on the same connection makes the class feel a bit awkward. Do you think you could try split the existing statics out into a separate class/namespace, and then have the texture code use that directly? Would be nice to put most (all?) of the texture-specific logic into gfx/layers somewhere. I think we're going to want to extend this again in the future (to support sharing IOSurface objects between processes), so it'd be nice to have a full abstraction around the mach ipc bits, but I don't want to block this landing on that. Just a basic separation here would be a good step in the right direction. ::: ipc/glue/SharedMemoryBasic_mach.mm:125 (Diff revision 5) > kCleanupMsg, > }; > > +const int kTextureLockTimeout = 32; // We really don't want to wait more than > + // two frames for a texture to unlock. This > + // in any case be very uncommon. Missing a word from this comment. ::: ipc/glue/SharedMemoryBasic_mach.mm:281 (Diff revision 5) > if (gMemoryCommPorts.find(pid) == gMemoryCommPorts.end()) { > // We don't have the ports open to communicate with that pid, so we're going to > // ask our parent process over IPC to set them up for us. > if (gParentPid == 0) { > // If we're the top level parent process, we have no parent to ask. > - LOG_ERROR("request for ports for pid %d, but we're the chrome process\n", pid); > + // LOG_ERROR("request for ports for pid %d, but we're the chrome process\n", pid); Why is this error disabled? ::: ipc/glue/SharedMemoryBasic_mach.mm:547 (Diff revision 5) > +} > + > +void > +SharedMemoryBasic::RegisterTextureSourceProvider(layers::TextureSourceProvider* textureSourceProvider) > +{ > + gTextureSourceProvider = textureSourceProvider; I don't think we can use a global here. Compositors are one-per-window, so if you open a second window I'd expect this value to get overwritten.

Attachment #8979712 - Flags: review?(matt.woodrow)

Comment hidden (mozreview-request)

Matt Woodrow (:mattwoodrow)

Comment 154

•

7 years ago

mozreview-review

Comment on attachment 8979712 [details] Bug 1265824 - Wait on texture handles with IPC https://reviewboard.mozilla.org/r/245868/#review262992 ::: gfx/layers/TextureSourceProvider.cpp:31 (Diff revision 6) > + if (bufferTexture && bufferTexture->IsDirectMap()) { > - texture->ReadUnlock(); > + texture->ReadUnlock(); > + auto actor = texture->GetIPDLActor(); > + if (actor) { > + base::ProcessId pid = actor->OtherPid(); > + nsTArray<uint64_t>* textureIds = texturesIdsToUnlockByPid.LookupOrAdd(pid); Might be worth making this code OSX only too, since it's needless overhead (if we're not going to process the textureIdsToUnlockByPid hashtable) and will affect Windows. ::: gfx/layers/TextureSync.cpp:64 (Diff revision 6) > + > + return &gProcessTextureIds.at(pid); > +} > + > +bool > +WaitForTextureIdsToUnlock(pid_t pid, uint64_t* textureIds, uint32_t textureIdsLength) You could use mozilla::Span to wrap the pointer/length array and get nicer for each syntax on the loop. ::: gfx/layers/TextureSync.cpp:82 (Diff revision 6) > + > + if (allCleared) { > + return true; > + } > + > + if (lock.Wait(TimeDuration::FromMilliseconds(kTextureLockTimeout)) == CVStatus::Timeout) { Is it possible for us to get woken up (not due to timeout), but still not have all the textures we're waiting for unlocked? And if so, would it be possible for that to happen repeatedly, such that the total time we spend blocking here is considerably longer than 32ms? ::: gfx/layers/TextureSync.cpp:132 (Diff revision 6) > +} > + > +void > +TextureSync::RegisterTextureSourceProvider(TextureSourceProvider* textureSourceProvider) > +{ > + RefPtr<TextureSourceProvider> ref(textureSourceProvider); You shouldn't need to construct an explicit RefPtr here I don't think, just calling gTextureSourceProviders.AppendElement(textureSourceProvider) should do the right thing. ::: gfx/layers/client/ClientLayerManager.cpp:321 (Diff revision 6) > ClientLayerManager::EndTransactionInternal(DrawPaintedLayerCallback aCallback, > void* aCallbackData, > EndTransactionFlags) > { > + if (mForwarder) { > + mForwarder->UpdateTextureLocks(); Copying the commit message comment about this into an inline comment would be worth it imo. ::: gfx/layers/client/MultiTiledContentClient.cpp:300 (Diff revision 6) > oldRetainedTiles.Clear(); > > nsIntRegion paintRegion = aPaintRegion; > nsIntRegion dirtyRegion = aDirtyRegion; > + > + MaybeSyncTextures(paintRegion, newTiles, scaledTileSize); Not worth blocking landing this, but I think we should consider a non-blocking approach as a follow-up. We have existing code (in GetBackBuffer) that checks if a tile is locked, and reallocates from the pool if it's not available (rather than blocking and waiting). Blocking is a bit sad, since it's dead time that we can't do anything else (like run JS), and I suspect that it's frequently faster to just copy valid pixels from the locked texture into an unlocked one in the pool. I think it'd be worth trying unifying this with the existing lock code, and to see if the non-blocking approach helps.

Attachment #8979712 - Flags: review?(matt.woodrow) → review+

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Attachment #8991332 - Attachment is obsolete: true

Alex Thayer [:alexical] (she/her)

Assignee

Comment 158

•

7 years ago

mozreview-review-reply

Comment on attachment 8988938 [details] Bug 1265824 - Add StaticMonitor https://reviewboard.mozilla.org/r/254060/#review261006 > This is a little weird, but I assume that having separate member variables rather than a single `Atomic<OffTheBooksMonitor*>` is slightly less work, or something? I think to have `Atomic<OffTheBooksMonitor*>`, we'd need all the stuff you implemented here, plus an extra `OffTheBooksMonitor` class, so it would be more code. WDYT? Yeah - it felt a little awkward but it felt like just going with less code was the right thing. I think it works out to be a little bit less work on average if two threads were to call it around the same time, since there should in theory be a smaller window for racing on creating both variables.

Alex Thayer [:alexical] (she/her)

Assignee

Comment 159

•

7 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #145) > We might need someone else to review the ipc/glue bits, though I'm not sure > anyone still knows that code. Jed, is there any chance you could take a quick look at the SharedMemoryBasic_mach.* and TextureSync.* (WRT the IPC bits) changes here? I realize this is a stretch, but I'm struggling to find a proper owner of this code who's still active, and you did the most recent review. If you can think of someone else who'd be better equipped to review it, let me know.

Flags: needinfo?(jld)

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Comment 160

•

7 years ago

I'd actually been hoping to delete SharedMemoryBasic_mach entirely and use shm_open instead, in order to make bug 1473156 (blocking the main thread for 50-100ms on every process launch while the Mach IPC gets bootstrapped) more tractable. On the other hand, if we need to keep it or something like it around to support other uses of Mach IPC, it could be refactored a little to do the setup asynchronously, because it's already creating a dedicated thread per child process and managing per-other-process state. This also goes nicely with the idea in comment #145 to break the core port-wrangling out of SharedMemoryBasic (which I agree with; wedging in graphics hooks like that is… not ideal). So that isn't as much of a problem as I first thought. I'll flag myself for review to take a closer look at the patch; I don't know this code well (yet) but I have been actively working on shared memory in general.

Flags: needinfo?(jld)

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Comment 161

•

7 years ago

mozreview-review

Comment on attachment 8979712 [details] Bug 1265824 - Wait on texture handles with IPC https://reviewboard.mozilla.org/r/245868/#review263206

Attachment #8979712 - Flags: review?(jld)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

7 years ago

Blocks: 1373773

Alex Thayer [:alexical] (she/her)

Assignee

Comment 162

•

7 years ago

Hey Jed, I know you're likely busy, so don't feel pressured, but I am hoping to get this landed and any kinks worked out before I go on paternity leave in the next few weeks (unsure when), so if there's any chance you could take a look soonish I would really appreciate it. Or if any of it looks like bits that are not time-sensitive which I could get out of the way as followups, let me know.

Flags: needinfo?(jld)

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Comment 163

•

7 years ago

mozreview-review

Comment on attachment 8979712 [details] Bug 1265824 - Wait on texture handles with IPC https://reviewboard.mozilla.org/r/245868/#review264558 I'm not a huge fan of how this is set up with `MemoryPorts` escaping into layout, but I'm planning to refactor this anyway so it won't make much difference in the end. ::: ipc/glue/SharedMemoryBasic_mach.h:15 (Diff revision 7) > #include "base/file_descriptor_posix.h" > #include "base/process.h" > > #include "SharedMemory.h" > #include <mach/port.h> > +#include "chrome/common/mach_ipc_mac.h" Style nit: This could use forward declarations instead of depending on the header, I think?

Attachment #8979712 - Flags: review?(jld) → review+

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Updated

•

7 years ago

Flags: needinfo?(jld)

Comment hidden (mozreview-request)

Mozilla Autoland

Comment 172

•

7 years ago

We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again. hg error in cmd: hg rebase -s 58fbf85eef5898898cde3f3e7934d84111ee22f2 -d bdf8618dc2c3: rebasing 473397:58fbf85eef58 "Bug 1265824 - Remove the unsed member in GLTextureSource r=mattwoodrow" rebasing 473398:30a9dbb81dd5 "Bug 1265824 - Add a new texture source type "DirectMapTextureSource" r=mattwoodrow" rebasing 473399:d7889c23e19a "Bug 1265824 - Pass the texture direct mapping info to all texture creating functions r=mattwoodrow" rebasing 473400:ee7c181cd1db "Bug 1265824 - implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL r=mattwoodrow" rebasing 473401:68d8bb87264e "Bug 1265824 - handle the texture uploading and readLock related things for direct mapping texture source r=mattwoodrow" rebasing 473402:746e0cb37cb2 "Bug 1265824 - Add StaticMonitor r=froydnj" rebasing 473403:fddd4696126c "Bug 1265824 - Wait on texture handles with IPC r=jld,mattwoodrow" merging gfx/layers/client/MultiTiledContentClient.cpp merging gfx/layers/moz.build warning: conflicts while merging gfx/layers/client/MultiTiledContentClient.cpp! (edit, then use 'hg resolve --mark') unresolved conflicts (see hg resolve, then hg rebase --continue)

Comment hidden (mozreview-request)

Pulsebot

Comment 181

•

7 years ago

Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/367abce30668 Remove the unsed member in GLTextureSource r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/efce316a4425 Add a new texture source type "DirectMapTextureSource" r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/984133e9614b Pass the texture direct mapping info to all texture creating functions r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/4731dc56702d implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/be68741ff4ce handle the texture uploading and readLock related things for direct mapping texture source r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/51795de4adaf Add StaticMonitor r=froydnj https://hg.mozilla.org/integration/autoland/rev/b5ba15b1a70f Wait on texture handles with IPC r=jld,mattwoodrow https://hg.mozilla.org/integration/autoland/rev/1099d6f15f9f Guard client storage stuff behind a pref r=mattwoodrow

Eliza Balazs [:ebalazs_]

Comment 182

•

7 years ago

Flags: needinfo?(dothayer)

Comment hidden (mozreview-request)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 185

•

7 years ago

Quick update - just trying to diagnose this crashtest failure: https://treeherder.mozilla.org/logviewer.html#?job_id=188974850&repo=try&lineNumber=49882

Flags: needinfo?(dothayer)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 186

•

7 years ago

I think I've got it boiled down to a texture size issue. I was missing logic in a couple of places to ensure we don't get to a DirectMapTextureSource for textures that exceed the max texture size. Unfortunately verifying this is excruciatingly slow because I can't reproduce the issue locally and we're strapped for OSX resources on try. Hopefully this next run is green by the end of the day though.

Comment hidden (mozreview-request)

Matt Woodrow (:mattwoodrow)

Comment 197

•

7 years ago

mozreview-review

Comment on attachment 8994090 [details] Bug 1265824 - Remove CreateForYCbCrWithBufferSize https://reviewboard.mozilla.org/r/258694/#review265600

Attachment #8994090 - Flags: review?(matt.woodrow) → review+

Matt Woodrow (:mattwoodrow)

Comment 198

•

7 years ago

mozreview-review

Comment on attachment 8994091 [details] Bug 1265824 - Plug holes in texture size restrictions https://reviewboard.mozilla.org/r/258696/#review265602

Attachment #8994091 - Flags: review?(matt.woodrow) → review+

Pulsebot

Comment 199

•

7 years ago

Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bf9d73b214fc Remove the unsed member in GLTextureSource r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/6aecd088e02c Add a new texture source type "DirectMapTextureSource" r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/295dd1a6a09f Pass the texture direct mapping info to all texture creating functions r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/71b23fbe1fec implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/39ae151b3d8c handle the texture uploading and readLock related things for direct mapping texture source r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/239ab9f9ef52 Add StaticMonitor r=froydnj https://hg.mozilla.org/integration/autoland/rev/c141fb67cf9a Wait on texture handles with IPC r=jld,mattwoodrow https://hg.mozilla.org/integration/autoland/rev/7c90215a2eca Guard client storage stuff behind a pref r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/27c7daabd1a3 Remove CreateForYCbCrWithBufferSize r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/391c8e7897df Plug holes in texture size restrictions r=mattwoodrow

Cosmin Sabou [:CosminS]

Comment 200

•

7 years ago

Backed out 10 changesets (bug 1265824) for causing reftests failures on global-composite-operation.html. Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-searchStr=reftest&fromchange=391c8e7897dfa02bc87a0bcff064b3e090711248&tochange=5d7c6ffc08ef5910ec64883c68b17b9f40a8859d&selectedJob=189557313 Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=189557313&repo=autoland&lineNumber=22473 https://treeherder.mozilla.org/logviewer.html#?job_id=189560220&repo=autoland&lineNumber=22071 https://treeherder.mozilla.org/logviewer.html#?job_id=189559424&repo=autoland&lineNumber=23901 Backout link: https://hg.mozilla.org/integration/autoland/rev/5d7c6ffc08ef5910ec64883c68b17b9f40a8859d

Flags: needinfo?(dothayer)

Comment hidden (mozreview-request)

Pulsebot

Comment 209

•

7 years ago

Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3b0192c401cf Remove the unsed member in GLTextureSource r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/cf95b1f62eae Add a new texture source type "DirectMapTextureSource" r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/1fe8be31198a Pass the texture direct mapping info to all texture creating functions r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/f428582ce4dc implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/ee6b49ee41c9 handle the texture uploading and readLock related things for direct mapping texture source r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/ce33d2c68119 Add StaticMonitor r=froydnj https://hg.mozilla.org/integration/autoland/rev/c7ea73331fc8 Wait on texture handles with IPC r=jld,mattwoodrow https://hg.mozilla.org/integration/autoland/rev/71422954f047 Guard client storage stuff behind a pref r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/1b574f2c0ec0 Remove CreateForYCbCrWithBufferSize r=mattwoodrow https://hg.mozilla.org/integration/autoland/rev/96ce98b72056 Plug holes in texture size restrictions r=mattwoodrow

Raul Gurzau (:RaulG)

Comment 210

•

7 years ago

bugherder

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

status-firefox63: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla63

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 211

•

7 years ago

\o/

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Flags: needinfo?(dothayer)

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

7 years ago

Blocks: 1400787

Hiroyuki Ikezoe (:hiro)

Comment 212

•

7 years ago

\o/

Mark

Comment 213

•

7 years ago

How do I enable this? I'm on Nightly Mac 2018-07-24 and I created a new pref called gfx.allow-texture-direct-mapping and set it true and restarted Firefox. But I didn't see any dramatic differences between true and false on gsmarena.com. IIRC when I was playing with the Try version some months ago I did see an improvement. Perhaps I have done something wrong. Is there any definitive test I can try e.g. a benchmark to see whether ClientStorageTextureSource is working or not? Thanks

Alex Thayer [:alexical] (she/her)

Assignee

Comment 214

•

7 years ago

(In reply to Mark from comment #213) > How do I enable this? I'm on Nightly Mac 2018-07-24 and I created a new pref > called gfx.allow-texture-direct-mapping and set it true and restarted > Firefox. But I didn't see any dramatic differences between true and false on > gsmarena.com. IIRC when I was playing with the Try version some months ago I > did see an improvement. Perhaps I have done something wrong. > > Is there any definitive test I can try e.g. a benchmark to see whether > ClientStorageTextureSource is working or not? > > Thanks Make sure to try with today's build. There are two builds per day; it came out on build id 20180724223402, and wasn't in the build yet for 20180724100052, so you could be running a 2018-07-24 build and still not see it. Anyway, to be certain, grab a profile with the https://perf-html.io profiler and search for "DirectMapTextureSource" in the Compositor thread. If you're seeing jank in the Compositor and no DirectMapTextureSource, you're likely not using client storage.

Mark

Comment 215

•

7 years ago

So it turns out that patience IS a virtue... It works on today's Nightly. I see DirectMapTextureSource and significant improvement in gsmarena.com jank at lower resolutions, still struggling a bit at 1680x1050. I'll have a fiddle and see what else I see. Doug, Good luck with your parental leave!

Alex Thayer [:alexical] (she/her)

Assignee

Comment 216

•

7 years ago

(In reply to Mark from comment #215) > So it turns out that patience IS a virtue... > > It works on today's Nightly. I see DirectMapTextureSource and significant > improvement in gsmarena.com jank at lower resolutions, still struggling a > bit at 1680x1050. I'll have a fiddle and see what else I see. Yeah - regarding gsmarena.com, one thing you might still see is a lot of checkerboarding, and large Rasterize times in the content process spend in ShmemTextureData::Create. From my testing this was an existing problem that was just being hidden by the previous jank. You'll see this for a while when scrolling while the texture pool fills up. I think there's more we can do here with our texture pooling, but I think it's a separate bug. (Looking into filing one about what I'm seeing right now.) Does this match what you're seeing, or are you seeing something else? Does your issue go away after 10-15 seconds of scrolling up and down on gsmarena? > Doug, Good luck with your parental leave! Thanks! :)

Alex Thayer [:alexical] (she/her)

Assignee

Comment 217

•

7 years ago

(Actually - I think Matt's comment in the gsmarena bug[1] is a better account of what's still going on. This bug should help the upload performance but there's still a lot left unchanged. I still think some of this can be shaved off with pooling adjustments though.) [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1457413#c18

Mark

Comment 218

•

7 years ago

(In reply to Doug Thayer [:dthayer] (On parental leave soon) from comment #216) > Does this match what > you're seeing, or are you seeing something else? Does your issue go away > after 10-15 seconds of scrolling up and down on gsmarena? When gfx.allow-texture-direct-mapping is false there is a huge jank as the sidebar scrolls up, often get 0 frames of sidebar animation, instead the whole scroll freezes for ~100s of ms and the sidebar appears. When gfx.allow-texture-direct-mapping is true the jank gets substantially better, it's still far from butter smooth even at 1280x800 default resolution. But in both cases there's a lot of checkerboarding particularly if I fling back up to the top of the page. Not sure about the 10-15 seconds, maybe, maybe not.

Ionuț Goldan [:igoldan]

Updated

•

7 years ago

Depends on: 1478565

Ionuț Goldan [:igoldan]

Comment 219

•

7 years ago

Big AWSY perf improvement: == Change summary for alert #14548 (as of Tue, 24 Jul 2018 01:22:55 GMT) == Improvements: 8% Resident Memory osx-10-10 opt stylo 707,995,740.35 -> 648,727,504.31 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14548

Dan Mosedale (:dmosedale, :dmose)

Comment 220

•

7 years ago

Thanks so much to Doug as well as to everyone else who helped out here! My Mac with lots of tabs feels like it runs the fans much less, and unlike before this push, Firefox processes almost never show up as big CPU-eaters in Activity Monitor. I wonder if this effects enough people to deserve a release note, in the interest of re-attracting Mac users who tried after Quantum but couldn't take all the CPU usage.

Michal K

Comment 221

•

7 years ago

Many many thanks for this implementation. This solution improves my Firefox's behaviour perceivably, especially if I have iMac retina with 5K resolution. I'm not sure how significant improvement can this implementation bring, but it noticeably improves my tab switching (almost instant), page scrolling (less stuttering) and video playback (better smoothness, less cpu usage) on my machine. Once again, great job! I would say it really deserves to be mentioned in release notes.

Michal K

Comment 222

•

7 years ago

I'm sorry for additional spamming :-( but I'm curious. Are there any plans with bugs, which this feature blocks?

Martin Stránský [:stransky] (ni? me)

Updated

•

7 years ago

Blocks: 1406230

Alex Thayer [:alexical] (she/her)

Assignee

Updated

•

7 years ago

Depends on: 1477914

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Updated

•

7 years ago

Updated

•

7 years ago

Blocks: 1453080

Sylvestre Ledru [:Sylvestre]

Comment 224

•

7 years ago

Release Note Request (optional, but appreciated) [Why is this notable]: Important improvement on Mac [Affects Firefox for Android]: No [Suggested wording]: Improved the reactivity of Firefox on Mac OS X (I am sure we can find a better wording) [Links (documentation, blog post, etc)]: Are we going to write something about that?

relnote-firefox: --- → ?

Pascal Chevrel:pascalc

Comment 225

•

7 years ago

I am adding that to Nightly 63 release notes with this wording: Improved the reactivity of Firefox on MacOS Looking at the patch (https://hg.mozilla.org/mozilla-central/rev/71422954f047#l2.12) this is not riding the trains and I am not seeing a bug about it in the dependencies. Is this something planned for 63 or will it be activated in a later release? If the former, please note that the Nightly Soft Freeze is on August 23 and preferences should not be activated after the soft freeze until the merge to beta (https://wiki.mozilla.org/Release_Management/Release_Process#Nightly_soft_code_freeze). Mike, do you know what the release status of this feature is? Thanks

Flags: needinfo?(jnicol) → needinfo?(mconley)

Keywords: feature

Mike Conley (:mconley) (:⚙️)

Comment 226

•

7 years ago

Redirecting to mstange. Hey Markus, this has been enabled on Nightly for a few weeks now, and I don't think we've observed any ill effects. Do you think this is stable enough to ride the 63 train out to release?

Flags: needinfo?(mconley) → needinfo?(mstange)

Markus Stange [:mstange]

Comment 227

•

7 years ago

I haven't seen any reports of problems, so yes, let's turn it on!

Flags: needinfo?(mstange)

Mike Conley (:mconley) (:⚙️)

Updated

•

7 years ago

Blocks: 1481913

Sören Hentzschel

Updated

•

7 years ago

Depends on: 1483533

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 228

•

6 years ago

Added to Beta 63 release notes.

relnote-firefox: ? → 63+

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1539680

Diana Rus

Updated

•

6 years ago

Regressions: 1562888

Kevin Brosnan [Ex-Mozilla]

Updated

•

6 years ago

No longer regressions: 1562888

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P3

Whiteboard: [gfx-noted][fxperf:p1][qf:p3:f64] → [gfx-noted][fxperf:p1]

Snapshot of etherpad 9 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 2.98 KB, text/plain		Details
WIP: client storage support 9 years ago Matt Woodrow (:mattwoodrow) 20.06 KB, patch		Details \| Diff \| Splinter Review
WIP towards better reuse/finishing 8 years ago Jeff Muizelaar [:jrmuizel] 14.11 KB, patch		Details \| Diff \| Splinter Review
WIP 8 years ago Jeff Muizelaar [:jrmuizel] 25.18 KB, patch		Details \| Diff \| Splinter Review
CoreAnimation timeline 8 years ago Markus Stange [:mstange] 2.32 MB, image/jpeg		Details
Video demontrating the jank 8 years ago (no longer active) 5.03 MB, video/webm		Details
Elaborated sketch 8 years ago Jeff Muizelaar [:jrmuizel] 37.68 KB, patch		Details \| Diff \| Splinter Review
P1: remove the unsed member in GLTextureSource. 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 3.84 KB, patch		Details \| Diff \| Splinter Review
P2: add a new texture source type "DirectMapTextureSource". 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 8.87 KB, patch		Details \| Diff \| Splinter Review
P3: pass the texture direct mapping capability from CompositorOGL to KnowsCompositor. 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 8.36 KB, patch		Details \| Diff \| Splinter Review
P4: pass the texture direct mapping info to all texture creating functions. 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 36.72 KB, patch		Details \| Diff \| Splinter Review
P5: implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL. 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 5.57 KB, patch		Details \| Diff \| Splinter Review
P6: handle the texture uploading and readLock related things for direct mapping texture source. 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 3.88 KB, patch		Details \| Diff \| Splinter Review
P7: don't recycle the texture source if it's direct mapping. 7 years ago Jerry Shih[:jerry] (UTC+8) (inactive) 1.25 KB, patch		Details \| Diff \| Splinter Review
Bug 1265824 - Remove the unsed member in GLTextureSource 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - Add a new texture source type "DirectMapTextureSource" 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - Pass the texture direct mapping info to all texture creating functions 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - pass the texture direct mapping info to all texture creating functions. 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request		Details
Bug 1265824 - implement CreateDataTextureSourceAroundYCbCr() and CreateDataTextureSourceAround() for CompositorOGL 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - handle the texture uploading and readLock related things for direct mapping texture source 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - Don't recycle the texture source if it's direct mapping 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request		Details
Bug 1265824 - Wait on texture handles with IPC 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - Avoid extra copies in driver 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request		Details
Bug 1265824 - Use glFinishObjectAPPLE instead of glClientWaitSync 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request		Details
logviz.html 7 years ago Jeff Muizelaar [:jrmuizel] 1.62 KB, text/html		Details
logviz.html 7 years ago Alex Thayer [:alexical] (she/her) 1.96 MB, text/html		Details
logviz.html 7 years ago Alex Thayer [:alexical] (she/her) 6.21 MB, text/html		Details
logviz.html 7 years ago Alex Thayer [:alexical] (she/her) 4.20 MB, text/html		Details
withoutFinishObject.html 7 years ago Alex Thayer [:alexical] (she/her) 8.29 MB, text/html		Details
withFinishObject.html 7 years ago Alex Thayer [:alexical] (she/her) 4.61 MB, text/html		Details
Bug 1265824 - Guard client storage stuff behind a pref 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - Don't use client storage for overly large textures 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request		Details
Bug 1265824 - Add StaticMonitor 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	froydnj : review+	Details
ipc changes 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request		Details
Bug 1265824 - Remove CreateForYCbCrWithBufferSize 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details
Bug 1265824 - Plug holes in texture size restrictions 7 years ago Alex Thayer [:alexical] (she/her) 59 bytes, text/x-review-board-request	mattwoodrow : review+	Details