Closed Bug 1300121 Opened 4 years ago Closed 3 years ago

Invalidation issue with PDF.js and copy-on-write canvas when switching tabs.

Categories

(Core :: Graphics: Layers, defect)

51 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: nical, Assigned: nical)

References

()

Details

Attachments

(6 files, 1 obsolete file)

Loic reported some issues around this with persistent buffer provider shared enabled on windows.

> No crash for me with this build when scrolling the PDF, but I got some
> invalidation areas (black background for 1 or 2 pages of the PDF) when
> switching from another tab to the PDF tab.
Attached image screenshot1.jpg
Attached image screenshot2.jpg
Version: unspecified → 51 Branch
Loic, there have been some changes in this area and the bug might have been fixed in the process. could you test this build https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com-303c493ab06f92d39ef1bf541e6e780963f8678f/try-win32/

It also has some additional logging. If the bug is still reproducible, could you look at the "failures" entries in the graphics section of the about:support after the bug has occured?

Thanks!
Flags: needinfo?(epinal99-bugzilla2)
Attached file about_support.txt
Invalisation is still visible with this build, when switching, some pages of the PDF have a black background.

about:support>graphics attached.
Flags: needinfo?(epinal99-bugzilla2)
Thanks, the content of log isn't what I suspected which is interesting.

Could you try this one? https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com-26f85a4a55c5c6315f4b0fd478760ebc5fd1e09a/try-win32/

If by any chance this fixes the issue, please still check the failures entries in about:support.

Thanks again!
Flags: needinfo?(epinal99-bugzilla2)
PDD linked for the record.
Still broken.

failures	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
Failure Log
(#0) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#1) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#2) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#3) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#4) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#5) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#6) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
(#7) 	CP+[GFX1-]: [BufferProvider] BorrowDT: Failed, allocating too many textures.
Flags: needinfo?(epinal99-bugzilla2)
I have another build ready for you to try when you have time, with more potential fixes: https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com-ac7d576f2a9e12613d84afbe855839cba9c18bab/try-win32/
I got crash when scrolling the PDF so it's hard to tell if invalidation is still present.
https://crash-stats.mozilla.com/report/index/c4720808-c9d3-43cd-b7a9-7925e2161003
https://crash-stats.mozilla.com/report/index/a1353f83-6f82-44fd-86be-9fee52161003

failures	CP+[GFX1]: Unexpected BufferProvider over-production.
Failure Log
(#0) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#1) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#2) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#3) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#4) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#5) 	CP+[GFX1]: Unexpected BufferProvider over-production.
Yet another build for you Loic! https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com-871c7e383ac967dc1d301b2c7035204020597534/try-win32/

It hardens the code around canvas some more, hopefully fixing the crash from the previous attempt.
No crash anymore but still invalidation. On some pages, the rendering stops and only a part of the PDF is displayed. 

failures	CP+[GFX1]: Unexpected BufferProvider over-production.
Failure Log
(#0) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#1) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#2) 	CP+[GFX1]: Unexpected BufferProvider over-production.
New build for you Loic: https://archive.mozilla.org/pub/firefox/try-builds/nsilva@mozilla.com-b74f0d74fe8e8bab8131833483828042955045f0/try-win64/

Thank you so much for your help.
Flags: needinfo?(epinal99-bugzilla2)
It seems to be fixed, I scrolled up/down many times, switched bewtween tabs after 2-3 min and I can't reproduce the invalidation with black areas.
I tried with e10s on/off, HWA on/off same result, no invalidation.

Good job! :)
Flags: needinfo?(epinal99-bugzilla2)
I dunno if it's important but I still see:

failures	CP+[GFX1]: Unexpected BufferProvider over-production.
Failure Log
(#0) 	CP+[GFX1]: Unexpected BufferProvider over-production.
(#1) 	CP+[GFX1]: Unexpected BufferProvider over-production.

(in both win32/64 builds)
(In reply to Loic from comment #13)
> It seems to be fixed, 

\o/

> I dunno if it's important but I still see:
> 
> failures	CP+[GFX1]: Unexpected BufferProvider over-production.

I'll keep an eye on this. As long as it happens only at punctually (apparently during tab switch) and not continuously the effects of getting in that state are manageable but it does mean canvas is consuming more memory than it should.

Thanks again.
In fact, I see these failures only when HWA is enabled. The rendering of the PDF when scrolling is even faster with HWA disabled (at least with this PDF).
This seems to have helped here but was motivated by another bug which I can't find. The other bug was about non-gecko memory allocations accumulating while the window is minimized or something along those lines.
Attachment #8805089 - Flags: review?(bas)
The smallest but perhaps most impactful change in this patch is to make IsTargetValid return true if and only if we have a DT to draw into. the method used to return tru if we had a valid target OR if we had a valid provider but no target. In most places we use this method what we really want to know is whether we can use the DT right after calling EnsureTarget. The only place where we want to keep the previous behaviour is GetCanvasLayer.
Attachment #8805110 - Flags: review?(bas)
And if all fails, fallback to the basic provider rather than not getting any DrawTarget at all.
Attachment #8805112 - Flags: review?(bas)
Attachment #8805112 - Flags: review?(bas) → review+
Comment on attachment 8805110 [details] [diff] [review]
Make switching between buffer providers more robust and fix IsTargetValid

Review of attachment 8805110 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/CanvasRenderingContext2D.cpp
@@ +1632,5 @@
> +  bool needsClear = !canDiscardContent;
> +  if (newTarget->GetBackendType() == gfx::BackendType::SKIA) {
> +    // Skia expects the unused X channel to contains 0 even for opaque operations
> +    // so we can't skip clearing in that case, even if we are going to cover the
> +    // entire canvas in the next drawing operation.

This comment confuses me, can you explain this a little more?
Comment on attachment 8805089 [details] [diff] [review]
Flush the D3D11 immediate context if we don't present the frmae to avoid resources queing up in the driver.

Review of attachment 8805089 [details] [diff] [review]:
-----------------------------------------------------------------

This is probably a good idea anyway, yes.
Attachment #8805089 - Flags: review?(bas) → review+
(In reply to Bas Schouten (:bas.schouten) from comment #20)
> Comment on attachment 8805110 [details] [diff] [review]
> Make switching between buffer providers more robust and fix IsTargetValid
> 
> Review of attachment 8805110 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/canvas/CanvasRenderingContext2D.cpp
> @@ +1632,5 @@
> > +  bool needsClear = !canDiscardContent;
> > +  if (newTarget->GetBackendType() == gfx::BackendType::SKIA) {
> > +    // Skia expects the unused X channel to contains 0 even for opaque operations
> > +    // so we can't skip clearing in that case, even if we are going to cover the
> > +    // entire canvas in the next drawing operation.
> 
> This comment confuses me, can you explain this a little more?

With the BGRX format, the skia backend relies on the (theoretically unused) X channel to be filled with 0xFF (and not 0, my comment is wrong) when performing drawing operations on top of it. This code runs when we just allocated a new buffer provider to render our canvas into so we don't know what's in it. usually we'd avoid clearing the buffer if we know that the next drawing command is an opaque operation which will cover all of the pixels, but in this case (skia) we can't skip clearing the target because the next drawing operation does rely the X channel only containing 0xFF which we don't know for sure.

On second thought it may make more sense to move this logic inside PersistentBufferProvider and let CanvasRenderingContext2D assume that whenever it gets a DrawTarget, the latter is in a usable state.
(In reply to Nicolas Silva [:nical] from comment #22)
> (In reply to Bas Schouten (:bas.schouten) from comment #20)
> > Comment on attachment 8805110 [details] [diff] [review]
> > Make switching between buffer providers more robust and fix IsTargetValid
> > 
> > Review of attachment 8805110 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: dom/canvas/CanvasRenderingContext2D.cpp
> > @@ +1632,5 @@
> > > +  bool needsClear = !canDiscardContent;
> > > +  if (newTarget->GetBackendType() == gfx::BackendType::SKIA) {
> > > +    // Skia expects the unused X channel to contains 0 even for opaque operations
> > > +    // so we can't skip clearing in that case, even if we are going to cover the
> > > +    // entire canvas in the next drawing operation.
> > 
> > This comment confuses me, can you explain this a little more?
> 
> With the BGRX format, the skia backend relies on the (theoretically unused)
> X channel to be filled with 0xFF (and not 0, my comment is wrong) when
> performing drawing operations on top of it. This code runs when we just
> allocated a new buffer provider to render our canvas into so we don't know
> what's in it. usually we'd avoid clearing the buffer if we know that the
> next drawing command is an opaque operation which will cover all of the
> pixels, but in this case (skia) we can't skip clearing the target because
> the next drawing operation does rely the X channel only containing 0xFF
> which we don't know for sure.
> 
> On second thought it may make more sense to move this logic inside
> PersistentBufferProvider and let CanvasRenderingContext2D assume that
> whenever it gets a DrawTarget, the latter is in a usable state.

That's what I'm thinking, and thanks for clearing that up, the confusion for me was the 0 vs 255.
Same patch with the comment fixed.
Attachment #8805110 - Attachment is obsolete: true
Attachment #8805110 - Flags: review?(bas)
Attachment #8807079 - Flags: review?(bas)
Comment on attachment 8807079 [details] [diff] [review]
Make switching between buffer providers more robust and fix IsTargetValid (v2)

Review of attachment 8807079 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/CanvasRenderingContext2D.cpp
@@ +1629,5 @@
>    MOZ_ASSERT(newTarget);
>    MOZ_ASSERT(newProvider);
>  
> +  bool needsClear = !canDiscardContent;
> +  if (newTarget->GetBackendType() == gfx::BackendType::SKIA) {

Hrm, that should only be needed the first time though right? After this there's going to be 0xFF in there, and this will never change.
Attachment #8807079 - Flags: review?(bas) → review+
(In reply to Bas Schouten (:bas.schouten) from comment #25)
> Hrm, that should only be needed the first time though right? After this
> there's going to be 0xFF in there, and this will never change.

Yep, this code only runs when we allocated a new provider, not when we already have one and it successfully gave us a draw target.
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c10416d7247d
Flush the D3D11 immediate context if a composition is cancelled to avoid resources queing up in the driver. r=Bas
https://hg.mozilla.org/integration/mozilla-inbound/rev/f96d497c78a7
Make switching between canvas buffer providers a tad more robust. r=Bas
https://hg.mozilla.org/integration/mozilla-inbound/rev/900960e68528
Make PersistentBufferProviderShared::BorrowDrawTarget more robust agaist bad edge cases and if all fails, fallback to the basic provider. r=Bas
You need to log in before you can comment on or make changes to this bug.