Do Async Canvas layers update

RESOLVED FIXED in mozilla22

Status

()

Core
Graphics: Layers
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: BenWa, Assigned: snorp)

Tracking

(Blocks: 3 bugs)

Trunk
mozilla22
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(relnote-firefox 22+, fennec+)

Details

(Whiteboard: [games:p1])

Attachments

(2 attachments, 10 obsolete attachments)

14.71 KB, text/plain
Details
16.05 KB, patch
BenWa
: review+
Details | Diff | Splinter Review
(Reporter)

Description

5 years ago
We spend 12% of our time in the latest eideticker run waiting for sync transaction:
http://people.mozilla.com/~bgirard/cleopatra/?zippedProfile=profiles/sps-profile-1357145803.48.zip&videoCapture=videos/video-1357146004.13.webm#

Updates containing canvas updates must be made synchronous because they require a surface swap. We have already solved this problem for web content by using TiledThebesLayer. By solving this for canvas we should be able to remove this sync wait and gain this 12%. However we might have to do an extra copy to send the canvas asynchronously.

Doing this change now may conflict with the layers refactor but I'm not sure.
(Reporter)

Comment 1

5 years ago
The support was initialized implemented for TiledThebesLayer in bug 748649.
Depends on: 748649
(Reporter)

Updated

5 years ago
tracking-fennec: --- → ?
(Reporter)

Updated

5 years ago
Blocks: 829752
(Reporter)

Comment 2

5 years ago
Just to be clear this bug will help in demos that run at about 25 FPS or higher. We pay a small cost for each transaction so if we do less then 25 FPS that cost become negligible.
No longer blocks: 829752
tracking-fennec: ? → +
Now that we have (or will shortly have) bug 716859 (streaming webgl buffers), this will become a high priority.  Not having this is essentially blocking the largest part of the speedup that we get from that bug.

BenWa, are you the possibly the best person to work on this?  How much work do you think it is?  This is likely going to end up being a big blocker for MWC and GDC WebGL demos as well, both on B2G and Android.
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: [games:p1]
Blocks: 835165
(Reporter)

Comment 4

5 years ago
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #3)
> Now that we have (or will shortly have) bug 716859 (streaming webgl
> buffers), this will become a high priority.  Not having this is essentially
> blocking the largest part of the speedup that we get from that bug.

Agreed for anything that is at ~25FPS or more is likely gated on this. For WebGL the benefit may be seen earlier if we blocking on glFinish

> 
> BenWa, are you the possibly the best person to work on this?  How much work
> do you think it is?  This is likely going to end up being a big blocker for
> MWC and GDC WebGL demos as well, both on B2G and Android.

Doing this for WebGL or 2D is different patch. I would imagine Jeff is the best person. Is patches are supposed to make sending a resolving buffer easy thus it should just be a matter of using a NoSwap variant transaction edit in theory.
Sounds like this is important enough to spend some time on figuring out the details.  It may be that we could split this into a few separate pieces (e.g., WebGL vs. 2D if that's what BenWa meant) and deliver it sooner, rather than later?  Or at least shorten the calendar days from starting to finishing.
Created attachment 710240 [details] [diff] [review]
Update WebGL canvases asynchronously
Depends on: 716859
Attachment #710240 - Flags: review?(jgilbert)
Attachment #710240 - Flags: feedback?(bgirard)
Try run here: https://tbpl.mozilla.org/?tree=Try&rev=b0ac0d27297e

Comment 8

5 years ago
Try run for b0ac0d27297e is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b0ac0d27297e
Results (out of 12 total builds):
    failure: 12
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jwillcox@mozilla.com-b0ac0d27297e
Created attachment 712998 [details] [diff] [review]
Update WebGL canvases asynchronously
Attachment #710240 - Attachment is obsolete: true
Attachment #710240 - Flags: review?(jgilbert)
Attachment #710240 - Flags: feedback?(bgirard)
Attachment #712998 - Flags: review?(jgilbert)
Attachment #712998 - Flags: feedback?(vladimir)
Attachment #712998 - Flags: feedback?(bgirard)
With the original patch, we had a problem where the content thread can produce surfaces much faster than the compositor can render them. This led to us throwing away a lot of frames, wasting resources. The new patch blocks content when the compositor has a surface ready to go but is not rendering it yet (the "transit" surface). I get about 53-55fps in playcanvas on the Nexus 7, and without the patch I get 48-50.
(Reporter)

Comment 11

5 years ago
Comment on attachment 712998 [details] [diff] [review]
Update WebGL canvases asynchronously

Review of attachment 712998 [details] [diff] [review]:
-----------------------------------------------------------------

This change looks good to me but I can't really see the ownership model of buffers. Perhaps I'll have a better idea when the streaming buffers has landed.

::: gfx/layers/ipc/ShadowLayers.h
@@ +217,5 @@
>    void PaintedImage(ShadowableLayer* aImage,
>                      const SharedImage& aNewFrontImage);
>    void PaintedCanvas(ShadowableLayer* aCanvas,
>                       bool aNeedYFlip,
> +                     bool aNeedNewBackBuffer,

Either add PaintedCanvasNoSwap or replace bool by an enum. We're trying to get rid of bool param in gfx.
Attachment #712998 - Flags: feedback?(bgirard) → feedback+
Created attachment 715855 [details] [diff] [review]
Bug 829747 - Update WebGL canvases asynchronously
Attachment #715855 - Attachment is obsolete: true
Created attachment 715869 [details] [diff] [review]
Bug 829747 - Update WebGL canvases asynchronously
Attachment #715869 - Attachment is obsolete: true
Created attachment 715870 [details] [diff] [review]
Update WebGL canvases asynchronously

Ugh, fighting with git-bz is a pain. This patch addresses some feedback from here and irc.
Attachment #712998 - Attachment is obsolete: true
Attachment #712998 - Flags: review?(jgilbert)
Attachment #712998 - Flags: feedback?(vladimir)
Attachment #715870 - Flags: review?(jgilbert)
Attachment #715870 - Flags: feedback?(bgirard)

Updated

5 years ago
Blocks: 829752
Blocks: 843685
(Reporter)

Comment 15

5 years ago
Comment on attachment 715870 [details] [diff] [review]
Update WebGL canvases asynchronously

This patch does the right thing but as we discuss over Vidyo we also want to be throttling requestAnimationFrame instead of waiting on the compositor to let the main thread do useful work instead of blocking. This will be done in a follow up bug.
Attachment #715870 - Flags: feedback?(bgirard) → feedback+
Created attachment 717198 [details] [diff] [review]
Update WebGL canvases asynchronously

Fixed bit rot and rebased against streaming patch that landed in m-c
Attachment #715870 - Attachment is obsolete: true
Attachment #715870 - Flags: review?(jgilbert)
Attachment #717198 - Flags: review?(jgilbert)
Comment on attachment 717198 [details] [diff] [review]
Update WebGL canvases asynchronously

Review of attachment 717198 [details] [diff] [review]:
-----------------------------------------------------------------

Let me know if I'm missing what's trying to be done here, but it looks like we're trying to convert from generic triple-buffering to triple-buffered chaining, where we block the producer if the pipeline is full. (That is, we want a new frame, our current mProducer is done, and mStaging still has a frame in it that the compositor hasn't consumed yet)

As such, I don't think we need a new mTransit buffer slot; we should just need to block SwapProducer if we're trying to swap and mStaging still has a frame in it.

::: gfx/gl/SurfaceStream.cpp
@@ +388,5 @@
>      if (mProducer) {
>          RecycleScraps(factory);
>  
> +        // When finished, transit is null and likely moved to cons
> +        WaitForCompositor();

We probably only want to call this if mStaging is non-null.

@@ +393,3 @@
>  
> +        if (mStaging) {
> +            // This means we're about to render another frame before sending a layer transaction. Boo.

It sounds like this comment should be a warning. Also, I'm not positive about timings, but could not we assert that mStaging is null after WaitForCompositor returns, as our SwapConsumer call Compositor-side is also mutexed, and so might guarantee that mStaging is null. This might have changed with this new async transaction stuff.

@@ +412,5 @@
> +{
> +    SAMPLE_LABEL("SurfaceStream_TripleBuffer", "WaitForCompositor");
> +
> +    // We are assumed to be locked
> +    while (mTransit)

Why don't we just wait on mStaging, if we just want to queue up an extra frame in the pipeline to do chaining?

::: gfx/gl/SurfaceStream.h
@@ +169,5 @@
>      : public SurfaceStream
>  {
>  protected:
>      SharedSurface* mStaging;
> +    SharedSurface* mTransit;

I don't think we need an mTransit. We should just block when we enter SwapProducer while mStaging is already filled, then block until the Consumer has consumed mStaging into mConsumer. (Either with a Layers-Finish or just waiting on mStaging to go null)

@@ +184,5 @@
>                                          const gfxIntSize& size);
>  
>      virtual SharedSurface* SwapConsumer_NoWait();
>  
> +    virtual SharedSurface* SwapTransit();

Swap transit shouldn't be necessary outside of the IPC stuff, since with same-process stuff, we know where all buffers are at all times, so there's never a buffer that's maybe-consumed like mTransit, thus we shouldn't need this call.

SwapTransit could have been SwapConsumer in the IPC SurfStream, but I chose to use a new SwapTransit function so it was more clear that it was different, and that it was still being called on the Producer side, whereas SwapConsumer is meant to be called by the Consumer.

::: gfx/layers/basic/BasicCanvasLayer.cpp
@@ +430,5 @@
>      // and forward Gralloc handle. Signal WaitSync complete with XPC mutex?
>      if (!isCrossProcess) {
>        FirePreTransactionCallback();
>        GLScreenBuffer* screen = mGLContext->Screen();
> +      screen->Stream()->SwapTransit();

Not needed. Just make sure we called FirePreTransactionCallback.
Attachment #717198 - Flags: review?(jgilbert) → review-
(In reply to Jeff Gilbert [:jgilbert] from comment #17)
> Comment on attachment 717198 [details] [diff] [review]
> Update WebGL canvases asynchronously
> 
> Review of attachment 717198 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Let me know if I'm missing what's trying to be done here, but it looks like
> we're trying to convert from generic triple-buffering to triple-buffered
> chaining, where we block the producer if the pipeline is full. (That is, we
> want a new frame, our current mProducer is done, and mStaging still has a
> frame in it that the compositor hasn't consumed yet)
> 
> As such, I don't think we need a new mTransit buffer slot; we should just
> need to block SwapProducer if we're trying to swap and mStaging still has a
> frame in it.

This was my thought too, at first, but in practice it was problematic. I ended up deadlocked because although mStaging had a frame, it had not been sent to the compositor. I am not entirely sure why this is the case, but it probably has something to do with layout interrupting the draw process? In any case, this is why I added mTransit. If we can do something to be assured mStaging has been sent, then we could eliminate mTransit.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #18)
> (In reply to Jeff Gilbert [:jgilbert] from comment #17)
> > Comment on attachment 717198 [details] [diff] [review]
> > Update WebGL canvases asynchronously
> > 
> > Review of attachment 717198 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Let me know if I'm missing what's trying to be done here, but it looks like
> > we're trying to convert from generic triple-buffering to triple-buffered
> > chaining, where we block the producer if the pipeline is full. (That is, we
> > want a new frame, our current mProducer is done, and mStaging still has a
> > frame in it that the compositor hasn't consumed yet)
> > 
> > As such, I don't think we need a new mTransit buffer slot; we should just
> > need to block SwapProducer if we're trying to swap and mStaging still has a
> > frame in it.
> 
> This was my thought too, at first, but in practice it was problematic. I
> ended up deadlocked because although mStaging had a frame, it had not been
> sent to the compositor. I am not entirely sure why this is the case, but it
> probably has something to do with layout interrupting the draw process? In
> any case, this is why I added mTransit. If we can do something to be assured
> mStaging has been sent, then we could eliminate mTransit.

We should figure out why. BenWa? Is there any case where we should deadlock on the second Content layer update between composites?
Flags: needinfo?(bgirard)
(Reporter)

Comment 20

5 years ago
(In reply to Jeff Gilbert [:jgilbert] from comment #19)
> (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #18)
> > (In reply to Jeff Gilbert [:jgilbert] from comment #17)
> > > Comment on attachment 717198 [details] [diff] [review]
> > > Update WebGL canvases asynchronously
> > > 
> > > Review of attachment 717198 [details] [diff] [review]:
> > > -----------------------------------------------------------------
> > > 
> > > Let me know if I'm missing what's trying to be done here, but it looks like
> > > we're trying to convert from generic triple-buffering to triple-buffered
> > > chaining, where we block the producer if the pipeline is full. (That is, we
> > > want a new frame, our current mProducer is done, and mStaging still has a
> > > frame in it that the compositor hasn't consumed yet)
> > > 
> > > As such, I don't think we need a new mTransit buffer slot; we should just
> > > need to block SwapProducer if we're trying to swap and mStaging still has a
> > > frame in it.
> > 
> > This was my thought too, at first, but in practice it was problematic. I
> > ended up deadlocked because although mStaging had a frame, it had not been
> > sent to the compositor.

It would be nice to have a trace of the threads that are deadlocked.

> > I am not entirely sure why this is the case, but it
> > probably has something to do with layout interrupting the draw process? In

layout doesn't interrupt the draw process exactly but sometime we stop painting. Even in this case it should be invisible to non Thebes layer.

> > any case, this is why I added mTransit. If we can do something to be assured
> > mStaging has been sent, then we could eliminate mTransit.
> 
> We should figure out why. BenWa? Is there any case where we should deadlock
> on the second Content layer update between composites?

Not that I can think of. This is the first time I hear of this deadlock.
Flags: needinfo?(bgirard)
(In reply to Benoit Girard (:BenWa) from comment #20)
> (In reply to Jeff Gilbert [:jgilbert] from comment #19)
> > (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #18)
> > > (In reply to Jeff Gilbert [:jgilbert] from comment #17)
> > > > Comment on attachment 717198 [details] [diff] [review]
> > > > Update WebGL canvases asynchronously
> > > > 
> > > > Review of attachment 717198 [details] [diff] [review]:
> > > > -----------------------------------------------------------------
> > > > 
> > > > Let me know if I'm missing what's trying to be done here, but it looks like
> > > > we're trying to convert from generic triple-buffering to triple-buffered
> > > > chaining, where we block the producer if the pipeline is full. (That is, we
> > > > want a new frame, our current mProducer is done, and mStaging still has a
> > > > frame in it that the compositor hasn't consumed yet)
> > > > 
> > > > As such, I don't think we need a new mTransit buffer slot; we should just
> > > > need to block SwapProducer if we're trying to swap and mStaging still has a
> > > > frame in it.
> > > 
> > > This was my thought too, at first, but in practice it was problematic. I
> > > ended up deadlocked because although mStaging had a frame, it had not been
> > > sent to the compositor.
> 
> It would be nice to have a trace of the threads that are deadlocked.

It's not a deadlock in the typical sense. What I tried was to wait in LockProducerImpl() for the compositor to make mStaging null by waiting on a Monitor.

while (mStaging)
    monitor.Wait();

The compositor calls SwapConsumer() to get a new buffer, which recycles the current mConsumer, and moves mStaging to mConsumer (mStaging becomes null). This never happened, though, because the layer transaction was never completed, causing the hang/deadlock. So the question we need answered is: is it legit for a layer transaction to not occur after a nsRefreshDriver tick? I think it is, hence the reason for mTransit; to track when we have actually performed such a transaction.
Created attachment 718528 [details] [diff] [review]
Update WebGL canvases asynchronously

I figured out what the problem was with just using mStaging. Early in the stream lifetime you can have a buffer in mStaging with no mConsumer. WaitForCompositor() was always assuming that if you have one in mStaging then you also have one in mConsumer and therefore we need to wait, but obviously that is not he case. Head -> desk.

We also talked about the case where you paint the canvas layer from some non-shadow layer (like for a screenshot), but we are fine there. The transaction pre/post callbacks are only hooked up when we are painting on a window in WebGLContext.
Attachment #717198 - Attachment is obsolete: true
Attachment #718528 - Flags: review?(jgilbert)
Review ping
Alright, Jeff and I talked about this some more, and the current patch is probably not quite right. Maybe.

In the general case, we actually do want to wait on the compositor in SwapProducer() if we have mStaging and no mConsumer. If we don't wait, we'll have to throw mStaging away, which is always bad. The problem with waiting is that we are not assured that mStaging has even been sent to the compositor at that point because SwapProducer could just be called three times in a row, without any layer transaction. This scenario actually happens in the wild because we do an initial SwapProducer when the context is initialized and then the web page will typically resize the canvas which causes another one (GLContext::ResizeScreenBuffer). So now we have mStaging and mProducer both populated and no pending layer transaction. Now the page will render some stuff in response to requestAnimationFrame. At the end of that we trigger a layer transaction and SwapProducer gets called once more (in PreTransactionCallback). Now we're hosed, because we have mStaging populated but not sent to the compositor, and current mProducer needs moved to mStaging. So we want to wait here, but we can't because it will just deadlock, as nothing is ever going to move mStaging to mConsumer.

I see three solutions:

1) Never call WaitForCompositor() if the incoming size in SwapProducer differs from the current size of mProducer. We know then that it's a resize operation and not a "real" swap. This is a hack.

2) In BasicCanvasLayer::Paint, call something similar to my old SwapTransit method that would just mark mStaging as in-flight. With that information, we can know whether or not we can safely wait in SwapProducer(). SwapConsumer() would then reset this flag.

3) Don't do a SwapProducer() for resize. Instead, implement Resize as a first class method in SurfaceStream. For TripleBuffer, we would just throw away the current mProducer and create a new surface at the new size. This would have the side-effect that resizing your canvas will forfeit the current contents, which I think is fair.

Jeff, do you have an opinion on which of these is best? I'm leaning towards 2 or 3 (or maybe both of those). Of course, if you have other solutions I'm open to those as well.
Flags: needinfo?(jgilbert)
#3 is the right answer here. I am increasingly sure this is the issue: For Resize, we call SwapProd regardless of if the Canvas is marked dirty, filling our buffer chain and causing it to block without issuing a composite.

The trick is that resize is not supposed to swap. Really, a resize should just kill the current buffer and make a new one of the new size. So we should do this: Add a Resize() function next to SwapProducer and SwapConsumer. Resize will trash the current Prod buffer, and replace it with the new one.

SwapProd should then be able to call WaitForCompositor at will.

Further, we should change the GLScreenBuffer init to call Resize to allocate the initial buffer, since this is closer to what we want.
Flags: needinfo?(jgilbert)
Created attachment 721690 [details] [diff] [review]
Update WebGL canvases asynchronously

This patch implements adds Resize to GLScreenBuffer and SurfaceStream, and uses it in lieu of Swap[Producer] where necessary. And it works.
Attachment #718528 - Attachment is obsolete: true
Attachment #718528 - Flags: review?(jgilbert)
Attachment #721690 - Flags: review?(jgilbert)
Current patch deadlocks pretty much immediately on Mac, and not sure why yet.
Ok so debugging shows that PreTransactionCallback is firing, which causes us to SwapProducer, but the UpdateSurface is never called. The transaction is apparently interrupted. Then on the next PreTransactionCallback, we deadlock. I think we need to implement solution #2 now, unfortunately.
Alternatively, some way say that we should never wait on the compositor when we aren't using async compositing...
Created attachment 722369 [details] [diff] [review]
Update WebGL canvases asynchronously

Alright, this version adds a new SurfaceStream_TripleBuffer_Async class which is only used with OMTC. WaitForCompositor is only implemented there. This fixes the behavior on Mac.
Attachment #721690 - Attachment is obsolete: true
Attachment #721690 - Flags: review?(jgilbert)
Attachment #722369 - Flags: review?(jgilbert)
Comment on attachment 722369 [details] [diff] [review]
Update WebGL canvases asynchronously

Review of attachment 722369 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, though the ScopedBindFramebuffer should move closer to the code which actually messes with framebuffers.

Someone else should review the ShadowLayers* files and the change in CanvasLayerOGL.cpp.

::: gfx/gl/GLScreenBuffer.cpp
@@ -330,2 @@
>  {
> -    ScopedBindFramebuffer autoFB(mGL);

I think you should keep this and drop it from Swap() and Resize(), since this is the function which'll actually switch out the bindings.

@@ +363,5 @@
> +
> +bool
> +GLScreenBuffer::Swap(const gfxIntSize& size)
> +{
> +    ScopedBindFramebuffer autoFB(mGL);

This should really be in Attach(), since we don't modify any GL bindings in this function.[1]

@@ +397,5 @@
>  
> +bool
> +GLScreenBuffer::Resize(const gfxIntSize& size)
> +{
> +    ScopedBindFramebuffer autoFB(mGL);

See [1].

::: gfx/layers/ipc/ShadowLayers.cpp
@@ +298,5 @@
>  }
>  
> +void
> +ShadowLayerForwarder::PaintedCanvasNoSwap(ShadowableLayer* aCanvas,
> +                                    bool aNeedYFlip,

Indent is messed up here.
Attachment #722369 - Flags: review?(jgilbert)
Attachment #722369 - Flags: review?(bgirard)
Attachment #722369 - Flags: review+
(Reporter)

Comment 32

5 years ago
Comment on attachment 722369 [details] [diff] [review]
Update WebGL canvases asynchronously

Review of attachment 722369 [details] [diff] [review]:
-----------------------------------------------------------------

NoSwap changes are fine.
Attachment #722369 - Flags: review?(bgirard) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/7268c16cad16
Backed out for causing B2G test timeouts:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=7268c16cad16&jobname=b2g

https://hg.mozilla.org/integration/mozilla-inbound/rev/b7f53f9b0205
It's also possible this was caused by a mozharness (infra) change (bug 851049). 

Retriggered a few things either side to be sure.
Before each timeout I see the following:

11:46:19     INFO -  [Parent 787] ###!!! ABORT: actor has been |delete|d: file /builds/slave/m-in-ics_a7_g-0000000000000000/build/obj-b2g/ipc/ipdl/PGrallocBufferParent.cpp, line 387

This seems bad. Benoit, do you know why this could be happening?
Flags: needinfo?(bjacob)
(In reply to Ed Morley [:edmorley UTC+0] from comment #35)
> It's also possible this was caused by a mozharness (infra) change (bug
> 851049). 
> 
> Retriggered a few things either side to be sure.

(Ruled out).
Assignee: nobody → snorp
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #36)
> Before each timeout I see the following:
> 
> 11:46:19     INFO -  [Parent 787] ###!!! ABORT: actor has been |delete|d:
> file
> /builds/slave/m-in-ics_a7_g-0000000000000000/build/obj-b2g/ipc/ipdl/
> PGrallocBufferParent.cpp, line 387
> 
> This seems bad. Benoit, do you know why this could be happening?

I am actually looking into similar issues with the graphics branch on B2G right now. I can't be specific without more data about your crash (stacks etc) but in my case, the problem is apparently that we're doing dodgy manual deletion of PGrallocBufferParent's:

1) they're manually deleted (as in operator delete[]) in ShadowLayersParent::DeallocPGrallocBuffer

2) they're also deleted by calling Send__delete__ in TextureHost::~TextureHost via ISurfaceAllocator::DestroySharedSurface

There doesn't seem to be a lot of coordination between these places. I get crashes similar to yours where a PGrallocBufferParent is passed to Send__delete__ as in 2) after having been manually deleted as in 1).
Flags: needinfo?(bjacob)
Created attachment 726845 [details]
GDB session showing the double deletion of a PGrallocBufferParent

That's what I'm talking about. See 0x43d56198. It's first manually deleted, then passed to Send__delete__. The bogus NeckoParent::DeallocPCookieService frame should read ShadowLayersParent::DeallocPGrallocBuffer instead.
Was able to reproduce with a local build on unagi. Got the following stack trace:

#0  0x4197f1e6 in mozalloc_abort (msg=<value optimized out>)
    at /Volumes/Slow/source/B2G/gecko/memory/mozalloc/mozalloc_abort.cpp:30
#1  0x414e7654 in Abort (aSeverity=3, 
    aStr=0x419f6847 "actor has been |delete|d", aExpr=<value optimized out>, 
    aFile=<value optimized out>, aLine=387)
    at /Volumes/Slow/source/B2G/gecko/xpcom/base/nsDebugImpl.cpp:430
#2  NS_DebugBreak_P (aSeverity=3, aStr=0x419f6847 "actor has been |delete|d", 
    aExpr=<value optimized out>, aFile=<value optimized out>, aLine=387)
    at /Volumes/Slow/source/B2G/gecko/xpcom/base/nsDebugImpl.cpp:417
#3  0x41342b34 in mozilla::layers::PGrallocBufferParent::Write (
    this=<value optimized out>, __v=<value optimized out>, __msg=0x487eb2e0, 
    __nullable=true)
    at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp:387
#4  0x41342cc6 in mozilla::layers::PGrallocBufferParent::Send__delete__ (
    actor=0x487d2c18)
    at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PGrallocBufferParent.cpp:75
#5  0x41568f08 in mozilla::layers::ShadowLayerManager::PlatformDestroySharedSurface (this=<value optimized out>, aSurface=0x48764244)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp:268
#6  0x415664d8 in mozilla::layers::ShadowLayerManager::DestroySharedSurface (
---Type <return> to continue, or q <return> to quit---
    this=0x0, aSurface=0x41a05cd9, aDeallocator=0x48784180)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/ipc/ShadowLayers.cpp:590
#7  0x4156792a in mozilla::layers::ShadowLayersParent::DestroySharedSurface (
    this=0x48784180, aSurface=0x41a05cd9)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/ipc/ShadowLayersParent.cpp:545
#8  0x4154fe0e in mozilla::layers::ShadowCanvasLayerOGL::DestroyFrontBuffer (
    this=0x48764000)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/opengl/CanvasLayerOGL.cpp:489
#9  0x4154f5fc in mozilla::layers::ShadowCanvasLayerOGL::Destroy (this=0x0)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/opengl/CanvasLayerOGL.cpp:504
#10 0x4154f5dc in mozilla::layers::ShadowCanvasLayerOGL::Disconnect (this=0x0)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/opengl/CanvasLayerOGL.cpp:496
#11 0x4156785a in mozilla::layers::ShadowLayerParent::ActorDestroy (
    this=0x487eb3c0, why=1215840640)
    at /Volumes/Slow/source/B2G/gecko/gfx/layers/ipc/ShadowLayerParent.cpp:60
#12 0x4131cb00 in mozilla::plugins::PPluginBackgroundDestroyerParent::DestroySubtree (this=0x487eb3c0, 
    why=mozilla::ipc::IProtocolManager<mozilla::ipc::RPCChannel::RPCListener>::Deletion)
    at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PPluginBackgroundDestroyerParent.cpp:326
#13 0x4134578c in mozilla::layers::PLayerParent::OnMessageReceived (
    this=0x487eb3c0, __msg=<value optimized out>)
---Type <return> to continue, or q <return> to quit---
    at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PLayerParent.cpp:172
#14 0x41341980 in mozilla::layers::PCompositorParent::OnMessageReceived (
    this=0x48803ef0, __msg=...)
    at /Volumes/Slow/source/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:346
#15 0x413098c6 in mozilla::ipc::AsyncChannel::OnDispatchMessage (
    this=0x48803ef8, msg=...)
    at /Volumes/Slow/source/B2G/gecko/ipc/glue/AsyncChannel.cpp:494
#16 0x4130e7f2 in mozilla::ipc::RPCChannel::OnMaybeDequeueOne (this=0x48803ef8)
    at /Volumes/Slow/source/B2G/gecko/ipc/glue/RPCChannel.cpp:402
#17 0x412f196a in DispatchToMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)()> (this=<value optimized out>)
    at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/tuple.h:383
#18 RunnableMethod<mozilla::dom::ContentParent, void (mozilla::dom::ContentParent::*)(), Tuple0>::Run (this=<value optimized out>)
    at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/task.h:307
#19 0x4130d1a8 in mozilla::ipc::RPCChannel::RefCountedTask::Run (
    this=<value optimized out>)
    at ../../dist/include/mozilla/ipc/RPCChannel.h:425
#20 mozilla::ipc::RPCChannel::DequeueTask::Run (this=<value optimized out>)
    at ../../dist/include/mozilla/ipc/RPCChannel.h:448
#21 0x41505588 in MessageLoop::RunTask (this=0x46881df0, task=0x46881d14)
    at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:334
#22 0x415063b6 in MessageLoop::DeferOrRunPendingTask (this=0x48803ef8, 
---Type <return> to continue, or q <return> to quit---
    pending_task=<value optimized out>)
    at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:342
#23 0x41506f94 in MessageLoop::DoWork (this=0x46881df0)
    at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_loop.cc:442
#24 0x41507214 in base::MessagePumpDefault::Run (this=0x45b79ec0, 
    delegate=0x46881df0)
    at /Volumes/Slow/source/B2G/gecko/ipc/chromium/src/base/message_pump_default.cc:23
#25 0x41505544 in MessageLoop:
Created attachment 727187 [details] [diff] [review]
Update WebGL canvases asynchronously

Turns out that we apparently do want to return a null_t() sometimes from OpPaintCanvas. This was somehow causing the crash. New patch always sends whatever reply we have for synchronous OpPaintCanvas (and of course never does for async).
Attachment #722369 - Attachment is obsolete: true
Attachment #727187 - Flags: review?(bgirard)
Created attachment 727712 [details] [diff] [review]
Update WebGL canvases asynchronously

Ok this one should be the final version. No Try issues.
Attachment #727187 - Attachment is obsolete: true
Attachment #727187 - Flags: review?(bgirard)
Attachment #727712 - Flags: review?(bgirard)
(Reporter)

Updated

5 years ago
Attachment #727712 - Flags: review?(bgirard) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/b43bbab312e7
Backed out for ABORTs on mochitest-1:
https://tbpl.mozilla.org/php/getParsedLog.php?id=20985450&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=20986464&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=20986665&tree=Mozilla-Inbound
{
03-22 11:05:11.317 E/Gecko   ( 1814): mozalloc_abort: [Parent 1814] ###!!! ABORT: unknown union type: file PLayersParent.cpp, line 2075
}

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/b4f1a45e51ea
Forgot a one-liner from my B2G tree. Repushed with that.

https://hg.mozilla.org/integration/mozilla-inbound/rev/79ecb92e936a
https://hg.mozilla.org/mozilla-central/rev/79ecb92e936a
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22

Updated

4 years ago
relnote-firefox: --- → ?
Nom'd for rel note: This improves perf of WebGL Canvas updates.
Bug 858039 appears be related to this changes.

Updated

4 years ago
relnote-firefox: ? → 22+
This bug has been listed as part of the Aurora 22 release notes in either [1], [2], or both. If you find any of the information or wording incorrect in the notes, please email release-mgmt@mozilla.com asap. Thanks!

[1] https://www.mozilla.org/en-US/firefox/22.0a2/auroranotes/
[2] https://www.mozilla.org/en-US/mobile/22.0a2/auroranotes/
Blocks: 863223

Updated

4 years ago
Blocks: 710398
You need to log in before you can comment on or make changes to this bug.