Last Comment Bug 598868 - Tracking: Support for pushing video frames directly to the compositor, bypassing content-main-thread(s)
: Tracking: Support for pushing video frames directly to the compositor, bypass...
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Graphics: Layers (show other bugs)
: unspecified
: x86_64 Linux
: P1 normal with 4 votes (vote)
: ---
Assigned To: Nicolas Silva [:nical]
: graphics-layers
Mentors:
: 706172 745872 (view as bug list)
Depends on: 598869 598870 613442 686742 687372 761424 761849 763234 773440 773451 773469 774357 775244 781024 797893 866652
Blocks: 598867 706172 745148 767480 774026
  Show dependency treegraph
 
Reported: 2010-09-23 00:53 PDT by Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
Modified: 2013-04-29 05:21 PDT (History)
32 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
+


Attachments
Push video frames directly to compositor (59.61 KB, patch)
2011-09-13 15:48 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
Push video frames directly to compositor (64.27 KB, patch)
2011-09-13 23:28 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
Push video frames directly to compositor (65.68 KB, patch)
2011-09-14 17:43 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
Push video frames directly to compositor (51.09 KB, patch)
2011-09-18 14:50 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
Isolate MediaBridge only from Image/Container objects (49.13 KB, patch)
2011-09-19 12:01 PDT, Oleg Romashin (:romaxa)
no flags Details | Diff | Splinter Review
part 1 - Moves OptimalFormatFor from ShadowLayers.cpp to gfxAsurface.h (3.08 KB, patch)
2012-05-29 11:09 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
Move OptimalShmemType from ShadowLayers.cpp to SharedMemory.h (4.06 KB, patch)
2012-05-29 11:34 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
Part 3 - PImageBridge protocol (14.15 KB, patch)
2012-05-30 16:07 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
Part 3 - PImageBridge protocol (with lots of hacks) (47.95 KB, patch)
2012-06-06 12:31 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
Part 3 - PImageBridge protocol (WIP v3) (69.69 KB, patch)
2012-06-21 08:46 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
Part 3 - PImageBridge protocol v4 (68.53 KB, patch)
2012-06-22 12:15 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol v5 (70.71 KB, patch)
2012-06-25 16:13 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (69.60 KB, patch)
2012-06-26 10:01 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (69.08 KB, patch)
2012-06-26 11:32 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (69.88 KB, patch)
2012-07-03 14:52 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (79.62 KB, patch)
2012-07-05 12:50 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (79.68 KB, patch)
2012-07-06 08:30 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (81.12 KB, patch)
2012-07-06 15:02 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (85.16 KB, patch)
2012-07-10 21:19 PDT, Nicolas Silva [:nical]
roc: review+
Details | Diff | Splinter Review
PImageBridge protocol (87.58 KB, patch)
2012-07-11 17:41 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (87.90 KB, patch)
2012-07-11 19:15 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
For reference: last patch rebased on bug 763234 (91.08 KB, patch)
2012-07-12 05:36 PDT, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
no flags Details | Diff | Splinter Review
PImageBridge protocol (87.96 KB, patch)
2012-07-12 14:08 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (89.91 KB, patch)
2012-07-12 15:12 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (89.86 KB, patch)
2012-07-12 15:25 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (89.85 KB, patch)
2012-07-12 17:33 PDT, Nicolas Silva [:nical]
no flags Details | Diff | Splinter Review
PImageBridge protocol (89.40 KB, patch)
2012-07-12 19:25 PDT, Nicolas Silva [:nical]
cjones.bugs: review-
Details | Diff | Splinter Review
PImageBridge protocol (89.82 KB, patch)
2012-07-13 08:50 PDT, Nicolas Silva [:nical]
cjones.bugs: review+
Details | Diff | Splinter Review
PImageBridge protocol (89.97 KB, patch)
2012-07-13 12:41 PDT, Nicolas Silva [:nical]
cjones.bugs: review+
Details | Diff | Splinter Review

Description Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2010-09-23 00:53:57 PDT
See bug 598867.
Comment 1 Oleg Romashin (:romaxa) 2011-09-13 15:48:21 PDT
Created attachment 560053 [details] [diff] [review]
Push video frames directly to compositor

Here some hacky version which does work (until you destroy some layer), 
Getting ABORT: layers should have been cleaned up by now: '0 == ManagedPLayerChild().Length()' - but I think that is because current decode call is blocking and block somehow layer destruction...

This impl, has tonns of hacks and a bit dirty, but it would be nice to understand what is better way to avoid these hacks:
1) Create Image NoCopy flag which trigerring MediaBridge mode.
2) Getting Parent ShadowableLayer from Container
3) Shadow/Shadowable layers tracking system, using unique ID.
4) hacky way to access RenderFrameParent.
Comment 2 Oleg Romashin (:romaxa) 2011-09-13 23:28:44 PDT
Created attachment 560107 [details] [diff] [review]
Push video frames directly to compositor

Fixed destroy crash (holding Layer ref in container was wrong decision).
Fixed A/V sync, copy frames data from decode thread, and send last uploaded frame from state machine data (discussed with cpearce)
Comment 3 Oleg Romashin (:romaxa) 2011-09-14 17:43:07 PDT
Created attachment 560289 [details] [diff] [review]
Push video frames directly to compositor

Fixed unique ID creation... count ID's on ShadowableLayer side, and send to parent as ImageSpecificAttributes.
Added yuv->neon->rgb->rgbTextureUpload path support
Partially implemented Swap.
Comment 4 Oleg Romashin (:romaxa) 2011-09-16 10:36:41 PDT
Ok, here is latest update version of the media bridge
http://hg.mozilla.org/users/romaxa_gmail.com/cross-video-layer/file/default/media_bridge.diff
Created ShadowableImageRefLayer in order to avoid mess with existing ImageLayer.
Fixed rendering in SW mode (Surface based) and maked sure that HW rendering works in both modes (delayed and non-delayed conversion)
Fixed problems with protocol creation in content main thread.
Still using ImageContainer as hook for passing ShadowImage layers to bridge protocol
Would be nice to get some feedback on current implementation..

Also started plugin bridge impl 
http://hg.mozilla.org/users/romaxa_gmail.com/cross-video-layer/file/default/plugin_bridge.diff
Comment 5 Oleg Romashin (:romaxa) 2011-09-18 14:50:05 PDT
Created attachment 560824 [details] [diff] [review]
Push video frames directly to compositor
Comment 6 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-09-18 21:41:43 PDT
I'm not convinced nsBuiltinDecoderReader.cpp needs to know about the MediaBridge. Why can't the ImageContainer and Image objects be completely responsible for pushing frames through IPC?
Comment 7 Oleg Romashin (:romaxa) 2011-09-18 22:32:23 PDT
Hmm, It is probably possible, if in Image every time when decoder call SetData instead of copy data into Image we call media bridge and copy data into Bridge Shmem, and when SetCurrentImage called we ask MEdiaBridge to send last copied frame to Chrome process... Is that what you wanted here?
Comment 8 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-09-18 22:38:54 PDT
Yes!
Comment 9 Oleg Romashin (:romaxa) 2011-09-18 22:40:21 PDT
ok, make sense, will do that
Comment 10 Oleg Romashin (:romaxa) 2011-09-19 12:01:36 PDT
Created attachment 560975 [details] [diff] [review]
Isolate MediaBridge only from Image/Container objects
Comment 11 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-04-18 22:16:08 PDT
*** Bug 745872 has been marked as a duplicate of this bug. ***
Comment 12 Nicolas Silva [:nical] 2012-04-19 12:08:31 PDT
I am a little bit worried about the number of threads involved here.
Lets sum up, please tell me if I am wrong:

 - We have 1 state machine thread (that is used by several (1 per video) nsBuiltinDecoderStateMachine instances)
 - A decode thread per video, up to a limit defined by MAX_DECODE_THREADS = 25.
 - 1 audio thread per video playing (it looks like it is destroyed when the video is paused and a new thread is created when video plays again)

Now, your patch seems to e adding two threads:
 - one on the decoding side, dedicated to a singleton
 - one on the compositing side, dedicated to another singleton
all the communication between the video decoders and the layer managers goes through one ipdl connection between these two singletons, and frames are sort of multiplexed/demultiplexed there. 

Why do we need these two threads?
I have been told that ipdl protocols can coexist on the same thread so isn't it possible to have the MediaBridgeParents working on the compositor threads and the MediaBridgeChilds working on the decode thread?
Meaning we have one ipdl connection per video instead of one connection.

Hope I didn't miss something too important. Anyway we might want to keep the number of threads low.
Comment 13 Chris Pearce (:cpearce) 2012-04-19 14:55:53 PDT
(In reply to Nicolas Silva [:nical] from comment #12)
> I am a little bit worried about the number of threads involved here.
> Lets sum up, please tell me if I am wrong:

Almost right. ;)

>  - 1 audio thread per video playing (it looks like it is destroyed when the
> video is paused and a new thread is created when video plays again)

The audio thread is destroyed when the audio stream is completely finished playing, not when it's paused (unfortunately). We're planning on making the audio threads go away in the not-too-distant future...

You're otherwise correct regarding threads in the nsBuiltinDecoder* architecture.
Comment 14 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-04-19 23:21:37 PDT
(In reply to Nicolas Silva [:nical] from comment #12)
> Why do we need these two threads?
> I have been told that ipdl protocols can coexist on the same thread so isn't
> it possible to have the MediaBridgeParents working on the compositor threads
> and the MediaBridgeChilds working on the decode thread?
> Meaning we have one ipdl connection per video instead of one connection.

I think that should be possible.

There is another possibility. We have recently added a mode to ImageContainer where it acts as a lock around a shared image resource. We might be able to use that and avoid IPDL completely for SetCurrentImage calls happening off the main thread.
Comment 15 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-20 10:19:03 PDT
(In reply to Nicolas Silva [:nical] from comment #12)
> I am a little bit worried about the number of threads involved here.
> Lets sum up, please tell me if I am wrong:
> 
>  - We have 1 state machine thread (that is used by several (1 per video)
> nsBuiltinDecoderStateMachine instances)
>  - A decode thread per video, up to a limit defined by MAX_DECODE_THREADS =
> 25.
>  - 1 audio thread per video playing (it looks like it is destroyed when the
> video is paused and a new thread is created when video plays again)
> 
> Now, your patch seems to e adding two threads:
>  - one on the decoding side, dedicated to a singleton
>  - one on the compositing side, dedicated to another singleton
> all the communication between the video decoders and the layer managers goes
> through one ipdl connection between these two singletons, and frames are
> sort of multiplexed/demultiplexed there. 
> 
> Why do we need these two threads?

The child-side thread is to work around bug 598870.  Alternatively, we can fix that.  I don't remember the parent-side threading design, but you're right that the parent-side thread should just be the compositor.

> Meaning we have one ipdl connection per video instead of one connection.
> 

There's not much reason to have connection-per-video.  There's a non-zero cost to each IPC channel, and non-zero setup/teardown cost.  But, that's orthogonal to the threading issues.
Comment 16 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-20 10:25:07 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline April 15-18) from comment #14)
> There is another possibility. We have recently added a mode to
> ImageContainer where it acts as a lock around a shared image resource. We
> might be able to use that and avoid IPDL completely for SetCurrentImage
> calls happening off the main thread.

This implementation is tenable for plugins where the plugin process is trusted as much as the browser process (and neither have "system" privileges).  It's not tenable for the b2g model where content processes are assumed compromised, and their parent process has high privileges.  A compromised content process can DoS its parent process in this model (and possibly other bad things).  This won't be the only issue we need to sort out as we move towards that, but we need to get back on the track of building towards a security-conscious model.
Comment 17 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-04-20 16:15:36 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #16)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline April
> 15-18) from comment #14)
> > There is another possibility. We have recently added a mode to
> > ImageContainer where it acts as a lock around a shared image resource. We
> > might be able to use that and avoid IPDL completely for SetCurrentImage
> > calls happening off the main thread.
> 
> This implementation is tenable for plugins where the plugin process is
> trusted as much as the browser process (and neither have "system"
> privileges).  It's not tenable for the b2g model where content processes are
> assumed compromised, and their parent process has high privileges.

The high-privilege process could do a try-lock with a timeout to avoid hanging.

But OK --- there's no compelling reason not to take the IPDL path, and it's probably less complex. So let's do that.
Comment 18 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-04-20 16:17:44 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #15)
> The child-side thread is to work around bug 598870.  Alternatively, we can
> fix that.

I think we should just fix that. The state machine thread doesn't block for long so it shouldn't be hard to have it handle IPC messages.

So we shouldn't need any new threads here.
Comment 19 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-04-20 16:52:51 PDT
Last time I looked, the state-machine thread didn't run an XPCOM or other event loop, and its inner loop was quite tight.  It looked fairly complicated to disentagle that into an event-drive model.

Are you OK with landing the IPC bits here with a separate thread and event-loop-ifiying the state machine in parallel, to minimize risk?
Comment 20 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-04-20 17:22:30 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #19)
> Last time I looked, the state-machine thread didn't run an XPCOM or other
> event loop, and its inner loop was quite tight.  It looked fairly
> complicated to disentagle that into an event-drive model.

As far as I can tell, it returns to the event loop after each step of nsBuiltinDecoderStateMachine::RunStateMachine.
Comment 21 Chris Pearce (:cpearce) 2012-04-22 15:29:14 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline April 22-25) from comment #20)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #19)
> > Last time I looked, the state-machine thread didn't run an XPCOM or other
> > event loop, and its inner loop was quite tight.  It looked fairly
> > complicated to disentagle that into an event-drive model.

We refactored. The state machine runs as a normal nsIThread/nsIEventTarget, and we dispatch one event per frame per playing media to update the current frame and current timestamp (so about one event every 40ms per playing media or so). There should be plenty of spare cycles on that thread to do whatever you want to do.
Comment 22 Randell Jesup [:jesup] 2012-04-22 19:17:18 PDT
(In reply to Chris Pearce (:cpearce) from comment #21)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline April
> 22-25) from comment #20)
> > (In reply to Chris Jones [:cjones] [:warhammer] from comment #19)
> > > Last time I looked, the state-machine thread didn't run an XPCOM or other
> > > event loop, and its inner loop was quite tight.  It looked fairly
> > > complicated to disentagle that into an event-drive model.
> 
> We refactored. The state machine runs as a normal nsIThread/nsIEventTarget,
> and we dispatch one event per frame per playing media to update the current
> frame and current timestamp (so about one event every 40ms per playing media
> or so). There should be plenty of spare cycles on that thread to do whatever
> you want to do.

Minor point: how many videos on the web are 60fps?  (or 50fps in PAL countries?)  In particular, what about network HDTV replays/feeds, especially sports?  (And HD will be your toughest decode.) For 60fps, that would be once per ~16ms - which may still be fine, but it's a lot less than 40ms (and obviously a lot of other things are happening.)  Not knowing this particular code, is there any significant downside to missing the next-frame deadline, or is it a minor annoyance, or a don't-care?
Comment 23 Chris Pearce (:cpearce) 2012-04-22 19:32:18 PDT
(In reply to Randell Jesup [:jesup] from comment #22)
> (In reply to Chris Pearce (:cpearce) from comment #21)
> Not knowing this
> particular code, is there any significant downside to missing the next-frame
> deadline, or is it a minor annoyance, or a don't-care?

It's probably on the minor-annoyance side of things. We'd just drop frames up to the current frame in order to catch up, and HTMLMediaElement.currentTime wouldn't be up to date. Audio is (currently) handled on another thread, so we won't glitch audio.
Comment 24 Nicolas Silva [:nical] 2012-05-29 11:09:00 PDT
Created attachment 628021 [details] [diff] [review]
part 1 - Moves OptimalFormatFor from ShadowLayers.cpp to gfxAsurface.h

I am working on off-main-thread video. Part of what I am doing is similar to What Romaxa has done, other parts are different, I'll come back to detailing that later.

First, I need to use a few pieces of code that have been written for the compositor protocols and that I need for video. I am moving them so that they can be accessible from more source files. I am trying to break this into very small patches to make it easier for review and integration.

So, this one moves OptimalFormatFor to gfxASurface since it is going to be used by the MediaBridge code and it does not depend on ShadowLayers specificities.
Comment 25 Nicolas Silva [:nical] 2012-05-29 11:34:13 PDT
Created attachment 628031 [details] [diff] [review]
Move OptimalShmemType from ShadowLayers.cpp to SharedMemory.h
Comment 26 Nicolas Silva [:nical] 2012-05-29 12:41:22 PDT
Some details about what I am doing right now on omtc video, to keep people informed and receive feedbacks:

Pushing images directly to the compositor without using the main thread is to be done with the PMediaBridge protocol. This protocol is separated from the PCompositor/PLayers/etc protocols so that it can have different entry points.

Roughly, this protocol should work as follows:
 - MediaBridgeChild pushes SharedImages along with a layerID to the compositor
 - MediaBridgeParent disptaches the SharedImage to the correct ShadowImageLayer
 - if the layer previously had a SharedImage obtained through PMediaBridge, the SharedImage is Sent back to the child side through an async message ReleasedSurface(image,layerID).
 - when the child receives a released shared image, it can reuse it for futur frames.

MediaBridgeChild does not know about video specifics. Instead a class VideoMediaBridgeChild inherits from MediaBridgeChild and handles video specific tasks. the IPC work is splitted between these two classes so that we can reuse the parent class (for instance, async plugin compositing and, of course, async animated gifs). Most of the protocol doesn't need to know about video, it just pushes images to the compositor.

VideoMediaBridgeChild has a list of VideoFrameContainers, and each VideoFrameContainer has a reference to a VideoMediaBridgeChild, so all the OMTC specific bits are handled in VideoFrameContainer (there's one per video) and VideoMediaBridgeChild (there's one per compositor).

Now, there could be two ways to handle pushing frames.
 - As I said above, asynchronously pushing frames to the compositor (loosing ownership of the frame) and receiving back frames when the compositor does not use it anymore.
 - Synchronously swapping double buffered shared images in the compositor, like it is done in the PCompositor protocol.

So far I am going for the first approach. I like it better because it is asynchronous, and it allows us to (in the future) push several frames in advance to the compositor along with their presentation times and let the compositor handle synchronization. Right now synchronization is done on the decoding side, and with the first implementation of omtc video I will keep it that way.
The second approach still has some pros, like it makes us allocate less shared images. We only need two shared images (back and front), while the first approach can make us allocate a bit more because we may want to push a frame C after frame B has been pushed, while it is possible that the child side has not yet received frame A back from the compositor. second approach is also a easier to implement since it has already been done for PCompositor.

It is not too late for me to switch to the second approach if requested, though my understanding from discussions with Chris Jones is that the first approach is preferred (because it will allow us to handle synchronization the the compositor as a future improvement).

I hope the description is understandable. Please ask me to rephrase/explain if need be, and let me know if you see problems with this design. 

Right now I am trying to separate what has been done into understandable patches (most of the work right now is in a big, hacky and unfinished patch at the moment)
Comment 27 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-29 13:41:58 PDT
I think the first approach is fine. The whole thing generally sounds pretty good.

I don't know why we need video-specific MediaBridge code though. What are the "video specific tasks"? It would be ideal if we can keep the same ImageContainer API as now.
Comment 28 Nicolas Silva [:nical] 2012-05-29 14:23:52 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #27)
> I think the first approach is fine. The whole thing generally sounds pretty
> good.
> 
> I don't know why we need video-specific MediaBridge code though. What are
> the "video specific tasks"? It would be ideal if we can keep the same
> ImageContainer API as now.

So far not many things. The VideoMediaBridge mainly has a list of VideoFrameContainers, and passes the SharedImage it receives from the compositor to the right VideoFrameContainer which stores the SharedImage pool (since video instances may have different image sizes and I assume we can only reuse SharedImages of the same size as the decoded frame).
It is also, likely that the video specific code will have to react to some events like destruction or switching to another compositor (when a tab is dragged from a window to another).

I guess, since these things are likely to be relevant to plugins as well, I can remove the VideoMediaBridge class, add a MediaBridgeObserver class instanciated on the child side for each layer that uses the protocol and have VideoFrameContainer inherit from it.
MediaBridge would then contain a list of observer and send them notifications when needed, instead of VideoMediaBridge having a list of VideoFrameContainer.

This way, the organization is similar but more code is handled by the generic MediaBridge code.

about the ImageContainer API, my understanding is that the most important part is the layers::Image class (the one that the video decoding code uses). So the way I see things, non-omtc scenarios would use the same code paths as now, the state machine passes an Image to the VideoFrameContainer, which puts it in its ImageContainer and the main thread can access the ImageContainer like it does now. In the case of omtc, the Image data is copied in a SharedImage when VideoFrameContainer::SetCurrentFrame is called (instead or in addition to being copied in the ImageContainer if we really need to), and sent to the compositor.

Is this in accordance with keeping the ImageContainer API? It looks like it is to me but I am not sure what use of ImageContainer you are referring to in particular.

After we have omtc video working, it would be nice to have the decoder ask the VideoFrameContainer to allocate the buffer in which the frame is decoded. This way we could avoid this extra copy by using a shmem buffer directly. 

Also, maybe it would be good send video frames to the main thread only n demand rather than systematically (for instance when js code renders the video in a canvas, or when the video frame needs some extra processing in the main thread like an svg filter).

But the last two items are improvements and we can come back to them whenever omtc video is working.
Comment 29 Randell Jesup [:jesup] 2012-05-29 14:26:47 PDT
I also like the first:

The best a/v synchronization code I've used/written preemptively decoded up to ~3-4 frames ASAP (which may not happen, if audio jitter is low), and queued them (in a cross-thread messageport) for the av-sync/compositor.  Frames were decoded as soon as the data was available if a buffer was available, while audio went through normal adaptive jitter buffering.  Compositor called avsync to get the "correct" frame, which looked at the current audio timestamp and selected the best video frame from those available, skipping or duplicating frames as needed.  The number of frames to preemptively decode depends on the competing demands of memory and likelihood of a frame dup/skip being forced by a decode missing the deadline (because of preemption, etc).
Comment 30 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-29 16:32:51 PDT
(In reply to Nicolas Silva [:nical] from comment #28)
> So far not many things. The VideoMediaBridge mainly has a list of
> VideoFrameContainers, and passes the SharedImage it receives from the
> compositor to the right VideoFrameContainer which stores the SharedImage
> pool (since video instances may have different image sizes and I assume we
> can only reuse SharedImages of the same size as the decoded frame).

Video code should not have to deal with reusing of SharedImages. That should be handled at the layers level. So video creates a new Image, and underneath you would recycle buffers if possible. Layers already works like this.

> It is also, likely that the video specific code will have to react to some
> events like destruction or switching to another compositor (when a tab is
> dragged from a window to another).

I don't think this should be necessary.

> about the ImageContainer API, my understanding is that the most important
> part is the layers::Image class (the one that the video decoding code uses).
> So the way I see things, non-omtc scenarios would use the same code paths as
> now, the state machine passes an Image to the VideoFrameContainer, which
> puts it in its ImageContainer and the main thread can access the
> ImageContainer like it does now. In the case of omtc, the Image data is
> copied in a SharedImage when VideoFrameContainer::SetCurrentFrame is called
> (instead or in addition to being copied in the ImageContainer if we really
> need to), and sent to the compositor.

Is there a reason why we have to have different code paths? Why can't we just have exactly the layers API we currently have, and just make it work with async compositing?
Comment 31 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-29 16:33:51 PDT
Matt Woodrow's patches in bug 695845 are very relevant here.
Comment 32 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-29 16:37:44 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #30)
> Is there a reason why we have to have different code paths? Why can't we
> just have exactly the layers API we currently have, and just make it work
> with async compositing?

Actually there is one API change we need. We need to split SetCurrentImage into two versions:
-- SetCurrentImageAsync(Image*), which can be called on any thread and can update the image outside of a layer transaction
-- SetCurrentImageInTransaction(Image*, LayerManager*), which can only be called on the main thread and forces the image update to happen as part of the current transaction on the given layer manager. In this case, the ImageContainer would only be in use by the given layer manager; sharing it with other layer managers would be a bug.
Comment 33 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-29 16:41:16 PDT
We could require that for a given ImageContainer, either SetCurrentImageAsync or SetCurrentImageInTransaction is always used --- they can't be mixed --- if that helps implementation.

Unfortunately we can't require that SetCurrentImageAsync is always called from the same thread, since the media engine may destroy decoder threads while playback is paused.
Comment 34 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-29 16:51:22 PDT
(In reply to Nicolas Silva [:nical] from comment #28)
> After we have omtc video working, it would be nice to have the decoder ask
> the VideoFrameContainer to allocate the buffer in which the frame is
> decoded. This way we could avoid this extra copy by using a shmem buffer
> directly. 

Yes, there are some known optimization possibilities here.

The way I'd imagined this working is that we'd add to PlanarYCbCrImage an alternative initialization protocol that allocates buffers and returns pointers that the decoder can fill in. E.g.

class PlanarYCbCrImage {
  ...
  struct PreallocationParams {
    PRInt32 mYStride;
    gfxIntSize mYSize;
    // Chroma buffers
    PRInt32 mCbCrStride;
    gfxIntSize mCbCrSize;
    // Picture region
    PRUint32 mPicX;
    PRUint32 mPicY;
    gfxIntSize mPicSize;
    StereoMode mStereoMode;
  };
  struct AllocatedData {
    PRUint8* mYChannel;
    PRUint8* mCbChannel;
    PRUint8* mCrChannel;
  };
  AllocatedData BeginSetDataWithPreallocation(const PreallocationParams& aParams);
  // The AllocatedData pointers have to remain valid after EndSetDataWithPreallocation, until the Image is released.
  void EndSetDataWithPreallocation();

Other than that, no other API changes would be needed. With that API, the Ogg and WebM decoders could be modified to decode directly into the AllocatedData buffers.

Yes, that belongs in another bug.
Comment 35 Nicolas Silva [:nical] 2012-05-30 11:07:32 PDT
> Is there a reason why we have to have different code paths? Why can't we
> just have exactly the layers API we currently have, and just make it work
> with async compositing?

I think the layers api is meant to be used from the main thread, and since it does more than swapping images (for example moving/resizing/etc the layers), using the layers PCompositor API not what we want to do with video of plugins that just care about changing the image without knowing about the rest. Also the PCompositor protocol way is to swap images synchronously rather than pushing images asynchronously.
Or are we talking about the same layers api? You proably refer to ImageContainer specifically. 

> Actually there is one API change we need. We need to split SetCurrentImage into two 
> versions:
> -- SetCurrentImageAsync(Image*), which can be called on any thread and can update the image > outside of a layer transaction
> -- SetCurrentImageInTransaction(Image*, LayerManager*), which can only be called on the 
> main thread and forces the image update to happen as part of the current transaction on the > given layer manager. In this case, the ImageContainer would only be in use by the given 
> layer manager; sharing it with other layer managers would be a bug.

Yes, this looks like what I have in mind as well. These two do have different code paths though, right ? I am placing the MediaBridge code in the class inherited by VideoFrameContainer rather than in ImageContainer though, is this the problem?

I can have MediaBridgeChild hold a list of ImageContainers directly and have the ImageContainers hold a reference to the MediaBridgeChild. ImageContainer would also be responsible for the SharedImage pool.
Comment 36 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-30 15:39:23 PDT
(In reply to Nicolas Silva [:nical] from comment #35)
> > Is there a reason why we have to have different code paths? Why can't we
> > just have exactly the layers API we currently have, and just make it work
> > with async compositing?
> 
> I think the layers api is meant to be used from the main thread, and since
> it does more than swapping images (for example moving/resizing/etc the
> layers), using the layers PCompositor API not what we want to do with video
> of plugins that just care about changing the image without knowing about the
> rest. Also the PCompositor protocol way is to swap images synchronously
> rather than pushing images asynchronously.
> Or are we talking about the same layers api? You proably refer to
> ImageContainer specifically.

Yes. Most of the LayerManager API is main-thread only. But ImageContainers were specifically designed to be used across threads (and are).

> Yes, this looks like what I have in mind as well. These two do have
> different code paths though, right ? I am placing the MediaBridge code in
> the class inherited by VideoFrameContainer rather than in ImageContainer
> though, is this the problem?

Yes.

> I can have MediaBridgeChild hold a list of ImageContainers directly and have
> the ImageContainers hold a reference to the MediaBridgeChild. ImageContainer
> would also be responsible for the SharedImage pool.

We shouldn't call it MediaBridge. It's just an implementation detail of ImageContainers and is not specific to "Media".

Do you agree that we can do this with the API change in comment #32 and no other API changes (and no other changes to non-layers code)?
Comment 37 Nicolas Silva [:nical] 2012-05-30 16:07:22 PDT
Created attachment 628527 [details] [diff] [review]
Part 3 - PImageBridge protocol

Here is a sort of skeleton of the protocol, mostly not implemented yet, to illustrate better the direction I am taking. It is not actually called by any running code.

> We shouldn't call it MediaBridge. It's just an implementation detail of 
> ImageContainers and is not specific to "Media".

Yes. Actually, in the process of cleaning things and splitting the work in progress I renamed it PImageBridge this morning. Is it okay?

> Do you agree that we can do this with the API change in comment #32 and no 
> other API changes (and no other changes to non-layers code)?

Yes, I agree, if adding Get/SetImageBridge and Get/SetLayerID to ImageContainer is not considered as API change (as in this patch). They should be called only by ImageBridgeChild so it is more an internal change.
Similar changes to be made on the parent side.
Comment 38 Nicolas Silva [:nical] 2012-05-30 16:16:26 PDT
About the data structures in PImageBridge.ipdl duplicated from PLayers.ipdl and prefixed with 'Media' I did this as a temporary workaround the fact that at the moment an IPDL protocol cannot use structures defined in another protocol. I think Chris Jones is working on this. It will be changed to using the same structures as PLayers instead (and remove the 'Media' that is just here to avoid naming clashes).
Comment 39 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-30 23:22:28 PDT
Comment on attachment 628527 [details] [diff] [review]
Part 3 - PImageBridge protocol

Review of attachment 628527 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

Are we going to have problems with SetCurrentImageAsync calls being sent from different threads for the same ImageContainer? ISTR that an IPDL protocol instance always wants to used from the same thread.

::: gfx/layers/ImageLayers.h
@@ +285,5 @@
>   * (because layers can only be used on the main thread) and we want to
>   * be able to set the current Image from any thread, to facilitate
>   * video playback without involving the main thread, for example.
>   */
> +class THEBES_API ImageContainer : public LinkedListElement<ImageContainer> 

LinkedList isn't thread-safe so I think you need to do something else here.

@@ +498,5 @@
> +  PRInt32 GetLayerID() const { return mLayerID; }
> +  void SetLayerID(PRInt32 aLayerID)
> +  { 
> +    mLayerID = aLayerID; 
> +  }

Make these private and friend the necessary ImageBridge classes.
Comment 40 Nicolas Silva [:nical] 2012-05-31 07:22:25 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #39)
> Looks good.
> 
> Are we going to have problems with SetCurrentImageAsync calls being sent
> from different threads for the same ImageContainer? ISTR that an IPDL
> protocol instance always wants to used from the same thread.

This should not be a problem. ImageBridgeChild has its own thread that it shares with other potential instances of ImageBridgeChild (there's one instance per compositor).

Sending an image to the compositor is achieved with 
void AsyncSendSharedImage(SurfaceDescriptor& aSurface);
That creates a 'send' task and post it in the VideoBridgeChild thread's message loop.
So it can be called from any thread. (I just added the method so it does not appear in the uploaded version of the patch yet)

> 
> ::: gfx/layers/ImageLayers.h
> @@ +285,5 @@
> >   * (because layers can only be used on the main thread) and we want to
> >   * be able to set the current Image from any thread, to facilitate
> >   * video playback without involving the main thread, for example.
> >   */
> > +class THEBES_API ImageContainer : public LinkedListElement<ImageContainer> 
> 
> LinkedList isn't thread-safe so I think you need to do something else here.

I would like to ensure that the linked list is accessed only from the ImageBridgeChild thread, in which case we would not need synchronization. I added comments and assertions to make it clearer. If I don't manage to ensure this, I'll add whatever mutex we need to have thread safety.

> 
> @@ +498,5 @@
> > +  PRInt32 GetLayerID() const { return mLayerID; }
> > +  void SetLayerID(PRInt32 aLayerID)
> > +  { 
> > +    mLayerID = aLayerID; 
> > +  }
> 
> Make these private and friend the necessary ImageBridge classes.

Yes, done.
Comment 41 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-31 16:35:19 PDT
One thing you should think about, but probably not work on in this bug, is handling the "remote" image types. We will want that to work somehow so that OOP plugins can asynchronously render and be composited without synchronizing with the main thread. (Which will be excellent!) That will really need its own separate machinery since we don't directly get messages when the plugin updates its async surface. Instead we need to forward information to the compositor thread when the remote image machinery (shared memory and cross-process mutex) is set up.
Comment 42 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-05-31 16:51:52 PDT
I never understood how we triggered a recomposite with the locking impl.  But I agree it makes sense to roll these together.
Comment 43 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-06-01 18:39:28 PDT
Nico and I discussed synchronously proxying the copy-out-of-decoder-into-shmem work over to whichever thread is doing IPC.  (If we're not already publishing the frame from the same thread that does IPC.)  This eliminates extraneous copies related to IPC and means we probably don't need new ImageContainer interfaces (except for referring to "async containers" through some ID).  We also wouldn't need to change the memory model of Image/ImageContainer.
Comment 44 Dietrich Ayala (:dietrich) 2012-06-05 13:58:09 PDT
Required for smooth video on B2G.
Comment 45 Nicolas Silva [:nical] 2012-06-06 12:31:57 PDT
Created attachment 630671 [details] [diff] [review]
Part 3 - PImageBridge protocol (with lots of hacks)

A new snapshot of the async video work.

It's unfinished and very hacky, so it's more intended for early feedbacks than reviews.

In particular, I have not yet implemented dispatching the frames to the right layers, so it only works with simple pages containing one video.
But it works as a proof of concept and can be tested with this page http://people.mozilla.org/~bgirard/Video.html (clicking on the link below the video triggers some nasty JS that occupies the main thread, yet the video stays smooth (try scrolling at the same time)).

To make it work you need to apply Part 2 (attachment 628031 [details] [diff] [review]) and the patch in bug 761849.
Also enable layers acceleration, disable xrender, and enable OMTC (with the MOZ_USE_OMTC environment variable).
Comment 46 Nicolas Silva [:nical] 2012-06-21 08:46:05 PDT
Created attachment 635338 [details] [diff] [review]
Part 3 - PImageBridge protocol (WIP v3)

Here's the current state of async video. It has changed a lot since the last version I uploaded on bugzilla, and it is still a workin progress, but most of the stuff is there and it should (hopefully) not change too much before the final version.
It still has a bunch of TODOs, some printfs for debug, etc.

Basically it works well, but the destruction sequence is not completely done (so it can crash sometimes when closing Firefox)

If you want to try, you need to apply the patch from bug 763234 first

Feedbacks are welcome :)
Comment 47 Nicolas Silva [:nical] 2012-06-22 12:15:42 PDT
Created attachment 635857 [details] [diff] [review]
Part 3 - PImageBridge protocol v4

Here is a cleaner version of async video with a few fixes.

It's going through try right now.

I tested it on Fennec and it works well on my Galaxy Nexus.
I'll test on b2g very soon, I hope it will work well there. I didn't have to change anything for it to work on Fennec so it should be alright.

I think The patch is almost finished.
Comment 48 Nicolas Silva [:nical] 2012-06-25 16:13:57 PDT
Created attachment 636524 [details] [diff] [review]
PImageBridge protocol v5

Fixed a small memory leak and took into accounts the recent changes in bug 763234 's patch
Comment 49 Nicolas Silva [:nical] 2012-06-26 10:01:33 PDT
Created attachment 636761 [details] [diff] [review]
PImageBridge protocol

I cleaned up a few things little bit, and replaced anonymous namespaces with static (No big changes).
Comment 50 Nicolas Silva [:nical] 2012-06-26 11:32:31 PDT
Created attachment 636806 [details] [diff] [review]
PImageBridge protocol

Some more cleanup (again, no big changes).
Comment 51 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-06-29 08:32:50 PDT
Sorry for the review lag.  We've been test-driving this patch for a couple of days, will have comments based on that.  No big issues so far AFAIK.
Comment 52 Nicolas Silva [:nical] 2012-07-03 14:52:16 PDT
Created attachment 638873 [details] [diff] [review]
PImageBridge protocol

Some refactoring of BasicLayers has landed so I had to rebase the patch. Apart from rebasing I didn't change anything in the patch since the previous version.
Comment 53 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-03 17:42:56 PDT
Comment on attachment 638873 [details] [diff] [review]
PImageBridge protocol

Review of attachment 638873 [details] [diff] [review]:
-----------------------------------------------------------------

It seems that you're making SetCurrentImage always async for ImageContainers that have had an ImageContainerChild set up for them. I thought you were going to do something like comment #32, but that's not here.

::: content/html/content/src/nsHTMLMediaElement.cpp
@@ +3001,5 @@
> +  
> +  ImageContainer* container = mVideoFrameContainer->GetImageContainer();
> +  if (mozilla::layers::ImageBridgeChild::IsCreated()) { 
> +    mozilla::layers::ImageBridgeChild::GetSingleton()->CreateImageContainerChild(container);
> +  }

Why is this needed here? Can't construction of an ImageContainer do this automatically?

::: gfx/layers/ImageLayers.h
@@ +489,5 @@
> +  }
> +  void SetImageContainerChild(ImageContainerChild * aChild)
> +  {
> +    mImageContainerChild = aChild;
> +  }

Add comments saying who is supposed to call these? I assume that normal users of ImageContainers should not be calling these?

::: gfx/layers/ipc/ImageContainerChild.h
@@ +15,5 @@
> +
> +class ImageBridgeCopyAndSendTask;
> +
> +
> +class ImageContainerChild : public PImageContainerChild

Needs a big comment explaining what this is for and how it is used. Also need to describe somewhere how the lifetime of these objects is managed.

@@ +38,5 @@
> +  void SendImageAsync(ImageContainer* aContainer, Image* aImage);
> +
> +  
> +  /**
> +   *  Returns true if the method is called in the ImagebridgeChild thread.

ImageBridgeChild
Comment 54 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-03 19:18:11 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #51)
> Sorry for the review lag.  We've been test-driving this patch for a couple
> of days, will have comments based on that.  No big issues so far AFAIK.

Update: we're digging through a bug that's causing intermittent flashes of black frames.  Seems to be happening with both OMX decoder and camera stream.  Initial testing suggests a problem in the composition pipeline, but gralloc/GL problems haven't been ruled out 100%.
Comment 55 Nicolas Silva [:nical] 2012-07-04 10:48:01 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #53)
> Comment on attachment 638873 [details] [diff] [review]
> PImageBridge protocol
> 
> Review of attachment 638873 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It seems that you're making SetCurrentImage always async for ImageContainers
> that have had an ImageContainerChild set up for them. I thought you were
> going to do something like comment #32, but that's not here.

At first I was keeping SetCurrentImage and SetCurrentImgeAsync separate but it turned out that it was looking like

if (container->GetImageContainerCHild()) {
  container->SetCurrentImageAsync();
} else {
  container->SetCurrentImage();
}

Since SetCurrentImage can be called from any thread, I don't really see why we should have two different methods because what really matters is whether the ImageContainer is back by an ImageContainerChild. It does not make sense to use SetCurrentImage (not Async) when the ImageContainer uses the ImageBridge protocol (It wont work actually, because during the transaction, if the ImageContainer uses ImageBridge, it will send the ImageID rather than the image's data).

Right now the behaviours in SetCurrentImage is: if we have an ImageBridge we use it, else we use the normal path (image will be forwarded during transaction).

> ::: content/html/content/src/nsHTMLMediaElement.cpp
> @@ +3001,5 @@
> > +  
> > +  ImageContainer* container = mVideoFrameContainer->GetImageContainer();
> > +  if (mozilla::layers::ImageBridgeChild::IsCreated()) { 
> > +    mozilla::layers::ImageBridgeChild::GetSingleton()->CreateImageContainerChild(container);
> > +  }
> 
> Why is this needed here? Can't construction of an ImageContainer do this
> automatically?

Do you mean have a constructor call looking like 
  new ImageContainer(ImageContainer::USE_IMAGEBRIDGE); 
or similary
  new ImageContainer(true);

Or do you mean that all ImageContainers should use ImageBridge in every case?

> 
> ::: gfx/layers/ImageLayers.h
> @@ +489,5 @@
> > +  }
> > +  void SetImageContainerChild(ImageContainerChild * aChild)
> > +  {
> > +    mImageContainerChild = aChild;
> > +  }
> 
> Add comments saying who is supposed to call these? I assume that normal
> users of ImageContainers should not be calling these?

the getter can be called by anyone, but the setter is for Image{Bridge/Container}Child only. I moved it to protected and added the two concerned classes as friend classes of ImageContainer.


I'm fixing these things and i'll upload a new patch in a few minutes.
Comment 56 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-04 16:24:39 PDT
(In reply to Nicolas Silva [:nical] from comment #55)
> At first I was keeping SetCurrentImage and SetCurrentImgeAsync separate but
> it turned out that it was looking like
> 
> if (container->GetImageContainerCHild()) {
>   container->SetCurrentImageAsync();
> } else {
>   container->SetCurrentImage();
> }

Where would we do that? For video, we would always call GetCurrentImageAsync, and for everything else we would always call SetCurrentImage.

The key question is whether we will ever have an ImageContainer that we want to sometimes update as part of a transaction and sometimes update asynchronously. Right now, we don't: videos and async-drawing-model plugins always update asynchronously, everything else always updates synchronously. For future things, like off-main-thread animated images, I'm not so sure. But probably we shouldn't worry about that.

So, how about we have a flags word parameter for the ImageContainer constructor, with an ENABLE_ASYNC flag, and say that when that flag is set, only SetCurrentImageAsync calls are allowed, and when it's not set, only SetCurrentImage calls are allowed? At some point in the future, if we need to, we could relax things and fix the implementation so that SetCurrentImage calls are allowed even if ENABLE_ASYNC is set.
Comment 57 Nicolas Silva [:nical] 2012-07-04 17:10:16 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #56)
> Where would we do that? For video, we would always call
> GetCurrentImageAsync, and for everything else we would always call
> SetCurrentImage.
> 


SetCurrentImageAsync only works with OMTC, so if video will always call SetCurrentImageAsync, then video does not work without OMTC.  Or am I missing something?

Right now OMTC is disabled by default on desktop, so SetCurrentImage[Async] has to use the old (or rather current) code path in this case (the most common case actually).

When we don't have OMTC we don't have ImageBridge either, so in this case container->GetImageContainerChild() will always return nsnull.

I don't think we can get arround checking for GetImageContainerChild and if we can't send the frame asynchronously we want to at least send the frame event if it means using main thread.

The way my patch work is when we create the ImageContainer for video, if possible we give it an ImageBridge protocol but it may not work (for instance if OMTC is not enabled). Then when we set the current image, the best scenario is async using ImageBridge, and fallback to synchronous if we don't have ImageBridge.


> The key question is whether we will ever have an ImageContainer that we want
> to sometimes update as part of a transaction and sometimes update
> asynchronously. 

Very true.

> Right now, we don't: videos and async-drawing-model plugins
> always update asynchronously, everything else always updates synchronously.
> For future things, like off-main-thread animated images, I'm not so sure.
> But probably we shouldn't worry about that.

Yes, So the decision of prefering asynchronous compositing can be done when initializing the ImageContainer, by giving it or not an ImageContainerChild, and the presence or absence of the ImageContainerChild is what's going to decide between sync vs async compositing.

In the current version of the patch trying to use the "SetCurrentImage in transaction" way would not work if the ImageContainer has an ImageContainerChild.

> 
> So, how about we have a flags word parameter for the ImageContainer
> constructor, with an ENABLE_ASYNC flag, and say that when that flag is set,
> only SetCurrentImageAsync calls are allowed, and when it's not set, only
> SetCurrentImage calls are allowed? At some point in the future, if we need
> to, we could relax things and fix the implementation so that SetCurrentImage
> calls are allowed even if ENABLE_ASYNC is set.

I don't really see the point in having two incompatible ways to send images because I think that when one is not possible (async) we want to default on the other one (sync).

If we really want at some point to be able to switch between sync and async on the same ImageContainer (my understanding is that it is not the case at the moment), we can easily add something like "bool mDisableAsync" in ImageContainer that would enforce the synchronous code paths even if the container has an ImageBridge. We can't enforce asynchronous completely though, without enforcing OMTC.

Or, we can have SetCurrentImage that tries asyncrnous and falls back to syncrhonous, and SetCurrentImageForTransaction (or something like this) that will enforce synchronous
(using mDisableAsync as mentionned above, which would also relax the restriction I metionned about not being able to use the synchronous code path when ImageContainer has an ImageBridge).
Comment 58 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-04 17:38:17 PDT
The semantics of SetCurrentImage and SetCurrentImageAsync would be quite different and callers shouldn't substitute one for the other.

The semantics of SetCurrentImage would be that, if the ImageContainer is in use by an ImageLayer, the caller must currently be in a transaction for that layer's LayerManager and the image change will take effect as part of that transaction. Therefore SetCurrentImage must be called on the main thread, since only the main thread participates in layer transactions. SetCurrentImage should really have some assertions added to check those preconditions.

The semantics of SetCurrentImageAsync would be that the image is updated "as soon as possible" independently of any transactions. Therefore it can be called on any thread, inside or outside a transaction.

Currently some call sites to SetCurrentImage want the first behavior, while others want the second. It's an accident that we happen to implement both behaviors with the same method, because pre-OMTC the same implementation happens to produce both behaviors. We should clarify the expectations of callers by having them call different methods.
Comment 59 Nicolas Silva [:nical] 2012-07-05 07:39:34 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #58)
> The semantics of SetCurrentImage and SetCurrentImageAsync would be quite
> different and callers shouldn't substitute one for the other.
> 
> The semantics of SetCurrentImage would be that, if the ImageContainer is in
> use by an ImageLayer, the caller must currently be in a transaction for that
> layer's LayerManager and the image change will take effect as part of that
> transaction. Therefore SetCurrentImage must be called on the main thread,
> since only the main thread participates in layer transactions.
> SetCurrentImage should really have some assertions added to check those
> preconditions.
> 
> The semantics of SetCurrentImageAsync would be that the image is updated "as
> soon as possible" independently of any transactions. Therefore it can be
> called on any thread, inside or outside a transaction.
> 
> Currently some call sites to SetCurrentImage want the first behavior, while
> others want the second. It's an accident that we happen to implement both
> behaviors with the same method, because pre-OMTC the same implementation
> happens to produce both behaviors. We should clarify the expectations of
> callers by having them call different methods.

Ah! ok, so if I understand correctly, the use of SetCurrentImage in VideoFrameContainer before async video does not correspond to the "in transaction" SetCurrentImage that you mentionned, does it?
This in transaction SetCurrentImage has yet to be implemented then, since the current implementation (the one before async video) specifies that it can be used from any thread and is effectively used off the main thread.

What would the main-thread-in-transaction implementation do differently compared to the current cross thread implementation (without async video)?
Comment 60 Nicolas Silva [:nical] 2012-07-05 11:36:17 PDT
How about we keep the async version of SetCurrentImage as 'SetCurrentImage', and add a 'SetCurrentImageInTransaction' for the cases where we are in the main-thread and in a transaction?

I mean rather than having SetCurrentImage and SetCurrentImageAsync.

I think the most restrictive version should have the most explicit name.
Comment 61 Nicolas Silva [:nical] 2012-07-05 12:50:51 PDT
Created attachment 639432 [details] [diff] [review]
PImageBridge protocol

I added some documetation, in particular a big documentation bloc in ImageBridgeChild.h that should clarify the way ImageBridge works.

Also added SetCurrentImageInTransaction that does what the old SetCurrentImage did without trying ImageBridge. SetCurrentImage tries to use ImageBridge and falls back to what the old SetCurrentImage was doing, as I proposed in my last comment.

SetCurrentImageInTransaction is not used anywhere yet and I suppose its implementation will be modified to take advantage of it not being so general purpose, but these should belong to another bug.

ImageContainer's constructor now takes a flag that can be ENABLE_ASYNC or DISABLE_ASYNC. In practive you choose by calling LayerManager::CreateImageContainer() or LayerManager::CreateAsynchronousImageContainer(). I did this rather than CreateImageContainer(aFlag) because the latter would requiere the caller to inclue ImageLayers just for two constants, and including too many files is part of what makes C++ compilation so slow (I'm sure adding up this kind of behavior through the entire code base matters). But if this is a problem I can still change, i'm not really attached to this.

Ther night be some other small changes here and there, but it is mostly that.
Comment 62 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-05 14:30:28 PDT
(In reply to Nicolas Silva [:nical] from comment #59)
> Ah! ok, so if I understand correctly, the use of SetCurrentImage in
> VideoFrameContainer before async video does not correspond to the "in
> transaction" SetCurrentImage that you mentionned, does it?

Right.

> This in transaction SetCurrentImage has yet to be implemented then, since
> the current implementation (the one before async video) specifies that it
> can be used from any thread and is effectively used off the main thread.

True, although many users of SetCurrentImage are currently relying on the fact that when called during a transaction, it takes effect as part of the current transaction.

> What would the main-thread-in-transaction implementation do differently
> compared to the current cross thread implementation (without async video)?

The current implementation is adequate, although as I said above it could use some more assertions to verify preconditions.
Comment 63 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-05 15:04:57 PDT
(In reply to Nicolas Silva [:nical] from comment #61)
> SetCurrentImageInTransaction is not used anywhere yet and I suppose its
> implementation will be modified to take advantage of it not being so general
> purpose, but these should belong to another bug.

It seems to me that all current users of GetCurrentImage except for video should be converted to SetCurrentImageInTransaction, otherwise you'll be breaking them.

(In reply to Nicolas Silva [:nical] from comment #61)
> ImageContainer's constructor now takes a flag that can be ENABLE_ASYNC or
> DISABLE_ASYNC. In practive you choose by calling
> LayerManager::CreateImageContainer() or
> LayerManager::CreateAsynchronousImageContainer(). I did this rather than
> CreateImageContainer(aFlag) because the latter would requiere the caller to
> inclue ImageLayers just for two constants, and including too many files is
> part of what makes C++ compilation so slow (I'm sure adding up this kind of
> behavior through the entire code base matters). But if this is a problem I
> can still change, i'm not really attached to this.

No, that's good.
Comment 64 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-05 16:58:44 PDT
Comment on attachment 639432 [details] [diff] [review]
PImageBridge protocol

Review of attachment 639432 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/html/content/src/nsHTMLMediaElement.cpp
@@ +90,5 @@
>  #endif
>  
> +#ifdef LOG
> +#undef LOG
> +#endif

Why this?

@@ +2998,5 @@
>  
>    mVideoFrameContainer =
> +    new VideoFrameContainer(this, LayerManager::CreateAsynchronousImageContainer());
> +
> +  mVideoFrameContainer->GetImageContainer();

This call to GetImageContainer doesn't do anything.

@@ +3004,5 @@
> +  ImageContainer* container = mVideoFrameContainer->GetImageContainer();
> +  if (ImageBridgeChild::IsCreated()) {
> +    ImageBridgeChild::GetSingleton()->CreateImageContainerChild(container);
> +  }
> +*/

Just remove the commented-out code.

::: gfx/layers/ImageLayers.cpp
@@ +139,5 @@
> +
> +void
> +ImageContainer::SetCurrentImageInTransaction(Image *aImage)
> +{
> +  ReentrantMonitorAutoEnter mon(mReentrantMonitor);

assert NS_IsMainThread and !mImageContainerChild.

I think we should share the body of SetCurrentImageInTransaction with SetCurrentImage's non-mImageContainerChild path, probably a private SetCurrentImageInternalSync or something like that.

::: gfx/layers/ImageLayers.h
@@ +295,2 @@
>  
> +  enum { DISABLE_ASYNC, ENABLE_ASYNC };

I would just make this a bitfield so ENABLE_ASYNC = 0x01, no DISABLE_ASYNC (it's implicitly zero).

@@ +295,4 @@
>  
> +  enum { DISABLE_ASYNC, ENABLE_ASYNC };
> +
> +  ImageContainer(int flag = DISABLE_ASYNC);

Default to 0.

@@ +323,5 @@
>     * mReentrantMonitor.
> +   *
> +   * If this ImageContainer has an ImageContainerChild for async video: 
> +   * Schelude a task to send the image to the compositor using the 
> +   * PImageBridge protcol without using the main thread.

Note that this must not be called if ENABLE_ASYNC has not been set.

@@ +341,5 @@
> +   *
> +   * Implementations must call CurrentImageChanged() while holding
> +   * mReentrantMonitor.
> +   */
> +  void SetCurrentImageInTransaction(Image* aImage);

Note that this must not be called if ENABLE_ASYNC has been set.

@@ +496,5 @@
> +   * Returns a pointer to this container's ImageContainerChild if it has one.
> +   * if the return value is not nil, it means that this ImageContainer is backed
> +   * by the ImageBridgeProtocol and its images are transfered directly to the 
> +   * compositor thread/process outside of a transaction and without using the
> +   * main thread.

Who needs to call this and GetImageContainerChild? ImageContainer clients should not need to call this I hope.

::: gfx/layers/Layers.cpp
@@ +390,5 @@
> +    gfxMatrix maskTranslation;
> +    bool maskIs2D = mMaskLayer->GetTransform().CanDraw2D(&maskTranslation);
> +    NS_ASSERTION(maskIs2D, "How did we end up with a 3D transform here?!");
> +#endif
> +    mMaskLayer->mEffectiveTransform.PreMultiply(mMaskLayer->GetTransform());

What's changing in this hunk? Seems like it shouldn't be in the patch.

::: gfx/layers/Layers.h
@@ +412,5 @@
>    /**
>     * Can be called anytime, from any thread.
>     */
>    static already_AddRefed<ImageContainer> CreateImageContainer();
> +  static already_AddRefed<ImageContainer> CreateAsynchronousImageContainer();

Add a comment explaining what the difference is.

::: gfx/layers/ipc/CompositorParent.cpp
@@ +578,5 @@
>    return true;
>  }
>  
> +
> +typedef std::map<PRUint32,CompositorParent*> CompositorMap;

Shouldn't this be a hashmap? Or better still, an nsTHashtable? nsDataHashtable<nsUint32Hashkey, CompositorParent*> would be good.

@@ +617,5 @@
> +
> +CompositorParent* CompositorParent::RemoveCompositor(PRUint32 id)
> +{
> +  CompositorMap::iterator it = sCompositorMap->find(id);
> +  if (it == sCompositorMap->end()) return nsnull;

Two lines

::: gfx/layers/ipc/CompositorParent.h
@@ +85,5 @@
> +  
> +  /**
> +   * Creates a global map referencing each compositor by ID.
> +   *
> +   * This map is used by the ImageBirdge protocol to trigger

typo

::: gfx/layers/ipc/ImageBridgeChild.h
@@ +53,5 @@
> + *   - (B) Since the ImageContainer does not use ImageBridge, the image data is swaped.
> + *  
> + * - During composition:
> + *   - (A) The ShadowImageLayer has an image ID, it looks up the ID in the global
> + *   table to see if there is an image. If there is no image, nothing is rendered.

The ID is for an ImageContainer, not for an Image. You need to make this clearer everywhere.

@@ +72,5 @@
> +friend class mozilla::layers::ImageContainer;
> +friend class mozilla::layers::ImageBridgeDestroyTask;
> +friend class mozilla::layers::ImageBridgeDeleteTask;
> +friend class mozilla::layers::ImageBridgeConnectionTask;
> +friend class mozilla::layers::ImageBridgeCreateContainerChildTask;

Indent these.

@@ +143,5 @@
> +   * Creates an ImageContainerChild and it's associated ImageContainerParent.
> +   *
> +   * The creation happens synchronously on the ImageBridgeChild thread, so if 
> +   * this function is called on another thread, the current thread will be 
> +   * paused until the creation is done.

But the ImageBridgeChild thread doesn't do a synchronous message to the compositor during this time right? We want calling CreateImageContainerChild on the main thread to not block the main thread waiting for a composite to end, or anything like that. Assuming that's true, make it clear in this comment that calling this from the main thread is actually OK.

::: gfx/layers/ipc/ImageBridgeParent.cpp
@@ +25,5 @@
> +  ImageContainerParent::DestroySharedImageMap();
> +}
> +
> +namespace {
> +  PRUint64 GenImageID() {

As I mentioned earlier, refer to ImageContainer IDs everywhere instead of Image IDs, otherwise things are going to be very confusing.

::: gfx/layers/ipc/ImageContainerChild.cpp
@@ +34,5 @@
> + *
> + * The values for the two constants are arbitrary and should be tweaked if it 
> + * happens that we run into shared memory problems.
> + */
> +enum {POOL_MAX_SHARED_IMAGES = 5, MAX_ACTIVE_SHARED_IMAGES=10};

these could just be static const int

@@ +139,5 @@
> +             data->mCbCrSize.width);
> +      memcpy(tempBufferV->Data() + i * tempBufferV->Stride(),
> +             data->mCrChannel + i * data->mCbCrStride,
> +             data->mCbCrSize.width);
> +    }

Ugh. We really need to eliminate copies :-(. If you and cjones are OK with this for now, I can live with it, but we really need ImageFactory to create some kind of SharedImage subclass of Image directly.

@@ +166,5 @@
> +
> +  if (mSharedImagePool.Length() >= POOL_MAX_SHARED_IMAGES) {
> +    return false;
> +  }
> +  if (img->type()==SharedImage::TYUVImage) {

Spaces around ==

@@ +264,5 @@
> +    return;
> +  }
> +
> +  // Sending images and (potentially) allocating shmems must be done 
> +  // on the ImageBrdgeChild thread.

typo

::: gfx/layers/ipc/ImageContainerChild.h
@@ +120,5 @@
> +  SharedImage* ImageToSharedImage(Image* toCopy);
> +
> +  bool AddSharedImageToPool(SharedImage* img);
> +  SharedImage* PopSharedImageFromPool();
> +  void ClearSharedImagePool();

We probably need comments on these methods.

::: gfx/layers/opengl/ImageLayerOGL.cpp
@@ +793,5 @@
> +
> +  nsRefPtr<gfxSharedImageSurface> surfY =
> +    gfxSharedImageSurface::Open(yuv.Ydata());
> +  nsRefPtr<gfxSharedImageSurface> surfU =
> +    gfxSharedImageSurface::Open(yuv.Udata());

This is just like code in ShadowImageLayerOGL::Init. Can't we share it?

@@ +848,5 @@
> +                                                  mOGLManager->GetCompositorID());
> +    PRUint32 imgVersion = ImageContainerParent::GetSharedImageVersion(mImageID);
> +    if (imgVersion != mImageVersion) {
> +      SharedImage* img = ImageContainerParent::GetSharedImage(mImageID);
> +      if (!UploadTextureFromSharedImage(img)) return;

Two lines

::: gfx/thebes/gfxPlatform.cpp
@@ +280,5 @@
> +            = mozilla::layers::ImageBridgeChild::GetSingleton();
> +        mozilla::layers::ImageBridgeParent* imageBridgeParent
> +            = new mozilla::layers::ImageBridgeParent(
> +                CompositorParent::CompositorLoop());
> +        imageBridgeChild->ConnectAsync(imageBridgeParent);

Some typedefs would get rid of the namespace prefixes here and make this more readable.
Comment 65 Kan-Ru Chen [:kanru] (UTC+8) 2012-07-05 21:17:53 PDT
Comment on attachment 639432 [details] [diff] [review]
PImageBridge protocol

Review of attachment 639432 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/ImageContainerChild.cpp
@@ +16,5 @@
> +namespace layers {
> +
> +/*
> + * - POOL_MAX_SHARED_IMAGES is the maximum number number of shared images to
> + * store in the InageContainerChild's pool.

ImageContainerChild

@@ +45,5 @@
> +bool ImageContainerChild::RecvReleasedSharedImage(const SharedImage& aImage)
> +{
> +  SharedImage* img = new SharedImage(aImage);
> +  if (!AddSharedImageToPool(img) || mStop) {
> +    DeallocSharedImageData(this, *img);

Do you mean DestroySharedImage? Otherwise mActiveImageCount will never decrease.
Comment 66 Nicolas Silva [:nical] 2012-07-06 08:30:20 PDT
Created attachment 639685 [details] [diff] [review]
PImageBridge protocol

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #63)
> It seems to me that all current users of GetCurrentImage except for video
> should be converted to SetCurrentImageInTransaction, otherwise you'll be
> breaking them.

At the moment they are not broken because calling SetCurrentImage without having an ImageBridge is exactly like calling SetCurrentImageInTransaction. I can change change them in this bug if you want, I meant to do it in another bug to avoid making this one too big and painful for review.

> many users of SetCurrentImage are currently relying on the fact that when called during
> a transaction, it takes effect as part of the current transaction.

So right now with async video what happens in the transaction is that if the layer container has an Image bridge the swap operation sends the image ID rather than the image data (regardless of what flavour of SetCurrentImage was called earlier). As it is, it's not a good idea to call SetCurrentImageInTransaction as it is harmless, but does not send the image to the compositor.
If we change this to enable switching between in-transaction and async on the same ImageContainer, what worries me is that for ImageBridge to work again, we have to do another transaction which will Swap the image data with the image ID. I guess async video should not trigger transactions continuously, so we'd need to make sure that we trigger a layer transaction when we switch back to async. The whole point being to be freed from the main thread, I think being able to do this would defeat the purpose of async video.
Ans it seems to be complicated for something that we don't need at the moment (and may never need?).

> The current implementation is adequate, although as I said above it could use some more > assertions to verify preconditions.

I added an assertion to check that we are on the main thread, I don't think we can check that we are in a transaction, can we?

> Do you mean DestroySharedImage? Otherwise mActiveImageCount will never decrease.

Yes, fixed.
Comment 67 Nicolas Silva [:nical] 2012-07-06 10:33:37 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #64)

I'm fixing these items.

> Who needs to call this and GetImageContainerChild? ImageContainer clients
> should not need to call this I hope.

I replaced it by IsAsync() and GetSharedImageID() 
which are the only things that GetImageContainerChild was meant to provide to ImageContainer users.

> 
> ::: gfx/layers/Layers.cpp
> @@ +390,5 @@
> > +    gfxMatrix maskTranslation;
> > +    bool maskIs2D = mMaskLayer->GetTransform().CanDraw2D(&maskTranslation);
> > +    NS_ASSERTION(maskIs2D, "How did we end up with a 3D transform here?!");
> > +#endif
> > +    mMaskLayer->mEffectiveTransform.PreMultiply(mMaskLayer->GetTransform());
> 
> What's changing in this hunk? Seems like it shouldn't be in the patch.

huh, I don't know, I doesn't seem to have white-space changes either, i'll find out. 


> @@ +143,5 @@
> > +   * Creates an ImageContainerChild and it's associated ImageContainerParent.
> > +   *
> > +   * The creation happens synchronously on the ImageBridgeChild thread, so if 
> > +   * this function is called on another thread, the current thread will be 
> > +   * paused until the creation is done.
> 
> But the ImageBridgeChild thread doesn't do a synchronous message to the
> compositor during this time right? We want calling CreateImageContainerChild
> on the main thread to not block the main thread waiting for a composite to
> end, or anything like that. Assuming that's true, make it clear in this
> comment that calling this from the main thread is actually OK.

Actually, it does use a synchonous message to create the ImageContainerChild/Parent pair.
This is because the container ID is generated on the Compositor side, where we can make sure there is no collision. 
 

> 
> @@ +139,5 @@
> > +             data->mCbCrSize.width);
> > +      memcpy(tempBufferV->Data() + i * tempBufferV->Stride(),
> > +             data->mCrChannel + i * data->mCbCrStride,
> > +             data->mCbCrSize.width);
> > +    }
> 
> Ugh. We really need to eliminate copies :-(. If you and cjones are OK with
> this for now, I can live with it, but we really need ImageFactory to create
> some kind of SharedImage subclass of Image directly.

Yes, I want to eliminate this copy and I have a clear idea of how i want to do it, but as a separate patch. This one is already big enough, and I'd like to see it pass reviews before I implement stuff on top of it. Rebasing all the time is a pain. The current state of async video is still an improvement over non-async video anyway.
By the way, to avoid this copy in a graceful way, i might need to change the Image class a bit, so it's probably worth chatting about it a bit before I do it and it conflicts with other goals and i redo it entirely (just like what happened with async video).
Comment 68 Nicolas Silva [:nical] 2012-07-06 10:53:45 PDT
> >  #endif
> >  
> > +#ifdef LOG
> > +#undef LOG
> > +#endif
> 
> Why this?

Some headers are defining a LOG macro, like somewhere in the IPDL code. At some point this file was including a file that included (etc) one of these definitions of LOG. As a result it was generating a warning on my machine, and an error on try servers thanks to -Werror. I don't know if every one else uses warning as errors, I think it is not the default config that you can find on the wiki because I would have not removed it and I didn't have it before. 
Anyway, since then I removed the include that was injecting the LOG macro so it is not strictly necessary to undef the it. But as LOG is a name that is very likely to clash, I kept the #undef to avoid the error showing up again as a side effect if someone ever changes one of the numerous headers included in nsHTMLMediaElement.cpp to directly or indirectly include or define a LOG macro.
Comment 69 Nicolas Silva [:nical] 2012-07-06 11:10:54 PDT
> > +typedef std::map<PRUint32,CompositorParent*> CompositorMap;
> 
> Shouldn't this be a hashmap? Or better still, an nsTHashtable? 
> nsDataHashtable<nsUint32Hashkey, CompositorParent*> would be good.

On mobile there is only one compositor and on desktop one per window. Do we gain something from using a hashmap rather than a map with so few elements?
Comment 70 Nicolas Silva [:nical] 2012-07-06 15:02:57 PDT
Created attachment 639821 [details] [diff] [review]
PImageBridge protocol

I fixed most of the items you guys pointed (I need to check again, I may have forgotten some, there was quite a few of them!). It includes a lot of renaming to make it clear that IDs refer to containers and not images, some documentation.
I also found and fixed a memory leak.
Comment 71 Nicolas Silva [:nical] 2012-07-06 20:07:43 PDT
*** Bug 706172 has been marked as a duplicate of this bug. ***
Comment 72 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-08 17:30:33 PDT
(In reply to Nicolas Silva [:nical] from comment #66)
> At the moment they are not broken because calling SetCurrentImage without
> having an ImageBridge is exactly like calling SetCurrentImageInTransaction.
> I can change change them in this bug if you want, I meant to do it in
> another bug to avoid making this one too big and painful for review.

You can get the benefits of smaller patches and less pain by doing it in this bug, but in a separate patch. Let's do that :-).

> If we change this to enable switching between in-transaction and async on
> the same ImageContainer,

Let's not worry about that now.

> I guess async video should not trigger transactions
> continuously,

Right. I think for now, video should just never call SetCurrentImageInTransaction.

> I added an assertion to check that we are on the main thread, I don't think
> we can check that we are in a transaction, can we?

We could but it would be hard. Don't worry about it.

(In reply to Nicolas Silva [:nical] from comment #67)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #64)
> > But the ImageBridgeChild thread doesn't do a synchronous message to the
> > compositor during this time right? We want calling CreateImageContainerChild
> > on the main thread to not block the main thread waiting for a composite to
> > end, or anything like that. Assuming that's true, make it clear in this
> > comment that calling this from the main thread is actually OK.
> 
> Actually, it does use a synchonous message to create the
> ImageContainerChild/Parent pair.
> This is because the container ID is generated on the Compositor side, where
> we can make sure there is no collision.

That's kinda bad. It means allocating an ImageContainer can block while the compositor thread does a composite.

How about having the ImageBridgeParent allocate an ImageBride ID for the child (assuming we need to support multiple ImageBridges, for multiple content processes), and have the ImageBridgeChild be responsible for allocating an ImageContainer ID, and then using a pair of IDs to identify the ImageContainer?

> By the way, to avoid this copy in a graceful way, i might need to change the
> Image class a bit, so it's probably worth chatting about it a bit before I
> do it and it conflicts with other goals and i redo it entirely (just like
> what happened with async video).

Here's the plan I've been carrying in my head for minimal-copying when using our builtin decoders:
-- PlanarYCbCrImage gets a new method Allocate(Data* aData) where the data pointers are null when passed in, and the method allocates memory and fills in the data pointers. Then there are two modes of operation:
1) "caller-allocate", the current setup, where the caller already has data in memory and just calls SetData() to copy that data into the image
2) "callee-allocate", the new option, where the caller calls Allocate(), fills in the buffers allocated by the image, then calls SetData() when that's finished. The image must keep the buffers alive and readable by the caller as long as the Image object lives.
-- The default implementation of Allocate() just allocates buffers in main memory.
-- For IPC we implement Allocate() to allocate buffers in shared memory.
-- Modify our decoder libraries to decode into buffers provided by the caller. I've already worked out a plan for this with Tim Terriberry. Basically we extend the codec API to allow the caller to pass in two callbacks: "create a frame buffer", and "release a frame buffer". The codec will need to keep some frames around longer than others to be used as references when decoding other frames.
-- Implement those callbacks in our video decoder: create a frame buffer creating a PlanarYCbCrImage and calling Allocate(), release a frame buffer by derefing the PlanarYCbCrImage.
Comment 73 Nicolas Silva [:nical] 2012-07-09 08:10:36 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #72)

> (In reply to Nicolas Silva [:nical] from comment #67)
> > (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #64)
> > > But the ImageBridgeChild thread doesn't do a synchronous message to the
> > > compositor during this time right? We want calling CreateImageContainerChild
> > > on the main thread to not block the main thread waiting for a composite to
> > > end, or anything like that. Assuming that's true, make it clear in this
> > > comment that calling this from the main thread is actually OK.
> > 
> > Actually, it does use a synchonous message to create the
> > ImageContainerChild/Parent pair.
> > This is because the container ID is generated on the Compositor side, where
> > we can make sure there is no collision.
> 
> That's kinda bad. It means allocating an ImageContainer can block while the
> compositor thread does a composite.
> 
> How about having the ImageBridgeParent allocate an ImageBride ID for the
> child (assuming we need to support multiple ImageBridges, for multiple
> content processes), and have the ImageBridgeChild be responsible for
> allocating an ImageContainer ID, and then using a pair of IDs to identify
> the ImageContainer?

The current stsrem has one very nice effect, which is that a video stream can change compositor effortlessly. That's what happens when you drag a tab containing a video from a window to another. If I change the ID to use a pair <ImageBridgeID,ContainerID> it means that this case will be harder to handle.
But I admit blocking the main thread kinda sux. Though it happens only when you create the video stream. I'll look in the profiler to see if it is noticeable.

> 
> > By the way, to avoid this copy in a graceful way, i might need to change the
> > Image class a bit, so it's probably worth chatting about it a bit before I
> > do it and it conflicts with other goals and i redo it entirely (just like
> > what happened with async video).
> 
> Here's the plan I've been carrying in my head for minimal-copying when using
> our builtin decoders:
> -- PlanarYCbCrImage gets a new method Allocate(Data* aData) where the data
> pointers are null when passed in, and the method allocates memory and fills
> in the data pointers. Then there are two modes of operation:
> 1) "caller-allocate", the current setup, where the caller already has data
> in memory and just calls SetData() to copy that data into the image
> 2) "callee-allocate", the new option, where the caller calls Allocate(),
> fills in the buffers allocated by the image, then calls SetData() when
> that's finished. The image must keep the buffers alive and readable by the
> caller as long as the Image object lives.
> -- The default implementation of Allocate() just allocates buffers in main
> memory.
> -- For IPC we implement Allocate() to allocate buffers in shared memory.
> -- Modify our decoder libraries to decode into buffers provided by the
> caller. I've already worked out a plan for this with Tim Terriberry.
> Basically we extend the codec API to allow the caller to pass in two
> callbacks: "create a frame buffer", and "release a frame buffer". The codec
> will need to keep some frames around longer than others to be used as
> references when decoding other frames.
> -- Implement those callbacks in our video decoder: create a frame buffer
> creating a PlanarYCbCrImage and calling Allocate(), release a frame buffer
> by derefing the PlanarYCbCrImage.

This sounds very good to me, and it gets rid of two copies, though there is one thing to keep in mind is that shared images can only be allocated/deallocated in the ImageBridgeChild thread (or Compositor thread for the parent side). And also, why do you want to add Allocate in PlanarYCbCrImage rather than doing the work in ImageContainer::CreateImage?

What I had in mind was to add SharedPlanarYCbCrImage inheriting from PlanarYCbCrImage that would be allocated by ImageContainer, and when we allocate an Image through ImageContainer, if it has a shared image available in its ImageContainerChild's pool it returns a SharedPlanarYCbCrImage, and if not it just returns a PlanarYCbCrImage. Later, when we call SetCurrentImage on the ImageContainer, if the argument is shared the image container wil send it directly through the image bridge, and if it is not, it will send the image to the ImageBridgeChild thread and the allocation copy will happen there (just like it happens now with async video.

I don't know what would happen on emergency ipc shutdown. Does it exit firefox? because if it doesn't, we my run into the case where the ipdl system deallocates its shmems on emergency shutdown, while the decoder is decoding into it...
Comment 74 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-09 23:31:01 PDT
(In reply to Nicolas Silva [:nical] from comment #73)
> The current stsrem has one very nice effect, which is that a video stream
> can change compositor effortlessly. That's what happens when you drag a tab
> containing a video from a window to another. If I change the ID to use a
> pair <ImageBridgeID,ContainerID> it means that this case will be harder to
> handle.

I thought there was one ImageBridge per process. If so, why does <ImageBridgeID,ContainerID> make it harder to handle cross-window tab dragging?

> This sounds very good to me, and it gets rid of two copies, though there is
> one thing to keep in mind is that shared images can only be
> allocated/deallocated in the ImageBridgeChild thread (or Compositor thread
> for the parent side).

That's OK, we should be able to do a sync call from the decoder thread to the ImageBridgeChild thread.

> And also, why do you want to add Allocate in
> PlanarYCbCrImage rather than doing the work in ImageContainer::CreateImage?

Because Allocate()'s interface has to be format-specific, like SetData's. There are alternative approaches:
-- define at least one variant of the CreateImage API that forces preallocation, and provide getters on PlanarYCbCrImage to retrieve pointers to the preallocated buffers. That seems more complicated and uglier than Allocate().
-- define a new PLANAR_YCBCR_PREALLOCATED image format, and provide getters on PlanarYCbCrImagePreallocated to retrieve pointers to the preallocated buffers. That seems even worse.

> What I had in mind was to add SharedPlanarYCbCrImage inheriting from
> PlanarYCbCrImage that would be allocated by ImageContainer, and when we
> allocate an Image through ImageContainer, if it has a shared image available
> in its ImageContainerChild's pool it returns a SharedPlanarYCbCrImage, and
> if not it just returns a PlanarYCbCrImage. Later, when we call
> SetCurrentImage on the ImageContainer, if the argument is shared the image
> container wil send it directly through the image bridge, and if it is not,
> it will send the image to the ImageBridgeChild thread and the allocation
> copy will happen there (just like it happens now with async video.

I think it would be better to have a special ImageFactory for ImageLayers with shadows that allocates SharedPlanarYCbCrImages always. SharedPlanarYCbCrImage::Allocate would retrieve buffers from the pool or create new ones.

> I don't know what would happen on emergency ipc shutdown. Does it exit
> firefox? because if it doesn't, we my run into the case where the ipdl
> system deallocates its shmems on emergency shutdown, while the decoder is
> decoding into it...

I don't know anything about emergency IPC shutdowns. They certainly can't be allowed to arbitrarily deallocate resources while threads are still using them.
Comment 75 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-10 14:05:30 PDT
Comment on attachment 639821 [details] [diff] [review]
PImageBridge protocol

General comment: use uint64_t for IDs and don't use a linear scan to
reuse old IDs.  Attempting to reuse old IDs can lead to subtle bugs,
and there's no need with uint64_t: we can allocate a million IDs per
second for >500 years before overflowing.

I'm not going to call those out specifically below.

Another general comment: you're creating specialized Task subclasses
for what can be done with existing helpers.  To dispatch a task that
just calls a method on an object

 loop->PostTask(FROM_HERE, NewRunnableMethod(obj, &Method, ...));

to dispatch a task that just calls a function

 loop->PostTask(FROM_HERE, NewRunnableFunction(obj, &Method, ...));

Please make these changes below; I'm not going to call them out
specifically in this review.


>diff --git a/gfx/layers/ImageLayers.cpp b/gfx/layers/ImageLayers.cpp

>+ImageContainer::ImageContainer(int flag) 

>+  if ((flag == ENABLE_ASYNC) && ImageBridgeChild::IsCreated()) {
>+    ImageBridgeChild::GetSingleton()->CreateImageContainerChild(this);
>+  }

I don't think this is going to be high enough overhead to worry about
initially, but if necessary we can make the IPC initialization of this
guy lazier.

I think it would be clearer if CreateImageContainerChild() returned
ImageContainerChild*, and we assigned it to mImageContainerChild here.

>diff --git a/gfx/layers/ImageLayers.h b/gfx/layers/ImageLayers.h

>+  /**
>+   * Returns true if this ImageContainer uses the ImageBridge IPDL protocol.
>+   */
>+  inline bool IsAsync() const {
>+    return mImageContainerChild != nsnull;

From what threads can this be called?

>+  }
>+
>+  /**
>+   * If this ImageContainer uses ImageBridge, returns the ID associated to
>+   * this container, for use in the ImageBridge protocol.
>+   * Returns 0 if this ImageContainer does not use ImageBridge. Note that
>+   * 0 is always an invalid ID for asynchronous image containers. 
>+   */
>+  PRUint32 GetAsyncContainerID() const;

From what threads can this be called?

>+  /**
>+   * Should be called by ImageContainerChild or ImageBridgeChild when 
>+   * initializing/destroying the asynchronous image bridge protocol.
>+   */
>+  void SetImageContainerChild(ImageContainerChild * aChild)
>+  {
>+    mImageContainerChild = aChild;
>+  }
>+

I'm not 100% sure we need this, but I may need another pass through to
convince myself.  If we can get away with this only being called on
~ImageContainer(), then we don't need it.  At any rate, with the
change requested above, let's make this ClearImageContainerChild().

One issue here is that the threading model isn't clear.  Not 100% sure
yet when this happens outside of ~ImageContainer(), so maybe not a bad
problem.  If we call this concurrent with clients on the state-machine
thread, from another thread, we're gonna have a bad time.

>+  // ImageBridgeProtocol
>+  ImageContainerChild * mImageContainerChild;

This is worth a more descriptive comment.

>diff --git a/gfx/layers/Layers.h b/gfx/layers/Layers.h

>   /**
>    * Can be called anytime, from any thread.
>+   *
>+   * Create an Image container which forwards its images to the compositor within
>+   * layer transactions on the main thread.
>    */
>   static already_AddRefed<ImageContainer> CreateImageContainer();

Can this still be created by any thread?

I don't think it's true that this is only used within layer
transactions on the content main thread.  The plugin async drawing
code violates that, right?

>+  
>+  /**
>+   * Can be called anytime, from any thread.
>+   *
>+   * Create an Image container which forwards its images to the compositor 
>+   * asynchronously using the ImageBridge IPDL protocol.
>+   */
>+  static already_AddRefed<ImageContainer> CreateAsynchronousImageContainer();
> 

This description isn't entirely true ... the returned image container
may or may not use ImageBridge.  I'm not sure I like this name, but if
roc is happy I'm happy, and I can't think anything much better atm.

>diff --git a/gfx/layers/Makefile.in b/gfx/layers/Makefile.in

>@@ -112,27 +112,37 @@ EXPORTS_mozilla/layers =\
>         CompositorCocoaWidgetHelper.h \
>         CompositorChild.h \
>         CompositorParent.h \
>         ShadowLayers.h \
>         ShadowLayersChild.h \
>         ShadowLayersParent.h \
>         ShadowLayersManager.h \
>         RenderTrace.h \
>+        ImageBridgeChild.h \
>+        ImageBridgeParent.h \
>+        ImageContainerChild.h \
>+        ImageContainerParent.h \
>+        SharedImageUtils.h \
>+
>         $(NULL)

Drop the newline before $(NULL) and please keep these alphabetized.

> 
> CPPSRCS += \
>         CompositorCocoaWidgetHelper.cpp \
>         CompositorChild.cpp \
>         CompositorParent.cpp \
>         ShadowLayers.cpp \
>         ShadowLayerChild.cpp \
>         ShadowLayersChild.cpp \
>         ShadowLayerParent.cpp \
>         ShadowLayersParent.cpp \
>+        ImageBridgeChild.cpp \
>+        ImageBridgeParent.cpp \
>+        ImageContainerChild.cpp \
>+        ImageContainerParent.cpp \

Keep alphabetized plz.

>diff --git a/gfx/layers/basic/BasicImageLayer.cpp b/gfx/layers/basic/BasicImageLayer.cpp

>+  if (mContainer->IsAsync()) {
>+    PRUint32 imageID = mContainer->GetAsyncContainerID();

containerID

>diff --git a/gfx/layers/ipc/CompositorParent.cpp b/gfx/layers/ipc/CompositorParent.cpp

>   if (aBackendType == LayerManager::LAYERS_OPENGL) {

This should work for basic layer managers too.  Please file a
followup.

>+void CompositorParent::CreateCompositorMap()
>+{
>+  if (sCompositorMap == nsnull) {
>+    sCompositorMap = new CompositorMap;

Nit: |new CompositorMap();|.  It doesn't matter here but the above
idiom has a specific usage that's confusing in this context.

>+void CompositorParent::DestroyCompositorMap()
>+{
>+  if (sCompositorMap != nsnull) {
>+    delete sCompositorMap;

You should probably assert here that the map is empty, or at least
warn if it's not.

>+CompositorParent* CompositorParent::GetCompositor(PRUint32 id)
>+{
>+  CompositorMap::iterator it = sCompositorMap->find(id);
>+  if (it == sCompositorMap->end()) {
>+    return nsnull;
>+  }
>+  return it->second;

 return it != sCompositorMap->end() ? it->second : nsnull;

>+CompositorParent* CompositorParent::RemoveCompositor(PRUint32 id)
>+{
>+  CompositorMap::iterator it = sCompositorMap->find(id);
>+  if (it == sCompositorMap->end()) {

This should also assert or warn in this case, I think.

>diff --git a/gfx/layers/ipc/ImageBridgeChild.cpp b/gfx/layers/ipc/ImageBridgeChild.cpp
>diff --git a/gfx/layers/ipc/ImageBridgeChild.h b/gfx/layers/ipc/ImageBridgeChild.h
>diff --git a/gfx/layers/ipc/ImageBridgeParent.cpp b/gfx/layers/ipc/ImageBridgeParent.cpp
>diff --git a/gfx/layers/ipc/ImageContainerChild.cpp b/gfx/layers/ipc/ImageContainerChild.cpp
>diff --git a/gfx/layers/ipc/ImageContainerChild.h b/gfx/layers/ipc/ImageContainerChild.h
>diff --git a/gfx/layers/ipc/ImageContainerParent.cpp b/gfx/layers/ipc/ImageContainerParent.cpp
>diff --git a/gfx/layers/ipc/ImageContainerParent.h b/gfx/layers/ipc/ImageContainerParent.h

These generally look like what I was expecting but I'm going to have
come back and make a second pass over them.

>diff --git a/gfx/layers/ipc/PImageBridge.ipdl b/gfx/layers/ipc/PImageBridge.ipdl

>+sync protocol PImageBridge
>+{

>+  sync PImageContainer() returns (PRUint32 id);

Nit: uint64_t.

>diff --git a/gfx/layers/ipc/PImageContainer.ipdl b/gfx/layers/ipc/PImageContainer.ipdl

>+/**
>+ * The PImageBridge protocol is used to allow isolated threads or processes to push
>+ * frames directly to the compositor thread/process without relying on the main thread
>+ * which might be too busy dealing with content script.
>+ */

Please describe PImageContainer's role in that.

>+sync protocol PImageContainer
>+{

>+child:
>+  // Give back the child thread/process ownership to a SharedImage
>+  async ReleasedSharedImage(SharedImage image);

Nit: "ReleaseImage"

Is it possible for the compositor to release several images, if we had
a queue and there were multiple "stale" images?  You might want to
send SharedImage[].

>+
>+parent:
>+
>+  // Send a SharedImage to the compositor giving to the compositor ownership 
>+  // of the image.
>+  async PushSharedImage(SharedImage image);

Nit: "PublishImage"

>+  sync __delete__();

There's a race here between __delete__ and ReleaseImage.  You
need a two-phase destruction protocol.  Writing a state machine helps
prove this to yourself [1].  In this case, what you want is

 child:
   async ReleaseImage(SharedImage image);
   async __delete__();

 parent:
   async PublishImage(SharedImage image);
   async Destroy();


 state PUBLISHING:
   send PublishImage goto PUBLISHING;
   send Destroy goto DESTROYING;
   recv ReleaseImage goto PUBLISHING;

 state DESTROYING:
   recv ReleaseImage goto DESTROYING;
   recv __delete__;

(I'm not 100% sure this protocol will pass type checking.  If not,
ping me.)


>diff --git a/gfx/layers/ipc/ShadowLayers.h b/gfx/layers/ipc/ShadowLayers.h

>+  void SetCompositorID(PRUint32 aID)
>+  {
>+    mCompositorID = aID;
>+  }

This can only be set once.  Assert that mCompositorID is "null" (not
set) here, which I think is == 0.

> protected:
>   ShadowLayerManager() {}

Need to initialize mCompositorID here.

> /**
>  * A ShadowableLayer is a Layer can be shared with a parent context
>  * through a ShadowLayerForwarder.  A ShadowableLayer maps to a
>  * Shadow*Layer in a parent context.
>  *
>@@ -583,18 +593,24 @@ public:
>                     SharedImage* aNewBack) = 0;
> 
>   virtual ShadowLayer* AsShadowLayer() { return this; }
> 
>   MOZ_LAYER_DECL_NAME("ShadowImageLayer", TYPE_SHADOW)
> 
> protected:
>   ShadowImageLayer(LayerManager* aManager, void* aImplData)
>-    : ImageLayer(aManager, aImplData)
>+    : ImageLayer(aManager, aImplData), 
>+      mImageContainerID(0),
>+      mImageVersion(0)
>   {}
>+
>+  // ImageBridge protocol:
>+  PRUint32 mImageContainerID;
>+  PRUint32 mImageVersion;
> };
> 
> 
> class ShadowColorLayer : public ShadowLayer,
>                          public ColorLayer
> {
> public:
>   virtual ShadowLayer* AsShadowLayer() { return this; }

>diff --git a/gfx/layers/ipc/SharedImageUtils.h b/gfx/layers/ipc/SharedImageUtils.h

>+template<typename Allocator>
>+bool AllocateSharedBuffer(Allocator* protocol,
>+                          const gfxIntSize& aSize,
>+                          gfxASurface::gfxContentType aContent,
>+                          gfxSharedImageSurface** aBuffer)

In bug 765734 we're moving away from directly allocating shmem, but we
can reconcile these in a followup.

>diff --git a/gfx/layers/ipc/ipdl.mk b/gfx/layers/ipc/ipdl.mk

> IPDLSRCS = \
>   PCompositor.ipdl \
>   PLayer.ipdl \
>   PLayers.ipdl \
>+  PImageBridge.ipdl \
>+  PImageContainer.ipdl \
>+  LayersSurfaces.ipdlh \

Alphabetical plz.

>   $(NULL)

>diff --git a/gfx/layers/opengl/ImageLayerOGL.cpp b/gfx/layers/opengl/ImageLayerOGL.cpp

> void
> ShadowImageLayerOGL::Swap(const SharedImage& aNewFront,
>                           SharedImage* aNewBack)

>+    if (aNewFront.type() == SharedImage::TSharedImageID) {
>+      // We are using ImageBridge protocol. The image data will be queried at render
>+      // time in the parent side.
>+      mImageContainerID = aNewFront.get_SharedImageID().id();
>+      mImageVersion = 0;

This isn't a critical bug, but I think here we want to do

  newID = aNewFront.get_SharedImageID().id();
  if (newID != mImageContainerID) {
    mImageContainerID = newID;
    mImageVersion = 0;
  }

otherwise we'll uselessly reupload with every "normal" layers
transaction on the content thread.

>+bool ShadowImageLayerOGL::UploadTextureFromSharedImage(SharedImage* img)
>+{

The duplication of this (rather nasty) code makes me unhappy, and in
addition we don't handle TSurfaceDescriptor any more.  Let's please
share it.

> void
> ShadowImageLayerOGL::RenderLayer(int aPreviousFrameBuffer,
>                                  const nsIntPoint& aOffset)
> {
>   mOGLManager->MakeCurrent();
>+  if (mImageContainerID) {
>+    ImageContainerParent::SetCompositorIDForImage(mImageContainerID,
>+                                                  mOGLManager->GetCompositorID());

I'm not convinced this implementation is correct.  I think it's
vulnerable to the following (edge case) bug

 - LayerManager M1 "owns" image container C
 - on the content side, C switches from M1 to LayerManager M2
 - txns for M1 and M2 are pushed to compositor
 - somehow, M1 ends up calling ShadowImageLayerOGL::RenderLayer
   *after* M2, perhaps because of an unlucky composition scheduling
 - C remains associated with M1
 - there are no more txns pushed for M2: we're only doing async video
   playback

In this case, I think we could go an unbounded amount of time without
recompositing for M2.  Nothing reassociates C with M2.

In reality, we push layers txns all the time, even with just video
playing, so I don't think this would be a serious bug in practice.
But if a better solution presents itself I'd rather use that here.

>diff --git a/gfx/thebes/gfxPlatform.cpp b/gfx/thebes/gfxPlatform.cpp

>     if (useOffMainThreadCompositing) {
>+        CompositorParent::CreateCompositorMap();
>         CompositorParent::CreateThread();
>+
>+        ImageBridgeChild::Create(new base::Thread("ImageBridgeChild"));
>+        ImageBridgeChild* imageBridgeChild = ImageBridgeChild::GetSingleton();
>+        ImageBridgeParent* imageBridgeParent = new ImageBridgeParent(
>+                CompositorParent::CompositorLoop());
>+        imageBridgeChild->ConnectAsync(imageBridgeParent);

Let's sweep all this goo into CompositorParent::StartUp().

We need to be able to pref off async video independently of omtc.
Let's add another pref for it, let's say "layers.async-video.enabled",
defaulting to false for now.

>+    mozilla::layers::ImageBridgeChild* bridge = 
>+        ImageBridgeChild::GetSingleton();
>+    if (bridge) {
>+        base::Thread* imageBridgeThread = bridge->GetThread();
>+        ImageBridgeChild::Destroy();
>+        delete imageBridgeThread;
>+    }
>+
>     CompositorParent::DestroyThread();
>-
>+    CompositorParent::DestroyCompositorMap();

Let's sweep this into CompositorParent::ShutDown().

[1] https://developer.mozilla.org/en/IPDL/Tutorial#Protocol_state_machines

Review part 1.
Comment 76 Nicolas Silva [:nical] 2012-07-10 21:19:58 PDT
Created attachment 640906 [details] [diff] [review]
PImageBridge protocol

In this version I think I have everything fixed, except the potential race you pointed. I'll look closely into this one tomorrow.
Comment 77 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-10 21:49:06 PDT
Comment on attachment 640906 [details] [diff] [review]
PImageBridge protocol

Review of attachment 640906 [details] [diff] [review]:
-----------------------------------------------------------------

Followup work:
1) Change appropriate callers to use SetCurrentImageInTransaction
2) When we successfully handle a SetCurrentImage via ImageBridge, we should tell the caller not to bother triggering a new main-thread layer transaction. That'll let us reduce main-thread activity when we're doing nothing but play video.

I only skimmed *Parent/*Child and IPDL stuff, assuming cjones is reviewing those.

::: content/html/content/src/nsHTMLMediaElement.cpp
@@ +88,5 @@
>  #endif
>  
> +#ifdef LOG
> +#undef LOG
> +#endif

Remove this change

::: gfx/layers/ImageLayers.cpp
@@ +92,5 @@
> +  mRemoteDataMutex(nsnull),
> +  mCompositionNotifySink(nsnull),
> +  mImageContainerChild(nsnull)
> +{
> +  if ((flag == ENABLE_ASYNC) && ImageBridgeChild::IsCreated()) {

no parens needed around == subexpression

::: gfx/layers/ImageLayers.h
@@ +587,5 @@
> +  // be created, most likely because off-main-thread compositing is not enabled.
> +  // In this case the ImageContainer is perfectly usable, but it will forward 
> +  // frames to the compositor through transactions in the main thread rather than 
> +  // asynchrmously using the ImageBridge IPDL protocol.
> +  ImageContainerChild * mImageContainerChild;

ImageContainerChild *mImageContainerChild;

::: gfx/layers/Layers.cpp
@@ +390,5 @@
> +    gfxMatrix maskTranslation;
> +    bool maskIs2D = mMaskLayer->GetTransform().CanDraw2D(&maskTranslation);
> +    NS_ASSERTION(maskIs2D, "How did we end up with a 3D transform here?!");
> +#endif
> +    mMaskLayer->mEffectiveTransform.PreMultiply(mMaskLayer->GetTransform());

You don't seem to be changing anything in this hunk.

::: gfx/layers/ipc/ImageContainerChild.h
@@ +148,5 @@
> +  ImageContainer* mImageContainer;
> +  nsIntSize mSize;
> +  nsTArray<SharedImage*> mSharedImagePool;
> +  bool mStop; 
> +  int mActiveImageCount;

put bool last

::: gfx/layers/ipc/ImageContainerParent.cpp
@@ +66,5 @@
> +  : image(aImage), id(aID), version(1), compositorID(0) {}
> +  SharedImage*  image;
> +  PRUint64      id;
> +  PRUint32      version;
> +  PRUint64      compositorID;

Put fields in order: 64-bits, pointers, 32-bits
Comment 78 Nicolas Silva [:nical] 2012-07-11 17:41:19 PDT
Created attachment 641288 [details] [diff] [review]
PImageBridge protocol
Comment 79 Nicolas Silva [:nical] 2012-07-11 19:15:02 PDT
Created attachment 641317 [details] [diff] [review]
PImageBridge protocol

This version has a way to throw away the shared images, useful when the image container is still allocated but inactive for a while.

call ImageContainer::SetCurrentImage(nsnull); instead of providing an image.

As a future enhancement the image bridge will probably set a timeout before getting rid of all its shared memory.
Comment 80 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 02:43:49 PDT
Comment on attachment 628031 [details] [diff] [review]
Move OptimalShmemType from ShadowLayers.cpp to SharedMemory.h

This doesn't really make sense in SharedMemory.h.  Let's just expose this through ShadowLayers.h.
Comment 81 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 02:48:00 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #75)
> Comment on attachment 639821 [details] [diff] [review]
> PImageBridge protocol
> 
> we can allocate a million IDs per second for >500 years before overflowing.

Sorry, that's a million per millisecond.  A million per second is >500,000 years.
Comment 82 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 04:33:21 PDT
Comment on attachment 641317 [details] [diff] [review]
PImageBridge protocol

This looks much nicer, thanks for sticking with it!

>diff --git a/gfx/layers/ImageLayers.cpp b/gfx/layers/ImageLayers.cpp

>+ImageContainer::ImageContainer(int flag) 

>+  if (flag == ENABLE_ASYNC && ImageBridgeChild::IsCreated()) {
>+    ImageBridgeChild::GetSingleton()->CreateImageContainerChild(this);

> I think it would be clearer if CreateImageContainerChild() returned
> ImageContainerChild*, and we assigned it to mImageContainerChild here.
> 

Didn't address this: let's do

 mImageContainerChild = blahblah->Create();

Let's go ahead and file a followup on de-sync'ing this while the code
is still fresh in our minds.  There's an IPDL trick we can use.

>diff --git a/gfx/layers/ImageLayers.h b/gfx/layers/ImageLayers.h

>+   * Can be called from any thread.
>+  inline bool IsAsync() const {

>+   * Can be called from ay thread.
>+  PRUint64 GetAsyncContainerID() const;
> 

These aren't actually thread safe.  Read on below.

>+  void SetImageContainerChild(ImageContainerChild * aChild)
>+  {
>+    mImageContainerChild = aChild;

This call can race with IsAsync()/GetAsyncContainerID(), for the
latter in an sg:crit way I think.  We need to do away with this
setter.  To keep things thread safe and sane, we need to

 - have ImageContainer hold a strong ref to its mImageContainerChild
 - only release this ref in ~ImageContainer
 - if the ImageContainerChild shuts down before ImageContainer
   releases its ref to the ImageContainerChild, then
   ImageContainerChild will just live on in a "zombie" state,
   everything no-ops.  I think |mStop| is handling this properly in
   your existing code.

Shared-memory multithreading is hard!

>+  ImageContainerChild *mImageContainerChild;

Strong ref.

>diff --git a/gfx/layers/ipc/CompositorParent.cpp b/gfx/layers/ipc/CompositorParent.cpp

>+CompositorParent* CompositorParent::GetCompositor(PRUint64 id)
>+{
>+  CompositorMap::iterator it = sCompositorMap->find(id);
>+  if (it == sCompositorMap->end()) {
>+    return nsnull;
>+  }
>+  return it->second;
>+}
>+

>  return it != sCompositorMap->end() ? it->second : nsnull;
> 

Didn't address this.

>diff --git a/gfx/layers/ipc/CompositorParent.h b/gfx/layers/ipc/CompositorParent.h

>+  static void CreateCompositorMap();
>+  static void DestroyCompositorMap();

These should be private.

>diff --git a/gfx/layers/ipc/ImageBridgeChild.cpp b/gfx/layers/ipc/ImageBridgeChild.cpp

Throughout here, we shouldn't need to use the reentrant Monitor, just
the plain Monitor.  (And if there's a valid reason, please document.)

>diff --git a/gfx/layers/ipc/PImageBridge.ipdl b/gfx/layers/ipc/PImageBridge.ipdl

>diff --git a/gfx/layers/ipc/PImageContainer.ipdl b/gfx/layers/ipc/PImageContainer.ipdl

>+child:

>+  async ReturnImage(SharedImage image);

>+parent:

>+  sync Stop();

>+  sync __delete__();

Unfortunately this implementation doesn't work either.  ReturnImage
can still race with __delete__, and when it does we lose.  We need to
do something like I described in the last comment, a two-phase
shutdown.  I'm not sure I understand all the constraints on the
current impl so it might be best to chat in person tomorrow.

It may not have been clear, but the state machine I wrote in previous
review comment is actually part of the IPDL language, IPDL syntax.
Let's work this out and add it to this file for self-checking
documentation.

>diff --git a/gfx/layers/ipc/ipdl.mk b/gfx/layers/ipc/ipdl.mk

> IPDLSRCS = \
>+  LayersSurfaces.ipdlh \
>   PCompositor.ipdl \
>   PLayer.ipdl \
>   PLayers.ipdl \
>+  PImageBridge.ipdl \
>+  PImageContainer.ipdl \

Alphabetical plz ;).

>   $(NULL)
>diff --git a/gfx/layers/opengl/ImageLayerOGL.cpp b/gfx/layers/opengl/ImageLayerOGL.cpp

> void
> ShadowImageLayerOGL::RenderLayer(int aPreviousFrameBuffer,
>                                  const nsIntPoint& aOffset)
> {
>   mOGLManager->MakeCurrent();
>+  if (mImageContainerID) {
>+    ImageContainerParent::SetCompositorIDForImage(mImageContainerID,
>+                                                  mOGLManager->GetCompositorID());

Let's file a followup for the correctness bug I mentioned in the
previous comment.

>+    PRUint32 imgVersion = ImageContainerParent::GetSharedImageVersion(mImageContainerID);
>+    if (imgVersion != mImageVersion) {
>+      SharedImage* img = ImageContainerParent::GetSharedImage(mImageContainerID);
>+      if (img && (img->type() == SharedImage::TYUVImage)) {
>+        UploadSharedYUVToTexture(img->get_YUVImage());
>+  
>+        mImageVersion = imgVersion;
>+      }

Not handling type SurfaceDescriptor here.  If it's valid to assume
that then let's assert or warn.  Please to be asking mattwoodrow or
roc.

So the two remaining blocking issues are
 - ownership model of ImageContainer<-ImageContainerChild
 - IPC destruction protocol for PImage{Bridge,Container}

Almost there! :)
Comment 83 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 05:36:42 PDT
Created attachment 641435 [details] [diff] [review]
For reference: last patch rebased on bug 763234

Slightly nontrivial so here's what it looks like.  This doesn't attempt to add gralloc support to PImageBridge; we'll do that in a followup.

Based on Kan-Ru's work from https://github.com/cgjones/platform-demo-mc .
Comment 84 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 05:37:25 PDT
Er, that should be bug 765734.
Comment 85 Nicolas Silva [:nical] 2012-07-12 07:52:56 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #82)
> Comment on attachment 641317 [details] [diff] [review]
> PImageBridge protocol
> >diff --git a/gfx/layers/ipc/PImageContainer.ipdl b/gfx/layers/ipc/PImageContainer.ipdl
> 
> >+child:
> 
> >+  async ReturnImage(SharedImage image);
> 
> >+parent:
> 
> >+  sync Stop();
> 
> >+  sync __delete__();
> 
> Unfortunately this implementation doesn't work either.  ReturnImage
> can still race with __delete__, and when it does we lose.  We need to
> do something like I described in the last comment, a two-phase
> shutdown.  I'm not sure I understand all the constraints on the
> current impl so it might be best to chat in person tomorrow.

Really? Stop is step 1, __delete__ is step 2. Stop is sync, so any message from parent to child will be put in the child side message loop before Stop finishes. after Stop no one is allowed to send any message and the child side schedules in its own loop a task that will send __delete__, so this task will be executed after any message received from the parent.
After Stop, no message except __delete__ will be sent. I don't see where the race is.

> It may not have been clear, but the state machine I wrote in previous
> review comment is actually part of the IPDL language, IPDL syntax.
> Let's work this out and add it to this file for self-checking
> documentation.

I understood, but how can we express the fact that after SendStop the child schedules a task in its own loop that will do Send__delete__ ? This is the key thing here.

Sorry for the fixes I missed, I'm on them now.

> Almost there! :)

\o/
Comment 86 Nicolas Silva [:nical] 2012-07-12 12:44:15 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #80)
> Comment on attachment 628031 [details] [diff] [review]
> Move OptimalShmemType from ShadowLayers.cpp to SharedMemory.h
> 
> This doesn't really make sense in SharedMemory.h.  Let's just expose this
> through ShadowLayers.h.

How does this make more sense in ShadowLayer.h than SharedMemory.h? Ideally ShadowLayers.h and ImageBridge should not even have to know about this i think.
Comment 87 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-12 13:32:47 PDT
The definition of "optimal" here is wrt layers surface sharing.  SharedMemory is just a dumb shared memory segment, it has no concept of optimal.
Comment 88 Nicolas Silva [:nical] 2012-07-12 14:08:58 PDT
Created attachment 641589 [details] [diff] [review]
PImageBridge protocol
Comment 89 Nicolas Silva [:nical] 2012-07-12 15:12:36 PDT
Created attachment 641624 [details] [diff] [review]
PImageBridge protocol
Comment 90 Nicolas Silva [:nical] 2012-07-12 15:25:48 PDT
Created attachment 641627 [details] [diff] [review]
PImageBridge protocol

rebased
Comment 91 Nicolas Silva [:nical] 2012-07-12 17:33:04 PDT
Created attachment 641669 [details] [diff] [review]
PImageBridge protocol

meh, forgot to remove the inline keyword somewhere which didn't blow up on my machine but failed on the try servers.
Comment 92 Nicolas Silva [:nical] 2012-07-12 19:25:30 PDT
Created attachment 641699 [details] [diff] [review]
PImageBridge protocol

One of the includes did not work on osx for some reason. Fixed.
Comment 93 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-13 01:24:23 PDT
Comment on attachment 641699 [details] [diff] [review]
PImageBridge protocol

>diff --git a/gfx/layers/ImageLayers.h b/gfx/layers/ImageLayers.h

> class THEBES_API ImageContainer {
>   NS_INLINE_DECL_THREADSAFE_REFCOUNTING(ImageContainer)
>+friend class mozilla::layers::ImageContainerChild;
>+friend class mozilla::layers::ImageBridgeChild;

This isn't necessary anymore, right?

>diff --git a/gfx/layers/ipc/ImageBridgeChild.cpp b/gfx/layers/ipc/ImageBridgeChild.cpp

>+#include "mozilla/ReentrantMonitor.h"

I don't know the media code well enough to say whether a reentrant
monitor is required here, but it's almost always not.  I guess roc can
speak to that if he wants.

>+void ImageBridgeChild::StartUp()
>+{

>+  imageBridgeChild->ConnectAsync(imageBridgeParent);

Can we just do 

  GetMessageLoop()->PostTask(
    FROM_HERE,
    NewRunnableMethod(this, &Open,
                      parent->GetIPCChannel(), imageBridgeParent->GetMessageLoop(),
                      AsyncChannel::Child));

here instead of going through a couple of layers of indirection?
Posting a helper function is fine too.

>+void ImageBridgeChild::DestroyBridge()
>+{

>+  sImageBridgeChildSingleton->GetMessageLoop()->PostTask(FROM_HERE, 
>+                  NewRunnableFunction(&StopImageBridgeSync, &barrier));
>+  barrier.Wait();

This isn't correct because monitors can have spurious wakeups.  You
need something like

  bool done = false;
  Monitor mon;
  MonitorAutoEnter enter(mon);
  // post task
  while (!done) {
    enter.Wait();
  }

>+ImageContainerChild* ImageBridgeChild::CreateImageContainerChild()

already_AddRefed<ImageContainerChild>

>+  ReentrantMonitor barrier("CreateImageContainerChild Lock");
>+  ReentrantMonitorAutoEnter autoMon(barrier);
>+

It's not possible to reenter this helper.  Use Monitor.

>+  GetMessageLoop()->PostTask(FROM_HERE, NewRunnableFunction(&CreateContainerChildSync, &result, &barrier));
>+

Note here that the first ref to |result| "floats" from bridge thread
to here.  Take the ref like so

  nsRefPtr<ImageContainerChild> container(dont_AddRef(result));

>+  // should stop the thread until the ImageContainerChild has been created on 

s/should stop the/blocks this/ ;), or we're in trouble ;).

>+  // the other thread
>+  barrier.Wait();

Same thing here --- need a bool cond that you want on.

>+  return result;

return container.forget();

>+}
>+
>+ImageContainerChild* ImageBridgeChild::CreateImageContainerChildNow()

already_AddRefed<ImageContainerChild>

>+  ImageContainerChild* ctnChild = new ImageContainerChild();

nsRefPtr<ImageContainerChild>

>+  PRUint64 id = 0;
>+  SendPImageContainerConstructor(ctnChild, &id);
>+  ctnChild->SetID(id);
>+  return ctnChild;

return ctnChild.forget();

>diff --git a/gfx/layers/ipc/ImageBridgeChild.h b/gfx/layers/ipc/ImageBridgeChild.h

>+  /**
>+   * Creates the ImageBridgeChild manager protocol.
>+   */
>+  static bool CreateWithThread(base::Thread* aThread);
>+

Probably want to call this StartUpOnThread() for consistency.

>diff --git a/gfx/layers/ipc/ImageContainerParent.cpp b/gfx/layers/ipc/ImageContainerParent.cpp

>+bool ImageContainerParent::Recv__delete__()

>diff --git a/gfx/layers/ipc/ImageContainerParent.h b/gfx/layers/ipc/ImageContainerParent.h

>+class ImageContainerParent : public PImageContainerParent
>+{

>+  // Overriden from PImageContainerParent (see ImageContainer.ipdl)
>+  bool RecvPublishImage(const SharedImage& aImage);
>+  // Overriden from PImageContainerParent (see DoStop)
>+  bool RecvStop();
>+  // Overriden from PImageContainerParent (see ImageContainer.ipdl)
>+  bool Recv__delete__();
>+  // Overriden from PImageContainerParent (see ImageContainer.ipdl)
>+  bool RecvFlush();

Use MOZ_OVERRIDE, for example

 bool RecvFlush() MOZ_OVERRIDE;

>diff --git a/gfx/layers/ipc/PImageContainer.ipdl b/gfx/layers/ipc/PImageContainer.ipdl

>+  // After receiving this message, the ImageContainerParent will not return images
>+  // back to the child side (to avoid a race between ReturnImage and __delete__) 
>+  sync Stop();
>+
>+  sync __delete__();
>+

This doesn't implement the simpler and more efficient protocol we
discussed on IRC, and you didn't write out the protocol state machine.
I don't think you'll be able to write one for this protocol since it
relies on event-loop semantics for correctness, actually.

But this works and we're in a pinch, so I'm going to strategically
assent :).

A few more things to fix above, but those I think this is ready to go,
pref'd off.  Let's try to land tomorrow.  \o/

r=me with fixes above.
Comment 94 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-13 04:07:06 PDT
Comment on attachment 641699 [details] [diff] [review]
PImageBridge protocol

I just noticed that we're doing AddCompositor() on the main thread but everything else happens on the compositor thread.  Can't land with sg:crit data hazards.
Comment 95 Nicolas Silva [:nical] 2012-07-13 08:50:07 PDT
Created attachment 641921 [details] [diff] [review]
PImageBridge protocol
Comment 96 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-13 11:51:29 PDT
Comment on attachment 641921 [details] [diff] [review]
PImageBridge protocol


>diff --git a/gfx/layers/ipc/CompositorParent.cpp b/gfx/layers/ipc/CompositorParent.cpp

>@@ -77,16 +81,19 @@ CompositorParent::CompositorParent(nsIWi

>+  CompositorLoop()->PostTask(FROM_HERE, NewRunnableFunction(&AddCompositor, 
>+                                                          this, &mCompositorID));

This is pretty scary ... in this case we get away with it because the
only thing that can destroy the CP is initialized on the compositor
thread after this task has been processed.  Please document this and
add a FIXME comment.

>diff --git a/gfx/layers/ipc/ImageBridgeChild.cpp b/gfx/layers/ipc/ImageBridgeChild.cpp

>+static void StopImageBridgeSync(ReentrantMonitor *aBarrier, bool *done)

aDone

>+{
>+  *done = false;
>+  ReentrantMonitorAutoEnter autoMon(*aBarrier);
>+

You can't modify done outside the monitor.  Just remove the *done =
false line, it's not necessary.

>+// dispatched function
>+static void DeleteImageBridgeSync(ReentrantMonitor *aBarrier, bool *done)
>+{
>+  *done = false;
>+  ReentrantMonitorAutoEnter autoMon(*aBarrier);

Same changes here.

>+  *done=true;

 *aDone = true;

>+// dispatched function
>+static void CreateContainerChildSync(nsRefPtr<ImageContainerChild>* result, 
>+                                     ReentrantMonitor* barrier,
>+                                     bool *done)
>+{
>+  *done=false;

And here.

>+  ReentrantMonitorAutoEnter autoMon(*barrier);
>+  *result = sImageBridgeChildSingleton->CreateImageContainerChildNow();
>+  *done=true;

and here

r=me this those.  We really need to tighten up the cross-thread memory
management here, but I'm ok us doing that later since this will land
pref'd off.
Comment 97 Nicolas Silva [:nical] 2012-07-13 12:41:51 PDT
Created attachment 642007 [details] [diff] [review]
PImageBridge protocol
Comment 98 Nicolas Silva [:nical] 2012-07-13 13:10:46 PDT
Lets' land it :)

We need to land Bug 763234 first.

Also, I propose that we don't close this bug and we use it as a tracking bug for the follow up bugs.
Comment 99 Ryan VanderMeulen [:RyanVM] 2012-07-13 18:08:51 PDT
I applied bug 763234 and am getting many failed hunks when I try to apply this patch. Also, this needs a real commit message before landing.
Comment 100 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-14 19:56:01 PDT
Nico, I'm rebasing this, will land in a few minutes.
Comment 101 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-07-14 20:29:49 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/f00bc02921e3

\o/
Comment 102 Ryan VanderMeulen [:RyanVM] 2012-07-15 10:20:24 PDT
https://hg.mozilla.org/mozilla-central/rev/f00bc02921e3
Comment 103 Nicolas Silva [:nical] 2012-07-15 13:31:26 PDT
oh, too bad I also rebased it yesterday but i wanted to make sure try was green before submitting. I'm sorry that you had to do it yourself.

Anyway it's really cool that this has finally landed :D
Comment 104 JP Rosevear [:jpr] 2012-07-18 10:25:04 PDT
Is there still a reason to leave this bug open as per white board?
Comment 105 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-07-18 11:55:50 PDT
See comment #98, although I agree it would be slightly better to close this bug and open a new tracking bug.
Comment 106 Joe Drew (not getting mail) 2012-08-16 13:35:48 PDT
We don't usually block on tracking bugs, so I'm going to close this out as fixed and open a new bug with the open dependencies of this bug. Just makes it easier to keep things tracked.

Note You need to log in before you can comment on or make changes to this bug.