Sandboxing support for Video camera access

RESOLVED FIXED in Firefox 43

Status

()

Core
Audio/Video
RESOLVED FIXED
3 years ago
2 months ago

People

(Reporter: gcp, Assigned: gcp)

Tracking

Trunk
mozilla43
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox43 fixed)

Details

Attachments

(4 attachments, 10 obsolete attachments)

1.32 MB, application/gzip
Details
16.80 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
5.33 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
127.55 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
Comment hidden (empty)
(Assignee)

Updated

3 years ago
Blocks: 1104615
(Assignee)

Comment 1

3 years ago
Created attachment 8549670 [details] [diff] [review]
camera-proxy

WIP patch showing the approach.
Assignee: nobody → gpascutto
(Assignee)

Comment 2

3 years ago
Created attachment 8553036 [details] [diff] [review]
wip-cameras

This is functional. Going to clean up a bit and check out a shutdown crash.
Attachment #8549670 - Attachment is obsolete: true
(Assignee)

Comment 3

3 years ago
Created attachment 8556496 [details] [diff] [review]
wip-cameras

Seems to be fully working. Going to make a look a cleaning up MediaEngineWebRTC.cpp.
Attachment #8553036 - Attachment is obsolete: true
(Assignee)

Comment 4

3 years ago
Created attachment 8557964 [details] [diff] [review]
Patch 1. v1 Proxy all video provider access to chrome process

This includes support for Screen/Window/Application sharing and cleans up MediaEngineWebRTC a bit. (MediaEngineWebRTCVideo.cpp is probably no longer needed after this)
Attachment #8556496 - Attachment is obsolete: true
Attachment #8557964 - Flags: review?(rjesup)
Attachment #8557964 - Flags: review?(bent.mozilla)
(Assignee)

Comment 6

3 years ago
Note this currently only supports a single engine active at the same time.

What still needs to be added is to make both CamerasParent and CamerasChild keep a hashtable of capEngine+capture_id combinations, create some object implementing ExternalRenderer instances on the CamerasParent side, and adding those to the AddRenderer call in StartCapture. On the CamerasChild side the same hashtable needs to keep the link to the original MediaEngineRemoteVideoSource that wants to receive the frame.
(Assignee)

Comment 7

3 years ago
Created attachment 8558053 [details] [diff] [review]
Patch 2. v1 Add multiple callback support

These lists are tiny so we might as well use arrays. Some kind of RefPtr use would be better though, but I grew tired of fighting with all our STL/boost replacements.
(Assignee)

Comment 8

3 years ago
The original patches here are now quite bitrotted.

Rebased against m-c, including a bunch of changes to MediaEngineWebRTC and constraints.

https://github.com/gcp/gecko-dev/tree/sandbox-av
https://github.com/gcp/gecko-dev/compare/sandbox-av

This also includes a bunch of audio work (that won't compile yet).
(Assignee)

Updated

3 years ago
Blocks: 1136347
Comment on attachment 8557964 [details] [diff] [review]
Patch 1. v1 Proxy all video provider access to chrome process

Review of attachment 8557964 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with a number of nits; you decide if the updates require a re-review (if so, please generate an interdiff given the size of the patch).  
Only one I'm not sure if it's a blocker is the "xxx needs a lock" comment

::: dom/media/systemservices/CamerasChild.cpp
@@ +12,5 @@
> +#include "mozilla/ipc/PBackgroundChild.h"
> +#include "prlog.h"
> +
> +#include "CamerasUtils.h"
> +#include "CamerasChild.h"

include CamerasChild.h first, please.

@@ +14,5 @@
> +
> +#include "CamerasUtils.h"
> +#include "CamerasChild.h"
> +
> +PRLogModuleInfo *gCamerasChildLog;

Can this live inside if defined(PR_LOGGING)?

@@ +27,5 @@
> +#define LOG_ENABLED() (false)
> +#endif
> +
> +namespace mozilla {
> +namespace camera {

Given these are all effectively global functions, couldn't they be static functions in CamerasChild:: instead of global functions in namespace camera?  This is in keeping with the style guide for added points.  (part of the reason for the style guide rule is that unified compilation means that "usings" in c++ code "escape" your file, and you can get collisions even if you have different sub-namespaces.  (I do note that there's no "using mozilla::camera" in this patch, which helps).

@@ +29,5 @@
> +
> +namespace mozilla {
> +namespace camera {
> +
> +static PCamerasChild* sCameras;

Is there any way to avoid leaking this, or do we care?  I note it's not cleared in Shutdown(), or is this handled by IPC? (in which case we should see if we can clear out this ptr when it becomes invalid/dangerous)

@@ +187,5 @@
> +
> +void Shutdown()
> +{
> +  LOG((__PRETTY_FUNCTION__));
> +  if (sCamerasChildThread) {

Are we always called on the thread that created sCameras?  (probably)

@@ +190,5 @@
> +  LOG((__PRETTY_FUNCTION__));
> +  if (sCamerasChildThread) {
> +    nsRefPtr<ShutdownRunnable> runnable = new ShutdownRunnable();
> +    sCamerasChildThread->Dispatch(runnable, NS_DISPATCH_SYNC);
> +    sCamerasChildThread = nullptr;

clear sCameras?

@@ +217,5 @@
> +bool
> +CamerasChild::RecvFrameSizeChange(const int& w, const int& h)
> +{
> +  LOG((__PRETTY_FUNCTION__));
> +  // XXX: Needs a lock

a) needs lock :-) ; b) what else needs to be locked?  RecvDeliverFrame?

::: dom/media/systemservices/CamerasParent.cpp
@@ +4,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#include "mozilla/Assertions.h"
> +#include "CamerasParent.h"

CameraParent.h first

@@ +10,5 @@
> +#include "MediaEngine.h"
> +#include "nsThreadUtils.h"
> +#include "prlog.h"
> +
> +PRLogModuleInfo *gCamerasParentLog;

See Child comments

@@ +101,5 @@
> +                                   uint32_t time_stamp,
> +                                   int64_t ntp_time,
> +                                   int64_t render_time)
> +{
> +

nit: remove blank line

@@ +108,5 @@
> +    mShmemInitialized = true;
> +  }
> +
> +  if (!mShmem.IsWritable()) {
> +    LOG(("Video shmem is not writeable in DeliverFrame"));

Is it a hard requirement that the shmem be returned before the camera captures the next frame?  THreading/priorities/etc might cause us to call DeliverFrame twice close together I imagine.  Not a blocker if we think it'll be really, really rare even under load.

@@ +284,5 @@
> +
> +void
> +CamerasParent::CloseActiveEngine()
> +{
> +  if(mPtrViEBase) {

in reverse order of opening (good practice).  Also: name is CloseActiveEngine, but it ignores mEngine and just closes the ViE modules.

::: dom/media/systemservices/CamerasParent.h
@@ +38,5 @@
> +                           int64_t render_time,
> +                           void *handle) MOZ_OVERRIDE;
> +  virtual bool IsTextureSupported() MOZ_OVERRIDE { return false; };
> +
> +  //

missing comment?

@@ +82,5 @@
> +  webrtc::VideoEngine* mAppEngine;
> +
> +  // Need this to avoid unneccesary WebRTC calls while enumerating.
> +  // XX: Nothing actually uses this
> +  bool mCameraEngineInit;

cleanup then?

::: dom/media/systemservices/moz.build
@@ +47,5 @@
>          ]
>      ]
>  
> +if CONFIG['_MSC_VER']:
> +    DEFINES['__PRETTY_FUNCTION__'] = '__FUNCSIG__'

while largely perfunctory, needs a build peer

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp
@@ +299,5 @@
> +      mozilla::camera::GetCaptureCapability(mCapEngine,
> +                                            uniqueId.get(),
> +                                            candidateSet[i], cap);
> +      if (!SatisfiesConstraintSet(*aConstraintSets[j], cap)) {
> +        candidateSet.RemoveElementAt(i);

Check this hasn't diverged given Constraints changes recently

@@ +311,5 @@
> +
> +void
> +MediaEngineRemoteVideoSource::ChooseCapability(
> +    const VideoTrackConstraintsN &aConstraints,
> +    const MediaEnginePrefs &aPrefs)

Ditto for ChooseCapability -- plus it would be good not to have multiple copies of the same code!

::: dom/media/webrtc/MediaEngineRemoteVideoSource.h
@@ +4,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef MEDIAENGINEREMOTEVIDEOSOURCE_H_
> +#define MEDIAENGINEREMOTEVIDEOSOURCE_H_

not reabable; either use style guide version or mirror other uses here

::: dom/media/webrtc/MediaEngineWebRTC.h
@@ +271,1 @@
>    Mutex mMutex;

I think it's no longer needed (if it ever was)

::: ipc/glue/moz.build
@@ +131,5 @@
>  
>  LOCAL_INCLUDES += [
>      '/dom/broadcastchannel',
>      '/dom/indexedDB',
> +    '/media/webrtc/trunk',

If this truly needed?
Attachment #8557964 - Flags: review?(rjesup) → review+
Comment on attachment 8557964 [details] [diff] [review]
Patch 1. v1 Proxy all video provider access to chrome process

Ok, we talked about this a bunch today with jesup on irc and I think there's a few medium-ish sized issues to work out. Canceling review for the time being, please needinfo me if you get stuck anywhere (or ping on irc)!
Attachment #8557964 - Flags: review?(bent.mozilla)
(Assignee)

Comment 15

3 years ago
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gpascutto@mozilla.com-c41735dd44a4/try-linux64-debug/try_ubuntu64_vm-debug_test-mochitest-3-bm121-tests1-linux64-build322.txt.gz

15:47:13     INFO -  -535382272[7f09ee8ada40]: CamerasChild: 7f0a09e3ab00
15:47:13     INFO -  195032832[7f0a0de37c40]: CamerasParent: 7f09f14af120
15:47:13     INFO -  195032832[7f0a0de37c40]: virtual bool mozilla::camera::CamerasParent::RecvNumberOfCaptureDevices(const int&, int*)
...stuff happens...
16:04:12     INFO -  -663738624[7f36d918fa00]: ~CamerasChild: 7f36d44f6aa0
16:04:12     INFO -  -174425664[7f36e6f20140]: Releasing MediaManager backend
16:04:12     INFO -  195032832[7f0a0de37c40]: virtual void mozilla::camera::CamerasParent::ActorDestroy(mozilla::ipc::IProtocolManager<mozilla::ipc::IProtocol>::ActorDestroyReason)
16:04:12     INFO -  [Parent 1937] WARNING: bad Shmem: file /builds/slave/try-l64-d-00000000000000000000/build/src/obj-firefox/ipc/ipdl/PCamerasParent.cpp, line 705
16:04:12     INFO -  195032832[7f0a0de37c40]: ~CamerasParent: 7f09f0696fa0
16:04:12     INFO -  [Parent 1937] WARNING: bad Shmem: file /builds/slave/try-l64-d-00000000000000000000/build/src/obj-firefox/ipc/ipdl/PCamerasParent.cpp, line 705
16:04:12     INFO -  -174425664[7f36e6f20140]: Releasing MediaManager singleton and thread
16:04:15     INFO -  2319 INFO TEST-OK | dom/media/tests/mochitest/test_dataChannel_basicAudio.html | took 3341ms

Cool.

...stuff happens...
16:04:16     INFO -  -535382272[7f09ee8ada40]: Selected audio device
16:04:16     INFO -  -535382272[7f09ee8ada40]: Audio device 0 allocated
16:04:16     INFO -  195032832[7f0a0de37c40]: virtual bool mozilla::camera::CamerasParent::RecvNumberOfCaptureDevices(const int&, int*)
16:04:16     INFO -  195032832[7f0a0de37c40]: bool mozilla::camera::CamerasParent::EnsureInitialized(int)

This gets called without the CamerasParent constructor call being in the log. It's actually being called without CamerasChild being there, too. Looks like the code that spins up and tears down the Cameras() thing in CamerasChild is broken.
(Assignee)

Comment 16

3 years ago
The problem appears to be that during these tests the CamerasChild is created in two different processes, which breaks because it's currently supposed to be a singleton.
(Assignee)

Comment 17

3 years ago
I understand from past discussion this would be expected in e10s mode because for example Loop is in chrome yet uses WebRTC, but the problem is occurring in non-e10s try and e10s is passing.
(Assignee)

Comment 18

3 years ago
Some of the logging is extremely confusing because the timestamps for some of the processes are wrong and interleaved, which makes it hard to see causally what's going on. I think the above wrt multiple processes or Cameras instances may be wrong. I've added an atomic counter and asserts to make sure it is.

I've found since that when only running the test_dataChannel_basicVideo.html succeeds on try. So the problem is that one test leaves things in a state that is broken for following ones.

Specifically, a passing try run looks like:
14:40:38     INFO -  256 INFO Analyzing element: pcLocal_local1_video
14:40:38     INFO -  257 INFO Analyzing element: pcLocal_remote2_video
14:40:38     INFO -  1976088384[7fde74661260]: 7fde3e7fe100(7fde3dafaa80): OnChannelConnected - Dispatching
14:40:38     INFO -  258 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | data channel is 'open' after 'onopen'
14:40:38     INFO -  259 INFO canplaythrough fired for media element pcRemote_remote2_video
14:40:38     INFO -  260 INFO canplaythrough fired for media element pcLocal_remote2_video
14:40:38     INFO -  261 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | Media flowing for pcLocal_local1_video
14:40:38     INFO -  262 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | Media flowing for pcLocal_remote2_video


And a failure looks like:
12:21:54     INFO -  2380 INFO Run step 44: PC_LOCAL_CHECK_MEDIA_FLOW_PRESENT
12:21:54     INFO -  2381 INFO Analyzing element: pcLocal_local1_video
12:21:54     INFO -  2382 INFO Analyzing element: pcLocal_remote2_video
12:21:54     INFO -  2383 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | data channel is 'open' after 'onopen'
12:21:54     INFO -  2384 INFO TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | Test timed out. - expected PASS

Inserting more logging in those places may be instructive.
(Assignee)

Comment 19

3 years ago
I got a useful log that indicates that one CamerasParent instance is being created while another one is active. It's not clear to me how that can happen. It would be expected if we were in e10s with multiple content processess - but we aren't. Maybe it's due to a CamerasChild being created on the chrome side. Maybe it's due to a shutdown-restart cycle of the subsystem and the deletion of the object in the parent being slow.

In any case moving the pointers of the "this is the parent object" from a global singleton to sitting in the callbacks themselves makes try happy. But it would be useful to understand what is actually going on.
(Assignee)

Comment 25

3 years ago
Created attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Major rework:

1) Move all the access to globals in the child to a singleton object to clean up the code and enforce proper locking.
2) Remove all use of globals in the parent, it's all instance members now. (This should allow multiple content processes)
3) Add a dedicated IPC thread to the child. (This should also ensure we can safely synchronously initialize PBackground in that thread)
4) Add a dedicated WebRTC access thread to the parent. This moves all camera access etc away from the main thread. On Windows this is an UI thread, so move that to Chrome-style threads.
5) Make all the IPC asynchronous. The sync API to webrtc API consumers is maintained by sending reply messages and blocking until their reception in the child. This is possible because (3) moved the receiving of those to another thread.
6) Fix or work around all the dragons in shutdown/reinit.

Outstanding issues:

1) The thread ping-pong caused by dispatching incoming requests to the webrtc thread and then re-dispatching the reply to the IPC thread makes it a pain to extract some error conditions. Currently, we ignore whether IPC calls succeed or fail, both because handling them would complicate things more and because there's not much to recover if child and parent can't communicate. See all the unsued << SendBlah stuff. In general this would benefit from followup work for a nicer way to deal with the chained thread dispatches that currently lead to continuation-passing-style programming. I know we've recently started adding MediaPromises, so maybe using that or something similar for follow-up work could be nice.

2) There's an extra frame copy happing because the WebRTC DeliverFrame callback is sync, but we don't copy the frame data into a Shmem until we're back on the IPC thread, meaning there's an indeterminate time inbetween...a time long enough so that another DeliverFrame could potentially come in, so we need to keep a buffer of what we already got. I think we need some infra with multiple Shmem buffers to properly handle this. That will likely be useful for audio as well, so IMHO it should go in a followup bug as this one is already complicated enough.
Attachment #8557964 - Attachment is obsolete: true
Attachment #8558053 - Attachment is obsolete: true
Attachment #8623655 - Flags: review?(rjesup)
Attachment #8623655 - Flags: review?(bent.mozilla)
(Assignee)

Comment 26

3 years ago
Comment on attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Review of attachment 8623655 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/automation.py.in
@@ +506,4 @@
>      env['XRE_NO_WINDOWS_CRASH_DIALOG'] = '1'
>  
>      # Set WebRTC logging in case it is not set yet
> +    env.setdefault('NSPR_LOG_MODULES', 'signaling:3,mtransport:4,datachannel:4,jsep:4,MediaPipelineFactory:4,MediaManager:4,CamerasParent:4,CamerasChild:4')

I'll remove these, debugging leftovers.

::: build/automationutils.py
@@ +347,4 @@
>    env.setdefault('MOZ_DISABLE_NONLOCAL_CONNECTIONS', '1')
>  
>    # Set WebRTC logging in case it is not set yet
> +  env.setdefault('NSPR_LOG_MODULES', 'signaling:3,mtransport:4,datachannel:4,jsep:4,MediaPipelineFactory:4,MediaManager:4,CamerasParent:4,CamerasChild:4')

Same.

::: testing/mozbase/mozrunner/mozrunner/base/device.py
@@ +25,4 @@
>              'MOZ_CRASHREPORTER_NO_REPORT': '1',
>              'MOZ_CRASHREPORTER_SHUTDOWN': '1',
>              'MOZ_HIDE_RESULTS_TABLE': '1',
> +            'NSPR_LOG_MODULES': 'signaling:5,mtransport:5,datachannel:5,jsep:5,MediaPipelineFactory:5,MediaManager:5,CamerasParent:5,CamerasChild:5',

Same.
Comment on attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Review of attachment 8623655 [details] [diff] [review]:
-----------------------------------------------------------------

BTW, I'm working on this review (it's not a simple one)
Comment on attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Review of attachment 8623655 [details] [diff] [review]:
-----------------------------------------------------------------

Initial partial review

::: dom/media/systemservices/CamerasChild.cpp
@@ +16,5 @@
> +#include "mozilla/Mutex.h"
> +#include "mozilla/SyncRunnable.h"
> +#include "MediaUtils.h"
> +#include "nsThreadUtils.h"
> +#include "prlog.h"

mozilla/Logging.h now

@@ +19,5 @@
> +#include "nsThreadUtils.h"
> +#include "prlog.h"
> +
> +#undef LOG
> +#undef LOG_ENABLED

are these needed?

@@ +41,5 @@
> +  CamerasSingleton()
> +    : mCamerasMutex("CamerasSingleton::mCamerasMutex"),
> +      mCameras(nullptr),
> +      mCamerasChildThread(nullptr) {
> +#if defined(PR_LOGGING)

remove PR_LOGGING

@@ +43,5 @@
> +      mCameras(nullptr),
> +      mCamerasChildThread(nullptr) {
> +#if defined(PR_LOGGING)
> +    if (!gCamerasChildLog)
> +      gCamerasChildLog = PR_NewLogModule("CamerasChild");

braces

@@ +54,5 @@
> +    LOG(("~CamerasSingleton: %p", this));
> +  }
> +
> +  static CamerasSingleton& getInstance() {
> +    static CamerasSingleton instance;

GetInstance() perhaps.  Ditto other methods.  lowerUpper() is JS-style ;-)

Does this imply a static constructor/destructor invocation?
ClearOnShutdown() with a static UniquePtr perhaps, and new here on the first usage?  Also: what's the thread safety of the CamerasSingleton?  Is getInstance referenced from multiple threads?  The mutex stuff below implies it is, so that complicates things and might add locking to each getInstance(), which would be a pain/overhead if it's done constantly instead of cached.

At least comment as to the lifetime, and the locking/threading model.

@@ +78,5 @@
> +  // We don't want this to happen in the middle of preparing IPC.
> +  // We will be alive on destruction, so this needs to be off the books.
> +  mozilla::OffTheBooksMutex mCamerasMutex;
> +
> +  CamerasChild *mCameras;

Not at least a UniquePtr<>?  Or is this alive too late for that due to the static instance?  Or (more likely) it gets deleted indirectly via the IPC code, and we have the risk of a dangling raw ptr here.  Comment as to lifetime at minimum.

@@ +85,5 @@
> +
> +class InitializeIPCThread : public nsRunnable
> +{
> +public:
> +  explicit InitializeIPCThread()

no explicit, that's only needed for 1-arg constructors (normally)

@@ +105,5 @@
> +    MOZ_RELEASE_ASSERT(existingBackgroundChild);
> +
> +    // Create CamerasChild
> +    mCamerasChild =
> +      static_cast<CamerasChild*>(existingBackgroundChild->SendPCamerasConstructor());

Comment that this is an indirect return value via GetCamerasChild on an outstanding reference to this and the use of SyncRunnable to wrap it, since that's not a normal pattern.

@@ +119,5 @@
> +private:
> +  CamerasChild* mCamerasChild;
> +};
> +
> +static CamerasChild* Cameras(bool trace) {

Do we need trace, instead of keying it off say ::Verbose logging?

@@ +121,5 @@
> +};
> +
> +static CamerasChild* Cameras(bool trace) {
> +  if (!gCamerasChildLog)
> +    gCamerasChildLog = PR_NewLogModule("CamerasChild");

braces

@@ +127,5 @@
> +  OffTheBooksMutexAutoLock lock(CamerasSingleton::getMutex());
> +  if (!CamerasSingleton::getChild()) {
> +    LOG(("No sCameras, setting up"));
> +    MOZ_ASSERT(!CamerasSingleton::getThread());
> +    LOG(("Spinning up IPC Thread"));

combine with previous log

@@ +129,5 @@
> +    LOG(("No sCameras, setting up"));
> +    MOZ_ASSERT(!CamerasSingleton::getThread());
> +    LOG(("Spinning up IPC Thread"));
> +    nsresult rv = NS_NewNamedThread("Cameras IPC",
> +      getter_AddRefs(CamerasSingleton::getThread()));

I find this a bit odd (getter_AddRefs(getFoo()) as an output), but ok

@@ +140,5 @@
> +    // 1) Creation of PBackground on the thread created earlier
> +    // 2) Creation of PCameras by sending a message to the parent on that thread
> +    nsRefPtr<InitializeIPCThread> runnable = new InitializeIPCThread();
> +    nsRefPtr<SyncRunnable> sr = new SyncRunnable(runnable);
> +    sr->DispatchToThread(CamerasSingleton::getThread());

What thread does this occur on?  I hope it's not MainThread, since we're a) doing this Dispatch under a mutex lock (deadlock risk), and b) blocking the sending thread *and* not processing the event loop for the thread.

SyncRunnable is mostly intended for non-nsThread sources that don't have an event loop and don't want to be silently converted to an nsThread (and perhaps then leak the response).

NrUdpSocketIPC::create_i() (and the UDPSocketChild code it calls) does with this by a) not holding the mutex, and b) doing this stuff on a newly-created thread, and using a monitor to signal completion.  That is also not wonderful; it's blocking STS thread waiting for the IPC channel to open on the new thread.  Better would be to block additional create() calls on STS and make create_i() return runnable (the same one likely) to STS to finish the create() (or multiple create()s).  (Ignoring that it doesn't create separate runnables currently and uses WrapRunnable; that could change, or use another WrapRunnable in return.)

So my suggestion: 
* Dispatch the runnable as NS_DISPATCH_NORMAL
* In Run(), have it send itself back to the original thread
* on the original thread in Run(), install the CameraChild and unblock whatever was waiting on Cameras().  However, this would require some redesign to make Cameras() async, or to guarantee we called CreateCameras() at some natural blocking point before we might need to call Cameras().

Also: is there any way to shut down the Camera thread/etc when unused?  It doesn't appear so... We've been planning to shut down MediaManager's thread (and close related things like engines) when we're idle for more than a very short while (at least on mobile, likely on desktop too).  Even if we can't shutdown the thread, we could close out resources.

@@ +188,5 @@
> +  return true;
> +}
> +
> +int
> +CamerasChild::NumberOfCapabilities(CaptureEngine aCapEngine,

Identify where this gets called (thread)

@@ +192,5 @@
> +CamerasChild::NumberOfCapabilities(CaptureEngine aCapEngine,
> +                                   const char* deviceUniqueIdUTF8)
> +{
> +  LOG((__PRETTY_FUNCTION__));
> +  LOG(("NumberOfCapabilities for %s", deviceUniqueIdUTF8));

combine

@@ +227,5 @@
> +  return Cameras(true)->NumberOfCaptureDevices(aCapEngine);
> +}
> +
> +int
> +CamerasChild::NumberOfCaptureDevices(CaptureEngine aCapEngine)

Merge with version with UniqueID (virtually identical)

@@ +279,5 @@
> +
> +int
> +CamerasChild::GetCaptureCapability(CaptureEngine aCapEngine,
> +                                   const char* unique_idUTF8,
> +                                   const unsigned int capability_number,

Note: we have an open bug that capability_number is a horrid index for grabbing the right device, so this will need to change eventually.  But not now

@@ +320,5 @@
> +  mReplyCapability.maxFPS = ipcCapability.maxFPS();
> +  mReplyCapability.expectedCaptureDelay = ipcCapability.expectedCaptureDelay();
> +  mReplyCapability.rawType = static_cast<webrtc::RawVideoType>(ipcCapability.rawType());
> +  mReplyCapability.codecType = static_cast<webrtc::VideoCodecType>(ipcCapability.codecType());
> +  mReplyCapability.interlaced = ipcCapability.interlaced();

It'd be nice if we didn't need all this boilerplate, but oh well.

@@ +350,5 @@
> +{
> +  LOG((__PRETTY_FUNCTION__));
> +  MonitorAutoLock monitor(mReplyMonitor);
> +  nsRefPtr<nsIRunnable> runnable =
> +    media::NewRunnableFrom([this, aCapEngine, list_number]() -> nsresult {

ditto about the index/number

@@ +474,5 @@
> +  monitor.Wait();
> +  if (mReplySuccess) {
> +    return 0;
> +  } else {
> +    return -1;

arguable nit: I'd write return mReplySuccess ? 0 : -1;, but it's all equivalent  If you leave it, remove the 'else'.  Repeat for other instances of this pattern

@@ +515,5 @@
> +                                     cb);
> +}
> +
> +int
> +CamerasChild::StartCapture(CaptureEngine aCapEngine,

There are many instances of almost the same Blah() wrapping a NewRunnableFrom() for SendBlah() and returning the result after waiting on the monitor; I wonder if this can be made a template or otherwise factored.  But it's ok as-is

@@ +603,5 @@
> +
> +class ShutdownRunnable : public nsRunnable {
> +public:
> +  ShutdownRunnable(nsRefPtr<nsIRunnable> aReplyEvent,
> +                   nsRefPtr<nsIThread> aReplyThread)

nsCOMPtrs

@@ +618,5 @@
> +  }
> +
> +private:
> +  nsRefPtr<nsIRunnable> mReplyEvent;
> +  nsRefPtr<nsIThread> mReplyThread;

These should be nsCOMPtrs...

@@ +645,5 @@
> +    LOG(("Shutdown called without PBackground thread"));
> +  }
> +  LOG(("Erasing sCameras & thread refs (original thread)"));
> +  CamerasSingleton::getChild() = nullptr;
> +  CamerasSingleton::getThread() = nullptr;

I still find the use of getFoo() being an assignable value confusing (because of 'get').  Maybe Child() and Thread()?

@@ +704,5 @@
> +    mIPCIsAlive(true),
> +    mReplyMonitor("mozilla::cameras::CamerasChild::mReplyMonitor")
> +{
> +  if (!gCamerasChildLog)
> +    gCamerasChildLog = PR_NewLogModule("CamerasChild");

braces

@@ +724,5 @@
> +webrtc::ExternalRenderer* CamerasChild::Callback(CaptureEngine aCapEngine,
> +                                                 int capture_id)
> +{
> +  for (unsigned int i = 0; i < mCallbacks.Length(); i++) {
> +    CapturerElement ce = mCallbacks[i];

c++11's looping (for (auto& ce : mCallbacks)) might be nice here; just a nit

::: dom/media/systemservices/CamerasChild.h
@@ +44,5 @@
> +  int id;
> +  webrtc::ExternalRenderer* callback;
> +};
> +
> +// statically mirror webrtc.org API

Comment which class(es) we're mirroring for help in maintaining
(Assignee)

Updated

3 years ago
Blocks: 1177242
Comment on attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Review of attachment 8623655 [details] [diff] [review]:
-----------------------------------------------------------------

A bunch of nits, but looks close.  (gcp and I talked over a bunch of the comments in the previous partial review live as well).

::: dom/media/systemservices/CamerasParent.cpp
@@ +18,5 @@
> +#undef LOG_ENABLED
> +
> +PRLogModuleInfo *gCamerasParentLog;
> +#define LOG(args) MOZ_LOG(gCamerasParentLog, mozilla::LogLevel::Debug, args)
> +#define LOG_ENABLED() MOZ_LOG_TEST(gCamerasParentLog, 5)

is 5 correct?

@@ +88,5 @@
> +    mBuffer = (unsigned char*)malloc(size);
> +    // There is an extra copy here. We can't copy into Shmem yet
> +    // because we have no idea if the next frame will arrive before
> +    // us copying the data into the Shmem will have finished. We need
> +    // multiple Shmem to avoid this.

Can we file a bug to get rid of this extra copy?  video-frame copies are bad...  I'd prefer to resolve this before enabling this.

Do we just need to double-buffer the SHmems?  I'm also fine if we do this only when in an odd case, and normally we avoid it (if so, we should have some sort of log message when the odd case happens, or some other way to know).

@@ +93,5 @@
> +    memcpy(mBuffer, buffer, size);
> +  };
> +
> +  virtual ~DeliverFrameRunnable() {
> +    free(mBuffer);

remove (use UniquePtr)

@@ +120,5 @@
> +private:
> +  CamerasParent *mParent;
> +  CaptureEngine mCapEngine;
> +  int mCapId;
> +  unsigned char* mBuffer;

UniquePtr

@@ +278,5 @@
> +    // Stop the callers
> +    while (mCallbacks.Length()) {
> +      auto capEngine = mCallbacks[0]->mCapEngine;
> +      auto capNum = mCallbacks[0]->mCapturerId;
> +      LOG(("Forcing shutdown of engne %d, capturer %d", capEngine, capNum));

engine

@@ +282,5 @@
> +      LOG(("Forcing shutdown of engne %d, capturer %d", capEngine, capNum));
> +      {
> +        MutexAutoUnlock unlock(mCallbackMutex);
> +        RecvStopCapture(capEngine, capNum);
> +        RecvReleaseCaptureDevice(capEngine, capNum);

We're going to continue the loop with the mutex relocked after this - can the mCallbacks list change while we had this unlocked?

@@ +329,5 @@
> +    return false;
> +  }
> +
> +  nsRefPtr<nsIRunnable> webrtc_runnable =
> +    media::NewRunnableFrom([=]() -> nsresult {

a few lines of comments here (or before this function) about what this following block does would be good, since this is the first of a bunch of similar functions using nested NewRunnableFrom()s.

Also comment on how duplicated code could be reduced by using function templates or other things, and perhaps file a low-priority enhancement bug request

@@ +417,5 @@
> +                                   webrtcCaps.rawType,
> +                                   webrtcCaps.codecType,
> +                                   webrtcCaps.interlaced);
> +          LOG(("Capability: %d %d %d %d %d %d",
> +               webrtcCaps.width,

are all of these ints?

@@ +701,5 @@
> +    mWebRTCThread(nullptr),
> +    mChildIsAlive(true)
> +{
> +  if (!gCamerasParentLog)
> +    gCamerasParentLog = PR_NewLogModule("CamerasParent");

braces

::: dom/media/systemservices/CamerasParent.h
@@ +116,5 @@
> +  bool mShmemInitialized;
> +  mozilla::ipc::Shmem mShmem;
> +
> +  // PBackground parent thread
> +  nsIThread* mPBackgroundThread;

Does this need to be a raw ptr?  Otherwise nsCOMPtr<nsIThread>

@@ +119,5 @@
> +  // PBackground parent thread
> +  nsIThread* mPBackgroundThread;
> +
> +  // webrtc processing thread
> +  base::Thread* mWebRTCThread;

webrtc isn't really correct.  MediaManagerThread, or VideoCaptureThread

@@ +122,5 @@
> +  // webrtc processing thread
> +  base::Thread* mWebRTCThread;
> +
> +  // Shutdown handling
> +  bool mChildIsAlive;

Maybe DISABLE_COPY_AND_ASSIGN(CamerasParent)?

::: dom/media/systemservices/CamerasUtils.cpp
@@ +31,5 @@
> +  NS_DECL_ISUPPORTS
> +
> +private:
> +  ~WorkerBackgroundChildCallback()
> +  { }

~WorkerBackgroundChildCallback() {}

::: dom/media/systemservices/CamerasUtils.h
@@ +32,5 @@
> +  }
> +
> +private:
> +  ~ThreadDestructor() {}
> +  nsCOMPtr<nsIThread> mThread;

For classes that we have no belief should be copy/assigned we should DISABLE_COPY_AND_ASSIGN.  Minor point for general style

@@ +47,5 @@
> +  }
> +
> +private:
> +  ~RunnableTask() {}
> +  nsRefPtr<nsIRunnable> mRunnable;

nsCOMPtr

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp
@@ +39,5 @@
> +  if (mozilla::camera::GetCaptureDevice(mCapEngine,
> +                                        mCaptureIndex,
> +                                        deviceName, kMaxDeviceNameLength,
> +                                        uniqueId, kMaxUniqueIdLength)) {
> +    return;

LOG?

::: dom/media/webrtc/MediaEngineRemoteVideoSource.h
@@ +28,5 @@
> +#include "MediaStreamGraph.h"
> +
> +#include "MediaEngineWrapper.h"
> +#include "mozilla/dom/MediaStreamTrackBinding.h"
> +// WebRTC library includes follow

add a blank line before this (super-nit)

@@ +59,5 @@
> +                           void *handle) override;
> +  virtual bool IsTextureSupported() override { return false; };
> +
> +  //
> +  MediaEngineRemoteVideoSource(int aIndex, mozilla::camera::CaptureEngine aCapEngine,

Perhaps fill in the comment?

::: dom/media/webrtc/MediaEngineWebRTC.h
@@ +182,4 @@
>  
>    nsCOMPtr<nsIThread> mThread;
>  
> +  //XXX: what is this really locking?

Need to resolve this.  perhaps it's not locking anything anymore after your changes

::: dom/media/webrtc/moz.build
@@ -21,2 @@
>  
>  if CONFIG['MOZ_WEBRTC']:

moz.build needs build peer review
Attachment #8623655 - Flags: review?(rjesup) → review-
Comment on attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Review of attachment 8623655 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/systemservices/PCameras.ipdl
@@ +18,5 @@
> +  int codecType;
> +  bool interlaced;
> +};
> +
> +async protocol PCameras

I think this protocol would be very well suited to use the "request pattern" that we have been using in lots of other components. Basically any request/response pair get packaged as an actor. This way you guarantee 1 request gets one response, order is maintained, and you don't have to pass extra "ids" around manually. So, as an example:

  struct NumberOfCaptureDevicesRequestParams
  { };

  struct NumberOfCapabilitiesRequestParams
  { };

  struct CaptureCapabilityRequestParams
  {
    nsCString unique_idUTF8;
    int capability_number;
  };

  union RequestParams
  {
    NumberOfCaptureDevicesRequestParams;
    NumberOfCapabilitiesRequestParams;
    CaptureCapabilityRequestParams;
  };

  protocol PCameras
  {
    manages PCamerasRequest;

    // other stuff

  parent:
    PCamerasRequest(int engine, RequestParams params);
  };

  union ResponseParams
  {
    // For NumberOfCaptureDevices, NumberOfCapabilities, etc.
    int;

    // For CaptureCapability.
    CaptureCapability;
  };

  protocol PCamerasRequest
  {
    manager PCameras;

  child:
    __delete__(ResponseParams params);
  };

@@ +37,5 @@
> +  async ReplyFailure();
> +  async ReplySuccess();
> +
> +parent:
> +  async NumberOfCaptureDevices(int engine);

Most of these requests include this "engine" argument... I wonder if we shouldn't make PBackground manage multiple PCameraEngine subprotocols rather than one PCameras subprotocol? Then you wouldn't have to pass these ids all over the place. Does that make sense?

@@ +50,5 @@
> +  async StopCapture(int engine, int numdev);
> +  // transfers frame back
> +  async ReleaseFrame(Shmem s);
> +
> +  async __delete__();

One thing that's tough to get right with these protocols is when to send the "__delete__" message. We can't tolerate this message racing with another message in the opposite direction (e.g. parent sends "DeliverFrame" at the same time child sends "__delete__"), so we normally have a two-step destruction sequence. This usually looks like:

  protocol PCameras
  {
    // ...
  parent:
    AllDone();

  child:
    __delete__();
  };

Then instead of the child deleting itself it waits for the parent to do it. Any messages that were already sent by the parent get processed or dropped, as appropriate.
Comment on attachment 8623655 [details] [diff] [review]
Patch 1. v2 Proxy all camera/screen access to the main process

Sorry, I won't be able to help further with this review, but the main protocol-level comments I have are already recorded here.

I'd suggest following up with mrbkap or billm for any other IPDL questions.
Attachment #8623655 - Flags: review?(bent.mozilla)
(Assignee)

Comment 36

3 years ago
Created attachment 8638064 [details] [diff] [review]
Sandboxing support for Video camera access

Addressed review comments from jesup. Added comments where appropriate.
I extracted some code that was generic into functions, but in general
it's pretty hard to do as every call has small differences that must
be dealth with.
This patch adds a ShmemBufferPool that deals with having multiple frames
in flight while still avoiding extra copies. It also deals with frame size
changes by (if necessary) temporarily doing the extra copy once and then
making sure the pool has the right size on the next frame.

Try found a subtle race condition with multiple streams, which is fixed, 
but necessitated an additional mutex.

I incorporated bent's suggestion for race-free delete. The PCamerasEngine idea                    
doesn't really make sense to me as the calls/protocols are all 100% identical 
except for the engine enum. I feel the same about the ""request pattern"
suggestion. Is this a suggestion for the next time we write something like this, 
or for a potential future improvement, or a hard requirement for r+? 
It looks like it requires rewriting most code here in which case I'd loved to 
have heard this comment on the initial review 4 months ago...
Attachment #8638064 - Flags: review?(rjesup)
Attachment #8638064 - Flags: review?(mrbkap)
(Assignee)

Updated

3 years ago
Attachment #8623655 - Attachment is obsolete: true
Comment on attachment 8638064 [details] [diff] [review]
Sandboxing support for Video camera access

Review of attachment 8638064 [details] [diff] [review]:
-----------------------------------------------------------------

Almost r+ (the ShmemPool(MaxEngine) is wrong I think), *but* since this moves ALL video capture to via-Shmem & IPC (not just when sandboxed), before it lands I'd like to see the deltas (for windows, mac, linux, most-recent-Android) in:
a) CPU use (master and content processes)
b) Latency (i.e. capture time until inserted into content graph with AppendToTrack (measured to right before that call).
c) Memory use (while capturing, and after capturing ends to ensure we've cleaned up everything).  If possible also direct measurement via os tools of Shmem usage.
d) (bonus points) Shmem pool fullness levels during use (avg/max).

I don't expect no added CPU/delay/etc; inherently this puts more layers/process-switches between capture and usage and uses more buffers; but we need to know these numbers to commit to this.  We may want/need to tweak these (for example maybe the number of Shmem buffers), if not for initial landing then before it uplifts.

::: dom/media/systemservices/CamerasParent.cpp
@@ +206,5 @@
> +    new DeliverFrameRunnable(mParent, mCapEngine, mCapturerId,
> +                             Move(shMemBuffer), buffer, size, time_stamp,
> +                             ntp_time, render_time);
> +  MOZ_ASSERT(mParent);
> +  nsIThread * thread = mParent->GetBackgroundThread();

whitespace nit

@@ +731,5 @@
> +}
> +
> +CamerasParent::CamerasParent()
> +  : mCallbackMutex("CamerasParent.mCallbackMutex"),
> +    mShmemPool(CaptureEngine::MaxEngine),

One shmem buffer per engine???  Why is this the value chosen?

::: dom/media/systemservices/CamerasParent.h
@@ +124,5 @@
> +  // PBackground parent thread
> +  nsCOMPtr<nsIThread> mPBackgroundThread;
> +
> +  // webrtc processing thread
> +  base::Thread* mVideoCaptureThread;

Update comment?

::: dom/media/systemservices/ShmemPool.cpp
@@ +89,5 @@
> +void ShmemPool::Put(ShmemBuffer&& aShmem)
> +{
> +  MutexAutoLock lock(mMutex);
> +  MOZ_ASSERT(mPoolFree < mShmemPool.Length());
> +  mShmemPool[mPoolFree] = Move(aShmem);

it's very tempting to use a model where we're elastic and will allocate above the limit, but will always free back down to the limit when frames come back (to here).  This means that temporary glitches won't drop frames, though using a too-small Shmem setting will result in constant reallocation of Shmems, which is expensive.  However, this is not mandatory.

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp
@@ +322,5 @@
> +  switch(mMediaSource) {
> +  case dom::MediaSourceEnum::Camera:
> +#ifdef XP_MACOSX
> +    // Mac doesn't support capabilities.
> +    //

Please check this code against the current MACOSX ifdefed code to ensure the settings haven't changed.  a "blink test" doesn't show any differences at a practical level.
Comment on attachment 8638064 [details] [diff] [review]
Sandboxing support for Video camera access

clearing review pending answers
Attachment #8638064 - Flags: review?(rjesup)
(Assignee)

Comment 39

3 years ago
(In reply to Randell Jesup [:jesup] from comment #37)

> c) Memory use (while capturing, and after capturing ends to ensure we've
> cleaned up everything).  If possible also direct measurement via os tools of
> Shmem usage.

We won't clean up much of anything until MediaManager gets cleaned up.

> > +CamerasParent::CamerasParent()
> > +  : mCallbackMutex("CamerasParent.mCallbackMutex"),
> > +    mShmemPool(CaptureEngine::MaxEngine),
> 
> One shmem buffer per engine???  Why is this the value chosen?

The value here is really arbitrary, but having at least one Shmem buffer per engine allows us to capture from all available sources at a given fps without running out of buffers if all the callbacks arrive near-simultaneously.

> > +void ShmemPool::Put(ShmemBuffer&& aShmem)
> > +{
> > +  MutexAutoLock lock(mMutex);
> > +  MOZ_ASSERT(mPoolFree < mShmemPool.Length());
> > +  mShmemPool[mPoolFree] = Move(aShmem);
> 
> it's very tempting to use a model where we're elastic and will allocate
> above the limit, but will always free back down to the limit when frames
> come back (to here).  This means that temporary glitches won't drop frames,
> though using a too-small Shmem setting will result in constant reallocation
> of Shmems, which is expensive.  However, this is not mandatory.

We can do a follow-up to detect if there are (for example) 100 gets without going over the current available buffers - 1, and then free the last one up. That should give us the best of both worlds.

> Please check this code against the current MACOSX ifdefed code to ensure the
> settings haven't changed.  a "blink test" doesn't show any differences at a
> practical level.

IIRC I rebased this quite intensively but I'll double check to be sure.
(Assignee)

Comment 40

3 years ago
Whoops, looks like I overwrote part of my reply.

>The value here is really arbitrary, but having at least one Shmem buffer per engine allows us to 
>capture from all available sources at a given fps without running out of buffers if all the callbacks 
>arrive near-simultaneously.

Note that these buffers are just *holders* for Shmem and won't actually contain or allocate any memory until they are actually given out (i.e. we run out of them) and we know the buffer size we should allocate.
(Assignee)

Comment 41

3 years ago
...and note it works like a stack (ShmemStack sounds kinda weird tho) so if you only ever use 1 or 2 buffers simultaneously that's also the max amount to get allocated.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #39)
> (In reply to Randell Jesup [:jesup] from comment #37)
> 
> > c) Memory use (while capturing, and after capturing ends to ensure we've
> > cleaned up everything).  If possible also direct measurement via os tools of
> > Shmem usage.
> 
> We won't clean up much of anything until MediaManager gets cleaned up.

We could drop Shmem buffers when there are no active captures.  (See below)

> > > +CamerasParent::CamerasParent()
> > > +  : mCallbackMutex("CamerasParent.mCallbackMutex"),
> > > +    mShmemPool(CaptureEngine::MaxEngine),
> > 
> > One shmem buffer per engine???  Why is this the value chosen?
> 
> The value here is really arbitrary, but having at least one Shmem buffer per
> engine allows us to capture from all available sources at a given fps
> without running out of buffers if all the callbacks arrive
> near-simultaneously.

(And as you note these are only holders for Shmems, and not filled until they first time N are needed at once, so it's high-water-mark in terms of memory use.)

So it appears that we could have this be larger (or allow it to just grow as needed), and then contract if we're staying lower.

One big thing I noticed going over this again:

If we capture some 3K screen capture frames, all frames in the list may be 3K frames (likely all will end up 3K frames even if there are other streams being captured at say 320x240).  If we then close that capture, and go call someone in Hello at 640x480, we're wasting a LOT of shmem.  So, perhaps we need to insert-sort the freelist, and have Get() search from the small end of the list.  a small amount of overhead (since these lists will be short, or should be) to save memory.  Also, release buffers when all captures close.

Question: do streams ever need two buffers per stream, perhaps if things are 'busy'?  Is there any reason not to size the array larger (2x engines, or larger arbitrary number (100?)?  The array objects themselves should be fairly small, so that part of the overhead should be small.

> 
> > > +void ShmemPool::Put(ShmemBuffer&& aShmem)
> > > +{
> > > +  MutexAutoLock lock(mMutex);
> > > +  MOZ_ASSERT(mPoolFree < mShmemPool.Length());
> > > +  mShmemPool[mPoolFree] = Move(aShmem);
> > 
> > it's very tempting to use a model where we're elastic and will allocate
> > above the limit, but will always free back down to the limit when frames
> > come back (to here).  This means that temporary glitches won't drop frames,
> > though using a too-small Shmem setting will result in constant reallocation
> > of Shmems, which is expensive.  However, this is not mandatory.
> 
> We can do a follow-up to detect if there are (for example) 100 gets without
> going over the current available buffers - 1, and then free the last one up.
> That should give us the best of both worlds.

Yes, that might be good or in combination with the above.
(Assignee)

Comment 43

3 years ago
>Question: do streams ever need two buffers per stream, perhaps if things are 'busy'?

In theory I guess this is possible if some part of the browser starts lagging. I practice I see <= 1 buffers per stream (you can do 2 streams with one buffer if you're not unlucky with callback timing).

You can make it bigger but I'm not sure there's really a point. I mean if we're not getting the frames out in real-time and they start buffering up, I'm not sure buffering much more is actually going to make for a better experience.

Tweaking the pool size or making it dynamic is IMHO 100% premature optimization.

>If we then close that capture, and go call someone in Hello at 640x480, we're wasting a LOT of 
>shmem.  So, perhaps we need to insert-sort the freelist, and have Get() search from the small end of 
>the list.

Sounds like more complexity for purely hypothetical gains. We should clean up the buffers if all capturers close though  (But does this need explicit code or can we have MediaManager shut down?), and perhaps try a freeing pass when a single capturer closes.

But IMHO that's all work for nice-to-have follow-up bugs.
Update: I'm reading through this patch today and tomorrow. I hope to have answers for gcp by tomorrow afternoon.
Comment on attachment 8638064 [details] [diff] [review]
Sandboxing support for Video camera access

Review of attachment 8638064 [details] [diff] [review]:
-----------------------------------------------------------------

After reading through this patch, I have a few comments that definitely need to be addressed before checkin. On IRC, gcp was asking if bent's suggestion about the request pattern was a necessary requirement, and I don't believe that it should make any substantive difference, as we can only ever have one request in flight at a time, meaning that we don't have to worry about keeping track of what reply belongs to which request. It might be a nice followup, however, as this stuff is already very complicated and addressing that comment would relieve at least one source of confusion.

Another general comment I have is that I'm a little lost in threads here. It could be because I was reading the patch too quickly, but a comment or assertion about which functions go on which threads (or even just a general comment in the .h file about what functions run on which threads) would go a long way towards clearing up my confusion.

I'm sorry for taking so long to get to this review. I promise to be more responsive in the future (it'll help that I've now actually read through both this bug and the patch, so I won't have to re-learn as much the next time).

::: dom/media/systemservices/CamerasChild.cpp
@@ +125,5 @@
> +private:
> +  CamerasChild* mCamerasChild;
> +};
> +
> +static CamerasChild* Cameras() {

Nit: The function name usually goes on the next line.

@@ +195,5 @@
> +  return true;
> +}
> +
> +bool
> +CamerasChild::DispatchToParent(nsCOMPtr<nsIRunnable> aRunnable,

This isn't right: there's no reason to take an nsCOMPtr<>, especially by value. This should take a raw pointer, which in XPCOM means that the caller must already have a strong reference to the pointed-to object. This happens in a few other places that should be fixed as well.

@@ +200,5 @@
> +                               MonitorAutoLock& aMonitor)
> +{
> +  {
> +    OffTheBooksMutexAutoLock lock(CamerasSingleton::Mutex());
> +    CamerasSingleton::Thread()->Dispatch(aRunnable, NS_DISPATCH_NORMAL);

I think I saw jesup asking about this but I didn't see a response: holding a lock while dispatching risks deadlocks. Have you verified that this is safe?

@@ +208,5 @@
> +  if (!mIPCIsAlive) {
> +    return false;
> +  }
> +  // Wait for a reply
> +  aMonitor.Wait();

In all of these functions, we need to protect against spurious wakeups (which are rare, but can happen if we get a signal at the wrong time). This means that you need an explicit "received reply" boolean and do something like:

do {
  aMonitor.Wait();
} while (!mReceivedResponse && mIPCIsAlive);
return mReplySuccess;

@@ +232,5 @@
> +      }
> +      return NS_ERROR_FAILURE;
> +    });
> +  // Prevent concurrent use of the reply variables. Note
> +  // that his is unlocked while waiting for the reply to be

that *t*his...

@@ +496,5 @@
> +                                 webrtcCaps,
> +                                 cb);
> +}
> +
> +int

Why do these functions return int and not bool?

@@ +601,5 @@
> +  if (CamerasSingleton::Thread()) {
> +    LOG(("Dispatching actor deletion"));
> +    // Delete the parent actor
> +    nsCOMPtr<nsIRunnable> deleteRunnable =
> +      media::NewRunnableFrom([=]() -> nsresult {

There are a few instances of this in both this file and CamerasParent.cpp. We had a long thread on m.d.platform about capturing (especially refcounted) objects in lambdas and one of the outcomes was to ban both [=] and [&] in favor of explicitly listing the closed-over variables. I would also appreciate a comment that we know that the CamerasChild pointer is safe from being destroyed even without a strong reference around the lambda because the IPC layer holds a strong reference to it until after the SendAllDone function returns.

::: dom/media/systemservices/CamerasChild.h
@@ +75,5 @@
> +{
> +  friend class mozilla::ipc::BackgroundChildImpl;
> +
> +public:
> +  NS_INLINE_DECL_REFCOUNTING(CamerasChild)

What threads use CamerasChild? I was expecting there to be threadsafe refcounting, but maybe we only ever refcount it on the background thread? Please add comments about that, if so.

@@ +141,5 @@
> +  // The monitor below isn't sufficient for this, as it will drop
> +  // the lock when Wait-ing for a response, allowing us to send a new
> +  // request. The Notify on receiving the response will then unblock
> +  // both waiters and one will be guaranteed to get the wrong result.
> +  Mutex mRequestMutex;

Please also comment that mRequestMutext must always be taken before mReplyMonitor in order to avoid deadlocking.

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp
@@ +53,5 @@
> +  return;
> +}
> +
> +void
> +MediaEngineRemoteVideoSource::Shutdown() {

Nit (here and a few other places): Gecko style places this brace on the next line.
Attachment #8638064 - Flags: review?(mrbkap)
(Assignee)

Comment 50

3 years ago
(In reply to Blake Kaplan (:mrbkap) (please use needinfo!) from comment #45)

> @@ +200,5 @@
> > +                               MonitorAutoLock& aMonitor)
> > +{
> > +  {
> > +    OffTheBooksMutexAutoLock lock(CamerasSingleton::Mutex());
> > +    CamerasSingleton::Thread()->Dispatch(aRunnable, NS_DISPATCH_NORMAL);
> 
> I think I saw jesup asking about this but I didn't see a response: holding a
> lock while dispatching risks deadlocks. Have you verified that this is safe?

We discussed this in person in Whistler. Basically the thread that's being dispatched to exists for no other task than to fire these runnables on to do IPC. Note the dispatch here is async, not sync, so the thread firing it can't deadlock by ending up reentrant.

IIRC jesup's concern was about the SyncRunnable used earlier in the setup process. That one blocks the MediaManager thread (fine) and runs the runnable on the freshly-created-and-doing-nothing-else IPC thread (also fine).

The lock here does nothing else than to ensure the Thread() member is valid when we try to Dispatch to it.

> In all of these functions, we need to protect against spurious wakeups
> (which are rare, but can happen if we get a signal at the wrong time). This
> means that you need an explicit "received reply" boolean and do something
> like:

We should probably put a reminder in the Monitor class header.

> Why do these functions return int and not bool?

They match the upstream (webrtc.org) API. 
https://dxr.mozilla.org/mozilla-central/source/media/webrtc/trunk/webrtc/video_engine/include/vie_capture.h?from=vie_capture.h
(Assignee)

Comment 51

3 years ago
Created attachment 8648000 [details] [diff] [review]
Proxy video capture device access to the main thread

Addressed review comments. +glandium for moz.build changes.
Attachment #8648000 - Flags: review?(rjesup)
Attachment #8648000 - Flags: review?(mrbkap)
Attachment #8648000 - Flags: review?(mh+mozilla)
(Assignee)

Updated

3 years ago
Attachment #8638064 - Attachment is obsolete: true
(Assignee)

Comment 52

3 years ago
Created attachment 8648002 [details]
memory-reports.tar.gz

Memory reports before and after running the capture, per OS (those with OS not specified are Linux).
(Assignee)

Comment 53

3 years ago
(In reply to Randell Jesup [:jesup] from comment #37)

> a) CPU use (master and content processes)
> b) Latency (i.e. capture time until inserted into content graph with
> AppendToTrack (measured to right before that call).
> c) Memory use (while capturing, and after capturing ends to ensure we've
> cleaned up everything).  If possible also direct measurement via os tools of
> Shmem usage.
> d) (bonus points) Shmem pool fullness levels during use (avg/max).

As for (d), on every system tested, the average Shmem pool fullness is 1 (buffer) and the maximum is equal to the number of capture sources active. This is the expected result as more than 1 buffer is only needed when a DeliverFrame arrives while a previous frame (typically from another engine) is still in flight.

Linux (4.1, GTK3) results, measured with top, averaging 30 second intervals.

    CPU usage with IPC
    1st run firefox 25.6% - 30.5% - 31.2%
    2nd run firefox 12.2% - 15.0% - 16.3%    
    (I get pretty high variance also for normal runs. When comparing to the other benchmarks
     and taking into account the Linux system is actually the fastest of the bunch, I suspect
     this is dominated by an inefficient display stack. Note the expected shift from web content
     to firefox process doesn't happen as the load from the capturer shifts, which reinforces
     it's just completely a negligible factor here.)
    Web Content 3-5%
   
    CPU usage Normal 
    firefox         18.5% - 24.7% - 25.0%
    Web Content     3-5%
   
    CPU usage with IPC, 1080p video
    firefox         34.2% - 32.2% - 39.3%
    Web Content     3-5%
    (Compare to below. Now you can see the influence of moving the capturer!)   
   
    CPU usage Normal, 1080p video
    firefox      19.0%   19.4%   21.5%
    Web Content  21.1%   17.9%   19.1%
   
    IPC latency on 1080p:
    2015-08-12 19:38:24.033608 UTC - -525342976[7f06e7fd0c40]: Deliverframe
    2015-08-12 19:38:24.034974 UTC - 1966470912[7ff2769f96e0]: AppendToTrack
    2015-08-12 19:38:24.097659 UTC - -525342976[7f06e7fd0c40]: Deliverframe
    2015-08-12 19:38:24.098978 UTC - 1966470912[7ff2769f96e0]: AppendToTrack
    2015-08-12 19:38:24.165497 UTC - -525342976[7f06e7fd0c40]: Deliverframe
    2015-08-12 19:38:24.166377 UTC - 1966470912[7ff2769f96e0]: AppendToTrack
   
    IPC latency on 1080p + full 100% CPU load on all cores:
    2015-08-13 12:29:24.369627 UTC - -438311168[7efddc9ec8e0]: Deliverframe
    2015-08-13 12:29:24.370713 UTC - 106956544[7fe40ae71fa0]: AppendToTrack
    2015-08-13 12:29:24.401447 UTC - -438311168[7efddc9ec8e0]: Deliverframe   
    2015-08-13 12:29:24.402611 UTC - 106956544[7fe40ae71fa0]: AppendToTrack
    2015-08-13 12:29:24.433327 UTC - -438311168[7efddc9ec8e0]: Deliverframe
    2015-08-13 12:29:24.434666 UTC - 106956544[7fe40ae71fa0]: AppendToTrack

Mac (10.10.4)

    Normal
    firefox:      4.8%  3.9%  4.2%
    web content:  9.7%  9.6%  9.9%

    IPC
    firefox:      11.9% 11.4% 9.3%
    web content:   4.9% 5.0%  5.1%

    IPC latency
    2015-08-13 14:17:22.541883 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:17:22.607663 UTC - 854605824[12b77f9f0]: DeliverFrame
    2015-08-13 14:17:22.607932 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:17:22.675128 UTC - 854605824[12b77f9f0]: DeliverFrame
    2015-08-13 14:17:22.675470 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:17:22.745731 UTC - 854605824[12b77f9f0]: DeliverFrame
    2015-08-13 14:17:22.745951 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:17:22.822058 UTC - 854605824[12b77f9f0]: DeliverFrame
    2015-08-13 14:17:22.822289 UTC - 473436160[121573080]: AppendToTrack

    IPC latency 100% with CPU load on all cores
    2015-08-13 14:22:35.466783 UTC - 952918016[12ae48680]: DeliverFrame
    2015-08-13 14:22:35.467093 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:22:35.531751 UTC - 952918016[12ae48680]: DeliverFrame
    2015-08-13 14:22:35.531974 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:22:35.599106 UTC - 952918016[12ae48680]: DeliverFrame
    2015-08-13 14:22:35.599361 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:22:35.664352 UTC - 952918016[12ae48680]: DeliverFrame
    2015-08-13 14:22:35.664673 UTC - 473436160[121573080]: AppendToTrack
    2015-08-13 14:22:35.730292 UTC - 952918016[12ae48680]: DeliverFrame
    2015-08-13 14:22:35.730569 UTC - 473436160[121573080]: AppendToTrack

Windows 7:

    Normal 1080p
    firefox:        1%
    web content:    2%
    (yay windows graphics drivers!)

    IPC 1080p:
    firefox:        2-3%
    web content:    1%

    IPC latency:
    2015-08-14 09:35:13.852000 UTC - 0[cb83ef0]: DeliverFrame
    2015-08-14 09:35:13.857000 UTC - 219644[967d3a0]: AppendToTrack
    2015-08-14 09:35:13.857000 UTC - 0[cb83ef0]: DeliverFrame
    2015-08-14 09:35:13.859000 UTC - 219644[967d3a0]: AppendToTrack
    2015-08-14 09:35:13.885000 UTC - 0[cb83ef0]: DeliverFrame
    2015-08-14 09:35:13.886000 UTC - 219644[967d3a0]: AppendToTrack
    2015-08-14 09:35:13.919000 UTC - 0[cb83ef0]: DeliverFrame
    2015-08-14 09:35:13.921000 UTC - 219644[967d3a0]: AppendToTrack
    2015-08-14 09:35:13.949000 UTC - 0[cb83ef0]: DeliverFrame
    2015-08-14 09:35:13.950000 UTC - 219644[967d3a0]: AppendToTrack

    IPC latency, 100% load on all cores
    2015-08-14 09:42:26.605000 UTC - 0[615190]: DeliverFrame
    2015-08-14 09:42:26.608000 UTC - 221336[967f3a0]: AppendToTrack
    2015-08-14 09:42:26.625000 UTC - 0[615190]: DeliverFrame
    2015-08-14 09:42:26.627000 UTC - 221336[967f3a0]: AppendToTrack
    2015-08-14 09:42:26.658000 UTC - 0[615190]: DeliverFrame
    2015-08-14 09:42:26.659000 UTC - 221336[967f3a0]: AppendToTrack
    2015-08-14 09:42:26.689000 UTC - 0[615190]: DeliverFrame
    2015-08-14 09:42:26.690000 UTC - 221336[967f3a0]: AppendToTrack
    2015-08-14 09:42:26.737000 UTC - 0[615190]: DeliverFrame
    2015-08-14 09:42:26.737000 UTC - 221336[967f3a0]: AppendToTrack


I'm not sure if there's much use running Android given the Linux results.

Overall conclusions: CPU usage mostly seems negligible compared to the noise. Latency is typically 1-2 milliseconds even in heavy load (which is *much* better than I'd have expected). 

Still need to look at the memory results.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #53)

> Linux (4.1, GTK3) results, measured with top, averaging 30 second intervals.
> 
>     CPU usage with IPC
>     1st run firefox 25.6% - 30.5% - 31.2%
>     2nd run firefox 12.2% - 15.0% - 16.3%  

What do 1st and 2nd run here mean exactly?

>     CPU usage with IPC, 1080p video
>     firefox         34.2% - 32.2% - 39.3%
>     Web Content     3-5%
>     (Compare to below. Now you can see the influence of moving the
> capturer!)   
>    
>     CPU usage Normal, 1080p video
>     firefox      19.0%   19.4%   21.5%
>     Web Content  21.1%   17.9%   19.1%

That's interesting ... and surprising.  It'd be interesting to see what's eating the CPU time there.  Note that it's possible Gecko profiler won't capture it depending on what thread it's on.  I'm guessing the 1080p is MJPEG?  Could be decoding that is the main hit.

> Mac (10.10.4)
> 
>     Normal
>     firefox:      4.8%  3.9%  4.2%
>     web content:  9.7%  9.6%  9.9%
> 
>     IPC
>     firefox:      11.9% 11.4% 9.3%
>     web content:   4.9% 5.0%  5.1%

So a small increase in CPU (from 13.5-14.7 to 14.4-16.4 total)


> 
> Windows 7:
> 
>     Normal 1080p
>     firefox:        1%
>     web content:    2%
>     (yay windows graphics drivers!)
> 
>     IPC 1080p:
>     firefox:        2-3%
>     web content:    1%

A very small increase.

> 
>     IPC latency:
>     2015-08-14 09:35:13.852000 UTC - 0[cb83ef0]: DeliverFrame
>     2015-08-14 09:35:13.857000 UTC - 219644[967d3a0]: AppendToTrack
>     2015-08-14 09:35:13.857000 UTC - 0[cb83ef0]: DeliverFrame
>     2015-08-14 09:35:13.859000 UTC - 219644[967d3a0]: AppendToTrack
>     2015-08-14 09:35:13.885000 UTC - 0[cb83ef0]: DeliverFrame
>     2015-08-14 09:35:13.886000 UTC - 219644[967d3a0]: AppendToTrack
>     2015-08-14 09:35:13.919000 UTC - 0[cb83ef0]: DeliverFrame
>     2015-08-14 09:35:13.921000 UTC - 219644[967d3a0]: AppendToTrack
>     2015-08-14 09:35:13.949000 UTC - 0[cb83ef0]: DeliverFrame
>     2015-08-14 09:35:13.950000 UTC - 219644[967d3a0]: AppendToTrack
> 
>     IPC latency, 100% load on all cores
>     2015-08-14 09:42:26.605000 UTC - 0[615190]: DeliverFrame
>     2015-08-14 09:42:26.608000 UTC - 221336[967f3a0]: AppendToTrack
>     2015-08-14 09:42:26.625000 UTC - 0[615190]: DeliverFrame
>     2015-08-14 09:42:26.627000 UTC - 221336[967f3a0]: AppendToTrack
>     2015-08-14 09:42:26.658000 UTC - 0[615190]: DeliverFrame
>     2015-08-14 09:42:26.659000 UTC - 221336[967f3a0]: AppendToTrack
>     2015-08-14 09:42:26.689000 UTC - 0[615190]: DeliverFrame
>     2015-08-14 09:42:26.690000 UTC - 221336[967f3a0]: AppendToTrack
>     2015-08-14 09:42:26.737000 UTC - 0[615190]: DeliverFrame
>     2015-08-14 09:42:26.737000 UTC - 221336[967f3a0]: AppendToTrack

A little more variance than Mac or linux (up to 5ms), and maybe a slight increase in latency with 1080p, but overall not a problem.

> I'm not sure if there's much use running Android given the Linux results.

Yeah, not sure there.  A very different gfx pipeline than desktop I think.

> Overall conclusions: CPU usage mostly seems negligible compared to the
> noise. Latency is typically 1-2 milliseconds even in heavy load (which is
> *much* better than I'd have expected). 

Agreed!
Comment on attachment 8648000 [details] [diff] [review]
Proxy video capture device access to the main thread

Review of attachment 8648000 [details] [diff] [review]:
-----------------------------------------------------------------

Please file the followup bug(s) for releasing memory when not in use, or when the resolution drops a lot (less critical)
Attachment #8648000 - Flags: review?(rjesup) → review+
(Assignee)

Comment 56

3 years ago
(In reply to Randell Jesup [:jesup] from comment #54)

> >     CPU usage Normal, 1080p video
> >     firefox      19.0%   19.4%   21.5%
> >     Web Content  21.1%   17.9%   19.1%
> 
> That's interesting ... and surprising.  It'd be interesting to see what's
> eating the CPU time there.  Note that it's possible Gecko profiler won't
> capture it depending on what thread it's on.  I'm guessing the 1080p is
> MJPEG?  Could be decoding that is the main hit.

System wide:
  11.82%  [kernel]                            [k] system_call
   9.38%  libxul.so                           [.] decode_mcu
   5.96%  [kernel]                            [k] copy_user_enhanced_fast_string
   5.14%  libc-2.19.so                        [.] __memcpy_sse2_unaligned
   5.11%  libc-2.19.so                        [.] syscall
   3.30%  libxul.so                           [.] jsimd_idct_ifast_sse2.column_end
   2.32%  libxul.so                           [.] jpeg_fill_bit_buffer
   2.20%  [kernel]                            [k] ktime_get_ts64
   2.05%  libxul.so                           [.] jsimd_idct_ifast_sse2.columnDCT
   1.71%  [kernel]                            [k] native_read_tsc
   1.37%  [kernel]                            [k] sys_clock_gettime
   1.36%  libxul.so                           [.] decompress_onepass
   1.33%  libc-2.19.so                        [.] memset


Firefox:
   9.40%  [.] decode_mcu                                                                                         
   3.29%  [.] jsimd_idct_ifast_sse2.column_end                                                                   
   2.31%  [.] jpeg_fill_bit_buffer                                                                               
   2.03%  [.] jsimd_idct_ifast_sse2.columnDCT                                                                    
   1.36%  [.] decompress_onepass                                                                                 
   0.68%  [.] jsimd_idct_ifast_sse2                                                                              
   0.30%  [.] CopyRow_ERMS                                                                                       
   0.28%  [.] jpeg_huff_decode                                                                                   
   0.25%  [.] InterpolateRow_SSSE3                                                                               
   0.14%  [.] jsimd_idct_ifast                                                                                   
   0.03%  [.] jzero_far       

So yes, MJPEG decoding. On Windows the camera (Logitech C920) might be using H264 or the MJPEG decoding might be done in the kernel and not be credited to the Firefox process.
Windows could easily be hiding it, and likely is.  And I don't think it does (or can do) uncompressed HD over the USB connection; so it's MJPEG or some other compressed format.

decode_mcu() is MJPEG code and looks to be the worst offender.  Without knowing the tool, it's hard to make inferences about the above stats in detail (like which threads - compositor/gfx vs stuff inside webrtc, etc)

Sounds like we have some frame copies in there too eating a noticable amount; hard to say which ones without stacks.

None of this is a blocker -- but interesting.  It may well be worth filing & attaching profiles for the parts that are in compositor/gfx.
Comment on attachment 8648000 [details] [diff] [review]
Proxy video capture device access to the main thread

Review of attachment 8648000 [details] [diff] [review]:
-----------------------------------------------------------------

I have one more big question that needs to be answered (and it applies both to CamerasParent as well as the cases CamerasChild) about lifetimes and holding objects alive as well as a few more nits to pick.

It sounds like we might have to implement bent's suggestion about the requests and ordering due to performance requirements and the ability to send multiple requests at the same time. As I've said though, that's OK to do in a followup bug.

::: dom/media/systemservices/CamerasChild.cpp
@@ +538,5 @@
> +                           webrtcCaps.rawType,
> +                           webrtcCaps.codecType,
> +                           webrtcCaps.interlaced);
> +  nsCOMPtr<nsIRunnable> runnable =
> +    media::NewRunnableFrom([this, aCapEngine, capture_id, capCap]() -> nsresult {

I think this is my last big question: do we have anything that guarantees that |this| will live until the runnable runs? It might be necessary to do something like:

nsRefPtr<CamerasChild> self(this);

and capture self instead.

::: dom/media/systemservices/CamerasUtils.h
@@ +21,5 @@
> +{
> +  DISALLOW_COPY_AND_ASSIGN(ThreadDestructor);
> +
> +public:
> +  explicit ThreadDestructor(nsCOMPtr<nsIThread> aThread)

This should take |nsIThread* aThread|.

@@ +40,5 @@
> +
> +class RunnableTask : public Task
> +{
> +public:
> +  explicit RunnableTask(nsRefPtr<nsRunnable> aRunnable)

And this should take |nsRunnable* aRunnable|.

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp
@@ +32,5 @@
> +  Init();
> +}
> +
> +void
> +MediaEngineRemoteVideoSource::Init() {

Nit: these two functions' opening braces need to go on their own lines.
Attachment #8648000 - Flags: review?(mrbkap)
Attachment #8648000 - Flags: review?(mh+mozilla) → review+
(Assignee)

Comment 59

3 years ago
(In reply to Blake Kaplan (:mrbkap) (please use needinfo!) from comment #58)

> It sounds like we might have to implement bent's suggestion about the
> requests and ordering due to performance requirements and the ability to
> send multiple requests at the same time. 

No? At the least it's not clear to me why you believe this seems likely at this point.

Note that it's possible for multiple DeliverFrame's to be in-flight at the same time (this is evident by the ShmemPool usage ever going >1). The request-response style messages are only needed for device enumeration/start/stop and those are in the other direction so they won't even stop the DeliverFrame stream.

> > +  nsCOMPtr<nsIRunnable> runnable =
> > +    media::NewRunnableFrom([this, aCapEngine, capture_id, capCap]() -> nsresult {
> 
> I think this is my last big question: do we have anything that guarantees
> that |this| will live until the runnable runs? It might be necessary to do
> something like:
> 
> nsRefPtr<CamerasChild> self(this);
> 
> and capture self instead.

Bah. This sucks because then suddenly CamerasChild has multiple owners in different threads. Also, even if "this" is being kept alive it's being used to try to do an IPC call on. If the other owner (the IPC code) has deleted it's reference then that function call isn't going to work anyhow.

However, I don't think this scenario is possible. CamerasChild is owned by the IPC layer. It gets destructed by its parent when its own Shutdown() function sends out the SendAllDone message. Before this happens, the very first thing it does is set the flag to disable further IPC, and then take the same lock that anyone issuing IPC needs too, before Dispatching the deletion of the IPC thread. So here it's not possible for any of those Runnables to end up sitting in the queue AFTER the Runnable that will actually cause CamerasChild to be deleted.

It's not possible for the IPC layer itself to spontaneously decide it wants to kill off some protocols, right?
Flags: needinfo?(mrbkap)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #59)
> No? At the least it's not clear to me why you believe this seems likely at
> this point.

That comment was based on a conversation I had with blassey a little bit ago. I guess that's not the case.

> It's not possible for the IPC layer itself to spontaneously decide it wants
> to kill off some protocols, right?

That's actually possible on the parent side instead of the child. If the child crashes, then we deref all of the actors. My comment still holds on the parent side, unfortunately.
Flags: needinfo?(mrbkap) → needinfo?(gpascutto)
(Assignee)

Updated

3 years ago
Flags: needinfo?(gpascutto)
(Assignee)

Comment 62

3 years ago
Created attachment 8652941 [details] [diff] [review]
Sandboxing support for Video camera access

Addressed outstanding review comments.
Attachment #8652941 - Flags: review?(mrbkap)
(Assignee)

Updated

3 years ago
Attachment #8648000 - Attachment is obsolete: true
(Assignee)

Comment 63

3 years ago
Created attachment 8652943 [details] [diff] [review]
0001-Bug-1104616-Ensure-CamerasParent-is-alive-in-thread-.patch

Just the last changes split out for ease of reviewing.
Attachment #8652943 - Flags: review?(mrbkap)
(Assignee)

Updated

3 years ago
Attachment #8652941 - Attachment is obsolete: true
Attachment #8652941 - Flags: review?(mrbkap)
(Assignee)

Updated

3 years ago
Attachment #8652943 - Attachment is obsolete: true
Attachment #8652943 - Flags: review?(mrbkap)
(Assignee)

Comment 65

3 years ago
Created attachment 8653164 [details] [diff] [review]
0001-Bug-1104616-Ensure-CamerasParent-is-alive-in-thread-.patch

Addressing review comments. Part1/2
Attachment #8653164 - Flags: review?(mrbkap)
(Assignee)

Comment 66

3 years ago
Created attachment 8653165 [details] [diff] [review]
0002-Bug-1104616-Don-t-attempt-to-dispatch-responses-if-t.patch

Addressing review comments, part 2/2.
Attachment #8653165 - Flags: review?(mrbkap)
(Assignee)

Comment 67

3 years ago
Created attachment 8653168 [details] [diff] [review]
Complete diff, all patches rolled up.

For reference. Review whatever is easiest for you.
Attachment #8653168 - Flags: review?(mrbkap)

Updated

3 years ago
Attachment #8653164 - Flags: review?(mrbkap) → review+

Updated

3 years ago
Attachment #8653165 - Flags: review?(mrbkap) → review+

Updated

3 years ago
Attachment #8653168 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 69

3 years ago
Follow up bug for reducing capture buffers automatically:
https://bugzilla.mozilla.org/show_bug.cgi?id=1200193
(Assignee)

Comment 70

3 years ago
(In reply to Randell Jesup [:jesup] from comment #57)
> Without
> knowing the tool, it's hard to make inferences about the above stats in
> detail (like which threads - compositor/gfx vs stuff inside webrtc, etc)
> 
> Sounds like we have some frame copies in there too eating a noticable
> amount; hard to say which ones without stacks.

FWIW this was gathered with Linux's kernel perf tool so it's whole-system. The copies don't appear to be inside a Firefox process but on the kernel/libc side.
https://hg.mozilla.org/mozilla-central/rev/be4b0c0068a9
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox43: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Depends on: 1200477
Why did you make your own copy of SynchronouslyCreatePBackground instead of factoring the code out into a common location?
Flags: needinfo?(gpascutto)
(Assignee)

Comment 73

2 years ago
I don't remember. Might simply have forgotten to clean up that part.
Flags: needinfo?(gpascutto)
See Also: → bug 785592
You need to log in before you can comment on or make changes to this bug.