Closed Bug 1104616 Opened 10 years ago Closed 9 years ago

Sandboxing support for Video camera access

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: gcp, Assigned: gcp)

References

Details

Attachments

(4 files, 10 obsolete files)

1.32 MB, application/gzip
Details
16.80 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
5.33 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
127.55 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
No description provided.
Blocks: 1104615
Attached patch camera-proxy (obsolete) — Splinter Review
WIP patch showing the approach.
Assignee: nobody → gpascutto
Attached patch wip-cameras (obsolete) — Splinter Review
This is functional. Going to clean up a bit and check out a shutdown crash.
Attachment #8549670 - Attachment is obsolete: true
Attached patch wip-cameras (obsolete) — Splinter Review
Seems to be fully working. Going to make a look a cleaning up MediaEngineWebRTC.cpp.
Attachment #8553036 - Attachment is obsolete: true
This includes support for Screen/Window/Application sharing and cleans up MediaEngineWebRTC a bit. (MediaEngineWebRTCVideo.cpp is probably no longer needed after this)
Attachment #8556496 - Attachment is obsolete: true
Attachment #8557964 - Flags: review?(rjesup)
Attachment #8557964 - Flags: review?(bent.mozilla)
Note this currently only supports a single engine active at the same time. What still needs to be added is to make both CamerasParent and CamerasChild keep a hashtable of capEngine+capture_id combinations, create some object implementing ExternalRenderer instances on the CamerasParent side, and adding those to the AddRenderer call in StartCapture. On the CamerasChild side the same hashtable needs to keep the link to the original MediaEngineRemoteVideoSource that wants to receive the frame.
These lists are tiny so we might as well use arrays. Some kind of RefPtr use would be better though, but I grew tired of fighting with all our STL/boost replacements.
The original patches here are now quite bitrotted. Rebased against m-c, including a bunch of changes to MediaEngineWebRTC and constraints. https://github.com/gcp/gecko-dev/tree/sandbox-av https://github.com/gcp/gecko-dev/compare/sandbox-av This also includes a bunch of audio work (that won't compile yet).
Blocks: 1136347
Comment on attachment 8557964 [details] [diff] [review] Patch 1. v1 Proxy all video provider access to chrome process Review of attachment 8557964 [details] [diff] [review]: ----------------------------------------------------------------- r+ with a number of nits; you decide if the updates require a re-review (if so, please generate an interdiff given the size of the patch). Only one I'm not sure if it's a blocker is the "xxx needs a lock" comment ::: dom/media/systemservices/CamerasChild.cpp @@ +12,5 @@ > +#include "mozilla/ipc/PBackgroundChild.h" > +#include "prlog.h" > + > +#include "CamerasUtils.h" > +#include "CamerasChild.h" include CamerasChild.h first, please. @@ +14,5 @@ > + > +#include "CamerasUtils.h" > +#include "CamerasChild.h" > + > +PRLogModuleInfo *gCamerasChildLog; Can this live inside if defined(PR_LOGGING)? @@ +27,5 @@ > +#define LOG_ENABLED() (false) > +#endif > + > +namespace mozilla { > +namespace camera { Given these are all effectively global functions, couldn't they be static functions in CamerasChild:: instead of global functions in namespace camera? This is in keeping with the style guide for added points. (part of the reason for the style guide rule is that unified compilation means that "usings" in c++ code "escape" your file, and you can get collisions even if you have different sub-namespaces. (I do note that there's no "using mozilla::camera" in this patch, which helps). @@ +29,5 @@ > + > +namespace mozilla { > +namespace camera { > + > +static PCamerasChild* sCameras; Is there any way to avoid leaking this, or do we care? I note it's not cleared in Shutdown(), or is this handled by IPC? (in which case we should see if we can clear out this ptr when it becomes invalid/dangerous) @@ +187,5 @@ > + > +void Shutdown() > +{ > + LOG((__PRETTY_FUNCTION__)); > + if (sCamerasChildThread) { Are we always called on the thread that created sCameras? (probably) @@ +190,5 @@ > + LOG((__PRETTY_FUNCTION__)); > + if (sCamerasChildThread) { > + nsRefPtr<ShutdownRunnable> runnable = new ShutdownRunnable(); > + sCamerasChildThread->Dispatch(runnable, NS_DISPATCH_SYNC); > + sCamerasChildThread = nullptr; clear sCameras? @@ +217,5 @@ > +bool > +CamerasChild::RecvFrameSizeChange(const int& w, const int& h) > +{ > + LOG((__PRETTY_FUNCTION__)); > + // XXX: Needs a lock a) needs lock :-) ; b) what else needs to be locked? RecvDeliverFrame? ::: dom/media/systemservices/CamerasParent.cpp @@ +4,5 @@ > + * License, v. 2.0. If a copy of the MPL was not distributed with this file, > + * You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > +#include "mozilla/Assertions.h" > +#include "CamerasParent.h" CameraParent.h first @@ +10,5 @@ > +#include "MediaEngine.h" > +#include "nsThreadUtils.h" > +#include "prlog.h" > + > +PRLogModuleInfo *gCamerasParentLog; See Child comments @@ +101,5 @@ > + uint32_t time_stamp, > + int64_t ntp_time, > + int64_t render_time) > +{ > + nit: remove blank line @@ +108,5 @@ > + mShmemInitialized = true; > + } > + > + if (!mShmem.IsWritable()) { > + LOG(("Video shmem is not writeable in DeliverFrame")); Is it a hard requirement that the shmem be returned before the camera captures the next frame? THreading/priorities/etc might cause us to call DeliverFrame twice close together I imagine. Not a blocker if we think it'll be really, really rare even under load. @@ +284,5 @@ > + > +void > +CamerasParent::CloseActiveEngine() > +{ > + if(mPtrViEBase) { in reverse order of opening (good practice). Also: name is CloseActiveEngine, but it ignores mEngine and just closes the ViE modules. ::: dom/media/systemservices/CamerasParent.h @@ +38,5 @@ > + int64_t render_time, > + void *handle) MOZ_OVERRIDE; > + virtual bool IsTextureSupported() MOZ_OVERRIDE { return false; }; > + > + // missing comment? @@ +82,5 @@ > + webrtc::VideoEngine* mAppEngine; > + > + // Need this to avoid unneccesary WebRTC calls while enumerating. > + // XX: Nothing actually uses this > + bool mCameraEngineInit; cleanup then? ::: dom/media/systemservices/moz.build @@ +47,5 @@ > ] > ] > > +if CONFIG['_MSC_VER']: > + DEFINES['__PRETTY_FUNCTION__'] = '__FUNCSIG__' while largely perfunctory, needs a build peer ::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp @@ +299,5 @@ > + mozilla::camera::GetCaptureCapability(mCapEngine, > + uniqueId.get(), > + candidateSet[i], cap); > + if (!SatisfiesConstraintSet(*aConstraintSets[j], cap)) { > + candidateSet.RemoveElementAt(i); Check this hasn't diverged given Constraints changes recently @@ +311,5 @@ > + > +void > +MediaEngineRemoteVideoSource::ChooseCapability( > + const VideoTrackConstraintsN &aConstraints, > + const MediaEnginePrefs &aPrefs) Ditto for ChooseCapability -- plus it would be good not to have multiple copies of the same code! ::: dom/media/webrtc/MediaEngineRemoteVideoSource.h @@ +4,5 @@ > + * License, v. 2.0. If a copy of the MPL was not distributed with this file, > + * You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > +#ifndef MEDIAENGINEREMOTEVIDEOSOURCE_H_ > +#define MEDIAENGINEREMOTEVIDEOSOURCE_H_ not reabable; either use style guide version or mirror other uses here ::: dom/media/webrtc/MediaEngineWebRTC.h @@ +271,1 @@ > Mutex mMutex; I think it's no longer needed (if it ever was) ::: ipc/glue/moz.build @@ +131,5 @@ > > LOCAL_INCLUDES += [ > '/dom/broadcastchannel', > '/dom/indexedDB', > + '/media/webrtc/trunk', If this truly needed?
Attachment #8557964 - Flags: review?(rjesup) → review+
Comment on attachment 8557964 [details] [diff] [review] Patch 1. v1 Proxy all video provider access to chrome process Ok, we talked about this a bunch today with jesup on irc and I think there's a few medium-ish sized issues to work out. Canceling review for the time being, please needinfo me if you get stuck anywhere (or ping on irc)!
Attachment #8557964 - Flags: review?(bent.mozilla)
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gpascutto@mozilla.com-c41735dd44a4/try-linux64-debug/try_ubuntu64_vm-debug_test-mochitest-3-bm121-tests1-linux64-build322.txt.gz 15:47:13 INFO - -535382272[7f09ee8ada40]: CamerasChild: 7f0a09e3ab00 15:47:13 INFO - 195032832[7f0a0de37c40]: CamerasParent: 7f09f14af120 15:47:13 INFO - 195032832[7f0a0de37c40]: virtual bool mozilla::camera::CamerasParent::RecvNumberOfCaptureDevices(const int&, int*) ...stuff happens... 16:04:12 INFO - -663738624[7f36d918fa00]: ~CamerasChild: 7f36d44f6aa0 16:04:12 INFO - -174425664[7f36e6f20140]: Releasing MediaManager backend 16:04:12 INFO - 195032832[7f0a0de37c40]: virtual void mozilla::camera::CamerasParent::ActorDestroy(mozilla::ipc::IProtocolManager<mozilla::ipc::IProtocol>::ActorDestroyReason) 16:04:12 INFO - [Parent 1937] WARNING: bad Shmem: file /builds/slave/try-l64-d-00000000000000000000/build/src/obj-firefox/ipc/ipdl/PCamerasParent.cpp, line 705 16:04:12 INFO - 195032832[7f0a0de37c40]: ~CamerasParent: 7f09f0696fa0 16:04:12 INFO - [Parent 1937] WARNING: bad Shmem: file /builds/slave/try-l64-d-00000000000000000000/build/src/obj-firefox/ipc/ipdl/PCamerasParent.cpp, line 705 16:04:12 INFO - -174425664[7f36e6f20140]: Releasing MediaManager singleton and thread 16:04:15 INFO - 2319 INFO TEST-OK | dom/media/tests/mochitest/test_dataChannel_basicAudio.html | took 3341ms Cool. ...stuff happens... 16:04:16 INFO - -535382272[7f09ee8ada40]: Selected audio device 16:04:16 INFO - -535382272[7f09ee8ada40]: Audio device 0 allocated 16:04:16 INFO - 195032832[7f0a0de37c40]: virtual bool mozilla::camera::CamerasParent::RecvNumberOfCaptureDevices(const int&, int*) 16:04:16 INFO - 195032832[7f0a0de37c40]: bool mozilla::camera::CamerasParent::EnsureInitialized(int) This gets called without the CamerasParent constructor call being in the log. It's actually being called without CamerasChild being there, too. Looks like the code that spins up and tears down the Cameras() thing in CamerasChild is broken.
The problem appears to be that during these tests the CamerasChild is created in two different processes, which breaks because it's currently supposed to be a singleton.
I understand from past discussion this would be expected in e10s mode because for example Loop is in chrome yet uses WebRTC, but the problem is occurring in non-e10s try and e10s is passing.
Some of the logging is extremely confusing because the timestamps for some of the processes are wrong and interleaved, which makes it hard to see causally what's going on. I think the above wrt multiple processes or Cameras instances may be wrong. I've added an atomic counter and asserts to make sure it is. I've found since that when only running the test_dataChannel_basicVideo.html succeeds on try. So the problem is that one test leaves things in a state that is broken for following ones. Specifically, a passing try run looks like: 14:40:38 INFO - 256 INFO Analyzing element: pcLocal_local1_video 14:40:38 INFO - 257 INFO Analyzing element: pcLocal_remote2_video 14:40:38 INFO - 1976088384[7fde74661260]: 7fde3e7fe100(7fde3dafaa80): OnChannelConnected - Dispatching 14:40:38 INFO - 258 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | data channel is 'open' after 'onopen' 14:40:38 INFO - 259 INFO canplaythrough fired for media element pcRemote_remote2_video 14:40:38 INFO - 260 INFO canplaythrough fired for media element pcLocal_remote2_video 14:40:38 INFO - 261 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | Media flowing for pcLocal_local1_video 14:40:38 INFO - 262 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | Media flowing for pcLocal_remote2_video And a failure looks like: 12:21:54 INFO - 2380 INFO Run step 44: PC_LOCAL_CHECK_MEDIA_FLOW_PRESENT 12:21:54 INFO - 2381 INFO Analyzing element: pcLocal_local1_video 12:21:54 INFO - 2382 INFO Analyzing element: pcLocal_remote2_video 12:21:54 INFO - 2383 INFO TEST-PASS | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | data channel is 'open' after 'onopen' 12:21:54 INFO - 2384 INFO TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_dataChannel_basicVideo.html | Test timed out. - expected PASS Inserting more logging in those places may be instructive.
I got a useful log that indicates that one CamerasParent instance is being created while another one is active. It's not clear to me how that can happen. It would be expected if we were in e10s with multiple content processess - but we aren't. Maybe it's due to a CamerasChild being created on the chrome side. Maybe it's due to a shutdown-restart cycle of the subsystem and the deletion of the object in the parent being slow. In any case moving the pointers of the "this is the parent object" from a global singleton to sitting in the callbacks themselves makes try happy. But it would be useful to understand what is actually going on.
Major rework: 1) Move all the access to globals in the child to a singleton object to clean up the code and enforce proper locking. 2) Remove all use of globals in the parent, it's all instance members now. (This should allow multiple content processes) 3) Add a dedicated IPC thread to the child. (This should also ensure we can safely synchronously initialize PBackground in that thread) 4) Add a dedicated WebRTC access thread to the parent. This moves all camera access etc away from the main thread. On Windows this is an UI thread, so move that to Chrome-style threads. 5) Make all the IPC asynchronous. The sync API to webrtc API consumers is maintained by sending reply messages and blocking until their reception in the child. This is possible because (3) moved the receiving of those to another thread. 6) Fix or work around all the dragons in shutdown/reinit. Outstanding issues: 1) The thread ping-pong caused by dispatching incoming requests to the webrtc thread and then re-dispatching the reply to the IPC thread makes it a pain to extract some error conditions. Currently, we ignore whether IPC calls succeed or fail, both because handling them would complicate things more and because there's not much to recover if child and parent can't communicate. See all the unsued << SendBlah stuff. In general this would benefit from followup work for a nicer way to deal with the chained thread dispatches that currently lead to continuation-passing-style programming. I know we've recently started adding MediaPromises, so maybe using that or something similar for follow-up work could be nice. 2) There's an extra frame copy happing because the WebRTC DeliverFrame callback is sync, but we don't copy the frame data into a Shmem until we're back on the IPC thread, meaning there's an indeterminate time inbetween...a time long enough so that another DeliverFrame could potentially come in, so we need to keep a buffer of what we already got. I think we need some infra with multiple Shmem buffers to properly handle this. That will likely be useful for audio as well, so IMHO it should go in a followup bug as this one is already complicated enough.
Attachment #8557964 - Attachment is obsolete: true
Attachment #8558053 - Attachment is obsolete: true
Attachment #8623655 - Flags: review?(rjesup)
Attachment #8623655 - Flags: review?(bent.mozilla)
Comment on attachment 8623655 [details] [diff] [review] Patch 1. v2 Proxy all camera/screen access to the main process Review of attachment 8623655 [details] [diff] [review]: ----------------------------------------------------------------- ::: build/automation.py.in @@ +506,4 @@ > env['XRE_NO_WINDOWS_CRASH_DIALOG'] = '1' > > # Set WebRTC logging in case it is not set yet > + env.setdefault('NSPR_LOG_MODULES', 'signaling:3,mtransport:4,datachannel:4,jsep:4,MediaPipelineFactory:4,MediaManager:4,CamerasParent:4,CamerasChild:4') I'll remove these, debugging leftovers. ::: build/automationutils.py @@ +347,4 @@ > env.setdefault('MOZ_DISABLE_NONLOCAL_CONNECTIONS', '1') > > # Set WebRTC logging in case it is not set yet > + env.setdefault('NSPR_LOG_MODULES', 'signaling:3,mtransport:4,datachannel:4,jsep:4,MediaPipelineFactory:4,MediaManager:4,CamerasParent:4,CamerasChild:4') Same. ::: testing/mozbase/mozrunner/mozrunner/base/device.py @@ +25,4 @@ > 'MOZ_CRASHREPORTER_NO_REPORT': '1', > 'MOZ_CRASHREPORTER_SHUTDOWN': '1', > 'MOZ_HIDE_RESULTS_TABLE': '1', > + 'NSPR_LOG_MODULES': 'signaling:5,mtransport:5,datachannel:5,jsep:5,MediaPipelineFactory:5,MediaManager:5,CamerasParent:5,CamerasChild:5', Same.
Comment on attachment 8623655 [details] [diff] [review] Patch 1. v2 Proxy all camera/screen access to the main process Review of attachment 8623655 [details] [diff] [review]: ----------------------------------------------------------------- BTW, I'm working on this review (it's not a simple one)
Comment on attachment 8623655 [details] [diff] [review] Patch 1. v2 Proxy all camera/screen access to the main process Review of attachment 8623655 [details] [diff] [review]: ----------------------------------------------------------------- Initial partial review ::: dom/media/systemservices/CamerasChild.cpp @@ +16,5 @@ > +#include "mozilla/Mutex.h" > +#include "mozilla/SyncRunnable.h" > +#include "MediaUtils.h" > +#include "nsThreadUtils.h" > +#include "prlog.h" mozilla/Logging.h now @@ +19,5 @@ > +#include "nsThreadUtils.h" > +#include "prlog.h" > + > +#undef LOG > +#undef LOG_ENABLED are these needed? @@ +41,5 @@ > + CamerasSingleton() > + : mCamerasMutex("CamerasSingleton::mCamerasMutex"), > + mCameras(nullptr), > + mCamerasChildThread(nullptr) { > +#if defined(PR_LOGGING) remove PR_LOGGING @@ +43,5 @@ > + mCameras(nullptr), > + mCamerasChildThread(nullptr) { > +#if defined(PR_LOGGING) > + if (!gCamerasChildLog) > + gCamerasChildLog = PR_NewLogModule("CamerasChild"); braces @@ +54,5 @@ > + LOG(("~CamerasSingleton: %p", this)); > + } > + > + static CamerasSingleton& getInstance() { > + static CamerasSingleton instance; GetInstance() perhaps. Ditto other methods. lowerUpper() is JS-style ;-) Does this imply a static constructor/destructor invocation? ClearOnShutdown() with a static UniquePtr perhaps, and new here on the first usage? Also: what's the thread safety of the CamerasSingleton? Is getInstance referenced from multiple threads? The mutex stuff below implies it is, so that complicates things and might add locking to each getInstance(), which would be a pain/overhead if it's done constantly instead of cached. At least comment as to the lifetime, and the locking/threading model. @@ +78,5 @@ > + // We don't want this to happen in the middle of preparing IPC. > + // We will be alive on destruction, so this needs to be off the books. > + mozilla::OffTheBooksMutex mCamerasMutex; > + > + CamerasChild *mCameras; Not at least a UniquePtr<>? Or is this alive too late for that due to the static instance? Or (more likely) it gets deleted indirectly via the IPC code, and we have the risk of a dangling raw ptr here. Comment as to lifetime at minimum. @@ +85,5 @@ > + > +class InitializeIPCThread : public nsRunnable > +{ > +public: > + explicit InitializeIPCThread() no explicit, that's only needed for 1-arg constructors (normally) @@ +105,5 @@ > + MOZ_RELEASE_ASSERT(existingBackgroundChild); > + > + // Create CamerasChild > + mCamerasChild = > + static_cast<CamerasChild*>(existingBackgroundChild->SendPCamerasConstructor()); Comment that this is an indirect return value via GetCamerasChild on an outstanding reference to this and the use of SyncRunnable to wrap it, since that's not a normal pattern. @@ +119,5 @@ > +private: > + CamerasChild* mCamerasChild; > +}; > + > +static CamerasChild* Cameras(bool trace) { Do we need trace, instead of keying it off say ::Verbose logging? @@ +121,5 @@ > +}; > + > +static CamerasChild* Cameras(bool trace) { > + if (!gCamerasChildLog) > + gCamerasChildLog = PR_NewLogModule("CamerasChild"); braces @@ +127,5 @@ > + OffTheBooksMutexAutoLock lock(CamerasSingleton::getMutex()); > + if (!CamerasSingleton::getChild()) { > + LOG(("No sCameras, setting up")); > + MOZ_ASSERT(!CamerasSingleton::getThread()); > + LOG(("Spinning up IPC Thread")); combine with previous log @@ +129,5 @@ > + LOG(("No sCameras, setting up")); > + MOZ_ASSERT(!CamerasSingleton::getThread()); > + LOG(("Spinning up IPC Thread")); > + nsresult rv = NS_NewNamedThread("Cameras IPC", > + getter_AddRefs(CamerasSingleton::getThread())); I find this a bit odd (getter_AddRefs(getFoo()) as an output), but ok @@ +140,5 @@ > + // 1) Creation of PBackground on the thread created earlier > + // 2) Creation of PCameras by sending a message to the parent on that thread > + nsRefPtr<InitializeIPCThread> runnable = new InitializeIPCThread(); > + nsRefPtr<SyncRunnable> sr = new SyncRunnable(runnable); > + sr->DispatchToThread(CamerasSingleton::getThread()); What thread does this occur on? I hope it's not MainThread, since we're a) doing this Dispatch under a mutex lock (deadlock risk), and b) blocking the sending thread *and* not processing the event loop for the thread. SyncRunnable is mostly intended for non-nsThread sources that don't have an event loop and don't want to be silently converted to an nsThread (and perhaps then leak the response). NrUdpSocketIPC::create_i() (and the UDPSocketChild code it calls) does with this by a) not holding the mutex, and b) doing this stuff on a newly-created thread, and using a monitor to signal completion. That is also not wonderful; it's blocking STS thread waiting for the IPC channel to open on the new thread. Better would be to block additional create() calls on STS and make create_i() return runnable (the same one likely) to STS to finish the create() (or multiple create()s). (Ignoring that it doesn't create separate runnables currently and uses WrapRunnable; that could change, or use another WrapRunnable in return.) So my suggestion: * Dispatch the runnable as NS_DISPATCH_NORMAL * In Run(), have it send itself back to the original thread * on the original thread in Run(), install the CameraChild and unblock whatever was waiting on Cameras(). However, this would require some redesign to make Cameras() async, or to guarantee we called CreateCameras() at some natural blocking point before we might need to call Cameras(). Also: is there any way to shut down the Camera thread/etc when unused? It doesn't appear so... We've been planning to shut down MediaManager's thread (and close related things like engines) when we're idle for more than a very short while (at least on mobile, likely on desktop too). Even if we can't shutdown the thread, we could close out resources. @@ +188,5 @@ > + return true; > +} > + > +int > +CamerasChild::NumberOfCapabilities(CaptureEngine aCapEngine, Identify where this gets called (thread) @@ +192,5 @@ > +CamerasChild::NumberOfCapabilities(CaptureEngine aCapEngine, > + const char* deviceUniqueIdUTF8) > +{ > + LOG((__PRETTY_FUNCTION__)); > + LOG(("NumberOfCapabilities for %s", deviceUniqueIdUTF8)); combine @@ +227,5 @@ > + return Cameras(true)->NumberOfCaptureDevices(aCapEngine); > +} > + > +int > +CamerasChild::NumberOfCaptureDevices(CaptureEngine aCapEngine) Merge with version with UniqueID (virtually identical) @@ +279,5 @@ > + > +int > +CamerasChild::GetCaptureCapability(CaptureEngine aCapEngine, > + const char* unique_idUTF8, > + const unsigned int capability_number, Note: we have an open bug that capability_number is a horrid index for grabbing the right device, so this will need to change eventually. But not now @@ +320,5 @@ > + mReplyCapability.maxFPS = ipcCapability.maxFPS(); > + mReplyCapability.expectedCaptureDelay = ipcCapability.expectedCaptureDelay(); > + mReplyCapability.rawType = static_cast<webrtc::RawVideoType>(ipcCapability.rawType()); > + mReplyCapability.codecType = static_cast<webrtc::VideoCodecType>(ipcCapability.codecType()); > + mReplyCapability.interlaced = ipcCapability.interlaced(); It'd be nice if we didn't need all this boilerplate, but oh well. @@ +350,5 @@ > +{ > + LOG((__PRETTY_FUNCTION__)); > + MonitorAutoLock monitor(mReplyMonitor); > + nsRefPtr<nsIRunnable> runnable = > + media::NewRunnableFrom([this, aCapEngine, list_number]() -> nsresult { ditto about the index/number @@ +474,5 @@ > + monitor.Wait(); > + if (mReplySuccess) { > + return 0; > + } else { > + return -1; arguable nit: I'd write return mReplySuccess ? 0 : -1;, but it's all equivalent If you leave it, remove the 'else'. Repeat for other instances of this pattern @@ +515,5 @@ > + cb); > +} > + > +int > +CamerasChild::StartCapture(CaptureEngine aCapEngine, There are many instances of almost the same Blah() wrapping a NewRunnableFrom() for SendBlah() and returning the result after waiting on the monitor; I wonder if this can be made a template or otherwise factored. But it's ok as-is @@ +603,5 @@ > + > +class ShutdownRunnable : public nsRunnable { > +public: > + ShutdownRunnable(nsRefPtr<nsIRunnable> aReplyEvent, > + nsRefPtr<nsIThread> aReplyThread) nsCOMPtrs @@ +618,5 @@ > + } > + > +private: > + nsRefPtr<nsIRunnable> mReplyEvent; > + nsRefPtr<nsIThread> mReplyThread; These should be nsCOMPtrs... @@ +645,5 @@ > + LOG(("Shutdown called without PBackground thread")); > + } > + LOG(("Erasing sCameras & thread refs (original thread)")); > + CamerasSingleton::getChild() = nullptr; > + CamerasSingleton::getThread() = nullptr; I still find the use of getFoo() being an assignable value confusing (because of 'get'). Maybe Child() and Thread()? @@ +704,5 @@ > + mIPCIsAlive(true), > + mReplyMonitor("mozilla::cameras::CamerasChild::mReplyMonitor") > +{ > + if (!gCamerasChildLog) > + gCamerasChildLog = PR_NewLogModule("CamerasChild"); braces @@ +724,5 @@ > +webrtc::ExternalRenderer* CamerasChild::Callback(CaptureEngine aCapEngine, > + int capture_id) > +{ > + for (unsigned int i = 0; i < mCallbacks.Length(); i++) { > + CapturerElement ce = mCallbacks[i]; c++11's looping (for (auto& ce : mCallbacks)) might be nice here; just a nit ::: dom/media/systemservices/CamerasChild.h @@ +44,5 @@ > + int id; > + webrtc::ExternalRenderer* callback; > +}; > + > +// statically mirror webrtc.org API Comment which class(es) we're mirroring for help in maintaining
Blocks: 1177242
Comment on attachment 8623655 [details] [diff] [review] Patch 1. v2 Proxy all camera/screen access to the main process Review of attachment 8623655 [details] [diff] [review]: ----------------------------------------------------------------- A bunch of nits, but looks close. (gcp and I talked over a bunch of the comments in the previous partial review live as well). ::: dom/media/systemservices/CamerasParent.cpp @@ +18,5 @@ > +#undef LOG_ENABLED > + > +PRLogModuleInfo *gCamerasParentLog; > +#define LOG(args) MOZ_LOG(gCamerasParentLog, mozilla::LogLevel::Debug, args) > +#define LOG_ENABLED() MOZ_LOG_TEST(gCamerasParentLog, 5) is 5 correct? @@ +88,5 @@ > + mBuffer = (unsigned char*)malloc(size); > + // There is an extra copy here. We can't copy into Shmem yet > + // because we have no idea if the next frame will arrive before > + // us copying the data into the Shmem will have finished. We need > + // multiple Shmem to avoid this. Can we file a bug to get rid of this extra copy? video-frame copies are bad... I'd prefer to resolve this before enabling this. Do we just need to double-buffer the SHmems? I'm also fine if we do this only when in an odd case, and normally we avoid it (if so, we should have some sort of log message when the odd case happens, or some other way to know). @@ +93,5 @@ > + memcpy(mBuffer, buffer, size); > + }; > + > + virtual ~DeliverFrameRunnable() { > + free(mBuffer); remove (use UniquePtr) @@ +120,5 @@ > +private: > + CamerasParent *mParent; > + CaptureEngine mCapEngine; > + int mCapId; > + unsigned char* mBuffer; UniquePtr @@ +278,5 @@ > + // Stop the callers > + while (mCallbacks.Length()) { > + auto capEngine = mCallbacks[0]->mCapEngine; > + auto capNum = mCallbacks[0]->mCapturerId; > + LOG(("Forcing shutdown of engne %d, capturer %d", capEngine, capNum)); engine @@ +282,5 @@ > + LOG(("Forcing shutdown of engne %d, capturer %d", capEngine, capNum)); > + { > + MutexAutoUnlock unlock(mCallbackMutex); > + RecvStopCapture(capEngine, capNum); > + RecvReleaseCaptureDevice(capEngine, capNum); We're going to continue the loop with the mutex relocked after this - can the mCallbacks list change while we had this unlocked? @@ +329,5 @@ > + return false; > + } > + > + nsRefPtr<nsIRunnable> webrtc_runnable = > + media::NewRunnableFrom([=]() -> nsresult { a few lines of comments here (or before this function) about what this following block does would be good, since this is the first of a bunch of similar functions using nested NewRunnableFrom()s. Also comment on how duplicated code could be reduced by using function templates or other things, and perhaps file a low-priority enhancement bug request @@ +417,5 @@ > + webrtcCaps.rawType, > + webrtcCaps.codecType, > + webrtcCaps.interlaced); > + LOG(("Capability: %d %d %d %d %d %d", > + webrtcCaps.width, are all of these ints? @@ +701,5 @@ > + mWebRTCThread(nullptr), > + mChildIsAlive(true) > +{ > + if (!gCamerasParentLog) > + gCamerasParentLog = PR_NewLogModule("CamerasParent"); braces ::: dom/media/systemservices/CamerasParent.h @@ +116,5 @@ > + bool mShmemInitialized; > + mozilla::ipc::Shmem mShmem; > + > + // PBackground parent thread > + nsIThread* mPBackgroundThread; Does this need to be a raw ptr? Otherwise nsCOMPtr<nsIThread> @@ +119,5 @@ > + // PBackground parent thread > + nsIThread* mPBackgroundThread; > + > + // webrtc processing thread > + base::Thread* mWebRTCThread; webrtc isn't really correct. MediaManagerThread, or VideoCaptureThread @@ +122,5 @@ > + // webrtc processing thread > + base::Thread* mWebRTCThread; > + > + // Shutdown handling > + bool mChildIsAlive; Maybe DISABLE_COPY_AND_ASSIGN(CamerasParent)? ::: dom/media/systemservices/CamerasUtils.cpp @@ +31,5 @@ > + NS_DECL_ISUPPORTS > + > +private: > + ~WorkerBackgroundChildCallback() > + { } ~WorkerBackgroundChildCallback() {} ::: dom/media/systemservices/CamerasUtils.h @@ +32,5 @@ > + } > + > +private: > + ~ThreadDestructor() {} > + nsCOMPtr<nsIThread> mThread; For classes that we have no belief should be copy/assigned we should DISABLE_COPY_AND_ASSIGN. Minor point for general style @@ +47,5 @@ > + } > + > +private: > + ~RunnableTask() {} > + nsRefPtr<nsIRunnable> mRunnable; nsCOMPtr ::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp @@ +39,5 @@ > + if (mozilla::camera::GetCaptureDevice(mCapEngine, > + mCaptureIndex, > + deviceName, kMaxDeviceNameLength, > + uniqueId, kMaxUniqueIdLength)) { > + return; LOG? ::: dom/media/webrtc/MediaEngineRemoteVideoSource.h @@ +28,5 @@ > +#include "MediaStreamGraph.h" > + > +#include "MediaEngineWrapper.h" > +#include "mozilla/dom/MediaStreamTrackBinding.h" > +// WebRTC library includes follow add a blank line before this (super-nit) @@ +59,5 @@ > + void *handle) override; > + virtual bool IsTextureSupported() override { return false; }; > + > + // > + MediaEngineRemoteVideoSource(int aIndex, mozilla::camera::CaptureEngine aCapEngine, Perhaps fill in the comment? ::: dom/media/webrtc/MediaEngineWebRTC.h @@ +182,4 @@ > > nsCOMPtr<nsIThread> mThread; > > + //XXX: what is this really locking? Need to resolve this. perhaps it's not locking anything anymore after your changes ::: dom/media/webrtc/moz.build @@ -21,2 @@ > > if CONFIG['MOZ_WEBRTC']: moz.build needs build peer review
Attachment #8623655 - Flags: review?(rjesup) → review-
Comment on attachment 8623655 [details] [diff] [review] Patch 1. v2 Proxy all camera/screen access to the main process Review of attachment 8623655 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/systemservices/PCameras.ipdl @@ +18,5 @@ > + int codecType; > + bool interlaced; > +}; > + > +async protocol PCameras I think this protocol would be very well suited to use the "request pattern" that we have been using in lots of other components. Basically any request/response pair get packaged as an actor. This way you guarantee 1 request gets one response, order is maintained, and you don't have to pass extra "ids" around manually. So, as an example: struct NumberOfCaptureDevicesRequestParams { }; struct NumberOfCapabilitiesRequestParams { }; struct CaptureCapabilityRequestParams { nsCString unique_idUTF8; int capability_number; }; union RequestParams { NumberOfCaptureDevicesRequestParams; NumberOfCapabilitiesRequestParams; CaptureCapabilityRequestParams; }; protocol PCameras { manages PCamerasRequest; // other stuff parent: PCamerasRequest(int engine, RequestParams params); }; union ResponseParams { // For NumberOfCaptureDevices, NumberOfCapabilities, etc. int; // For CaptureCapability. CaptureCapability; }; protocol PCamerasRequest { manager PCameras; child: __delete__(ResponseParams params); }; @@ +37,5 @@ > + async ReplyFailure(); > + async ReplySuccess(); > + > +parent: > + async NumberOfCaptureDevices(int engine); Most of these requests include this "engine" argument... I wonder if we shouldn't make PBackground manage multiple PCameraEngine subprotocols rather than one PCameras subprotocol? Then you wouldn't have to pass these ids all over the place. Does that make sense? @@ +50,5 @@ > + async StopCapture(int engine, int numdev); > + // transfers frame back > + async ReleaseFrame(Shmem s); > + > + async __delete__(); One thing that's tough to get right with these protocols is when to send the "__delete__" message. We can't tolerate this message racing with another message in the opposite direction (e.g. parent sends "DeliverFrame" at the same time child sends "__delete__"), so we normally have a two-step destruction sequence. This usually looks like: protocol PCameras { // ... parent: AllDone(); child: __delete__(); }; Then instead of the child deleting itself it waits for the parent to do it. Any messages that were already sent by the parent get processed or dropped, as appropriate.
Comment on attachment 8623655 [details] [diff] [review] Patch 1. v2 Proxy all camera/screen access to the main process Sorry, I won't be able to help further with this review, but the main protocol-level comments I have are already recorded here. I'd suggest following up with mrbkap or billm for any other IPDL questions.
Attachment #8623655 - Flags: review?(bent.mozilla)
Addressed review comments from jesup. Added comments where appropriate. I extracted some code that was generic into functions, but in general it's pretty hard to do as every call has small differences that must be dealth with. This patch adds a ShmemBufferPool that deals with having multiple frames in flight while still avoiding extra copies. It also deals with frame size changes by (if necessary) temporarily doing the extra copy once and then making sure the pool has the right size on the next frame. Try found a subtle race condition with multiple streams, which is fixed, but necessitated an additional mutex. I incorporated bent's suggestion for race-free delete. The PCamerasEngine idea doesn't really make sense to me as the calls/protocols are all 100% identical except for the engine enum. I feel the same about the ""request pattern" suggestion. Is this a suggestion for the next time we write something like this, or for a potential future improvement, or a hard requirement for r+? It looks like it requires rewriting most code here in which case I'd loved to have heard this comment on the initial review 4 months ago...
Attachment #8638064 - Flags: review?(rjesup)
Attachment #8638064 - Flags: review?(mrbkap)
Attachment #8623655 - Attachment is obsolete: true
Comment on attachment 8638064 [details] [diff] [review] Sandboxing support for Video camera access Review of attachment 8638064 [details] [diff] [review]: ----------------------------------------------------------------- Almost r+ (the ShmemPool(MaxEngine) is wrong I think), *but* since this moves ALL video capture to via-Shmem & IPC (not just when sandboxed), before it lands I'd like to see the deltas (for windows, mac, linux, most-recent-Android) in: a) CPU use (master and content processes) b) Latency (i.e. capture time until inserted into content graph with AppendToTrack (measured to right before that call). c) Memory use (while capturing, and after capturing ends to ensure we've cleaned up everything). If possible also direct measurement via os tools of Shmem usage. d) (bonus points) Shmem pool fullness levels during use (avg/max). I don't expect no added CPU/delay/etc; inherently this puts more layers/process-switches between capture and usage and uses more buffers; but we need to know these numbers to commit to this. We may want/need to tweak these (for example maybe the number of Shmem buffers), if not for initial landing then before it uplifts. ::: dom/media/systemservices/CamerasParent.cpp @@ +206,5 @@ > + new DeliverFrameRunnable(mParent, mCapEngine, mCapturerId, > + Move(shMemBuffer), buffer, size, time_stamp, > + ntp_time, render_time); > + MOZ_ASSERT(mParent); > + nsIThread * thread = mParent->GetBackgroundThread(); whitespace nit @@ +731,5 @@ > +} > + > +CamerasParent::CamerasParent() > + : mCallbackMutex("CamerasParent.mCallbackMutex"), > + mShmemPool(CaptureEngine::MaxEngine), One shmem buffer per engine??? Why is this the value chosen? ::: dom/media/systemservices/CamerasParent.h @@ +124,5 @@ > + // PBackground parent thread > + nsCOMPtr<nsIThread> mPBackgroundThread; > + > + // webrtc processing thread > + base::Thread* mVideoCaptureThread; Update comment? ::: dom/media/systemservices/ShmemPool.cpp @@ +89,5 @@ > +void ShmemPool::Put(ShmemBuffer&& aShmem) > +{ > + MutexAutoLock lock(mMutex); > + MOZ_ASSERT(mPoolFree < mShmemPool.Length()); > + mShmemPool[mPoolFree] = Move(aShmem); it's very tempting to use a model where we're elastic and will allocate above the limit, but will always free back down to the limit when frames come back (to here). This means that temporary glitches won't drop frames, though using a too-small Shmem setting will result in constant reallocation of Shmems, which is expensive. However, this is not mandatory. ::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp @@ +322,5 @@ > + switch(mMediaSource) { > + case dom::MediaSourceEnum::Camera: > +#ifdef XP_MACOSX > + // Mac doesn't support capabilities. > + // Please check this code against the current MACOSX ifdefed code to ensure the settings haven't changed. a "blink test" doesn't show any differences at a practical level.
Comment on attachment 8638064 [details] [diff] [review] Sandboxing support for Video camera access clearing review pending answers
Attachment #8638064 - Flags: review?(rjesup)
(In reply to Randell Jesup [:jesup] from comment #37) > c) Memory use (while capturing, and after capturing ends to ensure we've > cleaned up everything). If possible also direct measurement via os tools of > Shmem usage. We won't clean up much of anything until MediaManager gets cleaned up. > > +CamerasParent::CamerasParent() > > + : mCallbackMutex("CamerasParent.mCallbackMutex"), > > + mShmemPool(CaptureEngine::MaxEngine), > > One shmem buffer per engine??? Why is this the value chosen? The value here is really arbitrary, but having at least one Shmem buffer per engine allows us to capture from all available sources at a given fps without running out of buffers if all the callbacks arrive near-simultaneously. > > +void ShmemPool::Put(ShmemBuffer&& aShmem) > > +{ > > + MutexAutoLock lock(mMutex); > > + MOZ_ASSERT(mPoolFree < mShmemPool.Length()); > > + mShmemPool[mPoolFree] = Move(aShmem); > > it's very tempting to use a model where we're elastic and will allocate > above the limit, but will always free back down to the limit when frames > come back (to here). This means that temporary glitches won't drop frames, > though using a too-small Shmem setting will result in constant reallocation > of Shmems, which is expensive. However, this is not mandatory. We can do a follow-up to detect if there are (for example) 100 gets without going over the current available buffers - 1, and then free the last one up. That should give us the best of both worlds. > Please check this code against the current MACOSX ifdefed code to ensure the > settings haven't changed. a "blink test" doesn't show any differences at a > practical level. IIRC I rebased this quite intensively but I'll double check to be sure.
Whoops, looks like I overwrote part of my reply. >The value here is really arbitrary, but having at least one Shmem buffer per engine allows us to >capture from all available sources at a given fps without running out of buffers if all the callbacks >arrive near-simultaneously. Note that these buffers are just *holders* for Shmem and won't actually contain or allocate any memory until they are actually given out (i.e. we run out of them) and we know the buffer size we should allocate.
...and note it works like a stack (ShmemStack sounds kinda weird tho) so if you only ever use 1 or 2 buffers simultaneously that's also the max amount to get allocated.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #39) > (In reply to Randell Jesup [:jesup] from comment #37) > > > c) Memory use (while capturing, and after capturing ends to ensure we've > > cleaned up everything). If possible also direct measurement via os tools of > > Shmem usage. > > We won't clean up much of anything until MediaManager gets cleaned up. We could drop Shmem buffers when there are no active captures. (See below) > > > +CamerasParent::CamerasParent() > > > + : mCallbackMutex("CamerasParent.mCallbackMutex"), > > > + mShmemPool(CaptureEngine::MaxEngine), > > > > One shmem buffer per engine??? Why is this the value chosen? > > The value here is really arbitrary, but having at least one Shmem buffer per > engine allows us to capture from all available sources at a given fps > without running out of buffers if all the callbacks arrive > near-simultaneously. (And as you note these are only holders for Shmems, and not filled until they first time N are needed at once, so it's high-water-mark in terms of memory use.) So it appears that we could have this be larger (or allow it to just grow as needed), and then contract if we're staying lower. One big thing I noticed going over this again: If we capture some 3K screen capture frames, all frames in the list may be 3K frames (likely all will end up 3K frames even if there are other streams being captured at say 320x240). If we then close that capture, and go call someone in Hello at 640x480, we're wasting a LOT of shmem. So, perhaps we need to insert-sort the freelist, and have Get() search from the small end of the list. a small amount of overhead (since these lists will be short, or should be) to save memory. Also, release buffers when all captures close. Question: do streams ever need two buffers per stream, perhaps if things are 'busy'? Is there any reason not to size the array larger (2x engines, or larger arbitrary number (100?)? The array objects themselves should be fairly small, so that part of the overhead should be small. > > > > +void ShmemPool::Put(ShmemBuffer&& aShmem) > > > +{ > > > + MutexAutoLock lock(mMutex); > > > + MOZ_ASSERT(mPoolFree < mShmemPool.Length()); > > > + mShmemPool[mPoolFree] = Move(aShmem); > > > > it's very tempting to use a model where we're elastic and will allocate > > above the limit, but will always free back down to the limit when frames > > come back (to here). This means that temporary glitches won't drop frames, > > though using a too-small Shmem setting will result in constant reallocation > > of Shmems, which is expensive. However, this is not mandatory. > > We can do a follow-up to detect if there are (for example) 100 gets without > going over the current available buffers - 1, and then free the last one up. > That should give us the best of both worlds. Yes, that might be good or in combination with the above.
>Question: do streams ever need two buffers per stream, perhaps if things are 'busy'? In theory I guess this is possible if some part of the browser starts lagging. I practice I see <= 1 buffers per stream (you can do 2 streams with one buffer if you're not unlucky with callback timing). You can make it bigger but I'm not sure there's really a point. I mean if we're not getting the frames out in real-time and they start buffering up, I'm not sure buffering much more is actually going to make for a better experience. Tweaking the pool size or making it dynamic is IMHO 100% premature optimization. >If we then close that capture, and go call someone in Hello at 640x480, we're wasting a LOT of >shmem. So, perhaps we need to insert-sort the freelist, and have Get() search from the small end of >the list. Sounds like more complexity for purely hypothetical gains. We should clean up the buffers if all capturers close though (But does this need explicit code or can we have MediaManager shut down?), and perhaps try a freeing pass when a single capturer closes. But IMHO that's all work for nice-to-have follow-up bugs.
Update: I'm reading through this patch today and tomorrow. I hope to have answers for gcp by tomorrow afternoon.
Comment on attachment 8638064 [details] [diff] [review] Sandboxing support for Video camera access Review of attachment 8638064 [details] [diff] [review]: ----------------------------------------------------------------- After reading through this patch, I have a few comments that definitely need to be addressed before checkin. On IRC, gcp was asking if bent's suggestion about the request pattern was a necessary requirement, and I don't believe that it should make any substantive difference, as we can only ever have one request in flight at a time, meaning that we don't have to worry about keeping track of what reply belongs to which request. It might be a nice followup, however, as this stuff is already very complicated and addressing that comment would relieve at least one source of confusion. Another general comment I have is that I'm a little lost in threads here. It could be because I was reading the patch too quickly, but a comment or assertion about which functions go on which threads (or even just a general comment in the .h file about what functions run on which threads) would go a long way towards clearing up my confusion. I'm sorry for taking so long to get to this review. I promise to be more responsive in the future (it'll help that I've now actually read through both this bug and the patch, so I won't have to re-learn as much the next time). ::: dom/media/systemservices/CamerasChild.cpp @@ +125,5 @@ > +private: > + CamerasChild* mCamerasChild; > +}; > + > +static CamerasChild* Cameras() { Nit: The function name usually goes on the next line. @@ +195,5 @@ > + return true; > +} > + > +bool > +CamerasChild::DispatchToParent(nsCOMPtr<nsIRunnable> aRunnable, This isn't right: there's no reason to take an nsCOMPtr<>, especially by value. This should take a raw pointer, which in XPCOM means that the caller must already have a strong reference to the pointed-to object. This happens in a few other places that should be fixed as well. @@ +200,5 @@ > + MonitorAutoLock& aMonitor) > +{ > + { > + OffTheBooksMutexAutoLock lock(CamerasSingleton::Mutex()); > + CamerasSingleton::Thread()->Dispatch(aRunnable, NS_DISPATCH_NORMAL); I think I saw jesup asking about this but I didn't see a response: holding a lock while dispatching risks deadlocks. Have you verified that this is safe? @@ +208,5 @@ > + if (!mIPCIsAlive) { > + return false; > + } > + // Wait for a reply > + aMonitor.Wait(); In all of these functions, we need to protect against spurious wakeups (which are rare, but can happen if we get a signal at the wrong time). This means that you need an explicit "received reply" boolean and do something like: do { aMonitor.Wait(); } while (!mReceivedResponse && mIPCIsAlive); return mReplySuccess; @@ +232,5 @@ > + } > + return NS_ERROR_FAILURE; > + }); > + // Prevent concurrent use of the reply variables. Note > + // that his is unlocked while waiting for the reply to be that *t*his... @@ +496,5 @@ > + webrtcCaps, > + cb); > +} > + > +int Why do these functions return int and not bool? @@ +601,5 @@ > + if (CamerasSingleton::Thread()) { > + LOG(("Dispatching actor deletion")); > + // Delete the parent actor > + nsCOMPtr<nsIRunnable> deleteRunnable = > + media::NewRunnableFrom([=]() -> nsresult { There are a few instances of this in both this file and CamerasParent.cpp. We had a long thread on m.d.platform about capturing (especially refcounted) objects in lambdas and one of the outcomes was to ban both [=] and [&] in favor of explicitly listing the closed-over variables. I would also appreciate a comment that we know that the CamerasChild pointer is safe from being destroyed even without a strong reference around the lambda because the IPC layer holds a strong reference to it until after the SendAllDone function returns. ::: dom/media/systemservices/CamerasChild.h @@ +75,5 @@ > +{ > + friend class mozilla::ipc::BackgroundChildImpl; > + > +public: > + NS_INLINE_DECL_REFCOUNTING(CamerasChild) What threads use CamerasChild? I was expecting there to be threadsafe refcounting, but maybe we only ever refcount it on the background thread? Please add comments about that, if so. @@ +141,5 @@ > + // The monitor below isn't sufficient for this, as it will drop > + // the lock when Wait-ing for a response, allowing us to send a new > + // request. The Notify on receiving the response will then unblock > + // both waiters and one will be guaranteed to get the wrong result. > + Mutex mRequestMutex; Please also comment that mRequestMutext must always be taken before mReplyMonitor in order to avoid deadlocking. ::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp @@ +53,5 @@ > + return; > +} > + > +void > +MediaEngineRemoteVideoSource::Shutdown() { Nit (here and a few other places): Gecko style places this brace on the next line.
Attachment #8638064 - Flags: review?(mrbkap)
(In reply to Blake Kaplan (:mrbkap) (please use needinfo!) from comment #45) > @@ +200,5 @@ > > + MonitorAutoLock& aMonitor) > > +{ > > + { > > + OffTheBooksMutexAutoLock lock(CamerasSingleton::Mutex()); > > + CamerasSingleton::Thread()->Dispatch(aRunnable, NS_DISPATCH_NORMAL); > > I think I saw jesup asking about this but I didn't see a response: holding a > lock while dispatching risks deadlocks. Have you verified that this is safe? We discussed this in person in Whistler. Basically the thread that's being dispatched to exists for no other task than to fire these runnables on to do IPC. Note the dispatch here is async, not sync, so the thread firing it can't deadlock by ending up reentrant. IIRC jesup's concern was about the SyncRunnable used earlier in the setup process. That one blocks the MediaManager thread (fine) and runs the runnable on the freshly-created-and-doing-nothing-else IPC thread (also fine). The lock here does nothing else than to ensure the Thread() member is valid when we try to Dispatch to it. > In all of these functions, we need to protect against spurious wakeups > (which are rare, but can happen if we get a signal at the wrong time). This > means that you need an explicit "received reply" boolean and do something > like: We should probably put a reminder in the Monitor class header. > Why do these functions return int and not bool? They match the upstream (webrtc.org) API. https://dxr.mozilla.org/mozilla-central/source/media/webrtc/trunk/webrtc/video_engine/include/vie_capture.h?from=vie_capture.h
Addressed review comments. +glandium for moz.build changes.
Attachment #8648000 - Flags: review?(rjesup)
Attachment #8648000 - Flags: review?(mrbkap)
Attachment #8648000 - Flags: review?(mh+mozilla)
Attachment #8638064 - Attachment is obsolete: true
Attached file memory-reports.tar.gz
Memory reports before and after running the capture, per OS (those with OS not specified are Linux).
(In reply to Randell Jesup [:jesup] from comment #37) > a) CPU use (master and content processes) > b) Latency (i.e. capture time until inserted into content graph with > AppendToTrack (measured to right before that call). > c) Memory use (while capturing, and after capturing ends to ensure we've > cleaned up everything). If possible also direct measurement via os tools of > Shmem usage. > d) (bonus points) Shmem pool fullness levels during use (avg/max). As for (d), on every system tested, the average Shmem pool fullness is 1 (buffer) and the maximum is equal to the number of capture sources active. This is the expected result as more than 1 buffer is only needed when a DeliverFrame arrives while a previous frame (typically from another engine) is still in flight. Linux (4.1, GTK3) results, measured with top, averaging 30 second intervals. CPU usage with IPC 1st run firefox 25.6% - 30.5% - 31.2% 2nd run firefox 12.2% - 15.0% - 16.3% (I get pretty high variance also for normal runs. When comparing to the other benchmarks and taking into account the Linux system is actually the fastest of the bunch, I suspect this is dominated by an inefficient display stack. Note the expected shift from web content to firefox process doesn't happen as the load from the capturer shifts, which reinforces it's just completely a negligible factor here.) Web Content 3-5% CPU usage Normal firefox 18.5% - 24.7% - 25.0% Web Content 3-5% CPU usage with IPC, 1080p video firefox 34.2% - 32.2% - 39.3% Web Content 3-5% (Compare to below. Now you can see the influence of moving the capturer!) CPU usage Normal, 1080p video firefox 19.0% 19.4% 21.5% Web Content 21.1% 17.9% 19.1% IPC latency on 1080p: 2015-08-12 19:38:24.033608 UTC - -525342976[7f06e7fd0c40]: Deliverframe 2015-08-12 19:38:24.034974 UTC - 1966470912[7ff2769f96e0]: AppendToTrack 2015-08-12 19:38:24.097659 UTC - -525342976[7f06e7fd0c40]: Deliverframe 2015-08-12 19:38:24.098978 UTC - 1966470912[7ff2769f96e0]: AppendToTrack 2015-08-12 19:38:24.165497 UTC - -525342976[7f06e7fd0c40]: Deliverframe 2015-08-12 19:38:24.166377 UTC - 1966470912[7ff2769f96e0]: AppendToTrack IPC latency on 1080p + full 100% CPU load on all cores: 2015-08-13 12:29:24.369627 UTC - -438311168[7efddc9ec8e0]: Deliverframe 2015-08-13 12:29:24.370713 UTC - 106956544[7fe40ae71fa0]: AppendToTrack 2015-08-13 12:29:24.401447 UTC - -438311168[7efddc9ec8e0]: Deliverframe 2015-08-13 12:29:24.402611 UTC - 106956544[7fe40ae71fa0]: AppendToTrack 2015-08-13 12:29:24.433327 UTC - -438311168[7efddc9ec8e0]: Deliverframe 2015-08-13 12:29:24.434666 UTC - 106956544[7fe40ae71fa0]: AppendToTrack Mac (10.10.4) Normal firefox: 4.8% 3.9% 4.2% web content: 9.7% 9.6% 9.9% IPC firefox: 11.9% 11.4% 9.3% web content: 4.9% 5.0% 5.1% IPC latency 2015-08-13 14:17:22.541883 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:17:22.607663 UTC - 854605824[12b77f9f0]: DeliverFrame 2015-08-13 14:17:22.607932 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:17:22.675128 UTC - 854605824[12b77f9f0]: DeliverFrame 2015-08-13 14:17:22.675470 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:17:22.745731 UTC - 854605824[12b77f9f0]: DeliverFrame 2015-08-13 14:17:22.745951 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:17:22.822058 UTC - 854605824[12b77f9f0]: DeliverFrame 2015-08-13 14:17:22.822289 UTC - 473436160[121573080]: AppendToTrack IPC latency 100% with CPU load on all cores 2015-08-13 14:22:35.466783 UTC - 952918016[12ae48680]: DeliverFrame 2015-08-13 14:22:35.467093 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:22:35.531751 UTC - 952918016[12ae48680]: DeliverFrame 2015-08-13 14:22:35.531974 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:22:35.599106 UTC - 952918016[12ae48680]: DeliverFrame 2015-08-13 14:22:35.599361 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:22:35.664352 UTC - 952918016[12ae48680]: DeliverFrame 2015-08-13 14:22:35.664673 UTC - 473436160[121573080]: AppendToTrack 2015-08-13 14:22:35.730292 UTC - 952918016[12ae48680]: DeliverFrame 2015-08-13 14:22:35.730569 UTC - 473436160[121573080]: AppendToTrack Windows 7: Normal 1080p firefox: 1% web content: 2% (yay windows graphics drivers!) IPC 1080p: firefox: 2-3% web content: 1% IPC latency: 2015-08-14 09:35:13.852000 UTC - 0[cb83ef0]: DeliverFrame 2015-08-14 09:35:13.857000 UTC - 219644[967d3a0]: AppendToTrack 2015-08-14 09:35:13.857000 UTC - 0[cb83ef0]: DeliverFrame 2015-08-14 09:35:13.859000 UTC - 219644[967d3a0]: AppendToTrack 2015-08-14 09:35:13.885000 UTC - 0[cb83ef0]: DeliverFrame 2015-08-14 09:35:13.886000 UTC - 219644[967d3a0]: AppendToTrack 2015-08-14 09:35:13.919000 UTC - 0[cb83ef0]: DeliverFrame 2015-08-14 09:35:13.921000 UTC - 219644[967d3a0]: AppendToTrack 2015-08-14 09:35:13.949000 UTC - 0[cb83ef0]: DeliverFrame 2015-08-14 09:35:13.950000 UTC - 219644[967d3a0]: AppendToTrack IPC latency, 100% load on all cores 2015-08-14 09:42:26.605000 UTC - 0[615190]: DeliverFrame 2015-08-14 09:42:26.608000 UTC - 221336[967f3a0]: AppendToTrack 2015-08-14 09:42:26.625000 UTC - 0[615190]: DeliverFrame 2015-08-14 09:42:26.627000 UTC - 221336[967f3a0]: AppendToTrack 2015-08-14 09:42:26.658000 UTC - 0[615190]: DeliverFrame 2015-08-14 09:42:26.659000 UTC - 221336[967f3a0]: AppendToTrack 2015-08-14 09:42:26.689000 UTC - 0[615190]: DeliverFrame 2015-08-14 09:42:26.690000 UTC - 221336[967f3a0]: AppendToTrack 2015-08-14 09:42:26.737000 UTC - 0[615190]: DeliverFrame 2015-08-14 09:42:26.737000 UTC - 221336[967f3a0]: AppendToTrack I'm not sure if there's much use running Android given the Linux results. Overall conclusions: CPU usage mostly seems negligible compared to the noise. Latency is typically 1-2 milliseconds even in heavy load (which is *much* better than I'd have expected). Still need to look at the memory results.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #53) > Linux (4.1, GTK3) results, measured with top, averaging 30 second intervals. > > CPU usage with IPC > 1st run firefox 25.6% - 30.5% - 31.2% > 2nd run firefox 12.2% - 15.0% - 16.3% What do 1st and 2nd run here mean exactly? > CPU usage with IPC, 1080p video > firefox 34.2% - 32.2% - 39.3% > Web Content 3-5% > (Compare to below. Now you can see the influence of moving the > capturer!) > > CPU usage Normal, 1080p video > firefox 19.0% 19.4% 21.5% > Web Content 21.1% 17.9% 19.1% That's interesting ... and surprising. It'd be interesting to see what's eating the CPU time there. Note that it's possible Gecko profiler won't capture it depending on what thread it's on. I'm guessing the 1080p is MJPEG? Could be decoding that is the main hit. > Mac (10.10.4) > > Normal > firefox: 4.8% 3.9% 4.2% > web content: 9.7% 9.6% 9.9% > > IPC > firefox: 11.9% 11.4% 9.3% > web content: 4.9% 5.0% 5.1% So a small increase in CPU (from 13.5-14.7 to 14.4-16.4 total) > > Windows 7: > > Normal 1080p > firefox: 1% > web content: 2% > (yay windows graphics drivers!) > > IPC 1080p: > firefox: 2-3% > web content: 1% A very small increase. > > IPC latency: > 2015-08-14 09:35:13.852000 UTC - 0[cb83ef0]: DeliverFrame > 2015-08-14 09:35:13.857000 UTC - 219644[967d3a0]: AppendToTrack > 2015-08-14 09:35:13.857000 UTC - 0[cb83ef0]: DeliverFrame > 2015-08-14 09:35:13.859000 UTC - 219644[967d3a0]: AppendToTrack > 2015-08-14 09:35:13.885000 UTC - 0[cb83ef0]: DeliverFrame > 2015-08-14 09:35:13.886000 UTC - 219644[967d3a0]: AppendToTrack > 2015-08-14 09:35:13.919000 UTC - 0[cb83ef0]: DeliverFrame > 2015-08-14 09:35:13.921000 UTC - 219644[967d3a0]: AppendToTrack > 2015-08-14 09:35:13.949000 UTC - 0[cb83ef0]: DeliverFrame > 2015-08-14 09:35:13.950000 UTC - 219644[967d3a0]: AppendToTrack > > IPC latency, 100% load on all cores > 2015-08-14 09:42:26.605000 UTC - 0[615190]: DeliverFrame > 2015-08-14 09:42:26.608000 UTC - 221336[967f3a0]: AppendToTrack > 2015-08-14 09:42:26.625000 UTC - 0[615190]: DeliverFrame > 2015-08-14 09:42:26.627000 UTC - 221336[967f3a0]: AppendToTrack > 2015-08-14 09:42:26.658000 UTC - 0[615190]: DeliverFrame > 2015-08-14 09:42:26.659000 UTC - 221336[967f3a0]: AppendToTrack > 2015-08-14 09:42:26.689000 UTC - 0[615190]: DeliverFrame > 2015-08-14 09:42:26.690000 UTC - 221336[967f3a0]: AppendToTrack > 2015-08-14 09:42:26.737000 UTC - 0[615190]: DeliverFrame > 2015-08-14 09:42:26.737000 UTC - 221336[967f3a0]: AppendToTrack A little more variance than Mac or linux (up to 5ms), and maybe a slight increase in latency with 1080p, but overall not a problem. > I'm not sure if there's much use running Android given the Linux results. Yeah, not sure there. A very different gfx pipeline than desktop I think. > Overall conclusions: CPU usage mostly seems negligible compared to the > noise. Latency is typically 1-2 milliseconds even in heavy load (which is > *much* better than I'd have expected). Agreed!
Comment on attachment 8648000 [details] [diff] [review] Proxy video capture device access to the main thread Review of attachment 8648000 [details] [diff] [review]: ----------------------------------------------------------------- Please file the followup bug(s) for releasing memory when not in use, or when the resolution drops a lot (less critical)
Attachment #8648000 - Flags: review?(rjesup) → review+
(In reply to Randell Jesup [:jesup] from comment #54) > > CPU usage Normal, 1080p video > > firefox 19.0% 19.4% 21.5% > > Web Content 21.1% 17.9% 19.1% > > That's interesting ... and surprising. It'd be interesting to see what's > eating the CPU time there. Note that it's possible Gecko profiler won't > capture it depending on what thread it's on. I'm guessing the 1080p is > MJPEG? Could be decoding that is the main hit. System wide: 11.82% [kernel] [k] system_call 9.38% libxul.so [.] decode_mcu 5.96% [kernel] [k] copy_user_enhanced_fast_string 5.14% libc-2.19.so [.] __memcpy_sse2_unaligned 5.11% libc-2.19.so [.] syscall 3.30% libxul.so [.] jsimd_idct_ifast_sse2.column_end 2.32% libxul.so [.] jpeg_fill_bit_buffer 2.20% [kernel] [k] ktime_get_ts64 2.05% libxul.so [.] jsimd_idct_ifast_sse2.columnDCT 1.71% [kernel] [k] native_read_tsc 1.37% [kernel] [k] sys_clock_gettime 1.36% libxul.so [.] decompress_onepass 1.33% libc-2.19.so [.] memset Firefox: 9.40% [.] decode_mcu 3.29% [.] jsimd_idct_ifast_sse2.column_end 2.31% [.] jpeg_fill_bit_buffer 2.03% [.] jsimd_idct_ifast_sse2.columnDCT 1.36% [.] decompress_onepass 0.68% [.] jsimd_idct_ifast_sse2 0.30% [.] CopyRow_ERMS 0.28% [.] jpeg_huff_decode 0.25% [.] InterpolateRow_SSSE3 0.14% [.] jsimd_idct_ifast 0.03% [.] jzero_far So yes, MJPEG decoding. On Windows the camera (Logitech C920) might be using H264 or the MJPEG decoding might be done in the kernel and not be credited to the Firefox process.
Windows could easily be hiding it, and likely is. And I don't think it does (or can do) uncompressed HD over the USB connection; so it's MJPEG or some other compressed format. decode_mcu() is MJPEG code and looks to be the worst offender. Without knowing the tool, it's hard to make inferences about the above stats in detail (like which threads - compositor/gfx vs stuff inside webrtc, etc) Sounds like we have some frame copies in there too eating a noticable amount; hard to say which ones without stacks. None of this is a blocker -- but interesting. It may well be worth filing & attaching profiles for the parts that are in compositor/gfx.
Comment on attachment 8648000 [details] [diff] [review] Proxy video capture device access to the main thread Review of attachment 8648000 [details] [diff] [review]: ----------------------------------------------------------------- I have one more big question that needs to be answered (and it applies both to CamerasParent as well as the cases CamerasChild) about lifetimes and holding objects alive as well as a few more nits to pick. It sounds like we might have to implement bent's suggestion about the requests and ordering due to performance requirements and the ability to send multiple requests at the same time. As I've said though, that's OK to do in a followup bug. ::: dom/media/systemservices/CamerasChild.cpp @@ +538,5 @@ > + webrtcCaps.rawType, > + webrtcCaps.codecType, > + webrtcCaps.interlaced); > + nsCOMPtr<nsIRunnable> runnable = > + media::NewRunnableFrom([this, aCapEngine, capture_id, capCap]() -> nsresult { I think this is my last big question: do we have anything that guarantees that |this| will live until the runnable runs? It might be necessary to do something like: nsRefPtr<CamerasChild> self(this); and capture self instead. ::: dom/media/systemservices/CamerasUtils.h @@ +21,5 @@ > +{ > + DISALLOW_COPY_AND_ASSIGN(ThreadDestructor); > + > +public: > + explicit ThreadDestructor(nsCOMPtr<nsIThread> aThread) This should take |nsIThread* aThread|. @@ +40,5 @@ > + > +class RunnableTask : public Task > +{ > +public: > + explicit RunnableTask(nsRefPtr<nsRunnable> aRunnable) And this should take |nsRunnable* aRunnable|. ::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp @@ +32,5 @@ > + Init(); > +} > + > +void > +MediaEngineRemoteVideoSource::Init() { Nit: these two functions' opening braces need to go on their own lines.
Attachment #8648000 - Flags: review?(mrbkap)
Attachment #8648000 - Flags: review?(mh+mozilla) → review+
(In reply to Blake Kaplan (:mrbkap) (please use needinfo!) from comment #58) > It sounds like we might have to implement bent's suggestion about the > requests and ordering due to performance requirements and the ability to > send multiple requests at the same time. No? At the least it's not clear to me why you believe this seems likely at this point. Note that it's possible for multiple DeliverFrame's to be in-flight at the same time (this is evident by the ShmemPool usage ever going >1). The request-response style messages are only needed for device enumeration/start/stop and those are in the other direction so they won't even stop the DeliverFrame stream. > > + nsCOMPtr<nsIRunnable> runnable = > > + media::NewRunnableFrom([this, aCapEngine, capture_id, capCap]() -> nsresult { > > I think this is my last big question: do we have anything that guarantees > that |this| will live until the runnable runs? It might be necessary to do > something like: > > nsRefPtr<CamerasChild> self(this); > > and capture self instead. Bah. This sucks because then suddenly CamerasChild has multiple owners in different threads. Also, even if "this" is being kept alive it's being used to try to do an IPC call on. If the other owner (the IPC code) has deleted it's reference then that function call isn't going to work anyhow. However, I don't think this scenario is possible. CamerasChild is owned by the IPC layer. It gets destructed by its parent when its own Shutdown() function sends out the SendAllDone message. Before this happens, the very first thing it does is set the flag to disable further IPC, and then take the same lock that anyone issuing IPC needs too, before Dispatching the deletion of the IPC thread. So here it's not possible for any of those Runnables to end up sitting in the queue AFTER the Runnable that will actually cause CamerasChild to be deleted. It's not possible for the IPC layer itself to spontaneously decide it wants to kill off some protocols, right?
Flags: needinfo?(mrbkap)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #59) > No? At the least it's not clear to me why you believe this seems likely at > this point. That comment was based on a conversation I had with blassey a little bit ago. I guess that's not the case. > It's not possible for the IPC layer itself to spontaneously decide it wants > to kill off some protocols, right? That's actually possible on the parent side instead of the child. If the child crashes, then we deref all of the actors. My comment still holds on the parent side, unfortunately.
Flags: needinfo?(mrbkap) → needinfo?(gpascutto)
See Also: → 1195303
Flags: needinfo?(gpascutto)
Addressed outstanding review comments.
Attachment #8652941 - Flags: review?(mrbkap)
Attachment #8648000 - Attachment is obsolete: true
Just the last changes split out for ease of reviewing.
Attachment #8652943 - Flags: review?(mrbkap)
Attachment #8652941 - Attachment is obsolete: true
Attachment #8652941 - Flags: review?(mrbkap)
Attachment #8652943 - Attachment is obsolete: true
Attachment #8652943 - Flags: review?(mrbkap)
Addressing review comments. Part1/2
Attachment #8653164 - Flags: review?(mrbkap)
Addressing review comments, part 2/2.
Attachment #8653165 - Flags: review?(mrbkap)
For reference. Review whatever is easiest for you.
Attachment #8653168 - Flags: review?(mrbkap)
Attachment #8653164 - Flags: review?(mrbkap) → review+
Attachment #8653165 - Flags: review?(mrbkap) → review+
Attachment #8653168 - Flags: review?(mrbkap) → review+
Follow up bug for reducing capture buffers automatically: https://bugzilla.mozilla.org/show_bug.cgi?id=1200193
(In reply to Randell Jesup [:jesup] from comment #57) > Without > knowing the tool, it's hard to make inferences about the above stats in > detail (like which threads - compositor/gfx vs stuff inside webrtc, etc) > > Sounds like we have some frame copies in there too eating a noticable > amount; hard to say which ones without stacks. FWIW this was gathered with Linux's kernel perf tool so it's whole-system. The copies don't appear to be inside a Firefox process but on the kernel/libc side.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Depends on: 1200477
Blocks: 1107971
Why did you make your own copy of SynchronouslyCreatePBackground instead of factoring the code out into a common location?
Flags: needinfo?(gpascutto)
I don't remember. Might simply have forgotten to clean up that part.
Flags: needinfo?(gpascutto)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: