Closed Bug 708484 Opened 13 years ago Closed 12 years ago

Add camera capture stream provider for gonk (temporary solution)

Categories

(Core :: Audio/Video, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: cjones, Assigned: fabrice)

References

Details

Attachments

(4 files, 13 obsolete files)

468.37 KB, image/png
Details
14.21 KB, patch
Details | Diff | Splinter Review
20.88 KB, patch
cjones
: review+
Details | Diff | Splinter Review
16.13 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
B2G needs camera capture.  Instead of blocking on all the WIP webrtc APIs and patches, it seems most prudent to add a backend similar to what we have for android currently.  That's what this bug covers.  This code would be a temporary solution, to be deprecated in favor of a richer client API in webrtc.

We want something approximately like like http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/device/AndroidCaptureProvider.h, but GonkCaptureProvider.  (In the future, we might be able to share more code in there, if we target the C++ APIs instead of Java ones.  But that's work for a followup.)
Note that, the window brought up by the file picker in current b2g-gonk apparently can't be dismissed, and steals focus from the main window (not unexpectedly).  So the only way I found to dismiss it was to restart the b2g process.  I heard that fabrice was hacking on this so maybe he already found a solution :).
Indeed the capture picker is one case where we need some chrome UI. I just used some xul in the main xul file, not opening a new window so we don't lose the focus.

Other cases where we may need full screen chrome UI are the file picker and the JPAKE setup dialog for sync (for trust/security reasons). This means that we also need a way to dismiss these panels in some circumstances (eg. incoming call). I think the other prompts should be "app modal" so treated differently (alert(), prompt(), confirm(), http authentication dialogs, and content permission prompts)
I wrote a GonkCaptureProvider using the API from CameraHardwareInterface.h

Opening a camera (front or back) an setting callbacks works, I get frames and they are piped correctly in a <video> element. Unfortunately calls to getParameters() hang so we can't retrieve or modify the preview size. I'm not sure yet why this fails, using directly the CameraService code leads to the same issue.

Since the galaxy s2 has v4l2 support I will just implement a capture provider on top of v4l2.
Is that going to block on landing the webrtc repo?  That concerns me a bit.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #4)
> Is that going to block on landing the webrtc repo?  That concerns me a bit.

No, I started a gonk-only v4l2 backend. When webrtc will land we should be able to switch, but we're not blocked.
Assignee: nobody → fabrice
This sounds great, looking forward to a patch so I can get the camera app working (even if it's just a file picker) and make the gallery app actually useful.

Do we have bugs to track "alert(), prompt(), confirm(), http authentication dialogs, and content permission prompts" for B2G on Gonk?

Also off topic for this bug but where would a file picker pick files from for B2G? Picking files from anywhere in the file system isn't something you usually get on a mobile OS.
(In reply to Ben Francis from comment #7)
> Do we have bugs to track "alert(), prompt(), confirm(), http authentication
> dialogs, and content permission prompts" for B2G on Gonk?
> 

They work, but they look like crap.  Good point though, we haven't filed this yet, will do.

> Also off topic for this bug but where would a file picker pick files from
> for B2G? Picking files from anywhere in the file system isn't something you
> usually get on a mobile OS.

Since we're not offering native file system access to web apps, and there aren't other native apps to populate the filesystem, I'm not sure what the use cases are, or how a file picker would work.  If apps store data in Blobs in IDB, we put the blob on the file system, but I don't think the Blob necessarily has a meaningful name (or a name at all).  If we implement the proposed media storage API, that would be easy to expose I guess.  Other than that, intents feel like the right tool for most of these jobs.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> Since we're not offering native file system access to web apps, and there
> aren't other native apps to populate the filesystem, I'm not sure what the
> use cases are, or how a file picker would work.
<snip>
>  Other
> than that, intents feel like the right tool for most of these jobs.

Yes, discussed this with fabrice and justindarc in IRC earlier and the conclusion was that Web Intents seems like the right thing to use here, similar to how Android works. Do you think we'll need a temporary solution in the meantime or should we jump straight to using Web Intents? I can file a bug for this in either case...
I wouldn't worry much about the file picker.  It's just functionality to disable.  Let's jump straight to intents, and start prototyping in gaia itself (https://github.com/andreasgal/gaia/issues/237).  fabrice is interested in doing the gecko impl so he should stay in the loop.
Being able to get files from the camera as well as from the media library puts us at parity with "file picker"-like functionality in Android/iOS. Exposing it in
<input type=file> puts us ahead of everyone else there.
The problem is, as things stand we don't have a way for media apps to put content on the native file system in a way that our file picker can recognize.  But if we implement a dedicated media-library API, we would.
Attached patch wip: UI part (obsolete) — Splinter Review
This is the xuFront-end part
Attached patch wip : backend (obsolete) — Splinter Review
backend, using the android HAL layer.
Kanru,

to test, apply both patches and then use the camera app from gaia.

The interesting pieces are in GonkCaptureProvider.h/.cpp (in netwerk/protocol/device)

The call to GetParameters() hangs gecko on my phone.
Attached patch wip : backend (obsolete) — Splinter Review
removing a useless library.
Attachment #588215 - Attachment is obsolete: true
Something I find so far: The libcameraservice.so blob does not use libcamera.so but uses libseccameraadaptor.so instead. And the HAL_ api prefix was replaced with SEC_ in the adapter. When I use it to open the camera and call getParameters, it crashed in setParameters. I suspect there might be header mismatch.
It turns out they added a virtual method in between startPreview and stopPreview so all virtual methods after stopPreview are shifted by one.

--- a/include/camera/CameraHardwareInterface.h
+++ b/include/camera/CameraHardwareInterface.h
@@ -130,10 +130,12 @@ public:
      * Only used if overlays are used for camera preview.
      */
     virtual bool         useOverlay() {return false;}
     virtual status_t     setOverlay(const sp<Overlay> &overlay) {return BAD_VALUE;}
 
+    virtual void something() {}
+
     /**
      * Stop a previously started preview.
      */
     virtual void        stopPreview() = 0;
Attached patch wip: UI part (obsolete) — Splinter Review
The CSS is really hackish, and it needs to be l10n.
Attachment #588214 - Attachment is obsolete: true
Attached patch wip: backend (obsolete) — Splinter Review
With KanRu modification to the CameraHardwareInterface header and this patch, I can set up and stream frames properly, and close the camera!
Attachment #588439 - Attachment is obsolete: true
Attached image screenshot
because we all love cats
Nice teamwork Fabrice and Kan-Ru. Now why on earth would they add a virtual method there? Does Maguro have that extra method?
(In reply to Andreas Gal :gal from comment #22)
> Nice teamwork Fabrice and Kan-Ru. Now why on earth would they add a virtual
> method there? Does Maguro have that extra method?

Maguro has 2 extra methods, and also a different signature for HAL_openCameraHardware()...
What a mess. Fabrice, you should email the maguro guys and ask them why this is the case and how we are supposed to support multiple devices. Maybe we are not using the right interface/abstraction.
maguro added 3d related stuff for some devices.

Multiple devices support is still possible using #ifdefs. That's easy on the glue/gonk side but I'm not sure yet how to do this in gecko. Note that philikon also needs that in gecko for some samsung specific stuff in RIL...

I think we are the right abstraction level here. The camera HAL is changing post android 4.0 and should be better. For now, the alternative would be to pretend being a client for the CameraService, but that means another IPC layer between the hardware and us.
Ok let's ifdef it. Not great but such is life. Let's land this ASAP. Very cool.
Fabrice, can we create wrap the camera view into an XBL?

I know it's not standard (but like the current solution) but it will let Gaia add an app that contains:
<html>
 <body>
  <camera></camera>
  <input type="button" value="switch">
  ...
 </body>
</html>

and do all the styling/interaction logic from HTML (and possibly get rid of the <input type="file"> button)

What do you think?
As much as I hate the <input type="file"> button for Gaia's camera app from a UX point of view, unless you're proposing <camera> as a new HTML tag for standardisation I think we should put our effort into WebRTC's getUserMedia() rather than hacking something together with non-standard XBL. http://dev.w3.org/2011/webrtc/editor/getusermedia.html

Is it even possible to use XBL in content with XUL disabled?
(In reply to Ben Francis from comment #28)
> As much as I hate the <input type="file"> button for Gaia's camera app from
> a UX point of view, unless you're proposing <camera> as a new HTML tag for
> standardisation I think we should put our effort into WebRTC's
> getUserMedia() rather than hacking something together with non-standard XBL.
> http://dev.w3.org/2011/webrtc/editor/getusermedia.html
> 

I don't want to standardize <camera> at all and I guess XBL sounds a bad solution.

I just want a temporary solution that offer flexibility and that will be closer to the API offered by the platform at the end. Basically I want to be able to  write all the app's chrome from the content side.

Hacking a temporary API mimicking webRTC that will let us use navigator.getUserMedia({video: true}) with the MediaStreamRecorder.getRecordedData method to draw on a canvas sounds good to me.  

> Is it even possible to use XBL in content with XUL disabled?

video/audio are XBL. XBL does not necessarly wrap XUL content.
It's also possible to not show the <input type='file'> but call .click() on it to have the file picker being shown. Would that work? (see bug 36619)
That's a nice idea Mounir, I just tried it and it doesn't seem to work. Is the popup blocker enabled on the device? When I run it on the desktop the popup blocker gets in the way of the file picker opening.

This wouldn't solve Vivien's concerns about the use of XUL for the UI though.
Attached patch wip: UI partSplinter Review
Attachment #590348 - Attachment is obsolete: true
Attached patch wip : backend (obsolete) — Splinter Review
Attachment #590349 - Attachment is obsolete: true
We almost certainly prevent calling .click() except off of "user-input events" (i.e. popup blocker, yes).  So you'd want to make a button or something that popped up the picker.
Attached patch wip : poor man's getUserMedia (obsolete) — Splinter Review
XX : this patch is a temporary hack

This patch build on top of the backend patch to provide a poor man's substitute to webRTC getUserMedia()

It allows a webpage that has the privilege for "content-camera" to get camera stream into a video element on a webpage:

var node = getElementById("myVideo");
node.src = navigator.mozB2G.getCameraURI({ camera: 0, width: 640, height: 480});

camera #0 is the back camera, camera #1 is the front facing one. width and height may not be matched exactly, depending on the hardware support.
Attachment #591890 - Flags: review?(21)
Comment on attachment 591890 [details] [diff] [review]
wip : poor man's getUserMedia

>diff --git a/b2g/components/B2GComponents.manifest b/b2g/components/B2GComponents.manifest
>+# b2g.js
>+component {eff4231b-abce-4f7f-a40a-d646e8fde3ce} b2g.js
>+contract @mozilla.org/b2g-content;1 {eff4231b-abce-4f7f-a40a-d646e8fde3ce}
>+category JavaScript-navigator-property mozB2G @mozilla.org/b2g-content;1

I understand why you have choose this name but it really sounds too generic to me.

b2g.js -> CameraContent.js
mozB2G -> mozCamera

>diff --git a/b2g/components/Makefile.in b/b2g/components/Makefile.in 
>+XPIDLSRCS = \
>+        b2g.idl \
>+        $(NULL)
>+
> EXTRA_PP_COMPONENTS = \
>         B2GComponents.manifest \
>         CapturePicker.js \
>+        b2g.js \
>         $(NULL)
>

s/b2g/CameraContent/
 
>diff --git a/b2g/components/b2g.js b/b2g/components/b2g.js
>+const Cc = Components.classes;
>+const Ci = Components.interfaces;
>+const Cu = Components.utils;
>+const Cr = Components.results;

nit: Cr is unused

>+
>+Cu.import("resource://gre/modules/XPCOMUtils.jsm");
>+Cu.import("resource://gre/modules/Services.jsm");
>+
>+const kProtocolName = "b2g-camera:";
>+const kTimeout = 3000;
>+
>+B2GContent = function() {

B2GContent -> var CameraContent

>+  this.messages = ["b2g-content:getCameraURI"];
>+
>+  this.mm = Cc["@mozilla.org/childprocessmessagemanager;1"].getService(Ci.nsIFrameMessageManager);
>+
>+  this.messages.forEach((function(msgName) {
>+    this.mm.addMessageListener(msgName, this);
>+  }).bind(this));

let mm = Cc["@mozilla.org/childprocessmessagemanager;1"].getService(Ci.nsIFrameMessageManager);
mm.addMessageListener(b2g-content:getCameraURI", this);

No need to use 'this'

>+  this.hasPrivileges = false;
>+  this.mapping = { };
>+  this.timer = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
>+  this.timer.init(this, kTimeout, Ci.nsITimer.TYPE_REPEATING_SLACK);
>+}
>+ 
>+B2GContent.prototype = {
>+
>+  getCameraURI: function(aOptions) {
>+    dump("GContent::getCameraURI\n");

nit: Remove dump

>+    if (!this.hasPrivileges)
>+      return null;

nit: add a line break

>+    let options = aOptions || { };
>+    if (!options.camera)
>+      options.camera = 0;
>+    if (!options.width)
>+      options.width = 320;
>+    if (!options.height)
>+      options.height = 240;

nit: add a line break

>+    let id = Cc["@mozilla.org/uuid-generator;1"].getService(Ci.nsIUUIDGenerator).generateUUID().toString();
>+    id = id.substring(1, id.length - 2); // remove the brackets

id -> uuid?

>+    this.mapping[id] = (new Date()).getTime() + kTimeout;
>+    let uri = kProtocolName + "?camera=" + options.camera + 
>+                              "&width=" + options.width +
>+                              "&height=" + options.height +
>+                              "&type=video/x-raw-yuv";

nit: add a line space

>+    // XXX that's no e10s ready, but the camera inputstream itself is not...
>+    Services.prefs.setCharPref("b2g.camera." + kProtocolName + "?" + id, uri);
>+    return kProtocolName + "?" + id;
>+  },
>+  
>+  observe: function(aSubject, aTopic, aData) {
>+    dump("B2GContent observer (" + aTopic + ")\n");

nit: remove the dump

>+    if (aTopic == "inner-window-destroyed") {
>+      let wId = aSubject.QueryInterface(Ci.nsISupportsPRUint64).data;
>+      if (wId == this.innerWindowID) {
>+        Services.obs.removeObserver(this, "inner-window-destroyed");

Can you use a weak ref so you don't have to call removeObserver yourself?

>+        this.timer.cancel();
>+        this.timer = null;
>+        this.mapping = null;
>+      }
>+    } else if (aTopic == "timer-callback") {
>+      let now = (new Date()).getTime();

Date.now() ?

>+      dump("now is " + now + "\n");

nit: remove the dump

>+      for (aId in this.mapping) {
>+        dump(aId + " " + this.mapping[aId] + "\n");

nit: remove the dump

>+        if (this.mapping[aId] > now) {
>+          dump("\tremoved\n");

nit: removed the dump

>+          Services.prefs.clearUserPref("b2g.camera." + kProtocolName + "?" + aId);

Do you also have to clean up the pref in the inner-window-destroyed case?

>+          delete this.mapping[aId];
>+        }
>+      }
>+    }
>+  },


I don't really understand the role of your timer.
Does it mean the requester has 3 seconds to use the camera stream?

>+
>+  receiveMessage: function(aMessage) {
>+    switch(aMessage.name) {
>+      case "b2g-content:getCameraURI":
>+        if (this.mapping(aMessage.json.id)) {
>+          let options = this.mapping(aMessage.json.id);
>+          return kProtocolName + "?camera=" + options.camera + 
>+                                 "&width=" + options.width +
>+                                 "&height=" + options.height +
>+                                 "&type=video/x-raw-yuv";
>+        } else {
>+          return "";
>+        }

if (!this.mapping[aMessage.json.id])
  return "";

I think you want to use [] instead of () for this.mapping.

Also it seems like the e10s support is unused. I won't be opposed to a removal :)


>+        break;
>+    }
>+  },
>+
>+  init: function(aWindow) {
>+    dump("B2GContent::init() " + aWindow.location.href + "\n");

nit: remove the dump


I can't review the netwerk/protocol part.
Attachment #591890 - Flags: review?(21) → review-
Attached patch backend (obsolete) — Splinter Review
rebased backend
Attachment #591307 - Attachment is obsolete: true
Attachment #591968 - Flags: review?(jones.chris.g)
Attached patch poor man's getUserMedia (obsolete) — Splinter Review
updated the JS part, addressing Vivien's comments

Chris, can you review the protocol handler? It's mostly a copy of nsDeviceProtocolHandler.cpp
Attachment #591890 - Attachment is obsolete: true
Attachment #591989 - Flags: review?(jones.chris.g)
Attachment #591989 - Flags: review?(21)
(In reply to Fabrice Desré [:fabrice] from comment #39)
> Created attachment 591989 [details] [diff] [review]
> poor man's getUserMedia
> 
> updated the JS part, addressing Vivien's comments

The code seems fine now but I still don't really understand the role of your timer.
Does it mean the requester has 3 seconds to use the camera stream?
> The code seems fine now but I still don't really understand the role of your
> timer.
> Does it mean the requester has 3 seconds to use the camera stream?

yep. Maybe it's useless and I should just cleanup in the end.
(In reply to Fabrice Desré [:fabrice] from comment #41)
> > The code seems fine now but I still don't really understand the role of your
> > timer.
> > Does it mean the requester has 3 seconds to use the camera stream?
> 
> yep. Maybe it's useless and I should just cleanup in the end.

I guess you also use that as a simple way to specify parameters to the output stream?
Comment on attachment 591968 [details] [diff] [review]
backend

Patch is going to be updated.
Attachment #591968 - Flags: review?(jones.chris.g)
Attachment #591989 - Flags: review?(jones.chris.g)
Attached patch backend (obsolete) — Splinter Review
This patch implements the vtable hack and also dlopen()s libcamera to get aroudn the different signatures for HAL_openCameraHarware.

I can only test on the SGS2 for now, and it works fine here.
Attachment #591968 - Attachment is obsolete: true
Attachment #592317 - Flags: review?(jones.chris.g)
Attached patch poor man's getUserMedia (obsolete) — Splinter Review
Chris, can you review the protocol handler parts? Vivien already r+ the js part.
Attachment #591989 - Attachment is obsolete: true
Attachment #591989 - Flags: review?(21)
Attachment #592319 - Flags: review?(jones.chris.g)
Comment on attachment 592317 [details] [diff] [review]
backend

>diff --git a/netwerk/protocol/device/GonkCaptureProvider.cpp b/netwerk/protocol/device/GonkCaptureProvider.cpp

>+/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
>+/* ***** BEGIN LICENSE BLOCK *****

Use the new license header.  Keep the modeline here.

>+CameraHardwareInterface* CameraHardwareInterface::openCamera(PRUint32 aCamera) {
>+  CameraHardwareInterface* instance;

Use nsAutoPtr and .forget() the return value.

>+  if (!instance->ok()) {
>+    delete instance;
>+    instance = NULL;

Nit: I don't like it, but usual gecko style is |nsnull|.  Also
throughout here.  If network/ has different local style, let me know.

Make sure to remove the |delete| here with the nsAutoPtr change.  C++
compilers will happily coerce the nsAutoPtr<T> to T* and then delete
the T* out from under the nsAutoPtr.  Whups!

>+// The maximum number of frames we keep in our queue. Don't live in the past.
>+#define MAX_FRAMES_QUEUED 10
>+

That seems way too high, but we can tune this later.

Does this big buffer size exist to work around stutter in the camera
hardware or stutter in the gecko client?  I seriously doubt the HW is
all that variable, so I assume gecko client.  If the client is
stuttering, then it's either (i) not keeping up with the HW, in the
steady state; or (ii) clearing the queue in bursts.  I can't think of
a reason for (ii) (bad GCs?), but even if (ii) were happening, I don't
know why we would want to preserve frames for >10/fps seconds in the
past.  For (i), the client has no hope of catching up, so we should
just keep a smaller buffer.

>+NS_IMPL_THREADSAFE_ISUPPORTS2(GonkCameraInputStream, nsIInputStream, nsIAsyncInputStream)
>+

I don't understand the threading model of this code.  Where should I
look for docs?

>+GonkCameraInputStream::GonkCameraInputStream() :
>+  mWidth(0), mHeight(0), mFps(30), mCamera(0), mHeaderSent(false), mClosed(true), mFrameSize(0),
>+  mMonitor("AndroidCamera.Monitor")

Really? ;)

>+{
>+  mAvailable = sizeof(nsRawVideoHeader);

This should be in the initializer list too.

>+  mFrameQueue = new nsDeque();

See below re: changes to mFrameQueue.

>+GonkCameraInputStream::~GonkCameraInputStream() {

This destructor is extremely scary.  I don't understand the threading
model of *CameraInputStream, so we'll need to discuss that before I
can r+.

>+  // clear the frame queue
>+  while (mFrameQueue->GetSize() > 0) {
>+    nsMemory::Free(mFrameQueue->PopFront());

free()

>+NS_IMETHODIMP
>+GonkCameraInputStream::Init(nsACString& aContentType, nsCaptureParams* aParams)
>+{
>+  PRUint32 sizeDelta = 10000000;

wtf is this?  Looks like a sentinel max value.  If so, PR_UINT32_MAX
is what you want.

>+  PRUint32 bestWidth = mWidth;
>+  PRUint32 bestHeight = mHeight;
>+  for (PRUint32 i = 0; i < previewSizes.size(); i++) {
>+    Size size = previewSizes[i];
>+    PRUint32 delta = abs(size.width * size.height - mWidth * mHeight); 
>+    if (delta < sizeDelta) {
>+      sizeDelta = delta;
>+      bestWidth = size.width;
>+      bestHeight = size.height;
>+    }
>+  }

It looks like this algorithm is trying to find a previewSize as close
as possible to mWidth/mHeight as possible, as measured by difference
in number of pixels.  That wasn't clear to me on first or second read,
please add a comment.  Also s/sizeDelta/minSizeDelta/.

This algorithm implicitly assumes that the requested size and preview
size have pretty close to the same aspect ratio, but that seems
reasonable.  We probably don't care about other cases yet.

>+void 
>+GonkCameraInputStream::ReceiveFrame(char* frame, PRUint32 length) {
>+  {
>+    mozilla::ReentrantMonitorAutoEnter autoMonitor(mMonitor);

|using namespace mozilla;|, and please search-replace all other
instances.  s/autoMonitor/enter/.

>+    if (mFrameQueue->GetSize() > MAX_FRAMES_QUEUED) {
>+      nsMemory::Free(mFrameQueue->PopFront());

free().  But see below.

>+  mFrameSize = sizeof(nsRawPacketHeader) + length;
>+
>+  char* fullFrame = (char*)nsMemory::Alloc(mFrameSize);
>+

moz_malloc().  nsMemory::Alloc() is infallible now.  This allocation
should be fallible.  But see below.

>+  nsRawPacketHeader* header = (nsRawPacketHeader*) fullFrame;

reinterpret_cast

>+  header->packetID = 0xFF;
>+  header->codecID = 0x595556; // "YUV"

I have no idea what these mean, and they're not documented in
nsRawStructs.h.

>+  // we copy the Y plane, and de-interlace the CrCb
>+  
>+  PRUint32 yFrameSize = mWidth * mHeight;
>+  PRUint32 uvFrameSize = yFrameSize / 4;
>+

This is hard-coding an input format of 4:2:0 (or 4:1:1?) and an output
format of 4:2:0 (or 4:1:1?), right?  Why?

>+  memcpy(fullFrame + sizeof(nsRawPacketHeader), frame, yFrameSize);
>+
>+  char* uFrame = fullFrame + yFrameSize;
>+  char* vFrame = fullFrame + yFrameSize + uvFrameSize;

Doesn't |fullFrame| point at the header?  Why is this pointer
arithmetic valid?

>+  char* yFrame = frame + yFrameSize;

|const char* const yFrame|.  Does |frame| have a header too?

I'm getting really lost following the hard-coded binary formats here.
I know it's not your fault, but if you know where all this stuff comes
from, please add comments.  If we're hard-coding input/output, I
wonder if we can't get away with an easier-to-understand memcpy()
here.

Let's discuss this regardless.

>+NS_IMETHODIMP GonkCameraInputStream::Read(char *aBuffer, PRUint32 aCount, PRUint32 *aRead NS_OUTPARAM) {
>+  return ReadSegments(NS_CopySegmentToBuffer, aBuffer, aCount, aRead);
>+}
>+
>+NS_IMETHODIMP GonkCameraInputStream::ReadSegments(nsWriteSegmentFun aWriter, void *aClosure, PRUint32 aCount, PRUint32 *aRead NS_OUTPARAM) {

>+    header.options = 1 | 1 << 1; // color, 4:2:2

I don't believe that we're outputing 4:2:2!

>+      rv = aWriter(this, aClosure, (const char*)frame, *aRead, mFrameSize, &readThisTime);
>+

AFAICT, the only reason ReentrantMonitor is used in this class is
because of this |aWriter| callback and NotifyListeners() below.  Does
some spec say those have to happen while the monitor is held?  If not,
drop the monitor before calling out, and make this a non-reentrant
Monitor.  Otherwise, sigh, OK.

>+      nsMemory::Free(frame);

free()

>+NS_IMPL_THREADSAFE_ISUPPORTS0(GonkCaptureProvider)

Why does this need to be threadsafe?

>+nsresult GonkCaptureProvider::Init(nsACString& aContentType,

>+    if (stream) {
>+      nsresult rv = stream->Init(aContentType, aParams);
>+      if (NS_FAILED(rv))
>+        return rv;
>+    }
>+    else
>+      return NS_ERROR_OUT_OF_MEMORY;

Nit: braces here please.

>diff --git a/netwerk/protocol/device/GonkCaptureProvider.h b/netwerk/protocol/device/GonkCaptureProvider.h

>+/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
>+/* ***** BEGIN LICENSE BLOCK *****

New license header, keep the modeline.

>+#define USE_GS2_LIBCAMERA

Move all this gunk to GonkCaptureProvider.cpp.

>+// XXX move this to Hal/hal.cpp ?

No!! :) Move this to GonkCaptureProvider.cpp.  We're hacking around
different camera interfaces, which aren't necessarily device specific.
This is the feature-test vs. UA-sniff issue we discussed in MV.  The
maguro interface is from a newer stock android.  The sgs2 one ... who
knows.

At any rate, this code should die "soon", so let's just keep it in
GonkCaptureProvider.cpp.

>+class DeviceInformation {

I'd prefer to re-orient this around the camera interface we need to
use, not the device itself.

>+  public:
>+    typedef enum {
>+      DEVICE_SGS2,
>+      DEVICE_MAGURO
>+    } Type;
>+    
>+    static Type getType() {
>+      char propValue[PROPERTY_VALUE_MAX];
>+      property_get("ro.product.board", propValue, NULL);
>+      if (!strcmp(propValue, "GT-I9100"))
>+        return DEVICE_SGS2;
>+      
>+      // XXX don't have a maguro to check the value
>+      return DEVICE_MAGURO;

I wrote this on IRC but you were probably gone ... this value is in
device/toro/maguro/BoardConfig.mk, the TARGET_BOOTLOADER_BOARD_NAME
variable.

What happens if you try to use camera capture in the emulator?  Does
it return a null camera HAL or something?

We need a default vtable that's just whatever is in
camera/CameraHardwareInterface without the workaround ifdef'ery.  So
we also need a default camera interface value here.

>+    }
>+};
>+
>+class CameraHardwareInterface {

This and pretty much all the other classes declared here should be in
GonkCaptureProvider.cpp.  The hackery in here shouldn't pollute other
files through header includes.  Forward-declare your way to victory
for GonkCameraInputStream.

>+      void *hal = dlsym(mHandle, "HAL_openCameraHardware");
>+      HAL_openCameraHardware_SGS2 funct1 = (HAL_openCameraHardware_SGS2)hal;
>+      HAL_openCameraHardware_MAGURO funct2 = (HAL_openCameraHardware_MAGURO)hal;

reinterpet_cast, and do this after you know the type of interface
(i.e. in the |case|s of the switch below).

>+      if (!hal) {
>+        dlclose(mHandle);
>+        mHandle = NULL;

This code wants a dlclose() RAII helper.  You don't /need/ to write it
for this, but I would be happier if you did.

>+    ~CameraImpl() {

>+        mCamera.clear();

Isn't this taken care of by sp<>?  Not sure what this call is supposed
to achieve.

>+class GonkCameraInputStream : public nsIAsyncInputStream {

(This class should stay in this header, obviously.)

>+    nsDeque *mFrameQueue;

Why isn't this an inline member?  Make it so if you don't have a
reason otherwise.

More generally, a queue (inline or not) of dynamically allocated
frames, deleted after each use, is highly suboptimal.  In the steady
state (when gecko/client can keep up), we should have one or two
frames enqueued at all times.  But the general pattern will be,
malloc/free/malloc/free/malloc/malloc/free/malloc/free/free/etc.
Recycling allocated frames will be way friendlier to cache and help
avoid memory fragmentation.  See also question above re: number of
buffered frames.

Since I assume gecko-android uses this impl, we can try to fix this
for both in one fell swoop.  So -->followup work.

>+    CameraHardwareInterface* mHardware;

Can this be nsAutoPtr?

>diff --git a/netwerk/protocol/device/nsDeviceChannel.cpp b/netwerk/protocol/device/nsDeviceChannel.cpp

>+#ifdef MOZ_WIDGET_GONK
>+    if (mozilla::Preferences::GetBool("device.camera.enabled", false) == true)

|using namespace mozilla;|.  Factoring out the preference check seems
nice, but may not help much here.  The next person gets to do that.

>diff --git a/toolkit/library/Makefile.in b/toolkit/library/Makefile.in

>+  -lcutils \

Are you sure we're not linking to this already?  Doesn't matter too
much though.

>+  -lcamera_client \
>+  -lbinder \

I assume these are to avoid prevent link errors for the IPC server
code built into lcamera?  If so, Andreas hacked on a no-op binder stub
for stagefright that we could generalize and use here.  We could also
make a camera_client stub.  This work should be in a followup though.
(If client-less camera HAL needs binder to function correctly, we're
in deep doodoo :/.)

Whew, lots of questions.  Let's chat about this.
Attachment #592317 - Flags: review?(jones.chris.g)
Comment on attachment 592319 [details] [diff] [review]
poor man's getUserMedia

Sorry, I don't feel comfortable reviewing the protocol handler stuff either.  Ping me if you can't find another (responsive) reviewer.
Attachment #592319 - Flags: review?(jones.chris.g)
Attached patch backend v2Splinter Review
Addressed most comments (I think), just answering on specific points :

(In reply to Chris Jones [:cjones] [:warhammer] from comment #46)

> >+// The maximum number of frames we keep in our queue. Don't live in the past.
> >+#define MAX_FRAMES_QUEUED 10
> >+
> 
> That seems way too high, but we can tune this later.

I lowered it to 5 without noticeable difference.
 
> Does this big buffer size exist to work around stutter in the camera
> hardware or stutter in the gecko client?  I seriously doubt the HW is
> all that variable, so I assume gecko client.  If the client is
> stuttering, then it's either (i) not keeping up with the HW, in the
> steady state; or (ii) clearing the queue in bursts.  I can't think of
> a reason for (ii) (bad GCs?), but even if (ii) were happening, I don't
> know why we would want to preserve frames for >10/fps seconds in the
> past.  For (i), the client has no hope of catching up, so we should
> just keep a smaller buffer.

We have some (ii) issue sometimes, because the current video rendering pipeline 
is not designed for "real time" streams and instead does its best to buffer. We
added a special "real time" mode, that still have some issues that we're tracking
with cpearce.
 
> >+NS_IMPL_THREADSAFE_ISUPPORTS2(GonkCameraInputStream, nsIInputStream, nsIAsyncInputStream)
> >+
> 
> I don't understand the threading model of this code.  Where should I
> look for docs?

This inputstream gets frames pushed in by the HAL thread, and pumped out by the media decoder thread.

> 
> This is hard-coding an input format of 4:2:0 (or 4:1:1?) and an output
> format of 4:2:0 (or 4:1:1?), right?  Why?

We are hardcoding 4:2:0 because this is what the rawReader supports (see http://mxr.mozilla.org/mozilla-central/source/content/media/raw/nsRawReader.cpp#124)
 
> >+  memcpy(fullFrame + sizeof(nsRawPacketHeader), frame, yFrameSize);
> >+
> >+  char* uFrame = fullFrame + yFrameSize;
> >+  char* vFrame = fullFrame + yFrameSize + uvFrameSize;
> 
> Doesn't |fullFrame| point at the header?  Why is this pointer
> arithmetic valid?

Good catch for fullFrame! It's |fullFrame + sizeof(nsRawPacketHeader)|
 
> >+  char* yFrame = frame + yFrameSize;
> 
> |const char* const yFrame|.  Does |frame| have a header too?

each frame has a small header (nsRawPacketHeader), but |frame| itself doesn't.

> I'm getting really lost following the hard-coded binary formats here.
> I know it's not your fault, but if you know where all this stuff comes
> from, please add comments.  If we're hard-coding input/output, I
> wonder if we can't get away with an easier-to-understand memcpy()
> here.

This code converts yuv420sp to yuv420p (interlaced to deinterlaced).  I now check 
if the hardware supports yuv420p and just do a memcpy() in this case. That's the 
case for sgs2.
 
> AFAICT, the only reason ReentrantMonitor is used in this class is
> because of this |aWriter| callback and NotifyListeners() below.  Does
> some spec say those have to happen while the monitor is held?  If not,
> drop the monitor before calling out, and make this a non-reentrant
> Monitor.  Otherwise, sigh, OK.

My understanding is that AsyncWait (that calls NotifyListeners()) can do reentrant
calls.

> >diff --git a/toolkit/library/Makefile.in b/toolkit/library/Makefile.in
> 
> >+  -lcutils \
> 
> Are you sure we're not linking to this already?  Doesn't matter too
> much though.

Yep, we don't link to it anywhere else.
 
> >+  -lcamera_client \
> >+  -lbinder \
> 
> I assume these are to avoid prevent link errors for the IPC server
> code built into lcamera?  If so, Andreas hacked on a no-op binder stub
> for stagefright that we could generalize and use here.  We could also
> make a camera_client stub.  This work should be in a followup though.
> (If client-less camera HAL needs binder to function correctly, we're
> in deep doodoo :/.)

camera_client is needed for the CameraParameters and binder for the IMemory that
is used to get the frames... (https://github.com/cgjones/android-frameworks-base/blob/gingerbread-b2g/include/binder/IMemory.h)
We don't use the binder itself afaict.
Attachment #592317 - Attachment is obsolete: true
Attachment #593210 - Flags: review?(jones.chris.g)
I forgot to add that we probably want a whitelist of supported vtables, and fail early when not in the whitelist.
Comment on attachment 593210 [details] [diff] [review]
backend v2

>+class DlopenWrapper {

>+    void* dlsym(const char* aFunction) {
>+      return ::dlsym(mHandle, "HAL_openCameraHardware");

Whups.

On IRC, you said "closeWithStatus() is called before destroying."
Please add a comment that links to where that's guaranteed as part of
some necko spec.

I went through this a little more quickly than I probably should have,
but looks like you covered my comments.

r=me with fixes above
Attachment #593210 - Flags: review?(jones.chris.g) → review+
Comment on attachment 592319 [details] [diff] [review]
poor man's getUserMedia

Brad, can you review the protocol handler part? 
It's mostly a copy&paste of the device protocol handler with the following changes:
- access is granted to web content
- we retrieve the real URI from a pref that is set by chrome JS so the content JS can't use it directly successfully.
Attachment #592319 - Flags: review?(blassey.bugs)
(In reply to Fabrice Desré [:fabrice] from comment #51)
> backend pushed:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/7f7668eead68

https://hg.mozilla.org/mozilla-central/rev/7f7668eead68
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla13
Comment on attachment 592319 [details] [diff] [review]
poor man's getUserMedia

Review of attachment 592319 [details] [diff] [review]:
-----------------------------------------------------------------

this is pretty rubber-stampy. Nothing jumps out at me as wrong
Attachment #592319 - Flags: review?(blassey.bugs) → review+
Comment on attachment 592319 [details] [diff] [review]
poor man's getUserMedia

Thanks Brad!

Just asking a second review to bz to be extra sure. Boris, the part to be reviewed is the protocol handler.
Attachment #592319 - Flags: review+ → review?(bzbarsky)
Comment on attachment 592319 [details] [diff] [review]
poor man's getUserMedia

A few questions/comments:

1)  In init(), you're using the window.location URI instead of the document's URI.  Is that purposeful?  They don't quite match up because the former is exposed to scripts and has to be sanitized... do you want the sanitized version?

2)  from.schemeIs("about") will be true for every single iframe out there as long as the page doesn't load anything in it.  I really doubt that's what you really want here.  What _do_ you want?  Why is the test for "about" there?  It looks like you're doing a security check, but if so shouldn't you be using principals, not URIs?

4)  It looks like the camera URIs rely on unguessability of uuids, basically, right?  How confident are you in the UUID generator?  Is it ok that site A could postMessage the URI string to site B that could then read camera info from that URI?  (I'm not saying it's not OK; just trying to understand what the security model is.)

5)  In NewURI, please use .get() instead of BeginReading() for callsites that are supposed to be passing a null-terminated const char*.  BeginReading() doesn't guarantee null-termination.

6)  What's the expected use of these URIs?  The fact that the first newURI call wipes the pref could lead to very surprising behavior, depending on how the URI strings are used (e.g. putting two <a href="camera:whatever"> on the page would make only one of them work).

7)  What does actually opening this device channel do?  Does it trigger sny sort of user interaction?

8)  Should these URIs be URI_IS_LOCAL_RESOURCE?

r- for now because the "about" thing just looks wrong no matter what.
Attachment #592319 - Flags: review?(bzbarsky) → review-
Attached patch poor man's getUserMedia v2 (obsolete) — Splinter Review
Hi Boris, thanks for the review!

> 1)  In init(), you're using the window.location URI instead of the document's URI.  Is that purposeful?  They don't quite match up because the former is exposed to scripts and has to be sanitized... do you want the sanitized version?

Good catch, I moved to document.documentURI

> 2)  from.schemeIs("about") will be true for every single iframe out there as long as the page doesn't load anything in it.  I really doubt that's what you really want here.  What _do_ you want?  Why is the test for "about" there?  It looks like you're doing a security check, but if so shouldn't you be using principals, not URIs?

I removed the "about" and "file" whitelist since we don't need these here, but I don't think that from.schemeIs("about") can be true when init() is called if it's not from an about: page since init() is called when the page first uses the navigator.mozCamera property. At this point this mean that content has been loaded from somewhere.

Using principals is not what we do usually for web apis - he permission manager is easier to use if we want to hook up a settings UI from non c++ code.

> 4)  It looks like the camera URIs rely on unguessability of uuids, basically, right?  How confident are you in the UUID generator?  Is it ok that site A could postMessage the URI string to site B that could then read camera info from that URI?  (I'm not saying it's not OK; just trying to understand what the security model is.)

Yes, we rely on uuid being hard to guess and the uuid generator being good. It's also true that a privileged page could transmit the uuid to another page, but this is the same kind of trust we give for other data (eg geolocation). Since we only give the privilege to preinstalled gaia apps for now, I'm not worried.

> 6)  What's the expected use of these URIs?  The fact that the first newURI call wipes the pref could lead to very surprising behavior, depending on how the URI strings are used (e.g. putting two <a href="camera:whatever"> on the page would make only one of them work).

The intended usage is an a src in a <video> element. Typically, code like this be used: <video src="navigator.mozCamera.getCameraURI()">
If the result from |navigator.mozCamera.getCameraURI()| is used in two video elements, the second one will just fail and display nothing.

> 7)  What does actually opening this device channel do?  Does it trigger sny sort of user interaction?

Opening the channel creates an input stream that grab frames from the camera in raw format. There's no explicit user interaction involved.
The media decoder uses this input stream to feed the video element.

> 8)  Should these URIs be URI_IS_LOCAL_RESOURCE?

yep, added!
Attachment #592319 - Attachment is obsolete: true
Attachment #594243 - Flags: review?(bzbarsky)
> Good catch, I moved to document.documentURI

If the document URI works for you, you could use document.documentURIObject and that would just hand back an nsIURI; no need to round-trip through a string.  But see more below.

> At this point this mean that content has been loaded from somewhere.

You can have script running in a frame with URI about:blank.  Simple example:

  <iframe id="x></iframe>
  <script>
    var d = document.getElementById("x").contentDocument;
    var s = d.createElement("script");
    s.textContent = "/* use navigator.mozCamera here */"
    d.documentElement.appendChild(s);
  </script>

presto, navigator.mozCamera access when the URI is about:blank.

> Using principals is not what we do usually for web apis 

Uh... If you are NOT using principals for security checks you will have security holes, period.  There's no reason you can't use the permission manager when using principals; you just have to derive your origin from the principal, not the document URI.  If nsIPrincipal.origin doesn't do what you want, please file bugs to expose functionality that _will_ (and explain what you actually want).  But as far as I can tell, this patch will just break if a whitelisted site tries to use camera API from a subframe as above.  That can't possibly be desirable.

> The intended usage is an a src in a <video> element.

Hmm.  That's relying on the media selection algorithm only running once (or URI objects being cached across algorithm runs) and even then not all such algorithm implementations would work ok.  This may be OK for a prototype, but it's really no good for shipping code.  We need a better solution there.

Have you looked into how blob URIs are implemented?  Seems like what you want has a lot of the same design characteristics, right?
Comment on attachment 594243 [details] [diff] [review]
poor man's getUserMedia v2

r- because using URIs for security checks is broken.
Attachment #594243 - Flags: review?(bzbarsky) → review-
Attached patch poor man's getUserMedia v3 (obsolete) — Splinter Review
Now using the document principal to perform the permission check.

Regarding your last comment, I'm don't understand where the flaw is in the current implementation.

The protocol newChannel() with call nsDeviceChanel::init() that will return succeed only if the set up of the underlying hardware also succeeds. Do we really need more?
Attachment #594243 - Attachment is obsolete: true
Attachment #594311 - Flags: review?(bzbarsky)
> Regarding your last comment, I'm don't understand where the flaw is in the current
> implementation.

Your current implementation assumes that for a given string newURI will only be called once, right?  This is not really a safe assumption.

Let me give you a simple example.  Say you have:

  <div id="x">
    <video id="y">
  </div>
  <script>
    document.getElementById("y").src = navigator.mozCamera.getCameraURI();
    document.getElementById("x").innerHTML = document.getElementById("x").innerHTML;
  </script>

In that situation, the video won't work right.  A similar problem would happen if instead of the innerHTML set the div containing the video were just moved from one place in the document to another.

Again, this is probably OK for a prototype or a very controlled environment where you can expect people to work around whatever idiosyncrasies the implementation has, so it might be good enough for purposes of this bug, but it's not OK for shipping web-facing code.  So it sort of depends on when the planned rewrite, if any, of the URI parts of this is....
> +    let perm = Services.perms.testExactPermission(aWindow.document.nodePrincipal.URI,
> "content-camera");

This will throw an exception if aWindow.document has the system principal.  You probably want to check for that up front.
Now testing if the document's principal is the system principal.

To answer to comment #61:
We don't plan to expose this to any other app but gaia's default camera app. We do it this way as an interim solution, waiting for proper MediaStreams and webrtc support to be available. So my first goal is to get this landed, and then I hope we'll soon be able to get rid of this.
Attachment #594311 - Attachment is obsolete: true
Attachment #594311 - Flags: review?(bzbarsky)
Attachment #594821 - Flags: review?(bzbarsky)
> We don't plan to expose this to any other app but gaia's default camera app.

OK.  That makes me much happier about the current approach...
Comment on attachment 594821 [details] [diff] [review]
poor man's getUserMedia v4

r=me on condition that this never gets exposed to the web in its current state...  Some XXX comments to that effect next to the prefs might be a good idea.
Attachment #594821 - Flags: review?(bzbarsky) → review+
https://hg.mozilla.org/mozilla-central/rev/6234bb0a5f42

should this be closed? I'm unsure which patches did land
yes, it's ok to close now.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.