Closed Bug 451674 Opened 16 years ago Closed 13 years ago

Expose camera functionality to web content


(Core :: DOM: Core & HTML, defect)

Windows XP
Not set





(Reporter: pavlov, Unassigned)



(Keywords: mobile)


(4 files, 10 obsolete files)

2.79 KB, patch
: review+
Details | Diff | Splinter Review
321 bytes, patch
: review+
Details | Diff | Splinter Review
29.34 KB, patch
: review-
Details | Diff | Splinter Review
1.62 KB, patch
: review+
Details | Diff | Splinter Review
It would be great if I could have a web page that could take photos using the camera on my laptop or the camera on my phone.
Flags: wanted1.9.1+
So is the goal to only allow this to happen on pages that specifically opt in? Or should we try to allow it on any page with a file upload?
I think this needs to be explicit functionality.  I'm thinking of using the w3c device upload "spec," so you'd have <input type="file" device="camera"/>
(In reply to comment #1)
> So is the goal to only allow this to happen on pages that specifically opt in?
> Or should we try to allow it on any page with a file upload?

What kind of opt-in is needed? If the <input type="file" device="camera"/> appears in a page is that good enough? Should the control have a button to actually take the picture (akin to the button to select a file)? Since the user needs to press the button, that is kind of opt-in, right?
So there needs to be "opt-in" for both the web developer and for the user.  From the web developer's point of view, they need to explicitly use device="camera" to get this new UI element.  From the user's point of view, they'll have to explicitly click a "take photo" button to take the photo.  The page will display with a live video preview of the photo, but the at data won't be accessible to the web developer until the "take photo" button is pressed.
So there is actually already a feature in HTML4 (and expanded in HTML5) that we can use. Basically saying

<input type=file accept="image/*">

should allow the user to use the camera. Though we should probably also allow the user to attach any previously taken pictures.

Full spec here:

Which also allows things like:
<input type=file accept="application/word-document, application/odf-document">

but I think for our purposes just looking for "image/*" is fine.

Ideal would be if we also allowed things like "image/jpeg" or "image/gif" and encoded into that format when sending. But that seems secondary.
For a single photo, an extension to <input type="file"> could work, but what about continuous video from the user's camera like Flash allows?
Well, for a short video sequence

<input type=file accept="video/*">

works fine. We simply upload it as a ogg or mpeg or whatever we support for encoding. For something like streaming video to implement video chatting we need something different.
This is a very rough patch that depends on the gstreamer decoder for the video tag.  It looks for the protocol "camera://" in the video tag's src attribute.  Currently video from a video4linux2 source is supported. The idea would be to store the camera's settings in prefs.

The "tee" element in the pipeline allows you to stream to multiple sinks, one of which being the content area of the page.  Popping up a second window for debugging is implemented (surrounded by #ifdef DEBUG_VIDEO).  Producing image files or video streams from this is reletively d
Attached patch WIP v2, works on device (obsolete) — Splinter Review
there's something wrong with my xbl binding (camera.xml). The video doesn't show up when I use the camera tag, even though the same code in xul works fine.  Also, when using the camera tag, SetRGBData is being called, but still nothing shows up on screen.
Attachment #335366 - Attachment is obsolete: true
Attached patch WIP v3, camera tag works (obsolete) — Splinter Review
the issue I was seeing before with the video not showing up when part of an xbl binding was caused by the nsHTMLMediaElement being cloned.  As a result, there were two video decoders trying to access /dev/video0.  The fist one could, which is why I was seeing SetRGBData being called... but the second one couldn't, which was the one being displayed.

The solution I'm using in this patch is to share a reference to the decoder when being cloned.  This probably isn't the ideal solution, and even if it is there are things not being handled by this patch, such as what happens when one media element tries to shut down the decoder that the other is still using.  Another solution would be to shutdown the decoder (freeing the camera device) before cloning.  Right now, Shutdown is a protected function, this would have to be made public. (Also, it seems that the media (and video) element is calling Stop() when it should be calling Shutdown, such as in the deconstructor or in PickMediaElement.  This might be a separate bug) 

To test this patch, just put a <camera> tag into a xul document.  It works on the n800/n810 or a linux box with a video4linux2 camera attached (for example an mbp).
Attachment #335909 - Attachment is obsolete: true
We don't want to share the decoder, that's going to be incorrect when one <video> element changes state (e.g. does a seek, or changes the playback rate, or just does pause/play/stop).

The source element for the clone is not actually playing, right? Maybe you can avoid opening the device for elements that aren't playing. That would get your demos working and probably be a good idea in general.

However, if there are two pages or even two applications both trying to read from the camera at the same time, there's really no reason why they shouldn't be able to. That sounds like a bug at the GStreamer level or even lower.
Attached patch WIP v4, decoders share pipeline (obsolete) — Splinter Review
this wraps the pipeline, camera source and a tee element in a nsCameraSource, then creates a queue for each decoder to link to.
Attachment #336145 - Attachment is obsolete: true
Minor change to allow us to turn on the gstreamer backend for the camera source only with the ac flag --enable-gstreamer-camera
Attachment #336703 - Attachment is obsolete: true
roc, does this approach work for you?
I think it's OK, but you've diffed against something other than trunk so I'm not 100% sure. Chris D, what do you think?

+      NS_NAMED_LITERAL_STRING(ogg_ext, "camera:");
+      if (FindInReadable(ogg_ext, start, end,  
+                         nsCaseInsensitiveStringComparator())) {

ogg_ext is the wrong name here.
QA Contact: content
Taking photos is extremely privacy sensitive and can, when used without the user noticing, represent a surveillance device inside my house. I personally don't feel comfortable seeing this feature at all in Firefox. But, assuming this is accepted:
Given the tricks we've seen with <input type="file"> (ask Jesse Ruderman), please be sure to have explicit consent from the user, in form of a dialog box that cannot possibly be accidentally clicked nor the user tricked into clicking.
1. Accidental clicks will always happen.
2. Some users will always be tricked.

But, the additional functionality is worth the risks, no?
Component: Content → DOM
QA Contact: content → general
Attached patch WIP new approach (obsolete) — Splinter Review
This patch implements a protocol handler for camera:// as well as a camera channel and camera input stream. The channel/input stream can output raw rgb or ogg theora video, which can be specified by the url (camera://encoding=ogg). This also implements a raw video "decoder."

With this patch <video src="camera">, <xul:camera> and <input type="camera"> all work in firefox and fennec on the n810 and my mac book pro.
Attachment #337115 - Attachment is obsolete: true
Attachment #337446 - Attachment is obsolete: true
this no longer depends on the gstreamer backend for the video tag
No longer depends on: 422540
(In reply to comment #19)
> which can be specified by the url (camera://encoding=ogg).

Don't mean to bikeshed here, but scheme+"://" usually indicates a hierarchical URL to many apps which expect to find scheme+"://"+host+path  

In particular, nsStandardURL which you're using interprets things that way, and parsing "encoding=ogg" as a hostname may give you odd results here and there.

I don't know your scheme's syntax, but if you don't need the path parsing functionality nsSimpleURL is lighter weight. (that's what data: and javascript: use).

Is this a scheme you're going to expose to third party apps or expect other apps to implement? If not we probably shouldn't squat on the prime "camera" name and use moz-camera or something. If you're hoping this becomes standard/common functionality then camera: sounds great.
(In reply to comment #21)
> Is this a scheme you're going to expose to third party apps or expect other
> apps to implement? If not we probably shouldn't squat on the prime "camera"
> name and use moz-camera or something. If you're hoping this becomes
> standard/common functionality then camera: sounds great.

But before it becomes a standard, using moz-camera sounds better.

(Or perhaps something like moz-device://camera?image/jpg)
Attached patch WIP patch v.2, moz-device:// (obsolete) — Splinter Review
I like the suggestion of moz-device (which I do hope we can eventually standardize), because its forward looking and would allow for capture from other devices like a mic.

Dan, with this new scheme "camera" would be the host, so nsStandardURL seems newly appropriate.  Please correct me if I'm wrong.
Attachment #390662 - Attachment is obsolete: true
We could even do a "&authorization=police" parameter, which skips the user confirmation.</sarcasm>

I don't think "camera" should be a host (the host is the user's machine). Dan is right, neither :// nor StandardURL fit here.
No longer blocks: 507749
Depends on: 507749
(In reply to comment #24)
> I don't think "camera" should be a host (the host is the user's machine).

Then moz-device:camera?type=image/jpg&more_parameters=some_value may be better.
That will facilitate provide future options like
* moz-device:camera?type=image/ogg&frames_per_second=15&size=360x400
* moz-device:microphone?type=image/ogg&rate=10kHz
> Taking photos is extremely privacy sensitive and can, when used without the
user noticing, represent a surveillance device inside my house.

There are implications of this feature far beyond home surveillance and user's consent.

Imagine some crowd IRL where 10% of people are sharing their video input through Fennec on Maemo on Nokia N900, and anyone can see (online) the street view throught their eyes (like Google Street View, but live), which looks like an extremely great online experience, like nothing ever before.

Now imagine yourself walking in that croud, being constantly monitored (and broadcasted online) by several anonymous volunteers without your prior consent. Suddenly the application doesn't look so great anymore, does it?
By the way, this bug should block bug 479046, quiaff?
Blocks: 528886
And while you're at it: some applications (like geographically augmented reality, see bug 528886) does not require any uploading of camera shots or a video stream, they just need a rectangular area on the web page to display the camera input _as_it_is_ (like <img> for Web images, or <video> for Web video streams), and to be able to place objects over the camera frame (with higher z-index, for example), and to change (move, replace) these objects when the camera's geolocation (navigator.geolocation) or tilt (window.onmozorientation) or azimuth changes.
The Capture API (W3C):
See bug 507749 for implementation.
Attached patch Configure goop for gstreamer (obsolete) — Splinter Review
Assignee: blassey.bugs → me
Attachment #390870 - Attachment is obsolete: true
Attachment #463286 - Flags: review?
Attachment #463286 - Flags: review? → review?(mitchell.field)
Note that bug 422540 also contains configure goop for gstreamer and has requested review. How different are they?
Very similar, I wasn't aware that that bug was still alive.
I'd prefer to get this in and not make it depend on that bug, which could take a while to get in.
I agree it shouldn't depend on the gstreamer bug. Has been contacted about the use of gstreamer and the LGPL license?
Webkit has just gotten a bug for this:
Attachment #463286 - Attachment is obsolete: true
Attachment #463286 - Flags: review?(mitchell.field)
Attached patch Build glueSplinter Review
Lets do this instead.  We add MOZ_CAPTURE that can be set in the app's confvars and then we select the appropriate packages (if any) based on the platform.  When the other bug lands with a gstreamer <video> backend this can just be changed to force MOZ_GSTREAMER to 1 and let that deal with finding the gstreamer CFLAGS and LIBS.
Attachment #472428 - Flags: review?(mitchell.field)
Flags: in-testsuite?
Forgot to qref
Attachment #472442 - Attachment is obsolete: true
Attachment #472443 - Flags: review?(chris.double)
Attachment #472442 - Flags: review?(chris.double)
Comment on attachment 472428 [details] [diff] [review]
Build glue

r=Mitch, when you sort those added headers alphabetically. Thanks.
Attachment #472428 - Flags: review?(mitchell.field) → review+
Comment on attachment 472429 [details] [diff] [review]
Turn on capture in Fennec on linux

>+if test "$OS_ARCH" == "Linux"; then

seems like the rest of the code base uses a single '=' for this test
Attachment #472429 - Flags: review?(blassey.bugs) → review+
Attachment #472494 - Flags: review?(josh) → review+
No video for me.

As an image, it works:
<img src="moz-device:camera?type=image/png>

But as a video, it's doesn't:
<video src="moz-device:camera?type=video/x-raw-yuv"></video>

With these warnings:


(firefox-bin:5085): GLib-GObject-WARNING **: invalid unclassed pointer in cast to `GObject'

(firefox-bin:5085): GLib-GObject-WARNING **: instance with invalid (NULL) class pointer

(firefox-bin:5085): GLib-GObject-CRITICAL **: g_signal_handler_disconnect: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed
In debug mode:

###!!! ASSERTION: Cannot set both a frame limit and a time limit!: 'aParams->frameLimit == 0 || aParams->timeLimit == 0', file /home/paul/dev/buildbox/central/src/netwerk/protocol/device/GStreamerCaptureProvider.cpp, line 522
Comment on attachment 472443 [details] [diff] [review]
GStreamer Capture

>+struct nsRawVideo_PRUint24 {

This, and other nsRawVideo stuff, appears to be the same as code in the content/media/raw decoder. It should be moved out into a common file and shared.

>+  operator PRUint32() const
>+  { return value[2] << 16 | value[1] << 8 | value[1]; }

That last value[1] should be value[0]. I've not checked any of the other Raw based stuff, assuming you'll move to sharing the content/media/raw code.
Comment on attachment 472443 [details] [diff] [review]
GStreamer Capture

>+                   (const char*)&header,

Use C++ style casts. static_cast, reinterpret_cast, etc, for this and all other C style casts in the patch.

>+  if (!(mConv && mScale && mSink)) {
>+    g_printerr("Failed to create GStreamer pipeline elements");
>+    return NS_ERROR_FAILURE;
>+  }

Use Gecko logging facilities rather than g_printerr.

>+  if (!gst_element_link_filtered(mConv, mScale, NULL))
>+  if (!gst_element_link_filtered(mScale, mSink, sinkcaps))

Do you need to unref the previously created sinkcaps before returning here?

>+  // Create a pad on the bin to receive data.
>+  GstPad* firstPad = gst_element_get_static_pad(mConv, "sink");
>+  gst_element_add_pad(mBin, gst_ghost_pad_new("sink", firstPad));
>+  gst_object_unref(GST_OBJECT(firstPad));

Check return value of get_element_get_static_pad.

>+: mPipelineCount(0)
>+  gst_init(nsnull, nsnull);

The docs for gst_init say:

  "This function should be called before calling any other GLib functions."

How are you ensuring this?

>+  mPipeline = gst_pipeline_new("moz-camera-stream");
>+  mSource = gst_element_factory_make("v4l2src", NULL);
>+  mTee = gst_element_factory_make("tee", NULL);

Check return values of above calls, and all gstreamer calls following that need it.

More comments coming.
Comment on attachment 472443 [details] [diff] [review]
GStreamer Capture

>+  // Get a bin containing the stream's pipeline pieces
>+  GstBin* streambin;
>+  rv = stream->GetPipeline(&streambin);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+  GstPad* padToBlock;
>+  if (mPipelineCount > 0) {
>+    // We need to block the source
>+    padToBlock = gst_element_get_static_pad(mSource, "src");
>+    if (padToBlock == NULL)
>+      return NS_ERROR_FAILURE;
>+    gst_pad_set_blocked(padToBlock, TRUE);
>+    // Now that everything is holding we can proceed
>+  }

Does streambin need to be released on these early returns and the ones following?

>+  // Ask the tee to make a pad available for us
>+  GstPad* teepad;
>+  teepad = gst_element_get_request_pad(mTee, "src%d");
>+  if (!teepad) {
>+    return NS_ERROR_FAILURE;
>+  }
>+  // Link the tee to the stream's bin
>+  if (!gst_element_link(mTee, GST_ELEMENT(streambin))) {
>+    return NS_ERROR_FAILURE;
>+  }

Does the teepad need unlinking on this early return?

>+  // Pause the pipeline, rip out the stream that's disconnecting
>+  // Then resume the pipeline or shut down as appropriate
>+  gst_element_set_state(GST_ELEMENT(mPipeline), GST_STATE_PAUSED);
>+  gst_element_set_state(GST_ELEMENT(streambin), GST_STATE_NULL);
>+  gst_element_send_event(GST_ELEMENT(streambin),
>+                         gst_event_new_eos());
>+  gst_element_unlink(mTee, GST_ELEMENT(streambin));
>+  gst_bin_remove(GST_BIN(mPipeline), GST_ELEMENT(streambin));

Check return values of those calls that return a value. In particular the set_state ones.

>+    GError* err;
>+    gchar* errmsg;
>+    gst_message_parse_error(aMsg, &err, &errmsg);
>+    g_printerr("ERROR from element %s: %s\n", GST_OBJECT_NAME(aMsg->src),
>+               err->message);
>+    g_printerr("Debugging info: %s\n", (errmsg) ? errmsg : "none");
>+    g_free(errmsg);
>+    g_error_free(err);
>+    gst_element_set_state(mPipeline, GST_STATE_NULL);
>+    break;
>+  }

Use Gecko logging.

>+  GStreamerInputStream(GStreamerCaptureProvider *aCaptureProvider,
>+                       nsACString &aContentType,
>+  		       nsCaptureParams *aParams)

Fix indentation of last parameter.

>+  {
>+    printf("GStreamerInputStream::GStreamerInputStream\n");
>+    mReadQueue = new nsDeque();
>+    mWriteQueue = new nsDeque();
>+  }

Don't use printf.

>+  nsresult GetPipeline(GstBin **aBin);
>+  nsresult Disconnect();

Add comments explaining lifetime requirements of aBin, when these are supposed to be called, are they main thread only, etc.

>+  nsCOMPtr<nsIInputStreamCallback> mCallback;
>+  nsCOMPtr<nsIEventTarget> mCallbackTarget;
>+  nsRefPtr<GStreamerCaptureProvider> mCaptureProvider;
>+  PRUint32 mNotifyThreshold;
>+  PRUint32 mAvailable;
>+  nsCString mContentType;
>+  PRPackedBool mClosed;
>+  PRPackedBool mFinished;
>+  PRPackedBool mHeaderSent;
>+  nsDeque *mReadQueue, *mWriteQueue;
>+  GstElement *mBin;
>+  GstElement *mEnc, *mMuxer, *mConv, *mScale;
>+  GstElement *mSink;
>+  gulong mSignal;
>+  PRUint32 mWidth, mHeight, mFrameLimit;
>+  mozilla::Monitor mMonitor;

What members are protected by the monitor? All of them? Add comment to this effect.
Attachment #472443 - Flags: review?(chris.double) → review-
Work going on in bug #692955.
Closed: 13 years ago
Resolution: --- → DUPLICATE
removing sec-review-needed flag
Alias: camera
No longer blocks: 528886, webapi
No longer depends on: 508063, 621915, 507749, 508082, 565843, 567323
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.