Last Comment Bug 508082 - Implement raw decoder
: Implement raw decoder
Status: RESOLVED FIXED
: dev-doc-complete, mobile
Product: Core
Classification: Components
Component: Audio/Video (show other bugs)
: unspecified
: x86 Linux
: -- normal (vote)
: mozilla2.0b3
Assigned To: Kyle Huey [:khuey] (khuey@mozilla.com)
:
Mentors:
Depends on: 582328 587479 589606
Blocks: camera 573050
  Show dependency treegraph
 
Reported: 2009-08-03 12:17 PDT by Brad Lassey [:blassey] (use needinfo?)
Modified: 2012-02-16 14:40 PST (History)
12 users (show)
khuey: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch v.1 (12.20 KB, patch)
2009-08-03 12:17 PDT, Brad Lassey [:blassey] (use needinfo?)
cajbir.bugzilla: review-
Details | Diff | Splinter Review
patch (13.90 KB, patch)
2010-05-14 12:18 PDT, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Splinter Review
patch (14.20 KB, patch)
2010-05-14 15:34 PDT, Brad Lassey [:blassey] (use needinfo?)
cajbir.bugzilla: review-
Details | Diff | Splinter Review
Patch (1.06 MB, patch)
2010-06-11 13:19 PDT, Kyle Huey [:khuey] (khuey@mozilla.com)
cajbir.bugzilla: review-
Details | Diff | Splinter Review
Patch updated to review comments (1.06 MB, patch)
2010-06-16 14:12 PDT, Kyle Huey [:khuey] (khuey@mozilla.com)
no flags Details | Diff | Splinter Review
Right patch (1.07 MB, patch)
2010-06-16 19:11 PDT, Kyle Huey [:khuey] (khuey@mozilla.com)
cajbir.bugzilla: review+
mbeltzner: approval2.0+
Details | Diff | Splinter Review

Description Brad Lassey [:blassey] (use needinfo?) 2009-08-03 12:17:14 PDT
Created attachment 392304 [details] [diff] [review]
patch v.1

For the implementation of a camera input stream (bug 507749), encoding to ogg+theora only to decode is wasting a lot of cpu. The solution I see is to ship raw rgb internally and to support this I have implemented a raw "decoder." 

This is minimal implementation and there are several functions that are no-ops.  I am not sure if any of these are required.  Another problem is there is no way that I know of to get the height, width, aspect ratio and frame rate from the raw stream, so I added that functionality to the camera input stream and use that in the raw decoder.
Comment 1 cajbir (:cajbir) 2009-08-03 17:06:13 PDT
What happens if a web page sends data with this mime type? What does the decoder do given that there is no frame rate, aspect ratio, width, height or means to identify frames in the file?
Comment 2 Brad Lassey [:blassey] (use needinfo?) 2009-08-04 00:03:54 PDT
I suspect that with the current patch we'd fail miserably. I'd be looking to you for a suggestion on how to allow web content to utilize this mimetype.  Perhaps allowing the width, height, framerate and aspect ratio to be set as attributes of the video tag would work.
Comment 3 cajbir (:cajbir) 2010-01-25 18:24:15 PST
Comment on attachment 392304 [details] [diff] [review]
patch v.1

I don't have any suggestions in the area of providing this data from web content. I don't think we should add custom attributes to the video element. This could cause problems when it clashes with data provided by other codecs. People might just copy/paste a raw example and expect the declared frame rate, etc to work.

Is there some other, already defined, 'raw' format that can be used for this purpose, which has this data as part of the stream. Perhaps something like:

http://en.wikipedia.org/wiki/OggUVS

Do you need to deal with Audio from the camera? If so, how does this fit into the stream? If using a container like Ogg (in OggUVS or similar) at least audio and other data could be transmitted.
Comment 4 cajbir (:cajbir) 2010-01-25 18:56:38 PST
If you aren't interested in audio, YUV4MPEG could be an option:

http://wiki.multimedia.cx/index.php?title=YUV4MPEG2
Comment 5 Brad Lassey [:blassey] (use needinfo?) 2010-05-14 12:18:19 PDT
Created attachment 445416 [details] [diff] [review]
patch

updated to work with current layers implementation.  Doesn't address review comments.
Comment 6 Brad Lassey [:blassey] (use needinfo?) 2010-05-14 15:34:35 PDT
Created attachment 445471 [details] [diff] [review]
patch

this implements reading in an oggyuv stream (as defined here: http://wiki.xiph.org/OggYUV)
Comment 7 cajbir (:cajbir) 2010-06-01 19:35:07 PDT
Comment on attachment 445471 [details] [diff] [review]
patch

+#ifdef MOZ_RAW
+static const char gRawTypes[][16] = {
+  "video/x-raw",
+  "video/x-raw-rgb"
+};

This (and other canPlayType handling in the patch) will need minor
updates for the changes made to how canPlayType works in bug 566245
attachment 448322 [details] [diff] [review].

--- a/content/media/Makefile.in	Fri May 07 16:48:33 2010 -0400
+ifdef MOZ_RAW
+DIRS += raw
+endif

The other backends use PARALLEL_DIRS, why does raw use DIRS?

+++ b/content/media/raw/nsRawDecoder.h	Mon Aug 03 14:59:42 2009 -0400
+class nsRawStreamListener : public nsIStreamListener {
...
+class nsRawDecoder : public nsMediaDecoder

Instead of handling reading from the stream using a listener I suggest
using nsMediaStream. This gives a file like abstraction to the stream
and allows seeking in parts of the stream that have been cached
amongst other benefits.

Deriving from nsMediaDecoder requires implementing a whole pile of
stuff to get the correct HTML5 spect events right. I suggest deriving
from nsBuiltinDecoder and friends. For a smallish example see bug
566245 attachment 448326 [details] [diff] [review]. This implements the WebM decoder and uses
much of the shared functionality in the nsBuiltin base classes.

This will give you the correct events, buffering, and other features
for free.

+  PRBool mPositionChangeQueued;

Write a comment on what this is for.

+  PRBool mPositionChangeQueued;
+  PRInt32 mWidth;
+  PRInt32 mHeight;
+  double mFramerate;
+  double mAspectRatio;

Reorder these so the PRBool is the last one declared and make it a PRPackedBool.

+  PRUint32 mPlaybackPosition;
+  PRUint32 mDownloadPosition;
+  clock_t mInitialTime;
+  nsRefPtr<nsIStreamListener> mListener;
+  nsMediaStream* mStream;

Document what the threading requirements are for these (main thread
only?). Instead of using clock_t use mozilla::TimeStamp or mozilla::TimeDuration.

+++ b/content/media/raw/nsRawDecoder.cpp	Mon Aug 03 14:59:42 2009 -0400
+nsRawStreamListener::nsRawStreamListener(nsRawDecoder* aDecoder) 
+{
+  mDecoder = aDecoder;
+  mPositionChangeQueued = PR_FALSE;
+  mFramerate =  mAspectRatio = 0.0;
+  mWidth = mHeight = 0;
+}

Move these outside of the constructor body. eg:

nsRawStreamListener::nsRawStreamListener(nsRawDecoder* aDecoder) :
  mDecoder(aDecoder),
  mPositionChangeQueued(PR_FALSE),
  ...
{
}

+  aStream->ReadSegments(&NS_CopySegmentToBuffer, buf, count, &read);

handle return value of ReadSegments.

+  if (buf[0] == 0) { // that's our header packet

Need to check that at least 1 byte was read before indexing into buf.

+    memcpy(&mWidth, buf + 12, 3);
+    memcpy(&mHeight, buf + 15, 3);

Use of 'magic numbers' should be moved into constants or #define's at
the top of the file. Same with the rest of the file. Also document
where the numbers come.

This copies only 3 bytes into a a PRInt32? What is the 4th byte set
to? How do you deal with endian issues (PPC). Same goes with the other
memcpy code later.

+    mAspectRatio = aspect_n / aspect_d;
+    mFramerate = framerate_n / framerate_d;

Check for aspect_d and framerate_d being zero to avoid divide by zero
issues.

+      int offset = mWidth * mHeight / 4;

Check for overflow during mWidth * mHeight. Offset should be NSPR
type, PRUint32.

+      data.mCbChannel = buf + offset * 4 + data_header_size;
+      data.mCrChannel = buf + offset * 5 + data_header_size; 

offset is calculated by multiplying two values (width and height) that
are read from the bitstream. These values are not sanitized and are
added to a memory pointer. Need to check the values for 'valid' ranges
and check the arithmetic for overflows.

+      data.mCbCrSize = data.mYSize / 2;
+      data.mCbCrStride = data.mYStride / 2;

Is the YCbCr format from the file always 4:2:0? Do you need to deal
with 4:2:2 and 4:4:4? 

+    clock_t ct = clock();  UpdatePlaybackPosition((ct - mDecoder->mInitialTime) / 1000);

TimeStamp::Now() and other Mozilla time clode instead of clock().

+void nsRawDecoder::GetCurrentURI(nsIURI**)
+{
+}
+
+already_AddRefed<nsIPrincipal> nsRawDecoder::GetCurrentPrincipal()
+{
+  return nsnull;
+}

This and other unimplemented functions will need to be implemented. If
you base code on the nsBuiltin* classes you get much of it for free.

+float nsRawDecoder::GetCurrentTime()
+{
+  clock_t ct = clock();  
+  float t = ((float)(ct - mInitialTime)) / (float)CLOCKS_PER_SEC;
+  return t;
+  
+}

Use mozilla time functions (and in other places as well).
Comment 8 cajbir (:cajbir) 2010-06-01 22:37:26 PDT
I forgot to mention that it needs tests. At least a simple playback test. bug 566245 attachment 448321 [details] [diff] [review] show's how a simple playback test can be added.
Comment 9 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-02 10:57:21 PDT
I'm going to take it from here and address Chris's review comments.
Comment 10 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-11 13:19:13 PDT
Created attachment 450734 [details] [diff] [review]
Patch

Rebuilt on nsBuiltinDecoder* as suggested, with tests.
Comment 11 cajbir (:cajbir) 2010-06-15 20:23:33 PDT
Comment on attachment 450734 [details] [diff] [review]
Patch

+++ b/content/html/content/src/nsHTMLMediaElement.cpp
+static PRBool IsRawEnabled()
+{
+  return PR_TRUE;
+}

This should be based on a pref like the webm and ogg backends.

The following files need the MPL license header:

content/media/raw/Makefile.in
content/media/raw/nsRawDecoder.h
content/media/raw/nsRawDecoder.cpp

+++ b/content/media/raw/nsRawReader.h
+  PRBool ReadFromStream(nsMediaStream *stream, PRUint8 *buf, PRUint32 l);

Make argument names match standard (aStream, aBuf, etc).

diff --git a/content/media/raw/nsRawReader.cpp b/content/media/raw/nsRawReader.cpp
+
+static PRBool MulOverflow32(PRUint32 a, PRUint32 b, PRUint32 &aResult)
+{
  ...
+}

This appears to be used by the Ogg and Raw backends. Can it be moved
into VideoUtils.h and shared?

+nsresult nsRawReader::ReadMetadata()
+ [...]
+  if (!ReadFromStream(stream, (PRUint8*)&mMetadata, sizeof(mMetadata)))

Use C++ style casts (static_cast, reinterpret_cast, etc).

+  mInfo.mPixelAspectRatio = mMetadata.aspectNumerator / mMetadata.aspectDenominator;

Need a cast to float at least on of aspectNumerator or
aspectDenominator here otherwise the result of the division is an
integer division.

+  mFramerate = mMetadata.framerateNumerator / mMetadata.framerateDenominator;

Need a cast to float at least on of framerateNumerator or
framerateDenominator here otherwise the result of the division is an
integer division.

In general we'll need to validate all the metadata values read from
the stream to be within sane ranges so calculations used by them don't
cause problems.

+    mDecoder->GetStateMachine()->SetDuration(1000 *
+                                             (length - sizeof(nsRawVideoHeader)) /
+                                             (mMetadata.frameWidth * mMetadata.frameHeight * 
+                                              (mMetadata.lumaChannelBpp / 8.0 + mMetadata.chromaChannelBpp / 8.0) +
+                                              sizeof(nsRawPacketHeader)) /
+                                             mFramerate);

Some lines longer than 80 characters. The arithmetic can overflow
(frameWidth * frameHeight). Division by zero is possible if mFramerate
is 0 or if the 2nd division return is 0.

+// Helper method that either reads until it gets l bytes or returns PR_FALSE
+PRBool nsRawReader::ReadFromStream(nsMediaStream *stream, PRUint8* buf, PRUint32 l)

Should be aStream, aBuf, etc.

+{
+  nsresult rv = NS_OK;

'rv' isn't used outside of the loop. What it entered to return 'rv'?
If not, put it in the loop as part of the 'Read' call.

+
+  while (NS_SUCCEEDED(rv) && l > 0) {
+    PRUint32 bytesRead = 0;
+    rv = stream->Read((char*)buf, l, &bytesRead);

Use C++ style cast. Need to check 'rv' return value.

+PRBool nsRawReader::DecodeVideoFrame(PRBool &aKeyframeSkip,
+                                     PRInt64 aTimeThreshold)
+{
+ [...]
+  PRInt64 currentFrameTime = 1000 * mCurrentFrame / mFramerate;

Division by zero if mFramerate is zero. This goes back to the need to
check the validity of the metadata that was read previously.

+  PRUint32 length = mMetadata.frameWidth * mMetadata.frameHeight *
+                    (mMetadata.alphaChannelBpp + mMetadata.lumaChannelBpp + 
+                     mMetadata.chromaChannelBpp) / 8;

Overflow possibilities here. 

+  nsAutoPtr<PRUint8> buffer(new PRUint8[length]);

'length' is calculated using values read from the stream that haven't
been checked for valid values. The values should perhaps be
constrained to be within sane ranges.

+  // We're always decoding one frame when called
+  while(true) {
+    //    printf("Reading packet!\n");
+    nsRawPacketHeader header;
+
+    // Read in a packet header and validate
+    if (!(ReadFromStream(stream, (PRUint8*)&header, sizeof(header))) ||

C++ style casts please.

+        !(header.packetID == 0xFF && header.codecID == 0x595556 /* "YUV" */)) {
+      return PR_FALSE;
+    }

The 0x595556 is used in a couple of places. Move it out into a #define
at the toplevel of the .cpp file.

+  b.mPlanes[1].mHeight = mMetadata.frameHeight / 2;
+  b.mPlanes[1].mWidth = mMetadata.frameWidth / 2;

The code assumes 4:2:0? There should probably be a check/assertion
somewhere about this.

+nsresult nsRawReader::Seek(PRInt64 aTime, PRInt64 aStartTime, PRInt64 aEndTime)
+ [...]
+  PRUint32 frame = mCurrentFrame;
+  mCurrentFrame = aTime * mFramerate / 1000;

mCurrentFrame is a PRUint32 while aTime is a PRInt64. Overflow
possibility here in the aTime*mFramerate calculation.

+
+  PRUint32 offset = sizeof(nsRawVideoHeader) + mCurrentFrame *
+                    (sizeof(nsRawPacketHeader) +
+                     (mMetadata.alphaChannelBpp + mMetadata.lumaChannelBpp +
+                      mMetadata.chromaChannelBpp) / 8.0 *
+                     mMetadata.frameWidth * mMetadata.frameHeight);

Overflow checking needed.

+  rv = stream->Seek(nsISeekableStream::NS_SEEK_SET, offset);

Check return value of seek.

+    {
+    mozilla::MonitorAutoExit autoMonitorExit(mMonitor);
+    mozilla::MonitorAutoEnter autoMonitor(mDecoder->GetMonitor());
+    if (mDecoder->GetDecodeState() == nsBuiltinDecoderStateMachine::DECODER_STATE_SHUTDOWN) {
+      mCurrentFrame = frame;
+      return NS_ERROR_FAILURE;
+    }
+    }

Nested incorrectly.
Comment 12 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-16 14:12:41 PDT
Created attachment 451710 [details] [diff] [review]
Patch updated to review comments

Updated to comments.  In particular, this clamps the frame size to reasonable values which ensures that most calculations cannot overflow.
Comment 13 cajbir (:cajbir) 2010-06-16 18:52:45 PDT
Are you sure you uploaded the right patch? A cursory look shows review comments not addressed (for example, still no MPL header on nsRawReader.h, etc).
Comment 14 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-16 19:06:08 PDT
Comment on attachment 451710 [details] [diff] [review]
Patch updated to review comments

Yeah this isn't the right version.
Comment 15 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-16 19:11:36 PDT
Created attachment 451794 [details] [diff] [review]
Right patch
Comment 16 cajbir (:cajbir) 2010-06-16 20:30:43 PDT
Comment on attachment 451794 [details] [diff] [review]
Right patch

In nsRawDecoder.h and nsRawDecoder.cpp:
+ * Contributor(s):
+ 

Might want to add yourself as a contributor.
Comment 17 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-17 18:12:26 PDT
http://hg.mozilla.org/mozilla-central/rev/f54e6386c113
Comment 18 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-06-19 13:34:53 PDT
Backed out for Bug 573161

http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=f3a6e20989fc
Comment 19 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-07-20 19:56:46 PDT
Comment on attachment 451794 [details] [diff] [review]
Right patch

We need this for the camera work.
Comment 20 Mike Beltzner [:beltzner, not reading bugmail] 2010-07-20 21:49:18 PDT
Comment on attachment 451794 [details] [diff] [review]
Right patch

a=beltzner
Comment 21 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-07-26 12:42:58 PDT
http://hg.mozilla.org/mozilla-central/rev/070072f39303

Relanded.  Will track orange if it reappears in Bug 573161.
Comment 22 Daniel Holbert [:dholbert] 2010-08-03 18:46:24 PDT
This patch added this variable:
1214 static const char* gRawMaybeCodecs[] = {
1215   nsnull
1216 };

which isn't used anywhere, as shown by this MXR search:
http://mxr.mozilla.org/mozilla-central/search?string=gRawMaybeCodecs

That line should probably be removed, right?
(or is it used in a not-yet-landed followup bug?)
Comment 23 Daniel Holbert [:dholbert] 2010-08-03 18:47:07 PDT
(In reply to comment #22)
> That line should probably be removed, right?

er, s/that line/that variable/
Comment 24 Kyle Huey [:khuey] (khuey@mozilla.com) 2010-08-03 18:48:06 PDT
It looks like the maybeCodecs stuff went away between when I wrote the patch and when I landed it.
Comment 25 Eric Shepherd [:sheppy] 2010-08-24 09:25:09 PDT
Added OggYUV to the list of supported codecs here:

https://developer.mozilla.org/En/Media_formats_supported_by_the_audio_and_video_elements

I've seen notes that say this isn't used in an Ogg container, but the OggYUV page seems to say it is, so that's how I've documented it. If that's not correct, let me know.

Note You need to log in before you can comment on or make changes to this bug.