Closed Bug 616469 Opened 9 years ago Closed 9 years ago

Video sync is slow because of slow yuv2rgb conversion

Categories

(Core :: Graphics, defect)

ARM
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla2.0b9
Tracking Status
fennec 2.0+ ---

People

(Reporter: romaxa, Assigned: romaxa)

References

()

Details

(Keywords: mobile, perf)

Attachments

(5 files, 16 obsolete files)

12.69 KB, patch
cajbir
: review+
Details | Diff | Splinter Review
8.11 KB, patch
roc
: review+
Details | Diff | Splinter Review
5.99 KB, patch
roc
: review+
Details | Diff | Splinter Review
4.61 KB, patch
roc
: review+
Details | Diff | Splinter Review
13.97 KB, patch
cajbir
: review+
Details | Diff | Splinter Review
Open URL in android or maemo fennec nightly.

Sound is ok, but video is lagging a lot.

Oprofile short output
87        0.7135  libxul.so                plugin-container         th_decode_packetin
96        0.7873  libxul.so                plugin-container         mdct_backward
105       0.8611  libpixman-1.so.0.18.2    fennec                   pixman_composite_src_n_0565_asm_neon
125       1.0252  libxul.so                plugin-container         ofcl_neon_lp
298       2.4440  libpixman-1.so.0.18.2    fennec                   pixman_composite_src_0565_0565_asm_neon
326       2.6737  libc-2.10.1.so           Xorg                     memcpy
329       2.6983  libc-2.10.1.so           fennec                   memcpy
436       3.5758  libpixman-1.so.0.18.2    plugin-container         pixman_composite_src_x888_8888_asm_neon
468       3.8383  libpixman-1.so.0.18.2    fennec                   pixman_composite_over_8888_0565_asm_neon
1014      8.3162  vmlinux-2.6.32.26        vmlinux-2.6.32.26        omap3_enter_idle
2400     19.6834  libxul.so                plugin-container         FastConvertYUVToRGB32Row_C


I think we should add neon optimized yuv2rgb conversion
Attached patch This is a bit hacky, WIP1 (obsolete) — Splinter Review
Bit hacky version, but works mostly good, I have ~24FPS on N900 for URL. no lags smooth and fast!
Attachment #495004 - Flags: feedback?(tterribe)
Send it to try: http://hg.mozilla.org/try/rev/884ec6be1d2a
hoping to get fennec with fast video on galaxy Tab.
Attached patch Should make us fast, WIP1 (obsolete) — Splinter Review
Oh previous was wrong patch.
Attachment #495004 - Attachment is obsolete: true
Attachment #495006 - Flags: feedback?(tterribe)
Attachment #495004 - Flags: feedback?(tterribe)
Attached patch Should make us fast, WIP1 (obsolete) — Splinter Review
oh, something wrong with my FF gtk filepicker...
Attachment #495006 - Attachment is obsolete: true
Attachment #495006 - Flags: feedback?(tterribe)
Ok, tested try build with attached patch:
http://stage.mozilla.org/pub/mozilla.org/mobile/tryserver-builds/romaxa@gmail.com-2c49ea3d67ef/tryserver-mobile-browser-android-r7-build/fennec-4.0b3pre.en-US.eabi-arm.apk

URL video play ~25FPS without any problems in portrait mode on galaxy Tab... in landscape mode, layer scaled and FPS again dropped... so we need ScaleYUVRGB565 function in order to prepare scaled RGB layer and avoid software layer scaling
works really well on my nexus one.  Has cdouble looked at this?
Attachment #495007 - Flags: review?(chris.double)
tracking-fennec: --- → ?
Keywords: mobile, perf
Hardware: x86 → ARM
Hi, I have originally developed this ARM NEON assembly part for gstreamer (producing 24bpp rgb output) and then modified it to use with Mozilla to generate 16bpp rgb output. I think it's ok to use it under Mozilla tri-license.

That said, this code is simply initially clamping the pixels to 16-235/16-240 range and then doing the conversion according to http://www.poynton.com/notes/colour_and_gamma/ColorFAQ.html#RTFToC29
The subsampled chroma is simply duplicated with no interpolation.

More comments are here: http://cgit.freedesktop.org/~siamashka/pixman/commit/?h=yuv-arm-neon-wip&id=d96bea4f6e4ad248693dcf0bd6a45d51d6f6cf8a
"YUV->RGB conversion, optimized for the use of fast 8-bit NEON multiplications,
but aiming to minimize precision loss as much as possible.

C pseudocode:

y = clamp_range(y, 16, 235) - 16;
u = clamp_range(u, 16, 240) - 16;
v = clamp_range(v, 16, 240) - 16;

r = clamp((int16_t)(((int16_t)(149 * y) - (int16_t)(22840 - 204 * v - (v >> 1))) >> 1) >> 6);
g = clamp((int16_t)(((int16_t)(149 * y) + (int16_t)(17312 - 50 * u - 104 * v)) >> 1) >> 6);
b = clamp((int16_t)(((int16_t)(149 * y) - (int16_t)(28832 - 258 * u)) >> 1) >> 6);

The use of unsigned multiplications gains one extra bit of precision (sign
is hidden in the ADD or SUB operations which are done with the result). Also
the use of VHADD/VHSUB instructions allows to preserve one more bit in
intermediate calculations (two 16-bit values are added together to get
intermediate 17-bit result and shifted right by one bit)."

This is probably not entirely correct with the respect to color matrix, clamping and subsampling, but is fast (maybe even excessively fast so that memory bandwidth is fully saturated and there is still a lot of headroom).

There were some plans to get extensive YCbCr formats in pixman:
http://www.gstreamer.net/wiki/VideoHackfest
http://www.gstreamer.net/wiki/VideoHackfest/pixman

But unfortunately this work does not seem to be progressing anywhere. Otherwise, I think everything would have been much easier for Mozilla by now.
tracking-fennec: ? → 2.0+
The sub-sampling is probably fine. I don't know of a single open-source software conversion routine that does anything different (even when the chroma siting is supposed to be MPEG-style, and co-sited horizontally with even luma pixels, rather than halfway in-between like JPEG/Theora).

The color matrix is not _too_ far off. The matrix Theora uses for all of its color spaces is (to 6 digits of precision)

{{1.16438,  0.0,       1.59603, -222.922},
 {1.16438, -0.391762, -0.812968, 135.575},
 {1.16438,  2.01723,   0.0,     -276.836},
 {0.0,      0.0,       0.0,        1.0  }}

VP8 does not specify a specific matrix.

Pretending the shifts in the above are linear operations, the resulting matrix is

{{1.16406,  0.0,       1.59766, -222.625},
 {1.16406, -0.390625, -0.8125,   135.875},
 {1.16406,  2.01563,   0.0,     -276.125},
 {0.0       0.0        0.0         1.0  }}

This is likely as accurate as you're going to get with coefficients of that size. I haven't worked out the rounding to ensure all of the shifts are biased correctly.

The real thing that bothers me is the clamping in Y'CbCr space (as opposed to the final clamping in RGB space). This is not part of a normal Y'CbCr->RGB conversion, and can cause noticeable differences, especially on severely out-of-gamut values caused by chroma sub-sampling (e.g., thin colored text on a black background). There are RGB->Y'CbCr conversion routines that rely on such out-of-gamut values to attempt to preserve the quality of the colors in such regions.

However, simply removing the clamping altogether would allow some of the intermediate values to overflow. I'm not sure if something more clever could be done, and in fewer operations than are required by the clamping.

The code also assumes that NEON is present, which is not true, even on ARMv7 (e.g., some of the Tegra devices). Some kind of CPU detection will need to be done. I know pixman, libtheora, and libvpx all do so. I don't know if Mozilla has generic routines for this yet (I suspect not). Also, despite no less than 6 copies of the inner conversion routine, no attempt is made to use aligned loads or stores. Finally, the comments on register usage in the routine itself are out of date, and wrong. It erroneously claims q4-q7 are reserved, which they should be, as callee-saved registers, but in actuality it uses d8-d11 anyway.
(In reply to comment #8)
> The real thing that bothers me is the clamping in Y'CbCr space (as opposed to
> the final clamping in RGB space). This is not part of a normal Y'CbCr->RGB
> conversion, and can cause noticeable differences, especially on severely
> out-of-gamut values caused by chroma sub-sampling (e.g., thin colored text on a
> black background). There are RGB->Y'CbCr conversion routines that rely on such
> out-of-gamut values to attempt to preserve the quality of the colors in such
> regions.

Thanks. That's a good point.
 
> However, simply removing the clamping altogether would allow some of the
> intermediate values to overflow. I'm not sure if something more clever could be
> done, and in fewer operations than are required by the clamping.

I think we just need to use just straightforward 16-bit multiplications here. Especially considering that the possibility of using custom redefinable color matrices would be a nice bonus.
 
> The code also assumes that NEON is present, which is not true, even on ARMv7
> (e.g., some of the Tegra devices). Some kind of CPU detection will need to be
> done. I know pixman, libtheora, and libvpx all do so. I don't know if Mozilla
> has generic routines for this yet (I suspect not).

Yeah, this also looks like a quick hack.

> Also, despite no less than 6 copies of the inner conversion routine,

Actually pixman handles leading and trailing pixels in a scanline a bit better, avoiding the redundant uses of the inner conversion code (so it's done just once for the trailing leftover pixels). In the case of video, the scanlines are typically rather long, so that this extra overhead when handling the trailing pixels was not so big.

Moreover, there is 'oddflag' parameter (which is not used and this looks like a bug in the patch), it should be set when we are converting the the pixel data starting from the middle of 2x1 macroblock (for example, when we don't need full video, but want to crop some part of it and are starting from odd 'pic_x' position).

> no attempt is made to use aligned loads or stores.

Aligned stores are more beneficial than loads, as figured out by experiments with memcpy and pixman optimizations. Lower penalty for unaligned loads than stores is also confirmed by 
http://www.arm.com/files/pdf/A8_Paper.pdf
"The Level-1 Data-side memory system supports reading or writing 64-bits per cycle for integer load and store instructions and 128-bits per cycle for NEON load and store instructions. Non-word-aligned reads are supported without additional latency in cases where the entire access falls completely within a naturally address-aligned 128-bit region. Non-word-aligned writes are supported without additional latency in cases where the access falls completely within a naturally address-aligned 64-bit region."
Except of course the case when all 3 source pointers are perfectly aligned vs. just having a single aligned destination, then it may probably make a difference (admittedly I have not benchmarked it yet).

There is still one minor problem related to alignment, caused by lower chroma resolution: source data is effectively treated as 2x1 macroblocks, and perfectly realigning destination may force us to jump in the middle of this macroblock for the Y data, which would require a bit different processing (surely doable, but adds a lot of code). Another alternative is just to make one more variant of this function with all the alignment modifiers set which would be only used when all the source and destination buffer pointers are 16-byte aligned. But this relies on the video decoder allocating aligned buffers to get this performance boost (and gstreamer was unable to provide aligned buffers the last time when I asked about this, and it was the reason why I decided not to touch the alignment at all).

> Finally, the comments on register usage in the routine itself are
> out of date, and wrong. It erroneously claims q4-q7 are reserved, which they
> should be, as callee-saved registers, but in actuality it uses d8-d11 anyway.

Right, this comment is outdated. Fortunately it's easy to fix such problems :)

--

So what would be the best plan? I guess the clamping problem should be fixed first. But what should be done after that?

Actually I myself would prefer to add this NEON optimized colorspace conversion code into pixman, just because it already has all the supplementary neon macro templates and it is easier to benefit for example from using aligned stores there. Another benefit is that it is not so difficult to add at least nearest scaling to it. The real problem is that pixman does not have a usable API for planar YCbCr formats yet. There are several possible solutions:
a) Try to kick all the needed activity to finally add proper YCbCr support into pixman, but that has a high risk of not being done in a reasonable time (after all, a year has already passed).
b) I could add just the needed arm neon functions into pixman without wiring them to anything (but so that they could be called directly from some Mozilla code), but that would be effectively a dead code for pixman itself until it gets proper API and I don't know if the primary pixman maintainer would like it.
c) Another alternative is a custom very basic pixman API extension to expose just the functionality needed for Mozilla's html5 code, but I'm afraid Jeff might be not be very happy about this.

And surely, a totally standalone neon optimized function done specifically for Mozilla is always possible, just like this proposed patch (after all its problems are resolved).
One last thing regarding video quality. Maybe it makes sense to also add some kind of dithering when doing couversion to 16bpp rgb?
(In reply to comment #8)
> The real thing that bothers me is the clamping in Y'CbCr space (as opposed to
> the final clamping in RGB space). This is not part of a normal Y'CbCr->RGB
> conversion, and can cause noticeable differences, especially on severely
> out-of-gamut values caused by chroma sub-sampling (e.g., thin colored text on a
> black background). There are RGB->Y'CbCr conversion routines that rely on such
> out-of-gamut values to attempt to preserve the quality of the colors in such
> regions.
> 
> However, simply removing the clamping altogether would allow some of the
> intermediate values to overflow. I'm not sure if something more clever could be
> done, and in fewer operations than are required by the clamping.

Or just reducing conversion quality a bit more is possible, like it is already partially done for "204 * v - (v >> 1)". It could be once again divided by 2, resulting in something like "102 * v - (v >> 2)" or this shifted part could be tuned to get better rounding.

Now that I actually remembered and looked at my another old commit message, for the nonclamped range of values, this conversion was equivalent to a commonly used approximation mentioned in many places, including wikipedia, and this "(v >> 1)" subtraction was needed to match it exactly:

C = Y' − 16
D = U − 128
E = V − 128

R = ((298 * C + 409 * E + 128) >> 8)
G = ((298 * C - 100 * D - 208 * E + 128) >> 8)
B = ((298 * C + 516 * D + 128) >> 8)
> Another alternative is just to make
> one more variant of this function with all the alignment modifiers set which
> would be only used when all the source and destination buffer pointers are
> 16-byte aligned. But this relies on the video decoder allocating aligned
> buffers to get this performance boost (and gstreamer was unable to provide
> aligned buffers the last time when I asked about this, and it was the reason
> why I decided not to touch the alignment at all)

This is the preferred approach for us, I think. libtheora now guarantees 16-alignment, and libvpx always has, so this should be the normal case for us.

> a) Try to kick all the needed activity to finally add proper YCbCr support
> into pixman, but that has a high risk of not being done in a reasonable time
> (after all, a year has already passed).

If the goal is to get this into Fennec for the 4.0 branch, this sounds like a bad idea. It could always be moved back up into pixman post-4.0.

> And surely, a totally standalone neon optimized function done specifically for
> Mozilla is always possible, just like this proposed patch (after all its
> problems are resolved).

This is probably the best approach. The code is merely meant as a fallback to GPU color conversion, which will hopefully be faster once OpenGL ES layers are working.
Also, I think some kind of simple ordered dither would be a fantastic idea, especially if memory bandwidth is already the limiting factor.
(In reply to comment #12)
> If the goal is to get this into Fennec for the 4.0 branch, this sounds like a
> bad idea. It could always be moved back up into pixman post-4.0.

I also can't afford to waste too much time on this, so this needs to be done fast.

Regarding the other things, I just noticed that actually we are converting to r5g6b5 format (a nice discovery :) ). So 8-bit multiplications should provide more than enough precision with no need for pre-clamping if the conversion is done directly to r5g6b5 instead of Y'CbCr->r8g8b8->r5g6b5 in separate steps.

Will try to provide some update shortly.
Assignee: nobody → romaxa
Comment on attachment 495007 [details] [diff] [review]
Should make us fast, WIP1

>diff --git a/gfx/layers/basic/BasicImages.cpp b/gfx/layers/basic/BasicImages.cpp
> void
> BasicPlanarYCbCrImage::SetData(const Data& aData)
> {
> ...
>+#if defined(__arm__)
>+#ifndef HAS_SCALE_565_YUV_RGB
>+  prescale = PR_FALSE;
>+#endif
>+#endif

Make this one #if block:

>+#if defined(__arm__) && !defined(HAS_SCALE_565_YUV_RGB)
>+  prescale = PR_FALSE;
>+#endi

>+  if (gfxASurface::ImageFormatRGB16_565 == format) {

All other code in this file seems to use the opposite ordering style for comparisons:

>  if (format == gfxASurface::ImageFormatRGB16_565)

It's jarring to switch between the two different types. Please change to the style used by the file currently. Same for all other 'if' usage.

>+      printf("Ups need ScaleYCbCrToRGB16!!\n");

Remove printf usage.

>+      gfx::ConvertYCbCrToRGB565(aData.mYChannel,

This function will not exist on some platforms the way it is implemented now. This results in a compile and/or link error on Windows for example.

>+  gfxASurface::gfxImageFormat format = GetOptimizedFormat();

Please get Robert O'Callahan to review the GetOptimizedFormat addition/changes.

>diff --git a/gfx/ycbcr/yuv_convert.cpp b/gfx/ycbcr/yuv_convert.cpp
> 
>+#ifdef __arm__
>+void __attribute((noinline)) yv12_to_rgb565_neon(uint16 *dst, const uint8 *y, const uint8 *u, const uint8 *v, int n, int oddflag)

Split this assembly code into a seperate file called yuv_convert_arm.cpp. See bug 616778 attachment 496521 [details] [diff] [review] for an example of how this is being done on Windows for a yuv_convert_mmx.cpp and yuv_convert_sse2.cpp.

>+// Convert a frame of YUV to 16 bit RGB565.
>+NS_GFX_(void) ConvertYCbCrToRGB565(const uint8* y_buf,

This is defined in an __arm__ specific block but is called in the Layers code in a non-arm specific block. This results in build/compile errors on non-arm platforms.

With those fixes I'll r+.
Attachment #495007 - Flags: review?(chris.double) → review-
Attached patch Updated comments (obsolete) — Splinter Review
Attachment #495007 - Attachment is obsolete: true
Attachment #496815 - Flags: review?(chris.double)
Attached patch WIP2. Should be better (obsolete) — Splinter Review
Attachment #496815 - Attachment is obsolete: true
Attachment #496817 - Flags: review?(chris.double)
Attachment #496815 - Flags: review?(chris.double)
Attached patch Probably also make sense (obsolete) — Splinter Review
Not sure does this patch make sense or not.. but found it in our tree
Some crash fix for:
http://people.xiph.org/~maikmerten/demos/arctic_giant.html
probably  Kishore can explain it better..
Explanation from internal bugzilla:
....
Crash is due to width & height change after conversion AppUnitsToGfxUnits.
There is no scaled conversion for YCbCrtoRGB565 like (ScaleYCbCrToRGB32)
....
Not sure but probably it make sense to grab yuv2rgb32 function from gstreamer, and use it in case when we have 32bpp transparent video layer
(In reply to comment #18)

> Not sure does this patch make sense or not.. but found it in our tree

Can you find out if it's needed? Does it fix problems introduced by your new patch? Or does it fix existing problems?
(In reply to comment #20)
> Not sure but probably it make sense to grab yuv2rgb32 function from gstreamer,
> and use it in case when we have 32bpp transparent video layer

What is the license of the gstreamer code?
Attached patch More cleanup (obsolete) — Splinter Review
ok, another patch seems not needed... just wroks fine with that URL ~20FPS on N900, no crashes.
Attachment #496817 - Attachment is obsolete: true
Attachment #496819 - Attachment is obsolete: true
Attachment #497058 - Flags: review?
Attachment #496817 - Flags: review?(chris.double)
Attachment #497058 - Flags: review? → review?(chris.double)
Attachment #497058 - Flags: review?(roc)
+  gfxASurface::gfxImageFormat GetOptimizedFormat();

This needs to be documented. And doesn't it belong in gfxPlatform?

+      NS_WARNING("Fail, need ScaleYCbCrToRGB16\n");

Er, aren't we going to hit this warning and then completely fail to display any video?

Can you split out the changes to gfx/ycbcr into their own patch?
Attached patch ycbcr part (obsolete) — Splinter Review
Attachment #497058 - Attachment is obsolete: true
Attachment #497119 - Flags: review?(chris.double)
Attachment #497058 - Flags: review?(roc)
Attachment #497058 - Flags: review?(chris.double)
Attached patch GetOptimizedFormat WIP (obsolete) — Splinter Review
Should I make GetSystemDepth virtual with default Win/Mac values = 24?
Should I keep default GetOptimizedFormat return value for non-mobile platform = ARGB32?
Attachment #497122 - Flags: review?(roc)
Attached patch Separate BasicImage changes (obsolete) — Splinter Review
> Er, aren't we going to hit this warning and then completely fail to display any
> video?

Only if format 16_565 and prescale == TRUE, and that should not happen because of 
+  if (format == gfxASurface::ImageFormatRGB16_565) {
+#ifdef __arm__
+    // yuv2rgb16 with scale function not yet available
+    prescale = PR_FALSE;
+#else
+    // yuv2rgb16 function not yet available for non-arm
+    format = gfxASurface::ImageFormatRGB24;
+#endif
+  }
Attachment #497123 - Flags: review?(roc)
GetOptimizedFormat should be virtual and overridden on different platforms. I think it should only return either RGB24 or RGB16_565.

But hmm. Is there a reason you can't just create the surface using gfxPlatform::CreateOffscreenSurface(size, CONTENT_COLOR) and not bother getting the format?
Attached patch GetOptimizedFormat WIP2 (obsolete) — Splinter Review
> gfxPlatform::CreateOffscreenSurface(size, CONTENT_COLOR) and not bother getting
that does not guarantee us ImageSurface
Attachment #497122 - Attachment is obsolete: true
Attachment #497214 - Flags: review?(roc)
Attachment #497122 - Flags: review?(roc)
OK, I think GetOptimizedFormat should be called GetOffscreenFormat and it should be guaranteed to return the same format as CreateOffscreenFormat(CONTENT_COLOR) uses.

+    static PRInt32 GetSystemDepth() { return 16; }

You don't need this.

+    virtual gfxASurface::gfxImageFormat GetOptimizedFormat() { return gfxASurface::ImageFormatRGB16_565; }

Please add "typedef gfxASurface::gfxImageFormat gfxImageFormat;" to gfxPlatform to avoid these prefixes.
+  if (format == gfxASurface::ImageFormatRGB16_565) {
+#ifdef __arm__
+    // yuv2rgb16 with scale function not yet available
+    prescale = PR_FALSE;
+#else
+    // yuv2rgb16 function not yet available for non-arm
+    format = gfxASurface::ImageFormatRGB24;
+#endif
+  }

I'd like to have this not test __arm__ but rather some more meaningful symbol defined in the ycbcr library, e.g. HAVE_SCALE_YCBCR_TO_RGB565.

+    if (format != gfxASurface::ImageFormatRGB16_565)

format == gfxASurface::ImageFormatRGB24
Attached patch GetOffscreenFormat WIP3 (obsolete) — Splinter Review
Attachment #497214 - Attachment is obsolete: true
Attachment #497214 - Flags: review?(roc)
Attachment #497252 - Attachment is obsolete: true
Attached patch ycbcr part v3Splinter Review
Attachment #497119 - Attachment is obsolete: true
Attachment #497256 - Flags: review?(chris.double)
Attachment #497119 - Flags: review?(chris.double)
Attached patch BasicImage changes, v3 (obsolete) — Splinter Review
Attachment #497123 - Attachment is obsolete: true
Attachment #497257 - Flags: review?(roc)
Attachment #497123 - Flags: review?(roc)
Attachment #497255 - Flags: review?(roc)
Comment on attachment 497255 [details] [diff] [review]
GetOffscreenFormat WIP4, fixed compile

 +    virtual gfxImageFormat GetOptimizedFormat();

Should be GetOffscreenFormat. I assume this wouldn't compile on Qt.

r+ with that.
Attachment #497255 - Flags: review?(roc) → review+
Comment on attachment 497257 [details] [diff] [review]
BasicImage changes, v3

Oh, there is one problem here, which is that gfxPlatform is main-thread only and you're calling it from SetData which can run off the main thread.

One way to fix that would be to choose the format when the ImageContainer is created and store it in the ImageContainer.
Attachment #497257 - Flags: review+ → review-
Just qt part rename
Attachment #497255 - Attachment is obsolete: true
Attachment #497423 - Flags: review?(roc)
Attachment #497427 - Flags: review?(roc)
Attachment #497257 - Attachment is obsolete: true
Attachment #497428 - Flags: review?(roc)
Comment on attachment 497427 [details] [diff] [review]
Image/Container offscreen format API

+typedef gfxASurface::gfxImageFormat gfxImageFormat;
+

Put inside the ImageContainer class.

I don't see why you need Image::mFormat. Only BasicPlanarYCbCrImage needs it.
Moved to YCbCr image.
Attachment #497427 - Attachment is obsolete: true
Attachment #497463 - Flags: review?(roc)
Attachment #497427 - Flags: review?(roc)
Blocks: 619056
We onlyt need mOffscreenFormat in BasicPlanarYCbCrImage. Right?
Attachment #497463 - Attachment is obsolete: true
Attachment #497545 - Flags: review?(roc)
Attachment #497463 - Flags: review?(roc)
Comment on attachment 497256 [details] [diff] [review]
ycbcr part v3

this was r+ by 616469#c15
and update by roc request...
Attachment #497256 - Flags: review?(chris.double) → review?(roc)
Attachment #497256 - Flags: review?(roc) → review?(chris.double)
Comment on attachment 497256 [details] [diff] [review]
ycbcr part v3

Looks good, thanks for making the changes I asked for. r+ with the following changes:

- Add this patch as gfx/ycbcr/arm.patch
- Add to README about arm.patch and reference this bug
- Add to convert.sh a line to apply arm.patch

Thanks!
Attachment #497256 - Flags: review?(chris.double) → review+
Do I need review for this patch?
Comment on attachment 497733 [details] [diff] [review]
Comment 49, patching part

Patch is good.
Attachment #497733 - Flags: review+
Keywords: checkin-needed
Target Milestone: --- → mozilla2.0b9
Blocks: 620526
Depends on: 637961
Depends on: 640588
Depends on: 641014
Depends on: 641019
You need to log in before you can comment on or make changes to this bug.