Closed Bug 1434861 Opened 2 years ago Closed 2 years ago

Unnecessary frame copying in MediaEngineRemoteVideoSource::DeliverFrame

Categories

(Core :: WebRTC: Audio/Video, defect, P2)

58 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla60
Tracking Status
firefox58 --- unaffected
firefox59 --- verified
firefox60 --- verified

People

(Reporter: pehrsons, Assigned: pehrsons)

References

Details

Attachments

(2 files)

From bug 1423582 comment 5


Currently when we have to rescale, we:
- Create a webrtc::I420Buffer at the original resolution, and copy the original rawptr image buffer into that. (I'm not clear on why. Seemingly to convert to I420, but we don't do that if we don't have to rescale)
- Create a webrtc::I420Buffer at the scaled resolution and scale the originally-sized I420Buffer into that with util methods of I420Buffer.
- Create a raw buffer (the one we risk leaking here) at the scaled resolution and copy the scaled I420Buffer into that with our own VideoFrameUtils library.
- Then we create a buffer by creating a layers::PlanarYCbCrImage instance, which can be fed to our MediaStream pipeline. We copy into that with our graphics APIs.

Let's cut these four copies to one.
I suggest we do what CropAndScaleFrom does and let libyuv crop and scale straight into the buffer of our PlanarYCbCrImage, see [1].

This also has the benefit of using a buffer recycler, which the raw buffer allocations don't use. That alone might help perf more than the actual copying (though currently there's a lot of copying so maybe not).


[1] https://searchfox.org/mozilla-central/rev/b0098afaeaad24a6752aa418ca3e86eade5fab17/media/webrtc/trunk/webrtc/api/video/i420_buffer.cc#231
Rank: 14
Priority: -- → P2
Comment on attachment 8947739 [details]
Bug 1434861 - Simplify rescaling code in MediaEngineRemoteVideoSource::DeliverFrame.

https://reviewboard.mozilla.org/r/217448/#review223956

Lgtm!
Attachment #8947739 - Flags: review?(jib) → review+
Comment on attachment 8947739 [details]
Bug 1434861 - Simplify rescaling code in MediaEngineRemoteVideoSource::DeliverFrame.

https://reviewboard.mozilla.org/r/217448/#review223962

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp:507
(Diff revision 1)
>    layers::PlanarYCbCrData data;
> -
> -  // Take lots of care to round up!
> -  data.mYChannel = frame;
> -  data.mYSize = IntSize(dst_width, dst_height);
> -  data.mYStride = (dst_width * lumaBpp + 7) / 8;
> -  data.mCbCrStride = (dst_width * chromaBpp + 7) / 8;
> -  data.mCbChannel = frame + dst_height * data.mYStride;
> -  data.mCrChannel = data.mCbChannel + ((dst_height + 1) / 2) * data.mCbCrStride;
> -  data.mCbCrSize = IntSize((dst_width + 1) / 2, (dst_height + 1) / 2);
> +  data.mYChannel = const_cast<uint8_t*>(buffer->DataY());
> +  data.mYSize = IntSize(buffer->width(), buffer->height());
> +  data.mYStride = buffer->StrideY();
> +  MOZ_ASSERT(buffer->StrideU() == buffer->StrideV());
> +  data.mCbCrStride = buffer->StrideU();
> +  data.mCbChannel = const_cast<uint8_t*>(buffer->DataU());
> +  data.mCrChannel = const_cast<uint8_t*>(buffer->DataV());
> +  data.mCbCrSize = IntSize((buffer->width() + 1) / 2,
> +                           (buffer->height() + 1) / 2);
>    data.mPicX = 0;
>    data.mPicY = 0;
> -  data.mPicSize = IntSize(dst_width, dst_height);
> +  data.mPicSize = IntSize(buffer->width(), buffer->height());

Nit: Would it be a win to use c++11 list initialization here to preserve const and avoid having to cast it away?
Comment on attachment 8947739 [details]
Bug 1434861 - Simplify rescaling code in MediaEngineRemoteVideoSource::DeliverFrame.

https://reviewboard.mozilla.org/r/217448/#review223964

::: dom/media/webrtc/MediaEngineRemoteVideoSource.cpp:507
(Diff revision 1)
>    layers::PlanarYCbCrData data;
> -
> -  // Take lots of care to round up!
> -  data.mYChannel = frame;
> -  data.mYSize = IntSize(dst_width, dst_height);
> -  data.mYStride = (dst_width * lumaBpp + 7) / 8;
> -  data.mCbCrStride = (dst_width * chromaBpp + 7) / 8;
> -  data.mCbChannel = frame + dst_height * data.mYStride;
> -  data.mCrChannel = data.mCbChannel + ((dst_height + 1) / 2) * data.mCbCrStride;
> -  data.mCbCrSize = IntSize((dst_width + 1) / 2, (dst_height + 1) / 2);
> +  data.mYChannel = const_cast<uint8_t*>(buffer->DataY());
> +  data.mYSize = IntSize(buffer->width(), buffer->height());
> +  data.mYStride = buffer->StrideY();
> +  MOZ_ASSERT(buffer->StrideU() == buffer->StrideV());
> +  data.mCbCrStride = buffer->StrideU();
> +  data.mCbChannel = const_cast<uint8_t*>(buffer->DataU());
> +  data.mCrChannel = const_cast<uint8_t*>(buffer->DataV());
> +  data.mCbCrSize = IntSize((buffer->width() + 1) / 2,
> +                           (buffer->height() + 1) / 2);
>    data.mPicX = 0;
>    data.mPicY = 0;
> -  data.mPicSize = IntSize(dst_width, dst_height);
> +  data.mPicSize = IntSize(buffer->width(), buffer->height());

Maybe no, since we'd rely on arg order and a lot of them appear to have the same type. nm.
Pushed by pehrsons@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/8932202ed0b9
Simplify rescaling code in MediaEngineRemoteVideoSource::DeliverFrame. r=jib
https://hg.mozilla.org/mozilla-central/rev/8932202ed0b9
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Can this ride the trains to 60, or do you want it in 59?
Flags: needinfo?(apehrson)
I'd love to have this on 59. I'll just check how tricky the rebase is first.
Rebase to 59. Carries forward r=jib.
Tested locally with https://jsfiddle.net/pehrsons/m9gbqqfz/5/embedded/, and
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9e9eca37b9535dbe16027b99aaae9fd89999fb5e

Approval Request Comment
[Feature/Bug causing the regression]: bug 1388219
[User impact if declined]: Unnecessary cpu usage and memory churn.
[Is this code covered by automated tests?]: Most of the code is covered by all tests involving getUserMedia on linux. There's a narrow case of using one device at two different resolutions in the same child process that is not covered.
[Has the fix been verified in Nightly?]: Yes, I have verified this with mozregression.
[Needs manual test from QE? If yes, steps to reproduce]: Yes, would be good.
STR:
- Go to https://jsfiddle.net/pehrsons/m9gbqqfz/7/embedded/, the Result tab
- Click Start, grant the camera permission
Expected: All videos should be smooth. Cpu usage in the child process should be reasonable (for me, ~50% on linux - equivalent to half a core).
Actual: Bottom video choppy. Cpu usage very high in the child process (for me, ~110% on linux)
Note that I used a 4k Logitech Brio. Cameras with lower resolution might give less significant improvements with these STR.
[List of other uplifts needed for the feature/fix]: bug 1434538
[Is the change risky?]: Not really
[Why is the change risky/not risky?]: There's significant change to code, but it is fairly short and self-contained; and with automation and manual tests we have covered all paths of the code touched.
[String changes made/needed]: None
Flags: needinfo?(apehrson)
Attachment #8950180 - Flags: review+
Attachment #8950180 - Flags: approval-mozilla-beta?
Comment on attachment 8950180 [details] [diff] [review]
[beta 59] Simplify rescaling code in MediaEngineRemoteVideoSource::DeliverFrame

Fixes a perf regression in 59, stabilized on Nightly60 for a week, Beta59+
Attachment #8950180 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I tried to reproduce the bug on an older version of Nightly (2018-02-01) using the steps from comment 10. I used three different platforms (Ubuntu 16.04 x64, Windows 10 x 64 and macOS 10.13) and I got various results as well. 

I do have a question though. When you said that the video is choppy, did you refer to the fact that the image is wavy? Or can you please make a gif with the exact problem.

Because if you meant that the video is wavy, then the bug is still reproducing on beta 59.0b10 on all the platforms and on latest Nightly on Windows 10 and macOS. On the latest Nightly on Ubuntu, the video is not wavy anymore and the CPU is under 50%. 

Also, I couldn't manage to make the link for the test page on an older version of Nightly using Windows 10 work. The videos just weren't displayed.
Flags: needinfo?(apehrson)
(In reply to Oana Botisan from comment #13)
> I tried to reproduce the bug on an older version of Nightly (2018-02-01)
> using the steps from comment 10. I used three different platforms (Ubuntu
> 16.04 x64, Windows 10 x 64 and macOS 10.13) and I got various results as
> well. 
> 
> I do have a question though. When you said that the video is choppy, did you
> refer to the fact that the image is wavy? Or can you please make a gif with
> the exact problem.

Not wavy, that sounds like a graphics or camera issue. I mean choppy as in some frames are missing so playback is not smooth. Imagine 30 frames per second but every 4th to 5th frame is missing. This is very dependant on the power of your machine and the resolution of your camera though. I had a 4k camera on a quadcore Macbook Pro (with Ubuntu) when I saw this.


> Because if you meant that the video is wavy, then the bug is still
> reproducing on beta 59.0b10 on all the platforms and on latest Nightly on
> Windows 10 and macOS. On the latest Nightly on Ubuntu, the video is not wavy
> anymore and the CPU is under 50%. 
> 
> Also, I couldn't manage to make the link for the test page on an older
> version of Nightly using Windows 10 work. The videos just weren't displayed.

How old? Any errors in the console?
Flags: needinfo?(apehrson) → needinfo?(oana.botisan)
I managed to reproduce the problem using this link https://jsfiddle.net/pehrsons/m9gbqqfz/13/ on Ubuntu 16.04 using the Nightly build from 2018-02-01. The playback was not smooth. 

I retested on beta 59.0b10 using the same link on Ubuntu 16.04 x64 and macOS 10.13 and the playback was smooth and all the images were synchronized. 
On the other hand when I tested on Latest Nightly (2018-02-15) the bug reproduced. Some of the videos were not smooth and they weren't synchronized. 

Also, for some reason, this link doesn't work on Windows 10 x64. There is no error displayed in Browser Console or any network request displayed in the . 

Can you please verify the fix when you have a bit of time?
Flags: needinfo?(oana.botisan) → needinfo?(apehrson)
(In reply to Oana Botisan from comment #15)
> I managed to reproduce the problem using this link
> https://jsfiddle.net/pehrsons/m9gbqqfz/13/ on Ubuntu 16.04 using the Nightly
> build from 2018-02-01. The playback was not smooth. 
> 
> I retested on beta 59.0b10 using the same link on Ubuntu 16.04 x64 and macOS
> 10.13 and the playback was smooth and all the images were synchronized. 
> On the other hand when I tested on Latest Nightly (2018-02-15) the bug
> reproduced. Some of the videos were not smooth and they weren't
> synchronized. 

The difference is big enough for me to say that this is verified. We still rescale once per video so it's not expected to be perfect.

However on beta 10 it seems like the MediaManager thread gets stuck somewhere after I get to see ~10 videos. Investigating.


> Also, for some reason, this link doesn't work on Windows 10 x64. There is no
> error displayed in Browser Console or any network request displayed in the .

I'll try to figure out what is going on here too.
(In reply to Andreas Pehrson [:pehrsons] from comment #16)
> However on beta 10 it seems like the MediaManager thread gets stuck
> somewhere after I get to see ~10 videos. Investigating.

It's not stuck really, but it looks like bug 1430856 is making this a lot faster on Nightly.

I have a bit too many cameras attached, and they have quite a bit of capabilities each. With all the active captures and frames flowing when this starts becoming slow, getting to know one capability takes roughly a second. And on beta we iterate over all capabilities of all devices quite a few times.

As an example I now have 4 devices with 0, 2, 24 and 39 capabilities respectively. We ask for every capability from the parent something like 8 times (enumeration 1 + selectsettings (1 + 1 * (num advanced constraints 0)) + allocation (2 * choosecapability 3 + 1 * (num advanced constraints 0))) in the worst case in beta. That becomes 65 * 7 = 520 ipc calls.

With bug 1430856 it's instead:
enumeration 1 + selectsettings (1 + 1*0) + allocation (1 * choosecapability (1 + 1*0)) = 3
meaning 195 ipc calls.

Still, brrrrr.
I've verified this on beta and nightly.

(In reply to Oana Botisan from comment #15)
> I managed to reproduce the problem using this link
> https://jsfiddle.net/pehrsons/m9gbqqfz/13/ on Ubuntu 16.04 using the Nightly
> build from 2018-02-01. The playback was not smooth. 
> 
> I retested on beta 59.0b10 using the same link on Ubuntu 16.04 x64 and macOS
> 10.13 and the playback was smooth and all the images were synchronized. 
> On the other hand when I tested on Latest Nightly (2018-02-15) the bug
> reproduced. Some of the videos were not smooth and they weren't
> synchronized.

It looks to me like nightly is more choppy on this fiddle than beta as you say, however the latency from you doing something until it shows up on screen is much lower on Nightly (my guesstimate: <100ms vs ~300ms). With that in mind some choppiness is fine and it is a significant improvement with the patch vs without. The latency difference is likely not due to this or any of my other patches, but to changes in the gfx pipe.


> Also, for some reason, this link doesn't work on Windows 10 x64. There is no
> error displayed in Browser Console or any network request displayed in the.

It worked fine for me.
However, I noticed it doesn't work when another app is using the camera; with the symptoms you described. Could that be what happened?
There could also be an audio issue at play, but nothing known off the top of my head. If you can reproduce this perhaps we can narrow it down in another bug.
Status: RESOLVED → VERIFIED
Flags: needinfo?(apehrson)
I managed to verify the bug on Windows 10 x64, too, using the latest Nightly 60.0a1 and beta 59.0b11. There doesn't seem to be any  more issues except the one that Andreas explained in comment 18. 

Taking in account all these new information, I will take out the qe‑verify + flag.
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.