WebRTC LoadManager resolution reduction under load causing horizontal distortions

RESOLVED FIXED in mozilla33
(NeedInfo from)

Status

()

Core
WebRTC: Audio/Video
P1
major
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: jesup, Assigned: pkerr, NeedInfo)

Tracking

(Blocks: 1 bug)

Trunk
mozilla33
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: p=5)

Attachments

(8 attachments, 3 obsolete attachments)

(Reporter)

Description

4 years ago
See the attached image.  Whenever the load manager kicks in resolution reductions, it causes horizontal artifacts.  These seem tied to the data, and in the case of fake video (basically constant values for the entire image except the top couple of lines) the artifacts are very few, and appear in a regular pattern that implies a 'stride' issue perhaps.

Things tried an ruled out:
a) disabled ASM code in libyuv
b) disabled dedicated 3/4 scaler in libyuv
c) disabled 'box' filter in scaler (switch to no interpolation)

When examining the output of the decoder for a fake video instance (in pc_test.html):
a) I couldn't find anything the 'y' buffer that wasn't 0x80 other the the timecode at the start, b) the size output was 640x480(?), though I think I was getting FrameSizeChanged callbacks cutting the resolution. (I was getting them; i didn't verify for which side, but the fake video was showing artifacts)

Options are: libyuv (seems pretty unlikely now); chain between scale and encoder (unlikely), vp8 encoder when encoding 3/4*640x480 input (unlikely, possible), decoder (unlikely, possible per above), or chain between decoder and DeliverFrame()

Dumping frames in a readable state to disk at various points in the chain would help a lot.  An ugly uncompressed I420 file dumper that something (gimp?) can read would help a lot; there may be one somewhere in the tree esp. in the test code
(Reporter)

Comment 1

4 years ago
Created attachment 8442139 [details]
loadmanager.png
(Reporter)

Comment 2

4 years ago
Created attachment 8442266 [details] [diff] [review]
turn off variuous things to help isolate NOT FOR CHECK
(Assignee)

Comment 3

4 years ago
I traced the frame size set calls through both the encode and decode paths, and the sizes appear to always be correct at each point in time through a stress cycle.

I ran several tests between Firefox and Chrome. If the Firefox side is stressed and goes into load adaptation the Chrome side will decode with artifacts. If the Chrome side of the call is stressed there are no decode artifacts produced by the Firefox side.

Working on dumping frames from the VideoConduit.
>If the Chrome side of the call is stressed there are no decode artifacts produced by the Firefox side.

Does Chrome do resolution adaption? (I don't think they have it enabled)
(Assignee)

Comment 5

4 years ago
(In reply to Gian-Carlo Pascutto [:gcp] from comment #4)
> >If the Chrome side of the call is stressed there are no decode artifacts produced by the Firefox side.
> 
> Does Chrome do resolution adaption? (I don't think they have it enabled)

It doesn't appear to change resolution. The test was run to complete the exercise.
(Assignee)

Comment 6

4 years ago
Created attachment 8443124 [details] [diff] [review]
periodically dump received yuv frames to files NOT FOR CHECKIN

Periodically dump received yuv frames to files using new PeriodicWriter class (look for FIXME(PRK)). Can handle multiple sample points.
(Assignee)

Updated

4 years ago
Assignee: nobody → pkerr
Status: NEW → ASSIGNED
(Assignee)

Comment 7

4 years ago
I am loading the files created by the PeriodicWriter into GIMP as: raw image data (file type), Planar RGB, manually set the image size based on the expected size: 640x480, 480x360, 320x240 as indicated by the file size. This works best for the fake video stream.

At my first test point, at the DeliverFrame() call in VideoConduit.cpp, the data is already corrupted with the pattern seen on the receive side once the frames are reduced from 640x480 to 480x360.
(Reporter)

Comment 8

4 years ago
(In reply to Paul Kerr [:pkerr] from comment #7)
> I am loading the files created by the PeriodicWriter into GIMP as: raw image
> data (file type), Planar RGB, manually set the image size based on the
> expected size: 640x480, 480x360, 320x240 as indicated by the file size. This
> works best for the fake video stream.
> 
> At my first test point, at the DeliverFrame() call in VideoConduit.cpp, the
> data is already corrupted with the pattern seen on the receive side once the
> frames are reduced from 640x480 to 480x360.

DeliverFrame() should be the output of the decoder...  I'm 90% certain the error is on the encode side (maybe 95%).  Note that a new I420VideoFrame is created to hold the scaled image; this may have different strides than the non-scaled data had.  It's possible something is caching the stride somewhere (perhaps the u or v stride) - or that the stride or some other frame var going *into* the scaler is wrong, causing the scaler to mis-scale.  

DUmp the frame periodically at the output of the scaler; at the input to the encoder, at the output of the decoder and DeliverFrame, and rapidly we'll know where the corruption is
Whiteboard: p=3
Target Milestone: --- → mozilla33
Priority: -- → P1
(Assignee)

Comment 9

4 years ago
I have trace both the image data and I420VideoFrame meta-data through to VideoSender::AddVideoFrame. Everything seems to be good to that point. I will continue to follow the encoding to output path to ferrit out where the distortion is occuring.
(Assignee)

Comment 10

4 years ago
The first frame after a framesize reduction is always good, as far as I can tell. This is normally a key frame triggered by the framesize change. Distortion appears after this lead frame.
(Assignee)

Comment 11

4 years ago
To further expand on my previous comment. Periodically forcing a key frame to be generated will cause that frame to be displayed without artifacts on the receiving side. But, the distortion is only cleared for that one frame. It does not get steadily worse after the forced key frame as is the case if the encoder and decoder are off in their image references. Key frames are clear, all frames at the starting resolution are clear (not just 640x480), and non-key frames of reduced resolution show the stripped pattern. This appears to be an issue with VP8.
(Reporter)

Comment 12

4 years ago
derf: see comment 11 (and comment 10); preliminary info may point to a libvpx issue when the input size changes with a stream.

The corruption looks like some sort of stride issue.
Flags: needinfo?(tterribe)
Given that we are now encouraging people to use Nightly for Loop and that this is a fairly noticeable regression, I suggest we back out or at least disable the Load Management code until this is fixed.
Flags: needinfo?(rjesup)
Flags: needinfo?(adam)
(Assignee)

Comment 14

4 years ago
An alternate temporary solution would be to disable resolution reduction as part of the load management scheme and leave in the rate reduction, which works without a problem. This would give us some feedback on the LoadManager with Loop.
(In reply to Paul Kerr [:pkerr] from comment #14)
> An alternate temporary solution would be to disable resolution reduction as
> part of the load management scheme and leave in the rate reduction, which
> works without a problem. This would give us some feedback on the LoadManager
> with Loop.

That seems reasonable.

Comment 16

4 years ago
(In reply to Eric Rescorla (:ekr) from comment #13)
> Given that we are now encouraging people to use Nightly for Loop and that
> this is a fairly noticeable regression, I suggest we back out or at least
> disable the Load Management code until this is fixed.

This isn't just a Loop problem, though, right? I've seen these relics with relatively high frequency in regular WebRTC calls.

In any case, I agree that we should back out the resolution reduction (or, better yet, put it behind a pref that is off by default) until we have this issue chased to ground.
Flags: needinfo?(adam)
(In reply to Adam Roach [:abr] from comment #16)
> (In reply to Eric Rescorla (:ekr) from comment #13)
> > Given that we are now encouraging people to use Nightly for Loop and that
> > this is a fairly noticeable regression, I suggest we back out or at least
> > disable the Load Management code until this is fixed.
> 
> This isn't just a Loop problem, though, right? I've seen these relics with
> relatively high frequency in regular WebRTC calls.
>

Agreed. I was just thinking we might be able to tolerate this kind of artifact
on Nightly longer if we weren't actively encouraging people to use a feature
that is only on Nightly.

> In any case, I agree that we should back out the resolution reduction (or,
> better yet, put it behind a pref that is off by default) until we have this
> issue chased to ground.
(Assignee)

Updated

4 years ago
Whiteboard: p=3 → p=5
(In reply to Randell Jesup [:jesup] from comment #12)
> derf: see comment 11 (and comment 10); preliminary info may point to a
> libvpx issue when the input size changes with a stream.
> 
> The corruption looks like some sort of stride issue.

I'm not aware of any libvpx issues like this. We support resolution changes in the <video> tag, too, and have had people file bugs about correctly scaling them afterwards, so use of VP8's resolution-switching in the web is non-zero. See bug 626979, where there's a file encoded like this attached.
Flags: needinfo?(tterribe)
(Assignee)

Comment 19

4 years ago
Created attachment 8445883 [details] [diff] [review]
visual distortion work-around by re-initializing the vp8 encoder on frame size changes

Work around by re-inializing the vp8 encoder on frame size changes. This removes the video artifacts but does cause a freeze/glitch at each frame size change.
(Reporter)

Comment 20

4 years ago
Created attachment 8445905 [details] [diff] [review]
debug patch NOT FOR CHECKIN
(Reporter)

Comment 21

4 years ago
Created attachment 8446017 [details]
input_31.raw.png

Input image right before vpx_codec_encode() -- clean
Flags: needinfo?(rjesup)
(Reporter)

Comment 22

4 years ago
Created attachment 8446020 [details]
output_31.raw.png

output image right after decoding (in ReturnFrame()) - corrupt (view at 400% to make it clear).  Note that distortions do not move or change, though the chroma does as the source cycles.
(Reporter)

Comment 23

4 years ago
Created attachment 8446024 [details]
output_301.raw.png

example with live video.  In live video, the distortions aren't quite static; they are tied to the content and content motion.  Input live video was clean as with fake video.
(Assignee)

Comment 24

4 years ago
Created attachment 8446032 [details] [diff] [review]
visual distortion work-around by re-initializing the vp8 encoder on frame size changes

Updated patch. Uses minimal re-initialization by calling vpx_codec_enc_init() instead of vpx_codec_enc_config_set() in VP8EncoderImpl::UpdateCodecFrameSize when handling a frame size change. This results in minimal glitching on size change. This one change allows for an error free set of reduced frames.
(Reporter)

Comment 25

4 years ago
derf: as best I can determine, unless there's a fubar in the reconfig, the problem appears to be in the libvpx encoder when we change the input resolution - especially as pkerr's patch that does a full restart of the encoder on resolution changes makes the problem go away.

To convert the dumped frames from the patch in attachment 8445905 [details] [diff] [review], use
convert -size 480x360 -interlace plane -depth 8 -sampling-factor 4:2:0 yuv:$i $i.png
for input frames, and 
convert -size 544x360 -interlace plane -depth 8 -sampling-factor 4:2:0 yuv:$i $i.png
for output frames (ImageMagik's convert tool)
Flags: needinfo?(tterribe)
Flags: needinfo?(giles)
(Assignee)

Updated

4 years ago
Attachment #8445883 - Attachment is obsolete: true
(Reporter)

Comment 26

4 years ago
Comment on attachment 8446032 [details] [diff] [review]
visual distortion work-around by re-initializing the vp8 encoder on frame size changes

Review of attachment 8446032 [details] [diff] [review]:
-----------------------------------------------------------------

If this is working well with no other patches, file the followup codec bug, put the number in, and I think we can land this with minor nit-fixes.

::: media/webrtc/trunk/webrtc/modules/video_coding/codecs/vp8/vp8_impl.cc
@@ -136,5 @@
>    if (number_of_cores < 1) {
>      return WEBRTC_VIDEO_CODEC_ERR_PARAMETER;
>    }
>    feedback_mode_ = inst->codecSpecific.VP8.feedbackModeOn;
> -

don't remove blank lines unless there's a reason

@@ -157,5 @@
>    // TODO(andresp): assert(inst->extra_options) and cleanup.
>    Config default_options;
>    const Config& options =
>        inst->extra_options ? *inst->extra_options : default_options;
> -

ditto

@@ -331,5 @@
>    }
>    if (encoded_complete_callback_ == NULL) {
>      return WEBRTC_VIDEO_CODEC_UNINITIALIZED;
>    }
> -

ditto

@@ +342,5 @@
>      int ret = UpdateCodecFrameSize(input_image);
>      if (ret < 0) {
>        return ret;
>      }
> +#if 1 //FIXME(PRK) - work around for bug 1027100

#ifndef LIBVPX_ENCODER_RECONFIG_WORKS or some such with a comment with the bug # (and probably it would be a new bug on the config issue, and bug 1027100 will cover the workaround)

@@ +413,5 @@
>    // Update encoder context for new frame size.
>    // Change of frame size will automatically trigger a key frame.
>    config_->g_w = codec_.width;
>    config_->g_h = codec_.height;
> +#if 1 //FIXME(PRK) - bug1027100 work around

ditto
Attachment #8446032 - Flags: review+
(Assignee)

Updated

4 years ago
Blocks: 986513
(Assignee)

Updated

4 years ago
Depends on: 1030324
(Assignee)

Comment 27

4 years ago
Created attachment 8446110 [details] [diff] [review]
visual distortion work-around by re-initializing the vp8 encoder on frame size changes
(Assignee)

Updated

4 years ago
Attachment #8446032 - Attachment is obsolete: true
(Assignee)

Comment 28

4 years ago
Created attachment 8446220 [details] [diff] [review]
visual distortion work-around by re-initializing the vp8 encoder on frame size changes

ready for checkin
(Assignee)

Updated

4 years ago
Attachment #8446110 - Attachment is obsolete: true
(Assignee)

Comment 29

4 years ago
https://tbpl.mozilla.org/?tree=Try&rev=28bf46c408b1
r+ by Jesup carried forward.
(Reporter)

Comment 30

4 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/f3e706953551

We'll need to open a new bug for libvpx
Flags: needinfo?(giles)
(Assignee)

Comment 31

4 years ago
(In reply to Randell Jesup [:jesup] from comment #30)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/f3e706953551
> 
> We'll need to open a new bug for libvpx

bug 1030324
(Reporter)

Updated

4 years ago
Blocks: 1030324
No longer depends on: 1030324
https://hg.mozilla.org/mozilla-central/rev/f3e706953551
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Depends on: 1198107
You need to log in before you can comment on or make changes to this bug.