Closed Bug 1921623 Opened 4 months ago Closed 2 months ago

Problem with Videoframe, when initialized using I420 Videoframe with scaling and cropping

Categories

(Core :: Audio/Video: Web Codecs, defect, P3)

Firefox 130
defect

Tracking

()

RESOLVED FIXED
134 Branch
Tracking Status
firefox134 --- fixed

People

(Reporter: marten.richter, Assigned: marten.richter)

References

Details

Attachments

(9 files, 1 obsolete file)

Steps to reproduce:

I have a rather complex web app.
That gets a track for the webcam, then attaches this track to a video element. Then a VideoFrame is initialized by the video element (actually 20 fps to 25 fps). The VideoFrame is passed via a pipe to a worker, and then inside worker, I use this code

      resframe[out] = new VideoFrame(frame, {
        visibleRect,
        displayWidth: targetwidth,
        displayHeight: Math.max(((targetwidth * targetinvaspect) >> 1) << 1, 1)
      })
    frame.close()

To scale the frame to a different size and different croppings.
The frame is then then transmitted after encoding to another browser.
The frame is actually distorted (see screenshots for different scaling cropping; it's not the same image, but the camera is pointing to the same object). For me, it looks like some pixel format and memory access nonsense. It may also be concurrent writing. If I clone the Videoframe no distortion occurs, added for reference.
The pixel format appears to be I420. It is on Windows 11 with a recent Nvidia graphics card. It was also tested on Canary.

Actual results:

The picture is distorted.

Expected results:

It should not be distorted.

One example.

Attached image How it should look like β€”

I will try to attach a debugger to see more, but I am unfamiliar with the graphics webcodecs part of firefox.

I have to add that the VideoFrame is passed to the VideoEncoder. While looking into the VideoFrames' code, it seems as if no scaling code is present, and the scaling happens at the VideoEncoder side.

Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core
  1. Please type "about:support" in your browser an copy-paste its contents here
  2. Did your app work as expected in a previous version of Firefox? If yes, you can use "mozregression" to bisect to the exact change that caused this isse. You can take a look at how to use this tool at https://mozilla.github.io/mozregression/.
  3. Please either attach a testcase, or provide the devs with login credentials to a publicly accessible app. If the app cannot be made public, you can also share the credentials with the devs privately.
    Without a reproducible testcase, this issue would be infeasible to diagnose.
    Thanks!
Flags: needinfo?(marten.richter)
Attached file OK (obsolete) β€”
1.)

1.) It is now attached as attachment OK. (But it is localized..., I hope it is ok).
2.) Well, Webcodecs support started with 130, so it never worked before with firefox. But it works since over a year with chromium based plattforms. Although the processing path is a bit different, as firefox currently lack the ability to directly extract a videoframe from a mediatrack, but I think there is already a tracking bug for this.
3.) The problem is, that I cannot generate login credentials so easily. (The app is open source but rather difficult to setup). I may ask some people, if a private login on our installation can be shared with firefox devs.) For now, more precisely for today, I would try to track it down in the debugger myself. I maintained a network, audio, video client for vdr on windows and raspi for some years, so I have some audio/video coding experience. So far I was not able to generate a simple test case.
However, you may help me, if you can tell me were the cropping and scaling takes place? In VideoFrame or in the VideoEncoder?

Flags: needinfo?(marten.richter)

Ok, so your application uses WebCodecs. I have changed the component of this bug to reflect that.

I will cc the mozilla devs who work on Webcodecs - :padenot and :chunmin. They will be able to provide better guidance.

Thanks for testing with Firefox!

Component: Audio/Video: Playback → Audio/Video: Web Codecs
Flags: needinfo?(padenot)

(In reply to marten.richter from comment #8)

1.) It is now attached as attachment OK. (But it is localized..., I hope it is ok).

You can use the button Copy raw data to clipboard (Rohdaten in die Zwischenablage kopieren), and attach it here (it will copy a JSON file with only the structured data).

Attached file aboutdata β€”
1.)

1.) Ok, done.

I have debugged it, and I think I know the cause. However, it needs a lot of work.

1.) The VideoFrames arrive from the Webcam as 640x480 YUV420 and are stored as planar.
2.) I constructed VideoFrames with a display size of 320x180. As far as I understand the spec, this is legal and works well with Chromium-based browsers. I followed the data in the video frames down to the encoder - VideoFrame does not seem to do any rescaling in this path, and I think this is intended.
3.) Finally, the frames arrive after some conversions of the holding structures, not the pixel data, which look correct at WMFMediaDataEncoder. The frame shows correctly that they should be displayed at 320x180 and that the actual data is 640x480.
4.) The frames and strides are converted to NV12 to match the resolution of the incoming samples, which is 640x480. This is then copied to the sample object of the media foundation, so no rescaling occurs.
5.) However, the WMFMediaDataEncoder holds an mConfig object with the following values:

mCodec=H264 (1)
mSize  width=320, height=180
mBitrateMode Variable (1)
mBiitrate 250000
mMinBitrate 0
mMaxBitrate 0
mUsage Realtime (0)
mHardwarePreference None (2)
mPixelFormat RGBA32 (0 '\0')
mSourcePixelFormat RGBA32 (0 '\0')
mScalabilityMode L1T3 (2)
mFramerate 25 '\x19'
mKeyframeInterval 0
m NumberOfChannels 0
mSamplerate 0
Some({rawData=0x0000022cb7ae88c0 "B" tag=0 '\0' })

Not all of these attributes are set on the mft object. However, the width and height are set on the encoder's output, and no size is set on the input. So far, I think the RGBA32 setting is ignored, and everything is handled as NV12.

I see the following options to handle this:
a.) Identify this case early and throw an unsupported exception. But I would prefer support.
b.) One could try setting the input width and height on the encoder. The last time I worked with decoders in Windows was with DirectShow, and decoders and encoders seldom scale.
c.) Rewrite ConvertToNV12InputSample to include a rescaling branch using libuv if the input and output sizes need rescaling.

I prefer option c). (Of course, other OS could have similar flaws.)

Next I will try to make short reproduction case.

This HTML file should show a rectangle and a triangle above and below. In between is a VideoEncoder and VideoDecoder included to invoke the problematic functions. You may want to play around with the factor variable. For a value of 0.5, you see the problem on my Windows machine (compare it to Edge). For factor 1.0, everything is fine, and above 1.0, nothing is displayed (probably the encoder discarded the sample; otherwise, it could be a read of uninitialized data). The black background is caused by the alpha channel of the initial canvas and is a pure artifact.

I will now investigate a second issue: the output is not decodable by chromium, so that I may file a second related bug report soon.

Attachment #9427959 - Attachment is obsolete: true

I wrote that I would look into a second bug. But it is the same one. If the factor is set to 0.5 in this example, it also does not work. Here, the encodedvideochunks are written out into a JSON file, and you can reupload them. (Json is inefficient, but you can read quickly, if it is annexb...). Here it does not work, as if I reupload it to Chromium (generated by firefox) says 'Failed to execute 'decode' on 'VideoDecoder': A key frame is required after configure() or flush(). If you're using AVC formatted H.264 you must fill out the description field in the VideoDecoderConfig. at reader.onload' , but it just says, that with the broken scaling also the bitstream is broken I guess.

Flags: needinfo?(padenot) → needinfo?(cchang)

I'll take a look.

Severity: -- → S4
Priority: -- → P3

I decided instead of creating a workaround to write a test. I hope it is okay that I submitted one. (Main downside is that is also on the same thread)

Assignee: nobody → marten.richter
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #9430561 - Attachment description: Bug 1921623 - Adding rescaling support to WMFMediaDataEncoder r?chunmin → WIP: Bug 1921623 - Adding rescaling support to WMFMediaDataEncoder
Attachment #9430561 - Attachment description: WIP: Bug 1921623 - Adding rescaling support to WMFMediaDataEncoder → Bug 1921623 - Adding rescaling support to WMFMediaDataEncoder
Attached file encoder-rescale.html β€”
Pushed by cchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4cb52934d7d7 Adding rescaling support to WMFMediaDataEncoder r=chunmin https://hg.mozilla.org/integration/autoland/rev/fd3bf72178b7 Adjust WPT expectations for various configurations. r=chunmin
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/49303 for changes under testing/web-platform/tests
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 134 Branch
Upstream PR merged by moz-wptsync-bot
Regressions: 1932745
Regressions: 1932754
See Also: → 1934976
Flags: needinfo?(cchang)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: