Open Bug 1927346 Opened 1 year ago Updated 25 days ago

Video "Take Snapshot" context-menu entry often produces fully black screenshots (and/because `canvas.drawImage(video)` doesn't draw anything)

Categories

(Core :: Graphics: Canvas2D, defect)

Desktop
Linux
defect

Tracking

()

People

(Reporter: dholbert, Unassigned, NeedInfo)

References

Details

Attachments

(8 files, 1 obsolete file)

STR:

  1. Load a video, e.g. this screencast https://bug1927344.bmoattachments.org/attachment.cgi?id=9433531
  2. Pause the video.
  3. Take a snapshot (right-click the video, and choose "Take Snapshot" [S]) (note, this is a video-specific context menu option -- different from the more general "Take Screenshot" context-menu option -- note Snap vs Screen in the name)
  4. Look at the snapshot image that was generated. If it's fine, try seeking forward a bit (e.g. press spacebar twice to unpause/pause), and go back to step 3.

ACTUAL RESULTS:
At some points in the video, Take Snapshot reliably produces a fully-black snapshot, despite the video showing a regular video frame.

EXPECTED RESULTS:
No such fully-black snapshots.

I've noticed this with several screencasts that I've captured in the past week (where I've tried to extract snapshots from the videos using this feature). When this fails, it reliably fails until I seek the video sufficiently forward to get past the "bad part".

Here's an example of a bad snapshot that I got when performing the STR.

This was when I was seeked to t=2.657151 (as reported by the <video> element's currentTime attribute).

I was hoping that maybe I'd be able to reliably repro the bug by seeking to that particular currentTime in another copy of the video in another tab, but unfortunately that's not reproducing the bug. My original tab (paused at that timestamp) continues to reproduce the issue, though.

So: this bug depends on being seeked to particular timestamps in a more-fine-grained way than this currentTime value captures, or it depends on a bit more about the some particular state-of-the-tab/playback-session, or something else.

Nothing shows up in the web console or browser console when this bug reproduces, so this isn't something failing in a console-reported way.

I'm using Nightly 133.0a1 (2024-10-27) (64-bit) on Ubuntu 24.04

Attachment #9433534 - Attachment description: sample screenshot that was generated when following the STR → sample snapshot that was generated when following the STR
OS: Unspecified → Linux
Hardware: Unspecified → Desktop
Flags: needinfo?(karlt)

Guessing this might be related to image snapshotting, which I would guess is a widget bug. Maybe Karl has some ideas. We have some similar issues on Windows.

I'm not too familiar with exactly where snapshots hook in, but nsDisplayVideo has a couple of different paths. I would guess that Paint() might be the path used for snapshots, but not for typical video presentation.

Flags: needinfo?(karlt)

Might be some weird race condition. I tried reproducing on windows and macos with no luck. Might be linux specific.

I tried to repro ~30 times with no luck running Debian Bookworm / Xorg. If you're running Wayland, is there any chance that might have something to do with it?

I was using Wayland, yeah. I don't know enough about our video pipeline or "take snapshot" internals to know if that might matter here or not.

Toggling ni=me to try reproducing some more on the machine where I had hit this to see if I can come up with any more info...

Flags: needinfo?(dholbert)

I was able to repro again just now, using Firefox Nightly 134.0a1 (2024-11-10).

I tried to recreate what I was doing when I originally hit this on this machine, which was this (in a bit more detail) -- the goal is to try to pause at the moment where it looks like this, where the blue-bordered rectangle only occupying part of the height. To do that:

  • Seek to a moment just before that blue rectangle appears -- roughly t=4s -- and then pause the video.
  • Right-click the video and choose "Inspect".
  • In DevTools console, type "$0.playbackRate = 0.1" to reduce the video speed. ($0 refers to the DOM element that's focused in the inspector, which in this case is the video.)
  • Now, click the video to play, and then pause (e.g. with spacebar) as soon as you see that blue rectangle appear (at its smaller height).
  • Right-click the video and choose "Take Snapshot", and then look at the resulting snapshot that was saved.
  • If that doesn't reproduce the bug, then seek the video ever-so-slightly forward, by e.g. pressing space twice (to play/pause), and try another snapshot.

Machine details, in case it's relevant:
Model: Microsoft Surface Pro 7+
OS: Ubuntu 24.04.1 LTS

From about:graphics:
Compositing WebRender
WebGL 2 Driver Renderer Intel -- Mesa Intel(R) UHD Graphics (TGL GT2)
I'm not using Wayland, as it turns out. ($XDG_SESSION_TYPE is x11)

The moment at which I repro'd the issue was when the video had currentTime == 4.147145 -- though from prior attempts, I'm pretty sure that simply seeking directly to that time won't trigger the issue.

Flags: needinfo?(dholbert)
Attached file diagnostic patch v1

I've been trying to capture this in rr for analysis, but so far haven't been able to. But I did manage to just reproduce it in a local opt build with a little bit of debug logging enabled (see attached patch).

The debug logging confirmed that the dataURL that JS is working with here...
https://searchfox.org/mozilla-central/rev/f1c3f57cfdddd17b0a249a9c511d1791d5edd444/browser/base/content/nsContextMenu.sys.mjs#1772-1773

this.actor.saveVideoFrameAsImage(this.targetIdentifier).then(dataURL => {
  // FIXME can we switch this to a blob URL?

...is already the solid-black busted image. (Perhaps not too surprising, but a small amount of narrowing-the-window-of-where-things-might-be-going-wrong.)

Here's the busted all-black data URI that was logged when I triggered the bug.

Moving the problem one step earlier in the pipeline: while in the bad state (I've still got the video paused at the point where it generates black snapshots, which it persistently does once I'm at such a point, per last sentence in comment 0), I tried running a slightly-modified-to-work version of the code that we actually use to serialize a data URI:
https://searchfox.org/mozilla-central/rev/f1c3f57cfdddd17b0a249a9c511d1791d5edd444/browser/actors/ContextMenuChild.sys.mjs#222,226-238

case "ContextMenu:SaveVideoFrameAsImage": {
...
  let canvas = this.document.createElementNS(
    "http://www.w3.org/1999/xhtml",
    "canvas"
  );
  canvas.width = video.videoWidth;
  canvas.height = video.videoHeight;

  let ctxDraw = canvas.getContext("2d");
  ctxDraw.drawImage(video, 0, 0);

  // Note: if changing the content type, don't forget to update
  // consumers that also hardcode this content type.
  return Promise.resolve(canvas.toDataURL("image/jpeg", ""));

...and I confirmed that I do indeed get the same all-black-image data URI from that JS, too.

Here's the specific JS that I ran in the web console, for reference:

let video = document.querySelector("video");
let canvas = document.createElement("canvas");
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
let ctxDraw = canvas.getContext("2d");
ctxDraw.drawImage(video, 0, 0);
canvas.toDataURL("image/jpeg", "")

Also: if I run that instead with a image/png, I get a fully transparent PNG instead of a fully black image!

This JS snippet reliably seeks me to a time that triggers the bug, at least on my Surface 7 pro.

STR:

  1. Load https://bug1927344.bmoattachments.org/attachment.cgi?id=9433531
  2. Ctrl+Shift+K to open web console.
  3. Paste the contents of this attachment into web console and hit Enter.
  4. Wait a few seconds (might take a bit while the JS runs), until you see "FINISHED" in web console, and see what happens after that.

ACTUAL RESULTS:
I always see the logging "FOUND A TIMESTAMP THAT REPROS THE BUG" and my snapshots are reliably-bad.

EXPECTED RESULTS:
Should not have found a timestamp that repros the bug.

(Note: so far, my beefy Lenovo ThinkStation [running Ubuntu 24.10] does not repro the bug with the STR in comment 14. It sits for a minute or so running the JS (testing the data URI at every millisecond from t=0 to t=5s) and then it lets me know that it didn't hit the bug.

But my Surface Pro 7+ seems to reliably trigger the bug with comment 14.)

I can reproduce on my Dell XPS 16 (using STR in comment 14).

Comparing to my Surface Pro 7+....

  • different: my Dell XPS 16 is using Wayland.
  • same: I'm running Ubuntu 24.04.1 LTS, with Mesa graphics driver:
WebGL 2 Driver Renderer	Intel -- Mesa Intel(R) Arc(tm) Graphics (MTL)

Also, I can reproduce in a debug and opt build on both machines (using STR from comment 14), but unfortunately the bug goes away if I try to capture in rr (including with chaos mode). So this is a bit of a heisen-bug that goes away if it's observed too closely. Not sure if that's due to timing vs. something else that works out differently when running under rr.

However, I can still poke in gdb to to see what happens after I've found a bad timestamp. I ran comment 14's STR, and then attached GDB to the content process, and put a breakpoint in mozilla::dom::CanvasRenderingContext2D::DrawImage and then picked take snapshot from the video's context menu, and from stepping through, I saw that we're finding the drawable-image-thing using this backtgrace:

#0  mozilla::layers::AutoLockImage::GetImage (this=0x7fffc99a40f8, aTimeStamp=...)
    at obj-debug/dist/include/ImageContainer.h:724
#1  0x000070b6312f86a7 in mozilla::dom::HTMLMediaElement::GetCurrentImage (this=0x70b61a309800)
    at dom/html/HTMLMediaElement.cpp:2320
#2  0x000070b634e3e667 in nsLayoutUtils::SurfaceFromElement
    (aElement=0x70b61a309800, aSurfaceFlags=265, aTarget=[(mozilla::gfx::DrawTargetRecording *)] = {...}, aOptimizeSourceSurface=false) at layout/base/nsLayoutUtils.cpp:7279
#3  0x000070b630200704 in mozilla::dom::CanvasRenderingContext2D::DrawImage

...and owningImage->mImage (the thing we return and draw to the canvas) is of type mozilla::layers::GPUVideoImage.

I don't know much about mozilla::layers::GPUVideoImage , but given "GPU" in the name, plus the fact that I'm seeing this on machines with Mesa Graphics so far, I wonder if this is just a graphics driver bug...

Thanks for the detailed debugging notes! I gave your script a try on my machine three times. Each time it took some time to run but eventually output "didn't hit the bug. too bad". I'm running AMD, not Intel, so that might be some more evidence for this being a graphics driver bug. What GPU are you running on your ThinkStation where it wouldn't repro? On my side, I have:

WebGL 2 Driver Renderer	AMD -- AMD Radeon Pro WX 3200 Series (polaris12, LLVM 15.0.6, DRM 3.49, 6.1.0-26-amd64)
WebGL 2 Driver Version	4.6 (Core Profile) Mesa 22.3.6
Flags: needinfo?(dholbert)

Beefy thinkstation (which doesn't repro) is has this GPU:

WebGL 2 Driver Renderer	AMD -- AMD Radeon Pro WX 3200 Series (radeonsi, polaris12, LLVM 19.1.0, DRM 3.58, 6.11.0-9-generic)
WebGL 2 Driver Version	4.6 (Core Profile) Mesa 24.2.3-1ubuntu1

One other thing I realized/noticed, though: looking at diffs between about:support on repro'ing/not-repro'ing configs, I think the relevant differentiator is whether or not hardware decoding is available. I assume my webm video is VP8/VP9, and I noticed that my machines that reproduce the bug have this in their Codec Support Information section:

Codec Name  Software Decoding	Hardware Decoding
VP9              Supported          Supported
VP8              Supported          Supported

Whereas on those same machines when I run Firefox under rr and the bug becomes not-reproducible, "Hardware Decoding" changes to "Unsupported"; and that's how it is by default on my Thinkstation too.

Flags: needinfo?(dholbert)
Attached file reduced testcase 1 (standalone) (obsolete) —

Here's a reduced testcase that captures a series of screenshots in canvases (which you can see in the document).

On my machines/configs that repro the bug, all or most of these canvases end up red, indicating that the video failed to draw.

I did notice that MDN mentions this about Canvas.drawImage:

drawImage() only works correctly on an HTMLVideoElement when its HTMLMediaElement.readyState is greater than 1 (i.e., seek event fired after setting the currentTime property).
https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/drawImage#notes

(Typo there, I think MDN means "seeked" rather than seek.)

So in this testcase, I'm dutifully waiting for "seeked" (which indeed seems to be necessary for the test to work in Firefox and Chrome). But on my machines that repro the issue, this isn't sufficient to avert test-failure.

er, I guess the video isn't same-origin for this testcase because it's on another bmoattachments.org subdomain. Lemme upload the video and see if that makes it be considered same-origin...

Here's a screencast showing the some good results in Chrome, vs. bad results (red) in Firefox, on a machine that reproduces the bug.

Note that Chrome fails to seek to the requested part of the video at all for some rerason (and sometimes Firefox fails too, for the very first Canvas snapshot, as can sometimes be seen by the blue progress-bar being slightly longer than expected in Firefox's first Canvas snapshot).

But the important part here is that firefox end up with some red canvases, which are spots where our drawImage() call failed to do anything at all.

Attachment #9436940 - Attachment description: reduced testcase 1 (standalone) → reduced testcase 1 (standalone, must be loaded in foreground tab for results to be valid)

STR:

  1. Load reduced testcase 1 in a foreground tab.[1]

EXPECTED RESULTS:
None of the canvas snapshots should be red.
(Instead: each of the canvas snapshots should look the same way that the video looked, at the moment that the canvas snapshot was appended.)

ACTUAL RESULTS:
Many (or all) of the canvas snapshots are red.

In order to reproduce ACTUAL RESULTS, I think you need to be on Linux with both hardware WebRender and hardware VP8/VP9 support (as shown in about:support). The bug stops reproducing if I change any of those variables (e.g. if I test with software webrender; or if I test on my ThinkStation machine which lacks hardware VP8/VP9 support {regardless of software/hardware WR}; or test on Windows/Mac).

One other interesting aspect is that I get ACTUAL RESULTS on Android, even if I force-enable software webrender. So maybe there's something special on Android that makes this more prone to occurring there.

[1] (Note on why to load it in a foreground tab -- if you load it in a background tab, the 2nd and 3rd snapshot are consistently red and then white, on all desktop Firefox configs/platforms that I've tested, which seems odd but maybe-expected or at least a different issue; not sure if that's some sort of unintended race, vs. a privacy or performance mitigation for background tabs vs. something else.)

Summary: Video "Take Snapshot" context-menu entry often produces fully black screenshots → Video "Take Snapshot" context-menu entry often produces fully black screenshots (and/because `canvas.drawImage(video)` doesn't draw anything)

(severity-wise, this is probably S3 unless we're thinking that this reproduces widely with e.g. video+canvas mashup sites, etc.)

See Also: → 1526207

The severity field is not set for this bug.
:jimm, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(jmathies)

From some printfs, CanvasRenderingContext2D::DrawImage() knows that it is getting no source surface but chooses to pretend all is OK.

nsLayoutUtils::SurfaceFromElement() gets an mLayersImage

This sounds likely to be a graphics issue.

Reproducing on similar "Mesa Intel(R) Arc(tm) Graphics (MTL)".

Component: Audio/Video → Graphics: Canvas2D
Flags: needinfo?(jmathies)

The severity field is not set for this bug.
:lsalzman, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(lsalzman)

Andrew, ideas?

Severity: -- → S3
Flags: needinfo?(lsalzman) → needinfo?(aosmond)
Duplicate of this bug: 1895022

i am getting fully black screenshots on version 137.0.1 9 out of 10 times i take snapshots from youtube. will try and see if i get black screens with vp09 videos.

some info:

Operating System: EndeavourOS
KDE Plasma Version: 6.3.4
KDE Frameworks Version: 6.12.0
Qt Version: 6.9.0
Kernel Version: 6.14.2-arch1-1 (64-bit)
Graphics Platform: Wayland
Processors: 12 × AMD Ryzen 5 9600X 6-Core Processor
Memory: 30.5 GiB of RAM
Graphics Processor: AMD Radeon Graphics
Manufacturer: ASUS

WebGL 1 Driver Renderer AMD -- AMD Radeon Graphics (radeonsi, raphael_mendocino, LLVM 19.1.7, DRM 3.61, 6.14.2-arch1-1)
WebGL 1 Driver Version 4.6 (Compatibility Profile) Mesa 25.0.3-arch1.1

WebGL 2 Driver Renderer AMD -- AMD Radeon Graphics (radeonsi, raphael_mendocino, LLVM 19.1.7, DRM 3.61, 6.14.2-arch1-1)
WebGL 2 Driver Version 4.6 (Core Profile) Mesa 25.0.3-arch1.1

Sotaro, any ideas what's going on here? See Karlt's explanation in comment 27

Flags: needinfo?(sotaro.ikeda.g)

I hit happened to hit this again today on my aforementioned "beefy thinkstation" FWIW, when trying to capture a snapshot with the "take snapshot" context-menu entry, while viewing my screencast at https://bug1968090.bmoattachments.org/attachment.cgi?id=9490186 .

In case the driver info is useful, here's what I've got from about:support (still Mesa as in comment 8).

WebGL 2 Driver Renderer	AMD -- AMD Radeon Pro WX 3200 Series (radeonsi, polaris12, LLVM 19.1.1, DRM 3.59, 6.11.0-24-generic)
WebGL 2 Driver Version	4.6 (Core Profile) Mesa 24.2.8-1ubuntu1~24.10.1

and in "Codec Support Information", I see:

VP9	Supported	Unsupported
VP8	Supported	Unsupported

...with this in the "Decision Log":

VP8_HW_DECODE	
default	available		
env	blocklisted	#BLOCKLIST_FEATURE_FAILURE_VIDEO_DECODING_MISSING	Blocklisted; failure code FEATURE_FAILURE_VIDEO_DECODING_MISSING
VP9_HW_DECODE	
default	available		
env	blocklisted	#BLOCKLIST_FEATURE_FAILURE_VIDEO_DECODING_MISSING	Blocklisted; failure code FEATURE_FAILURE_VIDEO_DECODING_MISSING

In this case I am using wayland, and I'm on Ubuntu 24.10.

(Also: after hitting the bug with "take snapshot", I tried the steps from comment 14 on the same Firefox session where I had just tripped over the issue, but those still don't hit the bug on-demand on this ThinkStation machine.

Duplicate of this bug: 2000375
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: