[M(2)] Win10 Permafail - test_getUserMedia_basicWindowshare.html -- Error: canplaythrough event never fired

RESOLVED FIXED in Firefox 46

Status

()

P1
normal
Rank:
10
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: Callek, Assigned: ng)

Tracking

unspecified
mozilla46
x86_64
Windows 10
Points:
---

Firefox Tracking Flags

(firefox46 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

4 years ago
13:27:22 INFO - 4324 INFO TEST-START | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html
13:27:22 INFO - TEST DEVICES: No test devices found (in media.{audio,video}_loopback_dev, using fake streams.
13:28:22 INFO - TEST-INFO | started process screenshot
13:28:22 INFO - TEST-INFO | screenshot: exit 0
13:28:22 INFO - 4325 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | A valid string reason is expected
13:28:22 INFO - 4326 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | Reason cannot be empty
13:28:22 INFO - 4327 INFO Call getUserMedia for {"video":{"mozMediaSource":"window","mediaSource":"window"},"fake":false}
13:28:22 INFO - 4328 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | No audio tracks shall be present
13:28:22 INFO - 4329 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | One video track shall be present
13:28:22 INFO - 4330 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | Track kind should be video
13:28:22 INFO - 4331 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | Track id should be defined
13:28:22 INFO - 4332 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | Stream should be a LocalMediaStream
13:28:22 INFO - 4333 INFO TEST-PASS | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | Before starting the media element, currentTime = 0
13:28:22 INFO - 4334 INFO TEST-FAIL | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | The author of the test has indicated that flaky timeouts are expected. Reason: WebRTC inherently depends on timeouts
13:28:22 INFO - 4335 INFO TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | Error executing test: Error: canplaythrough event never fired MediaStreamPlayback.prototype.startMedia/</<@http://mochi.test:8888/tests/dom/media/tests/mochitest/mediaStreamPlayback.js:132:16 ... setTimeout handler*SimpleTest_setTimeoutShim@http://mochi.test:8888/tests/SimpleTest/SimpleTest.js:622:12 ... MediaStreamPlayback.prototype.startMedia/<@http://mochi.test:8888/tests/dom/media/tests/mochitest/mediaStreamPlayback.js:129:1 ... MediaStreamPlayback.prototype.startMedia@http://mochi.test:8888/tests/dom/media/tests/mochitest/mediaStreamPlayback.js:57:1 ... promise callback*@http://mochi.test:8888/tests/dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html:32:12 ... promise callback*runTestWhenReady@http://mochi.test:8888/tests/dom/media/tests/mochitest/head.js:279:10 ... runTest/<@http://mochi.test:8888/tests/dom/media/tests/mochitest/mediaStreamPlayback.js:232:15 ... promise callback*runTest@http://mochi.test:8888/tests/dom/media/tests/mochitest/mediaStreamPlayback.js:231:31 ... @http://mochi.test:8888/tests/dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html:17:3 ...
13:28:22 INFO - runTestWhenReady/<@dom/media/tests/mochitest/head.js:280:1
13:28:22 INFO - promise callback*runTestWhenReady@dom/media/tests/mochitest/head.js:279:1
13:28:22 INFO - promise callback*runTest@dom/media/tests/mochitest/mediaStreamPlayback.js:231:31
13:28:22 INFO - @dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html:17:3
13:28:22 INFO - MEMORY STAT | vsize 831MB | vsizeMaxContiguous 130015428MB | residentFast 231MB | heapAllocated 99MB
13:28:22 INFO - 4336 INFO TEST-OK | dom/media/tests/mochitest/test_getUserMedia_basicWindowshare.html | took 60260ms 


This was from a try run based on m-i from today ( https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=cb6c34882e41 )

Log for this failure: https://treeherder.mozilla.org/logviewer.html#?job_id=10554328&repo=try

I don't discount there being an issue on the host we need to adjust for, but I could use some insight from people here.

Randell can you help identify an owner please?
(Reporter)

Updated

4 years ago
Flags: needinfo?(rjesup)
andreas, any thoughts?  Or suggestions who should look at this, or is this a known intermittent?
Flags: needinfo?(rjesup) → needinfo?(pehrsons)
We didn't get "canplaythrough", which for a stream means we didn't get to readyState HAVE_CURRENT_DATA.

This is most likely (I'd say 90%) because there was never a video frame in the stream.

Looks like the window sharing code is from GIPS, so, uhm, time to pull down their latest code? webrtc_trace logs might reveal something as well. Over to you Randell.
Flags: needinfo?(pehrsons) → needinfo?(rjesup)
window sharing may be broken by Win10 changes
backlog: --- → webrtc/webaudio+
Rank: 10
Priority: -- → P1
jimm, dmajor: if one of you has a win10 machine, can you try https://mozilla.github.io/webrtc-landing/gum_test.html and try window, screen and app sharing?  If it doesn't work, can you re-run with NSPR_LOG_MODULES=mediamanager:4,webrtc_trace:65535,timestamp NSPR_LOG_FILE=somefile WEBRTC_TRACE_FILE=someotherfile (or WEBRTC_TRACE_FILE=nspr, which puts in the NSPR logs, but can kill perf - likely ok here).

If those all work, can you try a local run of the mochitest file (and modify the NSPR_LOG_MODULES in build/automation.py.in to set those, and add the trace file var)?

Thanks!
Flags: needinfo?(rjesup)
Flags: needinfo?(jmathies)
Flags: needinfo?(dmajor)
(In reply to Randell Jesup [:jesup] from comment #4)
> jimm, dmajor: if one of you has a win10 machine, can you try
> https://mozilla.github.io/webrtc-landing/gum_test.html and try window,
> screen and app sharing?  If it doesn't work, can you re-run with
> NSPR_LOG_MODULES=mediamanager:4,webrtc_trace:65535,timestamp
> NSPR_LOG_FILE=somefile WEBRTC_TRACE_FILE=someotherfile (or
> WEBRTC_TRACE_FILE=nspr, which puts in the NSPR logs, but can kill perf -
> likely ok here).
> 
> If those all work, can you try a local run of the mochitest file (and modify
> the NSPR_LOG_MODULES in build/automation.py.in to set those, and add the
> trace file var)?
> 
> Thanks!

I haven't had a chance to log anything but it isn't working for me in win10 release. The list of windows or screens to share is disabled with no entries. So confirming.
(In reply to Jim Mathies [:jimm] from comment #5)
> (In reply to Randell Jesup [:jesup] from comment #4)
> > jimm, dmajor: if one of you has a win10 machine, can you try
> > https://mozilla.github.io/webrtc-landing/gum_test.html and try window,
> > screen and app sharing?  If it doesn't work, can you re-run with
> > NSPR_LOG_MODULES=mediamanager:4,webrtc_trace:65535,timestamp
> > NSPR_LOG_FILE=somefile WEBRTC_TRACE_FILE=someotherfile (or
> > WEBRTC_TRACE_FILE=nspr, which puts in the NSPR logs, but can kill perf -
> > likely ok here).
> > 
> > If those all work, can you try a local run of the mochitest file (and modify
> > the NSPR_LOG_MODULES in build/automation.py.in to set those, and add the
> > trace file var)?
> > 
> > Thanks!
> 
> I haven't had a chance to log anything but it isn't working for me in win10
> release. The list of windows or screens to share is disabled with no
> entries. So confirming.

Sorry, false report. The drop down looked disabled (grayed out) but I could have clicked it. It had the nightly window listed. Selecting that, I am able to share it in the test page.
Flags: needinfo?(jmathies)
Failure appears to be here:
WARNING; (20:15:48:372 |  343) 1792; (desktop_frame_win.cc:53): Failed to allocate new window frame 0

media/webrtc/trunk/webrtc/modules/desktop_capture/desktop_frame_win.cc:53

  HBITMAP bitmap = CreateDIBSection(hdc, &bmi, DIB_RGB_COLORS, &data,
                                    section_handle, 0);
  if (!bitmap) {
    LOG(LS_WARNING) << "Failed to allocate new window frame " << GetLastError();

http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/fc96c2782c7ac271b6d06409bd926367e025a1b53bd075c4dd0165a1469ecfbeac691845eec8e653ee78a306555d7f69c453caf4a4a5040561f8f18a7e96dcbe

I know nothing of Windows DIB bitmaps, and the error (0) isn't very useful.  Jimm?
Flags: needinfo?(jmathies)
If the dims you pass request a 0x0 bitmap, you get this kind of result.
Flags: needinfo?(jmathies)
(Reporter)

Comment 9

4 years ago
Jim, Mreavy,

So taking inspiration from a pastebin on IRC here is the current patchset pushed to try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d5b28569c07b

This is based on a green m-c rev:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=205427533c56&filter-searchStr=Windows

This should be finished "soon" (not long after the build itself finish) and we'll see results.

n-i'ing you to be sure theres a followup once done
Flags: needinfo?(mreavy)
Flags: needinfo?(jmathies)
Flags: needinfo?(dmajor)
(desktop_frame_win.cc:53): Failed to allocate new window frame 0 height=4 width=-8 bitcount=32 biSize=40 biSizeImage=128

0 is GetLastError()

Note that biHeight = -size.height(), so size.height() == -4(!) and size.width = -8(!) (and biSizeImage = size.width()*bytesperpixel*size.height() = 128).  biSizeImage makes (kinda) sense in that it's negative*negative...

in media/webrtc/trunk/webrtc/modules/desktop_capture/win/window_capture_utils.cc, if window_placement.showCmd & SW_SHOWMAXIMIZED is true, it adjusts the width and height to remove Border width/height; perhaps those values are "unexpected" in this case?
Just confirming, this code run in the browser process correct?

Looking at the utils routine, it returns false when apis fail. Maybe some callers fail to check this?

http://mxr.mozilla.org/mozilla-central/source/media/webrtc/trunk/webrtc/modules/desktop_capture/win/window_capture_utils.cc

What kind of hardware are we running on here?
Flags: needinfo?(jmathies)
(Reporter)

Comment 12

4 years ago
(In reply to Jim Mathies [:jimm] from comment #11)
> What kind of hardware are we running on here?

Server Model:
    iX Systems - iX21X4 2U Neutron 
Date Purchased:
    2013-03-18    (incase it matters)

And from about:support on the host:

Graphics
--------

Adapter Description: NVIDIA GeForce GT 610
Adapter Drivers: nvd3dumx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um
Adapter RAM: 1024
Asynchronous Pan/Zoom: wheel input enabled
Device ID: 0x104a
Direct2D Enabled: true
DirectWrite Enabled: true (10.0.10240.16384)
Driver Date: 3-4-2014
Driver Version: 9.18.13.3523
GPU #2 Active: false
GPU Accelerated Windows: 1/1 Direct3D 11 (OMTC)
Subsys ID: 26153842
Supports Hardware H264 Decoding: Yes
Vendor ID: 0x10de
WebGL Renderer: Google Inc. -- ANGLE (NVIDIA GeForce GT 610 Direct3D11 vs_5_0 ps_5_0)
windowLayerManagerRemote: true
AzureCanvasBackend: direct2d 1.1
AzureContentBackend: direct2d 1.1
AzureFallbackCanvasBackend: cairo
AzureSkiaAccelerated: 0
(In reply to Jim Mathies [:jimm] from comment #11)
> Just confirming, this code run in the browser process correct?
> 
> Looking at the utils routine, it returns false when apis fail. Maybe some
> callers fail to check this?
> 
> http://mxr.mozilla.org/mozilla-central/source/media/webrtc/trunk/webrtc/
> modules/desktop_capture/win/window_capture_utils.cc
> 
> What kind of hardware are we running on here?

Or maybe instead of a failure here (since the follow up call to GetLastError returns 0) maybe the window you pass in is 0x0? Then you subtract the system non-client area border and Voilà, negative width/height.
(In reply to Jim Mathies [:jimm] from comment #13)
> > What kind of hardware are we running on here?
> 
> Or maybe instead of a failure here (since the follow up call to GetLastError
> returns 0) maybe the window you pass in is 0x0? Then you subtract the system
> non-client area border and Voilà, negative width/height.

A 0x0 window was what I suspect (once I saw the negatives, and why I mentioned the border subtraction).

The tests IIRC select a random window (first one in the list); I suspect some sort of fake non-visible window is first on the test machine.  We can verify this, and in the longer run improve the test to select a specific window somehow (would require test-only interfaces I think), or have it block from offering windows with negative sizes (which would require a bunch of geometry-check calls when building the list - but that's not a huge problem as we're planning to offer thumbnails).  Or select the *last* window!


If we can confirm this, we may want to disable the test on win10 until we resolve it.  (if selecting last window doesn't work)

(chatted with mreavy, we're in sync)
Flags: needinfo?(mreavy)
> A 0x0 window was what I suspect (once I saw the negatives, and why I
> mentioned the border subtraction).
> 
> The tests IIRC select a random window (first one in the list)

That's really awful. :( I'm amazed it actually works on other platforms. You might consider disabling the test on all platforms until this gets fixed.
(In reply to Jim Mathies [:jimm] from comment #15)
> > A 0x0 window was what I suspect (once I saw the negatives, and why I
> > mentioned the border subtraction).
> > 
> > The tests IIRC select a random window (first one in the list)
> 
> That's really awful. :( I'm amazed it actually works on other platforms. You
> might consider disabling the test on all platforms until this gets fixed.

Well, it *should* be able to share any window in the list it provides, so it should work.  That said, a specific window would be *better*, but: this is really painful to do in a mochitest, since we explicitly don't give applications any info or way to select which window is captured; that's totally up to the user.  Of course, in this case there is no use, which means we'd have to make a side-interface to force it to share a specific window (and we'd have to create said window as well).

Giving the user an option to share a window that isn't actually shareable is a bug (albeit a relatively minor one).  It just happens to be biting us here.
Perhaps we should block negative sizes as a belt-and-suspenders (limit to 0); negative sizes don't make sense anyways.

Should be a simple patch.  Looking for someone to take it and push a try to verify if it works.
Flags: needinfo?(rjesup)
Nico - would you feel comfortable taking this as a first bug?  Randell has agreed to mentor getting you through the process of landing this.   We're really just looking at blocking negative sizes for sanity.  Thanks!
Assignee: nobody → na-g
Flags: needinfo?(rjesup)
Maire - sure thing.
Additionally, it looks like on Windows 10 the Store application is launched at startup and suspended until the user actually launches it. The Store window is available in the drop down list, and sharing it results in sharing a black screen, but no errors. Once the Store has been launched by the user and closed it will not appear in the list until it is opened again.
Attachment #8695689 - Flags: review?(pkerr)
Attachment #8695689 - Attachment is obsolete: true
Attachment #8695689 - Flags: review?(pkerr)
Attachment #8697604 - Flags: review?(pkerr)

Updated

3 years ago
Attachment #8697604 - Flags: review?(pkerr) → review+

Comment 25

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/9fe63e24d192
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox46: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
You need to log in before you can comment on or make changes to this bug.