Closed Bug 1076833 Opened 10 years ago Closed 9 years ago

Intermittent test_videocontrols_audio_direction.html | Test timed out. - expected PASS

Categories

(Core :: Audio/Video, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla42
Tracking Status
firefox42 --- fixed
firefox43 --- fixed

People

(Reporter: cbook, Assigned: jwwang)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 2 obsolete files)

Rev4 MacOSX Snow Leopard 10.6 fx-team opt test mochitest-5

https://treeherder.mozilla.org/ui/logviewer.html#?job_id=804434&repo=fx-team

03:45:44 INFO - 943 INFO TEST-UNEXPECTED-FAIL | /tests/toolkit/content/tests/widgets/test_videocontrols_audio_direction.html | Test timed out. - expected PASS
This occurs more frequently with --run-by-dir enabled.
Here is an example on try server:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=351ba2066462&filter-searchStr=mochitest-5%2010.6%20opt
Flags: needinfo?(ehsan)
Blocks: 1162003
(In reply to Kaustabh Datta Choudhury from comment #503)
> This occurs more frequently with --run-by-dir enabled.

I can believe that.  :-)

> Here is an example on try server:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=351ba2066462&filter-
> searchStr=mochitest-5%2010.6%20opt

What information do you need from me?
Flags: needinfo?(ehsan) → needinfo?(kaustabh93)
:ehsan, it looks like you authored this test originally:
https://hg.mozilla.org/mozilla-central/filelog/e7e69cc8c07b/toolkit/content/tests/widgets/test_videocontrols_audio_direction.html

Do you have ideas of what could be done to prevent this timeout?
Flags: needinfo?(kaustabh93)
I think this is more related to Audio/Video. I guess the progress event is not fired or something:
http://mxr.mozilla.org/mozilla-central/source/toolkit/content/tests/widgets/videocontrols_direction_test.js?force=1#23

Or "if (b.length == 1 && b.end(0) == v.duration) {" doesn't work when this failure happens.

JW Wang, do you know what might cause the timeout here? Or perhaps some logging code should be added here?
Component: Widget → Video/Audio
Flags: needinfo?(jwwang)
(In reply to Joel Maher (:jmaher) from comment #508)
> :ehsan, it looks like you authored this test originally:
> https://hg.mozilla.org/mozilla-central/filelog/e7e69cc8c07b/toolkit/content/
> tests/widgets/test_videocontrols_audio_direction.html
> 
> Do you have ideas of what could be done to prevent this timeout?

Not without debugging it.  If you need me to do that please needinfo me, but it may take me weeks to get to it with my current workload.  :(  Someone on the media team can probably look sooner.
Hi Martijn,

Could you please have a look into this and help us fix this issue?
Flags: needinfo?(martijn.martijn)
Kaustabh, I can't reproduce this, I tried: ./mach mochitest -f plain  toolkit/content/tests/widgets/
But I don't get any failures at all.
What is the exact thing that you did running on the tryserver?
Flags: needinfo?(martijn.martijn) → needinfo?(kaustabh93)
This is a screenshot of when the timeout failure happens.
Similar failures are logged in bug 934853 for test_videocontrols_video_direction.html.
See Also: → 934853
(In reply to Martijn Wargers [:mwargers] (QA) from comment #527)
> Created attachment 8632977 [details]
> mozilla-test-fail-screenshot_qRx8IG.png
> 
> This is a screenshot of when the timeout failure happens.

That shows the progress bar has different colors.  So perhaps the UI code is waiting for an event of some sort to change the color, fade something in/out of whatever.  Someone who knows the UI code should look at this.
:mwargers, any ideas who would know the UI code for the progress bar/media stuff?
(In reply to Joel Maher (:jmaher) from comment #536)
> :mwargers, any ideas who would know the UI code for the progress bar/media
> stuff?

Sorry, I wouldn't know, but it seems to me that the load event for the next page is not firing for some reason.
(In reply to Martijn Wargers [:mwargers] (QA) from comment #526)
> Kaustabh, I can't reproduce this, I tried: ./mach mochitest -f plain 
> toolkit/content/tests/widgets/
> But I don't get any failures at all.
> What is the exact thing that you did running on the tryserver?

Hi,

Sorry for being late is responding. I had enabled --run-by-dir in runtests.py and pushed to try for some platforms. This was one of the failures that popped up.
Flags: needinfo?(kaustabh93)
(In reply to Joel Maher (:jmaher) from comment #536)
> :mwargers, any ideas who would know the UI code for the progress bar/media
> stuff?

git tells me jaws or dolske:

$ git log toolkit/content/widgets/videocontrols.xml
commit 973204b91e923571b38366228b0e90fcadd5f889
Author: Benjamin Chen <bechen@mozilla.com>
Date:   Tue Feb 17 16:13:16 2015 +0800

    Bug 1124605 - Execute dragStateChanged(false) before the element is hidden. r=jaws

commit b5f521fe1988ddef8222955b4e6bc75e86aad1a2
Author: Jared Wein <jwein@mozilla.com>
Date:   Fri Jan 9 14:29:36 2015 -0500

    Bug 1074744 - Apply the margin-end to the volume controls when fullscreen is unavailable. r=gijs r=dolske

    The adjusted margin-end needs to be applied to the volume controls when the fullscreen button is hidden due to fullscreen being unavailable, such as within an iframe that is lacking the allowfullscreen attribute. Previously the margin-end was only applied when the video was determined to be audio-only.

commit f939423743419343e7e3ebe44dbd5823a0ca73bd
Author: Ben Craddock <bcc@uga.edu>
Date:   Wed Oct 1 22:58:00 2014 +0200

    Bug 517114 - Keep video controls from being draggable. r=dolske

commit dcfe9f0776b83d9df4e458742da192021f1bb42e
Author: Ben Craddock <bcc@uga.edu>
Date:   Wed Sep 24 10:58:15 2014 -0700

    Bug 1072193 - Hide video controls when mouse leaves from the left or right.  r=dolske

commit 3a9d52f2998cde4249ed50a99553889249d678d3
Author: Tomasz Kołodziejski <tkolodziejski@mozilla.com>
Date:   Mon Aug 25 13:55:00 2014 +0200

    Bug 694696 - mute video if volume is set down to 0. r=jaws
...
Flags: needinfo?(jaws)
Flags: needinfo?(dolske)
(In reply to Ehsan Akhgari (not reviewing patches, not reading bugmail, needinfo? me!) from comment #535)
> (In reply to Martijn Wargers [:mwargers] (QA) from comment #527)
> > Created attachment 8632977 [details]
> > mozilla-test-fail-screenshot_qRx8IG.png
> > 
> > This is a screenshot of when the timeout failure happens.
> 
> That shows the progress bar has different colors.  So perhaps the UI code is
> waiting for an event of some sort to change the color, fade something in/out
> of whatever.  Someone who knows the UI code should look at this.

The bar doubles as a progress indicator, showing how much of the media has loaded (buffered). Not many pixels to go by, but it would seem to imply that when the test times out, the element on the left hasn't loaded any data.
url:        https://hg.mozilla.org/integration/fx-team/rev/5826ab3dc275962f3ed64a765051258ff3cacbb7
changeset:  5826ab3dc275962f3ed64a765051258ff3cacbb7
user:       Justin Dolske <dolske@mozilla.com>
date:       Fri Jul 17 12:15:14 2015 -0700
description:
bug 1076833 - extra logging to help diagose intermittent orange.
https://hg.mozilla.org/mozilla-central/rev/5826ab3dc275
Assignee: nobody → dolske
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
(In reply to Tim Taubert [:ttaubert] from comment #567)
> https://hg.mozilla.org/mozilla-central/rev/5826ab3dc275

Not fixed obviously, only to get some logging to see what's going on when this failure happens.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
From that last failure, the log:
05:09:34 INFO - 1574 INFO TEST-START | toolkit/content/tests/widgets/test_videocontrols_audio.html
05:09:34 INFO - MEMORY STAT | vsize 3974MB | residentFast 572MB | heapAllocated 132MB
05:09:34 INFO - 1575 INFO TEST-OK | toolkit/content/tests/widgets/test_videocontrols_audio.html | took 212ms
05:09:34 INFO - 1576 INFO TEST-START | toolkit/content/tests/widgets/test_videocontrols_audio_direction.html
05:09:34 INFO - 1577 INFO iframe loaded
05:09:34 INFO - 1578 INFO iframe loaded
05:09:34 INFO - 1579 INFO progress event: 1 buffered ranges, duration is 0.031247
05:09:34 INFO - 1580 INFO buffer 0 end: 0.031247
05:09:34 INFO - 1581 INFO done
05:09:34 INFO - 1582 INFO progress event: 1 buffered ranges, duration is 0.031247
05:09:34 INFO - 1583 INFO buffer 0 end: 0.031247
05:09:34 INFO - 1584 INFO done
05:09:34 INFO - 1585 INFO TEST-PASS | toolkit/content/tests/widgets/test_videocontrols_audio_direction.html | Rendering of reftest videocontrols_direction-2a.html should not be different to the reference
05:09:35 INFO - 1586 INFO iframe loaded
05:09:35 INFO - 1587 INFO progress event: 1 buffered ranges, duration is 0.031247
05:09:35 INFO - 1588 INFO buffer 0 end: 0.031247
05:09:35 INFO - 1589 INFO done
05:09:35 INFO - 1590 INFO iframe loaded
05:09:35 INFO - 1591 INFO progress event: 0 buffered ranges, duration is NaN
05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing: too late to send/recv, messages will be lost
05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing: too late to send/recv, messages will be lost
05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing: too late to send/recv, messages will be lost
05:15:02 INFO - TEST-INFO | screencapture: exit 0
05:15:02 INFO - 1592 INFO TEST-UNEXPECTED-FAIL | toolkit/content/tests/widgets/test_videocontrols_audio_direction.html | Test timed out.
Flags: needinfo?(jaws)
(In reply to Martijn Wargers [:mwargers] (QA) from comment #572)
> From that last failure, the log:
...
> 05:09:35 INFO - 1586 INFO iframe loaded
> 05:09:35 INFO - 1587 INFO progress event: 1 buffered ranges, duration is
> 0.031247
> 05:09:35 INFO - 1588 INFO buffer 0 end: 0.031247
> 05:09:35 INFO - 1589 INFO done

Ok, that's the normal first part of the test.

> 05:09:35 INFO - 1590 INFO iframe loaded
> 05:09:35 INFO - 1591 INFO progress event: 0 buffered ranges, duration is NaN

And... We die here.

This looks like a Core bug.

1) Why are we getting a progress event without any ranges? Is that even supposed to happen? (The .duration == NaN may be expected here, if this is actually happening before enough data has loaded to determine a duration).

2) Why are we not getting any further progress events? Even if this first empty event is ok, shouldn't there be at least one more when it finishes loading?

> 05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing:
> too late to send/recv, messages will be lost
> 05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing:
> too late to send/recv, messages will be lost
> 05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing:
> too late to send/recv, messages will be lost

Curiously, all the failures have these errors a few seconds after the previous output. I don't know if that's related, or just a side effect of the browser going idle and doing some GC.
Status: REOPENED → NEW
Flags: needinfo?(dolske) → needinfo?(cpearce)
ammendment- this is all green for --run-by-dir.
Attached patch 1076833_add_logs.patch (obsolete) — Splinter Review
Add logs for media events for debugging.
Flags: needinfo?(jwwang)
Attachment #8654697 - Flags: review?(martijn.martijn)
(In reply to Justin Dolske [:Dolske] from comment #573)
> 1) Why are we getting a progress event without any ranges? Is that even
> supposed to happen? (The .duration == NaN may be expected here, if this is
> actually happening before enough data has loaded to determine a duration).
Buffer ranges are not updated until the 1st frame is decoded. A 'progress' (in response to data download) can be fired before decoding the 1st frame. It is OK to have no buffer ranges when 'progress' is fired.
 
> 2) Why are we not getting any further progress events? Even if this first
> empty event is ok, shouldn't there be at least one more when it finishes
> loading?
It is possible for the whole file to be downloaded before we load metadata. Therefore there are no more 'progress' events.

The test is actually racy because it doesn't consider the whole file could be downloaded even before loading metadta or having a valid buffer range. I will re-write the test case to fix this race condition.

> > 05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing:
> > too late to send/recv, messages will be lost
> > 05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing:
> > too late to send/recv, messages will be lost
> > 05:09:38 INFO - ###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing:
> > too late to send/recv, messages will be lost
> 
> Curiously, all the failures have these errors a few seconds after the
> previous output. I don't know if that's related, or just a side effect of
> the browser going idle and doing some GC.
It should be irrelevant to this bug. The error messages often happen during GMP shutdown.
Flags: needinfo?(cpearce)
Attachment #8654697 - Flags: review?(martijn.martijn)
Attached patch 1076833_fix_races-v1.patch (obsolete) — Splinter Review
Listen to 'ended' event to ensure all data is in the buffer before taking snapshots.

Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0dfebd11fde
Got oranges on OSX because the snapshots are different from the references.

Here is an example:
https://treeherder.mozilla.org/logviewer.html#?job_id=10952698&repo=try


The control is not drawn at all which I think is a problem of OSX widgets.

Hi Justin,
Can you check it?
Attachment #8654697 - Attachment is obsolete: true
Flags: needinfo?(dolske)
(In reply to JW Wang [:jwwang] from comment #758)
> Created attachment 8655265 [details] [diff] [review]
> 1076833_fix_races-v1.patch
> 
> Listen to 'ended' event to ensure all data is in the buffer before taking
> snapshots.
> 
> Try:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0dfebd11fde
> Got oranges on OSX because the snapshots are different from the references.
> 
> Here is an example:
> https://treeherder.mozilla.org/logviewer.html#?job_id=10952698&repo=try
[...]
> 
> The control is not drawn at all which I think is a problem of OSX widgets.

Which part? The green cutting off the thumb is deliberate, it's a <div> masking off the content to hide any changes from the control's status overlay (which fades in/out, which a timing pain for tests). The control bar is there, but looks faint. It's probably in the process of fading out too... I see you added a play() call, which is probably causing that.

I think these tests don't need to actually play the media, this progress event code is just there to ensure that loading has finished (because the controls indicate buffering progress, and we want to test a consistent state). Still feels like a spec bug that media can load without ever firing a progress event, especially since it's the kind of edge-case that people are not likely to explicitly check for.

Maybe checking for .networkState == NETWORK_IDLE would work? Mixing events and state can be wonky, but maybe it works out here. (Might need to make sure that doesn't trigger too early, _before_ the media starts loading?) For that matter, is it possible there _is_ a progress event firing, but we miss it? We don't add the event listener until the iframe fires |load|, maybe doing that earlier (with DOMContentLoaded) would help?

Oh, it also occurs to me that we support a magic mozNoDynamicControls attribute on the <video>, which should suppress the control fade-out. It's an ugly hack, but simple to try.
Flags: needinfo?(dolske)
There is a 'preload' attribute which controls how much media data should be fetched by a media element before play() is called. If it is set to 'none', no data will be fetched at all and we don't need to care about |networkState == NETWORK_IDLE| or the buffer ranges.

It appears to me the test is about testing if right-to-left text is rendered correctly. So it should make sense to set 'preload' to none. Does that make sense?
Flags: needinfo?(dolske)
(In reply to JW Wang [:jwwang] from comment #785)

> It appears to me the test is about testing if right-to-left text is rendered
> correctly. So it should make sense to set 'preload' to none. Does that make
> sense?

Hmm, makes sense to me. Even better!
Flags: needinfo?(dolske)
https://hg.mozilla.org/mozilla-central/file/9ed17db42e3e/toolkit/content/tests/widgets/videocontrols_direction_test.js#l24

The test listens to the 'progress' event and waits until the whole file is buffered to take a snapshot. However, HTMLMediaElement.buffered won't be valid until metadata is loaded according to our current implementation. If the whole file is downloaded before loading metadata, there won't be any progress events and the test waits forever to take a snapshot and times out.

Per discussion of comment 785 and comment 789, we can set 'preload' to 'none' so we don't need to care about buffer ranges.
See comment 793 for the root cause.
Assignee: dolske → jwwang
Attachment #8655265 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8661032 - Flags: review?(jyavenard)
Attachment #8661032 - Flags: review?(jyavenard) → review+
Thanks for the review!
https://hg.mozilla.org/mozilla-central/rev/1a912ffbe674
Status: ASSIGNED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: