Closed Bug 1298256 Opened 8 years ago Closed 7 years ago

Intermittent FAIL test_video_playback.py TestVideoPlayback.test_video_playback_partial

Categories

(Testing Graveyard :: external-media-tests, defect)

Version 3
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: bulk-close-intermittents, intermittent-failure)

I will check it.
Flags: needinfo?(fatseng)
I try to reproduce this in my NB, found that it is easy hang while visiting https://www.youtube.com/embed/1visYpIREUM?autoplay=1.
So, I create a Bug 1303930.
It looks like a regression issue, not sure whether this intermittent bug relates to it.
Whatever Bug 1303930 block my local test, I would like checking it in advance.
It could be reproduced in my Win7 NB then I print stall_wait_time and lag.
stall_wait_time:10, lag:0.00819757336378
stall_wait_time never less than lag.
Why does script raise VideoException?


 0:47.47 TEST_START: MainThread test_video_playback.py TestVideoPlayback.test_video_playback_partial
 0:58.87 LOG: MainThread INFO https://youtube.com/embed/AbAACm1IQE0?autoplay=1
 1:20.56 LOG: MainThread INFO https://youtube.com/embed/yOQQCoxs8-k?autoplay=1
 2:25.22 LOG: MainThread INFO https://youtube.com/embed/rjmuKV9BTkE?autoplay=1
<type 'int'>
<type 'float'>
stall_wait_time:10, lag:0.00819757336378
 2:58.59 TEST_END: MainThread FAIL, expected PASS
Traceback (most recent call last):
  File "c:\mozilla-source\firefox-0919\testing/marionette/harness\marionette\marionette_test\testcases.py", line 162, in run
    testMethod()
  File "c:\mozilla-source\firefox-0919\dom/media/test/external\external_media_harness\testcase.py", line 180, in test_video_playback_partial
    self.run_playback(video)
  File "c:\mozilla-source\firefox-0919\dom/media/test/external\external_media_harness\testcase.py", line 81, in run_playback
    raise self.failureException(e)
AssertionError: Video https://www.youtube.com/embed/rjmuKV9BTkE?autoplay=1 stalled.
VideoPuppeteer - test url: https://youtube.com/embed/rjmuKV9BTkE?autoplay=1:
{
        (video)
        current_time: 35.332375,
        duration: 88.441,
        expected_duration: 60,
        played: TimeRanges: length: 1, ranges: [(0, 35.50052)]
        interval: 1
        lag: -0.0200488468483,
        stall_wait_time: 10
        url: https://www.youtube.com/embed/rjmuKV9BTkE?autoplay=1
        src: blob:https://www.youtube.com/23fb2520-142d-4655-a1ff-5ddeb4b88bf7
        frames total: 870
         - dropped: 0
         - corrupted: 0
}
Flags: needinfo?(sydpolk)
I no longer work at Mozilla, and have not looked at this in a long time. Sorry; I wish I could help.
Flags: needinfo?(sydpolk)
(In reply to Farmer Tseng[:fatseng] from comment #27)
> It could be reproduced in my Win7 NB then I print stall_wait_time and lag.
> stall_wait_time:10, lag:0.00819757336378
> stall_wait_time never less than lag.
> Why does script raise VideoException?
> 
> 
>  0:47.47 TEST_START: MainThread test_video_playback.py
> TestVideoPlayback.test_video_playback_partial
>  0:58.87 LOG: MainThread INFO
> https://youtube.com/embed/AbAACm1IQE0?autoplay=1
>  1:20.56 LOG: MainThread INFO
> https://youtube.com/embed/yOQQCoxs8-k?autoplay=1
>  2:25.22 LOG: MainThread INFO
> https://youtube.com/embed/rjmuKV9BTkE?autoplay=1
> <type 'int'>
> <type 'float'>
> stall_wait_time:10, lag:0.00819757336378
>  2:58.59 TEST_END: MainThread FAIL, expected PASS
> Traceback (most recent call last):
>   File
> "c:\mozilla-source\firefox-0919\testing/marionette/
> harness\marionette\marionette_test\testcases.py", line 162, in run
>     testMethod()
>   File
> "c:\mozilla-source\firefox-0919\dom/media/test/
> external\external_media_harness\testcase.py", line 180, in
> test_video_playback_partial
>     self.run_playback(video)
>   File
> "c:\mozilla-source\firefox-0919\dom/media/test/
> external\external_media_harness\testcase.py", line 81, in run_playback
>     raise self.failureException(e)
> AssertionError: Video https://www.youtube.com/embed/rjmuKV9BTkE?autoplay=1
> stalled.
> VideoPuppeteer - test url: https://youtube.com/embed/rjmuKV9BTkE?autoplay=1:
> {
>         (video)
>         current_time: 35.332375,
>         duration: 88.441,
>         expected_duration: 60,
>         played: TimeRanges: length: 1, ranges: [(0, 35.50052)]
>         interval: 1
>         lag: -0.0200488468483,
>         stall_wait_time: 10
>         url: https://www.youtube.com/embed/rjmuKV9BTkE?autoplay=1
>         src:
> blob:https://www.youtube.com/23fb2520-142d-4655-a1ff-5ddeb4b88bf7
>         frames total: 870
>          - dropped: 0
>          - corrupted: 0
> }
Bryce,
I'm not sure if you are a right person to check this.
Flags: needinfo?(bvandyk)
(In reply to Syd Polk :sydpolk from comment #28)
> I no longer work at Mozilla, and have not looked at this in a long time.
> Sorry; I wish I could help.

Thanks for replying.
I have a question, maybe you could help answer it.
How could I get gecko log while issuing "./mach external-media-tests"?
Flags: needinfo?(sydpolk)
You should be able to Google mach command documentation.
Flags: needinfo?(sydpolk)
(In reply to Farmer Tseng[:fatseng] from comment #27)
> It could be reproduced in my Win7 NB then I print stall_wait_time and lag.
> stall_wait_time:10, lag:0.00819757336378
> stall_wait_time never less than lag.
> Why does script raise VideoException?

I've recently made changes that should make this reporting more reliable. The time you were seeing this error is around where my changes will have landed, so I'm not sure if they're pre or post changes. If you're still seeing issues like this now, let me know, as it's a bug that requires my attention.

> How could I get gecko log while issuing "./mach external-media-tests"?

For logging you can use the techniques detailed here: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Gecko_Logging

I have a specialised python setup to run these tests, but for running from mach with logs I'd use:
> MOZ_LOG_FILE="gecko.log" ./mach external-media-tests

Feel free to direct future questions around this to me, as I'm the new owner of these tests.
Flags: needinfo?(bvandyk)
I could easily reproduce it in my NB.
The code I cloned it on 9/19.
I added a debug log in lag(self) of video_puppeteer.py as following and found a interesting thing.
If the interval of calling lag function is over 15 seconds, it might get old current_time.
Therefore, it causes lag larger than 10.
you get correct current_time in second time.

def lag(self):
        """
        :return: The difference in time between where the video is currently
            playing and where it should be playing given the time we started
            the video.
        """
        # Note that self.current_time could temporarily refer to a
        # spliced-in ad
        temp_current_time = self.current_time
        elapsed_current_time = temp_current_time - self._first_seen_time
        elapsed_wall_time = clock() - self._first_seen_wall_time
        lag = elapsed_wall_time - elapsed_current_time
        print ('Got lag {}, cruuent_time {}, temp_cruuent_time {}, elapsed_current_time{}, elapsed_wall_time{}'\
            .format(lag, self.current_time, temp_current_time, elapsed_current_time, elapsed_wall_time))
        return elapsed_wall_time - elapsed_current_time



 2:48.10 LOG: MainThread INFO https://youtube.com/embed/rjmuKV9BTkE?autoplay=1
Got lag -0.0152933686887, cruuent_time 2.451333, temp_cruuent_time 2.402604, elapsed_current_time0.500959, elapsed_wall_time0.485665631311
Got lag -0.0202313393737, cruuent_time 3.187625, temp_cruuent_time 3.146625, elapsed_current_time1.24498, elapsed_wall_time1.22474866063
Got lag 0.00467996624447, cruuent_time 4.173875, temp_cruuent_time 4.09075, elapsed_current_time2.189105, elapsed_wall_time2.19378496624
Got lag 0.0502386946451, cruuent_time 5.738854, temp_cruuent_time 5.616666, elapsed_current_time3.715021, elapsed_wall_time3.76525969465
Got lag -0.00564615712581, cruuent_time 6.392833, temp_cruuent_time 6.311291, elapsed_current_time4.409646, elapsed_wall_time4.40399984287
Got lag 0.0452700973939, cruuent_time 7.304958, temp_cruuent_time 7.183083, elapsed_current_time5.281438, elapsed_wall_time5.32670809739
Got lag 0.00916901423225, cruuent_time 8.256979, temp_cruuent_time 8.174979, elapsed_current_time6.273334, elapsed_wall_time6.28250301423
Got lag 0.014146941523, cruuent_time 9.207125, temp_cruuent_time 9.12602, elapsed_current_time7.224375, elapsed_wall_time7.23852194152
Got lag -0.0151306480976, cruuent_time 10.154083, temp_cruuent_time 10.113104, elapsed_current_time8.211459, elapsed_wall_time8.1963283519
Got lag -0.00963477012884, cruuent_time 11.177145, temp_cruuent_time 11.095479, elapsed_current_time9.193834, elapsed_wall_time9.18419922987
Got lag -0.0054553160809, cruuent_time 12.158312, temp_cruuent_time 12.077229, elapsed_current_time10.175584, elapsed_wall_time10.1701286839
Got lag -0.0212639304616, cruuent_time 13.14327, temp_cruuent_time 13.10225, elapsed_current_time11.200605, elapsed_wall_time11.1793410695
Got lag -0.00726807858133, cruuent_time 14.355479, temp_cruuent_time 14.274645, elapsed_current_time12.373, elapsed_wall_time12.3657319214
Got lag 0.00547135562594, cruuent_time 15.255437, temp_cruuent_time 15.1735, elapsed_current_time13.271855, elapsed_wall_time13.2773263556
Got lag -0.0201822302913, cruuent_time 16.62152, temp_cruuent_time 16.580895, elapsed_current_time14.67925, elapsed_wall_time14.6590677697
Got lag -0.0324611008769, cruuent_time 17.195687, temp_cruuent_time 17.15452, elapsed_current_time15.252875, elapsed_wall_time15.2204138991
Got lag -0.0148788847741, cruuent_time 18.232604, temp_cruuent_time 18.151083, elapsed_current_time16.249438, elapsed_wall_time16.2345591152
Got lag 13.3149434858, cruuent_time 33.583625, temp_cruuent_time 20.201291, elapsed_current_time18.299646, elapsed_wall_time31.6145894858
Got lag -0.0297675340484, cruuent_time 34.074625, temp_cruuent_time 34.033625, elapsed_current_time32.13198, elapsed_wall_time32.102212466
 3:20.56 TEST_END: MainThread FAIL, expected PASS
Traceback (most recent call last):
  File "c:\mozilla-source\firefox-0919\testing/marionette/harness\marionette\marionette_test\testcases.py", line 162, in run
    testMethod()
  File "c:\mozilla-source\firefox-0919\dom/media/test/external\external_media_harness\testcase.py", line 180, in test_video_playback_partial
    self.run_playback(video)
  File "c:\mozilla-source\firefox-0919\dom/media/test/external\external_media_harness\testcase.py", line 81, in run_playback
    raise self.failureException(e)
AssertionError: Video https://www.youtube.com/embed/rjmuKV9BTkE?autoplay=1 stalled.
VideoPuppeteer - test url: https://youtube.com/embed/rjmuKV9BTkE?autoplay=1:
{
        (video)
        current_time: 33.828604,
        duration: 88.441,
        expected_duration: 60,
        played: TimeRanges: length: 1, ranges: [(0, 33.951625)]
        interval: 1
        lag: -0.0297675340484,
        url: https://www.youtube.com/embed/rjmuKV9BTkE?autoplay=1
        src: blob:https://www.youtube.com/6a8ec635-6fbe-4ca9-b9c5-7e1bd3fd49f9
        frames total: 833
         - dropped: 0
         - corrupted: 0
}
To expand on some of the changes that I've made: the tests were previously making individual queries to marionette to retrieve data about wrapped elements. The problem with this is that each call would take time leading to behaviour were sequential calls were assuming the same underlying video state, but in fact the video state would change between calls.

Now the tests attempt to snapshot the underlying state and base stateful decisions upon this snapshot until the next state refresh. This should ideally make the tests more consistent in their behaviour, by minimising the races introduced by lags. It will also have the benefit that state logged after a failure should be the same state that caused the failure, which was not certain previously.

Because there is some time spent retrieving the values across marionette, we can end up having problems, which is what I think we're seeing above (I believe it's also why we get negative lags, but those are superficial and shouldn't fail tests). Part of this is that even a single call may take a large amount of time (I'm not sure why, or how to mitigate this in Marionette).

Anyway, one of the results of the above is that if you're getting failure logs from builds from before 24/09, those logs may be inconsistent with the state of the tests at the time the failure was recorded. As shown above, these may be caused by what the test thinks is a lag, but is actually a slow Marionette call. Marionette calls resulting in raciness in these tests is an ongoing issue that we're yet to have a concrete answer for.
Thanks Bryce.
Bug 1300653 seems to resolve remaining time issue.
I pulled changeset to 315232 and tested it over 20 times.
This symptom doesn't appear anymore.
Let's monitor a few days.
Cool, glad to hear that. There are some other issues with these tests that I'm currently investigating. I'm currently looking at an issue that appear to be prominent on the Windows test machines: stalls in the media tests. Based on new logging and testing with a loaner I'm seeing FF jank during test runs, which stalls the video even if it's fully buffered, at which point the test detects the stall and fails. I've got some ideas on how this can be addressed in the tests, however I want to investigate what's going on with the test machines as well to make sure there's nothing odd there.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.