The default bug view has changed. See this FAQ.

Intermittent test_play_events.html | Test timed out

VERIFIED FIXED

Status

()

Core
Audio/Video
VERIFIED FIXED
7 years ago
4 years ago

People

(Reporter: philor, Assigned: cpearce)

Tracking

({intermittent-failure})

Trunk
x86
Linux
intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

7 years ago
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282416759.1282417736.11129.gz
Rev3 Fedora 12 mozilla-central opt test mochitests-1/5 on 2010/08/21 11:52:39
s: talos-r3-fed-025

51795 INFO TEST-START | /tests/content/media/test/test_play_events.html
51796 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got play at 0 for http://mochi.test:8888/tests/content/media/test/r11025_s16_c1.wav
51797 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got play at 0 for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51798 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got play at 0 for http://mochi.test:8888/tests/content/media/test/small-shot.ogg
51799 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got play at 0 for http://mochi.test:8888/tests/content/media/test/seek.webm
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
mmap() failed: Cannot allocate memory
51800 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplay at play for http://mochi.test:8888/tests/content/media/test/r11025_s16_c1.wav
51801 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got playing at canplay for http://mochi.test:8888/tests/content/media/test/r11025_s16_c1.wav
51802 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplaythrough at playing for http://mochi.test:8888/tests/content/media/test/r11025_s16_c1.wav
51803 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplay at play for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51804 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got playing at canplay for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51805 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplaythrough at playing for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51806 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplay at canplaythrough for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51807 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got playing at canplay for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51808 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplaythrough at playing for http://mochi.test:8888/tests/content/media/test/320x240.ogv
51809 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplay at play for http://mochi.test:8888/tests/content/media/test/small-shot.ogg
51810 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got playing at canplay for http://mochi.test:8888/tests/content/media/test/small-shot.ogg
51811 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Check expected event got canplaythrough at playing for http://mochi.test:8888/tests/content/media/test/small-shot.ogg
51812 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Last event should be canplaythrough for http://mochi.test:8888/tests/content/media/test/small-shot.ogg - "canplaythrough" should equal "canplaythrough"
51813 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Last event should be canplaythrough for http://mochi.test:8888/tests/content/media/test/320x240.ogv - "canplaythrough" should equal "canplaythrough"
51814 INFO TEST-PASS | /tests/content/media/test/test_play_events.html | Last event should be canplaythrough for http://mochi.test:8888/tests/content/media/test/r11025_s16_c1.wav - "canplaythrough" should equal "canplaythrough"
51815 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_play_events.html | Test timed out.
51816 INFO SimpleTest finished /tests/content/media/test/test_play_events.html in 323516ms
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Reporter)

Comment 3

7 years ago
Certainly appears to be a regression from http://hg.mozilla.org/mozilla-central/rev/f31eb14e3014 where I disabled test_mixed_principals.html, which is pretty awful.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)

Updated

7 years ago
Depends on: 590028
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282617852.1282619503.1237.gz&f
ulltext=1#err1

15:54 < sayrer> cjones: there are 127 threads running during that crash

Default thread stacks on Linux are 10MB, so that's ~1.3GB of virtual address space accounted for at the time of the crash.  That explains why we're not seeing this problem on OS X (512kB stacks) or Windows (1MB stacks).  There were probably more threads active before the crash, too.

It's not clear what the underlying problem here is, but it seems that what's causing this test to timeout and fail (at least in the cases where the mmap failure is logged) is thread creation failure due to exhausted virtual address space.

I don't think we're leaking threads, but it would be good to investigate that.  Perhaps we're just churning through too many media decoders (which have 3-4 threads per decoder depending on OS) before earlier ones have completely shut down.

I can reproduce this easily with 6-8 copies of test_playback in my F12 32-bit VM.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 42

7 years ago
(In reply to comment #28)
> 15:54 < sayrer> cjones: there are 127 threads running during that crash
> 
> Default thread stacks on Linux are 10MB, so that's ~1.3GB of virtual address
> space accounted for at the time of the crash.  That explains why we're not
> seeing this problem on OS X (512kB stacks) or Windows (1MB stacks).  There were
> probably more threads active before the crash, too.

Perhaps we should shutdown the decoder state machine thread once the state machine reaches COMPLETED state. Then we'd be using no threads once the media finished playback. We could even go as far as shutting down the state machine thread when we're paused too, as we only really need the state machine thread to paint and advance the frame during playback, and to perform seeks off the main thread, and to run the buffering logic (while we're playing).
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 46

7 years ago
Created attachment 468891 [details] [diff] [review]
Patch: Shutdown state machine threads upon playback end

Shutdown the nsBuiltinDecoderStateMachine and nsWaveDecoderStateMachine threads when playback reaches COMPLETED state.

It's probably worth refactoring the media tests to be less parallel too.
Attachment #468891 - Flags: review?(kinetik)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 54

7 years ago
Created attachment 468960 [details] [diff] [review]
Patch v2: Shutdown state machine threads upon playback end

With race condition in nsWaveDecoder fixed as discussed. I also couldn't use NS_NewRunnableMethod(), it wouldn't compile with msvc because nsIThread::Shutdown() returns nsresult, not void. Tests pass on Linux64 and Win32.
Assignee: nobody → chris
Attachment #468891 - Attachment is obsolete: true
Attachment #468960 - Flags: review?(kinetik)
Attachment #468891 - Flags: review?(kinetik)
Attachment #468960 - Flags: review?(kinetik) → review+
(Assignee)

Updated

7 years ago
Attachment #468960 - Flags: approval2.0?
Comment hidden (Treeherder Robot)
Attachment #468960 - Flags: approval2.0? → approval2.0+
(Assignee)

Comment 56

7 years ago
Landed "Patch v2: Shutdown state machine threads upon playback end"
http://hg.mozilla.org/mozilla-central/rev/973351678dee

Lets see if this reduces the frequency of the failures, and then we should reduce the parallelism of the media tests. That would reduce the peak memory usage.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(In reply to comment #58)
> dao%mozilla.com
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282736928.1282738097.1436.gz
> Rev3 Fedora 12 mozilla-central opt test mochitests-1/5 on 2010/08/25 04:48:48
> 
> s: talos-r3-fed-045
> 53250 ERROR TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_play_events.html | Test timed out.

This one is the first failure since cpearce's checkin.
Comment hidden (Treeherder Robot)
(In reply to comment #60)
> dao%mozilla.com
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282739005.1282740088.13835.gz
> Rev3 Fedora 12 mozilla-central opt test mochitests-1/5 on 2010/08/25 05:23:25
> 
> s: talos-r3-fed-015
> 53258 ERROR TEST-UNEXPECTED-FAIL |
> /tests/content/media/test/test_play_events.html | Test timed out.

And this failure still has mmap() failures.  :-(
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 68

7 years ago
(In reply to comment #59)
> > 53250 ERROR TEST-UNEXPECTED-FAIL |
> > /tests/content/media/test/test_play_events.html | Test timed out.
> 
> This one is the first failure since cpearce's checkin.

I'll de-parallelize this test, and probably others need to be too.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 83

6 years ago
This hasn't failed for almost 3 months, so I think it's safe to say this was fixed by the JS array/PIC fixes which went in recently, or by the recent video orange purge.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Keywords: intermittent-failure
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.