Closed Bug 634564 Opened 10 years ago Closed 9 years ago

intermittent /tests/content/media/test/test_closing_connections.html | Test timed out.

Categories

(Core :: Audio/Video, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: mak, Assigned: kinetik)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297854616.1297855700.26199.gz

Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central opt test mochitests-1/5 on 2011/02/16 03:10:16

56987 INFO TEST-START | /tests/content/media/test/test_closing_connections.html
56988 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_closing_connections.html | Test timed out.
56989 INFO TEST-END | /tests/content/media/test/test_closing_connections.html | finished in 325848ms
I wonder if this was caused by bug 633051 landing? We didn't have this failure before that landed (at 2011-02-15 12:00:48 PST), and that bug's patch messed with our load group code, which is what this test is testing.
Blocks: 633051
I can't reproduce this locally, and there haven't been any reports today, so I'm going to assume this was fixed by the backout of bug 631058.  Please reopen this bug if it recurs.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Status: RESOLVED → VERIFIED
This still exists, it just seems to be very rare on m-c.  It started happening on the incremental GC branch very frequently, which I'm assuming is caused by the timing changing just enough to reveal the bug regularly.
Status: VERIFIED → REOPENED
Resolution: WORKSFORME → ---
Assignee: nobody → kinetik
Status: REOPENED → ASSIGNED
Status: ASSIGNED → UNCONFIRMED
Ever confirmed: false
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
The change in bug 633051 is implicated here.  I'm still tracking down all of the details, but if you look at https://tbpl.mozilla.org/php/getParsedLog.php?id=9372854&tree=Try&full=1#error0

...you see seek.webm used during test_bug686942 (which completes successfully), and then is added to the loadgroup during test_closing_connections because the media cache seeks the stream back near the start (for reasons that need to be investigated).  With the change from bug 633051, this causes recreated streams to be created in the foreground if they hit that code, which is why the seek ends up blocking the document load for test_closing_connections.
Attached patch patch v0Splinter Review
Don't clear mLoadInBackground when a load completes--it doesn't make sense that future loads should happen in the foreground.  To simplify the code, I've removed the mLoadInBackground test completely and rely directly on the presence of LOAD_BACKGROUND when moving the completed load back into the foreground.  Patch also includes two comment fixes, and adds a warning to nsHTMLMediaElement::GetDocumentLoadGroup.  It's possible that it makes sense to return nsnull from GetDocumentLoadGroup if the document is inactive, but I'm not sure--at least the warning will make it easier to spot bugs like this in the future.

The sequence of events that was causing this test failure (as observed on larch with billm's incremental GC patches) was:

0. tests use a very small media cache by default (automation.py sets it to 100kB).

1. test_bug686942 runs, including a set of tests against seek.webm.  The initial load of seek.webm passes through the code altered in this patch, setting mLoadInBackground to true.  seek.webm's element happens to stay alive long enough for the following events to occur.

2. test_bug686942 completes and NotifyOwnerDocumentActivityChanged suspends the decoder/stream.

3. Other media cache activity causes some of seek.webm's blocks to be evicted.  The stream moves from "ended" to "throttling" state, waiting for cache space to allow the blocks to be replaced from the network.

4. test_closing_connections starts loading 20 copies of seek.ogv, then waits for page load event to fire.  This test uses use_large_cache.js.

5. seek.webm's element from test_bug686942 is still alive (in an inactive document).  The cache notices that there's now enough space to fetch more data, so it creates a new channel, and since mLoadInBackground is false, the LOAD_BACKGROUND flag is not set.  The new request is added to the element's load group, which is the same load group used by test_closing_connections because both tests were loaded in the same docshell.

6. seek.webm's foreground load is now blocking the document load for test_closing_connections.

7. seek.webm's OnStartRequest fires, and since the suspend count is 1 (from the suspend in step 2), the channel is immediately suspended.

Since seek.webm will never be unsuspended (e.g. by another call to NotifyOwnerDocumentActivityChanged as a result of being added to a docshell), the load never completes, and the document load for test_closing_connections is never unblocked, restuling in a test timeout.
Attachment #600251 - Flags: review?(cpearce)
Attachment #600251 - Flags: review?(cpearce) → review+
http://hg.mozilla.org/integration/mozilla-inbound/rev/f75b898fcef7
Status: ASSIGNED → RESOLVED
Closed: 10 years ago9 years ago
Resolution: --- → FIXED
Oops, bug shouldn't be closed until the fix merges to m-c.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: --- → mozilla13
https://hg.mozilla.org/mozilla-central/rev/f75b898fcef7
https://hg.mozilla.org/mozilla-central/rev/7af360ee95fe
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.