Closed Bug 1041594 Opened 10 years ago Closed 10 years ago

browser_mozAudioChannel_muted.js crashes when run as a directory instead of a full suite

Categories

(Core :: Web Audio, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: vaibhav1994, Assigned: baku)

References

Details

Attachments

(1 file, 3 obsolete files)

In bug 992911, we are working on running tests per directory, to make the tests less dependent. When we do that, we see that browser_mozAudioChannel_muted.js crashes as can be seen in https://tbpl.mozilla.org/?tree=Try&rev=71ba0eb19adf.

Here's the full log: https://tbpl.mozilla.org/php/getParsedLog.php?id=44149620&tree=Try&full=1

We are planning on switching to this mode of running tests in 2 weeks and would like to do what we can to fix the tests.
Andrea, could you look at the logs and see what may be causing this problem when we are trying to run tests per directory, i.e., content/media/webaudio/test ?
Flags: needinfo?(amarchesini)
Blocks: run-by-dir
Thread 0 (crashed)
 0  libsystem_kernel.dylib + 0x12122
    rbx = 0x0000000000065f03   r12 = 0x0000000000065d00
    r13 = 0x0000000000065f03   r14 = 0x00007fff74349200
    r15 = 0x0000000000000000   rip = 0x00007fff87955122
    rsp = 0x00007fff5fbfced8   rbp = 0x00007fff5fbfcf20
    Found by: given as instruction pointer in context
 1  libsystem_c.dylib + 0x19ddc
    rip = 0x00007fff88a3eddd   rsp = 0x00007fff5fbfcee0
    rbp = 0x00007fff5fbfcf20
    Found by: stack scanning
 2  libdyld.dylib + 0x807
    rip = 0x00007fff8a220808   rsp = 0x00007fff5fbfcf00
    rbp = 0x00007fff5fbfcf20
    Found by: stack scanning
 3  libdyld.dylib + 0x817
    rip = 0x00007fff8a220818   rsp = 0x00007fff5fbfcf30
    rbp = 0x00007fff5fbfcf30
    Found by: stack scanning
 4  libdyld.dylib + 0x85a
    rip = 0x00007fff8a22085b   rsp = 0x00007fff5fbfcf60
    Found by: stack scanning
 5  libdyld.dylib + 0x8ed
    rip = 0x00007fff8a2208ee   rsp = 0x00007fff5fbfcff0
    Found by: stack scanning
 6  libmozglue.dylib!arena_purge [jemalloc.c:71ba0eb19adf : 3623 + 0x1a]
    rip = 0x0000000100019eb9   rsp = 0x00007fff5fbfd0d0
    Found by: stack scanning
 7  libmozglue.dylib!arena_run_dalloc [jemalloc.c:71ba0eb19adf : 3759 + 0x9]
    rip = 0x000000010001c135   rsp = 0x00007fff5fbfd130
    Found by: stack scanning
 8  libnss3.dylib!pt_PostNotifies [ptsynch.c:71ba0eb19adf : 106 + 0x8]
    rip = 0x0000000101115d6f   rsp = 0x00007fff5fbfd180
    Found by: stack scanning
 9  libmozglue.dylib!arena_dalloc [jemalloc.c:71ba0eb19adf : 4563 + 0xf]
    rip = 0x0000000100018986   rsp = 0x00007fff5fbfd1d0
    Found by: stack scanning
10  libsystem_c.dylib + 0x2d8c7
    rip = 0x00007fff88a528c8   rsp = 0x00007fff5fbfd220
    Found by: stack scanning
11  XUL!js::LifoAlloc::freeAll() [Utility.h:71ba0eb19adf : 122 + 0x4]
    rip = 0x00000001039fe117   rsp = 0x00007fff5fbfd270
    Found by: stack scanning
12  XUL!js::gc::StoreBuffer::enable() [StoreBuffer.h:71ba0eb19adf : 191 + 0x7]
    rip = 0x0000000103a43231   rsp = 0x00007fff5fbfd290
    Found by: stack scanning
13  XUL!js::gc::GCRuntime::gcCycle(bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) [jsgc.cpp:71ba0eb19adf : 5030 + 0x8]
    rip = 0x0000000103cacd9f   rsp = 0x00007fff5fbfd2c0
    Found by: stack scanning
14  XUL!js::gc::GCRuntime::collect(bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) [jsgc.cpp:71ba0eb19adf : 5234 + 0x14]
    rip = 0x0000000103cad063   rsp = 0x00007fff5fbfd320
    Found by: stack scanning
15  XUL!PreciseGCRunnable::Run() [XPCComponents.cpp:71ba0eb19adf : 2889 + 0xc]
    rip = 0x000000010249f824   rsp = 0x00007fff5fbfd380
    Found by: stack scanning
16  XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:71ba0eb19adf : 766 + 0x5]
    rip = 0x000000010142d024   rsp = 0x00007fff5fbfd3b0
    Found by: stack scanning
OS: Linux → Mac OS X
Sorry for the delay. Is this issue still reproducible?
The try result is not valid anymore, if you have something up-to-date I can take a look.
Flags: needinfo?(amarchesini)
Flags: needinfo?(vaibhavmagarwal)
I pushed this last night- the error is on osx opt only which I thought over night I would see results, here is a try push:
https://tbpl.mozilla.org/?tree=Try&rev=8194c614db4b

I am still awaiting osx opt results as of writing this comment, but we will see.
Flags: needinfo?(vaibhavmagarwal)
this happens on 10.8, here is a link to the log:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46450747&tree=Try

:baku, thanks for looking into this!
Flags: needinfo?(amarchesini)
It's hard to say what is going on here. I need more information:

1. not all the messages printed by the test appear. browser_mozAudioChannel.js and browser_mozAudioChannel_muted.js had several tests, but nothing is printed. Do you know why and how to show them?

2. how can I reproduce this new configuration locally? Usually I run mochitests individually or per directory using |moch mochitest-something path|.
Flags: needinfo?(amarchesini) → needinfo?(jmaher)
I am not sure why the full output is not being printed :(  I have only modified a few manifests, not tests or the browser.

I assume you could run this locally on osx opt builds like so:
mach mochitest-browser content/media/webaudio/test

I have found that running locally it is hard to reproduce problems seen on try server.
Flags: needinfo?(jmaher)
(In reply to Andrea Marchesini (:baku) from comment #6)
> 1. not all the messages printed by the test appear.
> browser_mozAudioChannel.js and browser_mozAudioChannel_muted.js had several
> tests, but nothing is printed. Do you know why and how to show them?

Sounds like bug 974073, perhaps?
:baku, any luck with this bug?
Attached patch test.patch (obsolete) — Splinter Review
I don't have mac, and I cannot reproduce the issue. I would like to know if you can apply this patch and push it to try. Or tell me how to have a push to try that enables the feature you are working on. Thanks.

This patch adds debugging messages.
I ran this on try with 3 jobs:
https://tbpl.mozilla.org/?tree=Try&rev=82f183077a7e

the oranges are for another failure and I investigated the logs to ensure that mozAudioChannel_muted.js was run successfully!

I say lets move forward with this patch to get it landed!
oh, I lied, the 4th run finished:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46567953&tree=Try&full=1#error4

we get a timed out error, maybe the logs have more useful information?
Hi :baku, it is a new week!  Do you have other thoughts?  I would be more than happy to push stuff to try or help out.
Flags: needinfo?(amarchesini)
https://tbpl.mozilla.org/php/getParsedLog.php?id=46567953&tree=Try&full=1#error4
in this there are not my debug messages. are you sure you pushed my patch? They should be visible. It least I see them locally.
Unfortunately I cannot use karlt suggested in bug 8 because it seems that forceCompleteLog()/requireCompleteLog() do not exist for browser mochitests.
Flags: needinfo?(amarchesini)
I am looking into why the patch isn't printing out lines- stay tuned!
I have the data printed out (the scripts we run in automation add a --quiet flag, I have removed that in my patch queue):
https://tbpl.mozilla.org/?tree=Try&rev=3a9bb1220a13

of course with all these retriggers I have no failures :(

Could this be a timing issue where adding the logging has prevented us from seeing this crash due to increased IO or other timing/cleanup?
Flags: needinfo?(amarchesini)
> Could this be a timing issue where adding the logging has prevented us from
> seeing this crash due to increased IO or other timing/cleanup?

Actually yes. The timing is an important factor in these tests. Let me work on it.
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
awesome, thanks!
I am having a hard time reproducing this now:
https://tbpl.mozilla.org/?tree=Try&rev=9d52c5041087

At first I thought my hacks to print out all the information were causing this to pass, but the linked in try push above shows all green.

:baku, did you find out anything?  Should we consider closing this?
Flags: needinfo?(amarchesini)
> :baku, did you find out anything?  Should we consider closing this?

I don't believe in bugs able to fix themselves :) But I didn't fine anything actually. I tried to rotate the test, but in the end the test seems correct: I didn't find any obvious timing issues.

Maybe we can close this bug and open it in case this issue reappears.
Flags: needinfo?(amarchesini)
My concern with doing that is that bugs like these tend to come back at random times, which only leads to tree closures, backouts, etc down the road. Do I need to remind you about the pain we were going through with random devtools test additions causing perma-orange because they happened to shift a test into a different chunk than before? Those are the worst kind of issues to sheriff and they cause massive developer frustration.

Closing this bug and keeping our fingers crossed seems like taking the easy way out here and I strongly suspect it'll come back to bite us in the butt later if we do.
I think the best thing we can do right now is try to bisect down to what changeset made this go away. Who knows, it could be a legitimate code fix :). But before we wash our hands of this, I think we need to understand whether it was random perturbation that made it go or something legitimate.
I'm actually wondering if this got fixed by the MediaStreamGraph refactor (bug 848954). Might be a good before/after for bisecting.
in 8 data points I have 2 failures:
https://tbpl.mozilla.org/?tree=Try&rev=eb6f0b691200&jobname=.*MacOSX.*%20test%20mochitest-browser-chrome-2

There is one other failure in e10s tests, so it might take a week or so to get this going again.  Can you take a look at the above try push and see if anything stands out as a problem?
I suspected that patch for bug 1075345 fixes this issue.
Attached patch test.patch (obsolete) — Splinter Review
Ehsan, can you review this patch? The reason why we have this failure is that StartPlaying() in AudioChannelNode is called asynchronously. In the meantime it could be that another AudioChannel object starts playing, creating a race condition.
Attachment #8477393 - Attachment is obsolete: true
Attachment #8504699 - Flags: review?(ehsan.akhgari)
Attached patch test.patch (obsolete) — Splinter Review
Attachment #8504699 - Attachment is obsolete: true
Attachment #8504699 - Flags: review?(ehsan.akhgari)
Attachment #8504733 - Flags: review?(ehsan.akhgari)
Comment on attachment 8504733 [details] [diff] [review]
test.patch

Review of attachment 8504733 [details] [diff] [review]:
-----------------------------------------------------------------

The code changes look fine, but I don't think this is the right way to fix the tests, as described below.

::: content/media/webaudio/test/browser_mozAudioChannel.html
@@ +37,5 @@
>    source.connect(ac.destination);
>    source.loop = true;
>    source.start(0);
> +
> +  document.getElementById("mozAudioChannelTest").innerHTML = "READY";

I'm not sure how this helps.  The load event must be dispatched after this script runs before your patch anyway.

::: content/media/webaudio/test/browser_mozAudioChannel.js
@@ +20,5 @@
>  
> +function waitForReady(doc, aCallback) {
> +  var e = doc.getElementById("mozAudioChannelTest");
> +  if (!e || e.innerHTML != 'READY') {
> +    setTimeout(function() { waitForReady(doc, aCallback); }, 500);

Can't you wait for the load event and avoid the polling here?

@@ -78,5 @@
> -
> -            info("Restoring the tab...");
> -            whenTabRestored(tab2, function() { info("Tab restored."); });
> -          }
> -        );

I'm assuming this is just indentation changes.

::: content/media/webaudio/test/browser_mozAudioChannel_muted.html
@@ +21,5 @@
>      document.getElementById("mozAudioChannelTest").innerHTML = "mozinterruptend";
>    }, false);
>  
>    document.getElementById("mozAudioChannelTest").innerHTML = "READY";
> +}

This change is also unneeded, I think.

::: content/media/webaudio/test/browser_mozAudioChannel_muted.js
@@ +29,5 @@
>  
> +function waitForReady(doc, aCallback) {
> +  var e = doc.getElementById("mozAudioChannelTest");
> +  if (!e || e.innerHTML != 'READY') {
> +    setTimeout(function() { waitForReady(doc, aCallback); }, 500);

Ditto.

@@ -63,5 @@
> -              gBrowser.removeTab(tab2);
> -              gBrowser.selectedTab = tab1;
> -            });
> -          }
> -        );

Again I'm assuming this is an indentation change.
Attachment #8504733 - Flags: review?(ehsan.akhgari) → review-
Attached patch test.patchSplinter Review
So the main issue here is when do we consider the AudioContext playing audio.
To me the audiocontext starts playing audio immediately, before the first block and this patch implements this concept.
Attachment #8504733 - Attachment is obsolete: true
Attachment #8505475 - Flags: review?(ehsan.akhgari)
Attachment #8505475 - Flags: review?(ehsan.akhgari) → review+
Blocks: 1085294
https://hg.mozilla.org/mozilla-central/rev/09e514e23ef1
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Depends on: 1195051
You need to log in before you can comment on or make changes to this bug.