Closed Bug 1138604 Opened 9 years ago Closed 9 years ago

Crash in test_peerConnection_twoAudioVideoStreams.html when applying the patch from bug 1137963

Categories

(Core :: Audio/Video, defect)

All
macOS
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mstange, Assigned: mstange)

References

Details

Applying the patch in bug 1137963 apparently causes us to crash during mochitest-3 debug runs, *only* on 10.6:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=362310d7cb6b
Log from a passing run (without the patch): http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mstange@themasta.com-430cdd99c7f0/try-macosx64-debug/try_mountainlion-debug_test-mochitest-3-bm108-tests1-macosx-build6.txt.gz

Log from a failing run (with the patch): http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mstange@themasta.com-362310d7cb6b/try-macosx64-debug/try_snowleopard-debug_test-mochitest-3-bm107-tests1-macosx-build285.txt.gz

One difference I found was that most of the failed runs contain a few lines that say "SendRTCPPacket RTCP Packet Send Failed":
> 23:44:15     INFO -  1895746752[10032e1a0]: [main|WebrtcVideoSessionConduit] VideoConduit.cpp:445: Init Initialization Done
> 23:44:15     INFO -  1599082496[10c34fb40]: [|WebrtcVideoSessionConduit] VideoConduit.cpp:1275: SendRTCPPacket RTCP Packet Send Failed

and none of the successful runs I checked contain that message.
None of the logs contain a crash stack. The minidump stack walker complains about a "Minidump header signature mismatch":

> 23:45:50     INFO -  2051 ERROR TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_twoAudioVideoStreamsCombined.html | application terminated with exit code -6
> 23:45:50     INFO -  runtests.py | Application ran for: 0:31:51.226409
> 23:45:50     INFO -  zombiecheck | Reading PID log: /var/folders/Q0/Q02Z86cFH5GSXLCvQRw78k+++-k/-Tmp-/tmpJzJRXipidlog
> 23:45:51     INFO -  mozcrash Saved minidump as /builds/slave/talos-slave/test/build/blobber_upload_dir/BD92D510-DC11-4F8F-A467-6C61E8879147.dmp
> 23:45:51  WARNING -  PROCESS-CRASH | dom/media/tests/mochitest/test_peerConnection_twoAudioVideoStreamsCombined.html | application crashed [None]
> 23:45:51     INFO -  Crash dump filename: /var/folders/Q0/Q02Z86cFH5GSXLCvQRw78k+++-k/-Tmp-/tmpPZr2r3.mozrunner/minidumps/BD92D510-DC11-4F8F-A467-6C61E8879147.dmp
> 23:45:51     INFO -  stderr from minidump_stackwalk:
> 23:45:51     INFO -  2015-02-28 23:45:50: minidump_processor.cc:264: INFO: Processing minidump in file /var/folders/Q0/Q02Z86cFH5GSXLCvQRw78k+++-k/-Tmp-/tmpPZr2r3.mozrunner/minidumps/BD92D510-DC11-4F8F-A467-6C61E8879147.dmp
> 23:45:51     INFO -  2015-02-28 23:45:50: minidump.cc:3815: INFO: Minidump opened minidump /var/folders/Q0/Q02Z86cFH5GSXLCvQRw78k+++-k/-Tmp-/tmpPZr2r3.mozrunner/minidumps/BD92D510-DC11-4F8F-A467-6C61E8879147.dmp
> 23:45:51     INFO -  2015-02-28 23:45:50: minidump.cc:3847: ERROR: Minidump header signature mismatch: (0x0, 0x0) != 0x504d444d
> 23:45:51     INFO -  2015-02-28 23:45:50: minidump_processor.cc:268: ERROR: Minidump /var/folders/Q0/Q02Z86cFH5GSXLCvQRw78k+++-k/-Tmp-/tmpPZr2r3.mozrunner/minidumps/BD92D510-DC11-4F8F-A467-6C61E8879147.dmp could not be read
> 23:45:51     INFO -  2015-02-28 23:45:50: minidump.cc:3787: INFO: Minidump closing minidump
> 23:45:51     INFO -  2015-02-28 23:45:50: minidump_stackwalk.cc:529: ERROR: MinidumpProcessor::Process failed

Is this a known problem? Can I get stacks in some other way?
Flags: needinfo?(ted)
This is generally indicative of us failing to properly write a minidump. It's checking a 4-byte magic number at the start of the file and getting zeros which means we didn't actually write a useful minidump.

This can happen if the heap is getting trashed really badly or something like that. Breakpad doesn't use new or malloc when writing a dump, but it does use its own allocator that calls mmap directly. If that fails or some other aspect of dump writing fails we can get into this situation.
Flags: needinfo?(ted)
Hmm, ok. I'm a little stuck now.

I can make the spin lock print some information about how long it's spinning, to see whether we're getting stuck in it, or something like that.

The other option would be to request a loaner machine and try to debug the problem directly on it.
Depends on: 1138640
Here's a push that prints long spin times: https://treeherder.mozilla.org/#/jobs?repo=try&revision=daf35ff24e21

In the log at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mstange@themasta.com-daf35ff24e21/try-macosx64-debug/try_snowleopard-debug_test-mochitest-3-bm107-tests1-macosx-build305.txt.gz there are a few instances where we're spinning multiple seconds while waiting for the lock to be released, with a maximum of 7.7 seconds.

The only reason I can think of that we would wait for so long is that the thread that's currently owning the lock doesn't get a chance to finish its work, because it's not scheduled long enough / often enough.
I've fired off a new push with an added PR_Sleep(PR_INTERVAL_NO_WAIT) in the spin loop: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b5e788107c36
And it's green!

It's still strange that a difference in timing causes us to crash, but since the crash seems to be 10.6-only, I'd rather not spend more time tracking that down.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.