Closed
Bug 977944
Opened 9 years ago
Closed 8 years ago
Intermittent test_ipc.html | test_peerConnection_offerRequiresReceiveVideo.html | pc_local: ICE failed to switch to connected | Test timed out.
Categories
(Core :: WebRTC: Networking, defect, P2)
Tracking
()
RESOLVED
INCOMPLETE
mozilla35
backlog | webrtc/webaudio+ |
People
(Reporter: KWierso, Assigned: drno)
References
Details
(Keywords: intermittent-failure)
Attachments
(2 files, 2 obsolete files)
3.52 KB,
patch
|
bwc
:
review+
|
Details | Diff | Splinter Review |
2.35 KB,
patch
|
drno
:
review+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=35383521&tree=Mozilla-Inbound slave: tst-linux64-spot-359 15:58:56 INFO - 831 INFO TEST-PASS | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | signalingState after local setRemoteDescription is 'stable' 15:58:56 INFO - 832 INFO TEST-PASS | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | iceConnectionState should not be undefined 15:58:56 INFO - 833 INFO TEST-PASS | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | signalingState after remote setLocalDescription is 'stable' 15:58:56 INFO - 834 INFO TEST-PASS | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | iceConnectionState should not be undefined 15:58:56 INFO - 835 INFO TEST-PASS | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | iceConnectionState should not be undefined 15:58:56 INFO - 836 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | pc_local: ICE failed to switch to connected 16:03:52 INFO - Xlib: extension "RANDR" missing on display ":0". 16:03:53 INFO - 837 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/ipc/test_ipc.html | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideo.html | Test timed out. 16:03:53 INFO - 138680768[7fe8fb115140]: [main|PeerConnectionImpl] PeerConnectionImpl.cpp:1638: CloseInt: Closing PeerConnectionImpl 7e05bd119948ba82; ending call 16:03:53 INFO - -131930368[7fe8fb119dc0]: [CCAPP Task|def] ccapi.c:1445: SIPCC-CC_API: 1/39, cc_int_onhook: UI -> GSM: ONHOOK 16:03:53 INFO - -131930368[7fe8fb119dc0]: [CCAPP Task|def] ccapi.c:1447: (1/39) On-hook called from /builds/slave/m-in-l64-000000000000000000000/build/media/webrtc/signaling/src/sipcc/core/ccapp/ccprovider.c:657 16:03:53 INFO - -195864832[7fe8fb119700]: [GSM Task|def] dcsm.c:532: SIPCC-DCSM: dcsm_process_event: DCSM 11 :(DCSM_READY:ONHOOK ) 16:03:53 INFO - -195864832[7fe8fb119700]: [GSM Task|fsm_sm] sm.c:46: SIPCC-FSM: sm_process_event: DEF 39 : 0x7fe9049b79edx: sm entry: (STABLE:ONHOOK) 16:03:53 INFO - -195864832[7fe8fb119700]: [GSM Task|fsm_sm] fsmdef.c:6711: SIPCC-FSM: fsmdef_ev_onhook: Entered.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → drno
Assignee | ||
Comment 5•9 years ago
|
||
Test timeouts are probably caused by Bug 976182
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 8•9 years ago
|
||
Progress: at least the fix from Bug 976182 as expected results now in connection failures from both ends and no test timeout any more. Lets hope for now that Bug 977933 will give us more clues in the logging once this hits again.
Comment 9•9 years ago
|
||
From logging: 11:54:16 INFO - (ice/WARNING) ICE-PEER(PC:1393962854814631 (id=58 url=http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_basicAudioVideoCombined.:default): no pairs for 1393962854814631 (id=58 url=http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_basicAudioVideoCombined.: stream2/video 11:54:16 INFO - (ice/WARNING) ICE-PEER(PC:1393962855025412 (id=58 url=http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_basicAudioVideoCombined.:default): no pairs for 1393962855025412 (id=58 url=http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_basicAudioVideoCombined.: stream2/video That looks pretty much fatal.
Assignee | ||
Comment 10•9 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #9) > That looks pretty much fatal. Indeed. But the questions are - why - why now - why only for one of the streams As this seem to happen always inside the test_ipc.html I'm starting to wonder if we are running into resource problems (e.g. out of socket or something like that).
Comment 11•9 years ago
|
||
It would be extremely useful if the test cases dumped the local and remote description when ICE fails.
Comment 12•9 years ago
|
||
I suspect we've hit some sort of problem in the interaction between NrSocketIpc and candidate creation.
Assignee | ||
Comment 13•9 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #11) > It would be extremely useful if the test cases dumped the local and remote > description when ICE fails. Is there a way to dump that information from JS land? I thinking about increasing the ICE log levels so that we see whats going on. But very likely that would blow the logs beyond there limits on the build servers.
Comment 14•9 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #13) > (In reply to Byron Campen [:bwc] from comment #11) > > It would be extremely useful if the test cases dumped the local and remote > > description when ICE fails. > > Is there a way to dump that information from JS land? > > I thinking about increasing the ICE log levels so that we see whats going > on. But very likely that would blow the logs beyond there limits on the > build servers. The test cases should be able to save a copy of both the local and remote description, but I'm not sure there is a way to ask for them again.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 16•9 years ago
|
||
Thanks for the idea. This patch writes out the SDP's from both sides if the ICE connection failed. Obviously we are still missing trickle ICE and timing information. But hopefully this helps to debug the issues.
Attachment #8390799 -
Flags: review?(docfaraday)
Comment 17•9 years ago
|
||
Comment on attachment 8390799 [details] [diff] [review] dump_sdp_after_ice_failure.patch Review of attachment 8390799 [details] [diff] [review]: ----------------------------------------------------------------- This looks like it will do the right thing, but it seems likely that we'll need to add more post-mortem logging, so putting it all in a function call is probably a good idea.
Attachment #8390799 -
Flags: review?(docfaraday) → review+
Assignee | ||
Comment 18•9 years ago
|
||
Try run (although it is doubtful it will hit the code) https://tbpl.mozilla.org/?tree=Try&rev=82cd9660c237 I would request landing of this for now, and re-factor when I put in more post-mortem debugging info.
Comment 19•9 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #18) > I would request landing of this for now, and re-factor when I put in more > post-mortem debugging info. I'm cool with that.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 23•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/add3f8ae5443
Flags: in-testsuite+
Keywords: checkin-needed
Comment hidden (Legacy TBPL/Treeherder Robot) |
https://hg.mozilla.org/mozilla-central/rev/add3f8ae5443
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 27•9 years ago
|
||
Sorry this was probably confusing, but the checking was just instrumentation for future debugging of this intermittent test failure. Is there a better way for me mark this properly in here in the future? We are waiting for another test failure with the instrumentation in place...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 39•9 years ago
|
||
Attachment #8396045 -
Flags: review?(docfaraday)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 42•9 years ago
|
||
Comment on attachment 8396045 [details] [diff] [review] dump_log_on_ice_error.patch Review of attachment 8396045 [details] [diff] [review]: ----------------------------------------------------------------- Looking at the implementation of requestCompleteLog(), all it seems to do is flip a bit that causes all subsequent logging to be output instead of accumulated in the circular buffer. This circular buffer _should_ be getting dumped as soon as a failure is observed, but I'm guessing it is not? Maybe the following chunk of code is causing some confusion somehow in the ipc case? dxr.mozilla.org/mozilla-central/source/testing/mochitest/tests/SimpleTest/SimpleTest.js#412
Assignee | ||
Comment 43•9 years ago
|
||
Good point Byron. That whole logging seems pretty useless to me.
Assignee | ||
Updated•9 years ago
|
Attachment #8396045 -
Attachment is obsolete: true
Attachment #8396045 -
Flags: review?(docfaraday)
Assignee | ||
Comment 44•9 years ago
|
||
This should hopefully be able to bypass the mochitest logging for dumping the big SDP's.
Attachment #8396818 -
Flags: review?(docfaraday)
Comment 45•9 years ago
|
||
Comment on attachment 8396818 [details] [diff] [review] dump_log_on_ice_error.patch Review of attachment 8396818 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me, provided |dump| actually does what it says.
Attachment #8396818 -
Flags: review?(docfaraday) → review+
Assignee | ||
Comment 46•9 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #45) > Looks good to me, provided |dump| actually does what it says. I verified that it works locally, and it bypasses the mochitest logging logic...
Assignee | ||
Updated•9 years ago
|
Attachment #8390799 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed,
leave-open
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 48•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/7ebf21a33220
Keywords: checkin-needed
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 57•9 years ago
|
||
Now we finally have the SDP in the logs. But I don't see anything bad or wrong in there. Byron: anything in the SDP which looks suspicious to you?
Flags: needinfo?(docfaraday)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 59•9 years ago
|
||
The port that seems to never receive any traffic seems to show up twice in the same SDP, once for each component. That's highly suspicious... 19:24:23 INFO - a=candidate:0 1 UDP 2122252543 10.134.57.120 42238 typ host 19:24:23 INFO - a=candidate:0 2 UDP 2122252542 10.134.57.120 42238 typ host
Updated•9 years ago
|
Flags: needinfo?(docfaraday)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 61•9 years ago
|
||
Adding jduell for visability. Jason: as per my separate email can you please help us debug why the network library over IPC hands out the same port twice (or point us to the best person)?
Flags: needinfo?(jduell.mcbugs)
Comment 62•9 years ago
|
||
I might be able to help on debugging the UDP IPC part. Can you point out which log file you guys found the duplicate port?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 64•9 years ago
|
||
(In reply to Shih-Chiang Chien [:schien] (UTC+8) from comment #62) > I might be able to help on debugging the UDP IPC part. Can you point out > which log file you guys found the duplicate port? I found it here: https://tbpl.mozilla.org/php/getParsedLog.php?id=37039733&full=1&branch=mozilla-inbound However, this does not seem to happen always. There are probably other errors lurking.
Comment 65•9 years ago
|
||
Seeing logging like this a lot: 02:43:15 INFO - (ice/WARNING) Peer offered candidates for disabled local component I would not expect this to happen in this test. Maybe we're hitting a race where the offer comes in before we've enabled the component somehow? I'm going to try improving the logging here, and look at the code in the meantime.
Comment 66•9 years ago
|
||
It looks like we get that kind of logging in normal operation. I'm trying to get this to reproduce on try with slightly better logging.
Assignee | ||
Comment 67•9 years ago
|
||
FYI as we have limited the amount of tests getting executed within test_ipc.html the signature of this problem has changed a little. New bug 992902 got filled which still shows the same ICE connection problem.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 70•9 years ago
|
||
Sorry about my latency--was on vacation last week. Randell is a better person to ask about anything ICE-related.
Flags: needinfo?(jduell.mcbugs) → needinfo?(rjesup)
Comment 71•9 years ago
|
||
Sigh. ekr, can you look at this? Or Byron, if you think you have the info you need to look at the duplicate port, please cancel the needinfo. If you think it's related to IPC specifically, please needinfo jduell again.
Flags: needinfo?(rjesup) → needinfo?(ekr)
Comment 72•9 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #70) > Sorry about my latency--was on vacation last week. Randell is a better > person to ask about anything ICE-related. This isn't actually ICE-related; we caught this code binding to the same port twice when passed 0, and were wondering whether we were somehow misusing the service or whether there was a bug: http://dxr.mozilla.org/mozilla-central/source/media/mtransport/nr_socket_prsock.cpp#1005
Updated•9 years ago
|
Flags: needinfo?(jduell.mcbugs)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 74•9 years ago
|
||
Byron looked at it.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 79•9 years ago
|
||
Bryan, Am looking through the logs, trying to piece together how you came to the conclusion that the binding was happening on the same port, could you provide the breakdown of what you are seeing.
Comment 80•9 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=6bbc6e6fde84 An approach to fixing this problem by staggering the calls to getUserMedia. The current scenario is unlikely to happen in real life, and as such It seems reasonable to fix the test, and then revisit ipdl pieces if we get reports from the field of this happening ?
Attachment #8411282 -
Flags: review?(drno)
Assignee | ||
Comment 81•9 years ago
|
||
Comment on attachment 8411282 [details] [diff] [review] Bug977944.patch Review of attachment 8411282 [details] [diff] [review]: ----------------------------------------------------------------- I'm not convinced that this is the root cause. But on the other hand it is a general cleanup of our test cases which does not harm. So worth a try. Looks good to me from the coding and functional side.
Attachment #8411282 -
Flags: review?(drno) → review+
Updated•9 years ago
|
Keywords: checkin-needed
Comment 82•9 years ago
|
||
Yes agreed, kind of a shot in the dark.
Updated•9 years ago
|
Keywords: checkin-needed
Comment 83•9 years ago
|
||
An update on my efforts: The ice candidates are determined when you initialize a mozRTCPeerConnection. 19:24:17 in the logs you can see the initialization occur, with the error. So I created a test case which initializes many mozRTCPeerConnections, and then grep "trickling candidate" | cut -d' ' -f14 | sort | uniq -c The std output from the mochitest, but I wasn't able to reproduce duplicate ports being generated. I did notice netwerk/base/src/nsUDPSocket.cpp has reuse_addr = true, which I believe could allow for UDP reuse. But since I couldn't verify the failure case, I have no idea if the solution would or would not work. ? Possibly a platform issue ?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 108•9 years ago
|
||
Current intermittent failures all seem to happen on Linux and MacOSX. I haven't seen any failures on Windows.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 111•9 years ago
|
||
The most recent log looks like bug 963524 (specifically, the bug 1006809 part of it).
Comment 112•9 years ago
|
||
I should also note that in the IPC case, there is some packet filtering logic that exacerbates the problem detailed here: https://bugzilla.mozilla.org/show_bug.cgi?id=963524#c52 Here's the code that makes this mess worse, and makes this more frequent in the IPC case, AFAICT: http://dxr.mozilla.org/mozilla-central/source/media/mtransport/stun_udp_socket_filter.cpp#130 This drops incoming check requests until we've sent our own check request in the other direction, which causes the first sender to have their first check request dropped, increasing the likelihood that the first sender will be pushed to the max retransmit count by triggered checks.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 117•9 years ago
|
||
So it sounds like we're not actually sure whether we saw bind(port=0) hand out the same port twice?
> I did notice netwerk/base/src/nsUDPSocket.cpp has reuse_addr = true,
Not sure why you'd want to have reuse_addr = true. Generally I'd avoid that?
Flags: needinfo?(jduell.mcbugs)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 119•9 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #117) > So it sounds like we're not actually sure whether we saw bind(port=0) hand > out the same port twice? > Oh, I did see it in the logs. I'm just not convinced it is the cause of the bulk of these failures. Once we land a fix or two, we should get a better idea of what effect it actually has.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 134•9 years ago
|
||
Targeting finishing this by Fx 33.
Priority: -- → P2
Target Milestone: mozilla30 → mozilla33
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 154•9 years ago
|
||
Nils -- Where does this stand? How close are we to solving this?
Flags: needinfo?(drno)
Assignee | ||
Comment 156•9 years ago
|
||
Not much new information here. I analyzed the log from comment #153. The ICE connections for audio and video are successful. Then the GC kicks in. And when the GC is finished the third ICE connection for the data channel is timed out. Makes me wish we could better control the GC from within a test. But also makes you wonder how the GC affect real world usage of timing sensitive ICE operations. In general the majority of the test failures in here fall into the ICE connection problem bucket. Which raises my wish for better/more ICE connection debug possibilities from JS land.
Flags: needinfo?(drno)
Comment 157•9 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #156) > Not much new information here. > I analyzed the log from comment #153. The ICE connections for audio and > video are successful. Then the GC kicks in. And when the GC is finished the > third ICE connection for the data channel is timed out. > Makes me wish we could better control the GC from within a test. But also > makes you wonder how the GC affect real world usage of timing sensitive ICE > operations. How long did the GC take? Our tests tend to accumulate garbage more than normal use, and so the GC's when they happen can take a while (especially on b2g emulator!) Running gc inbetween each test might be a good (though not new) idea. Or better yet, try our darnedest to remove mainthread touches, though for trickle ice it's unavoidable. (To bad we can't run in a worker.....) > In general the majority of the test failures in here fall into the ICE > connection problem bucket. Which raises my wish for better/more ICE > connection debug possibilities from JS land. hear hear
Assignee | ||
Comment 158•9 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #157) > (In reply to Nils Ohlmeier [:drno] from comment #156) > > I analyzed the log from comment #153. The ICE connections for audio and > > video are successful. Then the GC kicks in. And when the GC is finished the > > third ICE connection for the data channel is timed out. > > Makes me wish we could better control the GC from within a test. But also > > makes you wonder how the GC affect real world usage of timing sensitive ICE > > operations. > > How long did the GC take? Our tests tend to accumulate garbage more than > normal use, and so the GC's when they happen can take a while (especially on > b2g emulator!) Running gc inbetween each test might be a good (though not > new) idea. Or better yet, try our darnedest to remove mainthread touches, > though for trickle ice it's unavoidable. (To bad we can't run in a > worker.....) In this case it looks like it took 1-2 seconds. BTW this is not the B2G emulator, but e10s on Linux. But that shows similar issues like B2G, just not as bad/often as B2G. Byron: from looking at the logs it looks like the three ICE connections get tried one after another. Does that sounds right? If that is the case does each of them have independent timeouts which start on the first check, or is the last ICE connection left with whatever time is left from establishing all previous connections?
Flags: needinfo?(docfaraday)
Comment 159•9 years ago
|
||
From what I could tell, every candidate that should have been tried was, but it is hard to discern whether there were any network IO problems, or the exact timing. Each candidate runs its own timers, and there is also a timer per ICE that prevents ICE from failing until it pops to allow more candidates to be gathered, and new pairs to be formed (this is that "grace period" timer we added recently).
Flags: needinfo?(docfaraday)
Comment 160•9 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=41980868&tree=B2g-Inbound
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•9 years ago
|
Target Milestone: mozilla33 → mozilla35
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•8 years ago
|
backlog: --- → webRTC+
Rank: 25
Comment 193•8 years ago
|
||
Closing as not seen in 7 months
Status: REOPENED → RESOLVED
Closed: 9 years ago → 8 years ago
Resolution: --- → INCOMPLETE
Comment 194•5 years ago
|
||
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in
before you can comment on or make changes to this bug.
Description
•