Closed
Bug 872013
Opened 12 years ago
Closed 12 years ago
SIPCC fails to start if it can't mkdir the directory it wants in /tmp
Categories
(Core :: WebRTC: Signaling, defect)
Tracking
()
RESOLVED
FIXED
mozilla24
People
(Reporter: abr, Assigned: abr)
Details
(Keywords: intermittent-failure, Whiteboard: [WebRTC], [blocking-webrtc-] [qa-])
Attachments
(2 files, 1 obsolete file)
398.11 KB,
text/plain
|
Details | |
10.66 KB,
patch
|
jesup
:
review+
|
Details | Diff | Splinter Review |
When running repeated gUM/PC mochi tests under ASAN, jesup encountered an as-yet-unseen failure that ultimately resulted in a timeout in the "throwInCallbacks" test case (most likely because it was the last test run).
The telltale marker in this log is:
> [GSM Task|sdp_config] sdp_config.c:26: SDP: Invalid Config pointer.
This message can arise when the config object is NULL or uninitialized (which should cause emission of a message "sdp_init_config() failure") or when it has been corrupted by being overwritten (which would be worrisome).
This log message is also curious:
> [SIP MsgQueueWait task|ccsip] sip_platform_task.c:275: SIP : sip_platform_task_msgqwait : timeout waiting for listening IPC socket ready, exiting
It arises as a consequence of the following:
> [SIPStack task|ccsip] sip_platform_task.c:508: SIP : sip_platform_task_loop : failed to create temp dir
After such a failure, the sip_platform_task_loop returns, which prevents the IPC that the SIP MsgQueueWait expects to be set up from being set up.
Note that this message itself is an indication that mkdir("/tmp/SIP-%d",0700) failed (where %d is replaced with the PID). This is not a system call I would expect to fail under any but the most dire circumstances. I suspect the issue we're seeing here is collateral damage from corruption inflicted elsewhere in the code.
In any case, it's probably worth improving the output of these specific failure points to help shed light on this set of circumstances, should it arise again.
Updated•12 years ago
|
Keywords: intermittent-failure
Assignee | ||
Comment 1•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → adam
Assignee | ||
Updated•12 years ago
|
Attachment #749315 -
Flags: review?(ethanhugg)
Comment 2•12 years ago
|
||
Note: IIRC this was not ASAN and not part of a long sequence of mochitests; it was part of a non-ASAN last-minute check of mochitests before putting something up for review, IIRC. The problem did not repeat.
The comment about mkdir is likely the cause - I've noticed failed/crashed/^c'd (I think) sessions leaving /tmp/SIP-* dirs, which can cause mkdir to fail. If we're not using a mktmp/etc type thing, we need to handle collisions....
Comment 3•12 years ago
|
||
Comment on attachment 749315 [details] [diff] [review]
Enhance failure logging
Review of attachment 749315 [details] [diff] [review]:
-----------------------------------------------------------------
::: media/webrtc/signaling/src/sipcc/core/sipstack/sip_platform_task.c
@@ +506,5 @@
> PR_snprintf(stmpdir, sizeof(stmpdir), template, getpid());
>
> if (mkdir(stmpdir, 0700) != 0) {
> + CCSIP_DEBUG_ERROR(SIP_F_PREFIX"failed to create temp dir: %s (%s)",
> + fname, stmpdir, strerror(errno));
You may want to run a Try for Windows because this is the first instance of strerror in the signaling code that will be compiled on Windows. I know Windows compilers try to get you to use strerror_s, but not sure if it's just a warning or error they throw.
Attachment #749315 -
Flags: review?(ethanhugg) → review+
Assignee | ||
Comment 4•12 years ago
|
||
(In reply to Ethan Hugg [:ehugg] from comment #3)
> You may want to run a Try for Windows because this is the first instance of
> strerror in the signaling code that will be compiled on Windows.
Good catch. Here's the try push:
https://tbpl.mozilla.org/?tree=Try&rev=108061b1a44e
Assignee | ||
Comment 5•12 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #2)
> Note: IIRC this was not ASAN and not part of a long sequence of mochitests;
> it was part of a non-ASAN last-minute check of mochitests before putting
> something up for review, IIRC. The problem did not repeat.
>
> The comment about mkdir is likely the cause - I've noticed
> failed/crashed/^c'd (I think) sessions leaving /tmp/SIP-* dirs, which can
> cause mkdir to fail. If we're not using a mktmp/etc type thing, we need to
> handle collisions....
Yep, that's almost certainly the root cause here. The item in question is supposed to be initialized in the sip_platform_task_loop, but it happens after the return-due-to-failed-mkdir happens.
I'd like to land the diagnostic patch anyway, but I'll put together something that deals more gracefully with this situation.
Updated•12 years ago
|
Summary: Intermittent test_peerConnection_throwInCallbacks.html | Test timed out after "SDP: Invalid Config pointer." → SIPCC fails to start if it can't mkdir the directory it wants in /tmp
Assignee | ||
Comment 6•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #749315 -
Attachment is obsolete: true
Assignee | ||
Comment 7•12 years ago
|
||
Comment on attachment 749515 [details] [diff] [review]
Use more robust mkdtemp() rather than relying on PID
I have an r+ from ehugg on the logging (and the push try shows that things work just fine under Windows) -- I just need a review on the actual fix, which migrates from using the PID for creating a unique tmpdir name to using the POSIX-standard mkdtemp().
Attachment #749515 -
Flags: review?(rjesup)
Updated•12 years ago
|
Attachment #749515 -
Flags: review?(rjesup) → review+
Assignee | ||
Comment 8•12 years ago
|
||
I'm pushing this one to try also (cross-platform this time) just to make sure we don't get any unexpected behavior from mkdtemp(). It might be POSIX, but that doesn't mean everyone got it right...
https://tbpl.mozilla.org/?tree=Try&rev=80d6726a15d8
Assignee | ||
Comment 9•12 years ago
|
||
Nice and green all across the board. Landing.
https://hg.mozilla.org/integration/mozilla-inbound/rev/f5e13f8aa8cf
Comment 10•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Updated•12 years ago
|
Whiteboard: [WebRTC], [blocking-webrtc-] → [WebRTC], [blocking-webrtc-] [qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•