Closed
Bug 1112499
Opened 9 years ago
Closed 8 years ago
data written to pipe/socketpair get lost in b2g emulator
Categories
(Firefox OS Graveyard :: Emulator, defect)
Tracking
(firefox48 fixed)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox48 | --- | fixed |
People
(Reporter: bagder, Assigned: bagder)
References
Details
Attachments
(1 file, 1 obsolete file)
4.40 KB,
patch
|
mcmanus
:
review+
|
Details | Diff | Splinter Review |
In bug 1008091 I've been working on code that starts a new thread on FxOS (and Linux) and when shutting it down, it indicates death to the child thread using a pipe - the parent thread writes to it and the child thread poll()s it. When running this code in the b2g emulator (in the try-runs), at seemingly random intervals (but still frequent enough to be a problem) data that is written to the pipe doesn't arrive in the other end and thus it doesn't wake up poll() in the waiting thread. The result is a thread that doesn't terminate when it is supposed to. I've also tried to replace the pipe with a socketpair, but with seemlingly the same end result. This problem gets visible best when running xpcshell tests in the emulator. It is very rare that all tests manage to run without at least one of them getting hung on this problem and the test gets killed due to timeout (after 300 seconds). In bug 1008091 I intend to proceed and implement a work-around that detects when it runs in the emulator and avoids this situation only then, but it is awkward and ugly. The work-around writes to a shared variable and it makes the poll() timeout every NN milliseconds to check that variable. To repeat this problem, remove the variable-trick in nsNotifyAddrListener_Linux.cpp and run the xpcshell tests in the emulator, or before that patch has landed an earlier version of the patch that doesn't use the variable trick can be used.
Comment 1•9 years ago
|
||
I'm wondering if this is related to the other emulator flakiness we've been seeing for ages.
I had a look into this and I think I have an idea of what's going on. Here's part of nsNotifyAddrListener::Init(): ... 348: rv = NS_NewNamedThread("Link Monitor", getter_AddRefs(mThread), this); NS_ENSURE_SUCCESS(rv, rv); #ifdef MOZ_NUWA_PROCESS nsCOMPtr<nsIRunnable> runner = new NuwaMarkLinkMonitorThreadRunner(); mThread->Dispatch(runner, NS_DISPATCH_NORMAL); #endif 356: if (-1 == pipe(mShutdownPipe)) { return NS_ERROR_FAILURE; } ... And here's part of nsNotifyAddrListener::Run(): ... struct pollfd fds[2]; 260: fds[0].fd = mShutdownPipe[0]; ... while (!shutdown) { 279: int rc = EINTR_RETRY(poll(fds, 2, pollTimeout)); ... Note that mShutdownPipe isn't actually set up (on line 356) until after the thread is created (on line 348). If the thread runs before that call to pipe, line 260 will store -1 in fds[0].fd. Then, the call to poll on line 279 will be waiting on some nonexistent fd -1. If pollTimeout is -1, that call will never return, leading to hangs.
Assignee | ||
Comment 4•8 years ago
|
||
I can only but agree and say that it seems obvious now that you point it out. And probably the timing was so that this only hit us on the b2g emulator.
Assignee | ||
Comment 5•8 years ago
|
||
Just moving the setting up the pipe should do it then. You agree? I'll take it on a try-run adventure too and see how it plays.
Attachment #8667862 -
Flags: feedback?(dkeeler)
Comment on attachment 8667862 [details] [diff] [review] 0001-bug-1112499-set-up-shutdown-pipe-before-new-thread-s.patch Review of attachment 8667862 [details] [diff] [review]: ----------------------------------------------------------------- This should do it, yes. Do you think it's still necessary to have the shutdown indicator variable/B2G workaround?
Attachment #8667862 -
Flags: feedback?(dkeeler) → feedback+
Assignee | ||
Comment 7•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e478d38c9368
Assignee | ||
Comment 8•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=34d2c0d7703e
Assignee | ||
Comment 9•8 years ago
|
||
This second try-run was a test that didn't use the shutdown variable and then we're back to getting lots of mochitest timeouts on the B2G ICS emulator. It would indicate that it doesn't receive the shutdown signal appropriately. Clearly there's something still lurking in there.
Those all look like bug 1186440.
Daniel, can you move forward with landing this? Moving the call to pipe is at least more correct than what's going on now (without this fix, on my linux box xpcshell tests will periodically hang for 5 minutes until they're killed and then (successfully) re-run, which hurts productivity). If there's more to investigate, we can do that in a follow-up bug.
Flags: needinfo?(daniel)
Assignee | ||
Comment 12•8 years ago
|
||
Okidoki, thanks for bringing this back to my attention - it had fallen through some cracks and ended up in a dusty corner of mine. Here's a squashed and rebased version of the patch. A try-run is in progress as I write this: https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd9136e8c152 I'll PTO the next week so if you feel confident, feel free to move for commit. Otherwise I'll make sure it happens when I get back. I also realized I didn't get an r+ from anyone on this. Are you okay with it?
Attachment #8667862 -
Attachment is obsolete: true
Flags: needinfo?(daniel)
Attachment #8724257 -
Flags: review?(dkeeler)
Comment on attachment 8724257 [details] [diff] [review] v2-0001-bug-1112499-set-up-shutdown-pipe-before-new-threa.patch Review of attachment 8724257 [details] [diff] [review]: ----------------------------------------------------------------- LGTM, but I suppose a necko peer should sign off.
Attachment #8724257 -
Flags: review?(dkeeler) → review?(mcmanus)
(Also, thanks for the quick response here.)
Updated•8 years ago
|
Attachment #8724257 -
Flags: review?(mcmanus) → review+
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → daniel
Keywords: checkin-needed
Comment 15•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c12df2e6a862
Keywords: checkin-needed
Comment 16•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c12df2e6a862
You need to log in
before you can comment on or make changes to this bug.
Description
•