Closed
Bug 560630
Opened 13 years ago
Closed 13 years ago
NotificationService for e10s fennec initialized and used in different threads.
Categories
(Core :: IPC, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
fennec | 2.0a1+ | --- |
People
(Reporter: romaxa, Assigned: romaxa)
References
Details
Attachments
(1 file, 2 obsolete files)
1.39 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
On attempt to make full fix for bug 553904, we tried to add notification service initialization into MozillaChildThread, but this is causing some strange race condition problem. We should try to fix it.
Assignee | ||
Comment 1•13 years ago
|
||
Ok ,there is no race with MozillaChildThread, because it is not available... But bug 553904 is reproducible again... seems after fixing bug 528146 #0 NotificationService::HasKey (map=..., source=...) at mozilla/ipc/chromium/src/chrome/common/notification_service.cc:21 #1 0xb734dd3c in NotificationService::Notify (this=0x0, type=..., source=..., details=...) at mozilla/ipc/chromium/src/chrome/common/notification_service.cc:88 #2 0xb7341cfb in Run (this=0xae90d5e0) at mozilla/ipc/chromium/src/chrome/common/child_process_host.cc:44 #3 0xb7321112 in MessageLoop::RunTask (this=0xb2cff2b8, task=0xae90d5e0) at mozilla/ipc/chromium/src/base/message_loop.cc:336 #4 0xb7321942 in MessageLoop::DeferOrRunPendingTask (this=0xb2cff2b8, pending_task=...) at mozilla/ipc/chromium/src/base/message_loop.cc:344 #5 0xb7323797 in MessageLoop::DoWork (this=0xb2cff2b8) ---Type <return> to continue, or q <return> to quit--- at mozilla/ipc/chromium/src/base/message_loop.cc:444 #6 0xb7355e9c in base::MessagePumpLibevent::Run (this=0xb2d2e130, delegate=0xb2cff2b8) at mozilla/ipc/chromium/src/base/message_pump_libevent.cc:309 #7 0xb73217e9 in MessageLoop::RunInternal (this=0xb2cff2b8) at mozilla/ipc/chromium/src/base/message_loop.cc:216 #8 0xb732181c in MessageLoop::RunHandler (this=0xb2cff2b8) at mozilla/ipc/chromium/src/base/message_loop.cc:199 #9 0xb73218aa in MessageLoop::Run (this=0xb2cff2b8) at mozilla/ipc/chromium/src/base/message_loop.cc:173 #10 0xb7336811 in base::Thread::ThreadMain (this=0xb2d2d348) at mozilla/ipc/chromium/src/base/thread.cc:156 #11 0xb735699f in ThreadFunc (closure=0xb2d2d348) at mozilla/ipc/chromium/src/base/platform_thread_posix.cc:26 #12 0xb77281bf in start_thread () from /lib/libpthread.so.0 ---Type <return> to continue, or q <return> to quit--- #13 0xb4b4411e in clone () from /lib/libc.so.6
Blocks: 528146
Assignee | ||
Comment 2•13 years ago
|
||
And NotificationService created in different thread... #0 NotificationService::current () at mozilla/ipc/chromium/src/chrome/common/notification_service.cc:15 #1 0xb741cced in NotificationService (this=0xbfe5c6ec) at mozilla/ipc/chromium/src/chrome/common/notification_service.cc:32 #2 0xb67981d5 in XRE_InitChildProcess (aArgc=2, aArgv=0xbfe5d124, aProcess=GeckoProcessType_Content) at mozilla/toolkit/xre/nsEmbedFunctions.cpp:340 #3 0x08048fcb in main (argc=0, argv=0x0) at mozilla/ipc/app/MozillaRuntimeMain.cpp:87
Assignee | ||
Updated•13 years ago
|
Summary: Race condition happens when notification service added to MozillaChildThread → NotificationService for e10s fennec initialized and used in different threads.
Assignee | ||
Comment 3•13 years ago
|
||
Attachment #445543 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 4•13 years ago
|
||
Here is the backtrace of notification task creation. Breakpoint 1, ChildNotificationTask (this=0xb0f5a8b0, type=...) at ipc/chromium/src/chrome/common/child_process_host.cc:41 41 printf(">>>>>>Func:%s::%d pid:%i, thre:'%i':'%s'\n", __FUNCTION__, __LINE__, getpid(), PlatformThread::CurrentId(), IntToString(PlatformThread::CurrentId()).c_str()); (gdb) bt #0 ChildNotificationTask (this=0xb0f5a8b0, type=...) at ipc/chromium/src/chrome/common/child_process_host.cc:41 #1 ChildProcessHost::Notify (this=0xb0f5a8b0, type=...) at ipc/chromium/src/chrome/common/child_process_host.cc:134 #2 0xb7be0568 in ChildProcessHost::ListenerHook::OnChannelConnected (this=0xb0f5a8d4, peer_pid=26991) at ipc/chromium/src/chrome/common/child_process_host.cc:215 #3 0xb7bfff42 in IPC::Channel::ChannelImpl::ProcessIncomingMessages (this=0xb0f5b000) at ipc/chromium/src/chrome/common/ipc_channel_posix.cc:546 #4 0xb7c0043d in IPC::Channel::ChannelImpl::OnFileCanReadWithoutBlocking (this=0xb0f5b000, fd=19) at ipc/chromium/src/chrome/common/ipc_channel_posix.cc:---Type <return> to continue, or q <return> to quit--- 722 #5 0xb7bf4266 in base::MessagePumpLibevent::OnLibeventNotification (fd=19, flags=2, context=0xb0f5b000) at ipc/chromium/src/base/message_pump_libevent.cc:210 #6 0xb7ba57d7 in event_process_active (base=0xb360e400, flags=<value optimized out>) at ipc/chromium/src/third_party/libevent/event.c:385 #7 event_base_loop (base=0xb360e400, flags=<value optimized out>) at ipc/chromium/src/third_party/libevent/event.c:522 #8 0xb7bf44b0 in base::MessagePumpLibevent::Run (this=0xb362e130, delegate=0xb35ff2b8) at ipc/chromium/src/base/message_pump_libevent.cc:330 #9 0xb7bbfbf9 in MessageLoop::RunInternal (this=0xb35ff2b8) at ipc/chromium/src/base/message_loop.cc:218 #10 0xb7bbfc2c in MessageLoop::RunHandler (this=0xb35ff2b8) at ipc/chromium/src/base/message_loop.cc:200 #11 0xb7bbfcba in MessageLoop::Run (this=0xb35ff2b8) ---Type <return> to continue, or q <return> to quit---
Comment on attachment 445543 [details] [diff] [review] No ideas where is better to initialize notification service As I commented on IRC, the stack you've got here shows a task running on the wrong thread. I'm assuming the fix will be something unrelated to the notification service change here.
Attachment #445543 -
Flags: review?(bent.mozilla) → review-
Assignee | ||
Comment 6•13 years ago
|
||
Here is the backtrace with GTK build (latest e10s + latest e10s fennec) with enabled ipc plugins. Content process crashes 0xb70062e4 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, ObserverList<NotificationObserver, false>*>, std::_Select1st<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> > >::_M_begin (this=0x0) at /usr/include/c++/4.4/bits/stl_tree.h:488 488 (this->_M_impl._M_header._M_parent); (gdb) bt #0 0xb70062e4 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, ObserverList<NotificationObserver, false>*>, std::_Select1st<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> > >::_M_begin (this=0x0) at /usr/include/c++/4.4/bits/stl_tree.h:488 #1 0xb7005add in std::_Rb_tree<unsigned int, std::pair<unsigned int const, ObserverList<NotificationObserver, false>*>, std::_Select1st<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> > >::find (this=0x0, __k=@0xb42fef4c) at /usr/include/c++/4.4/bits/stl_tree.h:1434 #2 0xb7005414 in std::map<unsigned int, ObserverList<NotificationObserver, false>*, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, ObserverList<NotificationObserver, false>*> > >::find (this=0x0, __x=@0xb42fef4c) at /usr/include/c++/4.4/bits/stl_map.h:674 #3 0xb700456c in NotificationService::HasKey (map=@0x0, source=@0xb42fefd8) at ipc/chromium/src/chrome/common/notification_service.cc:21 #4 0xb7004a4d in NotificationService::Notify (this=0x0, type={value = NotificationType::CHILD_PROCESS_HOST_CONNECTED}, source=@0xb42ff04c, details=@0xb42ff048) at ipc/chromium/src/chrome/common/notification_service.cc:88 #5 0xb6ff189a in Run (this=0xad961be0) at ipc/chromium/src/chrome/common/child_process_host.cc:44 #6 0xb6fbc33e in MessageLoop::RunTask (this=0xb42ff1ec, task=0xad961be0) at ipc/chromium/src/base/message_loop.cc:336 #7 0xb6fbc3a7 in MessageLoop::DeferOrRunPendingTask (this=0xb42ff1ec, pending_task=@0xb42ff0cc) at ipc/chromium/src/base/message_loop.cc:344 #8 0xb6fbc77d in MessageLoop::DoWork (this=0xb42ff1ec) ---Type <return> to continue, or q <return> to quit--- at ipc/chromium/src/base/message_loop.cc:444 #9 0xb7012300 in base::MessagePumpLibevent::Run (this=0xb43111f0, delegate=0xb42ff1ec) at ipc/chromium/src/base/message_pump_libevent.cc:309 #10 0xb6fbbe9b in MessageLoop::RunInternal (this=0xb42ff1ec) at ipc/chromium/src/base/message_loop.cc:216 #11 0xb6fbbe1b in MessageLoop::RunHandler (this=0xb42ff1ec) at ipc/chromium/src/base/message_loop.cc:199 #12 0xb6fbbdbf in MessageLoop::Run (this=0xb42ff1ec) at ipc/chromium/src/base/message_loop.cc:173 #13 0xb6fe0f47 in base::Thread::ThreadMain (this=0xb430d708) at ipc/chromium/src/base/thread.cc:156 #14 0xb701293e in ThreadFunc (closure=0xb430d708) at ipc/chromium/src/base/platform_thread_posix.cc:26 #15 0xb775c585 in start_thread (arg=0xb42ffb70) at pthread_create.c:300 #16 0xb57cd29e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130 (gdb)
Assignee | ||
Comment 7•13 years ago
|
||
Here is the MOZ_IPC_MESSAGE_LOG=1 output before crash: [time:1275204729564457][PContentProcessParent] Sending Msg_NotifyRemotePrefObserver([TODO]) [time:1275204729565702][PContentProcessParent] Sending Msg_NotifyRemotePrefObserver([TODO]) pldhash: for the table at address 0xb09c9498, the given entrySize of 52 probably favors chaining over double hashing. [time:1275204729600517][PContentProcessParent] Sending Msg_NotifyRemotePrefObserver([TODO]) [time:1275204729606244][PContentProcessParent] Sending Msg_NotifyRemotePrefObserver([TODO]) Xlib: extension "RANDR" missing on display ":0.0". LoadPlugin() /usr/lib/mozilla/plugins/libflashplayer.so returned b44bf7e0 [time:1275204733696276][PPluginModuleParent] Sending Msg_NP_Initialize([TODO])
Assignee | ||
Comment 8•13 years ago
|
||
Message posted from here: http://mxr.mozilla.org/projects-central/source/electrolysis/ipc/chromium/src/chrome/common/child_process_host.cc#130 And io_loop taken from here in this case: http://mxr.mozilla.org/projects-central/source/electrolysis/ipc/chromium/src/chrome/common/child_process_host.cc#125
Assignee | ||
Comment 9•13 years ago
|
||
Attachment #445543 -
Attachment is obsolete: true
Attachment #448267 -
Flags: review?(bent.mozilla)
Comment on attachment 448267 [details] [diff] [review] This seems to work fine. We should never be asking for the IO loop in the first place. Please just do: #ifdef CHROMIUM_MOZILLA_BUILD MessageLoop* loop = mozilla::ipc::ProcessChild::message_loop(); loop->PostTask( #else
Attachment #448267 -
Flags: review?(bent.mozilla) → review-
Assignee | ||
Comment 11•13 years ago
|
||
Assignee: nobody → romaxa
Attachment #448267 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #448276 -
Flags: review?(bent.mozilla)
Comment on attachment 448276 [details] [diff] [review] comment + include. Looks good!
Attachment #448276 -
Flags: review?(bent.mozilla) → review+
Comment on attachment 448276 [details] [diff] [review] comment + include. Er. Wait... I take that back. Does this work on a clobber build? I think we can't include ipc/glue headers in ipc/chromium...
Attachment #448276 -
Flags: review+ → review-
Assignee | ||
Comment 14•13 years ago
|
||
? +#include "mozilla/ipc/ProcessChild.h" #include "mozilla/ipc/BrowserProcessSubThread.h" BrowserProcessSubThread.h - this is also part of ipc/glue...
Comment on attachment 448276 [details] [diff] [review] comment + include. Ah, true... Maybe I'm just misremembering the way this works. Ok, let's give this a shot!
Attachment #448276 -
Flags: review- → review+
Assignee | ||
Updated•13 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 16•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/704e77dcb303
Comment 17•13 years ago
|
||
Lots of orange so I backed out: http://hg.mozilla.org/mozilla-central/rev/b412ab506b1e
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 18•13 years ago
|
||
Windows failure: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275458817.1275459675.1988.gz Another failure http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1275458819.1275459305.32562.gz
Assignee | ||
Comment 19•13 years ago
|
||
bent any ideas why it is working on linux, and does not work on win... is logic different there?
Assignee | ||
Comment 20•13 years ago
|
||
Ok, I sent patch to try server again... also need to check if ti still needed after latest changes
Assignee | ||
Comment 21•13 years ago
|
||
Ok, seems try server is happy with this patch now
Assignee | ||
Comment 22•13 years ago
|
||
And patch is still needed, this patch fixing next case: take fennec with this patch applied (try build http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1281130098.1281134004.5041.gz): enable plugins, enable remote plugins. run ./fennec -url "http://www.communitymx.com/content/source/E5141/wmodeopaque.htm" without this patch you will get just crash, with this patch page works fine
tracking-fennec: --- → ?
Updated•13 years ago
|
tracking-fennec: ? → 2.0a1+
Assignee | ||
Comment 23•13 years ago
|
||
http://hg.mozilla.org/projects/electrolysis/rev/769df85cde9a
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•