Closed Bug 110226 Opened 23 years ago Closed 23 years ago

nsObserverService corruption

Categories

(Core :: XPCOM, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla0.9.7

People

(Reporter: KaiE, Assigned: alecf)

Details

Attachments

(2 files, 1 obsolete file)

I wrote code that used AddObserver("xpcom-shutdown").
But the notification was never received.

During debugging I got the impression the internal list of nsObserverService
might become corrupted, as for the same class, the notification was sent
multiple times.

For fun, I changed my code to call AddObserver("xpcom-shutdown") multiple times.
Even twice were enough, and the notification was received.

Hunting this down, I saw that some implementations of Observe do use
RemoveObserver(this) while being notified.

For test purposes, I changed the implementation of nsObserverService. I set a
flag while notification is active. If RemoveObserver is reached while this flag
is set, I simply return without trying to update the list.

With this change, my code works. A simple registration is enough and my code
gets notified.

I don't know whether it is forbidden to call RemoveObserver during notification,
or whether nsObserverService::RemoveObserver is buggy.

David Baron suggested we could add some assertion checks that tell us that this
happens. I will attach a patch that implements this assertion. (Note that this
patch does not fix my problem.)
In my environment, three "remove during notification" were encountered:

1) nsExternalHelperAppService::Observe

2)
nsObserverService::RemoveObserver(nsIObserver *, char const *)+0x00000044
[/home/kai/new-trunk/mozilla-debug/dist/bin/libxpcom.so +0x0008F1C4]
nsEventStateManager::~nsEventStateManager(void)+0x00000432
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x0027224E]
nsEventStateManager::Release(void)+0x00000099
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x00272575]
nsCOMPtr<nsIEventStateManager>::~nsCOMPtr(void)+0x0000004A
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x00338F3E]
nsPresContext::~nsPresContext(void)+0x00000487
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x002E62BF]
GalleyContext::~GalleyContext(void)+0x00000036
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x002E37E2]
nsPresContext::Release(void)+0x000000A1
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgklayout.so +0x002E6545]
nsDOMEvent::~nsDOMEvent(void)+0x000000F1
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x00287369]
nsDOMEvent::Release(void)+0x0000009F
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libgkcontent.so +0x00287603]
XPCJSRuntime::GCCallback(JSContext *, JSGCStatus)+0x000007E7
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libxpconnect.so +0x0005E3F3]
nsJSEnvironment::GetScriptingEnvironment(void)+0x000000CC
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libjsdom.so +0x0006162C]
js_GC+0x00000D9F [/home/kai/new-trunk/mozilla-debug/dist/bin/libmozjs.so
+0x0004857B]
js_ForceGC+0x0000005C [/home/kai/new-trunk/mozilla-debug/dist/bin/libmozjs.so
+0x000477C8]
JS_GC+0x00000070 [/home/kai/new-trunk/mozilla-debug/dist/bin/libmozjs.so
+0x00014584]
nsDOMSOFactory::Observe(nsISupports *, char const *, unsigned short const
*)+0x000000D8 [/home/kai/new-trunk/mozilla-debug/dist/bin/components/libjsdom.so
+0x00058A60]

3)

nsObserverService::RemoveObserver(nsIObserver *, char const *)+0x00000044
[/home/kai/new-trunk/mozilla-debug/dist/bin/libxpcom.so +0x0008F1C4]
nsCacheProfilePrefObserver::Remove(void)+0x0000018B
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libnkcache.so +0x0001BF63]
nsCacheService::Shutdown(void)+0x00000074
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libnkcache.so +0x0001D59C]
nsCacheProfilePrefObserver::Observe(nsISupports *, char const *, unsigned short
const *)+0x000000B4
[/home/kai/new-trunk/mozilla-debug/dist/bin/components/libnkcache.so +0x0001C3DC]
oh! Yeah, this is a known bug.. (sorry I didn't understand what you meant on
IRC, it's more clearly explained here)
I like what you're attempting here, but I think we're going to hit a lot of
false positives... because its possible to addObserver() on one list when you're
notifying on another list. if you can make this mNotifying somehow know which
list we're notifying (i.e. store it in the observerlist or something) then I'll
review that..
kaie, could you try the patch in bug 105719
dougt: I tested the patch, it does not seem to help. I still do not get notified.

But this might be, because with the patch from bug 105719 Mozilla hangs on
shutdown. Right after the console says "WebShell 0", the application stops
without any CPU cycles being used.
alecf: Good idea. The attached patch remembers the pointer to the notification
topic while notification is active, and only triggers the assertion if
adding/removing for the same topic is tried. I assume topic to list is a 1:1
relationship.
Attachment #57897 - Attachment is obsolete: true
alecf: Do you want to review the new patch?
I guess this seems reasonable, but what happens if notifying on one topic causes
notification to happen on another topic? lots of possibilities here. That's why
I was hoping for something that was based/stored in the observer list.
That said, if there's no way to do that (I haven't seen any indication that
we've tried) I guess we can go with this temporary solution.
> I was hoping for something that was based/stored in the observer list.

Me too. Since we have an object that is specialized for this purpose,
nsObserverList, it could keep track of what ObserverListEnumerators it had made.
It adds to its list on GetObserverList. We would need to add another method,
ReleaseIterator() to nsObserverList. Between those two calls, any call to
AddObserver/ReleaseObserver goes through the list of currently attached
iterators and tells each at what index an item is being added or removed. The
iterator can then adjust its index accordingly.
The simplest way to fix this is to notice when the observer list has changed out
from under the enumeration of the list and adjust the enumeration index
accordingly. I don't see why we need to get too fancy here, but I haven't tried
to fix it (I fixed it before for the 99% case of the current item getting
removed during notification just by noticing if the current item still matches
the item that got notified, but doug removed that code when he reworked it.)
david, didn't intent to.. sorry.  do you want to fix my mistake or should I fix
this?
 
Target Milestone: --- → mozilla0.9.8
Doug, if you have the time, I'd appreciate it if you could fix it.
will do.  I am upset that I broke this in the first place.  :-/
ok, I got sick of dealing with this so I fixed this the "right" way - notify
observers in reverse order, so that if they remove themselves, we're ok because
we've already passed them in the list.
Comment on attachment 59158 [details] [diff] [review]
notify observers in reverse order

+    mValueArray->GetElementAt(mIndex-1, aResult);
+    mIndex--;



should be 

mValueArray->GetElementAt(--mIndex, aResult);
Attachment #59158 - Flags: review+
Comment on attachment 59158 [details] [diff] [review]
notify observers in reverse order

sr=bienvenu, with doug's comment; I've been running with this for a while w/o
any problems
Attachment #59158 - Flags: superreview+
cool thanks - checked in (with the minor tweak)
Assignee: dougt → alecf
Target Milestone: mozilla0.9.8 → mozilla0.9.7
fix is in
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Are there users of this that were banking on the "other" firing order?
They better not have :)
If so it wouldn't have been intentional...so if there are any such consumers,
then this will help us flush them out...

 bienvenu and I were both running with this, and since this also fixed 40 bytes
of leaks, I think its worth finding any regressions with still two more weeks
left in the milestone... this might even fix a few crash-on-shutdown bugs
Alec, shouldn't you remove

 PRUint32 cnt;
 nsresult rv = mValueArray->Count(&cnt);

too (from both ::HasMoreElements and ::GetNext)?  


You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: