Closed Bug 845190 Opened 7 years ago Closed 6 years ago

Intermittent test_645970.js,test_logging.js,test_startup_session_sync.js,test_startup_nosession_sync.js | application crashed [@ mozalloc_abort(char const*)] from ASSERTION: thread pool wasn't shutdown

Categories

(Core :: DOM: Workers, defect, critical)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox25 --- wontfix
firefox26 --- wontfix
firefox27 --- fixed
firefox-esr24 --- wontfix

People

(Reporter: philor, Assigned: Yoric)

References

(Depends on 1 open bug)

Details

(Keywords: assertion, crash, intermittent-failure)

Attachments

(1 file, 16 obsolete files)

2.06 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
https://tbpl.mozilla.org/php/getParsedLog.php?id=20090551&tree=Mozilla-Inbound
Ubuntu 12.04 mozilla-inbound debug test xpcshell on 2013-02-25 20:36:43 PST for push 70814e7dbcb8
slave: tst-linux32-ec2-056

21:46:56     INFO -  TEST-PASS | /builds/slave/test/build/tests/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_645970.js | [run_test : 23] [xpconnect wrapped nsISearchEngine @ 0x9e367e8 (native @ 0x9e365b8)] != null
21:46:56     INFO -  *** Search: SRCH_SVC_saveSortedEngineList: starting
21:46:56     INFO -  TEST-INFO | (xpcshell/head.js) | test 1 finished
21:46:56     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
21:46:56     INFO -  TEST-PASS | (xpcshell/head.js) | 1 (+ 1) check(s) passed
21:46:56     INFO -  TEST-INFO | (xpcshell/head.js) | 0 check(s) todo
21:46:56     INFO -  WARNING: nsExceptionService ignoring thread destruction after shutdown: file ../../../xpcom/base/nsExceptionService.cpp, line 167
21:46:56     INFO -  ###!!! ASSERTION: thread pool wasn't shutdown: '!mPool', file ../../../../netwerk/base/src/nsStreamTransportService.cpp, line 435
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x003425C9]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x00342674]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x003F948F]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x015C46E6]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x01608089]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x016050B1]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x015CA3AF]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x01605B47]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x01605BB4]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/libxul.so +0x015CFB67]
21:46:56     INFO -  NS_ShutdownXPCOM+0x0000001A [/builds/slave/test/build/application/firefox/libxpcom.so +0x00001DD6]
21:46:56     INFO -  UNKNOWN [/builds/slave/test/build/application/firefox/xpcshell +0x00009388]
21:46:56     INFO -  ###!!! ASSERTION: thread pool wasn't shutdown: '!mPool', file ../../../../netwerk/base/src/nsStreamTransportService.cpp, line 435
21:46:56     INFO -  <<<<<<<
21:47:16     INFO -  PROCESS-CRASH | /builds/slave/test/build/tests/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_645970.js | application crashed [@ mozalloc_abort(char const*)]
21:47:16     INFO -  Crash dump filename: /builds/slave/test/build/tests/xpcshell/tests/toolkit/components/search/tests/xpcshell/24d47b23-07e3-f49d-4b785bfa-18f81c59.dmp
21:47:16     INFO -  Operating system: Linux
21:47:16     INFO -                    0.0.0 Linux 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686
21:47:16     INFO -  CPU: x86
21:47:16     INFO -       GenuineIntel family 6 model 26 stepping 5
21:47:16     INFO -       1 CPU
21:47:16     INFO -  Crash reason:  SIGSEGV
21:47:16     INFO -  Crash address: 0x0
21:47:16     INFO -  Thread 0 (crashed)
21:47:16     INFO -   0  libmozalloc.so!mozalloc_abort(char const*) [mozalloc_abort.cpp : 30 + 0x0]
21:47:16     INFO -      eip = 0xb7722f0d   esp = 0xbfa13ec0   ebp = 0xbfa13ed8   ebx = 0xb7724120
21:47:16     INFO -      esi = 0xb4868d9c   edi = 0xbfa13f08   eax = 0x0000000a   ecx = 0xffffffff
21:47:16     INFO -      edx = 0xb48698ac   efl = 0x00210282
21:47:16     INFO -      Found by: given as instruction pointer in context
21:47:16     INFO -   1  libxul.so!NS_DebugBreak_P [nsDebugImpl.cpp:70814e7dbcb8 : 422 + 0x5]
21:47:16     INFO -      eip = 0xb5ffd692   esp = 0xbfa13ee0   ebp = 0xbfa14318
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   2  libxul.so!nsStreamTransportService::~nsStreamTransportService [nsStreamTransportService.cpp:70814e7dbcb8 : 435 + 0x23]
21:47:16     INFO -      eip = 0xb4d265c9   esp = 0xbfa14320   ebp = 0xbfa14348
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   3  libxul.so!nsStreamTransportService::Release() [nsStreamTransportService.cpp:70814e7dbcb8 : 457 + 0x5]
21:47:16     INFO -      eip = 0xb4d26674   esp = 0xbfa14350   ebp = 0xbfa14378
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   4  libxul.so!nsCOMPtr_base::assign_assuming_AddRef(nsISupports*) [nsCOMPtr.h:70814e7dbcb8 : 442 + 0x7]
21:47:16     INFO -      eip = 0xb4ddd48f   esp = 0xbfa14380   ebp = 0xbfa143a8
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   5  libxul.so!nsCOMPtr_base::assign_with_AddRef(nsISupports*) [nsCOMPtr.cpp:70814e7dbcb8 : 49 + 0x8]
21:47:16     INFO -      eip = 0xb5fa86e6   esp = 0xbfa143b0   ebp = 0xbfa143c8
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   6  libxul.so!FreeFactoryEntries [nsCOMPtr.h:70814e7dbcb8 : 947 + 0x9]
21:47:16     INFO -      eip = 0xb5fec089   esp = 0xbfa143d0   ebp = 0xbfa143e8
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   7  libxul.so!nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::s_EnumReadStub(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*) [nsBaseHashtable.h:70814e7dbcb8 : 400 + 0x2]
21:47:16     INFO -      eip = 0xb5fe90b1   esp = 0xbfa143f0   ebp = 0xbfa14408
21:47:16     INFO -      Found by: previous frame's frame pointer
21:47:16     INFO -   8  libxul.so!PL_DHashTableEnumerate [pldhash.cpp:70814e7dbcb8 : 717 + 0x11]
21:47:16     INFO -      eip = 0xb5fae3af   esp = 0xbfa14410   ebp = 0xbfa14458
21:47:16     INFO -      Found by: previous frame's frame pointer
josh, jduell: this is one of our most frequent intermittent orange bugs. We're tripping an assertion in nsStreamTransportService:
###!!! ASSERTION: thread pool wasn't shutdown: '!mPool', file ../../../../netwerk/base/src/nsStreamTransportService.cpp, line 435

in this xpcshell test:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/search/tests/xpcshell/test_645970.js

That test is manually sending out a "quit-application" observer service notification, so I wonder if it's screwing up nsStreamTransportService's expectations around shutdown ordering?

Can you take a look at this (or find someone else who can)?
Josh: can you get someone to take a look at this? See comment 103 for my brief synopsis.
Flags: needinfo?(joshmoz)
Test disabled for Linux debug for now, until we get get responses to comment 103 / comment 115):
https://hg.mozilla.org/integration/mozilla-inbound/rev/509b300bda68
Whiteboard: [test disabled on Linux debug][leave open]
Looking into an owner now.
Flags: needinfo?(joshmoz)
Flags: needinfo?(joshmoz)
Flags: needinfo?(joshmoz)
nsStreamTransportService gets first instantiated after NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID.


When doing check-interactive of the failing test, after test has run and I type quit() I'm getting the following stack (windows):

 	xul.dll!nsStreamTransportService::Init()  Line 441	C++
 	xul.dll!nsStreamTransportServiceConstructor(nsISupports * aOuter=0x00000000, const nsID & aIID={...}, void * * aResult=0x0034f568)  Line 67 + 0x79 bytes	C++
 	xul.dll!mozilla::GenericFactory::CreateInstance(nsISupports * aOuter=0x00000000, const nsID & aIID={...}, void * * aResult=0x0034f568)  Line 16 + 0x14 bytes	C++
 	xul.dll!nsComponentManagerImpl::CreateInstance(const nsID & aClass={...}, nsISupports * aDelegate=0x00000000, const nsID & aIID={...}, void * * aResult=0x0034f568)  Line 1009 + 0x25 bytes	C++
 	xul.dll!nsComponentManagerImpl::GetService(const nsID & aClass={...}, const nsID & aIID={...}, void * * result=0x0034f5c0)  Line 1257 + 0x34 bytes	C++
 	xul.dll!CallGetService(const nsID & aCID={...}, const nsID & aIID={...}, void * * aResult=0x0034f5c0)  Line 52	C++
 	xul.dll!nsGetServiceByCIDWithError::operator()(const nsID & aIID={...}, void * * aInstancePtr=0x0034f5c0)  Line 234 + 0x13 bytes	C++
 	xul.dll!nsCOMPtr<nsIStreamTransportService>::assign_from_gs_cid_with_error(const nsGetServiceByCIDWithError & gs={...}, const nsID & aIID={...})  Line 1193 + 0xf bytes	C++
 	xul.dll!nsCOMPtr<nsIStreamTransportService>::nsCOMPtr<nsIStreamTransportService>(const nsGetServiceByCIDWithError & gs={...})  Line 581	C++
 	xul.dll!nsInputStreamPump::AsyncRead(nsIStreamListener * listener=0x0427aa98, nsISupports * ctxt=0x00000000)  Line 307	C++
 	xul.dll!nsBaseChannel::BeginPumpingData()  Line 223 + 0x3a bytes	C++
 	xul.dll!nsBaseChannel::AsyncOpen(nsIStreamListener * listener=0x04a6c988, nsISupports * ctxt=0x02b61480)  Line 600 + 0xb bytes	C++
>	xul.dll!`anonymous namespace'::ScriptLoaderRunnable::RunInternal()  Line 324 + 0x34 bytes	C++
 	xul.dll!`anonymous namespace'::ScriptLoaderRunnable::Run()  Line 159 + 0xb bytes	C++
 	xul.dll!nsThread::ProcessNextEvent(bool mayWait=false, bool * result=0x0034f7ff)  Line 627 + 0x19 bytes	C++
 	xul.dll!NS_ProcessPendingEvents_P(nsIThread * thread=0x01d10750, unsigned int timeout=4294967295)  Line 188 + 0x14 bytes	C++
 	xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager * servMgr=0x00000000)  Line 564	C++
 	xul.dll!NS_ShutdownXPCOM_P(nsIServiceManager * servMgr=0x00000000)  Line 519 + 0x9 bytes	C++
 	xpcom.dll!NS_ShutdownXPCOM(nsIServiceManager * svcMgr=0x00000000)  Line 134 + 0xa bytes	C++
 	xpcshell.exe!NS_internal_main(int argc=24, char * * argv=0x01c4f6bc, char * * envp=0x01c431b8)  Line 1973 + 0x8 bytes	C++
 	xpcshell.exe!wmain(int argc=31, wchar_t * * argv=0x01cfbc68)  Line 109 + 0x16 bytes	C++


This is just before NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID is fired.  I can also see the following stack (events fired from inside call to NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID notification:



>	xul.dll!`anonymous namespace'::ScriptLoaderRunnable::RunInternal()  Line 243	C++
 	xul.dll!`anonymous namespace'::ScriptLoaderRunnable::Run()  Line 159 + 0xb bytes	C++
 	xul.dll!nsThread::ProcessNextEvent(bool mayWait=true, bool * result=0x0034f53b)  Line 627 + 0x19 bytes	C++
 	xul.dll!NS_ProcessNextEvent_P(nsIThread * thread=0x01d10750, bool mayWait=true)  Line 238 + 0x17 bytes	C++
 	xul.dll!mozilla::dom::workers::RuntimeService::Cleanup()  Line 1192 + 0xe bytes	C++
 	xul.dll!mozilla::dom::workers::RuntimeService::Observe(nsISupports * aSubject=0x00000000, const char * aTopic=0x556e7cdc, const wchar_t * aData=0x00000000)  Line 1417	C++
 	xul.dll!nsObserverList::NotifyObservers(nsISupports * aSubject=0x00000000, const char * aTopic=0x556e7cdc, const wchar_t * someData=0x00000000)  Line 100	C++
 	xul.dll!nsObserverService::NotifyObservers(nsISupports * aSubject=0x00000000, const char * aTopic=0x556e7cdc, const wchar_t * someData=0x00000000)  Line 164	C++
 	xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager * servMgr=0x00000000)  Line 570	C++
 	xul.dll!NS_ShutdownXPCOM_P(nsIServiceManager * servMgr=0x00000000)  Line 519 + 0x9 bytes	C++
 	xpcom.dll!NS_ShutdownXPCOM(nsIServiceManager * svcMgr=0x00000000)  Line 134 + 0xa bytes	C++
 	xpcshell.exe!NS_internal_main(int argc=24, char * * argv=0x01c4f6bc, char * * envp=0x01c431b8)  Line 1973 + 0x8 bytes	C++
 	xpcshell.exe!wmain(int argc=31, wchar_t * * argv=0x01cfbc68)  Line 109 + 0x16 bytes	C++


If there is some specific change in order of observers to NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID on Ubuntu or it's just a timing issue of some kind, I don't know.

I don't see any bug that would be landed around the time this started to be so intensive, maybe except Bug 822399 (cc Olli).  That is a wild theory only.


This is definitely not caused by some network change.
(In reply to Honza Bambas (:mayhemer) from comment #180)
> When doing check-interactive of the failing test, after test has run and I
> type quit() I'm getting the following stack (windows):

I've seen an issue which looked imilar to this one while working on a different bug. That issue was fixed in bug 835730 and boiled down to this: nsBrowserGlue was tearing down some stuff when receiving an xpcom-shutdown event; those components were initialized during profile loading and in some tests this wasn't happening. In the end various shutdown functions were called on components that had never been initialized leading to an error very much like this one when manually calling _quit() interactively in the affected xpcshell test.

Since the problem might be something similar I'll have a look later today and see if can track it down.
Thanks, Gabriele.
I've gathered a few more clues about this failure and they point in a direction similar my previous experience:

- The assertion is triggered by executing a script (osfile_async_worker.js) as part of an xpcom-shutdown observer after ShutdownXPCOM() has been called

- This is causing a ScriptLoaderRunnable class to be instanced three times, it is from here that the nsStreamTransportService code is being called. The first time it runs, the |mPool| field is non-null and everything goes fine, this is during ShutdownXPCOM(). The second and third times it's in the "xpcom-shutdown" observer and that's where the assertion is being triggered

- In every case this is happening in a DOM worker thread, this is created here as part of the test:

https://mxr.mozilla.org/mozilla-central/source/toolkit/components/search/nsSearchService.js#2708

      let promise = OS.File.writeAtomic(path, data, { tmpPath: path + ".tmp"});

So my diagnose is that some code is hooking the "xpcom-shutdown" event and running some code that has already been run once cleaning up after itself so the following runs are causing the issue. The culprit seems to lie somewhere in the OS.File module so that's where I'll be looking next.
https://tbpl.mozilla.org/php/getParsedLog.php?id=21505860&tree=Mozilla-Inbound
Summary: Intermittent test_645970.js | application crashed [@ mozalloc_abort(char const*)] from ASSERTION: thread pool wasn't shutdown → Intermittent test_645970.js,test_logging.js | application crashed [@ mozalloc_abort(char const*)] from ASSERTION: thread pool wasn't shutdown
It looks like enabling the ECMAScript Internationalization API (part of SpiderMonkey, uses the big third-party ICU library) may make this failure permanent instead of intermittent on Windows - I see two complete sets of failures on two try runs:
April 9: https://tbpl.mozilla.org/?tree=Try&rev=4cc524a3f8ae
April 8: https://tbpl.mozilla.org/?tree=Try&rev=1fc6411a138d

I don't see these failures on earlier try runs:
April 4: https://tbpl.mozilla.org/?tree=Try&rev=7aa110e2dc60
so something else must have happened between April 4 and April 8 (the only differences on my side are restructured configure/make files).

To reproduce the failures, apply patches b2470ed032d1 and edfeec4def28 of the first-listed try run to current m-c.
(In reply to Gabriele Svelto [:gsvelto] from comment #183)
> I've gathered a few more clues about this failure and they point in a
> direction similar my previous experience:
> 
> - The assertion is triggered by executing a script (osfile_async_worker.js)
> as part of an xpcom-shutdown observer after ShutdownXPCOM() has been called
> 
> - This is causing a ScriptLoaderRunnable class to be instanced three times,
> it is from here that the nsStreamTransportService code is being called. The
> first time it runs, the |mPool| field is non-null and everything goes fine,
> this is during ShutdownXPCOM(). The second and third times it's in the
> "xpcom-shutdown" observer and that's where the assertion is being triggered
> 
> - In every case this is happening in a DOM worker thread, this is created
> here as part of the test:
> 
> https://mxr.mozilla.org/mozilla-central/source/toolkit/components/search/
> nsSearchService.js#2708
> 
>       let promise = OS.File.writeAtomic(path, data, { tmpPath: path +
> ".tmp"});
> 
> So my diagnose is that some code is hooking the "xpcom-shutdown" event and
> running some code that has already been run once cleaning up after itself so
> the following runs are causing the issue. The culprit seems to lie somewhere
> in the OS.File module so that's where I'll be looking next.

There is some code hooking web-workers-shutdown in OS.File: 
http://dxr.mozilla.org/mozilla-central/toolkit/components/osfile/osfile_async_front.jsm#l152

Could this be the cause?
Blocks: 853301
(In reply to Norbert Lindenberg from comment #192)
> It looks like enabling the ECMAScript Internationalization API (part of
> SpiderMonkey, uses the big third-party ICU library) may make this failure
> permanent instead of intermittent on Windows [...]

Interesting, I hadn't tried on Windows yet but I could reproduce it fairly consistently on Linux already.

(In reply to David Rajchenbach Teller [:Yoric] from comment #193)
> There is some code hooking web-workers-shutdown in OS.File: 
> http://dxr.mozilla.org/mozilla-central/toolkit/components/osfile/
> osfile_async_front.jsm#l152
> 
> Could this be the cause?

It might, thanks for the heads up.
It seems to me that there are two separate bugs - the failure logs above are always either Linux/test_645970.js or Windows/test_logging.js, never Linux/test_logging.js or Windows/test_645970.js. The ones I've seen in my try runs were always of the Windows/test_logging.js variant. Note that test_logging.js is a new test which reached mozilla-central only on April 5.
Partial fix.
Attachment #736684 - Flags: review?(jonathan.laver)
Comment on attachment 736684 [details] [diff] [review]
Made test_logging.js correctly ignore messages it wants to ignore

Review of attachment 736684 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.
Comment on attachment 736684 [details] [diff] [review]
Made test_logging.js correctly ignore messages it wants to ignore

Chiming in: rabbidrabbit doesn't have review rights on this part of the code, although he authored it.
Attachment #736684 - Flags: review?(jonathan.laver) → feedback?(jonathan.laver)
Comment on attachment 736684 [details] [diff] [review]
Made test_logging.js correctly ignore messages it wants to ignore

Review of attachment 736684 [details] [diff] [review]:
-----------------------------------------------------------------

Patch looks like it will fix something. I do not understand the cause of the crash well enough to know whether this will be sufficient, though.
Attachment #736684 - Flags: review+
(In reply to Gabriele Svelto [:gsvelto] from comment #183)
> - In every case this is happening in a DOM worker thread, this is created
> here as part of the test:
> 
> https://mxr.mozilla.org/mozilla-central/source/toolkit/components/search/
> nsSearchService.js#2708
> 
>       let promise = OS.File.writeAtomic(path, data, { tmpPath: path +
> ".tmp"});

Do I understand correctly that this code is executed upon shutdown?
If so, we have a weird race condition, because this is code that is meant to be executed during initialization.
Flags: needinfo?(jwalden+bmo)
(In reply to David Rajchenbach Teller [:Yoric] from comment #202)
> Do I understand correctly that this code is executed upon shutdown?
> If so, we have a weird race condition, because this is code that is meant to
> be executed during initialization.

No, it's more complicated and re-reading my comment I didn't describe the mechanism very well :-) This code only triggers the creation of a DOM Worker thread; then during shutdown this thread goes on to execute another script which is causing the problem. I was able to analyze the problem better and I found where the whole chain starts, it's here in RuntimeService::Cleanup():

https://mxr.mozilla.org/mozilla-central/source/dom/workers/RuntimeService.cpp#1127

Which is being called by an "xpcom-shutdown" observer (which is what I first saw when testing):

https://mxr.mozilla.org/mozilla-central/source/dom/workers/RuntimeService.cpp#1419

So for some reason the following is happening:

- OS.File starts up, creates a DOM worker and registers the "web-workers-shutdown" event
- During shutdown the "xpcom-shutdown" event is dispatched
- RuntimeService picks it up, calls Cleanup() which sends "web-workers-shutdown"
- osfile_async_front.jsm:webWorkersShutdownObserver() is invoked in response to the event
- This sends a "System_shutdown" message to the worker
- The shutdown code in osfile_async_worker.js is called, this is what's triggering the execution of ScriptLoaderRunnable

Now, the first time this runs it's fine; for some reason however it seems that the last four steps are repeated more than once causing the web worker to be shutdown more than once and that trips the assertion.
Gabriele, if your analysis is correct (and it looks correct), this should fix the bug.
Attachment #737865 - Flags: review?(nfroyd)
Attachment #737865 - Flags: feedback?(yura.zenevich)
Flags: needinfo?(jwalden+bmo)
Attachment #737865 - Flags: feedback?(yura.zenevich) → feedback+
Comment on attachment 737865 [details] [diff] [review]
Unregistering the shutdown observer after the first "web-workers-shutdown"

Review of attachment 737865 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/osfile_async_front.jsm
@@ +183,5 @@
> +  if (aTopic == WEB_WORKERS_SHUTDOWN_TOPIC) {
> +    // Failing to unregister can cause crashes in some tests
> +    // (probably not in production), see bug 845190
> +    Services.obs.removeObserver(webWorkersShutdownObserver, WEB_WORKERS_SHUTDOWN_TOPIC);
> +  }

I don't think you really need the commit; unregistering observers is pretty standard.

But why not unregister TEST_WEB_WORKERS_SHUTDOWN_TOPIC?  Actually, why is *registering* the observer on the test topic even done unconditionally in this file?

@@ +203,5 @@
>    });
>  };
>  
>  // Attaching an observer listening to the "web-workers-shutdown".
> +// The observer is removed once "web-workers-shutdown" is received.

No comment necessary.  The previous one could go too.
Attachment #737865 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd (:froydnj) from comment #208)
> Comment on attachment 737865 [details] [diff] [review]
> Unregistering the shutdown observer after the first "web-workers-shutdown"
> 
> Review of attachment 737865 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/osfile/osfile_async_front.jsm
> @@ +183,5 @@
> > +  if (aTopic == WEB_WORKERS_SHUTDOWN_TOPIC) {
> > +    // Failing to unregister can cause crashes in some tests
> > +    // (probably not in production), see bug 845190
> > +    Services.obs.removeObserver(webWorkersShutdownObserver, WEB_WORKERS_SHUTDOWN_TOPIC);
> > +  }
> 
> I don't think you really need the commit; unregistering observers is pretty
> standard.

Ok with s/commit/comment/.

> But why not unregister TEST_WEB_WORKERS_SHUTDOWN_TOPIC?

Well, because I do not want to enforce the fact that we are only ever going to run a single test.

> Actually, why is
> *registering* the observer on the test topic even done unconditionally in
> this file?

Because, at this stage, we don't really have a way to determine whether we are being executed in a testing setting. I guess we could wrap this in a preference, but I would rather do this as a followup, if that's ok with you.

> 
> @@ +203,5 @@
> >    });
> >  };
> >  
> >  // Attaching an observer listening to the "web-workers-shutdown".
> > +// The observer is removed once "web-workers-shutdown" is received.
> 
> No comment necessary.  The previous one could go too.

Ok.
Comment on attachment 737865 [details] [diff] [review]
Unregistering the shutdown observer after the first "web-workers-shutdown"

(In reply to David Rajchenbach Teller [:Yoric] from comment #209)
> (In reply to Nathan Froyd (:froydnj) from comment #208)
> > I don't think you really need the commit; unregistering observers is pretty
> > standard.
> 
> Ok with s/commit/comment/.

Right, thank you!

> > But why not unregister TEST_WEB_WORKERS_SHUTDOWN_TOPIC?
> 
> Well, because I do not want to enforce the fact that we are only ever going
> to run a single test.
> 
> > Actually, why is
> > *registering* the observer on the test topic even done unconditionally in
> > this file?
> 
> Because, at this stage, we don't really have a way to determine whether we
> are being executed in a testing setting. I guess we could wrap this in a
> preference, but I would rather do this as a followup, if that's ok with you.

You know how much I like followup bugs. :)  That's fine.

r=me with the changes.
Attachment #737865 - Flags: feedback+ → review+
Blocks: 862570
The combination of the two patches available so far still does not fix the perma-orange on Windows with the internationalization API enabled:
https://tbpl.mozilla.org/?tree=Try&rev=e9a5f999214f
OS: Linux → All
I've tried applying the patch and the test passes correctly however now I see this when running on Linux:

 0:00.40 file:///home/gsvelto/projects/build/firefox-debug/dist/bin/components/nsHandlerService.js:891: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]
 0:00.40 JS Component Loader: ERROR /home/gsvelto/projects/mozilla-central/testing/xpcshell/head.js:941
 0:00.40                      TypeError: Components is undefined
 0:00.40 file:///home/gsvelto/projects/build/firefox-debug/dist/bin/components/nsHandlerService.js:891: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]
 0:00.40 JS Component Loader: ERROR /home/gsvelto/projects/mozilla-central/testing/xpcshell/head.js:944
 0:00.40                      TypeError: Components is undefined
 0:00.40 LoadPlugin() /home/gsvelto/projects/build/firefox-debug/dom/plugins/test/testplugin/libnptest.so returned 7fb2a3bde8b0
 0:00.40 LoadPlugin() /home/gsvelto/projects/build/firefox-debug/dom/plugins/test/testplugin/secondplugin/libnpsecondtest.so returned 7fb2a7faad30
 0:00.40 WARNING: NS_ENSURE_TRUE(mPool) failed: file /home/gsvelto/projects/mozilla-central/netwerk/base/src/nsStreamTransportService.cpp, line 465
 0:00.40 WARNING: unable to post continuation event: file /home/gsvelto/projects/mozilla-central/xpcom/io/nsStreamUtils.cpp, line 438
 0:00.40 WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0xC1F30001: file /home/gsvelto/projects/mozilla-central/dom/workers/ScriptLoader.cpp, line 323
 0:00.40 WORKER ERROR DURING SETUP Error: Failed to load script: resource://gre/modules/osfile.jsm (nsresult = 0xc1f30001)
 0:00.40 WORKER ERROR DETAIL @resource://gre/modules/osfile/osfile_async_worker.js:16
 0:00.40 @resource://gre/modules/osfile/osfile_async_worker.js:392

Breaking-in with GDB shows that the issue is different, yet a I can still see a worker thread been spun up during shutdown which is then torn down and triggers the warning.
No longer blocks: 862570
Depends on: 862570
The patches are being collapsed into bug 862570.
Attachment #736684 - Flags: checkin?(jwalden+bmo)
Comment on attachment 736684 [details] [diff] [review]
Made test_logging.js correctly ignore messages it wants to ignore

https://hg.mozilla.org/integration/mozilla-inbound/rev/8a2536988191
Attachment #736684 - Flags: checkin?(jwalden+bmo)
Cc'ing Taras and Dougt for any insights, also for point of view on disabling this test and opening a bug to diagnose and re-enable. That would let Waldo land the JS Intl code, so we can make fwd progress there.

/be
We expected that bug 862570 would fix the issue. It seems that this is not the case.
And this explosion of instances is from yet another JS patch, https://hg.mozilla.org/integration/mozilla-inbound/rev/75699d465ba0 from bug 864085, making it permaorange on Windows.

I might be making the wrong call, but I think it's time to disable it.
Disabled on Windows debug in https://hg.mozilla.org/integration/mozilla-inbound/rev/65463f2f7ba9
Whiteboard: [test disabled on Linux debug][leave open] → [test disabled on Linux debug][test_logging.js disabled on Windows debug][leave open]
I'm currently neck deep in bug 838577, but as soon as I have some respite (hopefully this week), I'll resume investigating this bug.
I'm also stuck fixing hard-to-reproduce B2G crashes but as soon as I'm done I'll be trying to further the investigation I was doing in comment 213. I'm still convinced that the cause is some code that is spawning a worker thread while shutdown is already undergoing. Last time I hit an issue like this it was because of a non-initialized subsystem being shut down but this time it looks different.
I've had some time to investigate this further and I can confirm two things:

- The issue is the same on Windows and Linux so this is not platform-specific

- The assertion is triggered by the following chain: ShutdownXPCOM() sends an xpcom-shutdown event to all observers and then spins the event loop with NS_ProcessPendingEvents() to let them run. During this process a new worker thread is spawned; afterwards the xpcom-shutdown-threads event is sent and at this stage RuntimeService::Cleanup() spins the event loop again causing the worker to run and fail with the assertion.