Closed Bug 1314818 Opened 3 years ago Closed 3 years ago

Intermittent netwerk/test/unit_ipc/test_getHost_wrap.js | application crashed [@ mozalloc_abort(char const * const)]

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox50 --- unaffected
firefox51 --- unaffected
firefox52 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: chutten)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [measurement:client])

Attachments

(3 files)

Component: Networking → Telemetry
Product: Core → Toolkit
Windows 7 debug xpcshell
 08:28:36     INFO -   0  mozglue.dll!mozalloc_abort(char const * const) [mozalloc_abort.cpp:3bfde35a0d18 : 33 + 0x2c]
 08:28:36     INFO -      eip = 0x73722eec   esp = 0x0028f4cc   ebp = 0x0028f4d4   ebx = 0x00000000
 08:28:36     INFO -      esi = 0x00000021   edi = 0x6e0ccf70   eax = 0x00000000   ecx = 0x6e0c06ef
 08:28:36     INFO -      edx = 0x00000060   efl = 0x00000206
 08:28:36     INFO -      Found by: given as instruction pointer in context
 08:28:36     INFO -   1  xul.dll!NS_DebugBreak [nsDebugImpl.cpp:3bfde35a0d18 : 436 + 0xd]
 08:28:36     INFO -      eip = 0x5e59a004   esp = 0x0028f4dc   ebp = 0x0028f90c
 08:28:36     INFO -      Found by: call frame info
 08:28:36     INFO -   2  xul.dll!NS_DispatchToMainThread(already_AddRefed<nsIRunnable> &&,unsigned int) [nsThreadUtils.cpp:3bfde35a0d18 : 204 + 0x17]
 08:28:36     INFO -      eip = 0x5e62dadb   esp = 0x0028f914   ebp = 0x0028f94c
 08:28:36     INFO -      Found by: call frame info
 08:28:36     INFO -   3  xul.dll!`anonymous namespace'::internal_RemoteAccumulate [TelemetryHistogram.cpp:3bfde35a0d18 : 1411 + 0x20]
 08:28:36     INFO -      eip = 0x609be1c0   esp = 0x0028f954   ebp = 0x0028f984
 08:28:36     INFO -      Found by: call frame info 

Not sure what of the method body [1] calls into NS_DebugBreak... but if it's the "can't do it, shutting down" code then this sounds like something the fix for bug 1304660 ought to have taken care of... (+froydnj)

[1]: https://dxr.mozilla.org/mozilla-central/rev/4f09d9469e73adf32c7db6720504fcbe580516b3/xpcom/glue/nsThreadUtils.cpp#198
Flags: needinfo?(nfroyd)
(In reply to Chris H-C :chutten from comment #1)
> Windows 7 debug xpcshell
>  08:28:36     INFO -   0  mozglue.dll!mozalloc_abort(char const * const)
> [mozalloc_abort.cpp:3bfde35a0d18 : 33 + 0x2c]
>  08:28:36     INFO -      eip = 0x73722eec   esp = 0x0028f4cc   ebp =
> 0x0028f4d4   ebx = 0x00000000
>  08:28:36     INFO -      esi = 0x00000021   edi = 0x6e0ccf70   eax =
> 0x00000000   ecx = 0x6e0c06ef
>  08:28:36     INFO -      edx = 0x00000060   efl = 0x00000206
>  08:28:36     INFO -      Found by: given as instruction pointer in context
>  08:28:36     INFO -   1  xul.dll!NS_DebugBreak
> [nsDebugImpl.cpp:3bfde35a0d18 : 436 + 0xd]
>  08:28:36     INFO -      eip = 0x5e59a004   esp = 0x0028f4dc   ebp =
> 0x0028f90c
>  08:28:36     INFO -      Found by: call frame info
>  08:28:36     INFO -   2 
> xul.dll!NS_DispatchToMainThread(already_AddRefed<nsIRunnable> &&,unsigned
> int) [nsThreadUtils.cpp:3bfde35a0d18 : 204 + 0x17]
>  08:28:36     INFO -      eip = 0x5e62dadb   esp = 0x0028f914   ebp =
> 0x0028f94c
>  08:28:36     INFO -      Found by: call frame info
>  08:28:36     INFO -   3  xul.dll!`anonymous
> namespace'::internal_RemoteAccumulate [TelemetryHistogram.cpp:3bfde35a0d18 :
> 1411 + 0x20]
>  08:28:36     INFO -      eip = 0x609be1c0   esp = 0x0028f954   ebp =
> 0x0028f984
>  08:28:36     INFO -      Found by: call frame info 
> 
> Not sure what of the method body [1] calls into NS_DebugBreak... but if it's
> the "can't do it, shutting down" code then this sounds like something the
> fix for bug 1304660 ought to have taken care of... (+froydnj)

Ugh, so we have to be a little more clever, because we're still trying to accumulate Telemetry super-late in the cycle collector/GC:

08:28:36     INFO -   0  mozglue.dll!mozalloc_abort(char const * const) [mozalloc_abort.cpp:3bfde35a0d18 : 33 + 0x2c]
08:28:36     INFO -      eip = 0x73722eec   esp = 0x0028f4cc   ebp = 0x0028f4d4   ebx = 0x00000000
08:28:36     INFO -      esi = 0x00000021   edi = 0x6e0ccf70   eax = 0x00000000   ecx = 0x6e0c06ef
08:28:36     INFO -      edx = 0x00000060   efl = 0x00000206
08:28:36     INFO -      Found by: given as instruction pointer in context
08:28:36     INFO -   1  xul.dll!NS_DebugBreak [nsDebugImpl.cpp:3bfde35a0d18 : 436 + 0xd]
08:28:36     INFO -      eip = 0x5e59a004   esp = 0x0028f4dc   ebp = 0x0028f90c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   2  xul.dll!NS_DispatchToMainThread(already_AddRefed<nsIRunnable> &&,unsigned int) [nsThreadUtils.cpp:3bfde35a0d18 : 204 + 0x17]
08:28:36     INFO -      eip = 0x5e62dadb   esp = 0x0028f914   ebp = 0x0028f94c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   3  xul.dll!`anonymous namespace'::internal_RemoteAccumulate [TelemetryHistogram.cpp:3bfde35a0d18 : 1411 + 0x20]
08:28:36     INFO -      eip = 0x609be1c0   esp = 0x0028f954   ebp = 0x0028f984
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   4  xul.dll!`anonymous namespace'::internal_Accumulate [TelemetryHistogram.cpp:3bfde35a0d18 : 1451 + 0x12]
08:28:36     INFO -      eip = 0x609bc1ea   esp = 0x0028f98c   ebp = 0x0028f998
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   5  xul.dll!TelemetryHistogram::Accumulate(mozilla::Telemetry::ID,unsigned int) [TelemetryHistogram.cpp:3bfde35a0d18 : 2244 + 0xb]
08:28:36     INFO -      eip = 0x609b9c18   esp = 0x0028f9a0   ebp = 0x0028f9b4
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   6  xul.dll!AccumulateTelemetryCallback [XPCJSContext.cpp:3bfde35a0d18 : 3000 + 0xa]
08:28:36     INFO -      eip = 0x5ed3b036   esp = 0x0028f9bc   ebp = 0x0028f9d4
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   7  xul.dll!JSRuntime::addTelemetry(int,unsigned int,char const *) [Runtime.cpp:3bfde35a0d18 : 460 + 0xb]
08:28:36     INFO -      eip = 0x611a1bb8   esp = 0x0028f9dc   ebp = 0x0028f9e8
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   8  xul.dll!js::gcstats::Statistics::endGC() [Statistics.cpp:3bfde35a0d18 : 989 + 0x16]
08:28:36     INFO -      eip = 0x61064352   esp = 0x0028f9f0   ebp = 0x0028fa28
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -   9  xul.dll!js::gcstats::Statistics::endSlice() [Statistics.cpp:3bfde35a0d18 : 1107 + 0x7]
08:28:36     INFO -      eip = 0x61064938   esp = 0x0028fa30   ebp = 0x0028fa5c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  10  xul.dll!js::gc::GCRuntime::gcCycle(bool,js::SliceBudget &,JS::gcreason::Reason) [jsgc.cpp:3bfde35a0d18 : 6205 + 0x14]
08:28:36     INFO -      eip = 0x60fe5dd3   esp = 0x0028fa64   ebp = 0x0028fadc
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  11  xul.dll!js::gc::GCRuntime::collect(bool,js::SliceBudget,JS::gcreason::Reason) [jsgc.cpp:3bfde35a0d18 : 6326 + 0x16]
08:28:36     INFO -      eip = 0x60fd84d6   esp = 0x0028fae4   ebp = 0x0028fb78
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  12  xul.dll!js::gc::GCRuntime::gc(JSGCInvocationKind,JS::gcreason::Reason) [jsgc.cpp:3bfde35a0d18 : 6394 + 0x4f]
08:28:36     INFO -      eip = 0x60fe5a32   esp = 0x0028fb80   ebp = 0x0028fba8
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  13  xul.dll!JS::GCForReason(JSContext *,JSGCInvocationKind,JS::gcreason::Reason) [jsgc.cpp:3bfde35a0d18 : 7232 + 0x12]
08:28:36     INFO -      eip = 0x60fbae42   esp = 0x0028fbb0   ebp = 0x0028fbb8
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  14  xul.dll!mozilla::CycleCollectedJSContext::GarbageCollect(unsigned int) [CycleCollectedJSContext.cpp:3bfde35a0d18 : 1253 + 0x10]
08:28:36     INFO -      eip = 0x5e5a4539   esp = 0x0028fbc0   ebp = 0x0028fbd4
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  15  xul.dll!nsXPConnect::~nsXPConnect() [nsXPConnect.cpp:3bfde35a0d18 : 82 + 0xa]
08:28:36     INFO -      eip = 0x5ed5e7e8   esp = 0x0028fbdc   ebp = 0x0028fbf0
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  16  xul.dll!nsXPConnect::Release() [nsXPConnect.cpp:3bfde35a0d18 : 43 + 0x92]
08:28:36     INFO -      eip = 0x5ed6f660   esp = 0x0028fbe8   ebp = 0x0028fbf0
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  17  xul.dll!nsXPConnect::ReleaseXPConnectSingleton() [nsXPConnect.cpp:3bfde35a0d18 : 149 + 0x6]
08:28:36     INFO -      eip = 0x5ed6f725   esp = 0x0028fbf8   ebp = 0x0028fc1c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  18  xul.dll!xpcModuleDtor() [XPCModule.cpp:3bfde35a0d18 : 22 + 0x5]
08:28:36     INFO -      eip = 0x5ed5862b   esp = 0x0028fc00   ebp = 0x0028fc1c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  19  xul.dll!nsComponentManagerImpl::KnownModule::~KnownModule() [nsComponentManager.h:3bfde35a0d18 : 232 + 0x2]
08:28:36     INFO -      eip = 0x5e5f1d46   esp = 0x0028fc04   ebp = 0x0028fc1c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  20  xul.dll!nsAutoPtr<nsComponentManagerImpl::KnownModule>::~nsAutoPtr<nsComponentManagerImpl::KnownModule>() [nsAutoPtr.h:3bfde35a0d18 : 78 + 0xd]
08:28:36     INFO -      eip = 0x5e5f1b6b   esp = 0x0028fc0c   ebp = 0x0028fc1c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  21  xul.dll!nsTArray_Impl<nsAutoPtr<nsComponentManagerImpl::KnownModule>,nsTArrayInfallibleAllocator>::DestructRange(unsigned int,unsigned int) [nsTArray.h:3bfde35a0d18 : 1897 + 0xf]
08:28:36     INFO -      eip = 0x5e5f2bb7   esp = 0x0028fc14   ebp = 0x0028fc1c
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  22  xul.dll!nsTArray_Impl<nsAutoPtr<nsComponentManagerImpl::KnownModule>,nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned int,unsigned int) [nsTArray.h:3bfde35a0d18 : 1943 + 0x9]
08:28:36     INFO -      eip = 0x5e5f5ad8   esp = 0x0028fc24   ebp = 0x0028fc38
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  23  xul.dll!nsComponentManagerImpl::Shutdown() [nsComponentManager.cpp:3bfde35a0d18 : 881 + 0x11]
08:28:36     INFO -      eip = 0x5e5f5ce7   esp = 0x0028fc40   ebp = 0x0028fc78
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  24  xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *) [XPCOMInit.cpp:3bfde35a0d18 : 1018 + 0x5]
08:28:36     INFO -      eip = 0x5e6271a4   esp = 0x0028fc54   ebp = 0x0028fc78
08:28:36     INFO -      Found by: call frame info
08:28:36     INFO -  25  xul.dll!XRE_TermEmbedding [nsEmbedFunctions.cpp:3bfde35a0d18 : 208 + 0x7]
08:28:36     INFO -      eip = 0x60a0e7f9   esp = 0x0028fc80   ebp = 0x0028fe04
08:28:36     INFO -      Found by: call frame info

The best way to do this is probably to manually implement NS_DispatchToMainThread by getting the main thread ourselves in telemetry, and then bailing if we can't get the main thread.
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #2)
...
> 08:28:36     INFO -      Found by: call frame info
> 08:28:36     INFO -  23  xul.dll!nsComponentManagerImpl::Shutdown()
> [nsComponentManager.cpp:3bfde35a0d18 : 881 + 0x11]
> 08:28:36     INFO -      eip = 0x5e5f5ce7   esp = 0x0028fc40   ebp =
> 0x0028fc78
> 08:28:36     INFO -      Found by: call frame info
> 08:28:36     INFO -  24  xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *)
> [XPCOMInit.cpp:3bfde35a0d18 : 1018 + 0x5]
> 08:28:36     INFO -      eip = 0x5e6271a4   esp = 0x0028fc54   ebp =
> 0x0028fc78
> 08:28:36     INFO -      Found by: call frame info
> 08:28:36     INFO -  25  xul.dll!XRE_TermEmbedding
> [nsEmbedFunctions.cpp:3bfde35a0d18 : 208 + 0x7]
> 08:28:36     INFO -      eip = 0x60a0e7f9   esp = 0x0028fc80   ebp =
> 0x0028fe04
> 08:28:36     INFO -      Found by: call frame info
> 
> The best way to do this is probably to manually implement
> NS_DispatchToMainThread by getting the main thread ourselves in telemetry,
> and then bailing if we can't get the main thread.

This is inside XPCOM shutdown though. AFAICT we could record there, but we would never get this saved to disk or sent out anymore (that would happen from the Telemetry JS modules).

We should probably just drop recording calls at this point?
Is this maybe just racing with TelemetryImpl::ShutdownTelemetry [1][2]?
Do we need a proper "disable Telemetry recording" call before ShutdownXPCOM here?

[1] https://dxr.mozilla.org/mozilla-central/rev/86f702229e32c6119d092e86431afee576f033a1/toolkit/components/telemetry/Telemetry.cpp#1940
[2] https://dxr.mozilla.org/mozilla-central/rev/86f702229e32c6119d092e86431afee576f033a1/toolkit/components/telemetry/Telemetry.cpp#2288
Priority: -- → P2
Whiteboard: [measurement:client]
Priority: P2 → P1
Assignee: nobody → chutten
Comment on attachment 8808771 [details]
bug 1314818 - Get and dispatch telemetry accumulation to the main thread directly.

https://reviewboard.mozilla.org/r/91514/#review91584

r=me with the easy fix below, or the harder fix if you're feeling sufficiently motivated.

::: toolkit/components/telemetry/TelemetryHistogram.cpp:357
(Diff revision 1)
> +  if (NS_FAILED(rv)) {
> +    NS_WARNING("NS_FAILED DispatchToMainThread. Maybe we're shutting down?");
> +    return;
> +  }

This is just going to trade one set of problems for another, because if we do fail to get the main thread, we're going to run `~already_AddRefed` on an object that still holds a valid pointer, which is going to assert.

There are two ways to fix this:

1. The easy way: add `nsCOMPtr<nsIRunnable> event(aEvent);` somewhere near the top of the function, and dispatch `event` or `Move(event)` instead of `Move(aEvent)`.
2. The harder, but more efficient way: note that we always allocate the runnable even if dispatch might not succeed; it would be abstractly better if we held off allocating the runnable until we knew we had a thread to dispatch it to.  I think that means making this function take a `void (*)(void)` or similar and moving the `NS_NewRunnableFunction` call into the function.  But that's somewhat more complicated for a small win, so I can understand not futzing with that.
Attachment #8808771 - Flags: review?(nfroyd) → review+
Comment on attachment 8808770 [details]
bug 1314818 - Revert "bug 1304660 - Don't dispatch to main once xpcom starts shutting down."

https://reviewboard.mozilla.org/r/91512/#review91588
Attachment #8808770 - Flags: review?(nfroyd) → review+
Went for the easy way, as expected :)
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0e9e97d11bbb
Revert "bug 1304660 - Don't dispatch to main once xpcom starts shutting down." r=froydnj
https://hg.mozilla.org/integration/autoland/rev/fcd1172e075a
Get and dispatch telemetry accumulation to the main thread directly. r=froydnj
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/a66aa3e7dd91
Build fix: remove function erroneously included from 9f373cd98bab8e92ef5a0238edc9d50f8f2d5587 during rebase r=bustage-fix on a CLOSED TREE
https://hg.mozilla.org/mozilla-central/rev/0e9e97d11bbb
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.