Closed Bug 1507175 Opened Last year Closed Last year

Invalid nsTArray index access in internal_RegisterScalars, crash

Categories

(Toolkit :: Telemetry, defect, P1, critical)

defect
Points:
1

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: mayhemer, Assigned: chutten)

References

Details

Attachments

(1 file)

m-c@073045259e75, win64 debug build


>	mozglue.dll!MOZ_CrashPrintf(0x00007ffac249e258, 28, 0x00007ffac249e22f, ...) Line 68	C++
 	xul.dll!InvalidArrayIndex_CRASH(1, 1) Line 26	C++
 	xul.dll!nsTArray_Impl<(anonymous namespace)::DynamicScalarInfo,nsTArrayInfallibleAllocator>::ElementAt(1) Line 1101	C++
 	xul.dll!nsTArray_Impl<(anonymous namespace)::DynamicScalarInfo,nsTArrayInfallibleAllocator>::operator[](1) Line 1139	C++
 	xul.dll!`anonymous namespace'::internal_RegisterScalars({...}, {...}) Line 1770	C++
 	xul.dll!TelemetryScalar::AddDynamicScalarDefinitions({...}) Line 3519	C++
 	xul.dll!mozilla::TelemetryIPC::AddDynamicScalarDefinitions({...}) Line 52	C++
 	xul.dll!mozilla::dom::ContentChild::RecvAddDynamicScalars({...}) Line 3742	C++
 	xul.dll!mozilla::dom::PContentChild::OnMessageReceived({...}) Line 9151	C++
 	xul.dll!mozilla::dom::ContentChild::OnMessageReceived({...}) Line 3799	C++
 	xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage({...}) Line 2244	C++
 	xul.dll!mozilla::ipc::MessageChannel::DispatchMessageW({...}) Line 2168	C++
 	xul.dll!mozilla::ipc::MessageChannel::RunMessage({...}) Line 2009	C++
 	xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run() Line 2042	C++
 	xul.dll!nsThread::ProcessNextEvent(true, 0x000000083c1ff0cf) Line 1246	C++
 	xul.dll!NS_ProcessNextEvent(0x000001f333917800, true) Line 530	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(0x000000083c1ff948) Line 94	C++
 	xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run(0x000000083c1ff948) Line 302	C++
 	xul.dll!MessageLoop::RunInternal() Line 325	C++
 	xul.dll!MessageLoop::RunHandler() Line 319	C++
 	xul.dll!MessageLoop::Run() Line 299	C++
 	xul.dll!nsBaseAppShell::Run() Line 160	C++
 	xul.dll!nsAppShell::Run() Line 420	C++
 	xul.dll!XRE_RunAppShell() Line 954	C++
 	xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run(0x000000083c1ff948) Line 269	C++
 	xul.dll!MessageLoop::RunInternal() Line 325	C++
 	xul.dll!MessageLoop::RunHandler() Line 319	C++
 	xul.dll!MessageLoop::Run() Line 299	C++
 	xul.dll!XRE_InitChildProcess(20, 0x000001f333904040, 0x000000083c1ffb60) Line 784	C++
 	xul.dll!mozilla::BootstrapImpl::XRE_InitChildProcess(22, 0x000001f333904040, 0x000000083c1ffb60) Line 69	C++
 	firefox.exe!content_process_main(0x000001f333908120, 22, 0x000001f333904040) Line 50	C++
 	firefox.exe!NS_internal_main(23, 0x000001f333904040, 0x000001f3335e7950) Line 287	C++
 	firefox.exe!wmain(23, 0x000001f3335ebd40) Line 143	C++
 	firefox.exe!__scrt_common_main_seh() Line 288	C++
 	kernel32.dll!BaseThreadInitThunk()	Unknown
 	ntdll.dll!RtlUserThreadStart()	Unknown


Note that TelemetryScalar::DeInitializeGlobalState HAS NOT been hit before this crash on the respective process (keeping a breakpoint there...)

scalarInfo.name() == "webrtc.nicer.turn_438s"
existingKey->mData.id == 1
existingKey->mData.dynamic = false <=== this is probably the issue
*gDynamicScalarInfo length == 1

Looks like the ScalarKey.id is mixed between indexes to either gScalars [2] or gDynamicScalarInfo [1].

[1] https://searchfox.org/mozilla-central/rev/007b66c1f5f7a1b9a900a2038652a16a020f010c/toolkit/components/telemetry/core/TelemetryScalar.cpp#1770
[2] https://searchfox.org/mozilla-central/rev/007b66c1f5f7a1b9a900a2038652a16a020f010c/toolkit/components/telemetry/core/TelemetryScalar.cpp#2253
Yup. When registering the dynamic scalars in the child process we're assuming the presence of an `existingKey` means that the dynamic scalar we're trying to register is overwriting a dynamic scalar we've previously registered. It is sometimes the case that the scalar whose behaviour we're overwriting will not be dynamic itself (for Build Faster support).

It is strange that this doesn't explode on the parent process first in TelemetryScalar::RegisterScalars. It may be broken there as well, but something (maybe the initial dynamic scalar for event counts?) is preventing it from actually crashing.

Is this a reliable crash or only an occasional one? Do you have STR?
Flags: needinfo?(honzab.moz)
It has never crashed in parent, I think.  And this is reliable to repro since some time, just start the browser (from a debugger).  I don't have STR, but I have the profile that I run with and can offer (just a testing one).
Flags: needinfo?(honzab.moz)
I changed code near here in bug 1501659. Does the crash recur with b0d14569871f reverted?
Flags: needinfo?(honzab.moz)
See Also: → 1501659
This reverts "bug 1501659 - Expire expired dynamic builtin scalars and events r=janerik"
(In reply to Chris H-C :chutten from comment #3)
> I changed code near here in bug 1501659. Does the crash recur with
> b0d14569871f reverted?

Doesn't crash any more.
Flags: needinfo?(honzab.moz)
Alrighty, then.
Assignee: nobody → chutten
Status: NEW → ASSIGNED
Points: --- → 1
Priority: -- → P1
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ca67f7683dfd
Don't expire dynamic builtin scalars this way. r=janerik
https://hg.mozilla.org/mozilla-central/rev/ca67f7683dfd
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
You need to log in before you can comment on or make changes to this bug.