Crash in `anonymous namespace''::internal_GetHistogramByEnumId

VERIFIED FIXED in Firefox 52

Status

()

defect
P1
critical
VERIFIED FIXED
3 years ago
2 years ago

People

(Reporter: philipp, Assigned: gfritzsche)

Tracking

({crash})

51 Branch
mozilla54
x86
Windows
Points:
---

Firefox Tracking Flags

(firefox51 wontfix, firefox52+ verified, firefox-esr52 fixed, firefox53+ verified, firefox54 verified)

Details

(Whiteboard: [measurement:client], crash signature)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-57ab43b4-f014-4bfb-952e-325df2170212.
=============================================================
Crashing Thread (65)
Frame 	Module 	Signature 	Source
0 	xul.dll 	`anonymous namespace'::internal_GetHistogramByEnumId 	toolkit/components/telemetry/TelemetryHistogram.cpp:418
1 	xul.dll 	TelemetryHistogram::Accumulate(mozilla::Telemetry::ID, unsigned int) 	toolkit/components/telemetry/TelemetryHistogram.cpp:1901
2 	xul.dll 	mozilla::psm::`anonymous namespace'::AuthCertificate 	security/manager/ssl/SSLServerCertVerification.cpp:1288
3 	xul.dll 	mozilla::psm::`anonymous namespace'::SSLServerCertVerificationJob::Run 	security/manager/ssl/SSLServerCertVerification.cpp:1439
4 	xul.dll 	nsThreadPool::Run() 	xpcom/threads/nsThreadPool.cpp:225
5 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1067
6 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp:311
7 	xul.dll 	mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:338
8 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:225
9 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:205
10 	xul.dll 	nsThread::ThreadFunc(void*) 	xpcom/threads/nsThread.cpp:465
11 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:397
12 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:95
13 	ucrtbase.dll 	thread_start<unsigned int (__stdcall*)(void*)> 	
14 	kernel32.dll 	BaseThreadInitThunk 	
15 	ntdll.dll 	__RtlUserThreadStart 	
16 	ntdll.dll 	_RtlUserThreadStart

this crash was spiking up on 2017-02-12 for a day on 32bit versions of firefox 51.0.1 on windows. many of the user comments mentioned a site called 9anime.to in relation to this kind of crash.

https://mozilla.github.io/stab-crashes/supergraph.html?s=`anonymous+namespace%27%27%3A%3Ainternal_GetHistogramByEnumId
The triggering bug here is probably bug 1339010, where in SSLServerCertVerification.cpp uninitialized variables are being used for histogram ids.

However, we should probably catch this path to protect from errors in other components.
The crash occurs here, so a simple (and cheap) bounds check on the histogram id should be sufficient:
https://hg.mozilla.org/releases/mozilla-release/annotate/327e081221b0/toolkit/components/telemetry/TelemetryHistogram.cpp#l418
Priority: -- → P1
Whiteboard: [measurement:client] [qa+]
Assignee: nobody → gfritzsche
See Also: → 1339010
[Tracking Requested - why for this release]:
Crash bug. Triggered from security/manager/ssl/ code, but best protected against from Telemetry code.
This moves the histogram id checks consistently to the public Histogram API. Chris, can you review? Nathan, do you have any concerns with this?
Attachment #8837268 - Flags: review?(chutten)
Attachment #8837268 - Flags: feedback?(nfroyd)
Comment on attachment 8837268 [details] [diff] [review]
Move Histogram ID validity checks to outer API

Review of attachment 8837268 [details] [diff] [review]:
-----------------------------------------------------------------

I like the concept and the execution. The ipc refactor hasn't landed, has it? Because, if so, this won't apply.
Attachment #8837268 - Flags: review?(chutten) → review+
Comment on attachment 8837268 [details] [diff] [review]
Move Histogram ID validity checks to outer API

Actually this seems clear & simple enough, cancelling the additional feedback.
Attachment #8837268 - Flags: feedback?(nfroyd)
Keywords: checkin-needed
Whiteboard: [measurement:client] [qa+] → [measurement:client] [measurement:client:uplift] [qa+]
(In reply to Chris H-C :chutten from comment #5)
> The ipc refactor hasn't landed, has
> it? Because, if so, this won't apply.

Right - the refactor will land with bug 1313326, which will land later.
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8e9eea586ed
Move Histogram ID validity checks to outer API. r=chutten
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/d8e9eea586ed
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment on attachment 8837268 [details] [diff] [review]
Move Histogram ID validity checks to outer API

Approval Request Comment
[Feature/Bug causing the regression]: Telemetry, recently triggered by security code bug.
[User impact if declined]: Risk of crash from out-of-bounds data access.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Currently baking.
[Needs manual test from QE? If yes, steps to reproduce]: Per bug 1339010, comment 0, which should be confirmation steps for beta/aurora if bug 1339010 is not fixed there yet.
[List of other uplifts needed for the feature/fix]: n/a
[Is the change risky?]: Low-risk.
[Why is the change risky/not risky?]: This is a limited moving of code - an identifier validity check - to an outer API and skipping code if the check fails.
[String changes made/needed]: None.
Attachment #8837268 - Flags: approval-mozilla-beta?
Attachment #8837268 - Flags: approval-mozilla-aurora?
Comment on attachment 8837268 [details] [diff] [review]
Move Histogram ID validity checks to outer API

Fix a crash. Aurora53+.
Attachment #8837268 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8837268 [details] [diff] [review]
Move Histogram ID validity checks to outer API

fix crash in telemetry code, beta52+
Attachment #8837268 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flagging this for verification, instructions in Comment 10.
Flags: qe-verify+
Whiteboard: [measurement:client] [measurement:client:uplift] [qa+] → [measurement:client] [qa+]
Was unable to reproduce the initial crash using steps from bug 1339010, comment 0 using old Nightly debug build from 2017-02-12, normal Nightly from the same date and Fx 51.0.1 on Windows 7 and 10 both x86. If we look at Socorro we can't see any crashes in the last few days in 52.

https://crash-stats.mozilla.com/signature/?product=Firefox&signature=%60anonymous%20namespace%27%27%3A%3Ainternal_GetHistogramByEnumId&date=%3E%3D2017-02-14T14%3A05%3A00.000Z&date=%3C2017-02-21T14%3A05%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports

Will this be enough to call this Fixed on 52?
Flags: needinfo?(gfritzsche)
Thanks, i think crash-stats showing this as gone on 52 is sufficient.
Flags: needinfo?(gfritzsche)
(In reply to Georg Fritzsche [:gfritzsche] from comment #18)
> Thanks, i think crash-stats showing this as gone on 52 is sufficient.

Thank you! Marking accordingly.
Flags: qe-verify+
Verified fixed based on comment 21.
Status: RESOLVED → VERIFIED
Whiteboard: [measurement:client] [qa+] → [measurement:client]
You need to log in before you can comment on or make changes to this bug.