Closed Bug 1510345 Opened 7 years ago Closed 6 years ago

Crash in _platform_memset_pattern16$VARIANT$Haswell

Categories

(Toolkit :: Safe Browsing, defect, P1)

Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox-esr60 --- wontfix
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- fixed

People

(Reporter: marcia, Assigned: dimi)

References

Details

(4 keywords, Whiteboard: [fixed by bug 1542744?][post-critsmash-triage][adv-main68+])

Crash Data

This bug was filed from the Socorro interface and is report bp-707fa780-789b-4847-a56d-1d4500181127. ============================================================= Seen while looking at release crash stats: https://bit.ly/2TVWSGK. Mac only crash that affects 10.12-10.14. No extension correlations come up for any of the branches, but I did notice Version 2.8 of fxmonitor is present in quite a few of the crash reports I looked at. Volume is not super huge, but some interesting comments regarding Firefox crashing while in the background. Top 10 frames of crashing thread: 0 libsystem_platform.dylib _platform_memset_pattern16$VARIANT$Haswell 1 XUL nsUrlClassifierPrefixSet::MakePrefixSet xpcom/ds/nsTArray-inl.h:13 2 XUL nsUrlClassifierPrefixSet::SetPrefixes toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp:104 3 XUL mozilla::safebrowsing::VariableLengthPrefixSet::SetPrefixes toolkit/components/url-classifier/VariableLengthPrefixSet.cpp:106 4 XUL mozilla::sZeroToNineMask 5 XUL mozilla::safebrowsing::Classifier::UpdateTableV4 toolkit/components/url-classifier/LookupCacheV4.cpp:151 6 XUL mozilla::sZeroToNineMask 7 XUL mozilla::safebrowsing::Classifier::ApplyUpdatesBackground toolkit/components/url-classifier/Classifier.cpp:831 8 libmozglue.dylib arena_t::GetNonFullBinRun memory/build/rb.h:144 9 @0x37560f0f8a23 =============================================================
Crash Signature: [@ _platform_memset_pattern16$VARIANT$Haswell] → [@ _platform_memset_pattern16$VARIANT$Haswell] [@ _platform_memset_pattern16$VARIANT$Ivybridge] [@ _platform_memset_pattern16$VARIANT$Base] [@ _platform_memset_pattern16$VARIANT$Merom]
Priority: -- → P1
Assignee: nobody → dlee
Status: NEW → ASSIGNED

This probably the same bug as Bug 1362761 with a different signature.

  1. Both happen when applying an update and setting prefixes
  2. Bug 1362761 has around 95% crashes happened in Mac while this one is 100%
Priority: P1 → P2
See Also: → 1362761
Assignee: dlee → nobody
Status: ASSIGNED → NEW

Virtually every crash in this is a wildptr crash (likely to freed/unmapped memory, or bounds violation more likely). Might be sec-crit if so.

Goes back to at least 53. NI dveditz for check on priority

Group: firefox-core-security
Flags: needinfo?(dveditz)
Flags: needinfo?(dlee)

Crash pattern is different than Bug 1362761

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #2)

Goes back to at least 53.

Maybe, but something changed significantly with the 63 release. Crashes went from a couple a day before release to 40 a day after.

All the crashes I looked at had "rcx": "0x00000000e5e5e5e5" so it looks like something got freed.

The crashing addresses are from rdi which might point at a bogus move destination. How suspicious is it they all end "f30"?

sec-high seems reasonable.

Flags: needinfo?(dveditz)

The V57 crash I looked at has rcx = e5e5e5e5 also

Keep the NI, I'll see if I can find anything.

I suspect this may relate to we allocate a large array(around 1.5M 4-bytes prefixes) and even allocate additional one for little-endian platform.
I will work on a patch to see if we can reduce that.

Flags: needinfo?(dlee)
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Priority: P2 → P1

memshrink for the 6MB-ish array alloc
(Is this Main Process only, or Content Processes also?)

Flags: needinfo?(dlee)
Whiteboard: [memshrink]

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #8)

memshrink for the 6MB-ish array alloc
(Is this Main Process only, or Content Processes also?)

Main Process only

Flags: needinfo?(dlee)

Looks like the array allocations are already fallible and we're not looking at OOMs. Clearing memshrink for now.

Whiteboard: [memshrink]

just an update that I am working on this now, probably submit a patch to review next week

I implement that patch that refines memory allocation for prefix generation algorithm in Bug 1542744.
I'll monitor the result to see if that helps fix this problem.

I don't find any crash since landing Bug 1542744 two weeks ago.
This bug should be fixed.

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 1542744?]
Target Milestone: --- → mozilla68
Group: firefox-core-security → core-security-release

Do we have any options for ESR60 here?

Flags: needinfo?(dlee)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #14)

Do we have any options for ESR60 here?

I think it would be too risky.
The patch itself is not a trival one. It not only changes the algorithm in safebrowsing, but also affects the data we write to disk.
I'll play safe here.

Flags: needinfo?(dlee)
Depends on: 1542744

Lowering the severity to sec-moderate because this crash happens while processing background updates of the classifier data, not something malicious web content can trigger or influence.

Keywords: sec-highsec-moderate
Flags: qe-verify-
Whiteboard: [fixed by bug 1542744?] → [fixed by bug 1542744?][post-critsmash-triage]
Whiteboard: [fixed by bug 1542744?][post-critsmash-triage] → [fixed by bug 1542744?][post-critsmash-triage][adv-main68+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.