Closed
Bug 1046038
Opened 10 years ago
Closed 10 years ago
crash in nsUrlClassifierPrefixSet::MakePrefixSet(unsigned int const*, unsigned int)
Categories
(Toolkit :: Safe Browsing, defect)
Tracking
()
VERIFIED
FIXED
mozilla34
Tracking | Status | |
---|---|---|
firefox34 | --- | verified |
People
(Reporter: cbook, Assigned: gcp)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files)
1.21 KB,
patch
|
mmc
:
review+
|
Details | Diff | Splinter Review |
11.33 KB,
patch
|
mmc
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-2f54f011-7dfc-4342-8fbf-1116c2140730.
=============================================================
nightly was running in the background and crashed
Crashing Thread
Frame Module Signature Source
0 XUL nsUrlClassifierPrefixSet::MakePrefixSet(unsigned int const*, unsigned int) obj-firefox/x86_64/dist/include/nsTArray.h
1 XUL nsUrlClassifierPrefixSet::SetPrefixes(unsigned int const*, unsigned int) toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp
2 XUL mozilla::safebrowsing::LookupCache::ConstructPrefixSet(FallibleTArray<mozilla::safebrowsing::AddPrefix>&) toolkit/components/url-classifier/LookupCache.cpp
3 XUL mozilla::safebrowsing::LookupCache::Build(FallibleTArray<mozilla::safebrowsing::AddPrefix>&, FallibleTArray<mozilla::safebrowsing::AddComplete>&) toolkit/components/url-classifier/LookupCache.cpp
4 XUL mozilla::safebrowsing::Classifier::ApplyTableUpdates(nsTArray<mozilla::safebrowsing::TableUpdate*>*, nsACString_internal const&) toolkit/components/url-classifier/Classifier.cpp
5 XUL mozilla::safebrowsing::Classifier::ApplyUpdates(nsTArray<mozilla::safebrowsing::TableUpdate*>*) toolkit/components/url-classifier/Classifier.cpp
6 XUL nsUrlClassifierDBServiceWorker::FinishUpdate() toolkit/components/url-classifier/nsUrlClassifierDBService.cpp
7 XUL nsRunnableMethodImpl<tag_nsresult (nsIUrlClassifierDBService::*)(), void, true>::Run() obj-firefox/x86_64/dist/include/nsThreadUtils.h
8 XUL nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp
9 XUL NS_ProcessNextEvent(nsIThread*, bool) xpcom/glue/nsThreadUtils.cpp
10 XUL mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp
11 XUL MessageLoop::Run() ipc/chromium/src/base/message_loop.cc
12 XUL nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp
13 libnss3.dylib _pt_root nsprpub/pr/src/pthreads/ptthread.c
Ø 14 libsystem_pthread.dylib libsystem_pthread.dylib@0x1898
Ø 15 libsystem_pthread.dylib libsystem_pthread.dylib@0x1729
Ø 16 libsystem_pthread.dylib libsystem_pthread.dylib@0x5fc8
17 libnss3.dylib libnss3.dylib@0x10668f
Reporter | ||
Updated•10 years ago
|
Component: Phishing Protection → DOM: Security
Product: Toolkit → Core
Assignee | ||
Updated•10 years ago
|
Component: DOM: Security → Phishing Protection
Product: Core → Toolkit
Assignee | ||
Comment 1•10 years ago
|
||
This crash trace is unusable, but the Windows ones are more helpful:
https://crash-stats.mozilla.com/report/index/de43cbdf-d8ea-4add-9c75-703d32140725
http://hg.mozilla.org/releases/mozilla-release/annotate/529a45c94e5a/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#l118
http://hg.mozilla.org/releases/mozilla-release/annotate/529a45c94e5a/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.h#l64
mDeltas is fallible but the code doesn't deal with that anywhere.
Assignee | ||
Comment 2•10 years ago
|
||
If we don't have the memory to store the core SafeBrowsing data, we can't really do anything except silently disable the feature. I don't think we want that (because then malware can just try to OOM us), so these have to be infallible.
They should always have been, but the big rewrite to the new storage engine added reading them directly instead of into a temporary (fallible) copy and the code using these arrays wasn't updated.
Comment 3•10 years ago
|
||
Comment on attachment 8465466 [details] [diff] [review]
Patch 1. Make core arrays unfallible
Review of attachment 8465466 [details] [diff] [review]:
-----------------------------------------------------------------
What's the big rewrite to the new storage engine?
Also, we should figure out what to do with the phishing components sometime in both toolkit and firefox. Technically the bug description for Core::DOM::Security module reads "Native content-based security features including: Content Security Policy (CSP), Mixed Content Blocker (MCB), and Safe Browsing."
Attachment #8465466 -
Flags: review?(mmc) → review+
Assignee | ||
Comment 4•10 years ago
|
||
>What's the big rewrite to the new storage engine?
Bug 673470. We used to rely exclusively on SQLite, which lead to a 60M DB and having to cache it entirely in memory for decent performance. I originally added UrlClassifierPrefixSet to reduce the in-memory part to 1.2M, then eventually got rid of SQLite entirely.
Assignee | ||
Comment 5•10 years ago
|
||
Comments such as https://bugzilla.mozilla.org/show_bug.cgi?id=995782#c5 and the continuous appearance of SafeBrowsing bugs that involve OOMs with 1-2M allocations even with free virtual/physical memory suggest that making a >1M infallible allocation in Firefox is sufficiently problematic that we shouldn't do it. This is, to me, a sad and strange state of affairs.
This patch reworks the ~1.2M deltas array into an array of arrays, matching the actual datastructure, which means it's about 6000 arrays of 100 2-byte elements each. That's obviously not the most memory optimal new layout, but it's what I could reasonably do without making this code significantly more complex, introducing a huge risk of new bugs and likely writing backing segmented array classes - for which I quite frankly don't have time.
So my question now is, does this actually improve the situation over a singular contiguous array of about 1.2M? The actual memory usage jumps from about 1.2M to 1.7M, but it will now work if no more than ~200 byte chunks of continuous memory are available. I worry that this may just make the heap fragmentation even worse.
mmc r? for checking the algorithmic changes, dmajor/bsmedberg/anyone appropriate f? for the issue of >1M allocations being such a problem and some advice if this patch is a good idea or not.
Attachment #8465623 -
Flags: review?(mmc)
Attachment #8465623 -
Flags: feedback?(dmajor)
Attachment #8465623 -
Flags: feedback?(benjamin)
Comment 6•10 years ago
|
||
jemalloc allocates in aligned chunks of 1 MiB, then hands out smaller arenas from those chunks, so I wouldn't expect this to make fragmentation worse. With any luck those 6000 arrays will be handed out from existing chunks. The reason you're ssing it use more memory in practice is probably because jemalloc rounds up to the next available size class, which for a 200-byte array is probably 256 bytes (and there's more administrative overhead in jemalloc). IIUC the 1.2MiB allocation probably used a huge allocation, which actually rounds up to the next MiB (though on Windows, we decommit the unused parts).
Comment on attachment 8465623 [details] [diff] [review]
Patch 2. Replace big array by array of arrays
> dmajor/bsmedberg/anyone appropriate f? for the issue of >1M allocations
> being such a problem and some advice if this patch is a good idea or not.
In general, yes, >1M allocations are a problem on 32-bit builds. At high memory usage, most of the "holes" of free space are 1MB-sized regions that jemalloc has deallocated.
According to the diagram in jemalloc.c, 200 bytes isn't a horrible size from a slop perspective, but Nick should have a look for other gotchas.
Attachment #8465623 -
Flags: feedback?(n.nethercote)
Attachment #8465623 -
Flags: feedback?(dmajor)
Attachment #8465623 -
Flags: feedback?(benjamin)
Assignee | ||
Comment 8•10 years ago
|
||
I can tune the size of the subarrays, within reason, say between 50 and 500 elements, if that improves things or moves us to an optimum. As you can see in the code it's nsTArray<nsTArray<uint16_t> >.
>The reason you're ssing it use more memory in practice..
I was expecting that if only for the reason that every 100 entries now require an extra pointer.
>IIUC the 1.2MiB allocation probably used a huge allocation, which actually rounds up to the next MiB (though on Windows, we decommit the unused parts).
That' surprising, because:
http://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/LookupCache.cpp?from=LookupCache.cpp&case=true#635
http://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp?from=nsUrlClassifierPrefixSet.cpp&case=true#263
report non-rounded up values. Our memory reporting doesn't take slack into account?
Comment 9•10 years ago
|
||
> Our memory reporting doesn't take slack into account?
It does. The 1 MiB+ cases are a bit weird. The amount of address space allocated is a multiple of 1 MiB, but only the requested size gets committed. So the memory reporters only report that committed size (which is the requested size rounded up to the nearest page size).
Assignee | ||
Comment 10•10 years ago
|
||
So this patch should actually reduce the VM memory usage but make the physical memory situation slightly worse. Funny. I guess I can tune the subarrays then by just looking at what results in the lowest memory usage, rather than trying to determine it by counting and then hitting the worst case when I'm off-by-one.
Comment 11•10 years ago
|
||
Comment on attachment 8465623 [details] [diff] [review]
Patch 2. Replace big array by array of arrays
Review of attachment 8465623 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp
@@ +317,5 @@
> + uint32_t numInDelta = i == indexSize - 1 ? deltaSize - indexStarts[i]
> + : indexStarts[i + 1] - indexStarts[i];
> + mIndexDeltas[i].SetLength(numInDelta);
> + mTotalPrefixes += numInDelta;
> + toRead = numInDelta*sizeof(uint16_t);
spaces around *
::: toolkit/components/url-classifier/nsUrlClassifierPrefixSet.h
@@ +61,1 @@
> // array containing deltas from indices.
This comment is out of date.
Attachment #8465623 -
Flags: review?(mmc) → review+
Comment 12•10 years ago
|
||
Comment on attachment 8465623 [details] [diff] [review]
Patch 2. Replace big array by array of arrays
Review of attachment 8465623 [details] [diff] [review]:
-----------------------------------------------------------------
My feedback: measure the new memory usage via about:memory. Try not to make it much bigger than it was previously. There's obviously a trade-off here between total memory usage and maximum allocation size. I don't have a good sense what kind of reduction in virtual OOMs this is likely to cause.
Attachment #8465623 -
Flags: feedback?(n.nethercote)
Assignee | ||
Comment 13•10 years ago
|
||
Assignee | ||
Comment 14•10 years ago
|
||
I made the arrays 120 elements (240 bytes), which seems a good compromise between minimal slop and not accidentally hitting the worst case when we need to store an extra pointer, length field and/or go 64-bit. Memory usage (for the arrays) increased by ~6%, which seems very acceptable for a big win in dealing with VM fragmentation.
Comment 15•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/25c173040e1f
https://hg.mozilla.org/mozilla-central/rev/9d24ecc84a50
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Comment 16•10 years ago
|
||
I just crashed with this stack, using Firefox32beta :(
https://crash-stats.mozilla.com/report/index/d5534af1-1a30-4300-ad7a-60cc92140820
Assignee | ||
Comment 17•10 years ago
|
||
This appears to be quite low volume and a Mac OS X specific signature, so I'm not too keen in uplifting aggressively.
Comment 18•10 years ago
|
||
Gian-Carlo, is there any particular way we can verify this bug or add tests for it, other than having a look at crash-stats? Thanks!
Flags: needinfo?(gpascutto)
Updated•10 years ago
|
Flags: qe-verify?
Assignee | ||
Comment 19•10 years ago
|
||
I think crash stats are your most reasonable bet.
Flags: needinfo?(gpascutto)
Updated•10 years ago
|
Flags: qe-verify? → qe-verify+
Comment 20•10 years ago
|
||
Looking in Socorro [1] for crashes with this signature over the past 4 weeks, I see that there are 6 crashes on Firefox 34 Beta, and 2 crashes on Firefox 35 Aurora (0 crashes on 36 Nightly).
Given the very low number of crashes I think we can call this fixed. Please reopen if you think otherwise.
[1] https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=28&signature=nsUrlClassifierPrefixSet%3A%3AMakePrefixSet%28unsigned+int+const%2A%2C+unsigned+int%29#tab-sigsummary
Status: RESOLVED → VERIFIED
status-firefox34:
--- → verified
You need to log in
before you can comment on or make changes to this bug.
Description
•