Closed Bug 1205358 Opened 4 years ago Closed 4 years ago

url classifier memory reporting doesn't work correctly with DMD

Categories

(Toolkit :: Safe Browsing, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: froydnj, Assigned: njn)

References

Details

Attachments

(1 file, 1 obsolete file)

I was trying to use DMD to look at a few things in a just-started browser.  The first unreported records are these:

Unreported {
  31,677 blocks in heap block record 1 of 22,670
  2,846,752 bytes (2,621,434 requested / 225,318 slop)
  Individual block sizes: 256 x 6,873; 224 x 223; 192 x 267; 160 x 425; 128 x 298; 112 x 422; 96 x 681; 80 x 1,083; 64 x 1,892; 48 x 3,695; 32 x 8,068; 16 x 7,750
  4.31% of the heap (4.31% cumulative)
  11.37% of unreported (11.37% cumulative)
  Allocated at {
    #01: replace_malloc (/home/froydnj/src/gecko-dev.git/memory/replace/dmd/DMD.cpp:1233)
    #02: moz_xmalloc (/home/froydnj/src/gecko-dev.git/memory/mozalloc/mozalloc.cpp:84)
    #03: nsTArrayInfallibleAllocator::ResultTypeProxy nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity<nsTArrayInfallibleAllocator>(unsigned long, unsigned long) (/opt/build/froydnj/build-dmd/media/libstagefright/../../dist/include/nsTArray-inl.h:137)
    #04: nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::Capacity() const (/opt/build/froydnj/build-dmd/media/webrtc/signaling/signaling_ecc/../../../../dist/include/nsTArray.h:368)
    #05: InsertElementsAt<nsTArrayInfallibleAllocator> (/opt/build/froydnj/build-dmd/gfx/vr/../../dist/include/nsTArray.h:1814)
    #06: nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::Hdr() const (/opt/build/froydnj/build-dmd/toolkit/components/url-classifier/../../../dist/include/nsTArray.h:487)
    #07: nsUrlClassifierPrefixSet::LoadFromFile(nsIFile*) (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp:360)
    #08: mozilla::safebrowsing::LookupCache::LoadPrefixSet() (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/LookupCache.cpp:663)
    #09: mozilla::safebrowsing::LookupCache::Open() (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/LookupCache.cpp:108)
    #10: mozilla::safebrowsing::Classifier::GetLookupCache(nsACString_internal const&) (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/Classifier.cpp:705)
    #11: mozilla::safebrowsing::Classifier::RegenActiveTables() (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/Classifier.cpp:406)
    #12: mozilla::safebrowsing::Classifier::Open(nsIFile&) (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/Classifier.cpp:153)
    #13: nsUrlClassifierDBServiceWorker::OpenDb() (/home/froydnj/src/gecko-dev.git/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:720)
    #14: nsRunnableMethodImpl<nsresult (nsIUrlClassifierDBServiceWorker::*)(), true>::Run() (/opt/build/froydnj/build-dmd/toolkit/components/url-classifier/../../../dist/include/nsThreadUtils.h:885)
    #15: nsThread::ProcessNextEvent(bool, bool*) (/home/froydnj/src/gecko-dev.git/xpcom/threads/nsThread.cpp:874)
    #16: NS_ProcessNextEvent(nsIThread*, bool) (/home/froydnj/src/gecko-dev.git/xpcom/glue/nsThreadUtils.cpp:277)
    #17: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) (/home/froydnj/src/gecko-dev.git/ipc/glue/MessagePump.cpp:355)
    #18: ~AutoRunState (/home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/message_loop.cc:520)
    #19: nsThread::ThreadFunc(void*) (/home/froydnj/src/gecko-dev.git/xpcom/threads/nsThread.cpp:365)
    #20: _pt_root (/home/froydnj/src/gecko-dev.git/nsprpub/pr/src/pthreads/ptthread.c:215)
    #21: start_thread (/build/glibc-Ir_s5K/glibc-2.19/nptl/pthread_create.c:309 (discriminator 2))
    #22: __clone (/build/glibc-Ir_s5K/glibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113)
    #23: ??? (???:???)
  }
}

which strikes me as odd: AFAICT, we record the memory used for every nsUrlClassifierPrefixSet when we construct it, and about:memory certainly shows numbers for several prefix sets.

Several others with roughly the same call stacks follow, in the same vein.

I should note that the above line was with --sample-below=1, so precise block counting by DMD was being used.

Nick, do you know what's going on here?
Flags: needinfo?(n.nethercote)
My first instinct is to trust DMD. Are you sure we record every nsUrlClassifierPrefixSet?

A good thing to check would be if any URL classifier stacks are appearing in the "Once-reported" section of the output. If so, it suggests that we're reporting some but not all memory. If not, it suggests that something is wrong with DMD.
Flags: needinfo?(n.nethercote)
(In reply to Nicholas Nethercote [:njn] from comment #1)
> My first instinct is to trust DMD. Are you sure we record every
> nsUrlClassifierPrefixSet?

I'm pretty sure.  The memory-consuming part of nsUrlClassifierPrefixSet starts here in SetPrefixes:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#64

which calls MakePrefixSet:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#85

which builds a bunch of nsTArrays.  The SetPrefixes function then computes the amount of memory used by those arrays:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#79
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#261

The CollectReports function then reports the computed number:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp#250

Unless we should be calling SizeOfIncludingThis from the CollectReports function directly?

> A good thing to check would be if any URL classifier stacks are appearing in
> the "Once-reported" section of the output. If so, it suggests that we're
> reporting some but not all memory. If not, it suggests that something is
> wrong with DMD.

None of them are appearing in the Once-reported section.  All of them are in the Unreported section.
Flags: needinfo?(n.nethercote)
> Unless we should be calling SizeOfIncludingThis from the CollectReports
> function directly?

I'm 99% sure this is the problem. I can't remember the exact sequence off the top of my head but I think the "this block has been reported" markings get cleared before the memory reporters run, and then DMD looks at the markings after reporting has occurred.

So things are fine for about:memory, but not fine for DMD.

I see now that this was caused by bug 1050108, which changed things so the size was pre-computed to avoid a race... and I was the one who r+'d the patch :/
Blocks: 1050108
Flags: needinfo?(n.nethercote)
I'll take a look at this, but maybe not until next week.
Assignee: nobody → n.nethercote
I finally took a look at this.

The problem is that DMD relies on heap blocks being measured while memory reporters are running. But the url-classifier measures its heap blocks in advance, in order to avoid races, and reports that pre-measured size at reporting time. As a result, DMD claims these blocks are unreported.

DMD does have a mechanism that allows heap blocks to be manually marked as reported. We can use that to fix this problem. I have a patch that does this, but there's some subtlety involved in ensuring it is correct that worries me.
This is the easy fix. It informs DMD that url-classifier's measurement takes
place ahead of reporting.

But there's potential for errors. Consider this sequence of events:

- We build up some relevant data structures from scratch.

- We call SizeOfIncludingThis() to measure them in advance.

- We modify the relevant data structures in such a way that at least some of
  the heap blocks stay alive; e.g. by appending elements to an array without
  resizing that array -- the array elements heap block would be unchanged in
  this case.

- We call SizeOfIncludingThis() to measure the modified data structures. At
  least some of the heap blocks will be measured for a second time.

At this point, the mMemoryInUse value will be correct. However, DMD will think
that the heap blocks that were measured twice were also reported twice, and
(erroneously) complain.

I don't know this code well enough to understand if that is possible, but even
if it's not possible now, future changes to the code might make it possible.
The whole approach feels error-prone and it makes me nervous.

gcp, in bug 1050108 you said "I don't want to add locking just to get memory
reporting". How strong is that desire? If we did add locking we could then
revert to the previous measure-while-reporting approach and these subtle
complexities would go away.
Attachment #8716842 - Flags: feedback?(gpascutto)
Comment on attachment 8716842 [details] [diff] [review]
Fix the interaction between DMD and url-classifier memory reporting

Review of attachment 8716842 [details] [diff] [review]:
-----------------------------------------------------------------

I think the double-report cases are "safe" in that the usage of the nsTArrays does always use explicit Clear/Resize/Compact operations after the length gets modified.

But yes, it's easy to break this.

I have no hard objecting to locking things, I just thought it was sad that what is essentially "reporting" code ends up introducing races and requiring locking. But the additional complications here push back again towards locking.

So either way is fine by me.

::: toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp
@@ +48,5 @@
> +//
> +// Some care is required with this measure-in-advance approach. Specifically,
> +// we must avoid measuring any heap blocks twice (i.e. via separate
> +// SizeOfIncludingThis() calls) otherwise DMD will (erroneously) think they
> +// have been reported twice.

You need to explain here how to avoid the double-report, or at least link to this bug.
Attachment #8716842 - Flags: feedback?(gpascutto) → feedback+
This patch reverts the "measure-in-advance" approach added in part 1 of bug
1050108 -- because that doesn't interact well with DMD -- and adds locking to
avoid races between the url-classifier thread and the main thread.
Attachment #8717306 - Flags: review?(gpascutto)
Comment on attachment 8717306 [details] [diff] [review]
Fix up nsUrlClassifierPrefixSet memory reporting

Review of attachment 8717306 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp
@@ +171,5 @@
>  NS_IMETHODIMP
>  nsUrlClassifierPrefixSet::GetPrefixes(uint32_t* aCount,
>                                        uint32_t** aPrefixes)
>  {
> +  MutexAutoLock lock(mLock);

If you look, the only variables being accessed here before the GetPrefixesNative(Inner) call are local ones, so you don't need to take the lock here, can let GetPrefixesNative do it, and can avoid the whole GetPrefixesNativeInner construction.

All this function does is wrap GetPrefixesNative in a way that JS code can use it, so it's also fairly safe to assume it won't suddenly start to access instance vars.
Attachment #8717306 - Flags: review?(gpascutto) → review+
Attachment #8716842 - Attachment is obsolete: true
> If you look, the only variables being accessed here before the
> GetPrefixesNative(Inner) call are local ones, so you don't need to take the
> lock here, can let GetPrefixesNative do it, and can avoid the whole
> GetPrefixesNativeInner construction.

Good point. I can do much the same thing to get rid of SizeOfIncludingThisInner(), too, so I'll do that as well.
https://hg.mozilla.org/mozilla-central/rev/f3173fd2f161
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
I looked at the actual numbers this produces and they're slightly concerning (but it's not a flaw introduced here). Originally, the overhead was measured at about 6%: https://bugzilla.mozilla.org/show_bug.cgi?id=1046038#c14

After an update:

-rw-r--r-- 1 morbo morbo  551200 feb 11 15:33 goog-malware-shavar.pset
-1778178304[7f7c9cecf780]: SB tree done, size = 792016 bytes

-rw-r--r-- 1 morbo morbo  562180 feb 11 15:33 goog-phish-shavar.pset
-1778178304[7f7c9cecf780]: SB tree done, size = 798800 bytes

-rw-r--r-- 1 morbo morbo  131616 feb 11 15:33 goog-badbinurl-shavar.pset
-1778178304[7f7c9cecf780]: SB tree done, size = 249792 bytes

44% - 90% overhead. The large overhead for the smaller table is not unexpected, but for the larger tables that's a lot. The strange thing is, loading the same tables on startup:

-129816832[7fabff965780]: SB tree done, size = 249792 bytes
-129816832[7fabff965780]: Active table: goog-badbinurl-shavar

-129816832[7fabff965780]: SB tree done, size = 658384 bytes
-129816832[7fabff965780]: Active table: goog-malware-shavar

-129816832[7fabff965780]: SB tree done, size = 668432 bytes
-129816832[7fabff965780]: Active table: goog-phish-shavar

The code that is constructing the tables off of in-memory values is much less efficient (in terms of allocator overhead) than the one that constructs them from disk.
Filed bug 1247615 for this discrepancy.
You need to log in before you can comment on or make changes to this bug.