Closed Bug 819817 Opened 7 years ago Closed 7 years ago

DMD: cache calls to NS_DescribeCodeAddress

Categories

(Core :: DMD, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox19 --- fixed
firefox20 --- fixed
b2g18 --- fixed

People

(Reporter: njn, Assigned: njn)

References

Details

(Whiteboard: [MemShrink:P2])

Attachments

(1 file)

DMD calls NS_DescribeCodeAddress a *lot*, and a lot of the calls are repeated.  This could reduce dumping times significantly.
Whiteboard: [MemShrink] → [MemShrink:P2]
I just tried removing the NS_DescribeCodeAddress call on Mac.  Dump times for the browser with --sample-below=1 dropped from over 2 minutes to almost instantaneous.  So this cache is likely to have a big effect.

I'm leaning towards a fixed-size, lossy cache for this, but I'd be happy to hear arguments against that.
> I'm leaning towards a fixed-size, lossy cache for this, but I'd be happy to hear 
> arguments against that.

Like an LRU cache, or like a one-way direct-mapped cache?

I doubt it matters too much; you should do whatever seems easy, I think.
I was thinking direct-mapped because it's easier.
This patch implements the LocationService class, which basically wraps
NS_DescribeCodeAddress and caches requests.

With 2^12 entries, the cache takes ~150 KB (on 64-bit) or ~85 KB (on 32-bit),
and I get a hit rate of ~84%.  With 2^13 it takes 280 KB (150 KB on 32-bit) I
get ~87%.  So I went with 2^12 because it's a rounder number.

On Mac, this reduces dump time from 162s to 29s.  On Linux, it reduces dump
time from a second or two to well under a second.  

The whole thing feels a little over-engineered... but it sure works.
Attachment #692129 - Flags: review?(justin.lebar+bug)
Comment on attachment 692129 [details] [diff] [review]
DMD: cache calls to NS_DescribeCodeAddress for faster dumping.

>+// This class is used to print details about code locations.
>+class LocationService

[...]

>+    void Clear()
>+    {
>+      mPc = nullptr;
>+      InfallibleAllocPolicy::free_(mFunction);
>+      mFunction = nullptr;
>+      // Don't free mLibrary because it's externally owned.
>+      mLibrary = nullptr;
>+      mLOffset = 0;
>+    }

Nit: I don't think it matters in our case, but for cleanliness, can we Clear()
from ~Entry()?

>+  };
>+
>+  // A direct-mapped cache.  This number of entries gives a hit rate of ~84%
>+  // (compared to ~91% for an unbounded cache) at a moderate space cost.

Isn't the point of the direct-mapped cache to use /less/ space (as an
approximation to a fixed-size LRU cache)?  If the direct-mapped cache uses more
space than an unbounded cache, shouldn't we use the latter?

(Maybe you're concerned about a different test-case, in which the unbounded
cache actually uses more memory?  But if so, perhaps you should say that?  In
any case, since we limit the amount of data DMD outputs, I'm not sure we'd see
a lot of variance in the number of times we query the cache when calling
Dump().)

Also, nit, if you're going to have experimental results in the comment, could
you provide some information on the experimental procedure?

>+  static const size_t kNumBits = 12;

Please use a more descriptive name (e.g. kNumEntriesLog2), or just don't use
this variable.

>+static LocationService* gLocationService = nullptr;

This data should only live during one call to Dump(), right?  So can we
allocate this guy on the stack of the Dump call so we don't waste the space for
the direct-mapped cache at all times?

I guess it's annoying to pass the location service in to every call to Print().
But maybe you could move it into the Writer class somehow.
Attachment #692129 - Flags: review?(justin.lebar+bug) → review+
> >+  // A direct-mapped cache.  This number of entries gives a hit rate of ~84%
> >+  // (compared to ~91% for an unbounded cache) at a moderate space cost.
> 
> Isn't the point of the direct-mapped cache to use /less/ space (as an
> approximation to a fixed-size LRU cache)?  If the direct-mapped cache uses
> more
> space than an unbounded cache, shouldn't we use the latter?

I clearly worded that badly!  Here's what I meant:

- With 2^12 entries, we get a hit rate of 84%.
- With 2^24 entries (which closely approximates an infinite direct-mapped cache) we get a hit rate of 91%.

The "moderate space cost" refers to the 4096 entries.  I'll adjust the wording.
Ah, I understand better now.

I still feel dirty about keeping a global data structure for something that's only needed during Dump()'ing.
> I still feel dirty about keeping a global data structure for something
> that's only needed during Dump()'ing.

Sure, I'll see what I can do.
Comment on attachment 692129 [details] [diff] [review]
DMD: cache calls to NS_DescribeCodeAddress for faster dumping.

[Triage Comment]
npotb
Attachment #692129 - Flags: approval-mozilla-b2g18+
Attachment #692129 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/e5fddab2f19e
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Component: General → DMD
You need to log in before you can comment on or make changes to this bug.