Closed
Bug 819817
Opened 12 years ago
Closed 12 years ago
DMD: cache calls to NS_DescribeCodeAddress
Categories
(Core :: DMD, defect)
Core
DMD
Tracking
()
RESOLVED
FIXED
mozilla20
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: [MemShrink:P2])
Attachments
(1 file)
14.68 KB,
patch
|
justin.lebar+bug
:
review+
justin.lebar+bug
:
approval-mozilla-aurora+
justin.lebar+bug
:
approval-mozilla-b2g18+
|
Details | Diff | Splinter Review |
DMD calls NS_DescribeCodeAddress a *lot*, and a lot of the calls are repeated. This could reduce dumping times significantly.
![]() |
Assignee | |
Updated•12 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
![]() |
Assignee | |
Comment 1•12 years ago
|
||
I just tried removing the NS_DescribeCodeAddress call on Mac. Dump times for the browser with --sample-below=1 dropped from over 2 minutes to almost instantaneous. So this cache is likely to have a big effect.
I'm leaning towards a fixed-size, lossy cache for this, but I'd be happy to hear arguments against that.
Comment 2•12 years ago
|
||
> I'm leaning towards a fixed-size, lossy cache for this, but I'd be happy to hear
> arguments against that.
Like an LRU cache, or like a one-way direct-mapped cache?
I doubt it matters too much; you should do whatever seems easy, I think.
![]() |
Assignee | |
Comment 3•12 years ago
|
||
I was thinking direct-mapped because it's easier.
![]() |
Assignee | |
Comment 4•12 years ago
|
||
This patch implements the LocationService class, which basically wraps
NS_DescribeCodeAddress and caches requests.
With 2^12 entries, the cache takes ~150 KB (on 64-bit) or ~85 KB (on 32-bit),
and I get a hit rate of ~84%. With 2^13 it takes 280 KB (150 KB on 32-bit) I
get ~87%. So I went with 2^12 because it's a rounder number.
On Mac, this reduces dump time from 162s to 29s. On Linux, it reduces dump
time from a second or two to well under a second.
The whole thing feels a little over-engineered... but it sure works.
Attachment #692129 -
Flags: review?(justin.lebar+bug)
Comment 5•12 years ago
|
||
Comment on attachment 692129 [details] [diff] [review]
DMD: cache calls to NS_DescribeCodeAddress for faster dumping.
>+// This class is used to print details about code locations.
>+class LocationService
[...]
>+ void Clear()
>+ {
>+ mPc = nullptr;
>+ InfallibleAllocPolicy::free_(mFunction);
>+ mFunction = nullptr;
>+ // Don't free mLibrary because it's externally owned.
>+ mLibrary = nullptr;
>+ mLOffset = 0;
>+ }
Nit: I don't think it matters in our case, but for cleanliness, can we Clear()
from ~Entry()?
>+ };
>+
>+ // A direct-mapped cache. This number of entries gives a hit rate of ~84%
>+ // (compared to ~91% for an unbounded cache) at a moderate space cost.
Isn't the point of the direct-mapped cache to use /less/ space (as an
approximation to a fixed-size LRU cache)? If the direct-mapped cache uses more
space than an unbounded cache, shouldn't we use the latter?
(Maybe you're concerned about a different test-case, in which the unbounded
cache actually uses more memory? But if so, perhaps you should say that? In
any case, since we limit the amount of data DMD outputs, I'm not sure we'd see
a lot of variance in the number of times we query the cache when calling
Dump().)
Also, nit, if you're going to have experimental results in the comment, could
you provide some information on the experimental procedure?
>+ static const size_t kNumBits = 12;
Please use a more descriptive name (e.g. kNumEntriesLog2), or just don't use
this variable.
>+static LocationService* gLocationService = nullptr;
This data should only live during one call to Dump(), right? So can we
allocate this guy on the stack of the Dump call so we don't waste the space for
the direct-mapped cache at all times?
I guess it's annoying to pass the location service in to every call to Print().
But maybe you could move it into the Writer class somehow.
Attachment #692129 -
Flags: review?(justin.lebar+bug) → review+
![]() |
Assignee | |
Comment 6•12 years ago
|
||
> >+ // A direct-mapped cache. This number of entries gives a hit rate of ~84%
> >+ // (compared to ~91% for an unbounded cache) at a moderate space cost.
>
> Isn't the point of the direct-mapped cache to use /less/ space (as an
> approximation to a fixed-size LRU cache)? If the direct-mapped cache uses
> more
> space than an unbounded cache, shouldn't we use the latter?
I clearly worded that badly! Here's what I meant:
- With 2^12 entries, we get a hit rate of 84%.
- With 2^24 entries (which closely approximates an infinite direct-mapped cache) we get a hit rate of 91%.
The "moderate space cost" refers to the 4096 entries. I'll adjust the wording.
Comment 7•12 years ago
|
||
Ah, I understand better now.
I still feel dirty about keeping a global data structure for something that's only needed during Dump()'ing.
![]() |
Assignee | |
Comment 8•12 years ago
|
||
> I still feel dirty about keeping a global data structure for something
> that's only needed during Dump()'ing.
Sure, I'll see what I can do.
![]() |
Assignee | |
Comment 9•12 years ago
|
||
Comment 10•12 years ago
|
||
Comment on attachment 692129 [details] [diff] [review]
DMD: cache calls to NS_DescribeCodeAddress for faster dumping.
[Triage Comment]
npotb
Attachment #692129 -
Flags: approval-mozilla-b2g18+
Attachment #692129 -
Flags: approval-mozilla-aurora+
Comment 11•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Comment 12•12 years ago
|
||
![]() |
Assignee | |
Updated•12 years ago
|
Component: General → DMD
You need to log in
before you can comment on or make changes to this bug.
Description
•