Closed
Bug 551477
Opened 15 years ago
Closed 11 years ago
Add a built-in space profiler
Categories
(Core :: General, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: jseward, Unassigned)
References
(Depends on 1 open bug)
Details
Attachments
(4 files, 6 obsolete files)
142.37 KB,
text/plain
|
Details | |
134.35 KB,
patch
|
Details | Diff | Splinter Review | |
204.39 KB,
image/png
|
Details | |
246.08 KB,
patch
|
Details | Diff | Splinter Review |
The following is a placeholder bug following a discussion on irc with
jorendorff, bz, ted, w/ apologies in advance if I paraphrase it wrong.
It would be nice to have some kind of space profiler facility built
into Fx. It appears there is concern about the difficulty in
determining what Fx is using memory for, especially in more complex
use scenarios (many open tabs, many extensions, many different web
domains displayed).
Initial discussion assumed that the traversal problem was somehow
solvable. Specifically, we would be able to traverse the entire JS
graph (easy) and also the entire C++ object graph (conservative scan?)
FTR, I am inclined to think of the traversal problem in a pretty
abstract sense: we have a combined C++/JS object graph, plus some set
of roots into the graph. We will have a traversal mechanism which
allows us to visit all nodes in said graph, plus a query mechanism
that can ask each node something about itself (what kind are you? who
put you here?), although it might be that from C++ objects we can get
no such data.
The question of what kind of information should or could be collected
proved less clear cut. I think Jason was after this:
* for each (web) domain, total amount of storage attributed solely to
that domain
* for each extension, total amount of storage attributed solely to
that extension
I think I have a more general proposal for this, which I'll put in a
following comment, so as to avoid cluttering this initial statement.
There may be other metrics which are interesting. Two of the most
basic are "what are you" (what kind of object) and "who put you here"
(what call stack allocated you?)
One other question is: how much can we do without having to add
profiling metadata word(s) to each object?
Reporter | ||
Comment 1•15 years ago
|
||
(In reply to comment #0)
> * for each (web) domain, total amount of storage attributed solely to
> that domain
>
> * for each extension, total amount of storage attributed solely to
> that extension
Here's a strawman generalisation, since for sure we don't want to be
hardwiring concepts like "domain X" or "extension Y" into the
profiler.
We have a graph G and a set of roots R, so that each 'r' in R is a
pointer to some node 'nd' in G.
Imagine we also have a completely arbitrary set T of tags. Each
member of T characterises a root in some (arbitrary) way. Each root
then has some subset of T associated with it.
For example, T could be
{"domain is bbc.co.uk",
"extension is FlashBlock",
"tab number is 57"}
So then we can ask the profiler: how much graph is reachable from
roots which have the tag "domain is bbc.co.uk" ? Or from roots which
have the tag "extension is FlashBlock"? Or from roots which have both
of those tags?
Is that a useful generalisation? I suspect so. Is it implementable?
I don't know.
Comment 2•15 years ago
|
||
FWIW, I think it would be fine to do the traversal only on request, as long as we keep enough information around to perform that at any time. I suspect we'd want to be able to gather this information and display it when a user visits about:memory.
Maybe it makes sense to just have the underlying API provide the data you just proposed, and then the front-end UI can figure out a way to display it that makes sense?
(In reply to comment #2)
> Maybe it makes sense to just have the underlying API provide the data you just
> proposed, and then the front-end UI can figure out a way to display it that
> makes sense?
Yep, that'd be the right approach; I suspect a lot of tools would want to use this data in different ways. Note that exposing a simple query interface would also be perfectly ok, though, since getting the data could well be expensive. But a query could also just be 'give me all the data'.
Comment 4•15 years ago
|
||
It seems ok to have an expensive query, as long as we document that the call is expensive. If a user opens about:memory because their Firefox is already screwed, it's probably not a big deal if it takes a little bit to run.
Reporter | ||
Comment 5•15 years ago
|
||
Thinking a bit about the top level implementation choices and their
space requirements and intrusiveness. Seems like there's at least
three different things we can ask of an object, at heap-census time:
(1) What are you? (== what kind of object?)
Presumably we can find this out by examining the object itself.
Does not require any extra storage.
(2) Who put you here? (== what point in the program allocated you?)
This requires having extra storage associated with the object,
and this is needed for the entire lifetime of the object.
Usually done by having an extra word in each object, containing
some kind of allocation-point tag.
(3) Why are you still alive? (== who's pointing at you, and/or
from which root(s) are you reachable?)
Requires constant extra storage per object (perhaps a root-set
tag?) but that storage is only required transiently at
census-time, not for the entire lifetime of the object.
(1) and (3) might be acceptable. (2) is massively intrusive because
we'd have to change the layout of every object (or do some nasty hack
like having a shadow heap containing the allocation-point data), and
we'd also have to mess with every allocation point, to insert a tag
into the object/shadow-heap.
Note that (3) is a generalisation of the tags idea in Comment #1.
Comment 6•15 years ago
|
||
Gregor has been working on (2).
Reporter | ||
Comment 7•15 years ago
|
||
(In reply to comment #6)
Is there a tracking bug, or some other place I can learn about
what he's doing?
Comment 8•15 years ago
|
||
(In reply to comment #7)
> (In reply to comment #6)
> Is there a tracking bug, or some other place I can learn about
> what he's doing?
There is bug 547327.
My approach is to monitor objects that are created within a loop and try to optimize the allocation size for further objects that are created at the same program location.
Comment 9•15 years ago
|
||
Julian, is there anything I can do to help move this along? Do you need your car washed? Meals cooked? Heaven and earth moved? Is there anybody you want rubbed out? Andreas maybe? Andreas, we need this feature, you're going to have to take one from the team.
Comment 10•15 years ago
|
||
I meant "for the team", but either way.
Comment 11•15 years ago
|
||
Ted pointed me to this bug from here:
http://groups.google.com/group/mozilla.dev.platform/browse_thread/thread/f2008793142b9abb/49e00f7d41b70ce1?hl=en&lnk=gst&q=memory+profiling+firebug#49e00f7d41b70ce1
One of the most requested features for Firebug is a support for web page memory profiling. Recently we have tried to implement a prototype (based on memory API included in Jetpack prototype). You can see more here:
http://getfirebug.com/wiki/index.php/Firebug_Memory_Profiler
The first question: what is the relation between the existing Jetpack APIs and this bug? Will the Jetpack be used as a base for the solution?
---
So far, there is no way how to get precise size (row bytes) information for
e.g. strings, which would be nice addition to some size info already
provided. Some other more structured size info (related to images,
HTML elements, etc) would be also useful.
The prototype we have done for Firebug showed us that one of very useful things (except of the actual amount of memory consumed) is the list of referents (comment #5 - point #3). We have managed to get such list (using Jetpack APIs) since it provides a way how to access the entire graph of JS objects in the current firefox runtime.
Note that we needed to filter the graph in order to get JS objects only for the current page (which is how Firebug works, always focusing on the current page) so, having some kind of query interface (as mentioned in comment #3) would help.
Also agree with Vladimir that various tools would want to use different way how to present the data to the user. For example, in case of Firebug we also want to integrate the memory-info with existing panels.
The second question. As I mentioned, using the Jetpack APIs, it's relatively easy to get the entire graph of meta-data, where each node has its unique ID. The problem is that there is no way how to get JS object ID and/or to get an existing JS object according to the provided ID (I know that JSD provides an APIs to get these IDs for script objects: jsdIScript.tag), which complicates "JS object <-> meta-data" relation.
In Firebug it would be extremely helpful to show the meta data in context of real objects on the page (e.g. in the Watch panel, DOM panel, etc.)
Could such methods (like: getJSObject(id), getJSObjectID(obj) ) also be part of the memory API?
Honza
Reporter | ||
Comment 12•15 years ago
|
||
> The first question: what is the relation between the existing Jetpack APIs and
> this bug? Will the Jetpack be used as a base for the solution?
I had a look at the Firebug profiler links. It looks like a nice
presentation and query environment. I think we're approaching the
same problem from opposite ends, though, and I don't think I can
answer your questions at present. Here's what I'm looking into.
My goal atm is low level enumeration of the graph of objects, with the
short-term aim of maximising coverage. I use "object" in the loosest
sense here, to mean any lump of memory to which anyone has a pointer
or some kind of reference. So I am aiming to cover the JS heap, the
C++ heap, and not just stuff under control of the cycle collector, but
ad-hoc malloc'd memory too.
Given the great diversity of objects that covers, I am thinking about
the idea of building a "shadow graph" -- a throwaway structure which
is (mostly) constructed when we want a memory census, analysed, and
discarded.
The shadow graph is composed of shadow nodes, each of which is:
start-address length arbitrary-description set-of-out-pointers
plus a set of pointers which serve as roots. At this stage the
arbitrary-description can be minimal or nothing.
The shadow graph can be incomplete or inaccurate. Basically we forage
around for information to build it with, then hand it off to an
analyser. The analyser is tolerant of missing pointers, pointers to
the middle of nodes, pointers to static objects, apparent
inconsistencies like nested or overlapping blocks, etc.
The graph would be constructed, for JS objects, by a hacked version of
JS_DumpHeap. For C++ objects I am thinking to add hooks to
memory/mozalloc/mozalloc.h which observe all mallocs/news and
frees/deletes. At analysis time any such blocks would be scanned to
find pointers to other blocks.
This has various advantages:
* it gives coverage of C++ and JS objects
* without having to get involved with the Cycle Collector
* all the analysis code can be in one place, rather than scattered
around
* the analysis code does not need to be aware of the layout of any of
the objects
* the only code that does need to be aware of object layouts (and be
scattered around) are "abstractor methods", you might say, which
take (eg) a JSSquirrellyThing* and give back one or more shadow
nodes that represent it
* we can incrementally increase the level of detail in the
arbitrary-description field as required to support more detailed
queries.
* it extends naturally to anything else we write an "abstractor
method" for, eg, mmap'd blocks
* initially the description field can be empty. Even like that we get
basic stats on the numbers and sizes of allocated objects, plus
reachability information.
* the shadow graph doesn't need to be constructed all at the same
time, iow we don't necessarily need to stop the entire world to do
it. We can do analyses of incomplete or inconsistent graphs if need
be.
* as soon as the shadow graph is built, the system can continue. We
don't have to wait around for expensive reachability or whatever
analyses to complete. That can be done by a separate thread. Or
even by a different process, offline.
* when profiling is inactive it's free in space terms and almost free
in time terms. "Almost" because it would still require an extra
conditional branch on the mozalloc malloc/new functions to decide
not to collect the allocated address/lengths.
Anyway. If anyone thinks this is the wrong thing to build, or simply
won't work, please yell asap.
Comment 13•15 years ago
|
||
(In reply to comment #12)
> Anyway. If anyone thinks this is the wrong thing to build, or simply
> won't work, please yell asap.
If it weren't you saying it, I would say, "you're vastly underestimating the complexity of what you're proposing." But since it *is* you, I'll say, "It sounds too good to be true."
Comment 14•15 years ago
|
||
I don't understand what you mean by "abstractor methods", and what they would do.
Reporter | ||
Comment 15•15 years ago
|
||
(In reply to comment #14)
> I don't understand what you mean by "abstractor methods",
I mean a function which takes an object and returns its essential
details for inclusion into the shadow graph. For example, given a
pointer "p" to one of these
struct Foo { int a; Blah* b; char c; D* d; }
the abstractor function must return at a minimum these 3 items:
p sizeof(*p) {p->b, p->d}
that is, the address range of the object and the set of pointers found
within it. That is the minimum info needed to build the shadow graph.
The point about intercepting moz_xmalloc et al is that we can extract
the above info automatically, for malloc'd blocks. This is just as
well since it would obviously be totally infeasible to manually
annotate every C++ constructor in the tree.
When an malloc-style allocation is made we note the returned pointer
and the requested size. That means for a 2 words/block overhead we
get a list of all the blocks in the C++ heap. Then at analysis time
we can scan each block to find any pointers to other blocks it might
hold. This is really just conservative C/C++ garbage collection
stuff.
I just verified that I can stick a hook in moz_xmalloc et al (thanks
cjones!) and see all mallocs and frees going by. So this part of the
plan at least looks like it might work.
For JS objects and other stuff in the JS heap I will need to write
custom abstractor functions. But that's not such a big deal because
there's only a limited set of object formats to consider.
Comment 16•15 years ago
|
||
Julian, how much of a problem would tagged pointers be for this scheme? We use them in some places in the DOM, and I've been trying to decide whether we should stop doing so... Does it matter for this setup one way or another?
Reporter | ||
Comment 17•15 years ago
|
||
(In reply to comment #16)
I don't think that would be a problem -- those bits can just be masked
off. It weakens a bit the heuristic used to decide whether something
is could be a pointer to a block, because we can no longer reject
misaligned values as can't-possibly-be-a-pointer. But I don't think
that's a particularly safe heuristic anyway, since it would assume
that nobody in the entire system uses tagged pointers. Which I don't
believe.
So the short answer is: it makes no difference.
Comment 18•15 years ago
|
||
As a devil's advocate type question, what is the advantage of this approach versus forking an inert copy of the program or performing a non-fatal core-dump and then probing that static memory snapshot using a python-enabled gdb?
Assuming debug symbols are available, gdb can introspect objects and tell when a pointer in memory amounts to an nsCOMPtr/nsRefPtr versus a presumed invariant-controlled backlink and otherwise get a lot of structure knowledge for free.
The 'self-introspection' goal could be satisfied by taking a page from roc's chronicle playbook and having the python/gdb driver expose an HTTP loop that consumes and returns JSON. One might even argue that it's cleaner because the patient is never performing analytical surgery on itself.
In any event, this is very exciting work you are undertaking and I look forward to the results! You may be interested in trace-malloc's leak-soup logic that does some of the conservative pointer analysis stuff already, apparently:
http://mxr.mozilla.org/mozilla-central/source/tools/trace-malloc/
Comment 19•15 years ago
|
||
gdb, what's that? I'm a Windows-based JS developer who wants to know where all my memory is going!
/be
Comment 20•15 years ago
|
||
The use case I proposed when I badgered Julian into filing this bug was "I'm a normal user, and my Firefox is using a lot of memory. How can I find out what's using all that memory?" Ideally we will provide an API that things like about:memory can use which could give a rough idea of how much memory is entrained by each extension and web site. We need to give users tools so that instead of saying "Firefox uses too much memory" they can say "AdBlock uses too much memory when I visit cnn.com". (Then, ideally, the parties responsible can fix their code!)
Reporter | ||
Comment 21•15 years ago
|
||
(In reply to comment #20)
> [...] they can say "AdBlock uses too much memory when I visit cnn.com".
Yes, I completely agree that's where we want to get to eventually and
yes it will no doubt require an API to query the collected data. But
first we need the low-level info-collecting machinery in place, since
for sure I don't see how we can answer any interesting question
without having at a minimum a graph, roots and reachability analysis.
Also .. I'm reluctant to build a profiler which only has partial
coverage (eg, JS heap only), hence the thinking about the C++ heap at
this early stage.
Reporter | ||
Comment 22•15 years ago
|
||
One thing I'd like to ask the assembled wisdom here is, when should
a heap census be performed? My current plan is:
* immediately after the cycle collector has run, and
* immediately after JSGC has run
From sticking printfs in various places it appears that a run of CC
necessarily causes a run of JSGC (yes?) but not vice versa. So I'm
inclined to hack up some logic which does a census after every CC run
and after every JSGC run that isn't bracketed by a CC run.
Better suggestions / your-analysis-is-bogus style comments welcomed.
Comment 23•15 years ago
|
||
Right, I'm not suggesting you need to implement all that, just the underlying bits! It was more in response to Andrew's "why not just use gdb + python" question.
Comment 24•15 years ago
|
||
I raised the gdb solution because it seemed like a way to avoid a slippery slope on the C++ analysis front and because forking seems like the fastest way to get the system running again. Specifically, implementing the abstractor methods seems like manually reproducing the work that debug information already provides, and using gdb is much less risk-prone than rolling your own DWARF parser. (Of course, gdb-python or dehydra/treehydra could be used to automatically generate those if the need arose.)
It obviously is not the ideal long-term solution for a web developer focused tool. It might be reasonable for a mozilla developer tool, especially as it could be viable for crash-dump analysis. Mainly, I was thinking it could be reasonable as a simpler-to-develop prototype given that I believe Jim Blandy already has the JS side of the house covered for gdb-python ( http://hg.mozilla.org/users/jblandy_mozilla.com/archer-mozilla/ ). The fundamental assumption that the python-gdb way is simpler might be very wrong on my part...
fwiw, I believe gdb can run on windows, but I am unclear as to whether it understands the MS compiler's debugging symbols format, which would be the most important thing.
In any event, an in-tree solution is even more exciting and useful than an external mechanism with dependencies; I eagerly await the fruits!
Reporter | ||
Comment 25•15 years ago
|
||
I have working, a first cut at machinery that creates a combined
C++/JS shadow graph, as per proposal in comment #12. Will clean it up
and post a wip patch later this week. This is still far from being
useful.
For a Fx idling and displaying news.bbc.co.uk, there are about 88k
nodes in the graph, of which 84k are C++ and 4k are JS.
Reporter | ||
Comment 26•15 years ago
|
||
What this patch demonstrates:
* creation of a combined C++/JS shadow graph
* done periodically, at the end of some JS garbage collections
* thread-safe-ly, more or less, and without deadlocking the system
* reasonable performance (resulting browser is quite usable)
* without the profiler's own dynamic memory allocation skewing the
results
Example output: all C/C++ heap allocations via mozalloc are tracked,
and running stats of live blocks maintained. Periodically a 1-liner
summary is printed:
HEAPPROF 20747: CH: 1605639/1444361/170M total adds/dels/alloc,
161278/14595k live blocks/bytes
Distribution of ages at death is also computed but not displayed.
After some but not all JS GCs, a full profile is initiated. A
shadow graph is built, cleaned up, and some simple stats on nodes
and edges computed:
HEAPPROF 20747: PROFILE #8: BEGIN creating shadow graph
HEAPPROF 20747: profile #8: analysis begin
HEAPPROF 20747: raw: 161363 CH blocks, 4200 JS blocks, 207 dups
HEAPPROF 20747: initial sanity checks ok
HEAPPROF 20747: refinement done, 434094 mapped, 807262 unmapped
HEAPPROF 20747: objects/bytes: 4003/46752 JS, 161353/16426453 CH
HEAPPROF 20747: edges: 279 JS->JS, 225 JS->CH, 5647 CH->JS, 315904 CH->CH
HEAPPROF 20747: outdegree: 504 from JS, 321551 from CH, 322055 total
HEAPPROF 20747: indegree: 5926 to JS, 316129 to CH, 322055 total
HEAPPROF 20747 18183 objects of 165356 total unreferenced from within the SG
HEAPPROF 20747: PROFILE #8: END
What next:
As it stands, this is more or less useless. It mechanically extracts
low level reachability, but knows nothing about each node.
One direction is to try to answer "what kind of objects are in the
heap?" questions. This could be provided by the JS world without
difficulty. For the C++ heap we might be reduced to guessing object
types by looking for pointers to vtables within the objects.
Another direction is to answer "who put these objects in the heap?"
questions. This would require tagging each object with its allocation
point. For the C++ heap this could possibly by done by getting
stack traces from Breakpad.
Comment 27•15 years ago
|
||
(In reply to comment #26)
> Another direction is to answer "who put these objects in the heap?"
> questions. This would require tagging each object with its allocation
> point. For the C++ heap this could possibly by done by getting
> stack traces from Breakpad.
How about causality tracking? For example, when a DOM click event gets generated it would generate a 'frame' that says "DOM click calling foo.js/func bar()/line 53" and uses that as the tag. When that event-triggered JS initiates a setTimeout it creates a new frame to use when the setTimeout gets dispatched and links to the currently active (event-origin) frame. In order to avoid creating the world's longest linked list, a setTimeout from inside a setTimeout callback would reference the original progenitor instead and boost a counter to reflect that it's the result of a chained setTimeout. It seems like they could just be JS objects and the normal garbage collector would take care of it?
The same mechanism could potentially be exposed to C++ consumers with default logic being provided for the event loop/native nsITimers/other low-hanging fruit using the breakpad (or external address->symbol resolution). The startup timeline does some of this and I did a bit more for my recently started and regrettably currently triaged/abandoned Thunderbird memory work. I even went so far as to push state on all XPConnect C++->JS crossings, although I also was interested in the execution and its timing as well as the memory costs.
Comment 28•15 years ago
|
||
FYI, Breakpad can only produce stack traces after-the-fact. (We don't ship the debug symbols with the app, which it needs to walk the stack.)
Comment 29•15 years ago
|
||
We could expose the symbols via a web service (I don't think the symbol server protocol is fine-grained enough for this) so that the client could request a bunch of symbol/frame factoids and do the walking, if we wanted to. We have the information, can make all sorts of tradeoffs about how to get it to the user, including just pulling them all down the first time the user wants to walk some stack.
Comment 30•15 years ago
|
||
Sure, we can do anything (it's just software, right?) I'm just pointing out that as currently implemented, Breakpad isn't really a good fit for the task.
Reporter | ||
Comment 31•15 years ago
|
||
(In reply to comment #29)
For sure we can't do anything much interesting with the C++ heap
without having symbols or unwind data. Given those two I think we
could answer what's-in-the-heap and who-put-it-there questions, which
would be a good start. For x86-{linux,darwin} with
-fno-omit-frame-pointer (the default) we might be able to get away
without unwind data, but pretty much all other platforms require it.
I like the pre-canned factoid idea for a couple of reasons:
* With careful design of the factoids, they can be platform
independent. Almost all of the hassle about unwinding the stack and
producing traces is in reading the debuginfo. So this scheme
minimises the extra complexity compiled in to the browser, putting
it instead in the factoid server.
* At least for ELF/Dwarf3, the frame unwind info is stored in a form
which is compact but not suitable for fast unwinding, so the
Dwarf3->Factoid server could reformat it to be more suitable for
fast unwinding.(*)
There's lots of stuff on the web about Microsoft symbol servers, but I
don't see anything for other platforms. Does there exist
cross-platform versions of the technology?
(*) partially evaluate the Dwarf3 CFI for each address range,
producing, for each range, a canned summary of how to compute the
integer registers of the calling frame from the values in this frame.
Comment 32•15 years ago
|
||
Didn't Vlad do some factoid-like work already?
/be
Comment 34•15 years ago
|
||
I accidentally opened a dupe on this topic in order to point out http://www.eclipse.org/mat/ , which from my perspective offers a sort of high-point of the UI you might want.
(Note in particular the pleasantly informative choice of showing dominator trees)
Comment 35•15 years ago
|
||
(In reply to comment #26)
> For the C++ heap we might be reduced to guessing object
> types by looking for pointers to vtables within the objects.
I suspect you can get pretty far with this sort of thing. There are limits of course, but I think you can get a fair bit out of the existing XPCOM infrastructure. Could dig in an allocation looking for a pointer to <a href="https://developer.mozilla.org/en/Using_nsIClassInfo">nsIClassInfo</a>, and/or could tailor a small non-vtbl mixin class for decorating other non-XPCOM / non-virtual types with a pointer to a tiny amount of metadata (particularly: "your module and type name, an offset to a member pointer that's your owner, and/or offset to a member pointer to an object from which you inherit a sense of site-origin").
If you can infer an ownership-tree out of the graph dominators, a few unidentified intermediate nodes along a chain leading to, say, self-identifying image data, DOM nodes or JS objects probably isn't a big hazard to comprehension. I'd focus on this before worrying about the "who allocated you" side of the equation, and building a stack-crawling system. Just getting a characterization of the contents of a really bad heap would be a good start.
The nice thing about bundling this in the standard build is we'll get a whole lot more sampling from the field, cases where sore spots are really standing out, not just "developer sandbox testcases".
Reporter | ||
Comment 36•15 years ago
|
||
(In reply to comment #35)
I tried for a while to do a quick hack to get ELF symbol information
into the process, so as to see if any info can be scraped out of heap
blocks. But got nowhere; dlsym on Linux won't say anything about
private data symbols, which most vtables are. Plus it's unportable
and no use for field runs. I'm sure there's a benefit to be had from
reading debug info, but it seems like it'll take serious integration
with breakpad (in some way) to make it workable and cross-platform.
> but I think you can get a fair bit out of the existing XPCOM
> infrastructure. Could dig in an allocation looking for a pointer to <a
> href="https://developer.mozilla.org/en/Using_nsIClassInfo">nsIClassInfo</a>,
At profile time I am merely faced with a collection of blocks that
malloc/new have noted. I don't know which of those I can safely run
QueryInterface on.
What would be nice is for the compiler to automatically pass a
uniquely identifying text tag to each allocation. I can't find a way
to do that in the general case. But for classes that use XPCOM, we're
in luck. Every participating class has NS_DECL_ISUPPORTS,
NS_DECL_CYCLE_COLLECTING_ISUPPORTS or NS_DECL_ISUPPORTS_INHERITED in
its declaration. By adding to those macros the following
public: \
static void* operator new (size_t size) { \
return moz_malloc_tagged(size, (char*)__PRETTY_FUNCTION__); \
} \
static void operator delete (void *p) { moz_free(p); } \
it is possible to make a class-specific override of new/delete, which
passes __PRETTY_FUNCTION__ onwards, and that string contains the class
name. Wiring all that together makes possible to generate stats like
this (edited to make it less wide):
objects/bytes: 118343/12905257 in CH, 29901/2965424 in CH-tagged
[ 0] objects/bytes: 2000 / 480000 sStandardURL
[ 1] objects/bytes: 4251 / 408096 CSSStyleRuleImpl
[ 2] objects/bytes: 3165 / 278520 nsTextNode
[ 3] objects/bytes: 2369 / 265328 XPCWrappedNative
[ 4] objects/bytes: 876 / 147168 nsEventListenerManager
[ 5] objects/bytes: 1533 / 147168 nsXULElement
[ 6] objects/bytes: 2962 / 118480 AtomImpl
[ 7] objects/bytes: 673 / 91528 nsHTMLAnchorElement
[ 8] objects/bytes: 237 / 45504 nsLocalFile
[ 9] objects/bytes: 156 / 41184 imgRequest
(many lines deleted)
This says: in the C++ heap there are currently 118343 blocks
comprising 12905257 bytes. Of these, 29901 blocks (2965424 bytes)
have a tag. Followed by a per-class breakdown for the tagged blocks,
most voluminous first. The full profile has 447 lines, so at least
447 classes have been covered.
What this also shows is that only about 1/4 of the blocks are
identified. Since it covers (almost) all XPCOM classes, this gives a
quick upper bound on what coverage we can get from XPCOM.
It would be nice to expand the coverage. I am wondering if adding a
single line to other allocation points would not be a bad tradeoff.
It could be done incrementally.
Julian, I hate to suggest this, but I think the biggest benefit here would be to get this working under Win32 initially; there at least I know that the pdb files have all the info we need to resolve vtable pointers, and the dbghelp library makes doing that straightforward.
I went off on a bit of a tangent yesterday and wrote a quick tool to walk the entire Win32 heap (win32, not jemalloc), dumping the first 4 bytes of each allocation for later vtable analysis. It returns all that to js, and then dbghelp or other tools (map files also come to mind -- we could make this completely cross platform if we standardized on one map file format) can be used via ctypes/etc. to translate to symbol names. I'll get that patch up and cc you.
Comment 38•15 years ago
|
||
(In reply to comment #37)
> Julian, I hate to suggest this, but I think the biggest benefit here would be
> to get this working under Win32 initially; there at least I know that the pdb
> files have all the info we need to resolve vtable pointers, and the dbghelp
> library makes doing that straightforward.
I think this is a really bad idea. This should be a tool in *every build* of the program. Not "some developer builds, sometimes, when they happen to have set things up right". You should always be able to take a browser -- when it's up at 1gb resident after a week of browsing on a normal desktop and you don't know why -- and just ask it "why?"
The problem we face here is that in "our testing", when we do it on talos or under controlled conditions, we have these lovely graphs that say we don't use much memory and we always release it and everything is lovely. And then users in the field keep posting back screenshots from week-long sessions with a few dozen tabs and a modest set of extensions saying "no, you don't, it's really out of control and my system is paging again and it's firefox's fault". Then we turn around and point at our pretty graphs and tell them they must be imaging things because it's well-behaved when we tested it.
We need to close that gap, and that means gathering better *field data*.
I think we're saying the same thing -- I'm pointing out that 90% of our users are on win32, so getting that data would be a lot more useful than gathering linux data, and that there's a standardized way to obtain all the symbol data that we'd need via the symbol server, plus a simple library that we can use to query that data. All stuff that makes it possible to actually gathering actual field data :-)
We could also look into teaching breakpad how to spit out text map files so that we have a standard "symbol data" format without having to write platform specific code to get it as well. But right now, as far as I know, win32 is the only platform where we have infrastructure to say "give me symbols for release/nightly build with signature abcd".
Reporter | ||
Comment 40•15 years ago
|
||
(In reply to comment #39)
> I think we're saying the same thing -- I'm pointing out that 90% of our users
> are on win32, so getting that data would be a lot more useful than gathering
> linux data,
I understand what you're saying, and I'm not disagreeing. However,
I'm not trying to gather linux data. I'm trying to see how far it's
possible to get without any external assistance, a la Graydon
comments.
I'm inclining more towards a source-oriented approach -- what can we
do to get the compiler to add this info automatically? For one thing,
vtable scraping is going to fail to tell us anything for straight
malloc'd blocks, or for classes in which no vtable is required.
Reporter | ||
Comment 41•15 years ago
|
||
(In reply to comment #36)
> What this also shows is that only about 1/4 of the blocks are
> identified. Since it covers (almost) all XPCOM classes, this gives a
> quick upper bound on what coverage we can get from XPCOM.
25% coverage of the heap is pretty useless. I wondered how many
other allocation points would have to be hand annotated to
improve that significantly. Problem is there are thousands of
allocation points (constructors, mostly). On the assumption that
allocation probably follows a 90/10 rule, I bootstrapped the
process by hacking the profiler to record the top 5 frames of
each allocation, commoned those up, and looked at the sources
corresponding to the 5-frame groups that allocated the most.
This tells me where to put hand annotations so as to maximise
coverage of heap contents.
An afternoon of doing this increased coverage to 80% of heap
bytes, sometimes more, via 35 one-liner annotations, mostly
like so
MOZALLOC_PROF_TAG(this);
in constructors, since
MOZALLOC_EXPORT void moz_prof_tag(void* ptr, const char* tag);
#define MOZALLOC_PROF_TAG(__ptr) \
moz_prof_tag((__ptr), __PRETTY_FUNCTION__)
A sample profile is attached (see next comment), for Fx with 7 tabs
open. It shows coverage of 74% of bytes and 87% of objects for a set
of 177,885 live blocks that have been allocated through mozalloc.
The lines marked "static void* C::operator new..." are
allocations tagged automatically by the trick in Comment #36.
All others are from hand-applied annotations.
Reporter | ||
Comment 42•15 years ago
|
||
Comment 43•15 years ago
|
||
I support the standalone from all side-files, works in product builds, source based approach. When in doubt, use brute force. If we have to mangle our source a bit, it could be worthwhile compared to the old manglings we put up with all over for dubious benefit.
/be
Comment 44•15 years ago
|
||
Julian, how does your addition of operator new stuff work for classes that already have NS_DECL_AND_IMPL_ZEROING_OPERATOR_NEW?
Comment 45•15 years ago
|
||
This is cool, though I suspect people will immediately say "what's the stack trace for static void* nsCSSCompressedDataBlock::operator new(size_t, size_t)?" and you'll end up reinventing Massif's call tree structure. But that's ok, if it can be done within the browser so that anyone can use it that would be a big win.
Minor comments about the profile output format:
- Commas in big numbers (eg. 123,456) make them much easier to read;
- Please mark each entry with the percentage of the total that it represents.
Reporter | ||
Comment 46•15 years ago
|
||
(In reply to comment #44)
> Julian, how does your addition of operator new stuff work for classes that
> already have NS_DECL_AND_IMPL_ZEROING_OPERATOR_NEW?
Yes, an excellent question. Basically it doesn't. If a class
overrides new and delete itself then adding a second override in
NS_DECL_ISUPPORTS causes the compiler to complain.
What I _really_ did was to rename NS_DECL_ISUPPORTS to
NS_DECL_ISUPPORTS_NO_NEW_OVERRIDE (possibly a bad name :-) and to then
do this
#define NEW_OVERRIDES \
public: \
static void* operator new (size_t size) { \
return moz_malloc_tagged(size, (char*)__PRETTY_FUNCTION__); \
} \
static void operator delete (void *p) { moz_free(p); } \
#define NS_DECL_ISUPPORTS \
NS_DECL_ISUPPORTS_NO_NEW_OVERRIDE \
NEW_OVERRIDES
(ditto with NS_DECL_CYCLE_COLLECTING_ISUPPORTS and
NS_DECL_ISUPPORTS_INHERITED)
Then, everywhere where g++ complained about duplicate overrides,
changed NS_DECL_ISUPPORTS to NS_DECL_ISUPPORTS_NO_NEW_OVERRIDE (viz,
back to what it was originally).
From a bit of greppery, there are about 1700 uses of
NS_DECL_ISUPPORTS_*, of which 23 had to be _NO_NEW_OVERIDE'd. So not
a big hit.
That said there was some complexity I didn't figure out yet. Even
after the compiler accepted everything, in a few cases possibly to do
with multiple inheritance, I was seeing uses of uninitialised values
somehow to do with the automatic overriding, leading to segfaults, and
I had to back off the the _NO_NEW_OVERRIDE versions for those classes.
Possibly about 6 cases.
Reporter | ||
Comment 47•15 years ago
|
||
(Stuff I forgot to say in Comment #46)
... consequently, classes that are marked _NO_NEW_OVERIDE have to
be hand-annotated using MOZALLOC_PROF_TAG as per Comment #41.
MOZALLOC_PROF_TAG is a second way to get tag information into the
profiler. It facilitates setting the tag on a block after the block
has been allocated. The ideal of course is to set the tag at
allocation time, a la moz_malloc_tagged in Comment #46.
Reporter | ||
Comment 48•15 years ago
|
||
(In reply to comment #42)
> Created an attachment (id=449124) [details]
> heap profile as per Comment #41
As a light diversion from the "what should we build?" question: in
this profile, no less than 37% of the objects in the heap (66.6k out
of a total of 177k objects) have to do with CSS. That seems a lot.
Is that expected?
HEAPPROF 16701: objects/bytes: 177885/18446274 in CH,
155018/13662436 in CH-tagged
RANK OBJECTS/BYTES ALLOCATED BY
0 9131 / 1827388 static void* nsCSSCompressedDataBlock::operator
new(size_t, size_t)
1 19084 / 1374048 nsCSSSelector::nsCSSSelector()
4 8525 / 818400 static void* CSSStyleRuleImpl::operator new(size_t)
9 8491 / 407568 nsCSSDeclaration::nsCSSDeclaration()
12 10306 / 247344 nsCSSSelectorList::nsCSSSelectorList()
17 6484 / 155616 nsCSSValueList::nsCSSValueList()
27 1747 / 69880 nsCSSValuePairList::nsCSSValuePairList()
29 1017 / 50568 nsCSSValue::URL::URL(nsIURI*, nsStringBuffer*, nsIURI*,
nsIPrincipal*)
35 1156 / 36992 nsPseudoClassList::nsPseudoClassList(nsIAtom*,
nsCSSPseudoClasses::Type)
59 354 / 16992 static void* nsDOMCSSAttributeDeclaration::operator
new(size_t)
61 114 / 15504 static void* nsCSSStyleSheet::operator new(size_t)
103 57 / 4560 static void* CSSNameSpaceRuleImpl::operator new(size_t)
104 81 / 4536 static void* nsCSSRuleProcessor::operator new(size_t)
126 90 / 2880 static void* CSSImportantRule::operator new(size_t)
Comment 49•15 years ago
|
||
That seems like a large proportion, true; but you're also looking at a profile that (if I read correctly) only relates to ~18mb resident? So like .. a very small snapshot. Possibly quite un-representative of the program later on in its life. I assume this is just from a prefix of startup?
Can you get a profile once it's been browsing for a while and, say, passes the 150mb resident mark? Might need to up the numbers in your static array.
Comment 50•15 years ago
|
||
> Is that expected?
It could be, depending on what's being measured. You mentioned 7 tabs. What were they? Was gmail involved, say?
Reporter | ||
Comment 51•15 years ago
|
||
(In reply to comment #49)
> That seems like a large proportion, true; but you're also looking at a profile
> that (if I read correctly) only relates to ~18mb resident?
Yes .. I was also suspicious about that. By running a profiled build
on Massif, which intercepts all mallocs and frees, it's possible to
see what proportion of heap allocations are routed through mozalloc
and hence are intercepted. The news isn't good.
For a startup with 4 tabs, intercepting mozalloc shows the heap volume
peaking at around 29.4M. Massif puts that figure at a more credible
65.0M, meaning more than half the heap allocation bypasses mozalloc and
so is invisible to the built-in profiler. I'm not sure what to do
about this. It would be possible to force (eg) jsengine to route
allocations through mozalloc, but that would create a new intermodule
dependence. And there's a significant amount of heap allocated by
libX11.so.6.3.0 and libxcb.so.1.1.0 which we can't reroute at the
source level.
One ungood kludge would be to write a shared object redefining malloc,
free, etc and LD_PRELOAD it into the process. That of course only
works on Linux and perhaps MacOSX.
Anyway, some stats from the above run.
I should add, all these figures exclude the overhead of the built-in
profiler itself (about 18M), to make the comparison fair.
============
Built-in intercepts in mozalloc give a peak heap use of 29.4M.
Massif says 65.0M.
where did the rest go? Well, approximately:
12,737,468 0x704B744: moz_calloc (mozalloc.cpp:137)
8,484,176 0x704B81B: moz_malloc_tagged (mozalloc.cpp:105)
6,564,989 0x704B87D: moz_xmalloc (mozalloc.cpp:91)
p 12737468 + 8484176 + 6564989
$1 = 27786633
So 27.8 MB was noted by Massif as going via mozalloc; that ties
in at least approximately with the 29.4M figure above.
For the rest:
11,023,595 in 735 places, all below massif's threshold (01.00%)
PR_Malloc calls malloc directly:
4,754,242 PR_Malloc (prmem.c:467)
js_malloc calls malloc directly:
3,832,065 js_NewScript (jsutil.h:193)
3,478,750 JS_ArenaAllocate (jsutil.h:193)
2,269,696 JS_DHashAllocTable (jsutil.h:193)
1,401,504 JSScope::create(JSContext*, JSObjectOps const*, JSClass*,
JSObject*, unsigned int) (jsutil.h:193)
904,564 js_NewStringCopyN (jsutil.h:193)
system libraries presumably call malloc directly:
3,142,400 XGetImage (in /usr/lib/libX11.so.6.3.0)
1,571,264 ??? (in /usr/lib/libxcb.so.1.1.0)
extensions/spellcheck/hunspell/src/hashmgr.cpp calls malloc directly:
2,685,946 HashMgr::add_word(char const*, int, int, unsigned short*,
int, char const*, bool) (hashmgr.cpp:188)
## Hmm, does this mean we're allocating 2.7MB merely to initialise
## a spell checker that didn't get used in this run?
XPT_ArenaMalloc calls calloc directly:
1,126,648 XPT_ArenaMalloc (xpt_arena.c:221)
sqlite3 calls malloc directly:
1,038,456 sqlite3MemMalloc (sqlite3.c:12975)
Reporter | ||
Comment 52•15 years ago
|
||
(In reply to comment #50)
> > Is that expected?
>
> It could be, depending on what's being measured. You mentioned 7 tabs. What
> were they? Was gmail involved, say?
No gmail. I don't recall exactly, but I think the tabs were
http://lwn.net/Articles
http://www.youtube.com
http://en.wikipedia.org/wiki/Main_Page
and the other 4 were articles accessible from
http://www.spiegel.de/international
Comment 53•15 years ago
|
||
bug 559263 has some changes to make us wrap malloc more thoroughly with mozalloc. I'm not sure if those apply on all platforms or just Android. cjones would be the one to talk to about mozalloc hijinks, in any case.
Comment 54•15 years ago
|
||
The other major one is the JS GC chunks, though that may already go through the right place on Linux. That's better now that Vlad has landed the hooks, though.
Comment 55•15 years ago
|
||
> No gmail. I don't recall exactly, but I think the tabs were
OK. We'd have to take a look at their source to see whether the numbers make sense, but they're at least plausible...
(In reply to comment #51)
> It would be possible to force (eg) jsengine to route
> allocations through mozalloc, but that would create a new intermodule
> dependence. And there's a significant amount of heap allocated by
> libX11.so.6.3.0 and libxcb.so.1.1.0 which we can't reroute at the
> source level.
>
> One ungood kludge would be to write a shared object redefining malloc,
> free, etc and LD_PRELOAD it into the process. That of course only
> works on Linux and perhaps MacOSX.
>
So, two notes --- first, mozalloc has no dependencies by design, so using it in js/src would be a relatively simple matter of getting jseng-folk consent and shuffling around sources. This would work on all platforms. Second, we already use linker tricks on linux and windows (!!) to replace system malloc with jemalloc, so we could re-use those hacks to point malloc at a version with your instrumentation. This would take a bit of work. Mac would be left out.
(In reply to comment #53)
> bug 559263 has some changes to make us wrap malloc more thoroughly with
> mozalloc. I'm not sure if those apply on all platforms or just Android. cjones
> would be the one to talk to about mozalloc hijinks, in any case.
The android code uses GNU ld's --wrap, so linux only (for all intents and purposes).
Reporter | ||
Comment 57•15 years ago
|
||
(In reply to comment #56)
> shuffling around sources. This would work on all platforms. Second, we
> already use linker tricks on linux and windows (!!) to replace system malloc
> with jemalloc, so we could re-use those hacks to point malloc at a version with
> your instrumentation.
I tried with --enable-wrap-malloc, but got into various build system/linkage
difficulties, and gave up. Sticking with the source route for now.
Reporter | ||
Comment 58•15 years ago
|
||
(In reply to comment #51)
> Built-in intercepts in mozalloc give a peak heap use of 29.4M.
> Massif says 65.0M.
I routed js_malloc, XPT_ArenaMalloc and the allocator in db/sqlite3
through mozalloc. This significantly improves coverage. For a
startup of Fx with about 100ish tabs, Massif gives a peak C++ heap
allocation size of 342.9MB, and the mozalloc interceptor sees 252.0MB
of it (perhaps a bit more). That's 73.5%. Most of the rest escapes
via PR_Malloc, and it might be possible to reroute a couple of its
high-roller callers to mozalloc, so a figure of 80+% is possible, I'd
say.
That's sufficiently encouraging that I'd like to turn my collection
of experimental hacks into a proper patch. However, I have one
structural problem unsolved. The interceptor (which is in mozalloc)
really needs a lock/unlock facility. Right now I'm kludging it by
calling pthread_mutex_{lock,unlock} directly; obviously that won't
work on Windows.
I need to use the locking facilities in NSPR instead, but mozalloc
doesn't depend on NSPR. I could I guess copy JSThinLock into
mozalloc, so as to keep mozalloc standalone, but that's not ideal.
Any suggestions?
Comment 59•15 years ago
|
||
jslock.{cpp,h} depend on NSPR -- a thin lock can fatten into something requiring a PRLock.
If you just need a mutex, but also must stand alone, best to use raw Windows API and ifdef. Shouldn't be too bad.
/be
Reporter | ||
Comment 60•15 years ago
|
||
Standalone C++ heap profiler, patch against m-c 46349:4aadb903f941
(Mon Jun 28 09:46:49 2010 -0700).
Provides who-allocated-what-in-the-C++-heap analysis without external
assistance. As per comment #51, covers about 80% of the C++ heap
volume. Of that 80%, about 95% of the volume is tagged by allocation
point, using the schemes in comment #41 and comment #36.
In a quick experiment with 6 tabs, it is tracking about 42MB of
blocks, whereas "about:memory" lists about 12.5MB at that point.
Linux only for the time being. Working on extending it to MacOS and
Windows. Should be jemalloc-agnostic, although that is untested.
HOW TO USE:
Profiling is enabled/disabled at Fx startup, by default disabled.
To enable, start with env var MOZALLOC_HEAPPROF set to anything.
When running, it monitors mallocs/frees (etc). Every 20000 such
calls, it prints a summary line showing the current total volumes:
1456357/1183643/945M total allocs/frees/totalloc, 272714/40802k live blocks/bytes
Periodically the (tracked part of the) heap is sampled and a census
printed, one line per allocation point ("tag"), showing the number of
blocks and bytes billed to each tag. A census is performed when the
maximum volume has expanded by more than 4% (over 50MB) or by 2MB
(under 50MB). This means that you are guaranteed to have a census
within 4% of peak residency, and it will be the most recently
displayed one.
There is some space overhead (probably less than 10% of the allocated
heap) when profiling is enabled, and some extra processing per
allocation. Browser is still usable though. The extra space overhead
is not shown in the generated profiles. When disabled, the overheads
should be very close to zero (1 extra conditional branch per malloc
and per free).
Attachment #445563 -
Attachment is obsolete: true
Reporter | ||
Comment 61•15 years ago
|
||
For a startup/quit of the browser, with 6 tabs.
Attachment #449124 -
Attachment is obsolete: true
Reporter | ||
Comment 62•15 years ago
|
||
Version that works on both Linux and MacOSX, at least for 10.5.8,
32-bit build with gcc-4.2. Also, rebase to m-c 46969:d9253109fd88
(Thu Jul 01 14:40:49 2010 +0200).
Attachment #454639 -
Attachment is obsolete: true
Reporter | ||
Comment 63•15 years ago
|
||
Now with win32 support (at least, works on WinXP, MSVC9,
debug and release builds). Patch relative to m-c 46969:d9253109fd88
Attachment #455470 -
Attachment is obsolete: true
Reporter | ||
Comment 64•15 years ago
|
||
This now works on the biggest 3 targets. I'd like to move it towards
land-ability. At this point, some level of feedback/review seems to
me necessary.
Patch provides functionality as described in comment #60. However,
there are various kinds of rough edges:
(from the user's aspect)
* how should profiling be enabled? atm env var MOZALLOC_HEAPPROF
is tested at startup; if set, profiling is enabled. This doesn't
strike me as good. I thought about using a pref, but really we
want the profiler to be enabled or disabled right from process
start.
* where should output go? currently it's sent to stderr
(sending it to stdout breaks jsengine building).
(from the Fx code base aspects)
* Patch routes much more malloc/free traffic through mozalloc
than previously. This means these modules acquire a new
dependency on mozalloc:
* js
* db/sqlite3
I'm sure this totals standalone builds of js (haven't tried).
Even with those routed through mozalloc, we only capture about
80% of heap allocation, so potentially more modules should be
routed through mozalloc. [IMO, it no longer makes sense to have
mozalloc separate from NSPR; they should be combined.]
* When not profiling, there is an extra overhead of 1 memory
reference (read) and 1 conditional branch per alloc and free.
IMO this is insignificant in that (according to my own measurements)
the behaviour of Fx in big runs is on the order of 1 malloc and
1 free per 50000 instructions.
* As per 3rd para in comment #60, supposing we wanted to add more
allocation tags to cover the 5% of the heap that is currently
tagged as UNKNOWN (we know an allocation is happening, but we
don't know is doing it.) Finding the point in the sources to
annotated by hand requires some auxiliary mechanism to get a
stack trace. I have been using a hacked valgrind.
Given that the profiler is supposed to be standalone, I can't see
how the dependence on some external tool to "bootstrap" it can be
avoided. Just mentioning for completeness.
Reporter | ||
Comment 65•15 years ago
|
||
(from the Fx code base aspects in comment #64) +=
* the scheme described in comment #36 provides automatic tagging of
allocations made by all nsCOMPtr classes. It does this by stuffing
a per-class override of new and delete into the NS_DECL_*ISUPPORTS
macros. Since there are a couple thousand of such classes, there
will be some cost in terms of object code size. I'll measure.
Reporter | ||
Comment 66•14 years ago
|
||
* shows profiling results in "about:heap"
* rebased to m-c 48706:a4d86c3a3494 (Mon Aug 02 01:28:10 2010 -0700)
WIP. Not re-tested on OSX or Win. Appears to have stability problems
(occasional heap block overwrites) that weren't there before.
Reporter | ||
Comment 67•14 years ago
|
||
Attachment #464408 -
Attachment is obsolete: true
Reporter | ||
Comment 68•14 years ago
|
||
Not pretty, but possibly useful. Output is in two parts:
* the most recent census (detailed breakdown of contents). Since
a census is only made when the heap high-water mark rises, this
is always a picture of the heap at or close to the max for the run
so far.
* periodic one-line summaries, to give some feel for changes over time.
Reporter | ||
Comment 69•14 years ago
|
||
Revised version of comment 67 patch, with about 180 allocation sites
in js/src/*.{cpp,h} (that is, calls to js_malloc/_calloc/_realloc)
hand-annotated with allocation tags. This gives much better
resolution into jsengine's allocations. Prior to this change the
profiler would say (eg) that lots of stuff was allocated by
js_realloc, which isn't helpful -- see screenshot in comment #68. Now
it can say who is making those calls.
Still requires refinement to handle allocated-on-behalf-of scenarios.
Eg, saying that 4.5% of the heap is allocated by PL_DHashAllocTable is
pretty useless. Am thinking about tagging each block with 2 strings
rather than one -- primary and secondary tags. So then it would be
able to say, effectively: "X% of the heap was allocated by
PL_DHashAllocTable, on behalf of nsFoo::bar" (where nsFoo::bar is
using PL_DHashAllocTable to create/maintain a hash table).
Still has stability problems. I believe this is the same problem
as documented in bug 587610, and is unrelated to the profiler per se.
Attachment #464410 -
Attachment is obsolete: true
Comment 70•11 years ago
|
||
jseward: given the presence of about:memory and DMD, I think we can safely close this bug. Please reopen if you disagree.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•