Closed Bug 708159 Opened 13 years ago Closed 13 years ago

Avoid unnecessary work done by multi-reporters nsMemoryReporterManager::GetExplicit

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla12

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Keywords: dev-doc-complete, Whiteboard: [Snappy])

Attachments

(1 file, 4 obsolete files)

nsMemoryReporterManager::GetExplicit() calculates the "explicit" memory measurement. This consists of "heap-allocated" (which is cheap to get) plus the sum of all the explicit NONHEAP memory reports. Most of the work done by GetExplicit is doing the NONHEAP measurements. For nsIMemoryReporters, no unnecessary work is done -- GetExplicit checks if the kind is NONHEAP and the path starts with "explicit", and if not, the "amount" field isn't accessed. However, for nsIMemoryMultiReporters, lots of unnecessary work is done. We call CollectReports, which does all the measurements and passes them back via a callback. GetExplicit's callback ignores any non-NONHEAP, non-explicit measurements. To put this more concretely, the JS multi-reporter touches every live thing on the JS heap every time it runs. With telemetry enabled, this is once per minute. This is bad. On Linux, it's even worse because we also read /proc/self/smaps on every telemetry ping. So, how to fix? I thought about passing an extra arg to CollectReports that is a mask of the kinds we need to know about. But that's still going to do more work than we need. A more efficient approach is one that targets GetExplicit's specific use case. We'd add nsIMemoryMultiReporter::SizeOfExplicitNonHeap(), which would just return the sum of the explicit, NONHEAP measurements. For most multi-reporters (e.g. StorageSQLiteMultiReporter and MapsReporter) this would just return 0. For XPConnectJSCompartmentsMultiReporter it would just need to do some much quicker, higher-level traversals of the GC heap and a few other things.
:sigh. Adding nsIMemoryMultiReporter::SizeOfExplicitNonHeap is an unfortunate hack, but I'm not sure how to do this better. I wish we could at least check that SizeOfExplicitNonHeap returns close to the right thing, but even that is hard, because the world can change under you.
Blocks: 708285
Another possibility is to not do this and remove "explicit" from the telemetry stats. We'd still get "heap-allocated" and "resident" in telemetry, so it wouldn't be the end of the world, and we'd avoid the whole possibility of the two different measurement paths giving different results. Having said that, there are already two paths for computing "explicit" -- about:memory basically just sums all the expclit HEAP and NONHEAP values, whereas GetExplicit sums "heap-allocated" and all the explicit NONHEAP values.
No longer blocks: 708285
So a multi-reporter could, when registering, indicate the shared path prefix of all its reporters. We could say that if you're giving any explicit/ reporters, you must register with a prefix starting with "explicit/". I'd like that API much better than "give me all your explicit nonheap numbers." We'd have to iterate over all of explicit in order to use this, though. :-/
Attached patch patch (obsolete) — Splinter Review
This patch implements nsIMemoryReporter::explicitNonHeap. - bent, can you review the WorkerPrivate.cpp changes? I had to genericize the existing code in order to do two different GC traversals within the framework you created. I did this by rearranging things, passing in a function that does the traversal, and changing |IterateData* data| to |void* data|. - billm, can you review the JS and xpconnect changes? I had to create IterateCompartments. And the xpconnect changes are required for the WorkerPrivate.cpp changes. - jlebar, can you review everything else? I added GetExplicitNonHeap implementations where necessary. I also added a new "explicit" reporter that gets the result from nsMemoryReporterManager::GetExplicit(), i.e. uses the new code. The good thing about this is that it shows up in the "Other Measurements" list in about:memory, so you can compared it to the one at the top of about:memory. I've found that they're usually within 100KB of each other, and often within a few KB. I've seen them get as much as 1MB or so apart when the browser is really busy, e.g. restoring an old session. So if the two methods of measuring "explicit" ever get out of sync we'll be able to notice. (In fact, this is how I found bug 709653 -- the new method was giving the right answer, the old method was not.) This creation of the stand-alone "explicit" reporter also simplified TelemetryPing.js. Oh, I had to make a minor change for the swap tree in about:memory. If it's zero it now looks like this: 0 B (100.0%) -- swap └──0 B (100.0%) -- total Because we now have a reporter named "explicit" as well as a tree rooted at "explicit", I had to change the |treeNameMatches| function to check for "explicit/" at the start of the path instead of "explicit", so that the "explicit" reporter isn't mistakenly put in the "explicit" tree. And for an emmpty swap tree the name of the tree was "map/swap" and the only entry in it was also called "map/swap", so the version of |treeNameMatches| failed. With "map/swap/total" it doesn't fail. (I hope that made sense.) I also did some other testing that isn't visible -- while writing the patch I first implemented the new version of GetExplicit() in parallel with the old one to make sure I got it right. It took a few goes (the DOM worker's GetExplicitNonHeap was a bit tricky), but eventually I got them to always give the same result, which gives me confidence. The old method is gone in the patch, of course.
Attachment #581175 - Flags: review?(wmccloskey)
Attachment #581175 - Flags: review?(justin.lebar+bug)
Attachment #581175 - Flags: review?(bent.mozilla)
Comment on attachment 581175 [details] [diff] [review] patch >@@ -218,17 +225,17 @@ MapsReporter::CollectReports(nsIMemoryMu > // 'map/vsize'; otherwise we're probably not reading smaps correctly. If we > // didn't create a node under 'map/swap', create one here so about:memory > // knows to create an empty 'map/swap' tree. See also bug 682735. > > NS_ASSERTION(categoriesSeen.mSeenVsize, "Didn't create a vsize node?"); > NS_ASSERTION(categoriesSeen.mSeenVsize, "Didn't create a resident node?"); > if (!categoriesSeen.mSeenSwap) { > aCallback->Callback(NS_LITERAL_CSTRING(""), >- NS_LITERAL_CSTRING("map/swap"), >+ NS_LITERAL_CSTRING("map/swap/total"), OOC, why did you have to do this? > NS_IMETHODIMP > nsMemoryReporterManager::GetExplicit(PRInt64 *aExplicit) > { >- // Get KIND_NONHEAP measurements from multi-reporters, too. >+ PRInt64 explicitNonHeapNormal = 0; >+ for (PRUint32 i = 0; i < explicitNonheapNormal.Length(); i++) { >+ explicitNonHeapNormal += explicitNonheapNormal[i].amount; >+ } >+ >+ // For each multi-reporter we could call CollectReports and filter out the >+ // non-explicit, non-NONHEAP measurements. But that's lots of wasted work, >+ // so we instead use SizeOfExplicitNonHeap() which exists purely for this >+ // purpose. Can you indicate (until bug 700508 lands) that we're assuming the multi-reporters don't overlap? Also, how would you feel about doing a sanity check here in debug builds to ensure that SizeOfExplicitNonHeap is close to what you'd get if you iterated over the multi-reporter and filtered its results? We can't make it a fatal assertion, but this would catch bugs where someone, say, had an explicit, non-heap reporter but always returned 0 in SizeOfExplicitNonHeap.
Attachment #581175 - Flags: review?(justin.lebar+bug) → review+
Comment on attachment 581175 [details] [diff] [review] patch Review of attachment 581175 [details] [diff] [review]: ----------------------------------------------------------------- The JS stuff looks okay. Is there someone who would be more qualified to review the xpconnect changes? I haven't looked at that code in a while and it's changed a lot.
> > aCallback->Callback(NS_LITERAL_CSTRING(""), > >- NS_LITERAL_CSTRING("map/swap"), > >+ NS_LITERAL_CSTRING("map/swap/total"), > > OOC, why did you have to do this? See the paragraph in comment 4 that begins "Oh, I had to make a minor change..." > Also, how would you feel about doing a sanity check here in debug builds to > ensure that SizeOfExplicitNonHeap is close to what you'd get if you iterated > over the multi-reporter and filtered its results? We can't make it a fatal > assertion, but this would catch bugs where someone, say, had an explicit, > non-heap reporter but always returned 0 in SizeOfExplicitNonHeap. I don't want to keep the iterate-over-multi-reporters code, because it's moderately complicated. But it did occur to me last night to add a test that loads about:memory and checks the two explicit values are within 1% (or something like that).
> See the paragraph in comment 4 that begins "Oh, I had to make a minor change..." Oh, sorry; I completely skipped over the explanatory paragraph. > But it did occur to me last night to add a test that loads about:memory and checks the > two explicit values are within 1% (or something like that). Okay, cool.
> > But it did occur to me last night to add a test that loads about:memory and checks the > > two explicit values are within 1% (or something like that). > > Okay, cool. Thinking some more, measuring twice is a much stronger guarantee, so I'll think about adding it back in for debug builds.
about:memory does basically the same computation as was removed in this patch, right? So you're still measuring twice if you add the test?
(In reply to Justin Lebar [:jlebar] from comment #10) > about:memory does basically the same computation as was removed in this > patch, right? So you're still measuring twice if you add the test? That's right. But the comparison would have to be a "within 1% or so", and would only be done for whatever set of pages we load in the test. Whereas if I keep the old code the comparison would be exact and would run all the time in debug builds, on whatever set of pages people are viewing. So as a checking mechanism it's both more precise and much more wide-ranging. (And the old code will become a bit less obnoxious if bug 700508 ever re-lands.)
> Whereas if I keep the old code the comparison would be exact Is that true? My assumption was that it wouldn't be exact, since we have at least a tiny bit of concurrency in Gecko. It's easy enough to check, I guess.
The results were exactly the same for the testing I did, but that testing wasn't extensive.
If you made it just a warning, that'd probably be fine. NS_ASSERTION and NS_WARNING are the same level, anyway. But I suspect a worker, for example, could cause the numbers to be different.
I did some measurements. With 10 tabs open (gmail + 9 news sites), with the old code, computing "explicit" was taking 60--70ms. I thought a lot of it might be due to reading /proc/<pid>/smaps, so I disabled that and it still took 50--60ms. With the new code, it takes about 0.3ms, so it's ~200x faster. Furthermore, I represent the least worst case because I have a very fast machine with 16GB of RAM so I'm not seeing any slow-down from paging. The improvement could be much bigger for people on machines with less RAM.
One 50ms jank down, some unknown number to go!
Keywords: dev-doc-needed
Attached patch patch, v2 (obsolete) — Splinter Review
bent, can you review the xpconnect changes? They're very closely related to the DOM worker changes. jlebar, I put the double-checking we talked about in GetExplicit(). After browsing for a while, I'm yet to see a case where the two measurements don't match exactly. I also fixed test_aboutmemory.xul.
Attachment #581175 - Attachment is obsolete: true
Attachment #581175 - Flags: review?(wmccloskey)
Attachment #581175 - Flags: review?(bent.mozilla)
Attachment #581509 - Flags: review?(justin.lebar+bug)
Attachment #581509 - Flags: review?(bent.mozilla)
Blocks: 703143
Can you make the #ifdef DEBUG code more self-contained? Right now, I have to read through much of it while I'm trying to understand the release code, which is a bummer...
Also, a comment explaining what class of bugs you're trying to catch by computing |explicit| twice would be appreciated.
Attached patch patch v3 (obsolete) — Splinter Review
(In reply to Justin Lebar [:jlebar] from comment #18) > Can you make the #ifdef DEBUG code more self-contained? Right now, I have > to read through much of it while I'm trying to understand the release code, > which is a bummer... Done. (In reply to Justin Lebar [:jlebar] from comment #19) > Also, a comment explaining what class of bugs you're trying to catch by > computing |explicit| twice would be appreciated. Is this existing comment unclear? // (Actually, in debug builds we also do it the slow way and compare the // result to the result obtained from GetExplicitNonHeap(). This // guarantees the two measurement paths are equivalent. This is wise // because it's easy for memory reporters to have bugs.)
Attachment #581509 - Attachment is obsolete: true
Attachment #581509 - Flags: review?(justin.lebar+bug)
Attachment #581509 - Flags: review?(bent.mozilla)
Attachment #581828 - Flags: review?(justin.lebar+bug)
Attachment #581828 - Flags: review?(bent.mozilla)
Comment on attachment 581828 [details] [diff] [review] patch v3 Sorry I missed the comment earlier (I should use interdiff, rather than scanning) -- it's fine. Thanks! > + fprintf(stderr, "my-explicit: %d, %d\n", > + (int)explicitNonHeapMultiSize, (int)explicitNonHeapMultiSize2); Maybe we should only print something when the assertion fires?
Attachment #581828 - Flags: review?(justin.lebar+bug) → review+
> > + fprintf(stderr, "my-explicit: %d, %d\n", > > + (int)explicitNonHeapMultiSize, (int)explicitNonHeapMultiSize2); > > Maybe we should only print something when the assertion fires? Oh, right, that's my debugging code :)
Attached patch patch v4 (obsolete) — Splinter Review
patch v3 was only half the story; I had the patch in two parts in my queue and only attached the 2nd one. jlebar, you've now reviewed both parts so no need to look at it again. bent, you're up!
Attachment #581828 - Attachment is obsolete: true
Attachment #581828 - Flags: review?(bent.mozilla)
Attachment #582097 - Flags: review?(bent.mozilla)
Comment on attachment 582097 [details] [diff] [review] patch v4 Review of attachment 582097 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/workers/WorkerPrivate.cpp @@ +183,5 @@ > NS_LITERAL_CSTRING(")/"); > } > > + nsresult > + CollectForRuntime(CollectFun aFun, void* aData) Please add an 'AssertIsOnMainThread' here too. @@ +195,5 @@ > // Don't ever try to talk to the worker again. > if (disabled) { > #ifdef DEBUG > { > + nsCAutoString message("Unable to get report memory for "); Nit: please revert this. @@ +221,5 @@ > + IterateData data; > + nsresult rv = CollectForRuntime(mozilla::xpconnect::memory::CollectCompartmentStatsForRuntime, > + &data); > + if (NS_FAILED(rv)) > + return rv; Nit: Please brace single line if blocks. ::: dom/workers/WorkerPrivate.h @@ +473,5 @@ > { } > #endif > }; > > +typedef JSBool (*CollectFun)(JSRuntime* aRt, void* aData); Hm, instead of doing this here I think you can just pass a boolean flag to the BlockAndCollectRuntimeStats function indicating if this is a full report or just the quick one. Then you can pass that all the way through and call the appropriate function from the BlockAndCollectRuntimeStats function.
> > +typedef JSBool (*CollectFun)(JSRuntime* aRt, void* aData); > > Hm, instead of doing this here I think you can just pass a boolean flag to > the BlockAndCollectRuntimeStats function indicating if this is a full report > or just the quick one. Then you can pass that all the way through and call > the appropriate function from the BlockAndCollectRuntimeStats function. That was my initial plan, but the more general approach wasn't any harder and seemed much less hacky.
BTW, I don't know what the outcome of your review is. You haven't given me r+ or r-, but I guess it's really an r- because you want to see another version of the patch with your suggested changes? Is that right? (Don't be afraid to r- me if you don't like the patch, that's much clearer than leaving the r? in place.)
Attached patch patch v5Splinter Review
I really think the function pointer version is superior to the boolean flag version. Oh well, I've made the change.
Attachment #582097 - Attachment is obsolete: true
Attachment #582097 - Flags: review?(bent.mozilla)
Attachment #584712 - Flags: review?(bent.mozilla)
Comment on attachment 584712 [details] [diff] [review] patch v5 Review of attachment 584712 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, thanks! ::: dom/workers/WorkerPrivate.cpp @@ +1451,5 @@ > JSAutoSuspendRequest asr(aCx); > > + *mSucceeded = mIsQuick > + ? mozilla::xpconnect::memory::GetExplicitNonHeapForRuntime(JS_GetRuntime(aCx), mData) > + : mozilla::xpconnect::memory::CollectCompartmentStatsForRuntime(JS_GetRuntime(aCx), mData); Nit: ? and : on preceding lines.
Attachment #584712 - Flags: review?(bent.mozilla) → review+
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: