Closed Bug 798510 Opened 12 years ago Closed 12 years ago

js::StackSpace::sizeOfCommitted() overestimates on non-Windows platforms

Categories

(Core :: JavaScript Engine, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla19
Tracking Status
firefox18 --- fixed

People

(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)

References

Details

Attachments

(2 files, 2 obsolete files)

js::StackSpace::sizeOfCommitted() is supposed to tell us how much of the stack is committed. But on Linux, it assumes the whole stack is committed: return (trustedEnd_ - base_) * sizeof(Value); Even though we don't explicitly decommit the stack on Linux (bug 798500), mmap does not immediately commit memory -- when the stack is mmaped, zero bytes are committed! Maybe we can use mincore to keep ourselves honest here.
Blocks: 798500
Attached patch Patch, v1 (obsolete) — Splinter Review
r?njn for the changes to the memory reporter comment. Luke, I'm new at hacking in js/src, so please let me know if I'm doing something wrong!
Attachment #668604 - Flags: review?(n.nethercote)
Attachment #668604 - Flags: review?(luke)
poisining
poisoning, but yes, thanks! :)
Attached patch Patch, v1.1 (obsolete) — Splinter Review
Attachment #668608 - Flags: review?(n.nethercote)
Attachment #668608 - Flags: review?(luke)
Comment on attachment 668604 [details] [diff] [review] Patch, v1 ># HG changeset patch ># User Justin Lebar <justin.lebar@gmail.com> > >Bug 798510 - js::StackSpace::sizeOfCommitted() overestimates space used on non-Windows platforms. > >diff --git a/js/src/vm/Stack.cpp b/js/src/vm/Stack.cpp >index c1cb4d7..c38e345 100644 >--- a/js/src/vm/Stack.cpp >+++ b/js/src/vm/Stack.cpp >@@ -809,17 +809,40 @@ StackSpace::tryBumpLimit(JSContext *cx, Value *from, unsigned nvals, Value **lim > } > > size_t > StackSpace::sizeOfCommitted() > { > #ifdef XP_WIN > return (commitEnd_ - base_) * sizeof(Value); > #else >- return (trustedEnd_ - base_) * sizeof(Value); >+ /* >+ * Measure how many of our pages are actually committed, using mincore. >+ * This is slow, but hopefully nobody expects this method to be fast. >+ */ >+ >+ const int pageSize = getpagesize(); >+ size_t numBytes = (trustedEnd_ - base_) * sizeof(Value); >+ size_t numPages = (numBytes + pageSize - 1) / pageSize; >+ >+ unsigned char *vec = (unsigned char *) js_malloc(numPages); >+ int result = mincore(base_, numBytes, vec); >+ if (result) { >+ js_free(vec); >+ return -1; >+ } >+ >+ size_t committedBytes = 0; >+ for (size_t i = 0; i < numPages; i++) { >+ if (vec[i] & 0x1) >+ committedBytes += pageSize; >+ } >+ js_free(vec); >+ >+ return committedBytes; > #endif > } > > #ifdef DEBUG > bool > StackSpace::containsSlow(StackFrame *fp) > { > for (AllFramesIter i(*this); !i.done(); ++i) { >diff --git a/js/xpconnect/src/XPCJSRuntime.cpp b/js/xpconnect/src/XPCJSRuntime.cpp >index 66cdec6..e378bc8 100644 >--- a/js/xpconnect/src/XPCJSRuntime.cpp >+++ b/js/xpconnect/src/XPCJSRuntime.cpp >@@ -1688,16 +1688,17 @@ ReportJSRuntimeExplicitTreeStats(const JS::RuntimeStats &rtStats, > // total for later. > > size_t rtTotal = 0; > > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/runtime-object"), > nsIMemoryReporter::KIND_HEAP, rtStats.runtime.object, > "Memory used by the JSRuntime object."); > >+ > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/atoms-table"), > nsIMemoryReporter::KIND_HEAP, rtStats.runtime.atomsTable, > "Memory used by the atoms table."); > > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/contexts"), > nsIMemoryReporter::KIND_HEAP, rtStats.runtime.contexts, > "Memory used by JSContext objects and certain structures " > "hanging off them."); >@@ -1730,17 +1731,23 @@ ReportJSRuntimeExplicitTreeStats(const JS::RuntimeStats &rtStats, > nsIMemoryReporter::KIND_NONHEAP, rtStats.runtime.unusedCode, > "Memory allocated by one of the JITs to hold the " > "runtime's code, but which is currently unused."); > > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/stack-committed"), > nsIMemoryReporter::KIND_NONHEAP, rtStats.runtime.stackCommitted, > "Memory used for the JS call stack. This is the committed " > "portion of the stack; the uncommitted portion is not " >- "measured because it hardly costs anything."); >+ "measured because it hardly costs anything." >+#ifdef DEBUG >+ " Note: You are running a debug build, which may have stack " >+ "poisining enabled, which causes the whole stack to be " >+ "committed." >+#endif >+ ); > > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/gc-marker"), > nsIMemoryReporter::KIND_HEAP, rtStats.runtime.gcMarker, > "Memory used for the GC mark stack and gray roots."); > > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/math-cache"), > nsIMemoryReporter::KIND_HEAP, rtStats.runtime.mathCache, > "Memory used for the math cache.");
Attachment #668604 - Attachment is obsolete: true
Attachment #668604 - Flags: review?(n.nethercote)
Attachment #668604 - Flags: review?(luke)
># HG changeset patch ># User Justin Lebar <justin.lebar@gmail.com> >... I have no idea how this happened, but sorry!
Comment on attachment 668608 [details] [diff] [review] Patch, v1.1 Great patch; that 4MB runtime commit has been bugging me when I look at about:memory on my phone. One nit: 4 space indent in js/src, not 2. To wit, we looked into madvise'ing the stack in bug 656941; same conclusion.
Attachment #668608 - Flags: review?(luke) → review+
Thanks for the review! > One nit: 4 space indent in js/src, not 2. Fixed locally, and also fixed the extraneous newline I added to XPCJSRuntime.cpp.
Assignee: general → justin.lebar+bug
Comment on attachment 668608 [details] [diff] [review] Patch, v1.1 Review of attachment 668608 [details] [diff] [review]: ----------------------------------------------------------------- The basic idea is fine and the patch mostly lookds good. But "stack-committed" is no longer an accurate name -- it should be "stack-resident". Likewise, pretty much every occurrence of "committed" in this patch should be changed to "resident". I'd like to check the patch again after you've done that, if you don't mind. (BTW, I've had the idea of using mincore() everywhere to fully distinguish between virtual and physical memory consumption in all memory reporters, but I suspect it would be too slow.) ::: js/src/vm/Stack.cpp @@ +826,5 @@ > + unsigned char *vec = (unsigned char *) js_malloc(numPages); > + int result = mincore(base_, numBytes, vec); > + if (result) { > + js_free(vec); > + return -1; Return -1 when the return type is size_t? I understand the intent, but it doesn't really work with this interface. Just give up and return 0? @@ +831,5 @@ > + } > + > + size_t committedBytes = 0; > + for (size_t i = 0; i < numPages; i++) { > + if (vec[i] & 0x1) Any reason not to use MINCORE_INCORE instead of 0x1? ::: js/xpconnect/src/XPCJSRuntime.cpp @@ +1692,5 @@ > RREPORT_BYTES(rtPath + NS_LITERAL_CSTRING("runtime/runtime-object"), > nsIMemoryReporter::KIND_HEAP, rtStats.runtime.object, > "Memory used by the JSRuntime object."); > > + Why the extra blank line?
Attachment #668608 - Flags: review?(n.nethercote)
Attachment #668608 - Flags: review?(luke)
Attachment #668608 - Flags: review-
Attachment #668608 - Flags: review+
> Why the extra blank line? Ah, I see you fixed that while I was reviewing.
> Any reason not to use MINCORE_INCORE instead of 0x1? That exists only on Mac, afaict. > But "stack-committed" is no longer an accurate name -- it should be "stack-resident". Hm, you're right! And that's the kind of thing we've tried not to do in about:memory. I wonder if there's a way to get the data I actually want. Probably not...
(In reply to Justin Lebar [:jlebar] from comment #11) > > Any reason not to use MINCORE_INCORE instead of 0x1? > > That exists only on Mac, afaict. Oh. Can you define a local constant with that name or a similar name, just to make it clearer what the magic number means? > Hm, you're right! And that's the kind of thing we've tried not to do in > about:memory. Yeah, we've never resolved the vsize vs. resident battle, but for this case stack-resident seems more useful, so I'm happy with that.
> Yeah, we've never resolved the vsize vs. resident battle, but for this case stack-resident seems > more useful, so I'm happy with that. Well, it's a vsize vs. committed battle. But I guess the battle is that sometimes we can't measure committed and have to approximate it by resident. In this case, we /could/ measure committed merely by keeping track of the stack's high-water mark. Maybe that's what we should do, although might that impose an unacceptable cost on push operations?
I don't understand what this high-water mark would measure: all the memory is mmapped, I can touch any of it, so, in that sense, it's all "committed" (in windows MEM_COMMIT parlance). OTOH, some subset of these committed pages are resident, which is what mincore is telling you. What is there in-between these two figures? One in-between was if we did mprotect(PROT_NONE) to pages so that they were "reserved" (again in windows MEM_RESERVE parlance) but you couldn't touch them, but we don't do that because bug 656941 couldn't find any benefit. If you can, we can do that; we already do on windows.
By "committed" I think njn and I mean "in core or swap". If you have a less-overloaded term for that, I'd certainly appreciate it. :)
The words seem fine, my main point was to ask what tracking the high-water mark would tell us: iiuc, mincore tells us "in core or swap" and (trustedEnd_ - base_) tells us "how much virtual address space is claimed"; what else is there (modulo the second paragraph of comment 14)?
> mincore tells us "in core or swap" I thought it just meant "in core", i.e. "resident"...
Yes; I should have said "*whether* the memory is in core or swap" or just "what memory is in core".
I guess I'll keep the mincore business and rename things. The alternative of maintaining a high water mark seems kind of complicated, to me, given that I don't understand the code at all. And there are also performance implications that we don't understand. Given that we basically never see swapping, and that this is anyway a small amount of memory, I guess it's OK. :-/
Attached patch Patch, v2Splinter Review
I called it "size" instead of "resident size" because it's not resident size on Windows.
Attachment #668869 - Flags: review?(n.nethercote)
Attachment #668608 - Attachment is obsolete: true
Attachment #668608 - Flags: review?(luke)
Comment on attachment 668869 [details] [diff] [review] Patch, v2 Review of attachment 668869 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/Stack.cpp @@ +828,5 @@ > + unsigned char *vec = (unsigned char *) js_malloc(numPages); > + int result = mincore(base_, numBytes, vec); > + if (result) { > + js_free(vec); > + return -1; As mentioned previously: -1 isn't a good return value for size_t. @@ +831,5 @@ > + js_free(vec); > + return -1; > + } > + > + size_t committedBytes = 0; |committedBytes| is the wrong name. |residentBytes|? @@ +845,3 @@ > return (commitEnd_ - base_) * sizeof(Value); > #else > return (trustedEnd_ - base_) * sizeof(Value); What is this case measuring? The vsize of the stack, I guess? Probably worth a comment.
Attachment #668869 - Flags: review?(n.nethercote) → review+
> As mentioned previously: -1 isn't a good return value for size_t. Gah, sorry. Is there a way I can print a warning in JS-land? Maybe I could print a warning and then return the vsize? I'd certainly prefer to return an upper bound on the size than a lower bound on the size, otherwise we could hit this error case and we'd never notice.
Oh, I just found a bug in JSRuntime::sizeOfExplicitNonHeap() -- if |execAlloc_| is false, it fails to measure stackSpace.sizeOf(). Can you change it to do that? > Is there a way I can print a warning in JS-land? What kind of warning? Error console? Almost certainly you can, but I don't think it's worth it. I suggest either: - Ignore it and return 0; it seems unlikely. - Change the return value to int64_t, and propagate the possibility of failure upwards as necessary. AFAICT, you'd need to do change the following functions to return a bool indicating success/failure, and check it at their callsites: * JSRuntime::sizeOfExplicitNonHeap() * GetExplicitNonHeapForRuntime() * JSRuntime::sizeOfIncludingThis() If you do the latter, I should probably re-review :)
(In reply to Justin Lebar [:jlebar] from comment #19) > I guess I'll keep the mincore business and rename things. The alternative > of maintaining a high water mark... Again, just to be clear: I don't think a high-water mark is an alternative; it doesn't reflect any particular resource allocation on !Windows.
> Again, just to be clear: I don't think a high-water mark is an alternative; it doesn't reflect any > particular resource allocation on !Windows. Do you mean, "on Windows" instead of "on !Windows"? Either way, I don't see why that's true. (I don't mean to belabor this point, because we're all basically in agreement here, but I'd like to know if I'm wrong here, since it definitely matters elsewhere.) * On *nix, mmap() does not allocate a page in core until the page is touched (I'd imagine, until the page is written to). * On Windows, the VirtualAlloc(MEM_COMMIT) docs say: "Actual physical pages are not allocated unless/until the virtual addresses are actually accessed." That seems to be basically the same as *nix, with the exception that Windows won't overcommit. (That is, on Windows, VirtualAlloc(MEM_COMMIT) will fail if there isn't enough space in swap+core for the page, whereas on Linux, mmap will succeed, but the oomkiller will shoot some process when we run out of swap+core.) In both cases, it seems to me that the high water mark tells us how much memory is used in swap+core. On devices without swap, the high water mark tells us how much core is used. Or am I missing something important here?
(In reply to Justin Lebar [:jlebar] from comment #25) > Do you mean, "on Windows" instead of "on !Windows"? I meant "on all platforms other than Windows" since, on Windows, we do maintain a high-water mark which means "reserved but not MEM_COMMIT'd". > In both cases, it seems to me that the high water mark tells us how much > memory is used in swap+core. Ahhh, this is what you're getting at: the difference between "mmaped but never touched" and "touched, but then paged out" is whether there is associated swap memory. Gotcha. The about:memory 'explicit' comment suggests that 'explicit' measures the eager mmap() size, not high-water-mark usage. Is that right? > On devices without swap, the high water mark > tells us how much core is used. When there isn't swap, isn't mincore == high-water-mark? 99.99% of JS execution stays in the first few pages. Really, it's only pathological JS that uses up the whole stack. (Supporting, e.g., f.apply(x, huge-array) for a huge-array length comparable to other browsers is why we have such a large stack.) I have wondered whether we should do madvise(DONT_NEED) after these high-water spikes, but in the context of the desktop browsers, this didn't seem significant. For mobile, though, perhaps DONT_NEED would actually save a non-trivial amount of memory after one of these errant (usually infinite recursion) scripts pushes the high-water mark to the max.
> The about:memory 'explicit' comment suggests that 'explicit' measures the eager mmap() size, not > high-water-mark usage. Is that right? njn and I have had a longstanding disagreement about 'explicit', and the question is precisely whether the current comment describes what we want 'explicit' to measure. :) Not to speak for him, but I think we currently basically agree that we should measure swap+incore instead of the eager mmap() size. We haven't gotten around to changing things, though. I guess this bug is part of that change. > When there isn't swap, isn't mincore == high-water-mark? I think so! > For mobile, though, perhaps DONT_NEED would actually save a non-trivial amount of memory after one > of these errant (usually infinite recursion) scripts pushes the high-water mark to the max. Indeed, perhaps! Or perhaps such recursion, when it appears in the real world, would likely crash the content process, in which case we don't care.
(In reply to Justin Lebar [:jlebar] from comment #27) > Or perhaps such recursion, when it appears in the real > world, would likely crash the content process, in which case we don't care. JS stack exhaustion definitely shouldn't kill anything: content should get a clean exception that unwinds the stack. Furthermore, Chrome can use more stack than content (that's the "trusted" in trustedEnd), so we can even successfully run chrome JS synchronously when content has exhausted the stack. If you are browsing around the right (wrong) sites, you are silently hitting these things all the time. The two questions I have are: - does madvise(DONT_NEED) do what we think it does on mobile? - how often we we see the stack getting spiked on mobile? (we could use telemetry)
> - does madvise(DONT_NEED) do what we think it does on mobile? jemalloc relies on it heavily, so I really hope it works as we expect. > - how often we we see the stack getting spiked on mobile? (we could use telemetry) Sure, that sounds reasonable. I suspect we'll have other significant memory wins before this one matters too much, though.
(In reply to Justin Lebar [:jlebar] from comment #29) Righto. Filed bug 799250 for potential follow-up on this.
I'd like to land this for B2G so we can properly measure memory usage, which means at this point that it must be blocking basecamp, otherwise I can't land on Aurora. This is a ridiculous state of affairs (we should be allowed to land changes which are not blocking), and I'll write an e-mail to the b2g list about it.
Comment on attachment 668869 [details] [diff] [review] Patch, v2 [Approval Request Comment] Low-risk patch which is necessary for correct memory measurements on b2g. (This patch only affects memory measurement code which I don't think is even run unless you open about:memory.)
Attachment #668869 - Flags: approval-mozilla-aurora?
> As mentioned previously: -1 isn't a good return value for size_t. Do you mind if I return the vsize on error? At least then it might be obvious that something is going on.
> Do you mind if I return the vsize on error? At least then it might be > obvious that something is going on. Sure. BTW, isn't the usual protocol for aurora to wait until a patch has landed on m-i or m-c and then ask for approval on the final version, rather than a r+'d-but-needs-minor-changes version?
> BTW, isn't the usual protocol for aurora to wait until a patch has landed on m-i or m-c and then ask > for approval on the final version, rather than a r+'d-but-needs-minor-changes version? Again, is this hurting anyone? If so, I won't do it. The benefit to me is that I don't have to wait days for aurora approval. Once we switch over our b2g repository (which may have already happened), nobody on B2G can get this patch until I land on Aurora. At the speed we're trying to move at, an extra two days waiting for triage really matters. It's also not usual protocol to plan to land hundreds of patches on Aurora, but that's exactly what we're doing in B2G this cycle. (Every basecamp-blocking bug must land on m-c and m-a, because B2G v1 will be built off Gecko 18.)
Pushed to try, for good measure. https://tbpl.mozilla.org/?tree=Try&rev=a0efd643ab50 (Mostly, I'm concerned we won't have the right headers included on some platform or another, so mincore will fail to compile.)
My "string" #ifdef DEBUG "more string" #endif hack apparently angers MSVC (?!), so I got rid of that and instead build the description string explicitly as nsString desc = NS_LITERAL_STRING(...); #ifdef DEBUG desc += NS_LITERAL_STRING(...); #endif
Attachment #668869 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Ah, awesome. macos's mincore takes char* for the third argument, while Linux's takes unsigned char*. For some reason this doesn't cause build errors on my mac locally, but it does on try.
Attached patch Patch, v3Splinter Review
Nick, would you mind taking a quick look at these changes? The only substantial changes from v2 are to how we declare the description string for the runtime/stack reporter and how we declare the type for mincore's third argument.
Attachment #670540 - Flags: review?(n.nethercote)
Comment on attachment 670540 [details] [diff] [review] Patch, v3 Looks fine, but can you fix the tiny JSRuntime::sizeOfExplicitNonHeap() bug mentioned at the top of comment 23? Thanks!
Attachment #670540 - Flags: review?(n.nethercote) → review+
> Looks fine, but can you fix the tiny JSRuntime::sizeOfExplicitNonHeap() bug mentioned at the top of > comment 23? Gah, sorry again for missing a review comment. Fixed and pushed. https://hg.mozilla.org/integration/mozilla-inbound/rev/05567a9159c5 https://hg.mozilla.org/integration/mozilla-inbound/rev/cea04d0eb7c9
(In reply to Justin Lebar [:jlebar] from comment #44) > > Looks fine, but can you fix the tiny JSRuntime::sizeOfExplicitNonHeap() bug mentioned at the top of > > comment 23? > > Gah, sorry again for missing a review comment. > > Fixed and pushed. > > https://hg.mozilla.org/integration/mozilla-inbound/rev/05567a9159c5 > https://hg.mozilla.org/integration/mozilla-inbound/rev/cea04d0eb7c9 Sorry, backed out because of the xpcshell orange: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=cea04d0eb7c9 https://hg.mozilla.org/integration/mozilla-inbound/rev/a315b2dfd1b2
FWIW this was an OSX failure, so even if I had run xpcshell tests on linux (which I probably should have), it wouldn't have caught this.
Oh, I was doing NS_LITERAL_CSTRING() on a proper char*. :-/ (I wonder if there's any way to make that a compile error.) That also probably explains why MSVC didn't like the hack from comment 37 -- we're sending the "str" #ifdef DEBUG "more str" #endif into a macro. We were hitting it in the telemetry test because that counts the number of ghost windows, which runs the JS multi-reporter, which runs this code. Anyway, I'll get rid of the ifdef DEBUG cleverness, I guess. Nick, I recall us being concerned about ghost windows + telemetry because we didn't want to walk the whole JS heap counting compartments. Could you double-check that xpc::JSMemoryMultiReporter::CollectReports() does not, in fact, do this expensive operation? I tried to see myself, but it's all looking like goop to me this morning.
> Nick, I recall us being concerned about ghost windows + telemetry because we > didn't want to walk the whole JS heap counting compartments. Could you > double-check that xpc::JSMemoryMultiReporter::CollectReports() does not, in > fact, do this expensive operation? I tried to see myself, but it's all > looking like goop to me this morning. xpc::JSMemoryMultiReporter::CollectReports *is* the expensive one. xpc::JSMemoryMultiReporter::GetExplicitNonHeap is the cheap one, but it's only for computing "explicit". However, telemetry doesn't look at any multi-reporters, only single reporters. And the "ghost-windows"/GHOST_WINDOWS telemetry number comes from nsWindowMemoryReporter::NumGhostsReporter, which looks reasonably cheap. So I think we're ok, but I'm slightly worried that you are thinking that xpc::JSMemoryMultiReporter::CollectReports is cheap -- should I be? :) ---- BTW, the ghost windows reporting does this: nsCOMPtr<nsIEffectiveTLDService> tldService = do_GetService( NS_EFFECTIVETLDSERVICE_CONTRACTID); and I've seen the EffectiveTLDService taking up ~150KB in DMD reports. Is this a singleton class/service? And is it something that would be created even if the ghost window reporter didn't create it? The DMD reports said that it was created for this ghost windows reporter's purpose.
> So I think we're ok, but I'm slightly worried that you are thinking that > xpc::JSMemoryMultiReporter::CollectReports is cheap -- should I be? :) Honestly, I didn't remember which was the cheap and which was the expensive one. :) > Is this a singleton class/service? That's what do_GetService, as opposed to do_GetInstance, so yes. > And is it something that would be created even if the ghost window reporter didn't > create it? I expect so, but that would be easy to test.
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
I pushed both patches even though I only had a=akeybl on the one, because the second one is a trivial fix, and my risk assessment applies there. If this was the wrong thing to do, let me know and I won't do it again. https://hg.mozilla.org/releases/mozilla-aurora/rev/674b97ffda67 https://hg.mozilla.org/releases/mozilla-aurora/rev/97b1bdb2046d
Blocks: 801295
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: