Open Bug 855669 Opened 7 years ago Updated 4 years ago

Figure out how emscripten &co can decommit memory

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

People

(Reporter: justin.lebar+bug, Assigned: dougc)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(2 files, 9 obsolete files)

23.87 KB, patch
Details | Diff | Splinter Review
15.90 KB, patch
Details | Diff | Splinter Review
(Sorry to file this bug if you guys have already thought about this.)

Decommitting parts of the heap is an important part of memory management in native apps.

But as far as I'm aware, we don't expose this functionality to JS.  Native apps ported to JS can only "unmap" memory, by releasing an array buffer.

As far as I'm aware, emscripten uses one big array buffer to simulate "memory".  In this case, simulating munmap by releasing the array buffer is not an option, and exposing decommit becomes even more important.

Similarly, as far as I can tell, asm.js limits programs to a single array buffer under at least some circumstances.  Again this makes decommit even more important.

In bug 845685 comment 3, we found that the peak virtual memory usage of a library we'd like to run as JS to power the B2G keyboard is 9.4mb, while the peak RSS of that library is 1.2mb.  I think we probably can't use this library if it costs 9mb, which is ~9% of our available memory for apps.
Whiteboard: [MemShrink]
Whiteboard: [MemShrink] → [MemShrink:P2]
Whiteboard: [MemShrink:P2] → [MemShrink]
Whiteboard: [MemShrink] → [MemShrink:P2]
Emscripten has an option to grow the heap automatically. We could also have an option to shrink it, but that would only work if memory were not fragmented, which seems like the normal case. But native apps also can't decommit when fragmented.
> But native apps also can't decommit when fragmented.

IIUC, native heaps can have a non-contiguous heap, but Emscripten'd apps can't.  So there's a big difference.
> We could also have an option to shrink it, but that would only work if memory were not 
> fragmented, which seems like the normal case. But native apps also can't decommit when 
> fragmented

I'm not sure if you're saying fragmentation is or isn't the normal case.

Anyway good heap allocators are designed to minimize this fragmentation.
I would imagine that the difference is that native programs can decommit at the granularity of 4K pages, while I'm guessing that for efficiency emscripten uses much bigger typed array allocations.
And asm.js restricts you to just one typed array, right?  (Per invocation of an asm.js routine or something?)

OT but I wonder how emscripten handles pointers which cross array boundaries.
Yes, one typed array. And yes, we have no way to deallocate segments inside the heap, just shrink the entire thing, hence fragmentation is problematic.
Draft patches that probably don't compile outside MacOS.  I'd push to try, but it's down.

I don't know what I'm doing in the JS engine, so feedback is welcome!
Is ArrayBuffer.clear spec'ed somewhere?
The most complete spec atm is in bug 855669 comment 11.  :)
Then it had better be behind an ifdef nightly flag.
(In reply to :Ms2ger from comment #13)
> Then it had better be behind an ifdef nightly flag.

I thought we enabled experimental features in Aurora too?

In any case, I was thinking we might use Math.imul as a precedent to follow here, but I don't know the details of how that went down.
dherman proposed Math.imul for standardization before any implementation work, I am pretty sure. Reactions were ok, so we moved forward.

I'm not sure what the policy for Aurora is. But definitely anything non-standardized should be off by Beta, I assume, if not earlier.
Amazingly this was green on the first try.  We still need to do a bit of testing to ensure that this actually drops pages, but if the test is passing, it's zeroing the pages, so we should actually be dropping them.

I should note that Windows doesn't overcommit, so this doesn't let us allocate more memory on Windows than we would be able to otherwise.  However, by setting a chunk of pages to CoW the zero page, we're effectively swapping those pages out; this makes room in main memory for stuff which otherwise would be in the page file.

In other words, clear() on Windows does not make it less likely that we will crash due to OOM, but it does make it less likely that we'll page.  At least, that's my understanding of the black box that is Windows VMM.

I'm not sure who owns this code, but if I can get f+ on the general idea here, I'd be happy to write an e-mail to the typed array spec mailing list.

The main API question here, aside from whether we want this at all, is, Should the various typed arrays have a clear() function (instead of, or in addition to, ArrayBuffer::clear())?  How about DataView?  I modeled clear() on slice(), which exists only on ArrayBuffer, but I don't feel strongly about whether clear() should exist on the other classes or not.
Attachment #757002 - Flags: feedback?(sphink)
I am indeed the one who has been working on this code most recently, but that's about the extent of it. I'm happy to give my personal opinions, but I think the spec questions are the important ones.

I totally agree that we should be decommitting, somehow. And I think your implementation looks right for Linux and Windows; I don't know enough about OSX to say for sure.

In particular, MADV_DONTNEED on an anonymous mapping seems to be exactly what you want. I don't understand what you said about "by setting a chunk of pages to CoW the zero page, we're effectively swapping those pages out" on Windows -- where in the patch is that being done? Regardless, I think your Windows implementation is correct, in that you decommit and then re-commit the page range, because of this line from http://msdn.microsoft.com/en-us/library/windows/desktop/aa366803%28v=vs.85%29.aspx : "committed pages do not consume any physical storage until they are first accessed".

I'm not so sure about the API. I have prototyped ArrayBuffer-based methods intended as targets for memcpy/memmove and memset. ArrayBuffer#move, in fact, is in Nightly and Aurora, so you should model this addition after it (specifically, I think you want #ifndef RELEASE_BUILD). I also implemented ArrayBuffer#fill (a memset equivalent) in the past, and may have given azakai a build with that for perf testing; I can't remember.

At any rate, I'm wondering if clear() is the right API, or if it should be fill() with special handling when filling with zeroes, or if this should go all the way the other way and do ArrayBuffer#advise or something.

fill() is a little weird because you pass in a typed value to a method on an untyped container (ArrayBuffer), so from that standpoint clear() makes a little more sense. Except that if we end up wanting a close equivalent to memset(), then clear() seems redundant (cf bzero). Then again, perhaps we should be self-hosting things like fill(), in which case we might want an intrinsic that would look a lot like clear() to handle the decommitting.

I'm going to needinfo? luke, since he's been thinking about these things for asm.js and may have a broader vision of what this stuff should be moving towards.

The spec situation is a little messy. AIUI, Typed arrays started out being specced by Khronos, exclusively intended for OpenGL, but then migrated to WebIDL. But now it looks like ES6 will be taking them over, probably with some modifications. So it looks like current discussion is on es-discuss, though dherman is working on this aspect of the spec and it may be best to go through him.
Flags: needinfo?(luke)
Attachment #757002 - Flags: feedback?(sphink)
> "by setting a chunk of pages to CoW the zero page, we're effectively swapping those pages 
> out" on Windows -- where in the patch is that being done?

We're referring to the same thing:

> "committed pages do not consume any physical storage until they are first accessed".

The way I assume this is implemented in the VMM is that committed but unwritten-to pages are CoW to the zero page.  (I really hope that by "accessed" they mean "written to".)

FWIW I am not at all wed to any particular API, so long as we give pages some way to decommit memory.
This functionality makes sense.  In theory, we could port a malloc() implementation that decommits unused pages somewhat directly using clear(), right?  Anyone tried jemalloc ;) ?

On the subject of landing: definitely behind a pref that is default off on Release/Beta.  Before landing at all, though, I think we should work with dherman to get a strawman proposal up somewhere public.
Flags: needinfo?(luke)
> In theory, we could port a malloc() implementation that decommits unused pages somewhat 
> directly using clear(), right?

Yes, exactly.

Dave, do you have any thoughts on what the API should be here?  It seems like Luke doesn't have a strong opinion among the options discussed (?), and I certainly don't care.
Flags: needinfo?(dherman)
Are the API options summarized somewhere?
> Are the API options summarized somewhere?

1) ArrayBuffer.clear(start, end).  start and end are interpreted as in splice.  Sets the range [start, end) to 0, possibly decommitting memory.

1.a) Also add a clear() method on typed arrays.

1.b) Also add a clear() method on DataView.

2) ArrayBuffer.fill(), with special handling so that ArrayBuffer.fill(0) decommits if possible.  See the caveat in the third-to-last paragraph in comment 17.

3) ArrayBuffer.madvise().

Thinking about this more, I guess implicitly decommitting with fill(0) is somewhat problematic, because fill(0) really is a different operation than decommitting, from a performance point of view.  Then again, having both clear() and fill() is confusing, because it's not obvious how clear() would be different than fill(0).  Maybe that argues in favor of madvise().
Oh, and

3.a) We could have madvise() which doesn't necessarily zero out the memory.  This would enable us to make it faster on mac, where we'd like to call MADV_FREE.

I personally think that exposing these "might have been zeroed" semantics is a can of worms, but it's an option.  I don't have numbers on the perf difference, but anyway it depends on your benchmark.
Filling with zeros is overhead that worries me. Something like madvise MADV_DONTNEED sounds good to me, that is, to give a hint the memory can be paged out if RAM is needed, but I don't know if that is supported outside of Linux. Also, I worry about performance if we repeatedly malloc/free the same memory, and end up giving a lot of hints to the OS that are misleading.

Another possibility is to allow growing and shrinking of ArrayBuffers. This could be used to implement sbrk efficiently. This is safer and perhaps simpler, but would not work for fragmented C heaps of course.
> Filling with zeros is overhead that worries me.

It's not necessary to fill pages with zeroes on Windows, Mac, or Linux.  On Linux you can use MADV_DONTNEED.  On Windows and Mac, you can decommit and re-commit the pages.

But decommitting + recommitting isn't free, since it involves messing with the pagetable (and probably flushing the TLB).

> Also, I worry about performance if we repeatedly malloc/free the same memory, and end up 
> giving a lot of hints to the OS that are misleading.

That would be a bug in the allocator; I'm not sure we can or should fix that in this API.

> This is safer and perhaps simpler, but would not work for fragmented C heaps of course.

Right; I'm specifically interested in solving this problem, which is really important, particularly for B2G where we have so little memory to begin with.
(In reply to Justin Lebar [:jlebar] from comment #25)
> > Also, I worry about performance if we repeatedly malloc/free the same memory, and end up 
> > giving a lot of hints to the OS that are misleading.
> 
> That would be a bug in the allocator; I'm not sure we can or should fix that
> in this API.

To expand on that: iirc, allocators like jemalloc will use time measurements to avoid decommitting too soon (for the same reasons you mentioned).
dherman, ni ping?
madvise sounds likely to be controversial, since it's just a hint with no observable behavior (right?).

But I'm not sure I'm understanding this. I'd like to try to discuss face-to-face/video with jlebar and others next week.

Dave
Flags: needinfo?(dherman)
> madvise sounds likely to be controversial, since it's just a hint with no observable behavior 
> (right?).

Not quite.

madvise in (3) would explicitly say, "set these bits to zero and also decommit as much as you can" (you can only decommit at page granularity, so the parts of the array that share pages with other stuff can't be decommitted).

This would not be observably different from setting the bits to zero, except for its effect on speed (both of the madvise() operation itself and of future accesses to the array) and OOMs.

So in this sense, madvise() is just another name for option (1.b), clear() with the decommit semantics.  The main difference is that madvise() makes explicit what we're doing.

Option (3.a.) is different from option (3) in that it doesn't necessarily set the bits to zero; the OS gets to set them to zero whenever it feels like it.  Again, this is still quite observable to the script.
Here are what I see as some of the design challenges here:

- Regardless of incidental observable behavior, the key behavior you want this operation for is not directly observable (other than timing). That's a challenge for standardization; hints are hard to enforce (e.g., there's no criterion clearly defining conformance), and if a feature gets used for unintended purposes it can create pressure for implementations to work against the intended performance model.

- We don't know if madvise() is available on all relevant platforms, especially top-tier architectures that web browsers run on, but also keeping in mind that the web platform is being deployed on new architectures like the PowerPC-based Nintendo Wii U.

- If madvise zeroes memory, it's in danger of being abused as a memzero for memory that *is* actively being used, creating a directly competing counter-pressure against swapping out the region. Even if we simultaneously offer a memzero operation, that doesn't mean web sites will make the right choice between the two.

- If madvise doesn't zero memory, then we have to actively copy the memory to disk, which would make it slow.

- If we allow non-page-aligned regions, might it potentially be expensive (or at least more expensive than a noop, which is what Emscripten does today) for the implementation to do the page boundary arithmetic?

- If we don't, we have to figure out what a sufficiently universal page alignment is (probably 4K but I don't know just how universal that is).

- We'd need to figure out whether the operations live on ArrayBuffer, ArrayBufferView, or just Uint8Array. The typed arrays API was designed to avoid having many operations on ArrayBuffer, and I believe Ken Russell feels it's important to keep the ArrayBuffer API size small. If it lives on all ArrayBufferViews, the address scaling is annoying and weird. If it lives only on Uint8Array, it requires you to allocate a Uint8Array view just to release memory, which is annoying and a bit counterproductive.

- It seems to me that a resizing operation is desirable for other purposes, and if it could be used for this use case (of dealing with internal fragmentation of a typed array) that might be a better place to focus our attention. But if jlebar's use cases are unable to actually shrink the array size, then that's not a solution.

So those are the challenges I see. I think the best course of action is to get Ken involved in the discussion. I'll reach out.

Dave
I can answer some of these questions.

> - If we allow non-page-aligned regions, might it potentially be expensive (or at least 
> more expensive than a noop, which is what Emscripten does today) for the implementation 
> to do the page boundary arithmetic?

Calling madvise is relatively expensive -- it's a syscall and involves mucking with page tables.  It probably also involves a TLB flush.

In comparison, doing page-alignment arithmetic is very cheap.  memset'ing up to two pages of memory (at the edges of the array buffer) should also be very cheap in comparison.

We definitely don't want to expose the system's page size (or the alignment of an array buffer's allocation!), so IMO we should definitely not restrict to page-sized chunks.

> - It seems to me that a resizing operation is desirable for other purposes, and if it 
> could be used for this use case (of dealing with internal fragmentation of a typed array) 
> that might be a better place to focus our attention. But if jlebar's use cases are unable 
> to actually shrink the array size, then that's not a solution.

Resizing the array is orthogonal to this proposal.

The question isn't whether the use cases I'm concerned about are able to shrink the array.  The question is whether or not shrinking the array would be a useful operation for those use-cases.  The answer is, it would not be.  :)

This is because a heap allocator needs to deal with the situation when it has a live block at the beginning and end of an OS-allocated chunk.  That is, jemalloc allocates 1mb from the system, uses it, and at some point gets into the following state:

  chunk = A---------------------B

where A and B are live heap objects, and "-" indicates unused space.

Shrinking the chunk is not an option, because that would erase A or B.  The only option I'm aware of is to decommit the dashes.

We have the same problem (actually, worse) if we're in asm.js and restricted to using only one typed array for all of our memory.  Now shrinking that typed array is even less feasible, since the array is bounded on either end by the lowest-address and highest-address live heap blocks in the whole program.

When we spoke on IRC, you asked for me to schedule a meeting with you.  Do you still want to do that before you involve Ken?
> When we spoke on IRC, you asked for me to schedule a meeting with you.  Do
> you still want to do that before you involve Ken?

No need, I'm caught up, thanks.

Dave
What's the status of this bug?
Sorry for my absence on this. Here's the current state of things.

## API

We want two operations, with the following definitions:

- buffer.zero([startOffset = 0[, endOffset = buffer.byteLength])

Zeroes out the buffer's memory from startOffset (inclusive) to endOffset (exclusive).

- buffer.discard([startOffset = 0 [, endOffset = buffer.byteLength])

Zeroes out the buffer's memory from startOffset (inclusive) to endOffset (exclusive). Performance expectation: decommits the largest number of OS pages possible that fit within the specified range, to be lazily recommitted and zeroed on demand if those parts of the buffer are accessed again in the future.

We want both so that .zero() will be the more attractive and natural choice for programmers who simply want the zeroing behavior. The name .discard() is preferable to .clear() because it makes it clear that its primary intention is memory management functionality, not the zeroing.

## Implementation Strategies

Basically everyone everywhere can use memzero for .zero().

For .discard(), POSIX systems can use madvise() with MADV_DONTNEED. We know this works on Linux and at least compiles on OS X, but we should test it. On Windows, we *can't* use VirtualAlloc(MEM_RESET), which jemalloc and the GC use, because it doesn't guarantee zeroed memory. We should test whether MEM_DECOMMIT followed by MEM_COMMIT behave like madvise(MADV_DONTNEED).

## Standardization

It would be good to test OS X and Windows to be confident we can propose these two operations. We can chat with Ken meanwhile and see what he thinks of these. Once we're at a point he's happy with, I can bring it to TC39.

## No Resizing

Luke says we should avoid resizing operations and see how far we can get with apps requesting enormous buffers. Resizing opens up development challenges we'd like to avoid if possible.

Dave
Is there any chance of getting 'buffer.discard()' into FF26 / FxOS 1.2?

This could make a difference to memory usage and performance and is
quite an important function.

buffer.zero() is probably not so important because a JS or asm.js
function could be substituted with only a potential loss of
performance.

Is it blocked in the standardization process, or do we just need
someone to finish it off?
(In reply to Douglas Crosher [:dougc] from comment #35)
> Is there any chance of getting 'buffer.discard()' into FF26 / FxOS 1.2?

FF26? No. FxOS 1.2? I don't know what the policy is for that. Can we expose unspecced things there?

> Is it blocked in the standardization process, or do we just need
> someone to finish it off?

We can't ship it in Firefox until it's standardized, but that doesn't hold up implementing it. (And in fact, having an implementation would help with the spec process.) We'd just have to turn it off in beta/release. Whether that means FxOS can compile with it on, I don't know -- seems like it could be chrome-only without a problem, assuming that's helpful.

But I personally don't have the time to work on it at the moment.
This includes support for Linux and Solaris.  The main
need for this is FxOS and Android, so I request to
land this even without support for all the other
supported platforms and to address the others in follow
up bugs if necessary. The code falls back to zero filling.

I was familiar with Solaris and it has MADV_FREE
that does exactly what is needed so support has been
added now.

There is some MacOS code that is expected to work
but is not tested.  It was not clear why the original
code mmap'ed twice in its MacOS code so it is just
mmap'ed over once.

There is some example code for Windows, but it's
disabled as it is not tested and I am not confident
about it.  Something along these lines is expected
to help on Windows too.

Help testing and developing the MacOS and Windows
code would be appreciated.

The new discard() function is only enabled on nightly.
Attachment #757001 - Attachment is obsolete: true
Attachment #757002 - Attachment is obsolete: true
Attachment #757003 - Attachment is obsolete: true
Attachment #807777 - Flags: review?(sphink)
Comment on attachment 807777 [details] [diff] [review]
Rename to discard(), fixes, rebase, tested linux support

Review of attachment 807777 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jstypes.h
@@ +153,5 @@
>  **      Commonly used macros for operations on compatible types.
>  ***********************************************************************/
>  #define JS_HOWMANY(x,y) (((x)+(y)-1)/(y))
>  #define JS_ROUNDUP(x,y) (JS_HOWMANY(x,y)*(y))
> +#define JS_ROUNDDOWN(x,y) (((x)/(y))*(y))

Ugh. These are all really nasty. If you accidentally pass in a float, these do not at all do what you'd want. But that's not your problem; yours fits in fine here.

On the other hand, would it be better to do this as (x - x % y)?

::: js/src/vm/TypedArrayObject.cpp
@@ +118,5 @@
> + * Get |begin| and |end| out of args, assuming |args| corresponds to a
> + * call to slice() or discard().
> + */
> +static bool
> +GetSliceOrDiscardBeginEnd(JSContext *cx, CallArgs args, uint32_t *begin, uint32_t *end)

Can you call this ParseBeginEndArgs, and pass in the index of the begin arg? (As in, zero for both of these.) I know it's a little silly, but the "SliceOrDiscard" part of the name bothers me. Or pass in both Value args as HandleValues if you prefer and don't mind having even more params.

@@ +122,5 @@
> +GetSliceOrDiscardBeginEnd(JSContext *cx, CallArgs args, uint32_t *begin, uint32_t *end)
> +{
> +    Rooted<JSObject*> thisObj(cx, &args.thisv().toObject());
> +
> +    // Tese are the default values

// Default to using the whole buffer

@@ +207,5 @@
> +    if (!GetSliceOrDiscardBeginEnd(cx, args, &begin, &end))
> +        return false;
> +
> +    Rooted<JSObject*> thisObj(cx, &args.thisv().toObject());
> +    discardArrayBuffer(cx, thisObj->as<ArrayBufferObject>(), begin, end);

Remove the cx parameter to discardArrayBuffer and don't root here.

@@ +613,5 @@
> +#endif
> +
> +#if defined(XP_WIN)
> +#include "jswin.h"
> +#include <psapi.h>

Don't add platform-dependent stuff here unless you really need it. Put it in gc/Memory.cpp instead, following the model of gc::MarkPagesUnused. I know, this doesn't have anything to do with GC, but I think it's the closest we have. (There's some related functionality in the JIT allocator, but that dependency would be worse.)

Hm, in looking at this file again, it seems like the AsmJS does this too. It really shouldn't either.

@@ +674,5 @@
> +}
> +
> +void
> +ArrayBufferObject::discardArrayBuffer(JSContext *cx, ArrayBufferObject &arrayBuffer,
> +                                      uint32_t begin, uint32_t end)

Again, remove the cx parameter (because it makes it look this can gc, so I immediately thought this was a rooting hazard since you're passing in a object ref.)

@@ +687,5 @@
> +    uintptr_t dp = (uintptr_t) arrayBuffer.dataPointer();
> +    uintptr_t beginPtr = dp + begin;
> +    uintptr_t endPtr = dp + end;
> +
> +#if defined(__linux__) || defined(SOLARIS) || defined(XP_MACOSX) || defined(XP_WIN)

I think you can get rid of all platform ifdefs here. Use rt->gcSystemPageSize for the page size. Isn't the logic something like:

  beginZero = min(endPtr - beginPtr, uintptr_t(-beginPtr) % pageSize);
  endZero = min(endPtr - (beginPtr + beginZero), endPtr - lastPage);
  memset(beginPtr, 0, beginZero);
  if (firstPage < lastPage)
    discardPages(firstPage, lastPage);
  memset(lastPage, 0, endZero);

That's probably not quite right; I always get these things wrong. You could also put a gc::discardRegion in gc/Memory.cpp, I suppose, if this is likely to be used from other places.

@@ +3550,5 @@
>  };
>  
>  const JSFunctionSpec ArrayBufferObject::jsfuncs[] = {
>      JS_FN("slice", ArrayBufferObject::fun_slice, 2, JSFUN_GENERIC_NATIVE),
> +#ifdef NIGHTLY_BUILD

I'd prefer #ifndef RELEASE_BUILD, as with TypedArray#move. That'll put it everywhere but release and beta. It's ok to have it on Aurora, and NIGHTLY_BUILD will disable it for people's normal development builds. (Well, *really* I'd prefer having these sorts of things controllable by a magic pref or something. But that's tricky.)
Attachment #807777 - Flags: review?(sphink)
(In reply to Dave Herman [:dherman] from comment #34)
...
> ## API
...
> - buffer.discard([startOffset = 0 [, endOffset = buffer.byteLength])
> 
> Zeroes out the buffer's memory from startOffset (inclusive) to endOffset
> (exclusive). Performance expectation: decommits the largest number of OS
> pages possible that fit within the specified range, to be lazily recommitted
> and zeroed on demand if those parts of the buffer are accessed again in the
> future.

After some limited experience trying to take advantage of this could I
suggestion adding:

It is recommended that the start and end be aligned to a boundary of
at least 4096 bytes, a common page size, otherwise pages as the start
and end are not expected to be discarded by the operating system, but
still zero filled.  If the start and/or end can not be aligned in this
manner then it is recommended that the start-to-end span be as large
as possible to maximize the number of intermediate pages that the
operating system can discard.

Implementations are encouraged to align the start of the buffer elements
with a system page so that calls to discard() with an aligned start and
end index discard the start and end pages, and so that calls to discard
small regions are effective.
Attached patch Address reviewer freedback. (obsolete) — Splinter Review
Add a preference javascript.options.buffer_discard which defaults
to off for a RELEASE_BUILD.  It this what you have in mind?

Disabled the untested MacOS support.

Tried to address the feedback, thank you for the quick review.
Attachment #808245 - Flags: feedback?(sphink)
(In reply to Steve Fink [:sfink] from comment #38)
> Comment on attachment 807777 [details] [diff] [review]
> Rename to discard(), fixes, rebase, tested linux support
> 
> Review of attachment 807777 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jstypes.h
> @@ +153,5 @@
> >  **      Commonly used macros for operations on compatible types.
> >  ***********************************************************************/
> >  #define JS_HOWMANY(x,y) (((x)+(y)-1)/(y))
> >  #define JS_ROUNDUP(x,y) (JS_HOWMANY(x,y)*(y))
> > +#define JS_ROUNDDOWN(x,y) (((x)/(y))*(y))
> 
> Ugh. These are all really nasty. If you accidentally pass in a float, these
> do not at all do what you'd want. But that's not your problem; yours fits in
> fine here.
> 
> On the other hand, would it be better to do this as (x - x % y)?

Removed this macro, and just mask the low bits off inline - its a
power of two.
 
> ::: js/src/vm/TypedArrayObject.cpp
> @@ +118,5 @@
> > + * Get |begin| and |end| out of args, assuming |args| corresponds to a
> > + * call to slice() or discard().
> > + */
> > +static bool
> > +GetSliceOrDiscardBeginEnd(JSContext *cx, CallArgs args, uint32_t *begin, uint32_t *end)
> 
> Can you call this ParseBeginEndArgs, and pass in the index of the begin arg?
> (As in, zero for both of these.) I know it's a little silly, but the
> "SliceOrDiscard" part of the name bothers me. Or pass in both Value args as
> HandleValues if you prefer and don't mind having even more params.

Done.
 
> @@ +122,5 @@
> > +GetSliceOrDiscardBeginEnd(JSContext *cx, CallArgs args, uint32_t *begin, uint32_t *end)
> > +{
> > +    Rooted<JSObject*> thisObj(cx, &args.thisv().toObject());
> > +
> > +    // Tese are the default values
> 
> // Default to using the whole buffer

Done.
 
> @@ +207,5 @@
> > +    if (!GetSliceOrDiscardBeginEnd(cx, args, &begin, &end))
> > +        return false;
> > +
> > +    Rooted<JSObject*> thisObj(cx, &args.thisv().toObject());
> > +    discardArrayBuffer(cx, thisObj->as<ArrayBufferObject>(), begin, end);
> 
> Remove the cx parameter to discardArrayBuffer and don't root here.

Done, but could you please check it.
 
> @@ +613,5 @@
> > +#endif
> > +
> > +#if defined(XP_WIN)
> > +#include "jswin.h"
> > +#include <psapi.h>
> 
> Don't add platform-dependent stuff here unless you really need it. Put it in
> gc/Memory.cpp instead, following the model of gc::MarkPagesUnused. I know,
> this doesn't have anything to do with GC, but I think it's the closest we
> have. (There's some related functionality in the JIT allocator, but that
> dependency would be worse.)

Done.  Moved into gc/Memory.cpp

> @@ +674,5 @@
> > +}
> > +
> > +void
> > +ArrayBufferObject::discardArrayBuffer(JSContext *cx, ArrayBufferObject &arrayBuffer,
> > +                                      uint32_t begin, uint32_t end)
> 
> Again, remove the cx parameter (because it makes it look this can gc, so I
> immediately thought this was a rooting hazard since you're passing in a
> object ref.)

Done.  But needed to pass the JSRuntime in order to access the page size
store there.  Might it be better to be able to access the page size without
needing the JSRuntime?
 
> @@ +687,5 @@
> > +    uintptr_t dp = (uintptr_t) arrayBuffer.dataPointer();
> > +    uintptr_t beginPtr = dp + begin;
> > +    uintptr_t endPtr = dp + end;
> > +
> > +#if defined(__linux__) || defined(SOLARIS) || defined(XP_MACOSX) || defined(XP_WIN)
> 
> I think you can get rid of all platform ifdefs here. Use
> rt->gcSystemPageSize for the page size. Isn't the logic something like:
> 
>   beginZero = min(endPtr - beginPtr, uintptr_t(-beginPtr) % pageSize);
>   endZero = min(endPtr - (beginPtr + beginZero), endPtr - lastPage);
>   memset(beginPtr, 0, beginZero);
>   if (firstPage < lastPage)
>     discardPages(firstPage, lastPage);
>   memset(lastPage, 0, endZero);
> 
> That's probably not quite right; I always get these things wrong. You could
> also put a gc::discardRegion in gc/Memory.cpp, I suppose, if this is likely
> to be used from other places.

Cleaned it up a bit.  Was thinking to avoid calling memset with a zero
count, but guess it does not matter.
 
> @@ +3550,5 @@
> >  };
> >  
> >  const JSFunctionSpec ArrayBufferObject::jsfuncs[] = {
> >      JS_FN("slice", ArrayBufferObject::fun_slice, 2, JSFUN_GENERIC_NATIVE),
> > +#ifdef NIGHTLY_BUILD
> 
> I'd prefer #ifndef RELEASE_BUILD, as with TypedArray#move. That'll put it
> everywhere but release and beta. It's ok to have it on Aurora, and
> NIGHTLY_BUILD will disable it for people's normal development builds. (Well,
> *really* I'd prefer having these sorts of things controllable by a magic
> pref or something. But that's tricky.)

Have made an attempt, feedback welcomed.
For discard() to be most useful it will be necessary to
align the start of the buffer elements to a page boundary.

Otherwise it will not be possible to reliably discard a
single page.

This patch allocates an extra page and uses this to align
the start of the elements to a page boundary.  The offset
is recorded in the flags to be used when freeing the buffer.

Note x64 asm.js buffers are already aligned appropriately.

Might something along these lines be acceptable?
Attachment #807777 - Attachment is obsolete: true
Attachment #808246 - Flags: feedback?(sphink)
The code in AllocateArrayBufferContents needed a good
deal of re-working.  It's a little more complex to handle
the buffer re-allocation cases and for now the alignment
offset is just preserved when re-allocating, which might
leave the re-allocated buffer un-aligned.  In practice,
it's not clear if that code path is even used.

Is anyone able to help with this patch?  It could really
use some feedback on how appropriate the buffer_discard
preference is in this patch?
Attachment #808245 - Attachment is obsolete: true
Attachment #808245 - Flags: feedback?(sphink)
Sorry, getting the patches mixed up.

This one adds the buffer.discard() support, enabled
by the 'javascript.options.buffer_discard' pref.

Could use some feedback on how appropriate the
use and implementation of the preference is?
Attachment #808246 - Attachment is obsolete: true
Attachment #812501 - Attachment is obsolete: true
Attachment #808246 - Flags: feedback?(sphink)
The code in AllocateArrayBufferContents needed a good
deal of re-working.  It's a little more complex to handle
the buffer re-allocation cases and for now the alignment
offset is just preserved when re-allocating, which might
leave the re-allocated buffer un-aligned.  In practice,
it's not clear if that code path is even used.
Luke, Brendan, and I discussed this again today. It seems like it's unfortunate to have two operations, .zero and .discard; programmers could get confused which to use. Alternative API's are

    .zero({ discard: true })

or

    .zero({ hint: "discard" })

(See http://ariya.ofilabs.com/2011/08/hall-of-api-shame-boolean-trap.html for more about the "Boolean trap")

But another possibility is just to have a .zero() operation and use heuristics like the size of the requested range to determine whether to use memzero vs madvise(DONTNEED). Luke says we'll want to do measurements before we pitch something to other browser vendors.

Dave
Trying to rebase after the changes in bug 939562, and could really use some help on the appropriate way for a buffer.discard() preference to be implemented.

The buffer.discard() feature had been coded as conditional on a 'javascript.options.buffer_discard' preference based on feedback in comment 38.  I understood the intention might be to ship it as an experimental feature that defaults off in release builds.

After the changes in bug 939562, should this preference be moved from the ContextOptions to RuntimeOptions too?

I may well have misunderstood the intention?  Perhaps it should not even be a 'javascript.options' preference, and might be more appropriately grouped with DOM preferences?
Flags: needinfo?(jdemooij)
(In reply to Douglas Crosher [:dougc] from comment #47)
> After the changes in bug 939562, should this preference be moved from the
> ContextOptions to RuntimeOptions too?

That should work, although I think RuntimeOptions is for more "permanent" features like the JITs. Also, DOM workers have their own runtime so you have to make sure the flag is set there too (see the patch for bug 939562). It seems simpler to add a new jsfriendapi function like EnableArrayBufferDiscard(bool) and have it set a global bool, then call this function from ReloadPrefsCallback.

> I may well have misunderstood the intention?  Perhaps it should not even be
> a 'javascript.options' preference, and might be more appropriately grouped
> with DOM preferences?

Having javascript.options in the name is a good idea I think.
Flags: needinfo?(jdemooij)
Rebased.  This might need a significant rewrite after bug 979480 so attaching WIP patches before.  The current code passes all jit-tests.
Attachment #812504 - Attachment is obsolete: true
Rebased.
Assignee: general → dtc-moz
Attachment #812506 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.