Closed Bug 563742 (OS.File) Opened 11 years ago Closed 8 years ago

Efficient JS File API

Categories

(Toolkit :: OS.File, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: taras.mozilla, Assigned: Yoric)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(26 obsolete files)

Currently Mozilla file-handling api is designed in such a way that stat() calls are maximized. An efficient ctypes js file-handling api should avoid that pitfall. For example, there should be a stat()-like call so one can do info = file.stat(); ... = info.size; ... = info.lastModified, etc.

Similarly current directory-listing code is currently inefficient. Listing a directory should produce a bunch of stat()-like objects that already have name/size ready(ie mirror what FindFirstFile(http://msdn.microsoft.com/en-us/library/aa364418%28VS.85%29.aspx) does)

Linux view of efficient directory listing: http://udrepper.livejournal.com/18555.html
Note that a full set of stat() data on Windows is more expensive than just checking whether a file exists, or its size (it would internally call GetFileAttributesEx). By default I think we should expose a minimal amount of data, and have flags that let you request more properties, such as a file's mtime or size, if it matters to the client.
And I'm all for an efficient JS API, but I'm not sure that it necessary should be implemented in terms of ctypes, so let's design the API first, and then figure out the best implementation strategy?
If we want it available to C++ code as well, then it shouldn't.

One problem implementing it with ctypes is that you won't get auto-cleanup: if we need to destroy allocated stat structures and such, it either has to be done manually by the JS consumer or in a JSAPI finalizer. If this doesn't apply, then ctypes would be OK...
(In reply to comment #2)
> And I'm all for an efficient JS API, but I'm not sure that it necessary should
> be implemented in terms of ctypes, so let's design the API first, and then
> figure out the best implementation strategy?

Here is my proposal https://wiki.mozilla.org/JSFileApi
Assignee: nobody → dteller
I have just updated the wiki with a second generation proposal.

I have attempted to comment each function/method with the Unix/Windows/Android stdlib functions we need to implement it. Nothing prevents from writing essentially the same API in C++, of course.
Keywords: perf
OS: Linux → All
Hardware: x86 → All
Cc'ing Neil, who did a bunch of work a while back on a new+improved file API.
Note that I have written some (untested) code for (a tiny subset of) this API. I'll continue this as soon as a have a little time (or as bugs start to actually require it).

I have the feeling that, since bug 649537 has landed, a file API that can work in a background thread might become suddenly quite useful.
The api I wrote was just to make nsIFile scriptable and add some extra methods to make opening, reading and writing from streams easier. If nsIFile has some performance issues, I think it would be better to just fix those or add additional methods which don't have the performance issues, rather than inventing yet another file api which will just have a different set of issues.
(In reply to David Rajchenbach Teller [:Yoric] from comment #7)
> I have the feeling that, since bug 649537 has landed, a file API that can
> work in a background thread might become suddenly quite useful.

Absolutely. Required yesterday, actually :)

Neil, we need the ability to read/write off the main thread. We can't do that with XPCOM, so need something that can do so. What do you suggest?
(In reply to Neil Deakin from comment #8)
> If nsIFile
> has some performance issues, I think it would be better to just fix those or
> add additional methods which don't have the performance issues, rather than
> inventing yet another file api which will just have a different set of
> issues.

Your point is quite valid.

However, to sum things up, we are currently faced with several issues:
- using nsIFile from JavaScript is really quite un-JavaScript (can be solved with a nice JS module on top of nsIFile, something like a beefed up NetUtils);
- using nsIFile from a JS background thread is now purely impossible (can probably be solved with a big large chunk of js-ctypes + C);
- the above mentioned performance issues (some of which, I fear, might require heavy refactoring, if we intend to do it in-place with nsIFile – and break stuff along the way).
> - using nsIFile from JavaScript is really quite un-JavaScript

Not sure what you mean here but the api I wrote is more-or-less described at https://wiki.mozilla.org/ScriptableIO

Issues with it revolved mainly around character encodings and so on.

- using nsIFile from a JS background thread is now purely impossible (can probably be solved with a big large chunk of js-ctypes + C);

Out of interest, can you point to where these issues are described?
(In reply to Neil Deakin from comment #11)
> > - using nsIFile from JavaScript is really quite un-JavaScript
> 
> Not sure what you mean here but the api I wrote is more-or-less described at
> https://wiki.mozilla.org/ScriptableIO

Indeed, this looks much better than raw nsIFile.

> - using nsIFile from a JS background thread is now purely impossible (can
> probably be solved with a big large chunk of js-ctypes + C);

It's not specific to nsIFile - XPConnect is now impossible on background threads:
see bug 649537.
Actually nsIFile is not salvageable. It does implicit .stat() calls by the nature of it. It forces one to use streams for io, it is not extensible to implement things like readahead sanely, etc. Instead of having crappy knockoff of java we need a performant extensible(ie fd-based) low level api. It's more work to "improve/fix" nsIFile than to replace it with a brand new thing.
Attached patch Early preview (obsolete) — Splinter Review
I am doing some progress. The current design is the following:
- a few C++ classes provide a thin layer on top of Posix/Windows functions - along with destructors;
- a JSAPI adapter exposes said classes as two constructors "OS.File" and "OS.Dir", along with prototypes and static methods.

Upsides of the design:
- JSAPI is both fast and predictable (faster than either ctypes or xpconnect);
- JSAPI works on both main thread and chrome workers (by opposition to xpconnect);
- JSAPI gives us the ability to attach destructors/finalizers (by opposition to ctypes), which is something we definitely want, as JS scripts/threads can throw exceptions, can be killed for a reason or another, etc.

Downsides of the design:
- adding things through JSAPI is laborious.

I attach an early preview.
Note we only need to have jsapi wrap the file descriptor(assuming ctypes cant be taught to do finalizers). Ctypes has a big advantage of being extensible by addon authors/etc.
As far as I understand, there are currently no plans to teach ctypes about finalizers. I will, however, make sure that any ctypes library (or other) can access the underlying file descriptor / file handle for extensibility purposes.
A first taste of the front-end JavaScript module.
Attachment #576500 - Attachment is obsolete: true
Attached patch Prev 1. JSAPI binding (obsolete) — Splinter Review
Attachment #577890 - Attachment is obsolete: true
Attached patch Prev 1. Unix back-end (obsolete) — Splinter Review
Work-in-progress implementation of the Unix back-end.
Attachment #577892 - Flags: feedback?(mh+mozilla)
Attachment #577899 - Flags: feedback?(mh+mozilla)
Attachment #577891 - Flags: feedback?(dmandelin)
David, what kind of feedback are you looking for?
(In reply to David Mandelin from comment #22)
> David, what kind of feedback are you looking for?

This is my first foray on JSAPI, so I would like to be sure that I am understanding its use correctly, in particular memory practices.
Comment on attachment 577891 [details] [diff] [review]
Prev 1. JSAPI binding

Review of attachment 577891 [details] [diff] [review]:
-----------------------------------------------------------------

Just commenting on the JSAPI parts.

::: dom/rawfile/RawFileService.cpp
@@ +128,5 @@
> +  args[1] = INT_TO_JSVAL(aCode);
> +  
> +  if (JS_CallFunctionName(aCx, objFile, "Error", 2, args, &exc)) {
> +    JS_SetPendingException(aCx, exc);
> +  }

I'm not sure about a few things here. It looks like you are calling the builtin |Error| constructor with this=[the file object] and arguments (operation string, result code). According to https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Error, the arguments are (message, fileName, lineNumber), so I'm not sure this will do what you want. The phrasebook shows passing the global object |this|, but 'undefined' is the better thing to do now.

@@ +150,5 @@
> +{
> +  //Extract |this|
> +  JSObject *self = JS_THIS_OBJECT(aCx, aVp);
> +  RawFile *p = (RawFile*)JS_GetInstancePrivate(aCx, self, &gClass, NULL);
> +  CHECK_NULL(p);

Can the private actually be null in your implementation? It seems better to require it to be non-null and just assert here if you can.

@@ +192,5 @@
> +  jsval     valResult;
> +  switch(aFinalType) {
> +  case ARRAY_BUFFER:
> +    {//Array buffer: simple memory copy
> +      JSObject *objResult = js_CreateArrayBuffer(aCx, szRead);

Another case to ask for a new JSAPI function.

@@ +203,5 @@
> +      valResult = OBJECT_TO_JSVAL(objResult);
> +
> +      break;
> +    }
> +  case BINARY_STRING:

I think this is not portable. A JS string is a sequence of 16-bit values. It looks like you are creating a string where the internal bytes of the string are directly taken from the bytes read from the file, so the result will depend on endianness.

I'm also not clear why you want this zero-terminated.

I can review the character logic in detail if you let me know exactly what you intend here.

@@ +233,5 @@
> +
> +      bufResult = (jschar*)JS_malloc(aCx, (szRead+1)*sizeof(jschar) );
> +      CHECK_NULL(bufResult);
> +
> +      bufResult[szRead] = jschar(0);

Strings are immutable, so it's a nicer to fill the buffer before creating the JS string.

@@ +285,5 @@
> + * |_OSFile.prototype._readAsString|
> + *
> + * JavaScript signature:
> + * function(): ArrayBuffer
> + *             The buffer

I noticed a comment typo here.

@@ +314,5 @@
> +  jsval *argv = JS_ARGV(aCx, aVp);
> +          
> +  //Extract arg 0 (TypedArray => char*)
> +  JSObject* objTypeArray;
> +  if(!JS_ValueToObject(aCx, argv[0], &objTypeArray)) {

I think JS_ValueToObject will set a pending exception for you if it fails.

@@ +319,5 @@
> +    JS_ReportErrorNumber(aCx, js_GetErrorMessage, NULL, JSMSG_NOT_NONNULL_OBJECT);
> +    return JS_FALSE;
> +  }
> +
> +  if(!js_IsTypedArray(objTypeArray)) {

Things that start with js_ aren't part of the public API. For this one, you can file a bug on us to get a JS_IsTypedArray API added (and feel free to write the patch for it, although we will certainly do it for you if needed).

@@ +342,5 @@
> +  }
> +
> +  //Proceed with call
> +  size_t intResult;
> +  nsresult rv = p->Read((char*)buf, (off_t)offset, (size_t)size, &intResult);

At first I was worried about reading arbitrary data into a typed array of arbitrary type, but I think that's normal, because views can be created on a buffer of various types.

@@ +350,5 @@
> +  }
> +
> +  //Return result
> +  jsval valResult;
> +  if (!JS_NewNumberValue(aCx, (double)intResult, &valResult)) {

You can just use infallible DOUBLE_TO_JSVAL (or more likely INT_TO_JSVAL, I guess).

@@ +524,5 @@
> +    ReportError(aCx, "OS.File: _OSFile error", rv);
> +    return JS_FALSE;
> +  }
> +
> +  //Wrap and return

You can replace much of this with the new API JS_NewObjectForConstructor.

By the way, I would recommend not using the name 'Wrap' for that function--Wrap has a meaning in JSAPI so that could be a bit confusing.
Attachment #577891 - Flags: feedback?(dmandelin)
(In reply to David Mandelin from comment #24)
> Comment on attachment 577891 [details] [diff] [review] [diff] [details] [review]
> Prev 1. JSAPI binding
> 
> Review of attachment 577891 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> Just commenting on the JSAPI parts.
> 
> ::: dom/rawfile/RawFileService.cpp
> @@ +128,5 @@
> > +  args[1] = INT_TO_JSVAL(aCode);
> > +  
> > +  if (JS_CallFunctionName(aCx, objFile, "Error", 2, args, &exc)) {
> > +    JS_SetPendingException(aCx, exc);
> > +  }
> 
> I'm not sure about a few things here. It looks like you are calling the
> builtin |Error| constructor with this=[the file object] and arguments
> (operation string, result code). According to
> https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Error,
> the arguments are (message, fileName, lineNumber), so I'm not sure this will
> do what you want. The phrasebook shows passing the global object |this|, but
> 'undefined' is the better thing to do now.

The idea is that global object "_OSFile" is meant to be used only through a js module providing a few classes, including "OS.File.Error". My code was intended to implement "new OS.File.Error(..., ...)" and let JavaScript code do the rest.

> @@ +150,5 @@
> > +{
> > +  //Extract |this|
> > +  JSObject *self = JS_THIS_OBJECT(aCx, aVp);
> > +  RawFile *p = (RawFile*)JS_GetInstancePrivate(aCx, self, &gClass, NULL);
> > +  CHECK_NULL(p);
> 
> Can the private actually be null in your implementation? It seems better to
> require it to be non-null and just assert here if you can.

No, it cannot. However, it is my understanding that |JS_GetInstancePrivate| returns |NULL| if the object has the wrong class and raises an exception, and that so I still need to call |CHECK_NULL| to propagate that exception.

> 
> @@ +192,5 @@
> > +  jsval     valResult;
> > +  switch(aFinalType) {
> > +  case ARRAY_BUFFER:
> > +    {//Array buffer: simple memory copy
> > +      JSObject *objResult = js_CreateArrayBuffer(aCx, szRead);
> 
> Another case to ask for a new JSAPI function.

I will try submitting a patch for both.

> 
> @@ +203,5 @@
> > +      valResult = OBJECT_TO_JSVAL(objResult);
> > +
> > +      break;
> > +    }
> > +  case BINARY_STRING:
> 
> I think this is not portable. A JS string is a sequence of 16-bit values. It
> looks like you are creating a string where the internal bytes of the string
> are directly taken from the bytes read from the file, so the result will
> depend on endianness.
> 
> I'm also not clear why you want this zero-terminated.
> 
> I can review the character logic in detail if you let me know exactly what
> you intend here.

I guess I will simply look deeper into the behavior of |FileReader|.

> 
> @@ +233,5 @@
> > +
> > +      bufResult = (jschar*)JS_malloc(aCx, (szRead+1)*sizeof(jschar) );
> > +      CHECK_NULL(bufResult);
> > +
> > +      bufResult[szRead] = jschar(0);
> 
> Strings are immutable, so it's a nicer to fill the buffer before creating
> the JS string.

Ok.

> @@ +285,5 @@
> > + * |_OSFile.prototype._readAsString|
> > + *
> > + * JavaScript signature:
> > + * function(): ArrayBuffer
> > + *             The buffer
> 
> I noticed a comment typo here.

Thanks.

> @@ +314,5 @@
> > +  jsval *argv = JS_ARGV(aCx, aVp);
> > +          
> > +  //Extract arg 0 (TypedArray => char*)
> > +  JSObject* objTypeArray;
> > +  if(!JS_ValueToObject(aCx, argv[0], &objTypeArray)) {
> 
> I think JS_ValueToObject will set a pending exception for you if it fails.

Are you sure? I could not find anything suggesting this in the documentation and my brief foray into the implementation of JS_ValueToObject did not convince me of this either.

> 
> @@ +319,5 @@
> > +    JS_ReportErrorNumber(aCx, js_GetErrorMessage, NULL, JSMSG_NOT_NONNULL_OBJECT);
> > +    return JS_FALSE;
> > +  }
> > +
> > +  if(!js_IsTypedArray(objTypeArray)) {
> 
> Things that start with js_ aren't part of the public API. For this one, you
> can file a bug on us to get a JS_IsTypedArray API added (and feel free to
> write the patch for it, although we will certainly do it for you if needed).

Ok, as above.


> @@ +342,5 @@
> > +  }
> > +
> > +  //Proceed with call
> > +  size_t intResult;
> > +  nsresult rv = p->Read((char*)buf, (off_t)offset, (size_t)size, &intResult);
> 
> At first I was worried about reading arbitrary data into a typed array of
> arbitrary type, but I think that's normal, because views can be created on a
> buffer of various types.

From my understanding, this is what ArrayBuffers are all about, isn't it?

> 
> @@ +350,5 @@
> > +  }
> > +
> > +  //Return result
> > +  jsval valResult;
> > +  if (!JS_NewNumberValue(aCx, (double)intResult, &valResult)) {
> 
> You can just use infallible DOUBLE_TO_JSVAL (or more likely INT_TO_JSVAL, I
> guess).

Ok.

> @@ +524,5 @@
> > +    ReportError(aCx, "OS.File: _OSFile error", rv);
> > +    return JS_FALSE;
> > +  }
> > +
> > +  //Wrap and return
> 
> You can replace much of this with the new API JS_NewObjectForConstructor.

Ok. Note that this API is not documented on mdn.

> By the way, I would recommend not using the name 'Wrap' for that
> function--Wrap has a meaning in JSAPI so that could be a bit confusing.

I suppose that I do not need that function anymore now that I have JS_NewObjectForConstructor. If I do, do you have any suggestion on naming conventions?


Thanks for the feedback, by the way, this is very helpful.
(In reply to David Rajchenbach Teller [:Yoric] from comment #25)
> (In reply to David Mandelin from comment #24)
> > Comment on attachment 577891 [details] [diff] [review] [diff] [details] [review] [diff] [details] [review]
> > Prev 1. JSAPI binding
> > 
> > Review of attachment 577891 [details] [diff] [review] [diff] [details] [review] [diff] [details] [review]:
> > -----------------------------------------------------------------
> > 
> > Just commenting on the JSAPI parts.
> > 
> > ::: dom/rawfile/RawFileService.cpp
> > @@ +128,5 @@
> > > +  args[1] = INT_TO_JSVAL(aCode);
> > > +  
> > > +  if (JS_CallFunctionName(aCx, objFile, "Error", 2, args, &exc)) {
> > > +    JS_SetPendingException(aCx, exc);
> > > +  }
> > 
> > I'm not sure about a few things here. It looks like you are calling the
> > builtin |Error| constructor with this=[the file object] and arguments
> > (operation string, result code). According to
> > https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Error,
> > the arguments are (message, fileName, lineNumber), so I'm not sure this will
> > do what you want. The phrasebook shows passing the global object |this|, but
> > 'undefined' is the better thing to do now.
> 
> The idea is that global object "_OSFile" is meant to be used only through a
> js module providing a few classes, including "OS.File.Error". My code was
> intended to implement "new OS.File.Error(..., ...)" and let JavaScript code
> do the rest.

Oh, you're right, I missed that you were looking up |OS.File.Error|. I don't see that getting defined here, so I assume it's in another patch.

> > @@ +150,5 @@
> > > +{
> > > +  //Extract |this|
> > > +  JSObject *self = JS_THIS_OBJECT(aCx, aVp);
> > > +  RawFile *p = (RawFile*)JS_GetInstancePrivate(aCx, self, &gClass, NULL);
> > > +  CHECK_NULL(p);
> > 
> > Can the private actually be null in your implementation? It seems better to
> > require it to be non-null and just assert here if you can.
> 
> No, it cannot. However, it is my understanding that |JS_GetInstancePrivate|
> returns |NULL| if the object has the wrong class and raises an exception,
> and that so I still need to call |CHECK_NULL| to propagate that exception.

Ah, yes, you are right, someone could always call the method on an object of a different class.

> > @@ +314,5 @@
> > > +  jsval *argv = JS_ARGV(aCx, aVp);
> > > +          
> > > +  //Extract arg 0 (TypedArray => char*)
> > > +  JSObject* objTypeArray;
> > > +  if(!JS_ValueToObject(aCx, argv[0], &objTypeArray)) {
> > 
> > I think JS_ValueToObject will set a pending exception for you if it fails.
> 
> Are you sure? I could not find anything suggesting this in the documentation
> and my brief foray into the implementation of JS_ValueToObject did not
> convince me of this either.

Yes, the documentation doesn't say and it's not obvious in the implementation either. JS_ValueToObject eventually calls PrimitiveToObject, which calls one of (Boolean|String|Number)Object::create. Looking at BooleanObject::create (in vm/BooleanObject-inl.h), I saw that it calls newBuiltinClassInstance, which is a general object allocator, so I was pretty sure that reports errors. Since you asked, I dug deeper and found that bottoms out in ArenaLists::refillFreeList, which ends with js_ReportOutOfMemory(cx).

But you don't need to know any of that. As I mentioned in the other review, if it takes a cx and returns a bool, if it returns false it sets an exception on the cx. If it didn't, that a bug.

> > @@ +342,5 @@
> > > +  }
> > > +
> > > +  //Proceed with call
> > > +  size_t intResult;
> > > +  nsresult rv = p->Read((char*)buf, (off_t)offset, (size_t)size, &intResult);
> > 
> > At first I was worried about reading arbitrary data into a typed array of
> > arbitrary type, but I think that's normal, because views can be created on a
> > buffer of various types.
> 
> From my understanding, this is what ArrayBuffers are all about, isn't it?

I think so.

> > @@ +524,5 @@
> > > +    ReportError(aCx, "OS.File: _OSFile error", rv);
> > > +    return JS_FALSE;
> > > +  }
> > > +
> > > +  //Wrap and return
> > 
> > You can replace much of this with the new API JS_NewObjectForConstructor.
> 
> Ok. Note that this API is not documented on mdn.

I know. I added adding it to my list.

> > By the way, I would recommend not using the name 'Wrap' for that
> > function--Wrap has a meaning in JSAPI so that could be a bit confusing.
> 
> I suppose that I do not need that function anymore now that I have
> JS_NewObjectForConstructor. If I do, do you have any suggestion on naming
> conventions?

Not really. I'd probably call it "CreateFileObject" or something like that, but you probably want something that fits with Gecko names. Incidentally, "CreateWrapper" would be OK, I think: Wrap happens to be very near to JS_Wrap, which is why I noted it.

> Thanks for the feedback, by the way, this is very helpful.

Glad I could help.
By the way, I should mention I like this project: a proper file API that works with pure JS is a great thing. (The bug should probably be renamed, btw, and possibly moved to a new component.)
Summary: Efficient ctypes API for file handling → Efficient JS File API
Assignee: dteller → general
Component: js-ctypes → JavaScript Engine
QA Contact: js-ctypes → general
Turning this into a tracking bug.
Alias: JSFileAPI
Assignee: general → dteller
No longer depends on: 707096
Comment on attachment 577899 [details] [diff] [review]
Prev 1. Unix back-end

Review of attachment 577899 [details] [diff] [review]:
-----------------------------------------------------------------

Haven't looked at the implementation, but here are some comments on the API. One thing that seems awkward is that the directory class has a mixture of operations on files and to itself.

::: xpcom/build/rawfile_unix.h
@@ +218,5 @@
> +
> +  /**
> +   * A structure caching the last call to |fstat|, to avoid unnecessary
> +   * calls when requesting information that does not change (e.g.
> +   * creation time).

There is no such thing as creation time in what fstat returns.

@@ +276,5 @@
> +   * @param path A path, either relative to |this| directory or absolute.
> +   * @param flags File-opening flags
> +   */
> +  nsresult OpenFile(const nsAString &path,  OpenFlags &flags, RawFile** aOutFile);
> +  nsresult OpenFile(const nsACString &path, OpenFlags &flags, RawFile** aOutFile);

Do we really need nsAString/nsACString alternate versions of (almost) everything?

@@ +311,5 @@
> +   *
> +   * @param aOldPath A path, either relative to |this| directory or absolute.
> +   * @param aNewPath A path, either relative to |this| directory or absolute.
> +   */
> +  nsresult RenameItem(const nsACString &aOldPath, const nsACString &aNewPath);

Rename ? Move ?

@@ +323,5 @@
> +   *
> +   * @param aOldPath A path, either relative to |this| directory or absolute.
> +   * @param aNewPath A path, either relative to |this| directory or absolute.
> +   */
> +  nsresult RenameItemInto(const nsACString &aOldPath, RawDirectory *aNewDirectory, const nsACString &aNewPath);

Why do we need two separate functions?

@@ +331,5 @@
> +   *
> +   * Note: Operation is OS-accelerated under Linux/Android.
> +   * Note: As a side effect, this method calls |Resolve|.
> +   */
> +  nsresult Rmcontents();

What is a directory contents ? Are files emptied ? Are files removed ? Is the directory itself removed ?

@@ +339,5 @@
> +   *
> +   * Note: Operation is OS-accelerated under Linux/Android.
> +   * Note: As a side effect, this method calls |Resolve|.
> +   */
> +  nsresult Rmdir();

If Rmcontents does rm -rf, them Rmdir is redundant. Just have one function that takes a bool to do it recursively or not.
(In reply to Mike Hommey [:glandium] from comment #29)
> Comment on attachment 577899 [details] [diff] [review] [diff] [details] [review]
> Prev 1. Unix back-end
> 
> Review of attachment 577899 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> Haven't looked at the implementation, but here are some comments on the API.

Thanks.

> One thing that seems awkward is that the directory class has a mixture of
> operations on files and to itself.
> 
> ::: xpcom/build/rawfile_unix.h
> @@ +218,5 @@
> > +
> > +  /**
> > +   * A structure caching the last call to |fstat|, to avoid unnecessary
> > +   * calls when requesting information that does not change (e.g.
> > +   * creation time).
> 
> There is no such thing as creation time in what fstat returns.

Under MacOS X, it does.

> 
> @@ +276,5 @@
> > +   * @param path A path, either relative to |this| directory or absolute.
> > +   * @param flags File-opening flags
> > +   */
> > +  nsresult OpenFile(const nsAString &path,  OpenFlags &flags, RawFile** aOutFile);
> > +  nsresult OpenFile(const nsACString &path, OpenFlags &flags, RawFile** aOutFile);
> 
> Do we really need nsAString/nsACString alternate versions of (almost)
> everything?

As underlying Unix APIs take char* and Windows APIs take wchar*, I expect the Unix version to be mostly nsACString and the Windows version to be mostly nsAString. I am making (some) effort to ensure that both back-ends have the same API.

Do you think it hurts?

> @@ +311,5 @@
> > +   *
> > +   * @param aOldPath A path, either relative to |this| directory or absolute.
> > +   * @param aNewPath A path, either relative to |this| directory or absolute.
> > +   */
> > +  nsresult RenameItem(const nsACString &aOldPath, const nsACString &aNewPath);
> 
> Rename ? Move ?

Ok.

> 
> @@ +323,5 @@
> > +   *
> > +   * @param aOldPath A path, either relative to |this| directory or absolute.
> > +   * @param aNewPath A path, either relative to |this| directory or absolute.
> > +   */
> > +  nsresult RenameItemInto(const nsACString &aOldPath, RawDirectory *aNewDirectory, const nsACString &aNewPath);
> 
> Why do we need two separate functions?

You are right, we probably do not need both as the second one is more generic.

> 
> @@ +331,5 @@
> > +   *
> > +   * Note: Operation is OS-accelerated under Linux/Android.
> > +   * Note: As a side effect, this method calls |Resolve|.
> > +   */
> > +  nsresult Rmcontents();
> 
> What is a directory contents ? Are files emptied ? Are files removed ? Is
> the directory itself removed ?

Everything inside the directory but not the directory itself. I mostly published that method because I needed to implement it anyway for |Rmdir|.

> @@ +339,5 @@
> > +   *
> > +   * Note: Operation is OS-accelerated under Linux/Android.
> > +   * Note: As a side effect, this method calls |Resolve|.
> > +   */
> > +  nsresult Rmdir();
> 
> If Rmcontents does rm -rf, them Rmdir is redundant. Just have one function
> that takes a bool to do it recursively or not.

I actually had not thought about doing a non-recursive version. Would it be useful?
Component: JavaScript Engine → Networking: File
QA Contact: general → networking.file
(In reply to David Rajchenbach Teller [:Yoric] from comment #30)
> > > +  nsresult OpenFile(const nsAString &path,  OpenFlags &flags, RawFile** aOutFile);
> > > +  nsresult OpenFile(const nsACString &path, OpenFlags &flags, RawFile** aOutFile);
> > 
> > Do we really need nsAString/nsACString alternate versions of (almost)
> > everything?
> 
> As underlying Unix APIs take char* and Windows APIs take wchar*, I expect
> the Unix version to be mostly nsACString and the Windows version to be
> mostly nsAString. I am making (some) effort to ensure that both back-ends
> have the same API.
> 
> Do you think it hurts?

It multiplies the number of symbols, functions, etc. I think it's time we have an nsNativeString that is either nsACString on unix or nsAString on windows, and has conversions constructors from the other type. There are plenty of other places in the code base where this would be useful.

> > > +   *
> > > +   * Note: Operation is OS-accelerated under Linux/Android.
> > > +   * Note: As a side effect, this method calls |Resolve|.
> > > +   */
> > > +  nsresult Rmcontents();
> > 
> > What is a directory contents ? Are files emptied ? Are files removed ? Is
> > the directory itself removed ?
> 
> Everything inside the directory but not the directory itself. I mostly
> published that method because I needed to implement it anyway for |Rmdir|.

So if we expose it, it could be "Clear" or something.

> > @@ +339,5 @@
> > > +   *
> > > +   * Note: Operation is OS-accelerated under Linux/Android.
> > > +   * Note: As a side effect, this method calls |Resolve|.
> > > +   */
> > > +  nsresult Rmdir();

By the way, this could be Remove, since it's a member of the directory class.

> > If Rmcontents does rm -rf, them Rmdir is redundant. Just have one function
> > that takes a bool to do it recursively or not.
> 
> I actually had not thought about doing a non-recursive version. Would it be
> useful?

I don't know if it's actually useful. I was just wondering because that's how the nsILocalFile::Remove is declared, with a recursive argument. It would be good to see if there are places where it is used with false.
(In reply to Mike Hommey [:glandium] from comment #31)
> It multiplies the number of symbols, functions, etc. I think it's time we
> have an nsNativeString that is either nsACString on unix or nsAString on
> windows, and has conversions constructors from the other type. There are
> plenty of other places in the code base where this would be useful.

I have tried typedef-ing nsNativeString et al to nsCString et al in my .h, with limited success: the API is simply more annoying to use. Is there any real advantage to complicating the life of API clients thus?

> > Everything inside the directory but not the directory itself. I mostly
> > published that method because I needed to implement it anyway for |Rmdir|.
> 
> So if we expose it, it could be "Clear" or something.

Good name, I like it.

> > > @@ +339,5 @@
> > > > +   *
> > > > +   * Note: Operation is OS-accelerated under Linux/Android.
> > > > +   * Note: As a side effect, this method calls |Resolve|.
> > > > +   */
> > > > +  nsresult Rmdir();
> 
> By the way, this could be Remove, since it's a member of the directory class.

Ok.


> > > If Rmcontents does rm -rf, them Rmdir is redundant. Just have one function
> > > that takes a bool to do it recursively or not.
> > 
> > I actually had not thought about doing a non-recursive version. Would it be
> > useful?
> 
> I don't know if it's actually useful. I was just wondering because that's
> how the nsILocalFile::Remove is declared, with a recursive argument. It
> would be good to see if there are places where it is used with false.

It seems to be used with |false| quite often. A quick survey seems to indicate two scenarios:
- removing a file and not a directory;
- emptying a directory, just like Rmcontents (now renamed Clear).
(In reply to David Rajchenbach Teller [:Yoric] from comment #32)
> (In reply to Mike Hommey [:glandium] from comment #31)
> > It multiplies the number of symbols, functions, etc. I think it's time we
> > have an nsNativeString that is either nsACString on unix or nsAString on
> > windows, and has conversions constructors from the other type. There are
> > plenty of other places in the code base where this would be useful.
> 
> I have tried typedef-ing nsNativeString et al to nsCString et al in my .h,
> with limited success: the API is simply more annoying to use. Is there any
> real advantage to complicating the life of API clients thus?

Sorry for the formulation, I did not intend to troll. I meant "is there any real advantage that balances the complication to API clients?"
(In reply to David Rajchenbach Teller [:Yoric] from comment #33)
> (In reply to David Rajchenbach Teller [:Yoric] from comment #32)
> > (In reply to Mike Hommey [:glandium] from comment #31)
> > > It multiplies the number of symbols, functions, etc. I think it's time we
> > > have an nsNativeString that is either nsACString on unix or nsAString on
> > > windows, and has conversions constructors from the other type. There are
> > > plenty of other places in the code base where this would be useful.
> > 
> > I have tried typedef-ing nsNativeString et al to nsCString et al in my .h,
> > with limited success: the API is simply more annoying to use. Is there any
> > real advantage to complicating the life of API clients thus?

My point was not to typedef, but to also have conversion constructors.

> Sorry for the formulation, I did not intend to troll. I meant "is there any
> real advantage that balances the complication to API clients?"

My gut feeling is that it would simplify the API clients. I've seen numerous cases with #ifdef XP_WIN #else with one end using a nsAString and a conversion and the other end using an nsACString.

I could be wrong, though.
Mike, I have (mostly) applied your suggestions. New patches for the Unix back-end are now in bug 707679.

Marking patches as obsolete.
Attachment #577874 - Attachment is obsolete: true
Attachment #577891 - Attachment is obsolete: true
Attachment #577892 - Attachment is obsolete: true
Attachment #577899 - Attachment is obsolete: true
Attachment #577892 - Flags: feedback?(mh+mozilla)
Attachment #577899 - Flags: feedback?(mh+mozilla)
Duplicate of this bug: 691545
Attached patch JSAPI: Shared between back-ends. (obsolete) — Splinter Review
Attachment #580865 - Attachment is obsolete: true
Attachment #589436 - Flags: review?
Attachment #589436 - Flags: review? → review?(taras.mozilla)
Attached patch JSAPI: Unix back-end (obsolete) — Splinter Review
Attachment #589437 - Flags: review?(taras.mozilla)
Attached patch Injecting the API into JS (obsolete) — Splinter Review
Attachment #589439 - Flags: review?(taras.mozilla)
Attached patch Testsuite (obsolete) — Splinter Review
Attachment #589443 - Flags: review?(taras.mozilla)
Comment on attachment 589439 [details] [diff] [review]
Injecting the API into JS

Ah, sorry, that patch is not ready for review. I'll clean it up and resubmit later.
Attachment #589439 - Flags: review?(taras.mozilla)
Attached patch JSAPI: Unix back-end (obsolete) — Splinter Review
Attachment #589566 - Flags: review?(taras.mozilla)
Attachment #589437 - Attachment is obsolete: true
Attachment #589437 - Flags: review?(taras.mozilla)
Attachment #589440 - Attachment is obsolete: true
Two comments, one nit-picky:

1) Whenever you have, at global scope:

const char *X = ...

could you please use:

const char X[] = ...

It'll use slightly less space.

1a) Since those are often names for a DecodeArg, why not just make them static const and declare them in the DecodeFoo function?

2) Why not have Util::DecodeArgs just take a count indicating how long the DecodeArg[] is?  Use template magic to automagically provide the count if you like.  Just seems like that would be better than computing the count in DecodeArgs anyway and avoiding gArgEnd and whatnot.
Attached patch Injecting the API into JS (obsolete) — Splinter Review
Attachment #589577 - Flags: review?(taras.mozilla)
(In reply to Nathan Froyd (:froydnj) from comment #45)
> Two comments, one nit-picky:
> 
> 1) Whenever you have, at global scope:
> 
> const char *X = ...
> 
> could you please use:
> 
> const char X[] = ...
> 
> It'll use slightly less space.

Sure.

> 1a) Since those are often names for a DecodeArg, why not just make them
> static const and declare them in the DecodeFoo function?

Well, they are used outside of the function, to report error messages. I do not see any nice way to put them inside the function.

> 2) Why not have Util::DecodeArgs just take a count indicating how long the
> DecodeArg[] is?  Use template magic to automagically provide the count if
> you like.  Just seems like that would be better than computing the count in
> DecodeArgs anyway and avoiding gArgEnd and whatnot.

Ok, I will investigate that.
Attached patch JSAPI: Shared between back-ends. (obsolete) — Splinter Review
Attachment #589579 - Flags: review?(taras.mozilla)
Attachment #589436 - Attachment is obsolete: true
Attachment #589436 - Flags: review?(taras.mozilla)
Attached patch JSAPI: Shared between back-ends. (obsolete) — Splinter Review
Attachment #589809 - Flags: review?(taras.mozilla)
Attachment #589579 - Attachment is obsolete: true
Attachment #589579 - Flags: review?(taras.mozilla)
Attached patch JSAPI: Utilities. (obsolete) — Splinter Review
Attachment #589810 - Flags: review?(taras.mozilla)
Attachment #589809 - Attachment is obsolete: true
Attachment #589809 - Flags: review?(taras.mozilla)
Attached patch JSAPI: Shared between back-ends. (obsolete) — Splinter Review
Attachment #589811 - Flags: review?(taras.mozilla)
Attached patch JSAPI: Unix back-end (obsolete) — Splinter Review
Attachment #589812 - Flags: review?(taras.mozilla)
Attachment #589566 - Attachment is obsolete: true
Attachment #589566 - Flags: review?(taras.mozilla)
Attached patch Testsuite (obsolete) — Splinter Review
Attachment #589814 - Flags: review?(taras.mozilla)
Attachment #589443 - Attachment is obsolete: true
Attachment #589443 - Flags: review?(taras.mozilla)
Attachment #589574 - Attachment is obsolete: true
Attachment #589439 - Attachment is obsolete: true
(In reply to David Rajchenbach Teller [:Yoric] from comment #47)
> > 1a) Since those are often names for a DecodeArg, why not just make them
> > static const and declare them in the DecodeFoo function?
> 
> Well, they are used outside of the function, to report error messages. I do
> not see any nice way to put them inside the function.

Actually, I was wrong. I have just applied all your suggestions, all three were quite good.

Thanks!
I will do another review round tomorrow, sorry for lag, bugzilla has been DoSing me. I asked Nathan to help review this too.
Comment on attachment 589811 [details] [diff] [review]
JSAPI: Shared between back-ends.

Review of attachment 589811 [details] [diff] [review]:
-----------------------------------------------------------------

This looks sane, nothing but a few comments.

::: toolkit/components/osfile/OSFile_all.cpp
@@ +55,5 @@
> + * function(String): String
> + *          key      path
> + */
> +JSBool GetPath(JSContext *aCx, uintN aArgc, jsval *aVp)
> +{

Needs some comment beyond the JS signature, I think.

Why do we have this?  Is this because we can't use the directory service on a worker thread?

@@ +91,5 @@
> +}
> +
> +static JSFunctionSpec gFunctions[] =
> +{
> +  JS_FN("getPath",            GetPath,       1, JSPROP_ENUMERATE),

This seems a little thin; I'm assuming all the path manipulation bits are a part of bug 707696?  (Sorry, I haven't been following all of the discussion here.)
Comment on attachment 589810 [details] [diff] [review]
JSAPI: Utilities.

Review of attachment 589810 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/OSFile_util.cpp
@@ +56,5 @@
> +                    size_t aArgsCount)
> +{
> +  // 1. Check number of arguments
> +
> +  if (aArgsCount > aArgc) {

Nit: I think this is more idiomatic as aArgc < aArgsCount.

@@ +66,5 @@
> +  // 2. Actually decode
> +
> +  for (size_t i = 0; i < aArgsCount; ++i) {
> +    const DecodeArg *arg = &(aDecoders[i]);
> +    if (!arg -> decode(aCx, argv[i], arg -> buf)) {

No spaces around ->.  Here and many other places.

@@ +94,5 @@
> +
> +//// Decode integers
> +
> +
> +bool  DecodeIntImpl(JSContext* aContext, jsval aValue,

Nit: single-space between return type and function name.  Here and elsewhere.

@@ +310,5 @@
> +    return false;
> +  }
> +
> +  int lo = aInteger & 0xffff;
> +  int hi = (aInteger - lo) % 0xffff;

This is weird; you're splitting this into two 16-bit integers, but join expects two 32-bit integers.  Don't you want something like:

int lo = aInteger;
int hi = aInteger >> 31;

(Only correct for 32-bit 'int', but you get the idea.)

::: toolkit/components/osfile/OSFile_util.h
@@ +13,5 @@
> + * Utility functions
> + */
> +namespace Util {
> +
> +  /**

Nit: no indent on the comments inside a namespace.

@@ +24,5 @@
> +  * The error message will read "$aDescription is $aProblem".
> +  *
> +  * @param aDescription The object that should have another type.
> +  * @param aProblem The type that the object has.
> +  * @return false

For these error functions, you should be including @param aCx ...

@@ +37,5 @@
> + */
> +bool ReportArrayBoundsError(JSContext *aCx);
> +
> +/**
> + * Report the fact that a JS function is expecting more args.

Nit: just "Report that ...".

@@ +43,5 @@
> + * The error message will read
> + * "$aOperation requires more than $aRequiredArgs arguments"
> + *
> + * @param aRequiredArgs The number of arguments
> + * @return false

Need to document aOperation.

@@ +83,5 @@
> + * In case of error, raise a JS exception with details regarding the error.
> + *
> + * @param aName The name of the function/method/operation for which this
> + * decoding is undertaken. Used for error-reporting.
> + * @param aDecoders An array of decoders terminated with |gArgEnd|.

Need to document all arguments.

@@ +147,5 @@
> + * @return true If |aObject| has a property named |aProperty| and which
> + * is a positive integer.
> + */
> +bool MayGetUInt32Property(JSContext *aCx, JSObject *aObject,
> +                       const char *aProperty, uint32 *aOutResult);

uint32 -> uint32_t, please.  Here and elsewhere.

@@ +174,5 @@
> +
> +  if (JSVAL_IS_OBJECT(aValue)) {
> +    JSObject *objValue = JSVAL_TO_OBJECT(aValue);
> +    uint32 lo/*least significant bits*/,
> +      hi/*most significant bits*/;

uint32 -> uint32_t.  You can also ditch the comments on the variables, I think.

@@ +176,5 @@
> +    JSObject *objValue = JSVAL_TO_OBJECT(aValue);
> +    uint32 lo/*least significant bits*/,
> +      hi/*most significant bits*/;
> +    if (!MayGetUInt32Property(aCx, objValue, "lo", &lo)
> +        || !MayGetUInt32Property(aCx, objValue, "hi", &hi)) {

According to https://developer.mozilla.org/en/js-ctypes/js-ctypes_reference/Int64 lo and hi are not properties of the object itself, but methods on Int64 that you need to call.  Making this fix may mean that MayGetUInt32Property is now dead.

@@ +180,5 @@
> +        || !MayGetUInt32Property(aCx, objValue, "hi", &hi)) {
> +      return false;
> +    }
> +    if (sizeof(T) <= sizeof(int)) {
> +      if (hi != 0) {

This is not quite right, because you could have a negative Int64 passed in to something expecting a signed 32-bit integer.  I think you may have to write some template magic to ask about the signedness of T.

@@ +181,5 @@
> +      return false;
> +    }
> +    if (sizeof(T) <= sizeof(int)) {
> +      if (hi != 0) {
> +        ReportArrayBoundsError(aCx);

An array bounds error is not the right error to throw here.  Something more like ReportNotThirtyTwoBitIntegerError, though less verbose.

@@ +199,5 @@
> +    out[1] = lo;
> +#endif //IS_LITTLE_ENDIAN
> +    return true;
> +  }
> +  return false;

Better to move this up like so:

if (!JSVAL_IS_OBJECT(aValue)) {
  return false;
}

// Value must be an Int64/UInt64 now.
JSObject *objValue = ...

less indentation, less surprises.

@@ +245,5 @@
> +  uintN properties;
> +} SimplePropertySpec;
> +
> +#define PROP(name, value, properties) { name, value, properties }
> +#define PROP_INT(name) PROP(#name, INT_TO_JSVAL(name), JSPROP_ENUMERATE)

I think I'd make this JSPROP_READONLY; I don't think we particularly care about making them enumerable.
Comment on attachment 589810 [details] [diff] [review]
JSAPI: Utilities.

Review of attachment 589810 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/OSFile_util.h
@@ +227,5 @@
> + * Precondition: Module |ctypes| must be open in the context.
> + *
> + * @return true in case of success, false otherwise.
> + */
> +bool ToInt64(JSContext *aCx, int aInteger, JSObject **aOutResult);

You pass various non-int things to this elsewhere (ssize_t, off_t, etc.).  You should make this take an int64_t or similar so that unexpected truncation doesn't happen.
Comment on attachment 589812 [details] [diff] [review]
JSAPI: Unix back-end

Review of attachment 589812 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/OSFile_unix.cpp
@@ +123,5 @@
> +/**
> + * Get global JS object |_OS|.
> + *
> + * @return NULL If the global object could not be found. This should
> + * never happen.

The comments for these various getters are incomplete.  The functions also return null if they can't get various properties, or if OS isn't found, or OS.Unix, and so forth.

@@ +200,5 @@
> +  if (!JS_GetProperty(aCx, objUnix, NAME_ERROR, &valConstructor)) {
> +    return JS_FALSE;
> +  }
> +
> +  JSObject *objResult = JS_NewObjectForConstructor(aCx, &valConstructor);

Asking purely for my own edification: how do you define Error with a bogus constructor (see above), yet still manage to construct one here?  Does the C++ construction process go through different channels than the JS construction process?

@@ +255,5 @@
> + * in case of success and a negative integer in case of error. In case
> + * of success, this function produces a Int64. In case of
> + * error, it produces a value of type |_OS.Unix.Error|.
> + */
> +JSBool ReturnInt64ResultOrError(JSContext *aCx, jsval *aVp, int aResult)

This needs to be fixed, probably by renaming to ReturnSSize_tResultOrError, fixing the argument type, and tweaking the comment (since you already have an off_t function).

@@ +296,5 @@
> + * Use function |DecodeOff_t| to perform decoding.
> + */
> +bool DecodeOff_tImplem(JSContext *aCx, jsval aValue, void *aBuf)
> +{
> +  return MayGetSizedInt<off_t>(aCx, aValue, (off_t*)aBuf);

Let's use static_cast here and elsewhere.

@@ +360,5 @@
> +  /**
> +   * As long as the file descriptor is open, its number. Once it is closed, -1.
> +   */
> +  int fd;
> +} PrivateData;

You could opt to store the integer in the private data directly, by playing games with casts to void* and intptr_t.  If you want to keep the struct PrivateData, it might be better if you provided a constructor and just called new(), rather than using malloc and separate initialization in Make.

@@ +394,5 @@
> +    return NULL;
> +  }
> +
> +  // 3. Store private data
> +  PrivateData *privateData = (PrivateData*)malloc(sizeof(PrivateData));

Use nsAutoPtr (which I guess mandates using new instead of malloc).

@@ +399,5 @@
> +  if (!privateData) {
> +    return NULL;
> +  }
> +
> +  privateData -> fd = aDescriptor;

No spaces around ->.  Here and many other places.

@@ +610,5 @@
> + * Exposing function |dup| to JavaScript.
> + *
> + * C signature:
> + * int dup2(int fildes, int fildes2);
> + *

Please make sure that your comment about what's being exposed matches the function signature.

@@ +643,5 @@
> + * @this FD
> + * function(FD): FD | Error
> + */
> +JSBool Dup2JS(JSContext *aCx, uintN aArgc, jsval *aVp)
> +{

Is there a reason we don't just roll this all into one dup2 function and have it sort out the arguments on its own?  You might lose some style points in the actual implementation, since DecodeMethod might not be as directly usable, but certainly that'd be better than "dup2js"?

@@ +879,5 @@
> +/**
> + * Exposing function |read| to JavaScript.
> + *
> + * C signature:
> + * char*   ttyname( int fd );

Please make sure that your comment about what's being exposed matches the function signature.

@@ +912,5 @@
> +/**
> + * Exposing function |pwrite| to JavaScript.
> + *
> + * C signature:
> + * ssize_t   write(int fildes, const void *buf, size_t nbyte);

Please make sure that your comment about what's being exposed matches the function signature.

@@ +940,5 @@
> +    return Util::ReportArrayBoundsError(aCx);
> +  }
> +
> +  // 4. Do it
> +  int result = ::write(priv -> fd, array.buf, nbyte);

ssize_t result, please.

@@ +1100,5 @@
> +
> +
> +#if defined(HAVE_FACCESSAT)
> +/**
> + * Exposing function |faccessat| to JavaScript.

The defined check here does not match the check later:

#if defined(HAVE_ACCESSAT)
JS_FN("accessat", Methods::FAccessAt, 3, JSPROP_ENUMERATE),
#endif //defined(HAVE_ACCESSAT)

So I'm confused as to which is what and when.

@@ +1194,5 @@
> +/**
> + * Exposing function |open| to JavaScript.
> + *
> + * C signature:
> + * int open(const char *pathname, int mode, ... int flags)

The mode and flags arguments should be reversed here.  Should we ensure that mode is only passed if O_CREAT is passed in flags?

@@ +1275,5 @@
> +    return JS_FALSE;
> +  }
> +
> +  JS_SET_RVAL(aCx, aVp, OBJECT_TO_JSVAL(objResult));
> +  return JS_TRUE;

Should probably catch negative integers, either by using ReturnFDOrError, or explicitly checking.

@@ +1303,5 @@
> +} //namespace Functions
> +
> +static JSFunctionSpec gMethods[] =
> +{
> +  JS_FN("close",             Methods::Close,             0, JSPROP_ENUMERATE),

I really don't think we care about making all of these (and the other functions/methods in this file) enumerable.  Probably JSPROP_READONLY?

@@ +1309,5 @@
> +  JS_FN("dup2",              Methods::Dup2,              1, JSPROP_ENUMERATE),
> +  JS_FN("dup2js",            Methods::Dup2JS,            1, JSPROP_ENUMERATE),
> +#if defined(HAVE_ACCESSAT)
> +  JS_FN("accessat",          Methods::FAccessAt,         3, JSPROP_ENUMERATE),
> +#endif //defined(HAVE_ACCESSAT)

As mentioned, we have mismatched preprocessor checks.  But why does this get named accessat when it's wrapping faccessat, contra to all the other functions here?

@@ +1672,5 @@
> +  PROP_INT(S_IRUSR),
> +#if defined(S_IEXEC)
> +  PROP_INT(S_IEXEC),
> +#else
> +  PROP("S_IEXEC", INT_TO_JSVAL(S_IXUSR), JSPROP_ENUMERATE),

JSPROP_READONLY, not JSPROP_ENUMERATE.

@@ +1740,5 @@
> +  JS_FN("truncate", Functions::Truncate, 2, JSPROP_ENUMERATE),
> +  JS_FN("unlink", Functions::Unlink, 1, JSPROP_ENUMERATE),
> +
> +
> +  JS_FS_END

Nit: extra whitespace lines.

@@ +1764,5 @@
> +    JS_InitClass(aCx, objUnix,
> +                 /*parent_proto*/NULL, /*class*/&FD::gClass,
> +                 /*constructor*/ FD::Constructor, /*args*/0,
> +                 /*properties*/  NULL, /*methods*/FD::gMethods,
> +                 /*constr properties*/NULL, /*constr methods*/FD::gFunctions);

Nit: whitespace post-comments is a bit uneven here.  If you really want to include the comments, I'd suggest:

JS_InitClass(aCx, objUnix,
             NULL, /* parent_proto */
             &FD::gClass, /* class */
             ...)

and so forth, maybe using M-; in emacs to make sure the comments line up vertically.

If that seems unnecessarily verbose, just drop the comments.

@@ +1793,5 @@
> +  if (!Util::DefineSimpleProperties(aCx, objConstants, gProperties)) {
> +    return false;
> +  }
> +
> +  // 3. Attach functions

Nit: you have two step 3s. :)
Comment on attachment 589577 [details] [diff] [review]
Injecting the API into JS

i don't know about this, jst can suggest a better reviewer.
Attachment #589577 - Flags: review?(taras.mozilla) → review?(jst)
Comment on attachment 589810 [details] [diff] [review]
JSAPI: Utilities.

+extern const char NAME_FUNCTIONS[] = "Functions";
+extern const char NAME_ERROR[] = "Error";
+extern const char NAME_DIR[] = "Dir";

this is awkward use namespace Name {} around these consts

Please make sure you follow mozilla style for function declarations in your patches. returns type is on a separate line, ie
+bool ReportArrayBoundsError(JSContext *aCx)
+{
should be 

bool
ReportArray...

more review in a bit
Comment on attachment 589810 [details] [diff] [review]
JSAPI: Utilities.

+  char buf[3];
+  snprintf(buf, 3, "%d", aRequiredArgs);

Don't hardcode constants, use sizeof. Also checkout nsCPrintfString for a more C++-ey way http://dxr.lanedo.com/mozilla-central/xpcom/string/public/nsPrintfCString.h.html#l70
Comment on attachment 589810 [details] [diff] [review]
JSAPI: Utilities.


+typedef struct {
+  uint32_t byteLen;
+  uint8_t  *buf;
+} ArrayBufferParams;
+

nit: *buf should go first, otherwise struct might pack funny(ie waste more space due to padding) on 64bit

you have a lot of patterns like
+  if (!JSVAL_IS_OBJECT(aValue)) {
+    return false;
+  }
+  JSObject *obj = JSVAL_TO_OBJECT(aValue);

+  if (!JSVAL_IS_STRING(aValue)) {
+    return false;
+  }
+  size_t lenString;
+  const jschar *jsString =
+    JS_GetStringCharsAndLength(aCx,
+                               JSVAL_TO_STRING(aValue),
+                               &lenString);

etc

Instead of doing basic argument conversion/validation by hand use JS_ConvertArguments 

Perhaps you can structure your Decode* routines better around JS_ConvertArguments.


This looks like it's getting close to r+.
Attachment #589810 - Flags: review?(taras.mozilla) → review-
Comment on attachment 589811 [details] [diff] [review]
JSAPI: Shared between back-ends.

DefinePortableChromeWorkerFunctions and DefinePortableChromeMainThreadFunctions are 2 functions with awkward names and no useful descriptions.
Ie is one of them for testing only? Please comment, use more concise names(that are easier to tell apart)

GetPath should be GetSpecialDirectory. Some consistency is important.
Attachment #589811 - Flags: review?(taras.mozilla) → review-
Comment on attachment 589812 [details] [diff] [review]
JSAPI: Unix back-end

This is still doing too much in unextensible C++ goop. Good news is that your testsuite shouldn't have to change much.
Attachment #589812 - Flags: review?(taras.mozilla) → review-
As far as I can tell you did so much C++ goop because we do not have finalizers in js. Finalizers are bad and the js team won't let us have java-style ones.

However, we can do a native finalizer that will work for windows and unix file handles.

With that, the file api would be implemented as:
let lib = ctypes.open("/lib/x86_64-linux-gnu/libc.so.6"); 
let open = lib.declare("open",...)
let raw_close = lib.declare("close",...)
let read = lib.declare("read",...)
function close(fd) {
  let raw_fd = fd.forget();
  raw_close(raw_fd);
}
let raw_fd = open(...)
let fd = safely_finalize_word(raw_fd, raw_close);
read(fd,...);
// throw exception ..in which case we'd have to rely on gc for cleanup
close(fd);

safely_finalize_word would create ResourceFinalizer object which would SetPrivate 2 struct members:
1) a callback that takes one word-sized argument
2) word sized argument

The the finalizer would apply the native callback to the argument.
The object would also support .forget(), which returns the word sized argument and does SetPrivate(NULL).
ResourceFinalizer would have to transparently convert into the underlying word. This should go into ctypes.
Comment on attachment 589577 [details] [diff] [review]
Injecting the API into JS

Ben owns this stuff, he should review here.
Attachment #589577 - Flags: review?(jst) → review?(bent.mozilla)
(In reply to Nathan Froyd (:froydnj) from comment #57)
> Comment on attachment 589811 [details] [diff] [review]
> JSAPI: Shared between back-ends.
> 
> Review of attachment 589811 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks sane, nothing but a few comments.
> 
> ::: toolkit/components/osfile/OSFile_all.cpp
> @@ +55,5 @@
> > + * function(String): String
> > + *          key      path
> > + */
> > +JSBool GetPath(JSContext *aCx, uintN aArgc, jsval *aVp)
> > +{
> 
> Needs some comment beyond the JS signature, I think.
> 
> Why do we have this?  Is this because we can't use the directory service on
> a worker thread?

Exactly.

> @@ +91,5 @@
> > +}
> > +
> > +static JSFunctionSpec gFunctions[] =
> > +{
> > +  JS_FN("getPath",            GetPath,       1, JSPROP_ENUMERATE),
> 
> This seems a little thin; I'm assuming all the path manipulation bits are a
> part of bug 707696?  (Sorry, I haven't been following all of the discussion
> here.)

Indeed.
(In reply to Taras Glek (:taras) from comment #66)
> Comment on attachment 589812 [details] [diff] [review]
> JSAPI: Unix back-end
> 
> This is still doing too much in unextensible C++ goop. Good news is that
> your testsuite shouldn't have to change much.

(In reply to Taras Glek (:taras) from comment #67)
> As far as I can tell you did so much C++ goop because we do not have
> finalizers in js. Finalizers are bad and the js team won't let us have
> java-style ones.
> 
> However, we can do a native finalizer that will work for windows and unix
> file handles.
> 
> With that, the file api would be implemented as:
> let lib = ctypes.open("/lib/x86_64-linux-gnu/libc.so.6"); 
> let open = lib.declare("open",...)
> let raw_close = lib.declare("close",...)
> let read = lib.declare("read",...)
> function close(fd) {
>   let raw_fd = fd.forget();
>   raw_close(raw_fd);
> }
> let raw_fd = open(...)
> let fd = safely_finalize_word(raw_fd, raw_close);
> read(fd,...);
> // throw exception ..in which case we'd have to rely on gc for cleanup
> close(fd);
> 
> safely_finalize_word would create ResourceFinalizer object which would
> SetPrivate 2 struct members:
> 1) a callback that takes one word-sized argument
> 2) word sized argument
> 
> The the finalizer would apply the native callback to the argument.
> The object would also support .forget(), which returns the word sized
> argument and does SetPrivate(NULL).


I am not sure exactly how you consider this unextensible. We can import/export file descriptors between C and JS, so that should not be an issue.

Also, we cannot export constants directly from js-ctypes as they are macros, so moving to js-ctypes would not change anything on that front. Also, we basically cannot handle 64 bit integers in JavaScript, even with ctypes.Int64/ctypes.UInt64, which are designed only as place-holders, so everything 64-bits-ish needs to be done in C++ in the first place, and I strongly suspect that I will be left to guess which implementations of |off_t|/|size_t|/|ssize_t|/|HANDLE|/... are 32-bits or 64-bits.

Also note that extracting the contents of a finalizable |fd| is not possible in the current state of js-ctypes, whether automatically or manually. See bug 592035 and bug 717886 for a discussion on related topics. A workaround is possible, although easy to abuse into dangling descriptors.
Depends on: ctypes.finalize
Attachment #589814 - Flags: review?(taras.mozilla)
No longer depends on: 732262
Attached patch Companion testsuite (obsolete) — Splinter Review
Attaching the new WIP testsuite. This is now a mochitest-based sutie, as we are using chrome workers.
Attached patch Main thread driver (obsolete) — Splinter Review
Attaching a preview of the main thread API.
Attaching a preview of the mechanism used to (de)serialized js-ctypes data between threads
Attached patch Worker thread (obsolete) — Splinter Review
Attaching a preview of the worker thread.
Attachment #589812 - Attachment is obsolete: true
Attachment #589811 - Attachment is obsolete: true
Attachment #589810 - Attachment is obsolete: true
Attachment #589577 - Attachment is obsolete: true
Attachment #589577 - Flags: review?(bent.mozilla)
Attachment #589815 - Attachment is obsolete: true
Attachment #589814 - Attachment is obsolete: true
In case you're interested: asyncMoveFile using ChromeWorker + js-ctypes for win, mac, *nix.
https://bugs.downthemall.net/browser/trunk/modules/manager/asyncmovefile
MPL-Tri

- Uses MoveFileExW on Win.
- *nix will try rename, or if that fails (different file systems)
  - splice + pipe (sendfile) if available, to avoid pushing the data back and forth between kernel and user land
  - read/write otherwise
- Uses rename, read/write on mac

One could easily refactor that to provide asyncCopyFile as well.


Also, from personal experience, you will want to avoid anything that allocates memory (buffers, js shapes, cross-thread marshaling of data). While occasional small reads and writes will work fine, writing large amounts of data - like a file downloads - will thrash the js GC/CC, cause massive memory fragmentation and by this page-in/page-out traffic / disk thrashing due to swap I/O and if you're really unlucky cause even occasional OOM aborts.
Using nsIPipe + nsIAsyncStreamCopier came out far superior in my tests than using ChromeWorker + low-level js-ctypes when working with arbitrary data.
And for file-to-file copies using ctypes, keep in mind that you only need one temp buffer per thread, not one buffer per operation (e.g. your pump() right now unnecessarily allocates a ctypes.ArrayType(ctypes.char)(bufsize) per call). E.g. using a per-thread buffer instead of a per-call one in my asyncmovefile/unix_copy.js had a measurable improvement.
(In reply to Nils Maier [:nmaier] from comment #76)
> In case you're interested: asyncMoveFile using ChromeWorker + js-ctypes for
> win, mac, *nix.

Thanks, I will definitely look at it.

> (e.g. your pump()
> right now unnecessarily allocates a ctypes.ArrayType(ctypes.char)(bufsize)
> per call). E.g. using a per-thread buffer instead of a per-call one in my
> asyncmovefile/unix_copy.js had a measurable improvement.

Good point, thanks.
Sorry guys if I sound to crazy, but have you considered embedding
https://github.com/joyent/libuv

Rust already does that as far as I know.

Also there is some relevant effort here:
https://github.com/creationix/luvmonkey
(In reply to Irakli Gozilalishvili [:irakli] [:gozala] from comment #78)
> Sorry guys if I sound to crazy, but have you considered embedding
> https://github.com/joyent/libuv
> 
> Rust already does that as far as I know.
> 
> Also there is some relevant effort here:
> https://github.com/creationix/luvmonkey

Interesting idea. I suspect that this is not exactly what we need, but I will take another look.
Quick summary of what I have found out about libuv:
- it seems to have its own event loop, which sounds like a very bad idea for us;
- it seems heavily-oriented towards network i/o (epoll and kqueue are useless/unusable on regular files);
- also, big bunch of C code, which is rather opposed to our core design of making the code easy to hack by contributors.
Comment on attachment 603729 [details] [diff] [review]
Main thread driver

+  /**
+   * Declare an asynchronous entry point for a FFI function.
+   *
+   * @param {string} symbol
+   * @param {ABI} abi
+   * @param {} returnType
+   * @param {...*} argTypes The type of arguments
+   * @return {function():Schedule} An asynchronous function implementing
+   * the entry point.
+   */
+  declareFFI: function Agent_declare(symbol, abi, returnType /* ... argTypes*/) {
+    dumpn("Agent: declareFFI");
+    dump_arguments(arguments);
+    let encode_to_worker = [];// Encode arguments to send them to the worker
+    let arg_type_names   = [];// Used by the worker to encode arguments back
+
+    for (let i = 3; i < arguments.length; ++i) {
+      let current = arguments[i];
+      dumpn("Dealing with argument "+(i-3)+": "+current.name);
+      encode_to_worker.push(current.encode);
+      arg_type_names.push(current.name);
+    };
+    let ctype_signature  = [symbol, abi.name]; // Used by the worker to define the FFI
+    for (i = 2; i < arguments.length; ++i) {
+      let current = arguments[i];
+      ctype_signature.push(current.parent.toSource());
+    }
+    let return_type_name = returnType.name;
+    let result_converter = returnType.decode;
+    Agent.postMessage({declareFFI: {symbol: symbol,
+                                    signature: ctype_signature,
+                                    result: return_type_name,
+                                    args: arg_type_names}
+                      });
+    return Agent._remoteCaller(symbol, result_converter, encode_to_worker);
+  },

As I mentioned in another bug...why the heck is this doing doing ipc?
Alias: JSFileAPI → OS.File
Severity: normal → enhancement
Attachment #603728 - Attachment is obsolete: true
Attachment #603729 - Attachment is obsolete: true
Attachment #603730 - Attachment is obsolete: true
Attachment #603731 - Attachment is obsolete: true
No longer depends on: 707909
We should also consider files stored on remote location specified by URI and not to expect that file operations with remote files are very fast operation. With all these cloud services this scenario might be pretty common soon. 
I'm referring to https://bugzilla.mozilla.org/show_bug.cgi?id=682838
Just to keep it in mind during work on new design.
For the moment, the design of OS.File is at a much lower level. For remote files, I suspect that we will prefer a higher-level API.
Component: Networking: File → OS.File
Product: Core → Toolkit
This has been merged into OS.File. Marking the bug as fixed.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.