Closed Bug 682631 Opened 8 years ago Closed 7 years ago

Implement asynchronous reading in file.js

Categories

(Add-on SDK Graveyard :: General, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 781486

People

(Reporter: KWierso, Assigned: KWierso)

References

Details

Attachments

(1 file, 2 obsolete files)

+++ This bug was initially created as a clone of Bug #649889 +++

Would be nice to at least have the option to do async IO in the SDK. Deprecation and warning the developer about sync IO could come later, once async IO is implemented.
So I tried to add an asyncRead method to the file module. I have no idea if it's acceptable (or even an improvement against the original read()'s performance problems).

It works, though, with the following test code in an addon:

const file = require("file");
console.log(file.read("/home/kwierso/somefile.txt"));
console.log(file.readAsync("/home/kwierso/somefile.txt"));

Both read() and readAsync() return the content of the file.

Thoughts?
Attachment #556332 - Flags: feedback?(sdwilsh)
Comment on attachment 556332 [details] [diff] [review]
(Probably) horrible first attempt.

We have a number of tools for async IO that don't involve event loop spinning ... netwerk/ level stuff, FileReader, etc.   We should prefer those.
Attachment #556332 - Flags: feedback-
Indeed we must not spin the event loop. Sync does it and we're paying lots of horrible engineering debt because of it. The biggest one being an entire rewrite of the core Sync APIs to be asynchronous.

There's not going to be any way around a callback-based API. Maybe some day, when our iterators/generators are actually first class citizen in the JS engine (see bug 664784 for instance), we might be able to drape an async API with a sync-looking one using generators. That still requires an actual async APIs underneath.

I suggest something like

  file.read(path, function (error, data) {
    // do stuff here
  });

This is how we've been doing async APIs in Sync now. It follows the node.js convention of having callbacks first take an 'error' argument and than any additional arguments. 'error' could be an exception object or it could be an XPCOM result code.
In that case, here's a version that uses a callback to handle async reads.

Using this other patch ( http://pastebin.mozilla.org/1313721 ) against the simple-storage module, I got simple-storage to read its storage file asynchronously.
Attachment #556362 - Flags: feedback?
Severity: normal → enhancement
Priority: -- → P3
Target Milestone: --- → 1.3
Assignee: nobody → kwierso
Comment on attachment 556362 [details] [diff] [review]
Add readAsync() method that uses a callback

I know this is just a quick demonstration, but I thought I'd give my feedback anyway:

* In my experience, ignoring failures silently doesn't make for a nice-to-use async API.

* Why is callback an object? Are you thinking of having more methods on it? If so, which ones? If we only care about completion (with either success or error cases), what's wrong with just a function that takes error (which may be null) and result? (cf. the node.js convention I mentioned previously)

* <bikeshed> If we're going to deprecate file.read(), then it'd be nice to make the thing that's supposed to replace it at least as concise. readAsync() implies that there's still some other way to read files and there really shouldn't be. I suggest something like readAll() or readContents() </bikeshed>

* What's the difference between Cu and components.utils?
(In reply to Philipp von Weitershausen [:philikon] from comment #5)
> Comment on attachment 556362 [details] [diff] [review]
> Add readAsync() method that uses a callback
> 
> I know this is just a quick demonstration, but I thought I'd give my
> feedback anyway:
> 
> * In my experience, ignoring failures silently doesn't make for a
> nice-to-use async API.

Yeah, this whole bit was just blatantly stolen from the netutil example: https://developer.mozilla.org/en/JavaScript_code_modules/NetUtil.jsm#asyncFetch%28%29


> * Why is callback an object? 

Because I tried once to pass it as a function and failed, and then tried this and it seemed to work. Now that I'm looking at it when it's not almost midnight, I see how it should be done.

> Are you thinking of having more methods on it?
> If so, which ones? If we only care about completion (with either success or
> error cases), what's wrong with just a function that takes error (which may
> be null) and result? (cf. the node.js convention I mentioned previously)

I toyed with having an onError method, but if the callback function can cope with it, we should go with that.


> * <bikeshed> If we're going to deprecate file.read(), then it'd be nice to
> make the thing that's supposed to replace it at least as concise.
> readAsync() implies that there's still some other way to read files and
> there really shouldn't be. I suggest something like readAll() or
> readContents() </bikeshed>

Bikeshed accepted. :)

> * What's the difference between Cu and components.utils?

In theory, nothing. But when I tried to do | Cu.import("resource://gre/modules/NetUtil.jsm"); | it didn't seem to actually import it. (bug 683217)
So then, something like this?

It reports back errors encountered in the asyncFetch() process, uses a function directly as the callback, and recolored the bikeshed to readAll().

It doesn't change any SDK modules to actually use the new async readAll(), though.

I tested it out with a sample extension:

----
const file = require("file");

file.readAll("/path/to/some/localfile.txt", function(error, data) {
  if(error) {
    console.log(error);
  }
  console.log(data);
});
----

Which successfully logged the contents of localfile.txt to the console.


Incidentally, I discovered bug 684134 when I tried to test it with a text file bundled in the extension's data folder and tried to pass it as file.readAll(data.url("localfile.txt"), ...);


Before this lands/ships, though, we should probably make sure everything in the SDK has been switched to use the new async method, so we don't start spewing the deprecation warning to people's consoles when their addons get repacked with whatever SDK version this ships in.
Attachment #556332 - Attachment is obsolete: true
Attachment #556362 - Attachment is obsolete: true
Attachment #556332 - Flags: feedback?(sdwilsh)
Attachment #556362 - Flags: feedback?
Attachment #557741 - Flags: feedback?(philipp)
Attachment #557741 - Flags: feedback?(myk)
Comment on attachment 557741 [details] [diff] [review]
Callback as a function

This looks better to me, though Myk should have final say on how Jetpack-y the API is at this point. Couple of notes:

* Nits:

  - The deprecation warning still mentions readAsync().

  - You define 'file = MozFile(filename)' early but then recreate it in the first argument to NetUtil.asyncFetch.

  - No need to define 'str' and 'error' at that outermost level. (Also, I have no idea what JetPack's coding style is like, but 'let' is widely preferred to 'var'.)

* Don't try to read from inputStream if fetching it wasn't successful. Basically, if status isn't a success code, you need to bail out.

* inputStream.available() and NetUtil.readInputStreamToString() can both throw. One thing to be careful about in asynchronous programming: If something throws in your callbacks, execution stops at that point, the exception bubbles up into nirvana (because the callback was dispatched by the event loop) and the callback chain is broken. So you really want to wrap that line in a try block and invoke the callback with the exception's error code as the first argument in the catch block, so that the caller has a chance of knowing if something went wrong.

Basically:

exports.readAll = function readAll(filename, callback) {
  let file = MozFile(filename);
  ensureFile(file);
  NetUtil.asyncFetch(file, function(inputStream, status) {
    console.log(status);
    if (!components.isSuccessCode(status)) {
      return callback(error);
    }
    let str;
    try {
      str = NetUtil.readInputStreamToString(inputStream, inputStream.available());
    } catch (ex) {
      return callback(ex.result);
    }
    return callback(null, str);
  });
}
Attachment #557741 - Flags: feedback?(philipp) → feedback+
Even more fun: the callback can fail, so you should wrap that in a try-catch and report it to the error console if it does.
(In reply to Philipp von Weitershausen [:philikon] from comment #8)

(In reply to Shawn Wilsher :sdwilsh from comment #9)

Noted and noted.

Waiting for the current patch to work its way through Myk's feedback queue before posting a new one.
(Pushing all open bugs to the --- milestone for the new triage system)
Target Milestone: 1.3 → ---
Comment on attachment 557741 [details] [diff] [review]
Callback as a function

(In reply to Philipp von Weitershausen [:philikon] from comment #3)
> Indeed we must not spin the event loop. Sync does it and we're paying lots
> of horrible engineering debt because of it. The biggest one being an entire
> rewrite of the core Sync APIs to be asynchronous.

Indeed; let's not go there.


> I suggest something like
> 
>   file.read(path, function (error, data) {
>     // do stuff here
>   });
> 
> This is how we've been doing async APIs in Sync now. It follows the node.js
> convention of having callbacks first take an 'error' argument and than any
> additional arguments. 'error' could be an exception object or it could be an
> XPCOM result code.

That works for me.


(In reply to Wes Kocher (:KWierso) (Jetpack Bugmaster) from comment #7)

> Before this lands/ships, though, we should probably make sure everything in
> the SDK has been switched to use the new async method, so we don't start
> spewing the deprecation warning to people's consoles when their addons get
> repacked with whatever SDK version this ships in.

The perfect is the enemy of the good.  Let's land this API enhancement first, then modify other core APIs to use it, then deprecate the sync API.


(In reply to Philipp von Weitershausen [:philikon] from comment #8)

> (Also, I
> have no idea what JetPack's coding style is like, but 'let' is widely
> preferred to 'var'.)

`let` is the new `var` for the SDK, too, and much of the SDK's code already uses it.  Some older code still uses `var`, as do some modules intended to be reusable in Node.js, and high-level API documentation targeted at addon developers uses `var` in example code to make it easier to grok for new addon developers with a web development background.

But in general, SDK core implementation code should use `let`.


As for the name of the function, I would prefer we make `read` polymorphic, behaving asynchronously if its argument list includes a callback function, so `read(filename, mode)` remains synchronous, while `read(filename, mode, callback)` behaves asynchronously.

This also implies that the async implementation of `read` should support reading binary files, as does the sync implementation currently (and which appears to be the sole reason for the `mode` parameter).


Aside: I'm not (yet?) sold on the value of making all SDK core APIs asynchronous, especially the high-level ones targeted at potential addon developers, many of whom are novice and casual programmers.

But I do think it's worth introducing more asynchronicity into the core API set, at least as an option, especially in the low-level APIs targeted at API developers, most of whom are experienced and professional programmers.

So I'm definitely +1 on adding this API and modifying other core APIs to use it.
Attachment #557741 - Flags: feedback?(myk) → feedback+
(In reply to Myk Melez [:myk] from comment #12)
> Aside: I'm not (yet?) sold on the value of making all SDK core APIs
> asynchronous, especially the high-level ones targeted at potential addon
> developers, many of whom are novice and casual programmers.

As I pointed out in bug 649889 comment 3, I think this is a false dichotomy. Also, pretty much all of client-side web programming is async (event-based), so it should be familiar to a large portion of the Jetpack audience, right?

> But I do think it's worth introducing more asynchronicity into the core API
> set, at least as an option, especially in the low-level APIs targeted at API
> developers, most of whom are experienced and professional programmers.

I would very much like to see this "option" be at least the *strongly* recommended one, but ideally we would deprecate and eventually remove blocking calls altogether. Users vote with their feet, and they largely do it because performance matters to them. It can no longer be an afterthought.
(In reply to Philipp von Weitershausen [:philikon] from comment #13)
> As I pointed out in bug 649889 comment 3, I think this is a false dichotomy.

Yup, I'm familiar with that comment. :-)


> Also, pretty much all of client-side web programming is async (event-based),
> so it should be familiar to a large portion of the Jetpack audience, right?

Familiar, perhaps, but not thereby sufficiently usable.


> Users vote with their feet, and they largely do
> it because performance matters to them. It can no longer be an afterthought.

No one has suggested that, and performance is important to the Jetpack project, but asynchronous APIs != performance, nor is performance our only priority.
Irakli, you said you had some async file methods. Anything you could contribute here?
Duplicate of this bug: 746193
Do we still need this now we're probably getting bug 781486?
Unsurprisingly, I would vote for bug 781486, given that OS.File is:
- asynchronous;
- compatible with workers;
- designed to replace most uses of nsLocalFile*;
- designed for faster IO than nsLocalFile*.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 781486
You need to log in before you can comment on or make changes to this bug.