Closed Bug 865389 Opened 7 years ago Closed 7 years ago

[OS.File] Read-ahead flag for MacOS X

Categories

(Toolkit :: OS.File, defect)

defect
Not set

Tracking

()

RESOLVED WONTFIX

People

(Reporter: Yoric, Assigned: yzen)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 6 obsolete files)

No description provided.
Assignee: nobody → yura.zenevich
Hi David, just to confirm, the c++ code is using fcntl F_RDADVISE to read ahead on MacOSX, and that's what I should be using to do that (I am assuming it's available in libc - http://www.delorie.com/djgpp/doc/libc/libc_309.html)
Flags: needinfo?(dteller)
Attached patch Patch for 865389 v1 (obsolete) — Splinter Review
Hi David, I am attaching this patch to get some feedback whether I am moving in the right direction (there are a number of rough edges). Any comments are extremely welcome :)!
Attachment #751290 - Flags: feedback?(dteller)
(In reply to Yura Zenevich [:yzen] from comment #1)
> Hi David, just to confirm, the c++ code is using fcntl F_RDADVISE to read
> ahead on MacOSX, and that's what I should be using to do that (I am assuming
> it's available in libc - http://www.delorie.com/djgpp/doc/libc/libc_309.html)

Yes, that's the idea.
Flags: needinfo?(dteller)
Comment on attachment 751290 [details] [diff] [review]
Patch for 865389 v1

Review of attachment 751290 [details] [diff] [review]:
-----------------------------------------------------------------

Good start.

::: dom/system/OSFileConstants.cpp
@@ +320,5 @@
>    INT_CONSTANT(AT_SYMLINK_NOFOLLOW),
>  #endif //defined(AT_SYMLINK_NOFOLLOW)
>  
> +  // cmd for fcntl
> +  INT_CONSTANT(F_RDADVISE),

I'm almost sure that F_RDADVISE doesn't exist under Linux. So you should do a |#if defined(F_RDADVISE)|

::: toolkit/components/osfile/osfile_async_front.jsm
@@ +378,4 @@
>     * @param {number=} bytes If unspecified, read all the remaining bytes from
>     * this file. If specified, read |bytes| bytes, or less if the file does not
>     * contain that many bytes.
> +   * @param {JSON} options Optionally additional options.

Well, eventually, you will need something more here :)

::: toolkit/components/osfile/osfile_shared_front.jsm
@@ +54,2 @@
>      if (bytes == null) {
> +      options.readahead = true;

If you plan to have side-effects on argument |options|, you should copy it.

::: toolkit/components/osfile/osfile_unix_front.jsm
@@ +90,3 @@
>        * @throws {OS.File.Error} In case of I/O error.
>        */
>       File.prototype._read = function _read(buffer, nbytes, options) {

I would keep |_read| simple, if possible.
Could you create a method |File.prototype._readahead|?

@@ +100,5 @@
> +         throw_on_negative("readahead",
> +           UnixFile.readahead(this.fd, options.offset, nbytes)
> +         );
> +       } else if ("fcntl" in UnixFile) {
> +         fd = throw_on_negative("fcntl",

I'm pretty sure |fcntl| doesn't return a file descriptor in this case.
Attachment #751290 - Flags: feedback?(dteller) → feedback+
Attached patch Patch for 865389 v2 (obsolete) — Splinter Review
Hi David,
A couple of more questions:
* Not sure how to go about the off64_t. For now I just put in place a definition that defers to off_t which is probably wrong (I am guessing I need to expose it on the C side).
* Let me know if I am using radvisory correctly now in the implementation of _readahead.
* I still seem to be getting an error while running existing tests for test_osfile_front: Unix error 14 during operation fcntl (Bad address). My suspicion is that radvisory might still be incorrect and there's something wrong with the offset or more likely the count.
Your suggestions are extremely welcome :)
Thanks
Attachment #751290 - Attachment is obsolete: true
Attachment #754664 - Flags: feedback?(dteller)
Comment on attachment 754664 [details] [diff] [review]
Patch for 865389 v2

Review of attachment 754664 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/osfile/osfile_async_front.jsm
@@ +288,5 @@
>     */
>    readTo: function readTo(buffer, options = noOptions) {
> +    // Preserve the reference to |outExecutionDuration| option if it is passed.
> +    options = clone(options, ["outExecutionDuration"]);
> +    if (!("bytes" in options)) {

&& !("readahead" in options)

@@ +366,5 @@
>     */
> +  read: function read(nbytes, options = noOptions) {
> +    // Copy |options| to avoid modifying the original object but preserve the
> +    // reference to |outExecutionDuration| option if it is passed.
> +    options = clone(options, ["outExecutionDuration"]);

Why are you cloning |options| here?

@@ +588,5 @@
>   */
> +File.read = function read(path, bytes, options = noOptions) {
> +  // Copy |options| to avoid modifying the original object but preserve the
> +  // reference to |outExecutionDuration| option if it is passed.
> +  options = clone(options, ["outExecutionDuration"]);

Why are you cloning |options| here?

::: toolkit/components/osfile/osfile_shared_front.jsm
@@ +56,5 @@
>     */
> +  read: function read(bytes, options = noOptions) {
> +    // Preserve the reference to |outExecutionDuration| option if it is
> +    // passed.
> +    options = clone(options, ["outExecutionDuration"]);

What's the point of keeping a reference to outExecutionDuration in this file?

@@ +58,5 @@
> +    // Preserve the reference to |outExecutionDuration| option if it is
> +    // passed.
> +    options = clone(options, ["outExecutionDuration"]);
> +    if (!bytes) {
> +      options.readahead = true;

Here, too, don't overwrite |readahead| if it's already present in |options|.

@@ +98,5 @@
> +      if ("readahead" in options) {
> +        // Preserve the reference to |outExecutionDuration| option if it is
> +        // passed.
> +        options = clone(options, ["outExecutionDuration"]);
> +        options.offset = pos;

That looks fishy. Are you changing the semantics of _read?

::: toolkit/components/osfile/osfile_unix_front.jsm
@@ +111,5 @@
>        * the end of the file has been reached.
>        * @throws {OS.File.Error} In case of I/O error.
>        */
>       File.prototype._read = function _read(buffer, nbytes, options) {
> +       if ("readahead" in options) {

We could have options.readahead == false.
Attachment #754664 - Flags: feedback?(dteller) → feedback+
(In reply to Yura Zenevich [:yzen] from comment #5)
> Created attachment 754664 [details] [diff] [review]
> Patch for 865389 v2
> 
> Hi David,
> A couple of more questions:
> * Not sure how to go about the off64_t. For now I just put in place a
> definition that defers to off_t which is probably wrong (I am guessing I
> need to expose it on the C side).

Just define Types.off64_t as Types.uint64_t.withName("off64_t")

> * Let me know if I am using radvisory correctly now in the implementation of
> _readahead.
> * I still seem to be getting an error while running existing tests for
> test_osfile_front: Unix error 14 during operation fcntl (Bad address). My
> suspicion is that radvisory might still be incorrect and there's something
> wrong with the offset or more likely the count.

Ah, I believe I found the error. It seems that the function doesn't expect a struct but a pointer to it.

> Your suggestions are extremely welcome :)
> Thanks
Attached patch Patch for 865389 v3 (obsolete) — Splinter Review
Tests to follow.
Note that I got rid of noOptions constant. I realized that const does not actually prevent from modifying the fields within the object (if they are overridden), unlike Object.freeze. This resulted in some tests failing when noOptions would be modified in one operation from osfile_shared_front.jsm and then used in another one.
Attachment #754664 - Attachment is obsolete: true
Attachment #756377 - Flags: review?(dteller)
Comment on attachment 756377 [details] [diff] [review]
Patch for 865389 v3

Review of attachment 756377 [details] [diff] [review]:
-----------------------------------------------------------------

While |options = {}| is indeed clearer than |options = noOptions|, I am not sure that this change is necessary.
Let's keep it for the clarify of code, though.

Once you have fixed the stuff below, I'll ask vladan to take a second look at the API.

::: toolkit/components/osfile/osfile_async_front.jsm
@@ +97,5 @@
> +        return object[key];
> +      },
> +      set: function(value) {
> +        object[key] = value;
> +      }

What did you change here? Just whitespaces?

@@ +315,5 @@
>     * @resolves {number} The number of bytes effectively read.
>     * @rejects {OS.File.Error}
>     */
> +  readTo: function readTo(buffer, options = {}) {
> +    // Preserve the reference to |outExecutionDuration| option if it is passed.

Nit: "Preserve reference to option |outExecutionDuration|, if it is passed."

@@ +319,5 @@
> +    // Preserve the reference to |outExecutionDuration| option if it is passed.
> +    options = clone(options, ["outExecutionDuration"]);
> +    if (!("bytes" in options) && !("readahead" in options)) {
> +      options.readahead = true;
> +    }

For v1, let's not activate readahead for partial reads. So no readahead for |readTo|.

@@ +396,5 @@
> +  read: function read(nbytes, options = {}) {
> +    if (!nbytes && !("readahead" in options)) {
> +      // Preserve the reference to |outExecutionDuration| option if it is passed.
> +      options = clone(options, ["outExecutionDuration"]);
> +      options.readahead = true;

Same here, no readahead for |File.prototype.read| for v1.

@@ +621,5 @@
> +  if (!bytes && !("readahead" in options)) {
> +    // Copy |options| to avoid modifying the original object but preserve the
> +    // reference to |outExecutionDuration| option if it is passed.
> +    options = clone(options, ["outExecutionDuration"]);
> +    options.readahead = true;

Here, yes.
(pending check with someone who knows more than me about readahead)

::: toolkit/components/osfile/osfile_shared_front.jsm
@@ +59,5 @@
> +      if (!("readahead" in options)) {
> +        options.readahead = true;
> +      }
> +    } else {
> +      options.bytes = bytes;

Here, you should clone |options|.
However, let's not support readahead here for v1.

::: toolkit/components/osfile/osfile_unix_back.jsm
@@ +100,5 @@
>         Types.gid_t =
>           Types.intn_t(OS.Constants.libc.OSFILE_SIZEOF_GID_T).withName("gid_t");
>  
>         /**
> +        * An offset (positive or negative, 64-bits).

I'm almost sure that off_t is always positive. Am I wrong?

@@ +455,5 @@
> +           declareFFI("fcntl", ctypes.default_abi,
> +                      /*return*/ Types.negativeone_or_nothing,
> +                      /*fd*/     Types.fd,
> +                      /*cmd*/    Types.int,
> +                      /*arg*/    Types.radvisory.in_ptr);

Let's replace this with a |void*|.

@@ +456,5 @@
> +                      /*return*/ Types.negativeone_or_nothing,
> +                      /*fd*/     Types.fd,
> +                      /*cmd*/    Types.int,
> +                      /*arg*/    Types.radvisory.in_ptr);
> +       } else {

That probably won't work on BSD. You will need to be able to handle the case in which we do not support any read ahead.

::: toolkit/components/osfile/osfile_unix_front.jsm
@@ +81,5 @@
> +      * only performed if either |UnixFile.readahead| or |UnixFile.fcntl| is
> +      * available.
> +      * @throws {OS.File.Error} In case of I/O error.
> +      */
> +     File.prototype._readahead = function _readahead() {

As this is a platform-specific feature, let's call this |File.prototype._unixReadahead| and ensure that it is defined only if we know how to perform readahead for the platform. I have the intuition that this can be useful for v2.

@@ +85,5 @@
> +     File.prototype._readahead = function _readahead() {
> +       // Pointer to the beginning of the file.
> +       let offset = 0;
> +       // A total number of bytes in the file to be read ahead.
> +       let count = this.stat().size;

This call requires one disk access that we can avoid.

I would prefer if |_readahead| accepted as arguments the size (and, for consistency, the offset). This will let the caller provide this information, that it already has anyway.

@@ +92,5 @@
> +         throw_on_negative("readahead",
> +           UnixFile.readahead(this.fd, offset, count)
> +         );
> +       } else if (UnixFile.fcntl) {
> +         // Special case for MacOS X.

Let's rather check whether |radvisory| is defined. We might decide to add support for |fcntl| for operations other than read-ahead on platforms that do not offer radvisory (i.e. FreeBSD).

@@ +121,5 @@
>        * @throws {OS.File.Error} In case of I/O error.
>        */
>       File.prototype._read = function _read(buffer, nbytes, options) {
> +       if (options.readahead) {
> +         this._readahead();

If we use my suggestion above, you'll need to check whether |_readahead| is defined.
Attachment #756377 - Flags: review?(dteller) → feedback+
(In reply to David Rajchenbach Teller [:Yoric] from comment #9)
> Comment on attachment 756377 [details] [diff] [review]
> Patch for 865389 v3 
> ::: toolkit/components/osfile/osfile_async_front.jsm
> @@ +97,5 @@
> > +        return object[key];
> > +      },
> > +      set: function(value) {
> > +        object[key] = value;
> > +      }
> 
> What did you change here? Just whitespaces?

Yes, I realized, last time I added 4 spaces instead of 2.
Attached patch Patch for 865389 v4 (obsolete) — Splinter Review
Attachment #756377 - Attachment is obsolete: true
Attachment #758960 - Flags: review?(dteller)
Comment on attachment 758960 [details] [diff] [review]
Patch for 865389 v4

Review of attachment 758960 [details] [diff] [review]:
-----------------------------------------------------------------

A few more changes, but we're getting there :)

::: toolkit/components/osfile/osfile_shared_front.jsm
@@ +50,5 @@
>     * this file. If specified, read |bytes| bytes, or less if the file does not
>     * contain that many bytes.
> +   * @param {JSON} options Optionally, an object that may contain the
> +   * following fields:
> +   * - {number} bytes A number of bytes to be read.

That looks strange. We already have an option |bytes|.

@@ +298,1 @@
>   * to read.

Looks like you have inverted two lines.
Also, if you have an option "offset", well, you should document it.
However, I would rather do without that option for the time being.

::: toolkit/components/osfile/osfile_unix_back.jsm
@@ +455,5 @@
> +                    /*fd*/     Types.fd,
> +                    /*cmd*/    Types.int,
> +                    /*arg*/    Types.void_t.in_ptr);
> +
> +       if (OS.Constants.Sys.Name !== "Darwin") {

You probably don't need that check. As you mention, |readahead| will be null on platforms for which it is not implemented, including Darwin.

::: toolkit/components/osfile/osfile_unix_front.jsm
@@ +89,5 @@
> +         // UnixFile.readahead is OS specific and might be null.
> +         if (UnixFile.readahead) {
> +           throw_on_negative("readahead",
> +             UnixFile.readahead(this.fd, offset, count)
> +           );

Nit: Could you rather do the following?

if (Unix.readahead) {
  File.prototype._unixReadahead = ...
} else if (OS.Shared.Type.radvisory) {
  File.prototype._unixReadahead = ...
}

@@ +121,5 @@
>        * @throws {OS.File.Error} In case of I/O error.
>        */
>       File.prototype._read = function _read(buffer, nbytes, options) {
> +       if (options.readahead && this._unixReadahead) {
> +         this._unixReadahead(options.offset, nbytes);

In the spirit of getting read of |options.offset| for v1, I would just hardcode 0 here. We will think about how to introduce the offset once we have a precise use case.
Attachment #758960 - Flags: review?(dteller) → feedback+
vladan, now that we're converging on the implementation, I'd be interested on your feedback on the API:

 OS.File.read(path, {readahead: true})

would call the platform-specific readahead implementation (if any), read the file, close it. Does this match our use case?
Flags: needinfo?(vdjeric)
Yes, this would work well
Flags: needinfo?(vdjeric)
{readahead:true} is a terrible api. if you are reading a whole file at once, you can assume you should read it with various flavors of SEQUENTIAL flags. No need to add a flag to do that.
Attached patch Patch for 865389 v5 (obsolete) — Splinter Review
I updated the patch with the latest comments, while waiting for the consensus regarding the readahead.
Attachment #758960 - Attachment is obsolete: true
Attachment #760111 - Flags: review?(dteller)
Ok, Yzen, after brainstorming, here are the updates:
- this bug family is a misnomer, it should be something like "linear flag for MacOS X" (and other platforms);
- |readahead| should be replaced by |sequential|;
- |sequential| should default to |true|;
- under Linux, the call is actually not |readahead| but |fadvise()|;
- under MacOS X, it seems that the |fcntl| flag is actually |F_RDAHEAD| with an argument of |1|;
- benchmarking will be useful to determine whether all that code is actually useful.

I hope I mentioned that this bug was somewhat exploratory, before you started working on it :)

Also, please synchronize with sankha93 who may or may not be working on bug 865387.
Blocks: 865387
Depends on: 881483
Comment on attachment 760111 [details] [diff] [review]
Patch for 865389 v5

Review of attachment 760111 [details] [diff] [review]:
-----------------------------------------------------------------

Canceling review atm.
Attachment #760111 - Flags: review?(dteller)
This is a WIP patch that should be more useful to sankha93, who's working on bug 865387 (linux implementation).
Attachment #760111 - Attachment is obsolete: true
Attached file Benchmark file (obsolete) —
This is what I used to benchmark. I got the following results:
*** Reading took: 47.0966 cpu sec ***
*** Reading ahead 1 (F_RDADVISE) took: 67.8173 cpu sec ***
*** Reading ahead 2 (F_RDAHEAD) took: 49.5564 cpu sec ***
Flags: needinfo?(dteller)
Attachment #762344 - Attachment is obsolete: true
(In reply to Yura Zenevich [:yzen] from comment #21)
> This is what I used to benchmark. I got the following results:
> *** Reading took: 47.0966 cpu sec ***
> *** Reading ahead 1 (F_RDADVISE) took: 67.8173 cpu sec ***
> *** Reading ahead 2 (F_RDAHEAD) took: 49.5564 cpu sec ***

So this is a benchmark measuring the benefit of read-ahead on a whole-file read performed with a single read() call?
Yes, there are 3 measurements, first one with just read. Second one with fcntl and the F_RDADVISE flag, the last one fcntl and the F_RDAHEAD flag.

Yes, whole files are read, and in case of the F_RDADVISE are readahead as a whole as well.

It looks like for a 1000 x 100mb files read is still better than any version of readahead and read so I am worried I might've made a mistake somewhere.
it sounds like macos does the right thing(tm) when you issue a big read. So it makes sense that it's not any faster...Not sure that being slower makes sense. Sounds like we are better off not using those readahead flags on osx
yzen, why are you closing the file after the fcntl() and before the read()?
Flags: needinfo?(dteller)
Here are the results for files with content:
*** Reading took: 49.304 cpu sec ***
*** Reading ahead 1 took: 68.9389 cpu sec ***
*** Reading ahead 2 took: 50.0032 cpu sec ***
Based on the above benchmarks it looks like sequential reads with neither fcntl and the F_RDAHEAD flag nor fcntl and the F_RDADVISE yielded any performance improvements (unlike fadvisory in linux, bug 865387).

Closing as WONTFIX since there is no incentive to implement read-ahead on OSX.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.