Closed Bug 835552 Opened 13 years ago Closed 12 years ago

Make load() be script-relative and read()/snarf() be cwd-relative

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: sfink, Assigned: sfink)

References

Details

Attachments

(2 files, 2 obsolete files)

Currently, the JS shell's load() function resolves paths relative to the current working directory, while read() aka snarf() resolves relative to the running script. That seems backwards. If you have a script loading a script, I would expect it to search relative to the script so that you can place a directory full of scripts anywhere you want. (Well, perhaps it should also have a set of search paths it knows to scan, but this is just the JS shell we're talking about here.) However, when loading a data file, I don't see why the path of the script being executed should matter. For automating a static analysis, it would be helpful to change snarf's behavior, since it is running a JS script that calls system("some_command infile.txt > outfile.txt") followed by snarf("outfile.txt"), and having different search paths for the shell and the snarf forces me to place the script in the same directory as the in/out files. Which is inconvenient.
This changes behavior to what I would expect, though it would probably be best to expose the current directory and the script pathname, along with path manipulation commands, to allow the script author to do whatever is desired. Maybe someday.
Comment on attachment 707282 [details] [diff] [review] Make load() be script-relative and read()/snarf() be cwd-relative langfuzz uses both load() and snarf(), so I want to check with decoder whether this would break his scripts. The load() change is more likely to be problematic, and only the snarf() behavior affects what I want to do, so I can just do that part if it causes trouble.
Attachment #707282 - Flags: feedback?(choller)
Comment on attachment 707282 [details] [diff] [review] Make load() be script-relative and read()/snarf() be cwd-relative I think this is a safe change for LangFuzz, as long as loading absolute paths isn't broken by this. Note that there is some bug in the path resolving function that prevents loading relative filenames if the file loading the file is specified with an absolute path on the command line. Does this change also fix this behavior? That'd be great. Also thanks for checking with me =) I appreciate that.
Attachment #707282 - Flags: feedback?(choller) → feedback+
Blocks: 838029
(In reply to Christian Holler (:decoder) from comment #3) > I think this is a safe change for LangFuzz, as long as loading absolute > paths isn't broken by this. Yes, absolute paths are unaffected, and are fine. > Note that there is some bug in the path > resolving function that prevents loading relative filenames if the file > loading the file is specified with an absolute path on the command line. > Does this change also fix this behavior? That'd be great. "Fix"? It changes it to something that makes a lot more sense to me, and given your comment above, to you as well. Currently, if you have load("relative.js") in a script that you run as /absolute/path/to/script.js, and relative.js is in the same directory as script.js, then it will look in your current working directory for relative.js and probably fail (because if you were in the same dir as the script, you probably wouldn't have used an absolute path for it!) After this change, it will find it no matter how you pass script.js to the shell. On the other hand, if you're using snarf() or read() to load a "relative.txt" that is in the same directory as script.js, then currently that would succeed and my patch will make it fail if you don't happen to be in the same directory as relative.txt. I've sat on this patch long enough. Time to update all the jit-tests and land it.
This patch also makes Windows work the same way as other platforms. At least, as far as I can tell. Previously, the relative path resolution was XP_UNIX-only.
Attachment #721313 - Flags: review?(n.nethercote)
Attachment #707282 - Attachment is obsolete: true
Comment on attachment 721313 [details] [diff] [review] Make load() be script-relative and read()/snarf() be cwd-relative Review of attachment 721313 [details] [diff] [review]: ----------------------------------------------------------------- Looking good, but MakeAbsolutePathName is a disaster. I realize it predates you, but I still want to see it cleaned up. Sorry. BTW, can you remove the "XXX" comment in toolkit/components/aboutmemory/tools/diff-memory-reports.js? It's no longer needed now that this is fixed. ::: js/src/shell/js.cpp @@ +681,5 @@ > > +/* > + * Returns a JS_malloc'd string (that the caller needs to JS_free) containing > + * dirname(|from|) prepended to |leaf|. If |from| is empty or a leaf, > + * MakeAbsolutePathname returns a copy of leaf. This is a very confusing function. For example, if |from| ends with a '/', then it isn't modified before being concatenated. This is not consistent with Unix |dirname|: [bayou:~/moz/mi3/js/src] dirname a/b/c/ a/b Furthermore, the comment doesn't describe all the corner cases. I think we should just remove this function... more below. @@ +742,5 @@ > + const char *pathname = filename.ptr(); > + if (pathname[0] == '/') > + return filenameStr; > + > + const char *basename; The unix 'basename' takes a path like '/a/b/c' and returns 'c'. So calling this variable 'basename' confused me greatly. Please choose something else! @@ +749,5 @@ > + if (scriptRelative) { > + /* Get the currently executing script's name. */ > + script = GetTopScript(cx); > + JS_ASSERT(script->filename); > + basename = script->filename; In this case, we really want: dirname(script->filename) + '/' + pathname (including the Unix behaviour where dirname() of a name that doesn't contain '/' returns '.'). @@ +754,5 @@ > + } else { > + basename = getcwd(buffer, PATH_MAX); > + if (!basename) > + return NULL; > + strcat(buffer, "/"); And in this case, we really want: cwd() + '/' + pathname So you should just do those things more directly, and get rid of MakeAbsolutePathName, which is really confusing and being shoehorned to fit into two quite different cases. It'll end up clearer and probably shorter, too, since you'll be able to factor out the |'/' + pathname| part. @@ +3760,5 @@ > #endif > JS_FN_HELP("snarf", Snarf, 1, 0, > +"snarf(filename, [\"binary\"])", > +" Read filename into returned string. Filename is relative to the directory\n" > + " containing the current script."), Oh, so you're documenting a pre-existing feature? Cool.
Attachment #721313 - Flags: review?(n.nethercote) → review-
Hmm, I just realized that this will break my copy of parsemark in bug 839450. It consists of pairs of files like this: # foo.js compile(read("foo.data")) # foo.data <lots of JS code> You can invoke foo.js from another directory and it'll find foo.data because of the relative-loading behaviour. But your patch will break that. I wonder if we should have a readRelative function? (I also wonder if it's time to retire the "snarf" name.)
(In reply to Nicholas Nethercote [:njn] from comment #7) > Hmm, I just realized that this will break my copy of parsemark in bug > 839450. It consists of pairs of files like this: > > # foo.js > compile(read("foo.data")) > > # foo.data > <lots of JS code> > > You can invoke foo.js from another directory and it'll find foo.data because > of the relative-loading behaviour. But your patch will break that. In my previous version, I special-cased a leading "./" as in read("./foo.data") to be script-relative, just to allow access to the previous behavior. But I deemed it too ugly to live when I put the patch up for review. I initially tried adding an optional parameter to specify what it should be relative to, but it turns out there's already an optional parameter, and the API just seemed too hacky that way. > I wonder if we should have a readRelative function? Perhaps that's the best resolution. The other option would be to add a `dirname $0` equivalent. getscriptdir() or something. I was a little leery of that because path concatenation technically isn't portable, though it may not matter since we go through msys on Windows. I also didn't want to scope creep into providing a full path handling API. > (I also wonder if it's time to retire the "snarf" name.) I'm not going to now, since I know of a number of existing scripts (not in the tree) that use snarf() rather than read().
(In reply to Nicholas Nethercote [:njn] from comment #6) > Comment on attachment 721313 [details] [diff] [review] > Make load() be script-relative and read()/snarf() be cwd-relative > > Review of attachment 721313 [details] [diff] [review]: > ----------------------------------------------------------------- > > Looking good, but MakeAbsolutePathName is a disaster. I realize it predates > you, but I still want to see it cleaned up. Sorry. > > BTW, can you remove the "XXX" comment in > toolkit/components/aboutmemory/tools/diff-memory-reports.js? It's no longer > needed now that this is fixed. Ok > ::: js/src/shell/js.cpp > @@ +681,5 @@ > > > > +/* > > + * Returns a JS_malloc'd string (that the caller needs to JS_free) containing > > + * dirname(|from|) prepended to |leaf|. If |from| is empty or a leaf, > > + * MakeAbsolutePathname returns a copy of leaf. > > This is a very confusing function. For example, if |from| ends with a '/', > then it isn't modified before being concatenated. This is not consistent > with Unix |dirname|: > > [bayou:~/moz/mi3/js/src] dirname a/b/c/ > a/b True, though that was intentional. Callers are responsible for ensuring that the last directory component of the path ends in a slash. Which is admittedly obscure. > Furthermore, the comment doesn't describe all the corner cases. I think we > should just remove this function... more below. > > @@ +742,5 @@ > > + const char *pathname = filename.ptr(); > > + if (pathname[0] == '/') > > + return filenameStr; > > + > > + const char *basename; > > The unix 'basename' takes a path like '/a/b/c' and returns 'c'. So calling > this variable 'basename' confused me greatly. Please choose something else! Oh, sorry. I was trying to hard to avoid calling it "context". :-) > @@ +749,5 @@ > > + if (scriptRelative) { > > + /* Get the currently executing script's name. */ > > + script = GetTopScript(cx); > > + JS_ASSERT(script->filename); > > + basename = script->filename; > > In this case, we really want: > > dirname(script->filename) + '/' + pathname > > (including the Unix behaviour where dirname() of a name that doesn't contain > '/' returns '.'). > > @@ +754,5 @@ > > + } else { > > + basename = getcwd(buffer, PATH_MAX); > > + if (!basename) > > + return NULL; > > + strcat(buffer, "/"); > > And in this case, we really want: > > cwd() + '/' + pathname > > So you should just do those things more directly, and get rid of > MakeAbsolutePathName, which is really confusing and being shoehorned to fit > into two quite different cases. It'll end up clearer and probably shorter, > too, since you'll be able to factor out the |'/' + pathname| part. Ok. Hopefully msys provides dirname(). I don't want to reimplement it in an I'm-pretending-to-be-portable way.
Ok, this removes MakeAbsolutePath. You're right, this is much simpler. dirname() is from libgen.h on Unix, apparently io.h or direct.h on Windows. I also added a readRelative, and fixed the documentation. (I didn't realize you were telling me I was documenting it the wrong way.)
Attachment #722957 - Flags: review?(n.nethercote)
Attachment #721313 - Attachment is obsolete: true
Comment on attachment 722957 [details] [diff] [review] Make load() be script-relative and read()/snarf() be cwd-relative Review of attachment 722957 [details] [diff] [review]: ----------------------------------------------------------------- So much better! Thank you. ::: js/src/shell/js.cpp @@ +723,5 @@ > + if (!cwd) > + return NULL; > + strcpy(buffer, cwd); > + } > + Nit: whitespace alert. @@ +3747,4 @@ > " Synonym for snarf."), > > + JS_FN_HELP("readRelative", ReadRelative, 1, 0, > +"snarf(filename, [\"binary\"])", s/snarf/readRelative/ Having suggested |readRelative|, I now wonder if |readRelativeToScript| would be better? Not sure. You decide.
Attachment #722957 - Flags: review?(n.nethercote) → review+
> I also added a readRelative, and fixed the documentation. (I didn't realize > you were telling me I was documenting it the wrong way.) I didn't realize it either! I was commenting about the documentation of the second argument :) Still, a good outcome.
Attachment #722957 - Flags: checkin+
Attachment #722957 - Flags: checkin+ → checkin-
Hmm, now I'm wondering if load() should be kept as cwd-relative. I just typed this: [surf:~/moz/mi0/js/src] d64/js js> load("a.js") hello and I realized that with this bug's changes it wouldn't work. In general, in any application, loading files relative to cwd is the most intuitive thing, right? Maybe load() should be cwd-relative and we should add a loadRelativeToScript() function?
...except I forgot to qref with the "-e" thing.
Attachment #725223 - Flags: review?(n.nethercote)
Comment on attachment 725223 [details] [diff] [review] Make load() be script-relative and read()/snarf() be cwd-relative. Review of attachment 725223 [details] [diff] [review]: ----------------------------------------------------------------- I still think load() should be cwd-relative, but at least you can go either way now.
Attachment #725223 - Flags: review?(n.nethercote) → review+
Sorry, I typed up this comment, forgot to submit it, then closed the tab. So, here it is: (In reply to Phil Ringnalda (:philor) from comment #14) > Comment on attachment 722957 [details] [diff] [review] > Make load() be script-relative and read()/snarf() be cwd-relative > > Backed out in > https://hg.mozilla.org/integration/mozilla-inbound/rev/4657492d3aaf because > Windows said > https://tbpl.mozilla.org/php/getParsedLog.php?id=20489005&tree=Mozilla-Inbound ...which I totally don't understand, because I tested this on a Windows 7 box. There must be some difference in MozillaBuild or SDK or mingw versions. Or something. I've changed it to use _splitpath on Windows, which seems to have passed try -- see https://tbpl.mozilla.org/?tree=Try&rev=7d8e34cdd1a1 > In general, > in any application, loading files relative to cwd is the most intuitive > thing, right? IMO, no. Regular file loads should be cwd-relative. But script loads ("imports") make much more sense as script-relative, particularly in the absence of a JS_LIB_PATH mechanism. If they aren't script-relative, then you can't use a collection of scripts in a directory other than the working directory. Making load() default to using the cwd is a lot like adding '.' to your C/Python/Perl/whatever include path, which is a footgun (since if you run from someplace that happens to have a script by the requested name, it may end up getting the wrong script.) (In reply to Nicholas Nethercote [:njn] from comment #15) > Hmm, now I'm wondering if load() should be kept as cwd-relative. I just > typed this: > > [surf:~/moz/mi0/js/src] d64/js > js> load("a.js") > hello > > and I realized that with this bug's changes it wouldn't work. That seems like a valid use case. But I would assert that it's different from a script file containing a load() command; your example is special because you're using the interactive shell. So I've updated the patch to special-case the script named "-e". > Maybe load() should be cwd-relative and we should add a > loadRelativeToScript() function? I've added a loadRelativeToCwd() to cover anybody who still needs the old behavior.
Attachment #725223 - Flags: checkin+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
This breaks the sunspider benchmark. Now the benchmark doesn't work without patching. Not sure if I should report upstream to get a patch in to use "loadRelativeToCwd"? Also I'm not sure they will want to patch it??
Also this change makes no sense. It has always been that way + it is the same for JSC and D8. Now we are the only ones that have a different behavior for "load()" in the shell :(
(In reply to Hannes Verschore [:h4writer] from comment #22) > Also this change makes no sense. It has always been that way That's an unconvincing argument; there are many silly things about our shell that have been that way for ages, probably because it's really just a test runner. > + it is the > same for JSC and D8. Now we are the only ones that have a different behavior > for "load()" in the shell :( I didn't realize that. That *is* a big deal. (In reply to Hannes Verschore [:h4writer] from comment #21) > This breaks the sunspider benchmark. Now the benchmark doesn't work without > patching. Not sure if I should report upstream to get a patch in to use > "loadRelativeToCwd"? Also I'm not sure they will want to patch it?? That's a dealbreaker in my mind. I'll put it back. I only depend on the snarf() behavior anyway, I was just trying to do some related cleanup at the same time. I was surprised that this was controversial, but maybe I'm just confused.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: