Closed Bug 104152 Opened 23 years ago Closed 23 years ago

Engine doesn't pass parameters through to executables correctly

Categories

(Core Graveyard :: Installer: XPInstall Engine, defect)

x86
Windows NT
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.7

People

(Reporter: curt, Assigned: curt)

Details

Attachments

(1 file, 3 obsolete files)

In an installer script the following line should cause the executable to be launched with three command-line parameters: err = execute("jre131i.exe", "-s -a -s", true); but the three parameters get incorrectly quoted such they are perceived as a single parameter by the exe.
So, the nsIProcess api requires us to chop up the arguments into an arguments array, which on windows is then reconstituted back into a command line string before calling the WinAPI CreateProcess(). Only being lazy we didn't chop, we stuck the passed arguments into the first slot. At some point the nsProcess component got smart and started quoting arguments when it detects internal spaces, so instead of our arguments passing through unharmed they are now quoted into a single argument. To fix this we'll have to do the chopping in nsInstallExecute. You'll need to either make a separate pass to count the args, or just declare the cArgs array ridiculously large (256 ought to do) and then lump everything remaining into the last slot if we somehow use 'em all up. We already dup the string so we can just drop a null for every space and point the next slot at the remainder Somthing like #define IN_SINGLE_QUOTE 1 #define IN_DOUBLE_QUOTE 2 #define ARG_SLOTS 256 char *cArgs[ARG_SLOTS]; int quoted = 0; int argc = 0; char *c; if (!mArgs.Empty()) { cArgs[argc++] = ToNewCString(mArgs); if (!cArgs[0]) return nsInstall::OUT_OF_MEMORY; c = cArgs[0]; for (c = cArgs[0]; *c && argc < ARG_SLOTS; ++c) { switch(*c) { case '\'': if (quoted & IN_SINGLE_QUOTE) quoted &= ~IN_SINGLE_QUOTE; // turn off single quote else quoted &= IN_SINGLE_QUOTE; // turn on single quote break; { similar for double quote } case ' ': if (!quoted) { *c = 0; // terminate current arg char *p = c+1; // look ahead while (*p == ' ') ++p; // eat spaces if (*p) cArgs[argc++] = p; //not at end, set next arg c = p-1; } break; default: break; // nothing to do } } That was off the top of my head, watch for off-by-one errors and the like.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.6
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Attached patch 0.1 for mozilla (obsolete) — Splinter Review
Comment on attachment 56730 [details] [diff] [review] 0.1 for mozilla >+ cArgs[argc++] = ToNewCString(mArgs); mArgs is unicode, isn't it? I realize the code already converted this way before, but we should at least file a bug that execute() can't handle args with i18n chars that aren't Latin-1. Please reference that bug number in this bug report and we'll skip the fix for it as part of this patch (since several other places need to be fixed to make it really work anyway). >+ switch(*c) { >+ case '\'': >+ quoted ^= IN_SINGLE_QUOTE; // toggle the bit >+ break; >+ case '\"': >+ quoted ^= IN_DOUBLE_QUOTE; // toggle the bit >+ break; Can't use XOR here -- not only will you toggle the quote in question, in the usual case you will also be turning ON the other quote (that is, when it isn't already on). If you're using a bitfield you need to do it the way my example did. Maybe its clearer to just go ahead and use two variables, squote and dquote. In that case forget the #defines and the bitwise operators: case '\'': squote = !squote; break; and then your condition instead of "if (!quoted)" will be "if (!squote && !dquote)" This is not performance critical spot in the code and readability wins over my initial attempt to eliminate the extra comparison. But wait, you aren't keeping track of or deleting the outer quote, you're passing it in. In the unix case, where our array goes straight into PR_CreateProcess(), the program will get the quotes as part of the arg. And on windows assembleCmdLine() will put quotes around the quoted arg which is not at all what we want (single arg "a b" becomes 4-arg "" a b ""). We need to remove the outer quote. On further reflection we don't really care about the inner quote, and the squote/dquote mechanism (and my original one) allows for overlapping quotes: "blah 'foo" blah' would incorrectly be one arg. try then char quote = 0; ... case '\'': if (!quote) quote = '\''; // turn on quote cArgs[argc++] = c; start else if (quote = '\'') quote = 0; // turn off quote c = '\0'; // else ignore other kind of quote I've played with the command line a bit, and only quotes at the start or after whitespace start a new command, but quotes in the middle do keep whitespace together. That is, DOS seems to take blah"blah blah" as blahblah blah and not two arguments blah blah blah Doing that would require moving chars over, and we can probably ignore it and just decree that a quote anywhere starts/stops an arg. What about people who use single quotes as an appostrophe? Clearly we have to respect escaped quotes \" and \' as the regular char (and that implies \\, too), and to do that you'll have to shift chars over scripts could get realy ugly -- they'll have escaping for javascript, escaping characters for us that they want escaped by the shell. Consider trying to execute the command: cvs commit -m"don't crash" somefile That'd turn out to be execute('cvs','commit -m "don\\\'t crash" somefile'); and we'll get the args string commit -m "don\'t crash" somefile which we need to turn into commit -m don't crash somefile This all raises the possibility of mis-matched quotes. If we're going to treat that as an error (and I lean that way) we need to parse the args during the ::Prepare() steps so we can return the error code at an appropriate time. The other option is just to treat the rest of the string as whatever arg got started and let the developer debug it -- I could go for that too in a pinch. Ok, this handles execute() -- where's the equivalent patch for File.execute()? You probably want to put this arg munching stuff into its own routine and call it from both places
Attachment #56730 - Flags: needs-work+
Regarding above issues which Dan raises, after further discussion we've come up with the following (some of which needs to be documented for users): I have created bug #108782 to note the problem with execute() not handling international characters. To handle quotes we've come up with the following rules (Dan, who should take care of documenting this for users?): - We only treat double quotes as quotes, so we just ignore single quotes. - We strip off outer double quotes. - We strip the backslash from "backslash doublequote" to allow a mechanism for passing a quote through to the executable. All other backslash cases are ignored. I'll turn this all into a function which can be called by either install.Execute() or file.Execute() as Dan suggests.
Can you show me a couple examples of good "passes" using the rules you describe here, Curt? I'm updating the docs right now, but I'm having a hard time ferreting out what the rules mean, recommendation-wise. Do we want people to pass a string full of escape-quoted chunks (e.g., err = execute("jre131i.exe", "\"-s\" \"-a\" \"-s\"", true)? thanks
Attachment #56730 - Attachment is obsolete: true
There is one thing about the 0.2 patch I wasn't certain about: I'm not assuming that \"'s are wrapped around a single parameter, i.e., if there are two \"'s with spaces in between the spaces are still assumed to delimit new parameters. If you want the \"'s to wrap around a single parameter they must themselves be wrapped in double quotes. No other way makes sense to me.
Regarding Ian's documentation question: Spaces are assumed to delimit seperate command-line paramters. So "-a -b -c" is assumed to be 3 parameters: 1) -a 2) -b 3) -c If you need to put spaces into a single parameter you must use double quotes NOT single quotes. So "-s \"-b -c\"" is assumed to be 2 parameters: 1) -a 2) -b -c We are treating escaped double quotes as normal characters, so "-a -b\\\" -c" is three parameters: 1) -a 2) -b\" 3) -c and "\"\\\"-a -b -c\\\"\"" is a single paramter: 1) \"-a -b -c\" but "\\\"-a -b -c\\\"" ends up (not very usefully) being treated as three parameters: 1) \"-a 2) -b 3) -c\" It is the last two examples which I'm a bit uncertain of and have asked for feedback on.
One important thing to note if you're updating the documentation is that this bug is NOT fixed. The current behavior is that whatever arguments you think you are passing in end up as a single argument passed to the program, that is, the entire arg string will appear in a single argv[] item in the program. Some programs will parse that OK, and as we've found many won't. This fix will allow arguments like "-s -a -s" to be correctly interpreted as three separate arguments. The bit we're quibbling over is how to allow quoting back in for those cases where space characters are, in fact, part of the argument. But the docs must make clear that this is a fix that only works in 0.9.7 and later. Note that when Curt talks about "outer quotes" and escaped quotes he's talking about the string as XPInstall sees it. As the arguments appear in the xpinstall script the true outer quotes are parsed by the javascript interpreter. In script: execute("a.exe","\"arg one\" d\\\"quote") install engine: "arg one" d\"quote passed as: 1-->arg one 2-->d"quote We need to say that unlike some languages the \ char is NOT an escape except in the specific case of being followed by a double quote or another slash, and that the double slash use is optional. If we didn't, the common case of passing a filename argument would require quadruple slashes broken example: js script: execute('a.exe','c:\\\\windows\\\\win.ini') XPI engine: c:\\windows\\win.ini passed as c:\windows\win.ini So, double slash is optional (the above will work, just not recommended), and the only case you need it is if you want slash-dquote to appear in the argument, quite a rare case. install script: execute("a.exe",'c:\\tmp\\blah.txt slash\\\\\\"quote') XPI engine: c:\tmp\blah.txt slash\\\"quotepassed as 1-->c:\tmp\blah.txt 2-->slash\"quote This is a pretty contrived example but will make a good test case.
> and "\"\\\"-a -b -c\\\"\"" is a single paramter: > 1) \"-a -b -c\" Assuming the outer quotes are those in the script, after the JS parser gets done we end up with -->"\"-a -b -c\"" which I think means we pass the single arg (including the quotes) -->"-a -b -c" Of course it'd be clearer in a script to use single quotes to avoid some of the extra escaping, so if that's what you wanted you could write '"\\"-a -b -c\\""' (I think) But these are all pretty contrived. By using single quotes in the script you can avoid escaping double quotes, and people probably mean double quotes to group args rather than wanting literal double quotes passed in.
I realize that I jumped the gun with this patch in that the fix is still only being applied to install.Execute(), not yet to file.Execute(). None-the-less, I can use feedback as it is presently. Thanks.
Ccing Grace.
Comment on attachment 56949 [details] [diff] [review] 0.2 for mozilla - This incorporates Dan's suggestions >+ if (!mArgs.IsEmpty()) >+ { >+ cArgs[argc++] = ToNewCString(mArgs); >+ if (!cArgs[0]) >+ return nsInstall::OUT_OF_MEMORY; >+ for (c = cArgs[0]; *c && argc < ARG_SLOTS; ++c) { >+ switch(*c) { You switch styles back and forth between conditional { } and conditional { } The rest of the file falls squarely in the latter camp so please "when in Rome" to match. In other words in the section above the "if" fits in, the for and switch don't, nor do subsequent ifs and whiles. Chopping up the string in place needs a comment I think. >+ // Only handle escaped double quote, no other escaped characters Need to handle double-slash too otherwise people won't be able to pass in a literal \" if they needed to --which might be the case if they're executing a shell script with its own escaping needs. >+ case '\\': >+ p = c; //look ahead >+ if (*(p+1) == '\"') { >+ while (*p != 0) { >+ *p = *(p+1); >+ p++; >+ } >+ } >+ break; You need to increment c and extra one in this case, otherwise the very next time you'll find the " you just moved over and do the next case incorrectly. You should also comment that you're shifting the string over one to squeeze out the escape. case '\\': char ch = *(c+1) // look ahead if ( ch == '\\' || ch == '\"' ) { // escaped slash or dquote, shift string to eat slash for (p=c; *p != 0; ++p) *p = *(p+1); ++c; // step over escaped character } break; >+ case '\"': >+ *c = 0; // terminate current arg >+ if (quoted) { I'd move the *c=0 inside the if (quoted), the comment makes more sense and you don't waste time blowing away leading quotes you're just going to step over >+ quoted = PR_FALSE; >+ } >+ else { >+ quoted = PR_TRUE; >+ cArgs[argc-1] = c+1; >+ } >+ break; you can move the two quoted assignments outside the if/else right before the break and combine them: quoted = !quoted; // toggle quote state please add a comment to the cArgs[argc-1] = c+1;, I had to think about it too much. "move current argument pointer past the initial quote" or something Before checking in we need to get this working in File.execute() as well. Please do that before adding any more patches to this bug.
Attachment #56949 - Attachment is obsolete: true
Comment on attachment 58125 [details] [diff] [review] 0.3 - Addresses file.Execute and install.Execute both now. >+PRInt32 ChopParams(char **cArgs, nsString instring, PRInt32 slots) Mozilla coding style pushes an initial 'a' for arguments, like aArgs, aInString, aSlots. See http://www.mozilla.org/hacking/mozilla-style-guide.html Another useful reference (and the only place I could find the above linked) is http://www.mozilla.org/hacking/nutshell.html Since there's a push toward the "all-in-one" static builds we have to start worrying about potential global namespace clashes so for non-static non-class functions we should probably fallback on the older 4.x guideline of using MODULE_foo for exported functions and module_foo for functions private to the module. In this case, then, maybe xpi_ChopParams() I'd prefer a less cryptic name if you can come up with one, xpi_PrepareProcessArguments() maybe? Avoid nsString arguments like the plague, or any object for that matter. Since C/C++ is pass-by-value this forces a new object to be created, and in the nsString case this includes an allocation and copy to create a copy of the string data. allocation is currently 15% or so of everything we do so we have to get out of this kind of habit. An object reference would be acceptable and has some advantages over a pointer while being just as efficient. >+ cArgs[argc] = ToNewCString(instring); In this case, though, the first thing you do is make your own copy of the nsString data (another allocation) so it's looks like a plain char * argument might be best. Up to you whether you want the allocation to be done inside this routine (in which case your in string should be const) or use the ToNewCString() in the calling routines. oh, but that's an evil lossy conversion (you created a bug on that, right? we should reference that in a comment here) so it's probably best to hide the conversion and allocation inside this function and just use "const nsString& aArgumentString" for the arg. I think you should document what the inputs and expected outputs of this function, it's not really clear what "slots" is or what people have to do with the args array (e.g. free it later). aArgs and aSlots really should be next to each other since one describes the limits of the other. Aesthetically I'd put your inString first since that's the main input to this function. >+ // See if next character is backslash or dquote >+ if ( *(c+1) == '\\' || *(c+1) == '\"' ) >+ { >+ // Eat escape character (i.e., backslash) by >+ // shifting all characters to the left one. >+ for (p=c; *p != 0; ++p) >+ *p = *(p+1); >+ } You need a ++c; statement after the for(p) loop, otherwise c is left pointing at the escaped dquote char which is then treated as a real quote by the next case statement >+ case '\"': >+ *c = 0; // terminate current arg >+ if (quoted) >+ { >+ p = c+1; // look ahead >+ while (*p == ' ') >+ ++p; // eat spaces >+ if (*p) >+ cArgs[argc++] = p; //not at end, set next arg >+ c = p-1; >+ >+ quoted = PR_FALSE; >+ } >+ else >+ { >+ quoted = PR_TRUE; >+ cArgs[argc-1] = c+1; >+ } We've got a problem with quoted args that aren't also space delimited. For example: foo"bar" You can make an argument for "foo" and "bar" or for "foobar" (which is what DOS does), but the above code results in "bar" because the else clause moves up the current cArgs slot. If you want "foo" and "bar" you'll have to check to see if cArgs[argc-1] is pointing at c. If so set to c+1, otherwise set cArgs[argc++] to c+1. If you want "foobar" you'll have to shift things over as in the escape case and don't touch the cArgs slot. If you go for "foobar" you'll have to change the way you end quotes and only use spaces to terminate args. Otherwise foo"bar baz"bat would wrongly end up as "foobar baz" and "bat", where the initial quote concatenates but the final quote does not (consistancy would demand one arg "foobar bazbat"). I'd go for "foo" and "bar" -- easier to doc. >+ case ' ': >+ if (!quoted) >+ { >+ *c = 0; // terminate current arg >+ p = c+1; // look ahead >+ while (*p == ' ') >+ ++p; // eat spaces >+ if (*p) >+ cArgs[argc++] = p; //not at end, set next arg >+ c = p-1; What if some idiot passes in " arg1 arg2"? you'll create three args, the first one being a null string. You need to "pre-eat" spaces outside the main for loop. >+PRInt32 ChopParams(char **, nsString, PRInt32); The prototype should go in a header like nsInstallExecute.h Otherwise this will compile just fine if you change the function signature and forget to update this case, and the error may only be detected when this starts crashing (which could be down the road a ways since this method is not used all that much). >+ cParams[0] = 0; > >- cParams[0] = nsnull; nsnull and 0 are the same thing. >+ if (argcount >= 0) > { >+ rv = process->Init(mTarget); >+ if (NS_FAILED(rv)) >+ { >+ if(cParams[0]) >+ Recycle(cParams[0]); >+ return rv; >+ } > >- rv = process->Run(mBlocking, (const char **)&cParams[0], 1, nsnull); >+ rv = process->Run(mBlocking, (const char **)&cParams, argcount, nsnull); >+ } >+ else >+ rv = nsInstall::UNEXPECTED_ERROR; > > if(cParams[0]) > Recycle(cParams[0]); Save space by collapsing the two places you Recycle() cparams. In addition if you move it inside the if(argcount) you can dispense with the cParam[0] initialization at the top. if (argcount >= 0) { rv = process->Init(mTarget); if (NS_SUCCEEDED(rv)) rv = process->Run ... if (cParams[0]) Recycle(... } Most of our performance and bloat problem is small stuff scattered around that's really hard to make progress on. Get in the habit of watching for these kinds of inefficiencies.
Attachment #58125 - Flags: needs-work+
I've addressed most of Dan's (and David mentioned a couple offline) issues (the easiest ones!) but want to clarify some things before I generate another patch: - I'm amenable on naming conventions and such. - I'm not following you on the "evil lossy thing" so it is doubtful that I have created a bug on it. But maybe we can take this offline for a tutorial session. In the meantime, I understand, and have implemented, the reference suggestion and it works. - I believe you're mistaken about the c++ recommendation. I fooled with that when you gave me the example code earlier. What actually happens is that c comes out of the whole if statement pointing to the # or \ following the escape character (as you point out) but is immediately ++'d by the if statement wrapped around the case statements so it is pointing where it should be by the time it gets into the case statements again. - I propose that we check this fix in with the parsing logic that it currently has and open another bug for the edge cases (like embeded quotes and leading spaces). My reasoning is that this patch works for all the "sane" cases and is relatively straightforward, whereas what is checked in currently doesn't work at all. Then we can take a little time to consider the best way to fix these oddball cases and have a nice stable codebase to fall back too if we introduce new bugs in our efforts to deal with these obscure possiblities. It may be that QA will think of some more possibilites for us to address while they are developing test cases for this fix, too. - Is there any trick to getting the new header file picked up by "make /f client.mak" after I add it to cvs? ..or any other administrative responsibilities I sign up for when I add a new file? Let me know what you think about all this, and I'll whip up another patch.
Oh, wait a minute. About the header file, are you suggesting adding the prototype into a header file that already exists (nsInstallExecute.h already exists) or creating a new header. I assumed the latter in my remarks above but hadn't read carefully enough. nsInstallExecute.h has a bunch of stuff, and includes a bunch of other headers, which (I assume) nsInstallFileOpItem.cpp doesn't need to know. I guess there is no big penalty for including all of the extra baggage to get the one prototype we care about? ...and, I think I changed nsnull to 0 because I was getting a compiler warning and that made it go away. I'll look into what that was about.
> I'm not following you on the "evil lossy thing" [...] you already filed bug 108782 on it, I'm fine with that. > I believe you're mistaken about the ++c recommendation. You're right, sorry about that. > I propose that we check this fix in with the parsing logic that it > currently has and open another bug for the edge cases (like embeded > quotes and leading spaces). If the problems were big (like the pervasive i18n issue) I'd agree, but in this case fixing them is simple and not fixing them means they're likely to remain in the code for years and years. stick something like while (*c == ' ') ++c; before the main loop to eat initial spaces, and in the found quote else do something like quoted = PR_TRUE; if (cArgs[argc-1] == c) // pointing at quote, point to real argument cArgs[argc-1] = c+1; else // start a new argument cArgs[argc++] = c+1; > Is there any trick to getting the new header file picked up by "make /f > client.mak" after I add it to cvs? ..or any other administrative > responsibilities I sign up for when I add a new file? - don't add a new file, the existing nsInstallExecute.h appears fine - make sure you use depend builds so header changes in your tree are picked up ("nmake /f client.mak depend" after a pull) - adding a new file may entail a bit of work, especially on mac, depending on what type of file it is. Fortunately in this case you don't have to worry. There should be docs about this (for future reference): http://www.mozilla.org/build/making-additions.html http://www.mozilla.org/build/mac-build-system.html
Attachment #58125 - Attachment is obsolete: true
Comment on attachment 58645 [details] [diff] [review] 0.4 - I think this covers everything, including the edge cases I was trying to postpone. >+PRInt32 xpiPrepareProcessArguments(const nsString& aArgsString, char **aArgs, PRInt32 aArgsAvailable) underbar after the module: xpi_Prepare... >+ else >+ { >+ // The quote is embedded so start a new argument. >+ *c = 0; // Terminate current arg. You already did this above, right? >+PRInt32 xpiPrepareProcessArguments(const nsString&, char**, PRInt32); variable names in the prototype would help document what this does. With those nits, sr=dveditz
Attachment #58645 - Flags: superreview+
r=dprice
I addressed the last few nits which Dan identified and checked the patch into the mozilla trunk.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Build: 2002-06-17-08-1.0 A few combinations are tested and behave as expected. Multiple parameters is supported. a_fileop_mparam_win_blk_t.xpi and a_exe_multiparam_win_blk_t.xpi Marking Verified!
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: