Closed Bug 67503 Opened 25 years ago Closed 25 years ago

Crash with JS callback and File obcall from second initInstall/performInstallject

Categories

(Core Graveyard :: Installer: XPInstall Engine, defect, P1)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.1

People

(Reporter: jimmykenlee, Assigned: dbragg)

References

Details

(Keywords: crash, Whiteboard: [xpibug])

Attachments

(7 files)

Build: 2001-02-02-10-Mtrunk(WIN) 1. Open http://jimbob/jars/a_fileop_exe_2_win_block_t_cb.html or http://jimbob/jars/a_fileop_exe_3_win_block_t_cb.html (Both tests use blocking as true) 2. Click OK button from confirmation dialog RESULT: Crash EXPECTED RESULT: Notepad.exe is launched. Exiting notepad results in a confirm dialog appearing and text displays from the browser window. Incident ID: 25702613 Call Stack: (Signature = 0x10101010 df117123) 0x10101010 js3250.dll + 0x1b6ec (0x60b1b6ec) js3250.dll + 0x17a26 (0x60b17a26) js3250.dll + 0x44ad (0x60b044ad) js3250.dll + 0x446d (0x60b0446d) js3250.dll + 0x43ef (0x60b043ef) xpinstal.dll + 0xbd1f (0x60a3bd1f) Install script for xpi called by a_fileop_exe_2_win_block_t_cb.html: var xpiSrc = "notepad.exe"; initInstall("Acceptance: addFile for a_fileop_exe_2_win_block_t", "acceptance", "1.1.1.1", 0); g = getFolder("Program"); err1 = addFile("fileop_exe_2_win_block_t", "3.3.3.3", xpiSrc, g, xpiSrc, true); logComment("addFile " + xpiSrc + " returns = " + err1); performOrCancel(); logComment(""); //line space initInstall("Acceptance: a_fileop_exe_2_win_block_t", "acceptance", "1.1.1.1", 0); fileSource = getFolder(g, xpiSrc); // append file to path err2 = File.execute(fileSource, true); logComment("Path to notepad.exe = " + fileSource); logComment("File.execute returns = " + err2); performOrCancel(); err3 = confirm("This confirm dialog should appear after executable is dismissed."); logComment("confirm returns = " + err3); function performOrCancel() { if (0 == getLastError()) performInstall(); else cancelInstall(); }
Further investigation reveals that the problem is not related to blocking. It appears that the problem is related to File.execute(). The tests http://jimbob/jars/a_fileop_exe_2_win_block_f_cb.html and http://jimbob/jars/a_fileop_exe_3_win_block_f_cb.html have blocking set to false. Removing "blocking" from Summary. Either way, the crash still occurs. :-(
Summary: Crash occurs with callback and File.execute blocking → Crash occurs with callback and File.execute
Jimmy, does Install.execute() also crash? Could be something in the new nsIProcess stuff. Is this bug XP? nominating nsbeta1 for investigation
Keywords: nsbeta1
This is not a problem with Install.execute(). The crash only appears for File.execute(), and I have no idea why. There appears to be no problem on Linux, too.
Also, doesn't this crash only exist with the callback? Running File.Execute by itself doesn't crash anything as I understand it.
Correct. The crash only occurs with the callback. Further investigation reveals that the crash occurs with other methods of the File object (ie File.copy, File.isDirectory, etc.)
Well. It looks like this problem is unrelated to the Execute (blocking) changes. I suspected that there might be a problem with the (small) modification I had to make to one of the fileOp constructors but File.Copy and File.IsDirectory don't use the same constructor that File.Execute uses so this I don't think this is the problem. Running the test case in the first comment of this bug ended up crashing in the inline funtion in the struct Link in xpcom/ds/nsVoidBTree.h. The line: PRBool operator!=(const Link& aOther) const { return !aOther.operator==(*this); } The problem seems to be that *this empty. Crasho grosso.
But wait, there's more! Looking back, I am able to reproduce the problem with 6.0 RTM and 6.01. Well, I actually reproduced the crash with File.copy, so that sort of shows us that it's definitely not specific to File.execute. Now what?
Changing Summary to be more accurate. A more easier test to use may be http://jimbob/jars/f_fileop_copy_cb.html since we don't have to deal with some of the differences with File.execute().
Summary: Crash occurs with callback and File.execute → Crash occurs with callback and File object
We probably need to fix this one, seems like the File object is doing something nasty that could come up in other cases. I Don't understand the use of the word "callback". To my knowledge the only thing we use that term for in XPInstall is the InstallTrigger.install() 2nd argument, and I don't see that involved here (though there are other crashes with that callback function).
Whiteboard: [xpibug][smartupdate]
Keywords: nsbeta1nsbeta1+
Whiteboard: [xpibug][smartupdate] → [xpibug]
After some investigation I don't think this is related to File Ops but is instead caused by a double call to performInstall. In the test case presented an init/perform pair is called to add a file to the system for later copying using File.copy. Then another init/perform pair is used to test File.copy. When I modified the test to have only one init/perform pair the "callback" occured correctly. QA: Could you modify your test so that there is only one init/perform pair. You'll have to copy a file that already exists of course but it's really just for this verification. Just FYI, when the crash does occur it's in the javascript interpreter code, not in any xpinstall code.
One would think... I cannot reproduce the crash with the test described below. After talking with Don, I prepared a test where no File objects were called. A new test can be found at http://jimbob/jars/f_double_initinstall_cb.html This page calls f_double_initinstall.xpi whose install script is as follows: var xpiSrc = "xpinstall.txt"; initInstall("Functional: f_double_initinstall Part 1", "f_double_initinstall_1", "1.1.1.1", 0); f = getFolder("Program", "f_double_initinstall_1"); err1 = addFile("", xpiSrc, f, "f_double_initinstall_1.txt"); logComment("addFile returns = " + err1); performOrCancel(); initInstall("Functional: f_double_initinstall Part 2", "f_double_initinstall_2", "1.1.1.2", 0); g = getFolder("Program", "f_double_initinstall_2"); err2 = addFile("", xpiSrc, g, "f_double_initinstall_2.txt"); logComment("addFile returns = " + err2); performOrCancel(); function performOrCancel() { if (0 == getLastError()) performInstall(); // do the install here. else cancelInstall(); }
Calling one pair of initInstall() and performInstall() does not crash if a fileop is used. Another new test at http://jimbob/jars/f_fileop_dircreate_cb.html supports this claim. Calling two pairs of initInstall() and performInstall() does not crash if a fileop is used in the first part and addFile (no fileop) is used in the second pairing. This test is http://jimbob/jars/f_fileop_install_cb.html.
I'm confused by "below". I assume you meant "below" you when entering the comment, i.e. now "above" your comment, and not the test case you then entered that is, in fact, below your comment. Or did you gratuitously type in a non-crashing example? :-) So is it safe to change the summary to "crash with more than one init-perform block in same install"? We fixed one like that once, drat.
Very strange, requires two init/perform blocks AND a File object method?
No wait! It's even weirder than that. The File op has to be in the second init/perform block to cause the crash.
Yes, Don clarifies the situation. It appears that the file operation has to called from the second pair of initInstall/performInstall to reproduce the crash. Somewhat reluctantly :-), I did provide a couple of non-crashing examples. This seemed necessary in order to prove that this crash is specific to what Don and I concluded. Adding keyword crash. Updated Summary to reflect the new details of the crash. Hopefully, it is more clear.
Keywords: crash
Summary: Crash occurs with callback and File object → Crash with JS callback and File obcall from second initInstall/performInstallject
Target Milestone: --- → mozilla0.9.1
Priority: -- → P1
Adding Brendan and Phil to the CC list because this is a REALLY WEIRD crash down deep in jsinterp.c I've traced the crash to a line in jsinterp.c in the js_Interpret function. It's the case JSOP_GETPROP: case It seems to be that in the call to OBJ_GET_PROPERTY, the obj->map->ops->getProperty value is bogus. I'll create an attachment that has the stack trace (although that's not much help) and my debugger information that shows the value of obj->map->ops->getProperty
There's nothing weird about this -- you're probably passing a dead (garbage collected, maybe even recycled) object reference at 0x00ef2408 to the JS API. Don't do that. Where did that second argument from JS_EvaluateScript come from? Perhaps you need to use a GC root (JS_AddNamedRoot, JS_RemoveRoot)? Cc'ing jband, because I dimly recall he (and I) have seen this sort of thing from xpinstall before. /be
I'd make that property JSPROP_PERMANENT too. It's still scary to have a global C variable weakly referring to a JS GC thing that's not rooted, even if that thing is held by a property. What if the object in which the property is defined (however readonly and permanent that property may be) is itself cut off from the live object graph, and collected? /be
When you say "that property" you're referring to the line : JS_DefineProperty (jscontext, installObject, "File", OBJECT_TO_JSVAL(gFileOpObject), JS_PropertyStub, JS_PropertyStub, JS_PROP_READONLY); right? If so, I just need to have JSPROP_READONLY | JSPROP_PERMANENT for the last argument. Is that correct? Thanks Brendan.
Yes, that's the property I meant -- but what keeps the object in which that property is permanently read-only around and alive? /be
The install object is our global object -- it shouldn't be able to go away, should it? Are there any bad effects of leaving INSTALL_FILEOP in the Install_slots enum? The explicit slot numbers are part of the JSPropertySpec so it's probably OK, but I don't know if there's anything magic about sequential numbers. Also doesn't seem like there's much point in just removing the one line from the GetInstallProperty() case statement, why not remove the entire case? I don't know that you want my r= on this, I'm part of how we got in this mess in the first place; I demonstrably don't know what I'm doing here.
Nit: the JS_SetGlobalObject call added in the patch is not needed, because JS_InitStandardClasses(cx, obj) sets cx->globalObject to obj if cx->globalObject is null. It looks like you're ok once the cx->globalObject root refers to installObject, in which the permanent, readonly File property is bound to gFileOpObject. But, what if the GC could race with all the code in InitXPInstallObjects? I don't see any calls to JS_BeginRequest and JS_EndRequest in xpinstall/src/*.cpp. Does this code run only on the main thread, or does it run on its own thread? /be
dveditz: nothing magic about tinyids, except that they're negative and scarce, so I agree: the patch should be revised to remove INSTALL_FILEOP and its uses. I hope the JS API docs on JS_BeginRequest and JS_EndRequest are better now (Rob and I haven't hacked lately, but we made some passes to improve these lie-filled leftovers perpetrated by a techwriter back in the netscape.com JSRef days, and we may have fixed up the Request API docs). /be
- INSTALL_FILEOP = -10, INSTALL_INSTALLED_FILES = -11 I would make the tinyid domain here dense, just to help compilers optimize switch statements and avoid creeping sparseness. Any hope for JS_BeginRequest and JS_EndRequest around maximal non-blocking API and native code sequences? /be
I experimented with adding JS_BeginRequest(cx) in nsSoftwareUpdateRun.cpp right before the call to JS_EvaluateScript (~line 430) and JS_EndRequest(cx) right before the call to JS_DestroyContextMaybeGC(cx) (~line 473 also in nsSoftwareUpdateRun.cpp) Strange results. Mozilla would hang with no stack trace available somewhere after the call to JS_BeginRequest. Should the calls go somewhere else?
No stack backtrace? What debugger, and what are other threads doing? /be
Sorry. Not stack trace available at the time of the "crash" but I generated one using the Windows debugger. Attaching.
So the key phrase in my last comment about JS_BeginRequest was "non-blocking". If you find the other threads' stacks, I bet there's a deadlock. You need to call JS_SuspendRequest before calling out to a blocking or lock-acquiring method from your native JS methods, and call JS_ResumeRequest after -- if you have any blocking/lock-taking callouts. Do you? /be
Blocking, schmockling. Ok,I think I found where we're blocking. I put the Suspend/Resume in InstallStartInstall in nsJSInstall.cpp. That seemed to do it. Posting new diff.
Ok, one more patch. I also added the Suspend/Resume code around Finalize because a lot is done in that phase. Adding patch. Brendan, could you please review and if satisfactory give it your sr=?
r=sgehani
Fix checked in.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Depends on: 75458
I'm pretty sure this was broken when danm checked in his change to nsIPrompt from nsICommonDialogs. We were using the nsICommonDialog service for both our alert and confirm dialogs. This was replaced with the nsIPrompt service on April 8, 2001. I'll ping danm about it. FYI, it's cvs rev 1.161 of nsInstall.cpp.
Never mind my last comment. I got to this bug via another bug and it's the wrong one. That comment should have gone in 62167.
Build: 2001-05-01-06-trunk(WIN) Looks good now. Marking Verified.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: