<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 5

•

24 years ago

Attached file The other two Talkback reports — Details

Comment 6

•

24 years ago

Here are extracts from all three Talkback reports, from the lower halves of the stack: SharedStub nsDocLoaderImpl::FireOnStateChange nsDocLoaderImpl::doStopURLLoad nsDocLoaderImpl::OnStopRequest nsLoadGroup::RemoveRequest nsHttpChannel::OnStopRequest nsOnStopRequestEvent::HandleEvent PL_HandleEvent SharedStub nsDocLoaderImpl::FireOnStateChange nsDocLoaderImpl::doStopURLLoad nsDocLoaderImpl::OnStopRequest nsLoadGroup::RemoveRequest nsHttpChannel::OnStopRequest nsOnStopRequestEvent::HandleEvent SharedStub nsStreamXferOp::OnStopRequest nsStreamListenerTee::OnStopRequest nsHttpChannel::OnStopRequest nsOnStopRequestEvent::HandleEvent PL_HandleEvent

Keywords: crash

Comment 7

•

24 years ago

The reporter says the crash does not occur if the browser windows are open; only if they are all closed. I would guess that has to be checked for by the Networking and DocLoader code at the bottom of the stack traces. I don't think this has anything to do with JS Engine. Let me reassign this to the Installer component to see if they know whether this bug should belong to DocShell, Networking, or themselves -

Assignee: rogerl → ssu

Component: Javascript Engine → Installer

QA Contact: pschwartau → gemal

Comment 8

•

24 years ago

FWIW I've been able to reproduce this in a debug build, although with a stack trace different from any of the previous. All three cases, and in my case, it appears the stack has been corrupted. It appears that what ever code is getting triggered by this event is trashing the stack and then as the program returns up the stack it's crashing at rand spots, probably depending on what part of the stack gets trashed. I'll attach my stack trace. It has a bit more information than what TalkBack has.

Comment 9

•

24 years ago

Attached file Stack trace from my debug build — Details

Henrik Gemal

Comment 10

•

24 years ago

since this happens with downloading all files (I just tried .exe files) this is *not* installer. btw also crash on Win2k perhaps xp apps? Correct me if I'm wrong...

Assignee: ssu → pchen

Component: Installer → XP Apps

QA Contact: gemal → sairuh

sairuh (rarely reading bugmail)

Comment 11

•

24 years ago

This seems to be a general download issue, not an installation issue. I'm able to consistently on any file I ftp. Also I just got another crash, with a slightly different stack. This occured while another thread was in the middle of GlobalWindowImpl::Cleanup call. This last crash occured when I clicked Save Link As, instead of just clicking on the file to download it. NTDLL! 77f827e8() KERNEL32! 77e86a3d() _PR_WaitCondVar(PRThread * 0x03930508, PRCondVar * 0x00cfaa58, PRLock * 0x00cfc5a0, unsigned int 0xffffffff) line 185 + 23 bytes PR_WaitCondVar(PRCondVar * 0x00cfaa58, unsigned int 0xffffffff) line 532 + 23 bytes nsThreadPool::GetRequest(nsIThread * 0x0391d800) line 633 + 15 bytes nsThreadPoolRunnable::Run(nsThreadPoolRunnable * const 0x0391d7b8) line 865 + 27 bytes nsThread::Main(void * 0x0391d800) line 105 + 26 bytes _PR_NativeRunThread(void * 0x03930508) line 399 + 13 bytes _threadstartex(void * 0x039caab8) line 212 + 13 bytes KERNEL32! 77e8758a()

Comment 12

•

24 years ago

benc/mscott, should this belong in networking or xp apps?

Comment 13

•

24 years ago

A couple more observations. I've seen two distinct behaviors from similar operations. I think both are a manifistation of the same problem, though. Specifically I seem one type of crash when I just click on a link to download a file and another type of crash when I right click and choose "Save Link As". The Save Link As crash seems to be more telling. What I've found is that the Global Window is getting destroyed twice and/or things accessing the global window during it's destruction. Also I noticed that if I clicked on a file, it displays a dialog asking me what to do, run it, save, etc. If I close all the windows while this dialog remained opened and then chose an option, everything just shutdown. In my opinion it's realistic to expect a user to kick off a long download and might just close the windows before setting this process off. It appears apparently that the application thinks that since this last window is closing it's time to shutdown. So it would appear that the problem may rest with this shutdown logic. I suspect in both cases the closing of the last visible window is trigging some of the shutdown code, at least triggering the global window to be destroyed. There should be a way for these windows to disappear without actually shutting the process down until everything is finished.

Brendan Eich [:brendan]

Comment 14

•

24 years ago

Are we nesting an event loop at a bad spot? Cc'ing jst and danm, who love these kinds of bugs! /be

Dan M

Comment 15

•

24 years ago

Better than a thorough beating, anyway. It's always good to think of event queues when things go wrong, but in this case it can't be: none are pushed during ftp download. I can't reproduce the problem on Win2k. I can't even Save Link As... (I get a JS error ("contextMenu.saveLink is not a function")). But doing it the old fashioned way, just clicking on an ftp file, on those occasions when the download doesn't stall, I have no problems. (But man does UI responsiveness take a stiff dive.) Tough for me to help debug it, then.

Comment 16

•

24 years ago

Something the original reporter didn't mention, and I meant to mention in my subsequent investigation, is the fact that you have to click that little check box that says to leave window open after operation. If it's checked the window stays open and there isn't a problem. It has to be cleared so the window closes after the operation.

Assignee

Comment 17

•

24 years ago

OK. I'm going to claim that this is an XPConnect (and DOM) bug. I get the same crash stack as dbradley. I think the talkbacks reports are the same problem, but the debug builds simply catch the problem closer to the source. The problem is that the DOM system has called JS_DestroyContext while xpconnect still has pending calls using the JSContext. I can show this by snapshoting the stack of the JS_DestroyContext which happens while we are in XPCWrappedNative::CallMethod. That bit of code wraps the call to XPTC_InvokeByIndex with an auto object that does a JS_SuspendRequest before and JS_ResumeRequest after the call out to XPTC_InvokeByIndex. In this case the JS_DestoryContext call happens before XPTC_InvokeByIndex returns and our call to JS_ResumeRequest gets a garbage JSContext. Dealing with this locally won't help because we still have a nsXPCWrappedJS::CallMethod call pending on the stack - i.e. we are still running code in the JS engine on the JSContext that has been destroyed. The XPConnect and DOM systems have worked together at cross purposes here to break the expectations of the JS engine. I think that what we have to do is bite the bullet to bring still more of the DOM interaction with the JS engine into xpconnect itself. In this case we should have the DOM not call JS_DestroyContext itself, but instead it should inform xpconnect that it is done using the JSContext. And then xpconnect can be responsible for actually destoying the JSContext (either immediately or at its leisure). XPConnect has the info to manage this. The XPCCallContext chain for the current thread tracks the uses (or lack thereof) of the JSContext. So, we could add XPConnect::DoneWithJSContext(in JSContextPtr cx) (or whatever we want to call it) that will check the current thread's XPCCallContext chain for the given JSContext. If not found then the the JSContext will be immediately destroyed. If it is found then the *deepest* user of the JSContext in that chain will be marked to destroy the JSContext upon its own destruction. This just costs us one flag per XPCCallContext and a quick check in the XPCCallContext dtor. That dtor code should also ASSERT that the JSContext is not present in the threadjscontextstack. I believe that it would be an error for the JSContext to appear in that stack any deeper than the frame (optionally) pushed by the XPCCallContext. So a debug ASSERT should be sufficient. Does this seem like the right solution to others?

Assignee: pchen → dbradley

Component: XP Apps → XPConnect

QA Contact: sairuh → pschwartau

Assignee

Comment 18

•

24 years ago

Attached file stack trace showing call to JS_DestroyContext while context in use — Details

Assignee

Comment 19

•

24 years ago

Note that the stack I just posted shows that in this case the JSContext in use by the running JS code is not the same one being destroyed (of course). So, my arguments about the insuffientcy of a local fix might not be fully supported. Still, a more local fix (say to the use of AutoJSSuspendRequest to detect if the JSContext had been whacked while we were suspended) would require us to patch in a different JSContext into the current XPCCallContext (and probably the top frame of the threadjscontextstack). This would be *way* ugly. Letting XPConnect decide on the right time for the call to JS_DestroyContext has got to be the way to go.

Brendan Eich [:brendan]

Comment 20

•

24 years ago

jband: sounds right on the non-local solution. Even the classic codebase had to be careful not to destroy a JSContext (one per window) until it was sure no one was using it (http://lxr.mozilla.org/classic/source/lib/libmocha/lm_win.c#3310) /be

Johnny Stenback (:jst)

Comment 21

•

24 years ago

Sounds reasonable to me, calling into XPConnect in stead of calling JS_DestroyContext() should be straight forward in the DOM code, but we might wanto add a flag to the call that lets the DOM code indirectly destroy the context w/o forcing a GC (i.e. XPConnect would call JS_DestroyContextNoGC() if the flag is true). This is probably not an issue if the context destruction is delayed, but being able to destroy context w/o causing a GC is something that should be done (the current code doesn't do this) to speed up teardown of [i]frame-intesive pages.

Assignee

Comment 22

•

24 years ago

Attached patch lightly tested fix — Details — Splinter Review

Assignee

Comment 23

•

24 years ago

The attached patch makes it not crash for me. It does not address jst's 'no-gc' request. I had to add the kungFuDeathGrip to nsStreamXferOp.cpp because that object was releasing its reference to the observer before the call to the observer returned. This is, of course, a no-no. But I'm concerned that we might have other stuff getting tripped up in this case where the window is going away while we are still making JS calls on it. Note that I took out the bit where we were clearing the global object of the JSContext in nsJSEnvironment because doing this there would allow the window object to get gc'd and die before this all unwound. I'm out of my depth here as far as understanding the wider implications. This certainly needs more testing.

Brendan Eich [:brendan]

Comment 24

•

24 years ago

s/Destory/Destroy/g :-)

Assignee

Comment 25

•

24 years ago

Thanks Brendan. I'd already fixed the typos you may have noticed in the printfs. I have yet to see my fix cause a problem during my browsing. But that is hardly an exhautive test. I can't help thinking that there might be a better DOM fix to keep us from needing to defer the JSContext destruction in this case. We should also figure out if there is any code left that will barf on a context global with null private data. We're introducing a new (though rare) state here.

Comment 26

•

24 years ago

I applied the patch and ran with it under Windows and Linux. I haven't been able to reproduce the problem, nor have I seen any new problems crop up. I'll continue running with it. I've played with composer, chat, mail, the browser buster site, etc. Still not an exhaustive test, but looking pretty good. There is still another issue that I think needs to be addressed, but in another bug. That's the fact that if all windows are closed and the only thing left is the dialog that appears when you single click on a ftp file link, the browser process terminates instead of allow you to continue with the download process.

Status: NEW → ASSIGNED

benc

Comment 27

•

24 years ago

I saw an RFE that asked for a warning box to appear when you cut off a download in progress. (Can't find the bug though). Is one possible solution hooking up a new window to stave off the shutdown event?

Assignee

Comment 28

•

24 years ago

Attached patch patch that fixes typos and address's jst's NoGC request. — Details — Splinter Review

Assignee

Comment 29

•

24 years ago

Reassigning to me. Asking for r= and sr=. The above patch is ready to go as far as I'm concerned. Johnny says that he is going to look into all the places in the code that might be unhappy with a null context private. We decided that that effort does not block this bug since we only will only be creating that state in cases that would have likely crashed without this patch anyway.

Assignee: dbradley → jband

Status: ASSIGNED → NEW

Comment 30

•

24 years ago

r=dbradley

Johnny Stenback (:jst)

Comment 31

•

24 years ago

sr=jst

Assignee

Comment 32

•

24 years ago

fix checked in

Status: NEW → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED