Closed Bug 305452 Opened 20 years ago Closed 20 years ago

Clicking on link crashed web browser [@ nsGlobalWindow::RunTimeout]

Categories

(Core :: DOM: Core & HTML, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: compwiz312, Assigned: mrbkap)

References

()

Details

(4 keywords)

Crash Data

Attachments

(3 files, 4 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20050820 Firefox/1.6a1 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20050820 Firefox/1.6a1 When visiting www.snapstream.com if you click on the products link the browser crashes every time. Reproducible: Always Steps to Reproduce: 1. Visit http://snapstream.com 2. Click on the Products link 3. Actual Results: Browser crashed
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8b4) Gecko/20050821 Firefox/1.0+ Crashes on Mac/branch too when I follow the steps in comment 0 and then scroll using spacebar. My stack (TB8588585K) sucks because of bug 304842.
Keywords: crash
Branch: TB8596158Z @nsGlobalWindow::RunTimeout Trunk: TB8596209E @nsGlobalWindow::RunTimeout
This build WFM: 1.8b4_2005081608 And this one fails: 1.8b4_2005081615 In trunk it failed a little bit earlier: 1.9a1_2005081607
Stack: nsGlobalWindow::RunTimeout [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/dom/src/base/nsGlobalWindow.cpp, line 6043] nsGlobalWindow::ClearAllTimeouts [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/dom/src/base/nsGlobalWindow.cpp, line 6363] nsAppStartup::DestroyExitEvent [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/toolkit/components/startup/src/nsAppStartup.cpp, line 387] main [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/browser/app/nsBrowserApp.cpp, line 61] kernel32.dll + 0x16d4f (0x7c816d4f)
Assignee: nobody → general
Status: UNCONFIRMED → NEW
Component: General → DOM
Ever confirmed: true
Keywords: regression
Product: Firefox → Core
QA Contact: general → ian
Summary: Clicking on link crashed web browser → Clicking on link crashed web browser [@ nsGlobalWindow::RunTimeout]
Version: unspecified → Trunk
This regressed on 8/17 according to the Firefox15 branch Talkback topcrash table: http://talkback-public.mozilla.org/reports/firefox/FF15x/FF15x-topcrashers.html Here is a query to those crashes: http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=1&searchby=stacksig&match=contains&searchfor=nsGlobalWindow::RunTimeout&vendor=MozillaOrg&product=Firefox15&platform=All&buildid=&sdate=&stime=&edate=&etime=&sortby=bbid It looks like a recent checkin by mrbkap or jst might have caused this somewhere around the crash point: http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/dom/src/base/nsGlobalWindow.cpp&mark=848&rev=#848 It is currently the #1 topcrash on the branch (since the other 2-3 higher ranked crashes have been fixed). We should get this into beta.
Flags: blocking1.8b4?
Keywords: topcrash+
Assigning to jst and cc'ing mrbkap
Assignee: general → jst
Flags: blocking1.8b4? → blocking1.8b4+
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Flags: blocking1.8b5+
Assignee: jst → mrbkap
Status: REOPENED → NEW
After opening this testcase (based on the site) and then visiting another site, I crash with the latest nightly trunk build. If a minimised testcase is really needed, let me know, and I'll try to minimise further.
Attached patch easy fix (obsolete) — Splinter Review
This is the easy "null-check" version of this fix. The current question is whether we should do this or stop event handlers from running altogether (see below). The issue here is that we null out each of our timeouts' mTimer member in nsGlobalWindow::SuspendTimeouts. The page then has a mousemove event listener that creates a timeout. When you click on the link, we store the page in the bfcache (calling SuspendTimeouts). Then, before we show the new document, if the mouse moves, the event handler is called (even though the document is in the bfcache!) and creates the timeout, which (surprise, surprise) runs. We then end up in nsGlobalWindow::RunTimeout, which iterates through the timeout list, assuming non-null mTimer members and crash.
*** Bug 305958 has been marked as a duplicate of this bug. ***
I see the same bug on Linux build 20050821 when visiting: http://www.rft.com/appcommercial/pinpoint.htm TalkBack ID TB8714783G
This also fixes the bug, and it doesn't put the page script in a potentially weird state.
Attachment #193859 - Attachment is obsolete: true
Attachment #193882 - Flags: review?(bryner)
Status: NEW → ASSIGNED
Whiteboard: [needs review bryner]
Comment on attachment 193882 [details] [diff] [review] don't dispatch events if we're in the bfcache This seems to be missing the implementation of nsGlobalWindow::IsFrozen...
Attachment #193882 - Flags: review?(bryner) → review-
Attached patch updated to comments (obsolete) — Splinter Review
IsFrozen was already implemented in nsGlobalWindow.h as an inline function. jst suggested that I pull mIsFrozen (and thus the IsFrozen implementation) into nsPIDOMWindow, so that's what I've done.
Attachment #193882 - Attachment is obsolete: true
Attachment #194243 - Flags: review?(bryner)
Comment on attachment 194243 [details] [diff] [review] updated to comments Oh, oops. In that case, r=me if you put mIsFrozen with the other PRPackedBools. Otherwise they don't actually pack.
Attachment #194243 - Flags: review?(bryner) → review+
Attachment #194243 - Flags: superreview?(jst)
Again, I think this patch is wrong. We should NOT be blocking all events in the document. If we can detect events happening (which we can) then we should remove the document from bfcache when events happen. Otherwise we're introducing a pretty significant incompatibility in the DOM behavior depending on whether the page the user left was or was not stored in bfcache. I don't believe we want to be introducing such an incompatibility.
(In reply to comment #16) > Again, I think this patch is wrong. We should NOT be blocking all events in the > document. If we can detect events happening (which we can) then we should > remove the document from bfcache when events happen. I think that would have the side-effect of removing documents from the cache just due to unprocessed native events when the swap the content viewers. That's the sort of thing that was causing us to his PresShell::RetargetEventToParent problems.
> I think that would have the side-effect of removing documents from the cache > just due to unprocessed native events when the swap the content viewers. You mean we dispatch mouse events to a DOM after we've removed that content viewer from the window? That sounds like a bug in itself. As for thise bug, jst and I talked about it on #content a few days ago and were basically agreed that the right fix here would be to add the new timeout but not to fire it if the window is frozen (and then start up the timer if the window gets unfrozen).
(In reply to comment #18) > You mean we dispatch mouse events to a DOM after we've removed that content > viewer from the window? That sounds like a bug in itself. We get the events coming into PresShell::HandleEvent, at least. I'd have to check what happens to them after that.
I think it's reasonable to block events on the presentation level that are coming in through the presentation if we're frozen. Events being dispatched via DOM should not be blocked, however.
Comment on attachment 194243 [details] [diff] [review] updated to comments bz and I talked this over on IRC and he's convinced me to go the "hard" route and respect the setTimeout call (i.e., stick it in the queue of timeouts, but not run it until the page comes out of the bfcache).
Attachment #194243 - Flags: superreview?(jst)
Whiteboard: [needs review bryner]
This patch adds the new interval to the queue so that when we're pulled out of the bfcache, we'll run them as well. I think the math here should work out (both the timeout refcounting and the interval juggling). I'll attach a diff -w for review in a minute.
Attachment #194243 - Attachment is obsolete: true
Attached patch diff -w (obsolete) — Splinter Review
Attachment #194350 - Flags: review?(bryner)
Attachment #194350 - Flags: review?(bryner) → review+
Attachment #194350 - Flags: superreview?(jst)
This fixes a small refcounting botch (when one of the calls in nsGlobalWindow::SetTimeoutOrInterval fails, calling timeout->Release(), timeout->mRefCnt should be at least 1) that jst noticed. I'm carrying forward bryner's r.
Attachment #194350 - Attachment is obsolete: true
Attachment #194387 - Flags: superreview?(jst)
Attachment #194387 - Flags: review+
Attachment #194350 - Flags: superreview?(jst)
Comment on attachment 194387 [details] [diff] [review] fix refcounting problem + if (IsFrozen()) { + // If we're frozen, then we didn't really put this timeout into a timer's + // closure. Therefore our AddRef above was bogus. We release it now so that + // our refcount is correct, but timeout doesn't get deleted. + timeout->Release(scx); + } I think this now does the right thing, but it's more fragile than it could be (as manual refcounting commonly is). I think I'd rather see us release here unconditionally and add an additional AddRef() call in the case where we *do* create the timer, right after the creation of the timer happened and we know it succeeded. - In nsGlobalWindow::ResumeTimeouts(): for (nsTimeout *t = mTimeouts; t; t = t->mNext) { PRInt32 interval = (PRInt32)PR_MAX(t->mWhen, DOM_MIN_TIMEOUT_VALUE); t->mWhen = now + nsInt64(t->mWhen); + // Add a reference for the new timer's closure. + t->AddRef(); + t->mTimer = do_CreateInstance("@mozilla.org/timer;1"); NS_ENSURE_TRUE(t->mTimer, NS_ERROR_OUT_OF_MEMORY); This will leak t if we fail to create the timer, only AddRef() when you know you'll be releasing later on. The rv = t->mTimer->InitWithFuncCallback(TimerCallback, t, interval, nsITimer::TYPE_ONE_SHOT); NS_ENSURE_SUCCESS(rv, rv); Seems like all other code that initializes a timer cancels and nulls out the timer on failure to initialize it, and releases the timeout as well (or you could simply move the above AddRef() call to after all this succeeded. This code really needs nsRefPtr<>'s, wanna file a followup bug on that? sr=jst with that fixed.
Attachment #194387 - Flags: superreview?(jst) → superreview+
Fix checked into trunk. I will check it into the branch tomorrow.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
Whiteboard: [needs branch checkin]
I filed bug 306598 on using nsRefPtr<>.
Comment on attachment 194387 [details] [diff] [review] fix refcounting problem I think we want this fix on the branch.
Attachment #194387 - Flags: approval1.8b4?
This is the #1 topcrash on the branch, so we need the a= to get this checked in there.
Attachment #194387 - Flags: approval1.8b4? → approval1.8b4+
*** Bug 306749 has been marked as a duplicate of this bug. ***
I just checked this into MOZILLA_1_8_BRANCH.
Keywords: fixed1.8
Whiteboard: [needs branch checkin]
verified Firefox 1.4 (Win, Lin, Mac) builds from 2005-09-08
Keywords: fixed1.8verified1.8
Crash Signature: [@ nsGlobalWindow::RunTimeout]
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: