Closed Bug 306959 Opened 19 years ago Closed 19 years ago

Crash on first startup after running downrev Firefox/Thunderbird version [@ nsXBLBinding::AllowScripts][@ PresShell::Destroy]

Categories

(Core :: General, defect)

x86
All
defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: mcow, Assigned: mark)

References

Details

(Keywords: crash)

Crash Data

Attachments

(4 files)

TB 1.0+0830, Win2K

Every time I start my recent trunk build after having run 1.0.6, the program 
crashes before any window is shown.

On restarting the program, it appears to proceed normally.

Talkback is not started.  Talkback 1.6a1 is the only installed extension.
Defined accounts include four POP and one IMAP (only two POP are checked for 
mail on startup), three news accounts, RSS account with one feed that gets 
checked on startup.

Steps to reproduce:
1) Install 1.6a1 (ZIP distro); run, set as default mail client if necessary.
   Exit.
2) Install 1.0.6 (Windows Installer distro); run, do *not* set as default.
   Exit.
3) Start 1.6a1 -- get error.
4) Restart 1.6a1 -- runs normally.

Actual results:
3) I get this error:
  The instruction at "0x005759b4" referenced memory at "0x00000000".
  The memory could not be "read".   [OK to terminate, Cancel to debug]

Expected results:
  NO ERROR, no need to restart.
Same symptom seen running 1.6a1-0830 after 1.5b1-0904.
Symptom not seen running 1.5b1 after 1.0.6.
Scott, any chance you could look at this?  It's really making testing branch vs. 
trunk more tedious than ever.
Could this be related to the change for Bug #300800? We started to notice this
crash shortly after that went in. 

stack trace:

nsXBLBinding::AllowScripts() line 1232 + 15 bytes
nsXBLBinding::ExecuteDetachedHandler() line 794 + 8 bytes
ExecuteDetachedHandler(void * 0x0310dd28, void * 0x00000000) line 816
nsVoidArray::EnumerateForwards(int (void *, void *)* 0x0160bb90
ExecuteDetachedHandler(void *, void *), void * 0x00000000) line 648 + 21 bytes
nsBindingManager::ExecuteDetachedHandlers(nsBindingManager * const 0x030b6e80)
line 828
nsGlobalWindow::HandleDOMEvent(nsPresContext * 0x030d1610, nsEvent * 0x0012faa0,
nsIDOMEvent * * 0x0012fa68, unsigned int 7, nsEventStatus * 0x0012fac0) line 1492
DocumentViewerImpl::PageHide(DocumentViewerImpl * const 0x030cb850, int 1) line
1205 + 35 bytes
nsDocShell::FirePageHideNotification(nsDocShell * const 0x00b97528, int 1) line 881
nsDocShell::Destroy(nsDocShell * const 0x00b9753c) line 3400
nsXULWindow::Destroy(nsXULWindow * const 0x03007df8) line 511
nsWebShellWindow::Destroy(nsWebShellWindow * const 0x03007df8) line 848 + 9 bytes
nsAppShellService::Observe(nsAppShellService * const 0x03007c24, nsISupports *
0x003ec94c, const char * 0x00363b80, const unsigned short * 0x00000000) line 550
nsObserverService::NotifyObservers(nsObserverService * const 0x00c3a130,
nsISupports * 0x003ec94c, const char * 0x00363b80, const unsigned short *
0x00000000) line 235
NS_ShutdownXPCOM_P(nsIServiceManager * 0x003ec94c) line 776
ScopedXPCOMStartup::~ScopedXPCOMStartup() line 552 + 12 bytes
XRE_main(int 1, char * * 0x003e7da8, const nsXREAppData * 0x0042101c kAppData)
line 2349
main(int 1, char * * 0x003e7da8) line 62 + 18 bytes
Summary: Crash on first startup after running downrev version → Crash on first startup after running downrev version [crash @ nsXBLBinding::AllowScripts]
I backed out 300800 and the crash is still there. it must be something else.
I noticed something: after I've run a downrev version, if I then run the trunk 
version in one profile, it crashes; if I restart and select a different profile, 
it crashes again.
Severity: major → critical
Keywords: crash
Summary: Crash on first startup after running downrev version [crash @ nsXBLBinding::AllowScripts] → Crash on first startup after running downrev version crash [@ nsXBLBinding::AllowScripts]
I get this on linux going from thunderbird-MOZILLA_1_8_2005-09-11-11Z (branch)
to thunderbird-xft_2005-09-22-18Z (trunk). Crashes right after clicking 'Finish'
in the extension compat check dialog that appears on first run.
OS: Windows 2000 → All
cc'ing Benjamin. this crash is happening when toolkit restart's XPCOM in a trunk
build if you previously ran a 1.8 build.
Bug 311332 has more detailed info.

*** This bug has been marked as a duplicate of 311332 ***
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → DUPLICATE
Status: RESOLVED → VERIFIED
Depends on: 311332
So bug 311332 is fixed, but this is still crashing because we're getting
nsContentUtils::EventQueueService() while tearing down a presshell after layout
shutdown... and that's null, of course.
The basic problem is that nsWebShellWindow::Destroy happens off the
xpcom-shutdown notification, so we can be tearing down all the layout stuff for
the XUL in the window after the layout module has shut down.
Status: VERIFIED → REOPENED
Resolution: DUPLICATE → ---
Assignee: mscott → nobody
Status: REOPENED → NEW
Component: General → XP Miscellany
Product: Thunderbird → Core
QA Contact: general → brendan
We need to sort out our shutdown story.  Benjamin?  Thoughts?
Flags: blocking1.9a1+
Summary: Crash on first startup after running downrev version crash [@ nsXBLBinding::AllowScripts] → Crash on first startup after running downrev Thunderbird version crash [@ nsXBLBinding::AllowScripts][@ PresShell::Destroy]
To be precise, it's the hidden window destruction that dies here.
What seems to be the same thing also crashes Firefox when clicking 'cancel' in the automatic extension update dialog which comes up when starting it with some profile that has extensions that need updating. Again, no talkback.
Attached patch 306959-1a.diffSplinter Review
dunno if this is the correct fix, but I was annoyed by the crash so I did what looks like the obvious thing, and it works.
Perhaps what we should do is to have two stages of notification:

1)  Notify that xpcom shutdown is starting -- this should make people release things, break cycles, etc as needed.  All non-service objects from other modules have to be released at this stage.
2)  Notify that xpcom shutdown is continuing (or that service manager shutdown is imminent?) -- this should make people actually release any pointers to services that they hold.  The only things that should be happening at this point are release of services and possibly cleanup of module-internal data structures that do not reach out of the module.  Objects from other modules may not be accessed at this stage.

Or will that really just push the problem off a bit?  It seems to me that with clear definitions of what should be released at which stage we can eliminate the cyclical dependency here...
Attached file jwatt's ff trunk stack
I'm crashing intermitently with the attached stack in a home spun current trunk FF. If I should open another bug let me know.
No, that's this bug.
Summary: Crash on first startup after running downrev Thunderbird version crash [@ nsXBLBinding::AllowScripts][@ PresShell::Destroy] → Crash on first startup after running downrev Firefox/Thunderbird version crash [@ nsXBLBinding::AllowScripts][@ PresShell::Destroy]
*** Bug 321123 has been marked as a duplicate of this bug. ***
The event queue service is gone when destroying the presshell.  Kaboom.  This introduces a null check.
Attachment #207086 - Flags: superreview?(bzbarsky)
Attachment #207086 - Flags: review?(benjamin)
Comment on attachment 207086 [details] [diff] [review]
Check for null event queue service

I'm OK with this on one condition -- that we back this out once bsmedberg fixes shutdown to suck less.  Please add a nice comment here explaining why we're null-checking the return value of a function that's claiming to guarantee a non-null return, and file a followup bug on yourself to undo this change, dependent on bsmedberg's shutdown bug.
Attachment #207086 - Flags: superreview?(bzbarsky) → superreview+
Will do on checkin, pending bsmedberg's review.
Comment on attachment 207086 [details] [diff] [review]
Check for null event queue service

Is this really faster/better than NS_GetMainEventQ?
Attachment #207086 - Flags: review?(benjamin) → review+
Probably not; when I wrote this code I sorta used the actual documented (well, kinda) event queue APIs.  ;)
bz/bs, is this preferable, then?
Attachment #207429 - Flags: review?(bzbarsky)
Comment on attachment 207429 [details] [diff] [review]
With NS_GetMainEventQ

This is IMO preferable. Please use our standard name "nsresult rv" instead of "err".
Attachment #207429 - Flags: review+
I purposely used err instead of rv so as not to imply that it's nsPresShell::Destroy's return value, knowing all the while that someone would mention it during review.  I don't care either way, I can check in with rv.
We can also switch to NS_GetMainEventQ in PresShell::PostReflowEvent().
Comment on attachment 207429 [details] [diff] [review]
With NS_GetMainEventQ

Fine by me as long as you keep passing nsContentUtils::EventQueueService() as the second arg to NS_GetMainEventQ.  Oh, and while you're touching this code, this should be NS_GetThreadEventQ anyway.

UI wouldn't touch PostReflowEvent() for now; I'm hoping to make some major changes to how that works, and I'll fix it up then.
Attachment #207429 - Flags: review?(bzbarsky) → review+
Assignee: nobody → mark
Checked in on trunk, using NS_GetCurrentEventQ and passing the static service hint.
Status: NEW → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
Summary: Crash on first startup after running downrev Firefox/Thunderbird version crash [@ nsXBLBinding::AllowScripts][@ PresShell::Destroy] → Crash on first startup after running downrev Firefox/Thunderbird version [@ nsXBLBinding::AllowScripts][@ PresShell::Destroy]
V with TB 1.6a1-0105.  Thanks, Mark!
Status: RESOLVED → VERIFIED
Component: XP Miscellany → General
QA Contact: brendan → general
Crash Signature: [@ nsXBLBinding::AllowScripts] [@ PresShell::Destroy]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: