Closed Bug 738778 Opened 12 years ago Closed 4 months ago

Uncaught exceptions during Thunderbird shutdown

Categories

(Thunderbird :: Mail Window Front End, defect)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: Irving, Unassigned)

References

Details

(Whiteboard: Regression?)

Attachments

(3 files)

Every time I close my debug/trunk Thunderbird, I get a couple of Javascript exception messages followed by a pile of warnings, leak reports, and other signs of a non-clean shutdown. This smells to me like the sort of thing that could cause settings to not be saved at close time, so I think it deserves further investigation.

The first two messages are:

JavaScript error: , line 0: uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIControllers.removeController]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://chat/content/convbrowser.xml :: destroy :: line 243"  data: no]
JavaScript error: , line 0: uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://messenger/content/search.xml ::  :: line 156"  data: no]


The full log (starting from when I hit "Quit" is attached.
I can see this too, on Linux and Win XP.
Can be observed nicely by leaving Error console open and closing the main TB window.
The two exceptions mentioned in the summary for this bug seem to be red herrings - after cleaning them up, TB is still not shutting down smoothly. The variation between my and mconley's logs make me suspect race conditions in the order our components shut down.
Assignee: nobody → irving
Status: NEW → ASSIGNED
Whiteboard: Regression?
I don't know if it is related but for several days now TB does not exit properly (it did before, just spewing those errors into the error console). It seems to crash and waits some time for the debugger (it is a debug build) and then exits. It happens 100% at each close (which I do 100x per day on this test build).

This is 'bt' from gdb on linux:
#0  0xb7553a8c in nanosleep () from /lib/libc.so.6
#1  0xb75538c4 in sleep () from /lib/libc.so.6
#2  0xb4aa0207 in ah_crap_handler (signum=6) at /var/SSD/TB-hg/mozilla/toolkit/xre/nsSigHandlers.cpp:87
#3  0xb4aa4ccc in nsProfileLock::FatalSignalHandler (signo=6, info=0xbfc4a86c, context=0xbfc4a8ec) at /var/SSD/TB-hg/tbird-bin/mozilla/toolkit/profile/nsProfileLock.cpp:190
#4  <signal handler called>
#5  0xb7776749 in raise () from /lib/libpthread.so.0
#6  0xb563299d in Observe (this=0xaf9884c0, aTopic=<optimized out>) at /var/SSD/TB-hg/mozilla/storage/src/mozStorageService.cpp:852
#7  mozilla::storage::Service::Observe (this=0xaf9884c0, aTopic=0xb5e5ec6e "xpcom-shutdown-threads") at /var/SSD/TB-hg/mozilla/storage/src/mozStorageService.cpp:819
#8  0xb5b89bac in nsObserverList::NotifyObservers (this=0xa5a2287c, aSubject=0x0, aTopic=0xb5e5ec6e "xpcom-shutdown-threads", someData=0x0)
    at /var/SSD/TB-hg/mozilla/xpcom/ds/nsObserverList.cpp:99
#9  0xb5b8a493 in NotifyObservers (someData=0x0, aTopic=0xb5e5ec6e "xpcom-shutdown-threads", aSubject=0x0, this=0xb2d1bb50) at /var/SSD/TB-hg/mozilla/xpcom/ds/nsObserverService.cpp:149
#10 nsObserverService::NotifyObservers (this=0xb2d1bb50, aSubject=0x0, aTopic=0xb5e5ec6e "xpcom-shutdown-threads", someData=0x0)
    at /var/SSD/TB-hg/mozilla/xpcom/ds/nsObserverService.cpp:138
#11 0xb5b7a60a in mozilla::ShutdownXPCOM (servMgr=0xb7230704) at /var/SSD/TB-hg/mozilla/xpcom/build/nsXPComInit.cpp:585
#12 0xb4a93bcf in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0xb725f10c, __in_chrg=<optimized out>) at /var/SSD/TB-hg/mozilla/toolkit/xre/nsAppRunner.cpp:1104
#13 0xb4a9b1eb in XREMain::XRE_main (this=0xbfc4ade4, argc=4, argv=0xbfc4c074, aAppData=0xb7201180) at /var/SSD/TB-hg/mozilla/toolkit/xre/nsAppRunner.cpp:3885
#14 0xb4a9b3d1 in XRE_main (argc=4, argv=0xbfc4c074, aAppData=0xb7201180, aFlags=0) at /var/SSD/TB-hg/mozilla/toolkit/xre/nsAppRunner.cpp:3939
#15 0x08049e32 in do_main (argv=0xbfc4c074, argc=4, exePath=0xbfc4afb0 "/var/SSD/TB-hg/tbird-bin/mozilla/dist/bin/") at /var/SSD/TB-hg/mail/app/nsMailApp.cpp:111
#16 main (argc=4, argv=0xbfc4c074) at /var/SSD/TB-hg/mail/app/nsMailApp.cpp:200
xref bug 758826 bug 758089
Both started about 2012-05-20
Comment 30 has the following regression range:

http://hg.mozilla.org/comm-central/pushloghtml?fromchange=d8c08657b365&tochange=b2e89794570a

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=95437bcc43dc&tochange=0e2cc686b07b

It's possible it's all part of the same bug but just now manifesting with the indicated regression range, considering that I can consistently reproduce this with my regular profile but not with a brand new (unmodified) profile.
(In reply to IU from comment #5)
> Comment 30 has the following regression range:

Oops.  I meant comment 3.
(In reply to Irving Reid (:irving) from comment #0)

> JavaScript error: , line 0: uncaught exception: [Exception... "Component
> returned failure code: 0x80004005 (NS_ERROR_FAILURE)
> [nsIControllers.removeController]"  nsresult: "0x80004005
> (NS_ERROR_FAILURE)"  location: "JS frame ::
> chrome://chat/content/convbrowser.xml :: destroy :: line 243"  data: no]

I filed bug 761688 and attached a patch to fix this error.
Depends on: 761688
Depends on: 761983
I fixed the 2 JS error mentioned in comment 0 and the JS error:
[Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: file:///Users/ireid/tbird/objdir-comm-central-default/mozilla/dist/DailyDebug.app/Contents/MacOS/components/steelApplication.js :: app_observe :: line 686"  data: no]
that is in both attachment 608849 [details] and attachment 609832 [details] is gone too (I think it was fixed by bug 750454).

I don't see any JS error any more during the shutdown of a trunk debug Thunderbird build, but the shutdown still isn't clean (still lots of warnings, errors, and a large shutdown leak).

Not sure what the next step is in cleaning up this mess.
The shutdown error I see in bug 758826 goes away if I uncheck the telemetry pref "submit performance data"
I also do not see any errors any more (I had 3).
Depends on: 764742
Assignee: irving → nobody
Status: ASSIGNED → NEW

Is this still seen?

Flags: needinfo?(ishikawa)

I don't see the particular errors mentioned by the original poster.

However, I do still see some issues.

The attached log is from heavily PATCHED debug version of TB.

Some messages make me really uncomfortable.

Actually, depending on the timing (race), the debug version hits an assert and fails. (thus we may not obtain the memory report, etc. cleanly in ASAN build, etc.

The case in point.
In the attached log, you see

[Parent 736666, Main Thread] WARNING: some msg dbs left open: '!m_dbCache.Length() || m_dbCache.Length() <= unremoved_db_from_cache_due_to_unavailable_XPCOM', file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/db/msgdb/src/nsMsgDatabase.cpp:84
m_dbCache.Length() = 1, unremoved_db_from_cache_due_to_unavailable_XPCOM=0

My local version is patched to avoid the assert (in debug version of TB) mentioned in
bug 1342858.
The cause of the bug is found in Bug 1677202.
XPCOM shutdown is not quire well ordered and thus file removal from the so called cache is not
honored when the filename, etc. cannot be resolved due to the earlier shutdown of XPCOM subsystem
responsible for that. And we hit assert in TB version.
The patch in bug 1677202 has not been entered in the tree.
See bug 1677202 comment 11 and a few comments before and after it.

This assert is hit randomly on try-comm-central, too. It is just that not many people test debug version and so do not realize the issue.

So there are some strange issues at shutdown time.

Flags: needinfo?(ishikawa)
Severity: normal → S3

WFM per comment 12.

But do we need bugs for anything mentioned in comment 12? (that don't have a bug already)

Status: NEW → RESOLVED
Closed: 4 months ago
Flags: needinfo?(ishikawa)
Flags: needinfo?(benc)
Resolution: --- → WORKSFORME

(In reply to Wayne Mery (:wsmwk) from comment #13)

WFM per comment 12.

But do we need bugs for anything mentioned in comment 12? (that don't have a bug already)

The bug I mention in comment 12 is
addressed, not fixed completely in bug 1342858, and the cause of the bug is found in Bug 1677202.
Fixing it completely would require some type of proper ordering of XPCOM subsystem shutdown at FF and TB shutdown.
To shutdown an XPCOM subsystem and destruct all the object handled by it properly may depend on
the functioning of other subsystems.
Such depended-upon subsystems cannot be destructed in advance.
Thus there is some kind of forced ordering.
However, the current code does not follow it. That is the problem.
And I suspect there is even a circular dependency which needs to be split. A long tedious work.

Flags: needinfo?(ishikawa)

I think we should close this one.
Bug 1677202 is nice and specific and we can get our teeth stuck into fixing it.
And if any concrete crash-upon-shutdown bugs pop up, they should each have their own report anyway.

One thing I'd really like to do in the new year is start having a concerted effort to fix the deluge of useless warnings and noise in the output, especially in debug builds. I feel crash-on-shutdown bugs (especially if they only happen in debug builds) kind of get diminished and ignored because of all the crap that gets dumped out every time you run.

Flags: needinfo?(benc)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: