Closed Bug 1265242 Opened 8 years ago Closed 6 years ago

Thunderbird 45.0 hangs on start if user_pref("extensions.lastAppVersion", "45.0"). Thunderbird.exe task running but no UI appears. starts if user_pref("extensions.lastAppVersion", "45.0") 38.7.2 still starts

Categories

(Thunderbird :: Untriaged, defect)

45 Branch
x86_64
Windows 7
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: mike, Unassigned)

References

()

Details

(Keywords: hang)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36 Vivaldi/1.0.435.42

Steps to reproduce:

(I was asked to fill a bug report from this discussion https://support.mozilla.org/en-US/questions/1118670)

I was running Thunderbird 38, then yesterday it automatically updated to version 45.0.
After installing the version worked fine, but this morning when I started my computer the mail client refused to start, I had a "thunderbird.exe" in the task manager but it staid there using 45 megs of memory - staying stable.

After asking on the support group it was suggested to kill the process and restart with SHIFT to enable safe mode, that did not change anything.

The suggestion to reinstall TB on top, did work, but after quitting TB and restarting it the same thing happened, and this time reinstalling did not work.

I found the link to the 38.7.2 version installer, and it's currently running nicely, (except that I lost all my plugins in the downgrade process)

Regarding the setup, the machine is a Windows 7 Ultimate, Core i7, 32 gigs of RAM, SSD, nvidia graphics card, ClassiShell installed, nothing else specific I can think of.


Actual results:

TB 45 only starts once and then is broken.


Expected results:

TB 45 should have been running fine as many time as quitting/restarting as possible, as the previous versions have done in the last many many years on the same machine.
OS: Unspecified → Windows 7
Hardware: Unspecified → x86_64
I'm very surprised you were asked to file a bug. This does look like a support problem and we have no way to reproduce or fix what is going on on your particular machine.

Wayne, what do you plan to do here?
Flags: needinfo?(vseerror)
> Wayne, what do you plan to do here?

I didn't (and still don't) have a strong feeling for the value of this issue. But I had no time to spend on it this weekend and I had experienced an update issue myself, so I felt it was worth elevating to potentially get more eyes. 

Mickael, which AV software are you running?  Do you has have starting problems with 45 if you have AV disabled or have Windows started in safe mode?
Flags: needinfo?(vseerror) → needinfo?(mike)
I'm using Avast, but that does not seem to be the issue there, disabling it does not change anything.

But... restarting Windows in Safe Mode actually has an impact. (Not sure what exactly, perhaps the display driver reverting to standard VGA, or using one screen instead of the three I normally use.)
Flags: needinfo?(mike)
Do you have hardware acceleration turned on? Tools > Options, Advanced, General, last item. Try turning it off if not already turned off.
It was already turned off.
Is there some log files stored somewhere that I could consult to see where it gets stuck? Like some kind of "ultra verbose" flag?
You could download a daily debug build if you want to experiment. There are of course also Windows tools that let you track down problems. Anything in the Windows event log? Since we have users on Windows 7 and this works for you in safe mode, it seems to be a driver/hardware related problem and this is not the right place to discuss these issues (as I said in comment #1). However, we're always happy to help, if we can, since this can also be educational ;-)
And surely you have looked in the task manager that not two processes of Thunderbird are running (just guessing).
The event log does not seem to say anything, and of course the normal build does not have symbols so except seing calls to xul.dll and nss3.dll without any name does not help.

Where can I find one of these debug builds?

I do have VS2013 and VS2015 installed, so with a debug build I should be able to give more information.

It may be a driver/hardware related problem, but that would be specific to the version 45.0, all the previous versions released in the 5+ last years have been working without any problem.
Sorry, I was wrong. Debug builds used to be available for Firefox, but that has stopped now. I can't see any debug builds for Thunderbird. Compare:
http://ftp.mozilla.org/pub/firefox/nightly/2015/08/ to
http://ftp.mozilla.org/pub/thunderbird/nightly/2015/08/
To continue my thoughts from the support topic.

Reinstall Thunderbird.
Take a copy of the prefs.js file from the profile
Open Thunderbird.

Is there any difference in the prefs file once you can not open Thunderbird?
Matt: You are onto something.

I copied the prefs.js file from version 38.7.2, then the version while version 45.0 was running, then the version when it stopped.

I found out that it seems to be this one single line that makes it stop running:

--> user_pref("extensions.lastAppVersion", "45.0");
if I replace it with
--> user_pref("extensions.lastAppVersion", "38.7.2");

the program starts just fine.

So I experimented... if I put "45.1", "45,0", "44", "Whatever", "45.", "45", "", ... it all works just fine.
If I put the exact "45.0" string, it gets stuck.
Not sure if that can help, but I attached to process when TB was stuck, added 'http://symbols.mozilla.org' in the symbol server lists, and here is the callstack for the main thread:

------------------------
ntdll.dll!_NtWaitForSingleObject@12()
ntdll.dll!_RtlpWaitOnCriticalSection@8()
ntdll.dll!_RtlEnterCriticalSection@4()
ntdll.dll!_LdrpLoadDll@28()
ntdll.dll!_LdrLoadDll@16()
6f91d8ee()
[Frames below may be incorrect and/or missing]	
mozglue.dll!`anonymous namespace'::patched_LdrLoadDll(wchar_t * filePath, unsigned long * flags, _UNICODE_STRING * moduleFileName, void * * handle) Line 713
KernelBase.dll!_LoadLibraryExW@12()
kernel32.dll!_LoadLibraryW@4()
gdi32.dll!_IcmInitialize@0()
gdi32.dll!_IcmInitIcmInfo@8()
gdi32.dll!_GetICMProfileW@12()
xul.dll!gfxWindowsPlatform::GetPlatformCMSOutputProfile(void * & mem, unsigned int & mem_size) Line 1220
xul.dll!gfxPlatform::GetCMSOutputProfileData(void * & mem, unsigned int & size) Line 1613
xul.dll!gfxPlatform::CreateCMSOutputProfile() Line 1634
xul.dll!gfxPlatform::Init() Line 620
xul.dll!gfxPlatform::GetPlatform() Line 460
xul.dll!mozilla::widget::GfxInfo::GetD2DEnabled(bool * aEnabled) Line 49
xul.dll!NS_InvokeByIndex(nsISupports * that, unsigned int methodIndex, unsigned int paramCount, nsXPTCVariant * params) Line 71
xul.dll!XPCWrappedNative::CallMethod(XPCCallContext & ccx, XPCWrappedNative::CallMode mode) Line 1381
xul.dll!XPC_WN_GetterSetter(JSContext * cx, unsigned int argc, JS::Value * vp) Line 1151
xul.dll!js::Invoke(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 444
xul.dll!js::Invoke(JSContext * cx, const JS::Value & thisv, const JS::Value & fval, unsigned int argc, const JS::Value * argv, JS::MutableHandle<JS::Value> rval) Line 496
xul.dll!CallGetter(JSContext * cx, JS::Handle<JSObject *> obj, JS::Handle<JS::Value> receiver, JS::Handle<js::Shape *> shape, JS::MutableHandle<JS::Value> vp) Line 1667
xul.dll!js::NativeGetProperty(JSContext * cx, JS::Handle<js::NativeObject *> obj, JS::Handle<JS::Value> receiver, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 1968
xul.dll!js::GetProperty(JSContext * cx, JS::Handle<JSObject *> obj, JS::Handle<JSObject *> receiver, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 831
xul.dll!Interpret(JSContext * cx, js::RunState & state) Line 2608
xul.dll!js::RunScript(JSContext * cx, js::RunState & state) Line 391
xul.dll!js::Invoke(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 465
xul.dll!js::Invoke(JSContext * cx, const JS::Value & thisv, const JS::Value & fval, unsigned int argc, const JS::Value * argv, JS::MutableHandle<JS::Value> rval) Line 496
xul.dll!CallGetter(JSContext * cx, JS::Handle<JSObject *> obj, JS::Handle<JS::Value> receiver, JS::Handle<js::Shape *> shape, JS::MutableHandle<JS::Value> vp) Line 1667
xul.dll!js::NativeGetProperty(JSContext * cx, JS::Handle<js::NativeObject *> obj, JS::Handle<JS::Value> receiver, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 1968
xul.dll!js::DirectProxyHandler::get(JSContext * cx, JS::Handle<JSObject *> proxy, JS::Handle<JS::Value> receiver, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 232
xul.dll!js::CrossCompartmentWrapper::get(JSContext * cx, JS::Handle<JSObject *> wrapper, JS::Handle<JS::Value> receiver, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 165
xul.dll!js::Proxy::get(JSContext * cx, JS::Handle<JSObject *> proxy, JS::Handle<JS::Value> receiver_, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 300
xul.dll!js::proxy_GetProperty(JSContext * cx, JS::Handle<JSObject *> obj, JS::Handle<JS::Value> receiver, JS::Handle<jsid> id, JS::MutableHandle<JS::Value> vp) Line 572
xul.dll!js::GetProperty(JSContext * cx, JS::Handle<JSObject *> obj, JS::Handle<JS::Value> receiver, js::PropertyName * name, JS::MutableHandle<JS::Value> vp) Line 823
xul.dll!js::GetProperty(JSContext * cx, JS::Handle<JS::Value> v, JS::Handle<js::PropertyName *> name, JS::MutableHandle<JS::Value> vp) Line 4015
xul.dll!GetPropertyOperation(JSContext * cx, js::InterpreterFrame * fp, JS::Handle<JSScript *> script, unsigned char * pc, JS::MutableHandle<JS::Value> lval, JS::MutableHandle<JS::Value> vp) Line 203
xul.dll!Interpret(JSContext * cx, js::RunState & state) Line 2487
xul.dll!js::RunScript(JSContext * cx, js::RunState & state) Line 391
xul.dll!js::Invoke(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 465
xul.dll!js::Invoke(JSContext * cx, const JS::Value & thisv, const JS::Value & fval, unsigned int argc, const JS::Value * argv, JS::MutableHandle<JS::Value> rval) Line 496
xul.dll!JS_CallFunctionValue(JSContext * cx, JS::Handle<JSObject *> obj, JS::Handle<JS::Value> fval, const JS::HandleValueArray & args, JS::MutableHandle<JS::Value> rval) Line 2790
xul.dll!nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS * wrapper, unsigned short methodIndex, const XPTMethodDescriptor * info_, nsXPTCMiniVariant * nativeParams) Line 1222
xul.dll!nsXPCWrappedJS::CallMethod(unsigned short methodIndex, const XPTMethodDescriptor * info, nsXPTCMiniVariant * params) Line 603
xul.dll!PrepareAndDispatch(nsXPTCStubBase * self, unsigned int methodIndex, unsigned int * args, unsigned int * stackBytesToPop) Line 85
xul.dll!SharedStub() Line 113
xul.dll!NS_CreateServicesFromCategory(const char * aCategory, nsISupports * aOrigin, const char * aObserverTopic) Line 820
xul.dll!nsXREDirProvider::DoStartup() Line 896
xul.dll!XREMain::XRE_mainRun() Line 4154
xul.dll!XREMain::XRE_main(int argc, char * * argv, const nsXREAppData * aAppData) Line 4382
xul.dll!XRE_main(int argc, char * * argv, const nsXREAppData * aAppData, unsigned int aFlags) Line 4484
thunderbird.exe!do_main(int argc, char * * argv, nsIFile * xreDirectory) Line 195
thunderbird.exe!NS_internal_main(int argc, char * * argv) Line 332
thunderbird.exe!wmain(int argc, wchar_t * * argv) Line 138
thunderbird.exe!__tmainCRTStartup() Line 255
kernel32.dll!@BaseThreadInitThunk@12()
ntdll.dll!___RtlUserThreadStart@8()
ntdll.dll!__RtlUserThreadStart@8()
------------------------

The list of threads give that:

------------------------
11912	ntdll.dll!_NtWaitForSingleObject@12()
7044	xul.dll!google_breakpad::ExceptionHandler::ExceptionHandlerThreadMain	xul.dll!google_breakpad::ExceptionHandler::ExceptionHandlerThreadMain	
1860	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_ZwWaitForWorkViaWorkerFactory@8	
11172	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForSingleObject@12	
10152	xul.dll!ThreadEntry	xul.dll!base::MessagePumpForIO::GetIOItem	
10392	msvcr120.dll!_threadstartex	xul.dll!nsNotifyAddrListener::Run	
9152	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
7704	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForSingleObject@12	
1236	ntdll.dll!_TppWaiterpThread@4()	ntdll.dll!_NtWaitForMultipleObjects@20	
11104	mswsock.dll!_SockAsyncThread@4()	6f91cd62	Above 
11812	msvcr120.dll!_threadstartex	mswsock.dll!_SockWaitForSingleObject@16	
4604	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
6384	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
4768	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
4832	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
8004	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
11436	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
6840	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
3860	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
8844	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
10344	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
11260	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
10800	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
5724	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
6244	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
10920	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	Below 
3556	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
7000	ole32.dll!CROIDTable::WorkerThreadLoop	
3332	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
4740	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
4684	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
12116	msvcr120.dll!_threadstartex	nss3.dll!_PR_MD_WAIT_CV	
4804	xul.dll!ThreadEntry	gdi32.dll!_CreateFontIndirectExW@4	
11660	xul.dll!ThreadEntry	nss3.dll!_PR_MD_WAIT_CV	
------------------------
Mickael, 
Thanks for all the detail. 
Before we go further, does everything behave today as you describe in comment 11?
That is, you can still reproduce?
Severity: normal → critical
Flags: needinfo?(mike)
Keywords: hang
Summary: Thunderbird 45.0 failing to start after first install → Thunderbird 45.0 hangs on start after first install. 38.7.2 still starts
Wayne: I confirm the previous test.
I upgraded again my 38.7 to 45.0, after the first launch it worked, after quitting it would not restart, patching the line 

   user_pref("extensions.lastAppVersion", "45.0");

to

   user_pref("extensions.lastAppVersion", "45.00");

makes it work again :)

Since it almost look like the client was waiting for some network query that never times out, I did a netstat, not sure if that will help, but here it is:

 [thunderbird.exe]
  TCP    127.0.0.1:53783        mike-corei7:53782      ESTABLISHED
 [thunderbird.exe]
  TCP    127.0.0.1:65001        mike-corei7:49227      ESTABLISHED

It looks like all it does is to open some local ports on the machine with itself?
to continue that line of discovery, what happens if you disable both app and addon updates 
- in tools | options | advanced | update 
- and set preference  extensions.update.enabled to false
and use user_pref("extensions.lastAppVersion", "45.0");
Nope, still happening.
My upgrade history dialog is starting to look funny :D

I noticed there was options for Telemetry and RemoteDebugging, is that something that could help?
rstrong, is anything about comment 14 familiar?
Flags: needinfo?(robert.strong.bugs)
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #17)
> rstrong, is anything about comment 14 familiar?
Could be an add-on or the add-ons mgr. That pref is only used by the add-ons mgr... app update never touches it.
http://mxr.mozilla.org/mozilla-central/search?string=extensions.lastAppVersion
Flags: needinfo?(robert.strong.bugs)
Rainer, can you reproduce what is described here - comment 0 and comment 14?
Prior eleases are found at http://archive.mozilla.org/pub/mozilla.org/thunderbird/releases/
Flags: needinfo?(RainerBielefeldNG)
Flags: needinfo?(mike)
Mickael, did the problem go away for you?

You mentioned changing to user_pref("extensions.lastAppVersion", "45.00") ?
Do you mean "45.0.0" with the additional decimal point?
Does it work if you set any other value?
Flags: needinfo?(mike)
Summary: Thunderbird 45.0 hangs on start after first install. 38.7.2 still starts → Thunderbird 45.0 hangs on start if user_pref("extensions.lastAppVersion", "45.0"). Thunderbird.exe task running but no UI appears. starts if user_pref("extensions.lastAppVersion", "45.0") 38.7.2 still starts
Whiteboard: [closeme 2016-09-15]
yaaay! thankyouthankyouthankyou Mickael! & Matt! So glad to see I am not the only one with this. Finally just before throwing TB to the bin I came across this bug report.
Yeah the same issue, same fix, i.e 45.2.0 to 45.2.00 in prefs.js works:
user_pref("extensions.lastAppVersion", "45.2.00");
allows me to start x1, then user_pref gets flipped again to "default" and "no-starting" resumes.

i will try to give some more info later.
(In reply to Bogdan Lipski from comment #21)
> ...
> i will try to give some more info later.

Thanks Bogdan. What did you find?
Flags: needinfo?(RainerBielefeldNG) → needinfo?(lipski.bogdan)
Whiteboard: [closeme 2016-09-15]
Thunderbird has been driving me nuts again. A couple of weeks ago, I started having same issue as reported here (TB hangs early on startup, only safe-mode helps) several times, also on a Windows 7 64-bit system, with TB 45.0 to 45.4.0.

Fiddling with the extensions.lastAppVersion value did not help in my case, neither reverting to 38.8.
Then I started systematically moving away the various files in my TB user profile folder, and found that the issue disappears when deleting/renaming secmod.db

So my suspicion is that loading a security module (I have a DLL for a PKCS#11 smart card) hangs, maybe because some antivirus blocks it. Would be good if TB timed out within a few seconds on loading modules.

I suggest marking this bug as confirmed - including me, three people have been reporting this problem here, and bug 1308881 seems related.
I just found that in my case the issue is definitely due to an attached PKCS#11 module.
Sometimes, it seems to hang, not only during TB startup, but also, e.g., when sending out a message.
In both these situations, it is sufficient to simply remove the attached smart card for short time, and TB is back to life.

TB *must not* freeze when trying to access an unresponsive PKCS#11 module, so please add a timeout there.
nice detective work. keep it going :)

consider this list of bug reports https://mzl.la/2dEZKlx  (unfortunately we are not well triaged in this area - and we don't have a lot of people with expertise here)  This doesn't include bugs in the core mozilla components

https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/tools/NSS_Tools_modutil may also be of value
No need to go into any details w.r.t. PKCS#11 modules or cryptography here.
Further, the problem I wrote about above is not confined to TB - it happens with FF, too.

This issue is a very basic one: interactions with dynamically loaded modules must not block. If these do not offer a timeout themselves, the calling Mozilla (core?) component should add it (for instance by spawning a thread protecting the rest of the application from freezing).
Blocks: 1308055
I gave a shot at updating from 38.7.2 to 45.6.0, this time it seems to have worked. I've quit and restarted, and it did not get stuck.

Will see if nothing bad happens next time as reboot, but things seem ok now.
The possibly related issue with secmod.db I reported in comment 23 was already with TB 46.4.0.
Unfortunately I do not have a deterministic way of reproducing it (with any version).

Anyway, it is very likely that the underlying issue, namely that dynamically loaded modules/extensions/addons may block TB indefinitely (rather than timing out with an error) has not been tackled/solved.
Developers should have a pretty direct way of both reproducing such behavior and checking in the code whether such a timeout mechanism is present and used.
Flags: needinfo?(mueller8)
(Bogdan seems to be gone)
Flags: needinfo?(mike)
Flags: needinfo?(lipski.bogdan)
WFM per comment 30 and other testers
Status: UNCONFIRMED → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.