Closed Bug 1199957 Opened 9 years ago Closed 9 years ago

Thunderbird on Mac hangs when loading content from the Web

Categories

(MailNews Core :: Backend, defect)

Unspecified
macOS
defect
Not set
critical

Tracking

(thunderbird42+ fixed, thunderbird43+ fixed, thunderbird44+ fixed, thunderbird_esr38 unaffected)

RESOLVED FIXED
Tracking Status
thunderbird42 + fixed
thunderbird43 + fixed
thunderbird44 + fixed
thunderbird_esr38 --- unaffected

People

(Reporter: mxn, Assigned: ishikawa)

References

Details

(Keywords: hang, regression, Whiteboard: [regression:TB42][blocking TB42.0b1])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:42.0) Gecko/20100101 Firefox/42.0
Build ID: 20150828004008

Steps to reproduce:

Earlybird hangs indefinitely after a few minutes of usage, typically after opening a Web feed folder, loading an image from the Web inside a Web feed folder, or even just allowing all the Web feed site icons to load. Sometimes, as in the attached sample, the hang occurs even before my IMAP accounts have finished fetching new messages. Sometimes, I can work around the hang by scrolling quickly past the Web feeds in my folder list, but I would like to be able to read my feeds again.

I’ve been seeing this in Earlybird 42.0a2 build 20150828004004 (and in about a week’s worth of builds prior to that) on OS X 10.10.5.
You mrean an RSS feed?
Flags: needinfo?(mxn)
Yes. :-) The issue doesn’t seem to be limited to RSS feeds; they just happen to trigger the issue faster, because I have lots of RSS feed folders.
Flags: needinfo?(mxn)
One feed that consistently triggers the hang is <http://blog.wikimedia.org/feed/>.

I usually see a spike in CPU usage leading up to this hang. At that point, attempting to upgrade Earlybird via the About box leads to this crash:

<https://crash-stats.mozilla.com/report/index/3a073d0b-3c94-4cb3-be50-604682150907>

I did recently re-enable the “Allow Spotlight to index messages” preference in Advanced preferences. After disabling that preference, I’m once again able to view items from the Wikimedia Blog RSS feed without hanging. So it may have something to do with that setting.
See Also: → 1192778, 1197214
The CPU spike, hang, and subsequent crash on quit is much less common with Spotlight indexing turned off. I noticed that there’s always a stalled indexing activity in the Activity Manager, under the account “null”. Judging from the number of total messages to index, the indexing activity is from a different account each time. I do have an IMAP account with an absurdly large number of messages in one folder (on the order of tens of thousands). However, this folder wasn’t causing problems until recently.
Severity: normal → critical
Whiteboard: [regression:TB??]
channel related?

irc: [Fallen] I remember we had to do some changes to how channels are created, I think it was aceman who patched it
Status: UNCONFIRMED → NEW
Component: Untriaged → Backend
Ever confirmed: true
Flags: needinfo?(acelists)
Product: Thunderbird → MailNews Core
Yeah, I tried tried something with the channels, but I think I backed off (and some are still open) as I do not yet understand how they work.
Flags: needinfo?(acelists)
Whiteboard: [regression:TB??] → [regression:TB??][blocking TB42.0b1]
(In reply to Minh Nguyen from comment #5)
> The CPU spike, hang, and subsequent crash on quit is much less common with
> Spotlight indexing turned off. I noticed that there’s always a stalled
> indexing activity in the Activity Manager, under the account “null”. Judging
> from the number of total messages to index, the indexing activity is from a
> different account each time. I do have an IMAP account with an absurdly
> large number of messages in one folder (on the order of tens of thousands).
> However, this folder wasn’t causing problems until recently.

Confirming the "hang" and CPU spike problem on Mac OS X 10.10.5 for TB 42.0b1 and Earlybird 42.0a2 in new profiles when reading RSS feeds.
I don't notice a relation with Spotlight or the number of messages in IMAP account since Spotlight indexing TB is turned off by default and my IMAP accounts only contains few messages.
I had the feeling this was happening when reading lots of emails in succession (doesn't only happen in newsgroups for me). When it starts to hang, the email is not yet displayed and I see "Connecting to xxx..." in the status bar.
Phillips pastebin

      + 105 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff8d6a63ed]
        + ! 105 _pthread_start  (in libsystem_pthread.dylib) + 176  [0x7fff8d6a8fd7]
        + !   105 _pthread_body  (in libsystem_pthread.dylib) + 131  [0x7fff8d6a905a]
        + !     105 PR_GetThreadName  (in libnss3.dylib) + 440  [0x100701dc8]
        + !       105 XRE_AddJarManifestLocation  (in XUL) + 42963  [0x101c05003]
        + !         105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 2153052  [0x101e69c7c]
        + !           105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 2301929  [0x101e8e209]
        + !             105 NS_UTF16ToCString  (in XUL) + 46677  [0x101c29ea5]
        + !               105 XRE_AddJarManifestLocation  (in XUL) + 47152  [0x101c06060]
        + !                 105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 904244  [0x101d38e54]
        + !                   105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 643303  [0x101cf9307]
        + !                     105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 635878  [0x101cf7606]
        + !                       105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 631800  [0x101cf6618]
        + !                         105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 632914  [0x101cf6a72]
        + !                           105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 636296  [0x101cf77a8]
        + !                             105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 637195  [0x101cf7b2b]
        + !                               105 mozilla::LoadInfo::RedirectChain()  (in XUL) + 650030  [0x101cfad4e]
        + !                                 87 mozilla::LoadInfo::RedirectChain()  (in XUL) + 649421  [0x101cfaaed]
        + !                                 : 31 nsXPTCStubBase::Stub249()  (in XUL) + 37426  [0x101c19622]
        + !                                 : | 31 NS_UnregisterXPCOMExitRoutine  (in XUL) + 68,30,...  [0x101c1a044,0x101c1a01e,...]
        + !                                 : 27 nsXPTCStubBase::Stub249()  (in XUL) + 37452  [0x101c1963c]
        + !                                 : | 19 PR_Select  (in libnss3.dylib) + 8123,8112,...  [0x1006ff32b,0x1006ff320,...]
        + !                                 : | 4 PR_Select  (in libnss3.dylib) + 8128  [0x1006ff330]
        + !                                 : | + 4 __error  (in libsystem_kernel.dylib) + 9,12  [0x7fff8c89fc46,0x7fff8c89fc49]
        + !                                 : | 2 PR_Select  (in libnss3.dylib) + 8057  [0x1006ff2e9]
        + !                                 : | + 2 PR_GetCurrentThread  (in libnss3.dylib) + 213,0  [0x100700ed5,0x100700e00]
        + !                                 : | 2 PR_Select  (in libnss3.dylib) + 8120  [0x1006ff328]
        + !                                 : |   2 read  (in libsystem_kernel.dylib) + 20  [0x7fff8c8a568c]
        + !                                 : 18 nsXPTCStubBase::Stub249()  (in XUL) + 37454,37458,...  [0x101c1963e,0x101c19642,...]
        + !                                 : 9 PR_Read  (in libnss3.dylib) + 0,3,...  [0x1006e4b50,0x1006e4b53,...]
        + !                                 : 1 PR_Select  (in libnss3.dylib) + 8271  [0x1006ff3bf]
        + !                                 : 1 nsXPTCStubBase::Stub249()  (in XUL) + 37466  [0x101c1964a]
        + !                                 :   1 NS_UnregisterXPCOMExitRoutine  (in XUL) + 384  [0x101c1a180]
        + !                                 18 mozilla::LoadInfo::RedirectChain()  (in XUL) + 649423,649425,...  [0x101cfaaef,0x101cfaaf1,...]
        + 6 DYLD-STUB$$__error  (in libnss3.dylib) + 0  [0x100702a5a]
        + 3 DYLD-STUB$$PR_Read  (in XUL) + 0  [0x1048587fc]
        + 1 DYLD-STUB$$pthread_getspecific  (in libnss3.dylib) + 0  [0x100702d36]
        + 1 DYLD-STUB$$read  (in libnss3.dylib) + 0  [0x100702db4]
        + 1 PR_GetCurrentThread  (in libnss3.dylib) + 221  [0x100700edd]
        + 1 PR_Select  (in libnss3.dylib) + 8275  [0x1006ff3c3]
regression test request to those of you who can reproduce an issue ...

Does problem NOT happen with https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-13-03-02-07-comm-central/
and DOES happen with https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-22-03-02-08-comm-central/

If so, is problem related to these which landed in version 42?
* Bug 1185583 - Thunderbird broken by mozilla-central Bug 1143922 (landed on comm-central 2015-07-21)
* Bug 1143922 - Add AsyncOpen2 to nsIChannel and perform security checks when opening a channel (landed and stuck on 2015-07-20)

And if so, why are we not seeing more users having the problem?
Severity: critical → enhancement
Severity: enhancement → critical
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #11)
> regression test request to those of you who can reproduce an issue ...
> 
> Does problem NOT happen with
> https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-13-03-02-
> 07-comm-central/
> and DOES happen with
> https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-22-03-02-
> 08-comm-central/
> 
> If so, is problem related to these which landed in version 42?
> * Bug 1185583 - Thunderbird broken by mozilla-central Bug 1143922 (landed on
> comm-central 2015-07-21)
> * Bug 1143922 - Add AsyncOpen2 to nsIChannel and perform security checks
> when opening a channel (landed and stuck on 2015-07-20)
> 
> And if so, why are we not seeing more users having the problem?

It does NOT happen with with
> https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-13-03-02-
> 07-comm-central/

It DOES happen with
> https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-22-03-02-
> 08-comm-central/

Maybe only few Mac users are testing beta versions AND reading RSS feeds?
I never read RSS feeds in TB before getting involved in beta testing...
(In reply to Eckard Berberich from comment #12)
> It does NOT happen with with
> > https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-13-03-02-
> > 07-comm-central/
> 
> It DOES happen with
> > https://archive.mozilla.org/pub/thunderbird/nightly/2015/07/2015-07-22-03-02-
> > 08-comm-central/
I can confirm this, same for me on OS X 10.11.1. First build works normal. Second build freezes first after I tried to install an Add-On (and now it freezes everytime I open the Add-On manager).
(In reply to Nomis101 from comment #13)

> I can confirm this, same for me on OS X 10.11.1. First build works normal.
> Second build freezes first after I tried to install an Add-On (and now it
> freezes everytime I open the Add-On manager).
For the problem of freezing when opening the Add-ons manager I created a separate bug report:
https://bugzilla.mozilla.org/show_bug.cgi?id=1208090
https://bugzilla.mozilla.org/show_bug.cgi?id=1170646#c20 fits the regression range
and some technical input here
https://bugzilla.mozilla.org/show_bug.cgi?id=1204381#c13

Maybe someone could do a local build with that patch backed out to confirm
Perhaps there is something inconsistent with the Mac file system that needs to be adjusted.
(In reply to Joe Sabash [:JoeS1] from comment #15)
> Maybe someone could do a local build with that patch backed out to confirm


I'm currently trying this, build finishes in a minute...
(In reply to Nomis101 from comment #16)
> (In reply to Joe Sabash [:JoeS1] from comment #15)
> > Maybe someone could do a local build with that patch backed out to confirm
> 
> 
> I'm currently trying this, build finishes in a minute...

A TB 42b build without this patch works better for me, related to the freeze issue.
(In reply to Nomis101 from comment #17)
> (In reply to Nomis101 from comment #16)
> > (In reply to Joe Sabash [:JoeS1] from comment #15)
> > > Maybe someone could do a local build with that patch backed out to confirm
> > 
> > 
> > I'm currently trying this, build finishes in a minute...
> 
> A TB 42b build without this patch works better for me, related to the freeze
> issue.

On the other hand, the build does also work without any freeze if I not back out this patch. I'm confused!
I also had to include this patch to make it build without any error: http://hg.mozilla.org/mozilla-central/rev/d63692ee5330
But I don't think this is relevant here.
(In reply to Nomis101 from comment #18)
> (In reply to Nomis101 from comment #17)
> > (In reply to Nomis101 from comment #16)
> > > (In reply to Joe Sabash [:JoeS1] from comment #15)
> > > > Maybe someone could do a local build with that patch backed out to confirm
> > > 
> > > 
> > > I'm currently trying this, build finishes in a minute...
> > 
> > A TB 42b build without this patch works better for me, related to the freeze
> > issue.
> 
> On the other hand, the build does also work without any freeze if I not back
> out this patch. I'm confused!
> I also had to include this patch to make it build without any error:
> http://hg.mozilla.org/mozilla-central/rev/d63692ee5330
> But I don't think this is relevant here.

Because TB 42 (and FF 42 too) is building with 10.7 SDK, I also switched back to 10.7 SDK and now I can reproduce the freeze with my local build. :-) Now I will back out this patch, to see if this will fix it.
I can confirm, that backing out the patch Joe suggested in #15 will reproducible fix the freeze in a local build.
Blocks: 1170646
(In reply to Nomis101 from comment #20)
> I can confirm, that backing out the patch Joe suggested in #15 will
> reproducible fix the freeze in a local build.

Thanks very much Nomis for all your hard work 
Now we need an action plan
Chiaka, thoughts?
This is blocking our release of beta 42
Assignee: nobody → ishikawa
Flags: needinfo?(ishikawa)
Whiteboard: [regression:TB??][blocking TB42.0b1] → [regression:TB42][blocking TB42.0b1]
This also affects the new account provisioner: after selecting an email address, a web content tab opens in which the provider offer is to be displayed. This often hangs now.
I have r+ to backout the suspect patch from bug 1170646, trying to get that backout landed now. Then we'll go for aurora and beta approvals.
See Also: → mail-cache2
(In reply to aleth [:aleth] from comment #23)
> This also affects the new account provisioner: after selecting an email
> address, a web content tab opens in which the provider offer is to be
> displayed. This often hangs now.

Now the problematic patch has been backed out on m-c, this issue seems to be fixed.

Furthermore, if I run the corresponding mozmill test locally with a fresh c-c build, it now passes locally too.
The patch for 1170646 has now been backed out on mozilla-beta, mozilla-aurora, and mozilla-central. I'm going to mark this bug as fixed as well, but it would be good to verify when the next beta comes out.
This bug was fixed by the backout of the patches for bug 1170646.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Version: 42 → 44
OS: Unspecified → Mac OS X
Summary: Thunderbird hangs when loading content from the Web → Thunderbird on Mac hangs when loading content from the Web
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #22)
> Chiaka, thoughts?
> This is blocking our release of beta 42

Sorry I could not respond. I had a serious hardware problem caused initially by
UPS failure which may have caused other hardware parts failure (I did not realize the latter problems initially until external hardware enclosure that hosted my disk drives 
started to behave funnily. I am still recovering the issues and so could not monitor the
web pages easily (not to mention followed all the e-mails.)

Sorry about this period of silence.

Yet, I am surprised that the problem existed with the Mac library.
(It seems that Mac library overload the read function in an unexpected manner.
I need to figure out how to make the original patch more robust in the face of
such environmental change: the original patch was meant to protect us from reading profile data, etc. from remote drive (remote file system). Let me investigate this issue
in more detail once my hardware comes back on-line fully.).
Flags: needinfo?(ishikawa)
Just a quick note:


-    while (remaining > 0) {
-        n = PR_Read(fd, start, remaining);
-        if (n < 0) {
-            if( (len - remaining) == 0 ) // no octet is ever read
-                return -1;
-            break;
-        } else {

It seems that I need to change if (n < 0) to if (n <= 0) in the affected patch to
correct the problem on the Mac: However, I have a feeling that there may be subtle
semantic issue that was missed in the implementation of PR library (especially in
PR_Read implementation on the Mac). Otherwise, I can't explain the non-issue on Windows and Linux build. 
But again, I am sorry it has to wait until my hardware issues are resolved complete so that I can run local build :-(
The bug I am having has identical symptoms but on current 58.8.0, ie. connections hanging with "Connecting ..".  Circumstances are different, I only do email, however, I have large folders, with tens of thousands of emails.  It seems that thunderbird keeps opening new connections all the time, and connections hang for whatever reason. 

If a connection is through ssh tunnel, I see connection being opened in ssh (with verbose mode), but thunderbird just sits in "Connecting".  

This makes thunderbird pretty unusable in Mac.  This does not happen in FreeBSD, nor there is any of the slowness and waits in FreeBSD.  The accounts are same, so this is likely mac related, not related to the number of emails or folder sizes. 

There is a patch for older version of thunderbird, could this bug have crept back in?  The problem started around 5x.  

Note that all connections hang, including "checking for updates" in About, it just loops endlessly.
Mac SDK version changed?
I did read recently that MACOSX_DEPLOYMENT_TARGET now defaults to 10.9
See Also: → 1440716
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: