Hangs sending mail while copying message to Sent folder on Mac-only while displaying the progress bar. Deadlock in graphics on CGLClearDrawable. Workaround comment 349
Categories
(Thunderbird :: Message Compose Window, defect)
Tracking
(thunderbird_esr60 wontfix, thunderbird_esr68? affected, thunderbird53 unaffected, thunderbird54 wontfix, thunderbird56 wontfix, thunderbird57 wontfix, thunderbird58 wontfix, thunderbird59 wontfix, thunderbird60 wontfix, thunderbird67 wontfix, thunderbird71 wontfix, thunderbird72 wontfix, thunderbird73 wontfix, thunderbird74 wontfix, thunderbird75 wontfix, thunderbird76 wontfix, thunderbird77 wontfix, thunderbird78 affected, thunderbird79 affected)
Tracking | Status | |
---|---|---|
thunderbird_esr60 | --- | wontfix |
thunderbird_esr68 | ? | affected |
thunderbird53 | --- | unaffected |
thunderbird54 | --- | wontfix |
thunderbird56 | --- | wontfix |
thunderbird57 | --- | wontfix |
thunderbird58 | --- | wontfix |
thunderbird59 | --- | wontfix |
thunderbird60 | --- | wontfix |
thunderbird67 | --- | wontfix |
thunderbird71 | --- | wontfix |
thunderbird72 | --- | wontfix |
thunderbird73 | --- | wontfix |
thunderbird74 | --- | wontfix |
thunderbird75 | --- | wontfix |
thunderbird76 | --- | wontfix |
thunderbird77 | --- | wontfix |
thunderbird78 | --- | affected |
thunderbird79 | --- | affected |
People
(Reporter: jamesrome, Unassigned)
References
()
Details
(Keywords: hang, regression, regressionwindow-wanted, Whiteboard: [regression:TB54?][duptome][workaound: comment 349])
User Story
Workaround: Comment 349 History: * 2017-02-22 TB45 cayenne INCOMPLETE Bug 1341784 - hangs on sending mail (on one machine only, but has come and gone on others) (Unclear if this is the same issue, so unknown if this is the first report of this bug) >> 2016-06 core graphics Bug 1207332 - skia content on OS X - landed. Should be in TB48 (beta), and subsequently in TB52 ESR >> 2017-02-14 core graphics Bug 1325227 - Use read locks instead of synchronous transactions for ContentClientRemoteBuffer - landed. Should be in TB54 (beta), and subsequently in TB60. * 2017-03 TB45 simone WFM Bug 1343480 - Rare hang sending email with MacOSx with spotlight search enabled (so perhaps unrelated) >> 2017-05 FF?? OPEN Bug 1369207 - Firefox hang on CGLClearDrawable after quickly closing window after FEATURE_FAILURE_OPENGL_CREATE_CONTEXT * 2017-07 TB54(beta) Rome OPEN Bug 1381485 - Hangs sending mail while copying message to Sent folder on Mac-only while displaying the progress bar. Deadlock in graphics on CGLClearDrawable. ** 2018-03 m_kato is first developer to comment, and then we link to bug 1369207 reads "On macOS installations where opening a window logs: |[GFX1-]: [OPENGL] Failed to init compositor with reason: FEATURE_FAILURE_OPENGL_CREATE_CONTEXT|, quickly closing the window can deadlock in CGLClearDrawable." ** 2018-10 Gene developer comments, and Glenn comments about mutex * 2017-09 TB52(ESR) Reid DUPE Bug 1400568 - Thunderbird Mac 52.3.0 frequently hangs after Send * 2017-12 TB52 cinymini DUPE Bug 1422251 - TB52 Freeze after mail has been sent, with zero cpu (sierra). two imap accounts * 2018-02 TB52 or earlier Heikki INCOMPLETE (but more general networking issue caused by bug Bug 1440716 - Sits forever in "Connecting ". Hanging connections (imap and smtp) and hang on mac OS X * more reports follow for Thunderbird 60 ---------------------------------------------------------------------------------------------------------------------------------------------------- ** We need a one-day regression range using daily builds. Please pick a build date that might fail and test it. *** * initially thought to be: http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-08-03-02-29-comm-central/ to http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-12-03-02-06-comm-central/ * We have no idea on what date the flaw states, we only know some that work and some that don't. Working backwards: * fail: 2017-06-07 comment 193, coment 216 Scott, Christopher * fail: 2017-06-06 Richard * fail: 2017-06-05 Richard * ?? : 2017-06-04 Chris * fail: 2017-06-03 Scott * fail: 2017-06-02 Scott * fail: 2017-06-01 Scott * fail: 2017-05-31 Scott * fail: 2017-05-24 Christopher * ?? : 2017-05-17 ? * ?? : 2017-05-14 ? * ?? : 2017-05-01 Christopher * fail: 2017-04-01 Scott * fail: 2017-03-22 Scott * ??: 2017-03-15 Scott * ??: 2017-03-08 ?? * works for two weeks: 2017-03-01 Scott User configs (name, computer, OS version, graphics, monitor(s)) : * robert.p MacBookPro14,2 10.13.6 0x5927 Built-In Retina LCD * joduinn MacBookPro15-2018 10.14.6 Radeon Pro 560X 4 GB Intel UHD Graphics 630 * Bob Shimizu 10.14.16 <unknown graphics> Apple Thunderbolt Display(3) * Aaron Mac Mini (Late 2014) 10.14.6 Intel Iris 1536 MB 27" Thunderbolt display * Robert Shimizu Mac Pro 10.14.6 Uknown graphics Thunderbolt 27" (3) * Scott James iMac Mid 2017 10.13.6 Radeon Pro 575 Integrated 5120 x 2880 Apple Display * kimlove iMac Retina 5K 27-inch 2017 10.13.6 Radeon Pro 580 imac desktop * Marc De Graef Mac Pro (Late 2013) 10.13.6 3.5 GHz 6-core Intel Xeon E5 LG Ultrawide Display * Marc De Graef Macbook Pro (Retina, 15" Mid 2015) 10.13.6 2.8GHz Intel Core i7 Built-in Display * Ludovic Rousseau ? "it takes a week" * Christopher Schultz ? * Richard Leger ? - yahoo and gmail imap - spotlight doesn't matter - bumping file handles doesn't help > 54.0B3 works - no crashing. > 55.0b2 dies **beta feedback** - James: still fails with 67 beta (this bug) - Scott: unknown (this bug) - degraef: works with 66? beta (bug 1525001) - cinymini: unknown (bug 1422251) **Reports:** https://support.mozilla.org/en-US/questions/1273927 https://discourse.mozilla.org/t/thunderbird-freezes-and-i-have-to-force-quit/46256 https://support.mozilla.org/en-US/questions/1265983 https://support.mozilla.org/en-US/questions/1256159 (claims safe mode helped) https://support.mozilla.org/en-US/questions/1254261 https://support.mozilla.org/en-US/questions/1247137 (reverted to TB52 - good tester) https://support.mozilla.org/en-US/questions/1246950 https://support.mozilla.org/en-US/questions/1246643 1/13/2019 https://support.mozilla.org/en-US/questions/1242885 **disabling send progress helps** https://support.mozilla.org/en-US/questions/1241167 (when adding attachment) https://support.mozilla.org/en-US/questions/1237929 (likely unrelated because this is on win10 and caused by signature file) https://support.mozilla.org/en-US/questions/1234828 9/21/2018 https://support.mozilla.org/en-US/questions/1241380 https://support.mozilla.org/en-US/questions/1234041 - UCD reverted to TB52.0 https://support.mozilla.org/en-US/questions/1172364 - three users left Thunderbird https://support.mozilla.org/en-US/questions/1170401 - 52.2.1 Arthur frequent hangs 8/7/2017 Similar or Mac hang issue: - https://support.mozilla.org/en-US/questions/1246950 - https://support.mozilla.org/en-US/questions/1246494
Attachments
(24 files)
1.70 MB,
text/plain
|
Details | |
136.43 KB,
application/zip
|
Details | |
61.32 KB,
text/plain
|
Details | |
5.74 MB,
text/plain
|
Details | |
126.58 KB,
application/zip
|
Details | |
100.80 KB,
text/plain
|
Details | |
1.35 MB,
text/plain
|
Details | |
107.08 KB,
text/plain
|
Details | |
115.67 KB,
application/zip
|
Details | |
117.55 KB,
application/zip
|
Details | |
120.33 KB,
application/zip
|
Details | |
146.26 KB,
application/zip
|
Details | |
4.55 KB,
image/png
|
Details | |
6.28 KB,
image/png
|
Details | |
1.10 MB,
text/plain
|
Details | |
31.13 KB,
image/png
|
Details | |
18.76 KB,
image/png
|
Details | |
51.05 KB,
text/plain
|
Details | |
1.74 MB,
text/plain
|
Details | |
1.72 MB,
text/plain
|
Details | |
95.15 KB,
image/jpeg
|
Details | |
1.66 MB,
text/plain
|
Details | |
154.83 KB,
application/x-gzip
|
Details | |
90.08 KB,
text/plain
|
Details |
see attached
Comment 1•7 years ago
|
||
Does "hangs frequently" mean it hangs and you must kill the process? OR does it mean it hangs for x minutes and rten Please try 55
Comment 3•7 years ago
|
||
> Please try 55 ... started in safe mode
Reporter | ||
Comment 4•7 years ago
|
||
It upgraded to 55. I'll see if it hangs still
Reporter | ||
Comment 5•7 years ago
|
||
It still hangs frequently in normal mode. And I have the latest MacOS 10.12.6. New apple report attached.
Reporter | ||
Comment 6•7 years ago
|
||
Reporter | ||
Comment 7•7 years ago
|
||
It hung again in safe mode. One issue may be the number of open files (see attached). TB seems to load every one of my fonts, and I have a huge number since I do desktop publishing. There is no reason for this, and certainly zonks the system, which (I think) has some limit on the number of open files. TB 55.0b2
Reporter | ||
Comment 8•7 years ago
|
||
It hung right away in normal mod. I attach a spindump
Reporter | ||
Comment 9•7 years ago
|
||
It hung again just after sending mail.
Reporter | ||
Comment 10•7 years ago
|
||
It seems to be happening when I send google mail. It did it again.
Reporter | ||
Comment 11•7 years ago
|
||
It is now hanging every time I send mail. It's a lot worse since the upgrade to macOS 10.12.6
Comment hidden (off-topic) |
Comment hidden (off-topic) |
Comment 14•7 years ago
|
||
(In reply to James Rome from comment #11) > It is now hanging every time I send mail. It's a lot worse since the upgrade > to macOS 10.12.6 does that mean non-google mail? with or without addons? (In reply to James Rome from comment #7) > Created attachment 8889048 [details] > TBfiles.txt > > It hung again in safe mode. One issue may be the number of open files (see > attached). TB seems to load every one of my fonts, and I have a huge number > since I do desktop publishing. There is no reason for this, and certainly > zonks the system, Your fonts situation is a good piece of info. That's the way gecko works (it's not Thunderbird code) and is unavoidable, so the same thing will happen in Firefox. It does mean many fd will be open, and perhaps cause high memory usage. > which (I think) has some limit on the number of open files. TB 55.0b2 You can make a modest increase to ulimit See bug 800279 comment 1. But it is unclear whether this is related to your hanging situation.
Reporter | ||
Comment 15•7 years ago
|
||
I reverted to the release channel after the last kerfuffle with Lightning, and this problem no longer occurs.
Comment 16•7 years ago
|
||
Does this Mac have 13 imap accounts, or is that the Windows 10 system? Did the hang occur only when sending?
Reporter | ||
Comment 17•7 years ago
|
||
My Mac has 11 IMAP accounts. I usually do not activate them all on Windows.
Comment 18•7 years ago
|
||
James, Please try increased ulimit See bug 800279 comment 1. Is it more prone to happen with gmail, or happens only with gmail? Did it also happen with 53 beta? Or, had you not used beta prior to 54? (marking regression, because it does not happen for user with release build)
Reporter | ||
Comment 19•7 years ago
|
||
Sorry, I reverted to the release build because I could not do anything...
Comment 20•7 years ago
|
||
Had you been using TB53 beta prior to comment 0 and not had problems?
Reporter | ||
Comment 21•7 years ago
|
||
Don't remember. I gave it up when Lightning died.
Comment 22•7 years ago
|
||
Similar report in bug 1343480 but it is version 45. And bug 1400568 version 52. But no one has ponied up with a regression range. If you could retest - it would be useful to know whether it is more prone to happen with gmail, or happens only with gmail? (bp-a454d2eb-107b-45cf-88de-d00c50170518 indicates you were at one time using 53 beta)
Reporter | ||
Comment 23•7 years ago
|
||
Well, after struggling to find provider for Google calendar (try Googling for the beta builds), I have 56.0 b4 running, and so far I can send mail. Why is Provider for Google Calendar called gdata provider? How is one supposed to find it? Change one name or the other.
Reporter | ||
Comment 24•7 years ago
|
||
56.0b4 just hung again sending gmail.
Reporter | ||
Comment 25•7 years ago
|
||
Reporter | ||
Comment 26•7 years ago
|
||
The latest hang was on 57.90b1. TB opens all of my hundreds of font files. Why???
Reporter | ||
Comment 27•7 years ago
|
||
57.0b1
Comment 28•7 years ago
|
||
(In reply to James Rome from comment #26) > TB opens all of my hundreds of font files. > Why??? Nothing to do with Thunderbird. That's the way mozilla Gecko handles Mac
Reporter | ||
Comment 29•7 years ago
|
||
That's bad. It might be running out of file handles.
Reporter | ||
Comment 30•7 years ago
|
||
TB57.0b2 hung again after sending mail via gmail.
Reporter | ||
Comment 31•7 years ago
|
||
The sample from today
Comment 32•7 years ago
|
||
Can you remove any font files from your system? Do you have virtual folders that iterate over many other folders?
Reporter | ||
Comment 33•7 years ago
|
||
No virtual folders. If I remove fonts, I can't get them back easily. It hung today without sending an e-mail. But surely opening all the font files is a bug. It must slow things down and use more system resources.
Comment 34•7 years ago
|
||
(In reply to James Rome from comment #33) > No virtual folders. If I remove fonts, I can't get them back easily. I suggesting only removing fonts that you do not use. > But surely opening all the font files is a bug. > It must slow things down Actually, no. > and use more system resources. no, not memory, and not cpu, afaik. only open file handles. You should do comment 18
Comment 35•7 years ago
|
||
>
> You should do comment 18
... and does disabling spotlight help?
Reporter | ||
Comment 37•7 years ago
|
||
So far, adding file handles fixes things. There is not need to open all my font files, and IMHO that is a bug.
Comment 38•7 years ago
|
||
(In reply to James Rome from comment #36) > How do I disable spotlight? bug 1343480 comment 1
Reporter | ||
Comment 39•7 years ago
|
||
Sorry, it hung again after sending a Gmail, and with Spotlight disabled, and with the file numbers boosted.
Reporter | ||
Comment 40•7 years ago
|
||
This time it hung after sending Yahoo IMAP mail
Reporter | ||
Comment 41•7 years ago
|
||
The Apple report
Reporter | ||
Comment 42•7 years ago
|
||
This still happens daily on 58.0b3
Reporter | ||
Comment 43•7 years ago
|
||
Every day it hangs many times. This is getting annoying.
Comment 44•7 years ago
|
||
Probably the best chance of finding the cause is for you to find the regression range using nightly builds. For example starting with version 54 dmg nightly found at http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-01-03-02-08-comm-central/
Reporter | ||
Comment 45•7 years ago
|
||
54.0B3 seems to work without crashing.
Reporter | ||
Comment 46•7 years ago
|
||
55.0b2 dies
Comment 47•7 years ago
|
||
> 54.0B3 seems to work without crashing. > 55.0b2 dies That's a great start. So we need your help to determine the one day range in the 55.0a1 series. http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-08-03-02-29-comm-central/ http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-12-03-02-06-comm-central/
Reporter | ||
Comment 48•7 years ago
|
||
I need a non-daily build. When install lightning and I reboots, it updates to 5.8. The whole way that TB and Lightning are stored needs fixing. The correct Lightning builds and calendar provider should be ibn the same directory as TB. It is really hard to find what does with what.
Reporter | ||
Comment 49•7 years ago
|
||
There is no 55.0b1 to test that is not a nightly that self-updates. So the problem happened from 54 to 55 I believe that is when I reported it. But looking at the above, it seems that 54 also crashed. I have sent in many hang reports. Don't they point to the issue?
Comment 50•7 years ago
|
||
> I have sent in many hang reports. Don't they point to the issue? If we had someone to read it. Right now we don't. But you have started to narrow a regression range, which could ultimately be more useful. Unfortunately one cannot get a reegression range without testing daily. > The whole way that TB and Lightning are stored needs fixing. The correct Lightning builds and calendar provider should be ibn the same directory as TB. It is really hard to find what does with what. Have you used dailies on a regular basis? I never have a problem with lightning in daily. You just install the daily and if lightning doesn't behave install the correct nightly. All thunderbird daily installs WITHIN THE SAME VERSION from then on should just work.
Comment 51•7 years ago
|
||
(I failed to finish the thought) You just install the Thunderbird daily and if lightning doesn't behave install the correct version lightning. (Or remove the currently installed lightning and install thunderbird daily a second time)
Reporter | ||
Comment 52•7 years ago
|
||
I have been using the candidate builds because the dailies self-update instantly. The candidate builds always say Lightning is incompatible, and I must find the correct one. The lightning-TB version page does not list gdata provider\, so I must figure that out too. Yes, they are in the lightning directory, but I have downloaded every version, and there is no way to tell from the files which are correct. Also gdata provider throws all sorts of errors if it is the wrong version, but this is not detected in the install process. Unlike Lightning, gdata provider is not disabled. It may be time for me to switch to outlook
Comment 53•7 years ago
|
||
> I have been using the candidate builds because the dailies self-update instantly This is hardly a show stopper. One simply disables updates (In reply to James Rome from comment #45) > 54.0B3 seems to work without crashing. Just backtracking a bit, you reported this issue on July 17. It seems to me by then you would have run 54.0b1 and b2 without problems and first saw the problem using 54.0b3. Is that correct?
Reporter | ||
Comment 54•7 years ago
|
||
I would assume so, but it was a while ago.
Comment 55•7 years ago
|
||
(In reply to James Rome from comment #54) > I would assume so [that 54.0b3 is the first that failed], but it was a while ago. But that [if you were running 54.0b3 in comment 0] does not square with comment 45 where "54.0B3 seems to work without crashing." Hopefully you are using https://releases.mozilla.org/pub/thunderbird/releases/ In any event, suggest your next step be setting the yahoo and gmail account to save sent messages to a local folder, so that we can determine whether this situation involves the imap sent folder.
Reporter | ||
Comment 56•7 years ago
|
||
Good suggestion. I changed the sent folder to local now on 58.b3. It always manages to send the mail, so you mught be correct in your hunch.
Reporter | ||
Comment 57•7 years ago
|
||
Also remember that there has been a long-standing bug about copying IMAP mail to sent folder. Maybe when that was fixed, it caused this issue.
Reporter | ||
Comment 58•7 years ago
|
||
I do believe you have pinned down the problem. Not had a hang since I made the sent folders local.I am running 58.0b3
Comment 59•7 years ago
|
||
According to what I have read, gmail automatically saves outgoing mail to gmail Sent folder. So there should be no need for Thunderbird to be set to save to Sent (even though it is default). I don't know about yahoo.
Reporter | ||
Comment 60•7 years ago
|
||
But go back to that bug about copying sent mail to imap folder. It always sends the mail successfully, bug hangs after or during the next step. Still have not had a hang using local folders.
Comment 61•7 years ago
|
||
Sure, there is still a bug. Which it why it is so important to us for you to determine a regression range.
Reporter | ||
Comment 62•7 years ago
|
||
It has started to hang again on 58.0b3. Twice today after sending Google mail
Comment 63•7 years ago
|
||
Can you try a nightly build from http://archive.mozilla.org/pub/thunderbird/nightly/latest-comm-central/thunderbird-60.0a1.en-US.mac.dmg If so, what are the results?
Reporter | ||
Comment 64•7 years ago
|
||
It just hung again with 60 nightly. I did move the sent messages back to gmail from local.
Reporter | ||
Comment 65•7 years ago
|
||
Something else is happening with TB 60 also. When it hangs, and I Force quit it, the app in my /Applications/Thunderbird Daily.app gets trashed. I cannot reopen it until I replace it with the version I downloaded. The same thing happens when the daily tries to update itself. TB never restarts, and I must update it manually. And moving sent mail to local did not help the hangs.
Comment 66•7 years ago
|
||
(In reply to James Rome from comment #65) >The same thing > happens when the daily tries to update itself. TB never restarts, and I must > update it manually. I confirm this issue. Since a week it happens every time I want to update Daily via About Daily > Check for updates.
Comment 67•7 years ago
|
||
Perhaps the two of you can conspire to find the regression range. (In reply to James Rome from comment #64) > It just hung again with 60 nightly. I did move the sent messages back to > gmail from local. (previously sent in PM) to get the gdata and lightning addons to match the nightly being tested https://ftp.mozilla.org/pub/calendar/lightning/nightly/latest-comm-central/
Comment 68•7 years ago
|
||
m_kato, can you make anything of the stacks here, or in bug 1422251 and bug 1400568 (which are both version 52) I don't know that it is related, but for completeness, ref Bug 1170646 - A few M-C fixes to handle short read in Cache code ( from [META] Failure to deal with short read There is also a newly reported Bug 1440716 - Hanging (imap and smtp) connections and hang on mac OS X
Comment hidden (obsolete) |
Reporter | ||
Comment 70•7 years ago
|
||
Comment 71•7 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #68) > m_kato, can you make anything of the stacks here, or in bug 1422251 and bug > 1400568 (which are both version 52) Maybe, these are same deadlock by CGLClearDrawable. I don't know why this occur.
Comment 72•6 years ago
|
||
FWIW bug 1369207 reads "On macOS installations where opening a window logs: |[GFX1-]: [OPENGL] Failed to init compositor with reason: FEATURE_FAILURE_OPENGL_CREATE_CONTEXT|, quickly closing the window can deadlock in CGLClearDrawable."
Comment 73•6 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #47) > > 54.0B3 seems to work without crashing. > > 55.0b2 dies > > That's a great start. So we need your help to determine the one day range in > the 55.0a1 series. > http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-08-03-02- > 29-comm-central/ > http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-12-03-02- > 06-comm-central/ James, We still need better regression range than http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-08-03-02-29-comm-central/ http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-12-03-02-06-comm-central/ It seems I forgot to mention the regression tool https://mozilla.github.io/mozregression/quickstart.html
Reporter | ||
Comment 74•6 years ago
|
||
Alas, the tool is for windows only. Because it is so difficult to know and get the correct version of lightning and google provider, I have given up on nightlies. They should be packaged together in the same download directory IMHO>
Comment 75•6 years ago
|
||
I've been seeing this occasionally, in Daily, for quite some time, including today's build. I can't easily give a regression window: sometimes it happens a few times a day, sometimes once a week. abridged hang txt: OS Version: Mac OS X 10.13.6 (Build 17G65) Architecture: x86_64h Path: /Applications/Thunderbird Daily.app/Contents/MacOS/thunderbird Identifier: org.mozilla.thunderbird daily Version: 63.0a1 (63.0a1) Duration: 4.30s (process was unresponsive for 10 seconds before sampling) Hardware model: MacBookPro14,1 Active cpus: 4 Heaviest stack for the main thread of the target process: 43 start + 1 (libdyld.dylib + 4117) [0x7fff791f6015] 43 main + 890 (thunderbird + 4474) [0x10681617a] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 61943681) [0x10a9c8f81] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 61941996) [0x10a9c88ec] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 61939604) [0x10a9c7f94] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 61180073) [0x10a90e8a9] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 41233963) [0x109608e2b] 43 -[NSApplication run] + 764 (AppKit + 223365) [0x7fff4e949885] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 41229596) [0x109607d1c] 43 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 3044 (AppKit + 8224308) [0x7fff4f0eae34] 43 _DPSNextEvent + 2085 (AppKit + 268915) [0x7fff4e954a73] 43 _BlockUntilNextEventMatchingListInModeWithFilter + 64 (HIToolbox + 194692) [0x7fff506a3884] 43 ReceiveNextEventCommon + 613 (HIToolbox + 195334) [0x7fff506a3b06] 43 RunCurrentEventLoopInMode + 286 (HIToolbox + 195990) [0x7fff506a3d96] 43 CFRunLoopRunSpecific + 483 (CoreFoundation + 545107) [0x7fff513b9153] 43 __CFRunLoopRun + 1293 (CoreFoundation + 547053) [0x7fff513b98ed] 43 __CFRunLoopDoSources0 + 300 (CoreFoundation + 550092) [0x7fff513ba4cc] 43 __CFRunLoopDoSource0 + 108 (CoreFoundation + 1430572) [0x7fff5149142c] 43 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17 (CoreFoundation + 670225) [0x7fff513d7a11] 43 __NSThreadPerformPerform + 334 (Foundation + 426677) [0x7fff534fd2b5] 43 ??? (<A5CA6BA1-E112-3B19-B3AB-EB1683128CAE> + 41001366) [0x1095d0196] 43 -[NSView removeFromSuperview] + 270 (AppKit + 157615) [0x7fff4e9397af] 43 -[NSView _setWindow:] + 2356 (AppKit + 147783) [0x7fff4e937147] 43 -[NSSurface setWindow:] + 53 (AppKit + 2268392) [0x7fff4eb3cce8] 43 -[NSSurface _disposeSurface] + 152 (AppKit + 2269311) [0x7fff4eb3d07f] 43 -[NSNotificationCenter postNotificationName:object:userInfo:] + 66 (Foundation + 26823) [0x7fff5349b8c7] 43 _CFXNotificationPost + 599 (CoreFoundation + 358839) [0x7fff5138b9b7] 43 -[_CFXNotificationRegistrar find:object:observer:enumerator:] + 1664 (CoreFoundation + 362624) [0x7fff5138c880] 43 ___CFXNotificationPost_block_invoke + 225 (CoreFoundation + 633569) [0x7fff513ceae1] 43 _CFXRegistrationPost + 458 (CoreFoundation + 634282) [0x7fff513cedaa] 43 __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__ + 12 (CoreFoundation + 634588) [0x7fff513ceedc] 43 CGLClearDrawable + 41 (OpenGL + 28651) [0x7fff5b8c1feb] 43 _pthread_mutex_lock_slow + 253 (libsystem_pthread.dylib + 5320) [0x7fff7950c4c8] 43 __psynch_mutexwait + 10 (libsystem_kernel.dylib + 117318) [0x7fff79346a46] *43 psynch_mtxcontinue + 0 (pthread + 31325) [0xffffff7f81967a5d] (blocked by pthread mutex owned by thunderbird (Thunderbird Daily) [3467] thread 0x20330) Process: thunderbird (Thunderbird Daily) [3467] Path: /Applications/Thunderbird Daily.app/Contents/MacOS/thunderbird Architecture: x86_64 Parent: launchd [1] UID: 501 Task size: 1273.60 MB CPU Time: 0.055s (124.5M cycles, 25.3M instructions, 4.92c/i) Note: Unresponsive for 10 seconds before sampling Note: 1 idle work queue thread omitted
Comment 76•6 years ago
|
||
> They should be packaged together in the same download directory IMHO
Should not be difficult - it should only change every 10 weeks when version numbers change. And if you have them bookmark, then you are good for when the version does change ...
Everyone in the other bugs seems to be gone for the summer :( so I don't think we are going to make progress for some months.
Comment 78•6 years ago
|
||
Glenn from bug 1400568 writes "It’s a lot less frequent than it was, although I’m not sure if I’ve just learned to work around it. It does still happen once a week or so, when it used to happen three or four times a day." So for most of you it has stopped, or it is better in version 60?
Updated•6 years ago
|
Reporter | ||
Comment 79•6 years ago
|
||
It is less frequent, but it still does it.
Comment 80•6 years ago
|
||
Not directly linked to this bug but may be linked to it as the closest I found that was updated recently... Today on Win 10 Pro, I got a crash report bp-8017d419-a7d9-47ba-ba11-f9a3a0181002 Thunderbird 60.0b11 Crash Report [@ shutdownhang | ntdll.dll@0x6a28c ] following closure of Thunderbird because it hangs on closure following the fact that it got stuck in trying to save copy of sent message to Sent folder on server via IMAP. The progress bar reached 100% before it start to hang in never ending the processing... Before closing TB, I tried moving in different folders (which in some case help sorted the issue by re-connecting/re-authenticating to the server as indicated in status bar), flush dns, put in offline mode and back in online mode (does that re-initialise socket/connection to server?), etc... but nothing worked... TB was stuck in processing save copy of sent message to Sent folder. I could still send another message but it same issue raise with the second message... Msg were sent and received by recipients but no copy could be kept in Sent nor in a local folder (would be nice to have such option if copy to Sent folder on server incomplete somehow to avoid loosing msg). Only thing that could be done is to close Thunderbird that then crashed upon closing as it hanged on closing. As indicated before somewhere in some posts, as I am VPN user that may cause IP of mail server to change, I am wondering if that may cause TB to not handle such situation somehow by not updating its dns/socket/cached server connection information and cause it to hang in some sort of loop, where the only way to come out of it is to close the application and re-open it but in the current case situation that means loosing data (copy of msg in Sent folder). Stopping or restarting VPN had no effect though. I would expect that using the disconnect/reconnect button in Thunderbird shall suffice to pop out of such processing/loop/hang situation but it does not... unfortunately for the user... and data is lost :-) Hope this information may help sort the issue in the future or narrow it down... so it can be fixed once and for all... Saving a copy of sent message in Sent folder is a basic feature that shall not fails or if it does that Thunderbird shall cleary indicate why it failed to do so by a clear error msg, and allow to retry or regain access to the message so it can be saved later somehow... while it has already been sent...
Comment 81•6 years ago
|
||
Another possibility I would thought about is that computer goes to sleep and when it wakes up something in Thunderbird is not waking up or is no longer valid or accurate causing issue to the save copy to sent folder when sending a message... but the issue not being systematic and happening randomly is hard to track down...
Comment 82•6 years ago
|
||
Similar bug have been reported: Bug 413240 "Save to Sent was successful, but "Copying to sent folder" doesn't finish, or zombie "Copy complete""
Reporter | ||
Comment 83•6 years ago
|
||
My computer never goes to sleep, so it is not that.
Comment 84•6 years ago
|
||
I haven't looked at all the attachments to this bug but most seem to be apple logs. Maybe an tb imap log would also be helpful since the problem seems to be saving to the imap Sent mailbox. If an error is reported while saving to Sent, the patch here should allow the user to choose to save to Local Folders: Bug 1366591. This also applies to saving saving Drafts and Templates But not sure this fix is in the versions being tested by the reporter(s) of this bug.
Comment 85•6 years ago
|
||
As the original author of this bug report, I wanted to chime in with a bit of information in response to recent comments. First, I'm glad it's getting some continued attention. It's the kind of bug where you lose work -- it's effectively a crash bug, since you have to force quit. 1. It has repeated consistently with POP as well as IMAP. 2. Possibly related to Sent folder, but upon restart, the copy is always safely in the Sent folder (seems more like it deadlocks on cleanup, closing the progress window, rather than the task itself) 3. Earlier in this bug are extensive stack traces that point to the exact pthread mutexes which deadlock -- that should help to know where to look. 4. It seems like a race condition around mutexes -- always hard to find/fix.
Comment 86•6 years ago
|
||
https://support.mozilla.org/en-US/questions/1237929 states the cause was a signature file. I wonder if that is the case with some other reports
Comment hidden (obsolete) |
Reporter | ||
Comment 88•6 years ago
|
||
It is still doing this on 65.0b4
Comment hidden (obsolete) |
Comment 91•6 years ago
|
||
This bug report is of course reported long before TB60, but bug 1525001 comment 4 states using beta resolves the issue he reported against TB60.
Can you try the newer beta 66 from https://www.thunderbird.net/en-US/channel/ and report your results?
(fair warning, most addons won't work - calendar will)
Comment hidden (obsolete) |
Comment 93•6 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #92)
Eckard, feel free to try as well.
Note: You'll need to get it from https://www.thunderbird.net/en-US/channel/ AND it might start a new profile
Your link leads to the actual 67.0b1 beta-version.
Should I try the 66.0b3 beta-version from http://ftp.mozilla.org/pub/thunderbird/releases/66.0b3/mac/ or 67.0b1 in a new profile?
Comment 94•6 years ago
|
||
Eckard, I suspect both 66 and 67 should be tested. Ultimately, for whomever can reproduce the the failure, we need to know the range in which it was fixed (if it was) so we can know what needs to be uplifted to esr.
Comment hidden (obsolete) |
Comment 96•6 years ago
|
||
There is a link for the Google Calendar on the Release Notes pages:
https://www.thunderbird.net/en-US/thunderbird/67.0beta/releasenotes/
https://www.thunderbird.net/en-US/thunderbird/66.0beta/releasenotes/
Second last item. Right-click to save the XPI file locally, then install add-on manually from file.
Reporter | ||
Comment 97•6 years ago
|
||
67b1 is working with gdata provider. But the crashes have grown more infrequent, so it will take time to know if the problem is fixed.
Reporter | ||
Comment 98•6 years ago
|
||
It is not fixed. 67b1 just hung after I sent gmail.
Comment 99•6 years ago
•
|
||
(In reply to Wayne Mery (:wsmwk) from comment #94)
I cannot reproduce the issue in new profiles neither with TB 66.0b3 nor with TB 67.0b1 (two different IMAP accounts with the French ISP Free, no GMail nor Yahoo accounts).
Comment 100•5 years ago
|
||
Are any more crash reports needed? I have several from my machine (with truncated traces), and one non-truncated from a coworkers machine. Both are modern iMacs - one is almost brand new, with an old thunder bird profile, while the other is an older machine with a a very new thunderbird profile.
Both report the same crash on CGLClearDrawable + 44. Most of the users in my office (all mac based) have been suffering from this bug for the last 2-4+ months and it seems to going up in frequency. Were eager to see a fix.
I can offer up some of my time for testing purposes too.
Thanks,
Comment 101•5 years ago
|
||
(In reply to James Rome from comment #98)
It is not fixed. 67b1 just hung after I sent gmail.
What happens if you point Sent folder to a local folder in account settings?
Comment 102•5 years ago
|
||
(In reply to Scott from comment #100)
...
Both report the same crash on CGLClearDrawable + 44. Most of the users in my office (all mac based) have been suffering ...
I can offer up some of my time for testing purposes too.
Great! Let's start at the same point as others:
- What happens if you point Sent folder to a local folder in account settings? (using current release)
- What happens if you use the beta from https://www.thunderbird.net/en-US/channel/ ?
Reporter | ||
Comment 103•5 years ago
|
||
I pointed to local now. We shall see.
Comment 104•5 years ago
•
|
||
workaround |
https://support.mozilla.org/en-US/questions/1242885#answer-1179483 suggests disabling send progress helps, which might implicate graphics
Comment 105•5 years ago
|
||
(In reply to Wayne from comment #102)
Hi Wayne,
#1 Most of my users do not save Sent mail at all, so I think this can be ruled out. We use O365, which rolled out automatic filing of sent emails to "Sent Items" folder. We had to disable Thunderbird saving an extra copy - to avoid duplicates - in Fall 2018 (around Sept 17th is when it started)
#2 I installed the beta version yesterday, I haven't had any hangs yet, but I haven't had to send many emails. In wait and see mode currently.
#3 I will look into this setting and report back.
Comment 106•5 years ago
|
||
(In reply to Wayne from comment #102)
Further to #3:
My own preferences are set somewhat correctly... with sendInBackground set to False, and show_send_progress set to True, but offline.send.unsent_messages is set to 0 not 1 (as recommended in the other bug thread. However the reporter in that thread indicated he used the opposite settings and made his highly reproducible crash go away.
I have checked my own machine (both original v60 and beta) and both have matching settings. I also checked with one of my colleagues who is one of the only people in my office who does not experience the problem and he has the same settings as myself.
I will try toggling the settings once I experience another crash.
Thanks!
Comment 107•5 years ago
|
||
(In reply to Wayne from comment #102)
Beta Build 67.0b3 (64 Bit) just crashed for me.
It was running a fresh profile, imap with no local folders, it was saving to a imap sent folder (as I hadn't thought to disable that, it is now disabled).
I have an un-truncated crash log for it too, but it reports the same CGLClearDrawable error (+41 instead of +44).
Excerpt:
55 CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER + 12 (CoreFoundation + 634300) [0x7fff3f561dbc]
55 CGLClearDrawable + 41 (OpenGL + 28651) [0x7fff49a56feb]
55 _pthread_mutex_lock_slow + 253 (libsystem_pthread.dylib + 5320) [0x7fff677924c8]
55 __psynch_mutexwait + 10 (libsystem_kernel.dylib + 117318) [0x7fff675cca46]
*55 psynch_mtxcontinue + 0 (pthread + 31325) [0xffffff7f82a10a5d] (blocked by pthread mutex owned by thunderbird (Thunderbird) [416] thread 0x189b)
Reporter | ||
Comment 108•5 years ago
|
||
It hung again sending IMAP mail, and Sent mail is a local folder.
Reporter | ||
Comment 109•5 years ago
|
||
It has hung twice today. But I have put sent mail back to Google.
Comment 110•5 years ago
|
||
Just happened to me in TB 67.0b2 (64-bit) while sending message, the save to Sent folder on IMAP server while online ok, progressed up to 89% and then got stuck somehow... see attached... I left it as is but after few hours no progress, TB still in processing status :-)
Those are errors appearing in console fyi... if that can be of any help...
NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIAbDirectoryQuery.doQuery]
nsAbLDAPAutoCompleteSearch.js:261
TypeError: complistItem.occurrence is undefined
agenda-listbox.js:1041:13
TypeError: this.mItemInfoCache[aNewItem.id] is undefined
calDavCalendar.js:758:28
NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: [JavaScript Error: "this.mItemInfoCache[aNewItem.id] is undefined" {file: "jar:file:///C:/Users/richard/AppData/Roaming/Thunderbird/Profiles/cnant748.default/extensions/%7Be2fda1a4-762b-4020-b5ad-a41df1933103%7D.xpi!/components/calDavCalendar.js" line: 758}]'[JavaScript Error: "this.mItemInfoCache[aNewItem.id] is undefined" {file: "jar:file:///C:/Users/richard/AppData/Roaming/Thunderbird/Profiles/cnant748.default/extensions/%7Be2fda1a4-762b-4020-b5ad-a41df1933103%7D.xpi!/components/calDavCalendar.js" line: 758}]' when calling method: [calICalendar::modifyItem] calAlarmService.js:174
Assert failed: aElement
calUtils.jsm:147
ASSERT resource://calendar/modules/calUtils.jsm:147
setElementValue chrome://calendar/content/calendar-ui-utils.js:32
setBooleanAttribute chrome://calendar/content/calendar-ui-utils.js:91
enableTimeIndicator chrome://calendar/content/calendar-multiday-view.xml:2573
view_XBL_Constructor chrome://calendar/content/calendar-multiday-view.xml:2534
Could not start Browser Toolbox, you need to enable it. ToolboxProcess.jsm:77:13
init resource://devtools/client/framework/ToolboxProcess.jsm:77
oncommand chrome://messenger/content/messenger.xul:1
TypeError: this.gViewSourceUtils is undefined
webconsole.js:168:5
Comment 111•5 years ago
|
||
Clicking on red cross icon to close the previous prompt, raise a new prompt Save message > Your message was sent but a copy...etc... as per attached...
Comment 112•5 years ago
|
||
(In reply to Richard in comment #110)
That console report references an error occurring with your calendar extension (caldavCalendar.js). Does the crash log also report a "CGLClearDrawable" and/or Mutex error, if not - this may be distinct.
Wayne perhaps can provide more guidance.
Comment 113•5 years ago
|
||
TB do not crash in my case, just hand in processing mode... (as per processing icon appearing on the main tab blue circle)... also worth mentioning I am using TB on Windows... sorry only just recall this bug was referring to macos...
Comment 114•5 years ago
|
||
(In reply to Richard in commender #113)
The bug is affecting both, but seems more prevalent in MacOS. Hang is perhaps the correct term, the app gets stuck, and doesn't continue during (?) or after the moment the sent email progress bar closes. In my experience Thunderbird stays open and does not close, but has hung/is no longer responding to user input. You have to do the equivalent of a force quit (or control alt delete) and manually force it to close, and then generate a bug report.
Comment 115•5 years ago
|
||
I switched back to use the release version this morning (as it has my local email folders and message filters). I will keep an eye on it, but I feel like its crashing more frequently than the beta does. The beta doesn't remove the problem (as myself and others have pointed out though).
It could also just be the transient nature of the bug, some days it happens more than others.
Comment 118•5 years ago
|
||
In Reply to Wayne in comment #116
Upgraded today and got a crash a little while ago. No log was generated, but I would assume it is the same crash. Will confirm as soon as I see a log.
Comment 119•5 years ago
|
||
Are we discussing more than one issue here? When I see the problem I have, which happens about once per day (for months, and still, using current Daily), on MacOS, I get the spinning beachball, and I need to Force Quit. There’s no possibility to get a log here, is there?
Are people talking about getting logs, and seeing CGLClearDrawable, seeing a different issue than me? I am using IMAP Sent folder, and I only see it immediately after a successful mail send. But I send many emails per day, yet it only hangs once or twice per day.
Comment 120•5 years ago
|
||
(In reply to Calum Mackay from comment #119)
It could be, probably is?
I am getting logs intermittently now.
The crash always occurs during the sending process. Usually - if not always, when I have moved on to do something else whilst the email completes. Thunderbird will hang (with the beach ball) and needs to be force quit. Every log I have examined (this includes multiple workstations in our office) includes the CGLClearDrawable handle.
In my experience, your usage of a sent folder doesn't affect the crash as Thunderbird in our environment (O365) does not even save its own sent messages and still crashes. Crashes seem less frequent in the Beta releases, but still happen.
Some of my users have started (temporarily) using Apple Mail again to avoid incessant crashes.
Comment 121•5 years ago
|
||
(In reply to Scott from comment #120)
thanks Scott; how are you getting logs following the force quit?
Comment 122•5 years ago
|
||
(In reply to Calum Mackay from comment #121)
(In reply to Scott from comment #120)
thanks Scott; how are you getting logs following the force quit?
Good question. That explains why it doesn't give me a log every time, but it does intermittently. I assumed it was normal behavior.
Comment 123•5 years ago
|
||
(In reply to Scott from comment #122)
thanks Scott; how are you getting logs following the force quit?
Good question. That explains why it doesn't give me a log every time, but it does intermittently. I assumed it was normal behavior.
thanks. I never get logs when mine does this. Or perhaps I'm just not leaving it long enough before doing the Force Quit.
Comment 124•5 years ago
|
||
(In reply to Calum Mackay from comment #123)
thanks. I never get logs when mine does this. Or perhaps I'm just not leaving it long enough before doing the Force Quit.
I have gotten them in both situations. If its been too long since the crash the log is truncated and not very useful. If its more recent it will have more information on the thread that crashed.
Comment 125•5 years ago
|
||
I am regularly experiencing the same issue: Thunderbird hangs after sending an email with the spinning beach ball mouse cursor. I'm on MacOS 10.13.6. Sometimes, but not always, I receive a Apple bug report dialog along with a crash log after Force-Quitting. I'll attach my crash log to the ticket.
Comment 126•5 years ago
|
||
Comment 127•5 years ago
|
||
(In reply to Edmond from comment #125)
I am regularly experiencing the same issue: Thunderbird hangs after sending an email with the spinning beach ball mouse cursor. I'm on MacOS 10.13.6. Sometimes, but not always, I receive a Apple bug report dialog along with a crash log after Force-Quitting. I'll attach my crash log to the ticket.
Looks like the same bug to me - based on your crash report. Welcome :)
Comment 128•5 years ago
|
||
We may need a Mac-expert for this bug, and bug 1398807, and bug 1422251.
Also, anyone with a 4K or 5K monitor?
Comment 129•5 years ago
|
||
(In reply to Calum Mackay from comment #119)
Are we discussing more than one issue here?
Yes, very possible (even probable) people are seeing more than one issue - even more so because the problem comments span two years, and multiple versions.
(In reply to James Rome from comment #74)
Alas, the [mozregression] tool is for windows only.
No, because there is a command line tool for Mac. So if this is a regression in the version 54 time frame and someone can reliably reproduce this, then the regression range should be easy to get.
Comment 130•5 years ago
|
||
See also Bug 1334549... which looks like a similar issue... if that can help...
Comment 131•5 years ago
|
||
And also FYI Bug 1257235...
Comment 132•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #128)
We may need a Mac-expert for this bug, and bug 1398807, and bug 1422251.
Also, anyone with a 4K or 5K monitor?
What constitutes a Mac-expert?
I am running a Mac, w/ a built-in 5K display, and experiencing the bug.
Comment 133•5 years ago
|
||
Someone who can debug it and write a patch to fix it, I suppose.
Comment 134•5 years ago
|
||
Would recording Performance via DevTools (Ctrl+i on Windows to open it) help to identify the issue?
Any end-user encountering the issue on Mac may be able to do it:
- prepare msg ready to send
- open DevTools (you can keep it open in parallell of TB)
- go to Performance tab (if missing make sure Petformance option ticked in DevTools settings)
- start recording Performance
- send msg
- if msg send ok, stop recording Performance, and delete profile created. Repeat process above till a msg fails to send.
OR - if msg not sending properly, wait a bit, then stop recording Performance. Then save petformance profile recorded and post here.
Maybe that can help identify what TB is doing when sending but not completing process somehow...
Would that help?
Comment 135•5 years ago
|
||
(In reply to Scott from comment #132)
(In reply to Wayne Mery (:wsmwk) from comment #128)
We may need a Mac-expert for this bug, and bug 1398807, and bug 1422251.
Also, anyone with a 4K or 5K monitor?
What constitutes a Mac-expert?
The starting point would be someone who can specifically identify the steps to reproduce, or even better get the regression range.
https://mozilla.github.io/mozregression/quickstart.html and for Mac use command line.
I am running a Mac, w/ a built-in 5K display, and experiencing the bug.
If it were an external display I would ask whether changing the display to 4k or non-4k eliminates the problem. Which brings us back to regression range.
Comment 136•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #135)
The starting point would be someone who can specifically identify the steps to reproduce, or even better get the regression range.
https://mozilla.github.io/mozregression/quickstart.html and for Mac use command line.
What range do you want to be looked at?
Comment 137•5 years ago
|
||
Note that I originally filed bug 1400568 two years ago (https://bugzilla.mozilla.org/show_bug.cgi?id=1400568) which, in the followup comments, showed a mutex lock in CGLClearDrawable. It was not on a 5k monitor or even a 4k monitor. Recent comments show that it's still exactly the same bug.
What is needed is not so much a Mac expert, but a thread/mutex expert. These are hard bugs to find/fix, but not impossible. Usually it involves reading the code carefully around the mutexes, looking for race conditions and false assumptions.
I suspect that there is an assumption lurking in the mutex code, some intermediate operation (displaying the progress bar, most likely) that takes a variable amount of time, and sometimes the race condition causes the mutex never to unlock. I would try commenting out the progress window completely (who needs 'em anyway?).
After two years of flailing around in Bugzilla, I have lost interest in this bug, unfortunately, but I hope somebody fixes it.
Comment 138•5 years ago
|
||
(In reply to Richard Leger from comment #136)
...
What range do you want to be looked at?
Try July 1 2017 to August 10 2017. Hard to be more exacting - and depends on whether this is reproducicble in nightlies. If doesn't fail in July 1, try a month or two earlier.
Comment 139•5 years ago
|
||
(In reply to Glenn Reid from comment #137)
originally filed bug 1400568 two years ago [against 52.3.0 2017-09-16] which ... showed a mutex lock in CGLClearDrawable. It was not on a 5k monitor or even a 4k monitor.
Good point. It may not be a trigger, or requirement, for most.
Comment 141•5 years ago
|
||
There is an interesting development in bug 1422251 comment 18 - the reporter stopped having trouble after moving to beta 67. If that holds for others then version 68 should work better.
Comment 142•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #138)
(In reply to Richard Leger from comment #136)
...
What range do you want to be looked at?Try July 1 2017 to August 10 2017. Hard to be more exacting - and depends on whether this is reproducicble in nightlies. If doesn't fail in July 1, try a month or two earlier.
Actually, I forgot that we had earlier determined a probable range of
http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-08-03-02-29-comm-central/
http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-12-03-02-06-comm-central/
base, iirc, on James' report of seeing this in beta 54
Comment 143•5 years ago
|
||
I don't see any Thunderbird debug symbols in the various apple log dumps. Seeing TB functions show up in those logs might give some more clues? I'm not sure which builds (if any) have debug symbols in them - nightly, maybe?
I've no idea if the apple log will show up TB symbols even if they're there, but it should do.
Other random thoughts:
A GL mutex lock kind of implies that GUI code is being called from multiple threads, but I was under the impression that all the GUI code in TB was driven via main-thread-only javascript. And that has warning lights all over it which go off if it's called from the wrong thread.
But yeah, ultimately this kind of bug really needs someone running a debug build on a mac, catching it a debugger, and poking about to see what's locked up.
Comment 144•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #139)
(In reply to Glenn Reid from comment #137)
originally filed bug 1400568 two years ago [against 52.3.0 2017-09-16] which ... showed a mutex lock in CGLClearDrawable. It was not on a 5k monitor or even a 4k monitor.
Good point. It may not be a trigger, or requirement, for most.
The display resolution was speculation from me, trying to explain why it was affecting more macs than windows machines. They commonly have higher resolution screens and less powerful graphics cards leading to - presumably lower refresh rates - which is what I was trying to get at with the suggestion :)
Someone who better understands the mutex function may be able to simply rules this out though!
Comment 145•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #141)
There is an interesting development in bug 1422251 comment 18 - the reporter stopped having trouble after moving to beta 67. If that holds for others then version 68 should work better.
Still occurs in 69.0b3.
Comment 146•5 years ago
|
||
As a test I manually sent about 55 email at random times over a period of several hours and haven't seen a problem. This is with macbook air and tb version 60.8.0. I was sending from a non-gmail account to a gmail account. Save to Sent on non-gmail worked fine with no lock-ups. The messages were not huge and mostly just old emails archived from a mailing list. Display on mbAir (running mavericks 10.9.5) not very high res, 1280x800.
Does anyone cc'd on this bug see the problem on mbAir?
Reporter | ||
Comment 148•5 years ago
|
||
Another clue. I just sent an e-mail, and it hung while downloading a message--presumably the copy of the message I sent. See attached image.
Reporter | ||
Comment 149•5 years ago
|
||
Comment 150•5 years ago
|
||
(In reply to James Rome from comment #148)
Another clue. I just sent an e-mail, and it hung while downloading a message--presumably the copy of the message I sent. See attached image.
Any evidence this is the same crash (crash log showing same types of mutex locks/clear drawable issues)... I have personally experienced the crash probably close to 50 times now and not seen an issue downloading messages.
fyi, in bug#1547339 before it was closed-as-DUP, I reported hitting this problem when running Thunderbird 60.8.0 on Mac OSX 10.14.6 (and list of other previous versions). If it helps, I had attached crash-dumps in bug#1547339 over the last few months.
(aside: Thanks to :wsmwk for connecting these two tickets)
Updated•5 years ago
|
Comment 158•5 years ago
|
||
Thanks to everyone who contributed their system configs, noted in user story
Comment 159•5 years ago
|
||
For what it's worth, the problem I was seeing seems recently (within the last month or so) to have stopped.
For a long time (a year or more), a few times a week, TB would stop responding, immediately after sending a message (with IMAP Sent folder). I had to Force Quit, and never once got a stack trace or report. It belatedly occurs to me that I've not seen this happen for several weeks now. I'd been away for a couple of weeks holiday; having now been back a few weeks, I don't think it's happened since before my holiday, so that's at least a month.
I'm running Daily builds, updated daily (or when they change), on MacOS 10.14.6 (currently).
I appreciate this doesn't help much; just as a data point.
Updated•5 years ago
|
Comment 160•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #142)
(In reply to Wayne Mery (:wsmwk) from comment #138)
(In reply to Richard Leger from comment #136)
...
What range do you want to be looked at?Try July 1 2017 to August 10 2017. Hard to be more exacting - and depends on whether this is reproducicble in nightlies. If doesn't fail in July 1, try a month or two earlier.
Actually, I forgot that we had earlier determined a probable range of
http://archive.mozilla.org/pub/thunderbird/nightly/2017/03/2017-03-08-03-02-29-comm-central/
http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-12-03-02-06-comm-central/
base, iirc, on James' report of seeing this in beta 54
As suggested, I run bisection 2017-03-08 till 2017-06-13 with MozRegression on Windows 10 pro x64 with Thunderbird 32-bits...
Bisection setup
- New profile (re-used)
- IMAP/SMTP mailbox setup - keep message - sync most recent 30 days
- Use each TB version for a full day before moving on to the next...
Results (see also attached):
Start 55.0a1 (2017-04-26) - good
55.0a1 (2017-05-20) - good
55.0a1 (2017-06-01) - good
55.0a1 (2017-06-07) - good
55.0a1 (2017-06-10) - bad (one time) - RB could not save a copy of sent email to Sent folder on the server, stuck in processing, disconnecting network on computer and reconnecting cause a TB prompt to appear asking to Retry, after which it worked.
55.0a1 (2017-06-09) - good for sending items - but bad (one time) for "Copying message to Draft folder..." When trying to access mail folders while that happen the popup error message "Could not connect to mail server xxx. Connection was refused."... could have been issue at server side for this one (temporarily disconnected due to kernel update)
At the end there were a message saying it did not have enough info to establish a bisection of code or something like that... not sure what it means...
Bisection Information for the bad one:
app_name: thunderbird
build_date: 2017-06-10
build_file: C:\Users\richard.mozilla\mozregression\persist\2017-06-10--comm-central--thunderbird-55.0a1.en-US.win32.zip
build_type: nightly
build_url: https://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-10-03-02-05-comm-central/thunderbird-55.0a1.en-US.win32.zip
changeset: 3e3745b52dc53eb74efd73d3107a81e2e13f94be
pushlog_url: https://hg.mozilla.org/comm-central/pushloghtml?fromchange=e915a8b2f1f505d7c4fa1820f35ce99d1164b293&tochange=3e3745b52dc53eb74efd73d3107a81e2e13f94be
repo_name: comm-central
repo_url: https://hg.mozilla.org/comm-central
Could it be that Thunderbird has an issue to connect to the IMAP server to file copy of message and it does not retries properly or identify the issue to prompt user to retry?
We use a dedicated CA cert for SSL validation which is added to Thunderbird Certificates storage prior using the account...
Comment 161•5 years ago
|
||
I have just had another hang. The email has been copied to the (IMAP) Sent folder, so the hang must have appeared afterwards.
Comment 162•5 years ago
|
||
I’m on the beta update channel and running 70.0b1.
Comment 163•5 years ago
|
||
All: I've been following this thread with interest. TB is one of the main tools I use in my business, and these frequent hangs get in the way. So to all of you: THANK YOU for thinking about this. A few minutes ago, TB hung again. After issuing a Force Quit, I captured the data that normally goes to Apple. It may be huge, but it may also shed some light. Here we go with an edit/paste. Sincerely, Bob
Reporter | ||
Comment 164•5 years ago
|
||
I hate to throw a wrench into this effort, but I moved sent mail to a local folder, and TB still hangs. Maybe not so often though...
Updated•5 years ago
|
Comment 165•5 years ago
|
||
This specific signature seems to be missing in the report so far.
This is Thunderbird 60.9.0 (64-bit) on macOS High Sierra (10.13.6). I'm using IMAP4 to gmail, and a gmail SMTP server. This also applies to some recent Thunderbird versions.
After sending an email, Thunderbird sometimes stalls, requiring ForceQuit and restart. In all cases, the message gets sent, and gets copied into my "Sent Mail" folder. After the restart all is well (until the next time).
This is 100% reproducible: The stall happens ONLY if I click on the main window before the message window disappears, or within a few seconds. If I remember to wait at least 5 seconds after the send window disappears, it does not stall and all is well.
This has been happening for several months, over several Thunderbird updates. IIRC it was worse on earlier versions, but I have modified my behavior to minimize it.
I believe that this also applies to an NNTP send to giganews and IMAP4 save to gmail.
I also believe that it is OK to click on any other application's window without stalling Thunderbird.
I don't have good statistics on these, however.
Comment 166•5 years ago
|
||
Richard, thank you for your scientific study. And signficant that it correlates to Jame's earlier findings. If accurate, others should find that https://ftp.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-10-03-02-05-comm-central/thunderbird-55.0a1.en-US.mac.dmg shows the problem.
Can we determine the cause from https://hg.mozilla.org/comm-central/pushloghtml?startdate=2017-06-07&enddate=2017-06-10 (blissfully small list) ?
Bug 1364977 ?
Comment 167•5 years ago
|
||
Hmm, we've got two slightly different ranges here:
pushlog_url: https://hg.mozilla.org/comm-central/pushloghtml?fromchange=e915a8b2f1f505d7c4fa1820f35ce99d1164b293&tochange=3e3745b52dc53eb74efd73d3107a81e2e13f94be
from comment #160 and
https://hg.mozilla.org/comm-central/pushloghtml?startdate=2017-06-07&enddate=2017-06-10
from comment #166.
If I read the bug summary, it talks about an IMAP issue when copying the sent message to the Sent folder. We also read:
No problem if Sent is set to local folder. Deadlock on CGLClearDrawable.
So it's possible that the regression range is correct, but if really CGLClearDrawable is a problem, then we shouldn't look for the issue in C-C. The equivalent date range based on dates is:
https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2017-06-07&enddate=2017-06-10
Looking at bug 1364977 we see that for TB it added the removal of a command observer for housekeeping purposes. We note that this observer is also used in editor.js
https://searchfox.org/comm-central/search?q=obs_documentCreated&case=false®exp=false&path=editor.js
where it's not removed.
I doubt that this change caused any problems, but I'm happy to provide a try build with that change removed. I guess you want a build for Mac, so would you like this to be based on TB 68 ESR, TB 70 beta or trunk? Note that the code added in bug 1364977 runs when the compose window closes, so it's not entirely impossible that that has some bad effect onto other things happening at the same time.
Comment 168•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #167)
Hmm, we've got two slightly different ranges here:
pushlog_url: https://hg.mozilla.org/comm-central/pushloghtml?fromchange=e915a8b2f1f505d7c4fa1820f35ce99d1164b293&tochange=3e3745b52dc53eb74efd73d3107a81e2e13f94be
from comment #160 and
https://hg.mozilla.org/comm-central/pushloghtml?startdate=2017-06-07&enddate=2017-06-10
from comment #166.
The range in comment #166 is approximated from comment 160 so it could be applied to mozilla-central.
If I read the bug summary, it talks about an IMAP issue when copying the sent message to the Sent folder. We also read:
No problem if Sent is set to local folder. Deadlock on CGLClearDrawable.So it's possible that the regression range is correct, but if really CGLClearDrawable is a problem, then we shouldn't look for the issue in C-C. The equivalent date range based on dates is:
https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2017-06-07&enddate=2017-06-10
I too suspect the issue will be in M-C. But a regression range of three days may be challenging to get a hit. Even a one day range is challenging.
Richard (or anyone else??), can you run regression tests to see if you can narrow 2017-06-07 thru 2017-06-10 to a one day range?
... I'm happy to provide a try build with that change removed.
Let's see first if Richard has positive results
Attached is crash while sending email today using TB 60.9.0 on OSX 10.14.6.
I just noticed that I am on TB 60.9.0 (64-bit). While the about panel claims I am on "the release update channel", I see on https://www.thunderbird.net/en-US/thunderbird/releases/ that there are newer 68.1.x releases available for manual download, even though my TB installation does not see them. Will investigate here, but flagging in case that helps with narrowing the debugging range.
Comment 171•5 years ago
|
||
(In reply to John O'Duinn [:joduinn] (please use "needinfo?" flag) from comment #169)
...
I just noticed that I am on TB 60.9.0 (64-bit). While the about panel claims I am on "the release update channel", I see on https://www.thunderbird.net/en-US/thunderbird/releases/ that there are newer 68.1.x releases available for manual download, even though my TB installation does not see them.
That is because update to version 68 from version 60 are not currently enabled. But that wouldn't help you with this issue anyway.
Comment 172•5 years ago
|
||
Still no reply re. comment #167: Do you want a try build? For Mac? Based on which version of TB?
Comment 173•5 years ago
|
||
I'm using 68.1.2 and mine crashes multiple times daily, thread dump includes the CGLClearDrawable call (and a few calls beneath that).
My messages ALWAYS send successfully, and the "Sent" message copy is also saved to the server correctly. So no data is lost for me. Seems to be related to mouse activity connected with the main window while sending a message -- not just clicking. I do a lot of scrolling with the touchpad, and I often don't want for a message window to close before moving-on with my life, so I'm surely generating mouse-events on the main window during and after the message window closes.
I have only Lightning and Enigmail extensions installed. No funny business with custom x509 certificates.
I'm happy to try previous versions to see what happens. I'm also happy to try a custom build to test out a fix. I'll even run it in a debugger if that will help in any way. I'm desperate to get this resolved, but I'm not sure how best to help.
Comment 174•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #167)
....
I doubt that this change caused any problems, but I'm happy to provide a try build with that change removed. I guess you want a build for Mac, so would you like this to be based on TB 68 ESR, TB 70 beta or trunk? Note that the code added in bug 1364977 runs when the compose window closes, so it's not entirely impossible that that has some bad effect onto other things happening at the same time.
Most reporters are using esr, so 68 ESR please. And if that doesn't work out then Beta 70.
Comment 176•5 years ago
|
||
Mac try build started:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=189dafbb360eab163171568640014c3829d1f4dd
Since I've been doing merges and uplifts yesterday, this will be a TB 68.2.0 ESR pre-release. I'll paste the path to the binary here later.
Comment 177•5 years ago
|
||
Reporter | ||
Comment 178•5 years ago
|
||
What is the google provider that goes with this, and where do we get it?
Comment 179•5 years ago
|
||
The one from ATN for TB 68, it's just a regular add-on.
Reporter | ||
Comment 180•5 years ago
|
||
Sorry, this test build hung already.
Comment 181•5 years ago
|
||
(In reply to James Rome from comment #180)
Sorry, this test build hung already.
I had one too. Looked eerily similar but no crash report unfortunately.
Comment 182•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #168)
(In reply to Jorg K (GMT+2) from comment #167)
So it's possible that the regression range is correct, but if really CGLClearDrawable is a problem, then we shouldn't look for the issue in C-C. The equivalent date range based on dates is:
https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2017-06-07&enddate=2017-06-10I too suspect the issue will be in M-C. But a regression range of three days may be challenging to get a hit. Even a one day range is challenging.
Richard (or anyone else??), can you [re]run regression tests to see if you can narrow 2017-06-07 thru 2017-06-10 to a one day range?
The builds that need to be tested by those who have been able to reliably reproduce are:
- http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-07-03-02-05-comm-central/thunderbird-55.0a1.en-US.mac.dmg
- http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-08-03-02-07-comm-central/thunderbird-55.0a1.en-US.mac.dmg
- http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-09-03-02-05-comm-central/thunderbird-55.0a1.en-US.mac.dmg
- http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-10-03-02-05-comm-central/thunderbird-55.0a1.en-US.mac.dmg
#1 presumably works and #4 is presumably fails. So #2 and #3 should be tested first.
Backup your profile before testing.
Comment 183•5 years ago
|
||
Sorry, this test build hung already.
Sorry, I suggested that this wouldn't help, see comment #167. The issue that only Mac users are experiencing is clearly in some low-level Mac graphics library, see "Deadlock on CGLClearDrawable" in the summary of this bug.
Comment 184•5 years ago
|
||
So, Wayne and Jorg, what's the best way to make progress, here? Try the builds referenced by Wayne in comment #182, or try Jorg's build referenced in comment #177? I can probably try them all. While I can't reliably make it crash, it happens so often that if I went a whole day without crashing, I'd consider the bug "not present in that build".
Will running any of these builds damage my tb user profile? I'm currently running 68.1.2. I use IMAP for everything so I'm not worried about the messages themselves; mostly just the setting and all that. I can re-build if necessary but would prefer to avoid it if possible.
Comment 185•5 years ago
|
||
As we heard (and expected), the build from comment #177 isn't any good.
Apparently the problem started to occur between the 2017-06-07 and the 2017-06-10. So in comment #182 Wayne suggests to try the builds of 7th, 8th, 9th and 10th of June 2017.
Running "old" builds, in this case TB 55 on a fresh profile can lead to malfunctions if the profile has already been upgraded to a newer version. I don't think it will cause damage or changes to the profile, but Wayne suggested to do a backup just in case.
Personally I think it's important to also try the 7th and 10th to double/triple check that the issue really stated there.
Comment 186•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #167)
If I read the bug summary, it talks about an IMAP issue when copying the sent message to the Sent folder. We also read:
No problem if Sent is set to local folder. Deadlock on CGLClearDrawable.
To Clarify - the bug definitely occurs regardless of sent folder location. Local folder, No sent folder at all, or online folder. I do not even have TB configured to save sent messages - O365 creates duplicates on the back end if I do; and experience the crash daily.
Comment 187•5 years ago
|
||
Thanks for the addition info.
I look forward to everyone's test results of comment 182 in the next few days so we can finally nail this bugger.
Comment 188•5 years ago
|
||
I backed-up my profile and launched this version with no other modifications to my profile. I'm having trouble writing my first email... I'm getting the color-wheel about once a second for two seconds. Mouse clicks are ignored. Keyboard is ignored.
Comment 189•5 years ago
|
||
(In reply to Christopher Schultz from comment #188)
I backed-up my profile and launched this version with no other modifications to my profile. I'm having trouble writing my first email... I'm getting the color-wheel about once a second for two seconds. Mouse clicks are ignored. Keyboard is ignored.
This finally cleared-up. I've had other bouts of the color-wheel appearing, but they all eventually do clear-up and I'm able to continue my work. Still testing...
Comment 190•5 years ago
|
||
FYI, running bisection 2017-06-06 to 2017-06-11 on Windows 10 with TB 32 bits keep you posted... I'll remove the need info flag when I'll publish the result... as it is reduced period and number of version I plan to use each version for a few days in a row... to maximise chances to get the issue... as said in my previous test result, issue was linked to a temporary lost of connection to the IMAP server during the sending (server was rebooting)... with TB unable to resume task while server was back and running... I don't know if that help... as info...
Comment 192•5 years ago
|
||
(Moving back to Thunderbird where it's more likely for users to find it.)
Comment 193•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #182)
The builds that need to be tested by those who have been able to reliably reproduce are:
I believe I have gotten this one to lock-up. ActivityMonitor.app says "Not Responding" and there's no CPU usage. I'm going to wait a good long time before killing it and hopefully I'll be able to get a thread dump.
But this was the "suspected good build," so maybe we have to cast a wider net.
Comment 194•5 years ago
|
||
(In reply to Christopher Schultz from comment #193)
(In reply to Wayne Mery (:wsmwk) from comment #182)
The builds that need to be tested by those who have been able to reliably reproduce are:
I believe I have gotten this one to lock-up. ActivityMonitor.app says "Not Responding" and there's no CPU usage. I'm going to wait a good long time before killing it and hopefully I'll be able to get a thread dump.
But this was the "suspected good build," so maybe we have to cast a wider net.
Yep, deadlocked at the same place:
[...]
11 CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER + 12 (CoreFoundation + 646038) [0x7fff5133eb96] 1-11
11 CGLClearDrawable + 44 (OpenGL + 28513) [0x7fff5af37f61] 1-11
11 _pthread_mutex_firstfit_lock_slow + 222 (libsystem_pthread.dylib + 5325) [0x7fff7d4234cd] 1-11
11 __psynch_mutexwait + 10 (libsystem_kernel.dylib + 16134) [0x7fff7d368f06] 1-11
*11 psynch_mtxcontinue + 0 (pthread + 10172) [0xffffff7f827fa7bc] (blocked by pthread mutex owned by thunderbird [42207] thread 0x86053b) 1-11
Comment 195•5 years ago
•
|
||
(In reply to Richard Leger from comment #190)
FYI, running bisection 2017-06-06 to 2017-06-11 on Windows 10 with TB 32 bits keep you posted... I'll remove the need info flag when I'll publish the result... as it is reduced period and number of version I plan to use each version for a few days in a row... to maximise chances to get the issue.
Suggest we use Richard's idea of running multiple days with one build, and do it wit multiple people with each person taking one or more builds in a coordinated manner:
Richard http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-06-03-02-05-comm-central/ ?
test#2 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-05-03-02-06-comm-central/
test#3 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-04-03-02-08-comm-central/
test#4 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-03-03-02-05-comm-central/
test#5 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-02-03-02-06-comm-central/
test#6 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-01-03-02-08-comm-central/
Although we do not know whether the regression is in fact in ths range. What do you think?
Reporter | ||
Comment 196•5 years ago
|
||
Is it possible to create a debug version of TB that would print out the necessary information to diagnose this in real time? Logs do not work when it hangs.
Comment 197•5 years ago
|
||
(In reply to James Rome from comment #196)
Is it possible to create a debug version of TB that would print out the necessary information to diagnose this in real time? Logs do not work when it hangs.
Perhaps. But we would need a developer familiar with this area of coe to define the process, or enen say whether such a thing is possible. Right now there isn't such a person.
What we CAN do now - without any special tools or knowledge - is FIND the regression range, which is the path suggested two years ago. This can ONLY be done by those of you who can reproduce this - the rest of us are powerless to help except encourage you on the path.
To elaborate on "we do not know whether the regression is in fact in this range" of June, it was originally thought maybe this began in nightly 54, so the range possibly includes many months before June.
Comment 198•5 years ago
|
||
(In reply to Christopher Schultz from comment #194)
(In reply to Christopher Schultz from comment #193)
(In reply to Wayne Mery (:wsmwk) from comment #182)
The builds that need to be tested by those who have been able to reliably reproduce are:
I believe I have gotten this one to lock-up. ActivityMonitor.app says "Not Responding" and there's no CPU usage. I'm going to wait a good long time before killing it and hopefully I'll be able to get a thread dump.
But this was the "suspected good build," so maybe we have to cast a wider net.
Yep, deadlocked at the same place:
[...]
11 CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER + 12 (CoreFoundation + 646038) [0x7fff5133eb96] 1-11
11 CGLClearDrawable + 44 (OpenGL + 28513) [0x7fff5af37f61] 1-11
11 _pthread_mutex_firstfit_lock_slow + 222 (libsystem_pthread.dylib + 5325) [0x7fff7d4234cd] 1-11
11 __psynch_mutexwait + 10 (libsystem_kernel.dylib + 16134) [0x7fff7d368f06] 1-11
*11 psynch_mtxcontinue + 0 (pthread + 10172) [0xffffff7f827fa7bc] (blocked by pthread mutex owned by thunderbird [42207] thread 0x86053b) 1-11
Locked-up again. I also confirmed that (a) I had re-installed the Daily.app before launch and (b) disabled auto-update so it wouldn't keep upgrading itself.
For anyone who can't get a thread dump, try this: while the color-wheel is spinning and before you Force-Quit, run the "Activity Monitor" application, choose "Thunderbird" (or "Daily") in the list of processes, click the gear-icon in the upper-left-hand corner of the window and choose "Sample Process". After a few seconds, it will give you a thread dump as text with a bunch of header information. Every time this has happened to me, the offending thread was the first one listed. It shows every call starting with the thread-start at the top and the current work being done at the bottom, on the most-indented line. A few lines from the bottom, you'll see what I have posted above, including the call to CGLClearDreawable.
So even the assumed-good build is locking-up for me.
Comment 199•5 years ago
|
||
(In reply to Richard Leger from comment #190)
FYI, running bisection 2017-06-06 to 2017-06-11 on Windows 10 with TB 32 bits keep you posted... I'll remove the need info flag when I'll publish the result... as it is reduced period and number of version I plan to use each version for a few days in a row... to maximise chances to get the issue... as said in my previous test result, issue was linked to a temporary lost of connection to the IMAP server during the sending (server was rebooting)... with TB unable to resume task while server was back and running... I don't know if that help... as info...
It seems today at 9:36am, I had one issue with saving copy of message to Sent with TB 2017-06-09 while sending a simple text message...
pushlog_url https://hg.mozilla.org/comm-central/pushloghtml?fromchange=cc0700686608ad42e5847abcfc10f1c25b644352&tochange=b8876205fa8dbf22f34ffffadce627327ad51f24
In Activity Manager it shows one entry "connection refused at 9:23am" but server was available at all time as I checked and especially at 9:36am... indeed the message was sent... as I checked by other means. I could also access any messages in Inbox from any dates...
But I also noticed when the issue raise, TB keep trying to save a copy without process to complete, that it was also trying to bring the Sent folder uptodate and download some message in it... but that process appeared "Paused"... then I decided to browse between multiple folder to check if issue with the server but could still have access to messages... then browsing back in the Sent folder... seems to trigger it to be updated again... and the bring folder uptodate activity log resume its process... till completed.
Then the message was copied to Sent folder fine by itself (previously stall process completed successfully at that point), all I had to do was to browse back and form from/to Sent folder in Main TB UI so it may have triggered/retriggered connection to the server and update of the folder again...
I have slightly changed my IMAP settings to sync only 1 days worth of emails and not download email larger that 50k. So you know!
I was also connected via VPN to the office at the time but that should not make any different to TB...
2017-06-10 is impossible to use for more than few hours because UI keep slowing down over time to the point it is unbearable/unusable...
Hope this info can help...
Comment 200•5 years ago
|
||
(In reply to Richard Leger from comment #199)
(In reply to Richard Leger from comment #190)
FYI, running bisection 2017-06-06 to 2017-06-11 on Windows 10 with TB 32 bits keep you posted... I'll remove the need info flag when I'll publish the result... as it is reduced period and number of version I plan to use each version for a few days in a row... to maximise chances to get the issue... as said in my previous test result, issue was linked to a temporary lost of connection to the IMAP server during the sending (server was rebooting)... with TB unable to resume task while server was back and running... I don't know if that help... as info...
It seems today at 9:36am, I had one issue with saving copy of message to Sent with TB 2017-06-09 while sending a simple text message...
pushlog_url https://hg.mozilla.org/comm-central/pushloghtml?fromchange=cc0700686608ad42e5847abcfc10f1c25b644352&tochange=b8876205fa8dbf22f34ffffadce627327ad51f24
Forgot to mentioned that I was running a bisection to find related fixes and not regressions (by default)... to get the link above...
Comment 201•5 years ago
|
||
(In reply to Richard Leger from comment #199)
with TB 2017-06-09
FYI, about the TB version I referred to...
app_name: thunderbird
build_date: 2017-06-09
build_file: C:\Users\richard.mozilla\mozregression\persist\2017-06-09--comm-central--thunderbird-55.0a1.en-US.win32.zip
build_type: nightly
build_url: https://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-09-03-02-05-comm-central/thunderbird-55.0a1.en-US.win32.zip
changeset: 998749e6ed4e8c8a70b406fa421cf64e98f0977a
pushlog_url: https://hg.mozilla.org/comm-central/pushloghtml?fromchange=998749e6ed4e8c8a70b406fa421cf64e98f0977a&tochange=b8876205fa8dbf22f34ffffadce627327ad51f24
repo_name: comm-central
repo_url: https://hg.mozilla.org/comm-central
Comment 202•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #195)
Richard http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-06-03-02-05-comm-central/ ?
I'll try that next and will attempt to get a thread dump as per Comment 198 advise see if that can help somehow... if issue occurs...
Comment 203•5 years ago
|
||
(In reply to Christopher Schultz from comment #198)
For anyone who can't get a thread dump, try this: while the color-wheel is spinning and before you Force-Quit, run the "Activity Monitor" application, choose "Thunderbird" (or "Daily") in the list of processes, click the gear-icon in the upper-left-hand corner of the window and choose "Sample Process".
Worth mentioning this advise is only applicable on Mac OS X system and not Windows ;-)
I first thought wrongly you were referring to TB Activity Manager ;-)
For Windows, the closer I could find is to open Task Manager, select Thunderbird process, right click, create dump file (.DMP)... would that be of any use?
Comment 204•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #195)
(In reply to Richard Leger from comment #190)
Suggest we use Richard's idea of running multiple days with one build, and do it wit multiple people with each person taking one or more builds in a coordinated manner:
Richard http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-06-03-02-05-comm-central/ ?
test#2 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-05-03-02-06-comm-central/
test#3 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-04-03-02-08-comm-central/
test#4 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-03-03-02-05-comm-central/
test#4 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-02-03-02-06-comm-central/
test#5 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-01-03-02-08-comm-central/
When testing as per above suggestion, one version at a time for three days, would it be worth then to activate IMAP logging at startup as per
https://wiki.mozilla.org/MailNews:Logging to maximum level of verbosity?
Would that be any useful to dev team to identify the issue?
Comment 205•5 years ago
•
|
||
The only imperative is to find the one day regression range where MACOS build N works and build N+1 fails. Nothing else matters.
Thanks for helping. I do hope the others jump in or this could take another two years :)
Comment 206•5 years ago
|
||
(In reply to Richard Leger from comment #203)
(In reply to Christopher Schultz from comment #198)
For anyone who can't get a thread dump, try this: while the color-wheel is spinning and before you Force-Quit, run the "Activity Monitor" application, choose "Thunderbird" (or "Daily") in the list of processes, click the gear-icon in the upper-left-hand corner of the window and choose "Sample Process".
Worth mentioning this advise is only applicable on Mac OS X system and not Windows ;-)
I was under the impression that this whole issue was 100% MacOS. I don't think the Windows build of tb is using OpenGL, etc. Are Windows folks having deadlocks in OpenGL like the bug-title suggests? Or are Windows folks having otherwise unexplained lock-ups and just guessing that it's the same issue. GUI deadlocks across OSs almost never have the same root cause because the OSs are usually so different.
Comment 207•5 years ago
|
||
This is 100% Mac only
Comment 208•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #205)
The only imperative is to find the one day regression range where MACOS build N works and build N+1 fails. Nothing else matters.
Thanks for helping. I do hope the others jump in or this could take another two years :)
People may want to use the binary search.
Step 0:
Set Start Date 0. (version from this date is known to work.)
Set End Date (version from this date is known to be broken).
Step 1:
Duration = EndDate - SartDate (in days).
If Duration is 1, we are done (!). The software got broken by a patch set on the start date.
Choose a test date based on Start Date + Duration / 2: We need to take care of the odd number, but
choose either below or above.
Step 2: Check the version on test date.
If the version is OK, then set Start Date to this test date.
If the version is NOT OK, then set End Date to this test date.
Go to Step 1:
This will take O(logN) as opposed to O(N) days as Wayne pointed out.
I think this is the strategy the mozilla utility for bisection uses.
(Yes "bi-" section.)
Also, I am sorry that I don't own or use a Mac and so can't offer any insight on this bug.
Comment 209•5 years ago
•
|
||
I do not get the issue so often. Maybe 1 time per week or less.
With Thunderbird 68.2.0 I just got the problem and used Activity Monitor as documented in https://bugzilla.mozilla.org/show_bug.cgi?id=1381485#c198
Sampling process 424 for 3 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
Analysis of sampling thunderbird (pid 424) every 1 millisecond
Process: thunderbird [424]
Path: /Applications/Thunderbird.app/Contents/MacOS/thunderbird
Load Address: 0x10df81000
Identifier: org.mozilla.thunderbird
Version: 68.2.0 (68.2.0)
Code Type: X86-64
Parent Process: ??? [1]
Date/Time: 2019-10-26 17:32:40.722 +0200
Launch Time: 2019-10-26 16:11:02.254 +0200
OS Version: Mac OS X 10.14.6 (18G103)
Report Version: 7
Analysis Tool: /usr/bin/sample
Physical footprint: 340.2M
Physical footprint (peak): 351.6M
----
Call graph:
2267 Thread_3211 DispatchQueue_1: com.apple.main-thread (serial)
+ 2267 ??? (in XUL) load address 0x10e504000 + 0x2a1d5d0 [0x110f215d0]
+ 2267 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] (in AppKit) + 1361 [0x7fff4304b46b]
+ 2267 _DPSNextEvent (in AppKit) + 1135 [0x7fff4304c77d]
+ 2267 _BlockUntilNextEventMatchingListInModeWithFilter (in HIToolbox) + 64 [0x7fff44cb3c76]
+ 2267 ReceiveNextEventCommon (in HIToolbox) + 603 [0x7fff44cb3ee5]
+ 2267 RunCurrentEventLoopInMode (in HIToolbox) + 292 [0x7fff44cb41ab]
+ 2267 CFRunLoopRunSpecific (in CoreFoundation) + 455 [0x7fff45a5561e]
+ 2267 __CFRunLoopRun (in CoreFoundation) + 1189 [0x7fff45a55d15]
+ 2267 __CFRunLoopDoSources0 (in CoreFoundation) + 283 [0x7fff45a567a3]
+ 2267 __CFRunLoopDoSource0 (in CoreFoundation) + 108 [0x7fff45a72d89]
+ 2267 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ (in CoreFoundation) + 17 [0x7fff45a72de3]
+ 2267 ??? (in XUL) load address 0x10e504000 + 0x29e0a01 [0x110ee4a01]
+ 2267 -[NSView removeFromSuperview] (in AppKit) + 164 [0x7fff430c5ee5]
+ 2267 -[NSView _setWindow:] (in AppKit) + 2621 [0x7fff430c533a]
+ 2267 __21-[NSView _setWindow:]_block_invoke_2 (in AppKit) + 136 [0x7fff430dbd69]
+ 2267 -[__NSArrayM enumerateObjectsWithOptions:usingBlock:] (in CoreFoundation) + 219 [0x7fff45aa476b]
+ 2267 -[NSView _setWindow:] (in AppKit) + 2309 [0x7fff430c5202]
+ 2267 -[NSSurface setWindow:] (in AppKit) + 50 [0x7fff4337eb78]
+ 2267 -[NSSurface _disposeSurface] (in AppKit) + 132 [0x7fff4337eefb]
+ 2267 -[NSNotificationCenter postNotificationName:object:userInfo:] (in Foundation) + 66 [0x7fff47cafaab]
+ 2267 _CFXNotificationPost (in CoreFoundation) + 732 [0x7fff45a293c7]
+ 2267 -[_CFXNotificationRegistrar find:object:observer:enumerator:] (in CoreFoundation) + 1642 [0x7fff45a2a014]
+ 2267 ___CFXNotificationPost_block_invoke (in CoreFoundation) + 87 [0x7fff45ac1688]
+ 2267 _CFXRegistrationPost (in CoreFoundation) + 404 [0x7fff45ab91da]
+ 2267 ___CFXRegistrationPost_block_invoke (in CoreFoundation) + 63 [0x7fff45ab9270]
+ 2267 __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__ (in CoreFoundation) + 12 [0x7fff45ab92f6]
+ 2267 CGLClearDrawable (in OpenGL) + 44 [0x7fff4f6b2f61]
+ 2267 _pthread_mutex_firstfit_lock_slow (in libsystem_pthread.dylib) + 222 [0x7fff71bc34cd]
+ 2267 _pthread_mutex_firstfit_lock_wait (in libsystem_pthread.dylib) + 96 [0x7fff71bc5d52]
+ 2267 __psynch_mutexwait (in libsystem_kernel.dylib) + 10 [0x7fff71b08f06]
I don't know what ??? (in XUL) load address 0x10e504000 + 0x29e0a01 [0x110ee4a01]
is doing. It looks like to be the latest code from Thunderbird that is involved in the deadlock.
Comment 210•5 years ago
|
||
Please see my comment #194. I believe I got the "known good" build to fail. Do I misunderstand something? Perhaps we need to rewind a bit for our "known good build"?
Comment 211•5 years ago
•
|
||
(In reply to Christopher Schultz from comment #210)
Please see my comment #194. I believe I got the "known good" build 2017-06-07 to fail.
Thanks for reemphasizing this
Comment 212•5 years ago
•
|
||
People may want to use the binary search.
That would be great. Unfortunately I don't think anyone reports this behavior being extremely deterministic. Which means false positives happen unless a build is used for several days, and one individual doing a binary search might take a several weeks to complete a binary search.
Does anyone reliably reproduce this within a few hours? If not, then we need these reporters to divide and conquer, working back now from 2017-06-06.
We don't need traces. We don't need logs. We only need to know the dates of the daily builds that fail, and there is a list in comment 195 (which I have just corrected).
Richard took test#1 http://archive.mozilla.org/pub/thunderbird/nightly/2017/06/2017-06-06-03-02-05-comm-central/
We need others to pick test#2 - test#6.
Or maybe take an older one - http://archive.mozilla.org/pub/thunderbird/nightly/2017/05/2017-05-25-03-02-23-comm-central/
Comment 213•5 years ago
|
||
(In reply to Christopher Schultz from comment #206)
(In reply to Richard Leger from comment #203)
(In reply to Christopher Schultz from comment #198)
For anyone who can't get a thread dump, try this: while the color-wheel is spinning and before you Force-Quit, run the "Activity Monitor" application, choose "Thunderbird" (or "Daily") in the list of processes, click the gear-icon in the upper-left-hand corner of the window and choose "Sample Process".
Worth mentioning this advise is only applicable on Mac OS X system and not Windows ;-)
I was under the impression that this whole issue was 100% MacOS. I don't think the Windows build of tb is using OpenGL, etc. Are Windows folks having deadlocks in OpenGL like the bug-title suggests? Or are Windows folks having otherwise unexplained lock-ups and just guessing that it's the same issue. GUI deadlocks across OSs almost never have the same root cause because the OSs are usually so different.
FYI, the deadlock in graphics on CGLClearDrawable may be 100% MacOS only (I am not in position to tell really), but the Hangs sending mail while copying message to Sent folder while displaying the progress bar is not a MacOS only issue... it has been experienced in various TB versions on Windows... that is why I help testing on Windows... but if you think that is not necessary, let me know and I'll stop testing and reporting...
Comment 214•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #211)
(In reply to Christopher Schultz from comment #210)
Please see my comment #194. I believe I got the "known good" build 2017-06-07 to fail.
Thanks for reemphasizing this
Wayne. It's worth noting that I've been running that build for several days, now, and I was only able to get it to fail a single time. It's been running without quitting the whole time and I've been using my email as usual. With the up-to-date builds, I was getting hangs maybe 10 times per day. So perhaps this is a race-condition or something like that where the old builds were susceptible, but the later builds are just MORE susceptible due to some combination of factors. Not really helpful, I know. :(
When mine hangs with the (likely) MacOS-specific CGLClearDrawable call in the thread dump, the message is 100% sent and the sent-message is 100% saved to my IMAP server, and both windows (the composition window and the "sending message" window) both close. It appears to only be a lock-up of the main window after all the other windows have closed.
Comment 216•5 years ago
|
||
(In reply to Christopher Schultz from comment #210)
Please see my comment #194. I believe I got the "known good" build to fail. Do I misunderstand something? Perhaps we need to rewind a bit for our "known good build"?
I have been able to get tb 55.0a1 daily (date: 2017-06-07) to lock-up a second time after sending a message. It took days to do it, but it finally happened. It is indeed the old familiar deadlock involving a call through CGLClearDrawable.
The two threads in deadlock are "Compositor" (running MessageLoop::Run()) and the unnamed thread making the call to CGLClearDrawable, which looks like the main event-dispatch thread for the application.
I guess I have to back-up to a previous build to try again. Shall I just grab the previous day? Or was this search based upon some educated guesses as to where the flaw may have been introduced?
Comment 217•5 years ago
|
||
Chris, thanks for volunteering to test another build.
Coordination is clearly difficult, so I've put more details and suggested date assignments in the user story - which everyone should consider to be the "diary" for this bug.
Updated•5 years ago
|
Comment 218•5 years ago
|
||
(Commenting on User Story)
User configs (name, computer, OS version, graphics, monitor(s)) :
- Christopher Schultz ?
MacBook Pro (15-inch, 2018), 10.14.6 Mojave, Intel UHD Graphics 630 w/1536MiB+Radeon Pro 560X w/4GiB; built-in display (15.4-inch 2880x1800)
Comment 219•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #217)
I thought I commented that the beta build crashed for me? I have found what I think is the 2017-06-03 you wanted me to test, will try it out.
Thanks.
Comment 220•5 years ago
|
||
(In reply to Scott from comment #219)
CGLClearDrawable Mutex crash on
"thunderbird-55.0a1.en-US.mac.dmg 54M 03-Jun-2017 11:08"
after about 2 hours.
Ill try the next one.
Comment 221•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #217)
I got crashes in June 3rd and June 2nd within an hour. I jumped to May 13st and its so far lasted longer than the others. I will give it some more time then try June 1st.
Using a blank profile w/ no sent folder saving.
Comment 222•5 years ago
|
||
(In reply to Scott from comment #221)
(In reply to Wayne Mery (:wsmwk) from comment #217)
I got crashes in June 3rd and June 2nd within an hour. I jumped to May 13st and its so far lasted longer than the others. I will give it some more time then try June 1st.
Using a blank profile w/ no sent folder saving.
OK, I switched back to June 1st this morning and got it to crash. So to recap:
June 3rd, 2nd and 1'st daily builds all crash in an hour or less (lets say less than half a dozen emails sent in each). I have been running May 31st for about a day and half without crashes so I will go back and continue to test that.
Anyone else want to try and confirm these finding by also testing the May 31st and June 1st daily builds?
Comment 223•5 years ago
|
||
Thanks for testing. Yes, we need to get the regression range down to one day. So confirming "31st May good, 1st June bad" would be very helpful.
Comment 224•5 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #223)
Thanks for testing. Yes, we need to get the regression range down to one day. So confirming "31st May good, 1st June bad" would be very helpful.
Got a crash in the May 31st Build, I'm going to jump back a full week.
Comment 225•5 years ago
|
||
(In reply to Scott from comment #219)
(In reply to Wayne Mery (:wsmwk) from comment #217)
I thought I commented that the beta build crashed for me? I have found what I think is the 2017-06-03 you wanted me to test, will try it out.
To reclarify for others, we're at the stage where only testing of NIGHTLY builds is helpful. Thanks for actively working on this
Comment 226•5 years ago
|
||
I've gotten 2017-06-07 "daily" to crash a bunch of times, now. I know Scott has been getting this to happen more quickly than I have -- just reiterating that June 7th is definitely bad.
I'd like to try May 30th, but this directory only appears to contain log files and no actual builds:
http://archive.mozilla.org/pub/thunderbird/nightly/2017/05/2017-05-30-03-02-06-comm-central/
So I've backed-up to this build: http://archive.mozilla.org/pub/thunderbird/nightly/2017/05/2017-05-29-03-02-06-comm-central/thunderbird-55.0a1.en-US.mac.dmg
Comment 227•5 years ago
|
||
(In reply to Christopher Schultz from comment #226)
You can skip it. May 24th Crashes too.
Ill go back another week.
Comment 230•5 years ago
|
||
How are results from version older than May 17 or 24?
Comment 231•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #230)
How are results from version older than May 17 or 24?
I was away on vacation for 10 days. I get failures all the way back to May 10th. I am currently testing the AM build (there was an AM and PM) of May 3rd.
Comment 232•5 years ago
|
||
(In reply to Scott from comment #227)
How are results from version older than May 17 or 24?
Just another confirmation that 2017-05-17 is locking up. It took several days (weeks?) to start, but today it's been locking up a lot.
I'll go back to 2017-05-01.
Comment 233•5 years ago
|
||
(In reply to Christopher Schultz from comment #232)
I'll go back to 2017-05-01.
I get crashes in 2017-04-01, I am currently testing March 1st.
Updated•5 years ago
|
Comment 234•5 years ago
|
||
(In reply to Christopher Schultz from comment #232)
I'll go back to 2017-05-01.
2017-05-01 is unusable for me: every time I launch it, it re-downloads all email from all folders for all time from Gmail. I think I need to replace the CPU can in my computer, now.
Comment 235•5 years ago
|
||
(In reply to Richard Leger from comment #213)
FYI, the deadlock in graphics on CGLClearDrawable may be 100% MacOS only (I am not in position to tell really), but the Hangs sending mail while copying message to Sent folder while displaying the progress bar is not a MacOS only issue... it has been experienced in various TB versions on Windows... that is why I help testing on Windows... but if you think that is not necessary, let me know and I'll stop testing and reporting...
If it's easily reproduced, we still need to find the regression range, so please do keep on testing
Comment 236•5 years ago
|
||
(In reply to Scott from comment #233)
(In reply to Christopher Schultz from comment #232)
I'll go back to 2017-05-01.
I get crashes in 2017-04-01, I am currently testing March 1st.
Have you confirmed it is the same crash? How goes it with March 1?
(In reply to Christopher Schultz from comment #234)
(In reply to Christopher Schultz from comment #232)
I'll go back to 2017-05-01.
2017-05-01 is unusable for me: every time I launch it, it re-downloads all email from all folders for all time from Gmail. I think I need to replace the CPU can in my computer, now.
Depending on Scott's results about March 1, can you coordinate the next calendar dates to test
Comment 237•5 years ago
|
||
MIGHT be on to something with March 1st. Haven't had a crash yet and I have been running it for close to 2 weeks. I think Ill go to weekly builds between April 1st and March 1st and see what I can find this week. I will be away the following 2 weeks.
All of my previous crashes have been CGLClearDrawable ones.
Comment 238•5 years ago
|
||
Scott, which date are you going with first, so Christopher can pick a different date?
Comment 239•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #238)
Scott, which date are you going with first, so Christopher can pick a different date?
I'm starting with the 22nd... give me a day, I usually get crashes within an hour or two. So I should be able to narrow it down to a week of builds fairly quickly then we can break them up.
Comment 240•5 years ago
|
||
(Scott is making great progress. Hopefully we have a 1-2 day range by Wednesday or Thurdsay.)
Comment 242•5 years ago
|
||
Back at it hopefully today. I seem to have lost track of whether I was testing the 15th or March 8th build last. Hopefully Ill have it to a week soon.
Comment 243•5 years ago
|
||
I got 2017-05-01 to lock-up, finally.
It's always fun re-downloading your whole email history from Gmail. I will eventually get banned. :(
I'm going to re-try with http://archive.mozilla.org/pub/thunderbird/nightly/2017/04/2017-04-08-00-40-03-comm-aurora/thunderbird-54.0a2.en-US.mac.dmg
Comment 244•5 years ago
|
||
2017-04-08 locked-up this morning in (well, beneath) CGLClearDrawable.
I'm backing up to http://archive.mozilla.org/pub/thunderbird/nightly/2017/04/2017-04-04-00-40-03-comm-aurora/thunderbird-54.0a2.en-US.mac.dmg
Hmm. I've looked back at the comments and I've apparently switched from "comm-central" to "comm-aurora". What is the difference, and should I be consistent?
Comment 245•5 years ago
|
||
Yes you need to be using nightly consistently. Aurora is what used to be alpha, and that has different code.
Comment 246•5 years ago
|
||
But "nightly" has a bunch of options for each day. Which of the e.g. 2017-04-04-* should I be using?
http://archive.mozilla.org/pub/thunderbird/nightly/2017/04/
There are lots of choices:
Dir 2017-04-04-00-40-03-comm-aurora-l10n/
Dir 2017-04-04-00-40-03-comm-aurora/
Dir 2017-04-04-03-02-02-comm-central-l10n/
Dir 2017-04-04-03-02-02-comm-central/
Dir 2017-04-04-03-02-03-comm-esr45/
Dir 2017-04-04-03-02-03-comm-esr52/
Dir 2017-04-04-03-03-28-comm-central/
Dir 2017-04-04-03-03-28-comm-esr45/
Dir 2017-04-04-03-03-28-comm-esr52/
Comment 247•5 years ago
|
||
comm-central is where the development happens, so "2017-04-04-03-02-02-comm-central"
Comment 248•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #241)
Good results Scott?
Ok - I get March 8th (comm-central/ - for clarity) to crash. And March 1st to maybe not crash (I usually get crashes in a couple of hours and I ran it cleanly for almost 2 weeks.)
I'm currently downloading the daily builds for the 2nd, 3rd, 4th, 5th, 6th and 7th.
Interestingly the night of the 7th is when it rolls over from v54 to v55 - this might prove to be significant.
I will start my testing on the 7th and work backwards as its easier/quicker for me to eliminate candidates that prove they work successfully. If anyone else wants to double/triple check the March 1st build and work forwards that would be great.
Comment 249•5 years ago
|
||
Unsurprisingly, I got 2017-04-08 to lock-up in a similar way. I'll go back to 2017-03-02 to help bracket Wayne's researches.
Comment 250•5 years ago
|
||
Happens twice a day or more including today.. Running 68.4.1 (64-Bit). Really frustrating.
Comment 251•5 years ago
|
||
I got 2017-03-02 to lock up this evening, beneath CGLClearDrawable Maybe I should try Communicator? ;)
Comment 252•5 years ago
|
||
(In reply to Christopher Schultz from comment #251)
I got 2017-03-02 to lock up this evening, beneath CGLClearDrawable Maybe I should try Communicator? ;)
Interesting. Try March 1st. My 7th is still running after several days. We might need to go back further...!
Comment 253•5 years ago
|
||
I'm starting to think that this was introduced by a change in Macos and not a change in Thunderbird.
Comment 254•5 years ago
|
||
(In reply to Christopher Schultz from comment #253)
I'm starting to think that this was introduced by a change in Macos and not a change in Thunderbird.
I have a similar sentiment. I recall first encountering crashes in the fall... of I think 2018. But this could also be explained by not being up to date on Thunderbird releases. My office also has Mac's with various different OS - as I am not on site anymore, I cant do a detailed analysis of whom is getting crashes with what version, but I can also see this impacting peoples update schedule.
Comment 255•5 years ago
|
||
I'm starting to think that this was introduced by a change in Macos and not a change in Thunderbird.
An interesting idea. It could even be hardware related. But if it is our code, that would be consistent with this being first reported with 54 beta and not seeing majority of reports until version 60 when newer version 5<something> code hit the larger user population.
To exhaust the code regression idea we'd need to test version 53 and 54.
http://archive.mozilla.org/pub/thunderbird/nightly/2017/01/2017-01-24-03-02-12-comm-central/ is roughly the earliest 54 nightly.
http://archive.mozilla.org/pub/thunderbird/nightly/2016/11/2016-11-15-03-02-11-comm-central/ is roughly earliest 53 nightly.
Updated•5 years ago
|
Comment 256•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #255)
I'm starting to think that this was introduced by a change in Macos and not a change in Thunderbird.
An interesting idea. It could even be hardware related. But if it is our code, that would be consistent with this being first reported with 54 beta and not seeing majority of reports until version 60 when newer version 5<something> code hit the larger user population.
To exhaust the code regression idea we'd need to test version 53 and 54.
http://archive.mozilla.org/pub/thunderbird/nightly/2017/01/2017-01-24-03-02-12-comm-central/ is roughly the earliest 54 nightly.
http://archive.mozilla.org/pub/thunderbird/nightly/2016/11/2016-11-15-03-02-11-comm-central/ is roughly earliest 53 nightly.
No big updates from me, I am still running March 7 stable. But Christopher has reported a crash on March 2nd.
Another hang-while-sending. Eventually manually force-quit. On restart, confirmed that the email had been sent correctly (as usual). This has happened several times over this last week now, fyi, each with same pattern as before. Write email and click send without any problems. The "Sending" Dialog box finishes sending the email and disappears. Then Thunderbird immediately locks, with beach-ball, until I eventually give up waiting and force-quit.
TB: 68.4.1
MacOSX: 10.14.6
:wsmwk,
Hey there. I note that in all of my hangs, the "sending email" dialog box with progress bar successfully completes sending, and successfully disappears from screen. The hang for me happens immediately after that dialog box clears. However, I just noticed that the summary for this ticket describes having the dialog box still displayed when hanging!?!
Are these the same issue or two different issues?
Comment 259•5 years ago
|
||
Are these the same issue or two different issues?
Interesting observation. I have no technical expertise here. But I think your comment does further suggest this is a graphics issue and not a thunderbird issue.
Comment 260•5 years ago
|
||
important |
This bug is 3 years old. I filed essentially the same bug, but I don't see it in my Dashboard so it may have been closed as a duplicate or something. At the time I provided stack dumps that matched these.
In my experience (which is considerable -- I led the engineering teams for iMovie and iPhoto at Apple), this is a "race condition" bug, not a graphics bug or an OS bug or whatever else has been proposed. It is deadlocking in a mutex, probably around the progress bar, not the mail delivery, which always succeeds.
I don't think you're going to find/fix this bug by regression analysis, as though the bug were somehow introduced at some point. I think it has been there for a long time, and is a design bug. Race conditions are like that.
The only way to fix it, in my opinion, is to look carefully at the code, specifically where the mutexes are established and released. It takes some hard thinking and careful looking, but mutexes are inherently hard to debug.
Suggestions:
- Remove the mutexes altogether. Are they really necessary? I've seen very few UI/graphics/progress bar interactions that require mutexes. Presumably they are used because of multiple threads, but if (as should be) only one thread -- the main thread, typically -- is handling UI updates, then a mutex shouldn't be necessary.
- Deliberately introduce a delay into one or the other of the threads that is locking the mutex, to see if it can be consistently reproduced.
- Add trace/log statements around the lock/unlock of the mutexes so you can see any close timing / race conditions (important to flush stdout as blocked threads that write to log files don't always show up in the log files due to output buffering).
Comment 261•5 years ago
•
|
||
In your bug 1400568 you wrote "This is recent bug, as of 52.3.0, never happened before". So in that bug (which is duped to this one), and here, we have proceeded on the assumption it's a regression - regardless of whether it's a race condition or not.
Looking back, there is also bug 1422251 and bug 1440716. So you are not the only person to have reported this issue against version 52. Still, it was only 3-4 people to report the issue for all or most of version 52. So change(s) in newer versions make the situation worse. But certainly it's possible part of the underlying issue predates any of these bug reports, and the regression hunt is a waste.
Comment 262•5 years ago
|
||
It still may have been introduced in 52, but if that is the theory, then there's little point in testing 53, 54, etc.
Not sure what source code control system you guys use, but GitHub has pretty nice "diff" tools and inspecting code changes in/around the area in question in the timeframe of 52.3.0 might shed some clues.
Comment 263•5 years ago
|
||
(In reply to John O'Duinn [:joduinn] (please use "needinfo?" flag) from comment #258)
:wsmwk,
Hey there. I note that in all of my hangs, the "sending email" dialog box with progress bar successfully completes sending, and successfully disappears from screen. The hang for me happens immediately after that dialog box clears. However, I just noticed that the summary for this ticket describes having the dialog box still displayed when hanging!?!
I have never had a stray window hanging around when the lock-up occurs. The mail-send operation appears to have 100% completed. The lock-up occurs when trying to work with the main window after the completion of the send/copy-to-sent/etc. operation and all temporary windows have closed (for me).
Comment 264•5 years ago
|
||
Hello all (and John):
I'll add my two cents worth here. I also do not see any window hanging around after I send a message. It clears and then if I'm foolish enough to try ANYTHING else in TB before a period of time (which I've not been able to determine), operation continues normally. If, on the other hand, I attempt to go right back into TB and start another message, I'm entertained by the beach-ball until I get bored and perform a force quit.
Sincerely,
Bob
Comment 265•5 years ago
|
||
(In reply to Glenn Reid from comment #262)
It still may have been introduced in 52, but if that is the theory, then there's little point in testing 53, 54, etc.
We've been working back in time, not forward in time.
Not sure what source code control system you guys use
hg
but GitHub has pretty nice "diff" tools and inspecting code changes in/around the area in question in the timeframe of 52.3.0 might shed some clues.
Good idea. Which two versions should we run a diff against?
(Hint: that's what we are trying to determine, so we can actually look at some targeted code changes instead of just "hey, what changed during the decade around release 52?")
Comment 266•5 years ago
|
||
(In reply to Christopher Schultz from comment #265)
(In reply to Glenn Reid from comment #262)
It still may have been introduced in 52, but if that is the theory, then there's little point in testing 53, 54, etc.
We've been working back in time, not forward in time.
Looking back through this bug's history, I see regression testing comments for 53, 54, and 55.
but GitHub has pretty nice "diff" tools and inspecting code changes in/around the area in question in the timeframe of 52.3.0 might shed some clues.
Good idea. Which two versions should we run a diff against?
(Hint: that's what we are trying to determine, so we can actually look at some targeted code changes instead of just "hey, what changed during the decade around release 52?")
Based on the stack trace(s) the section of code should be narrow. Can't be that many changes to that code. Pick a version before it (52.2) and after it (52.3).
I know you feel like I am intruding and telling you how to find the bug. I am also noting that in three years you have not found the bug, so I'm just trying to offer suggestions. I have found bugs like this in my career. It is not easy. My previous suggestions are probably more useful than diff'ing, based on your comments.
Comment 267•5 years ago
|
||
(In reply to Glenn Reid from comment #266)
I know you feel like I am intruding and telling you how to find the bug. I am also noting that in three years you have not found the bug, so I'm just trying to offer suggestions. I have found bugs like this in my career. It is not easy. My previous suggestions are probably more useful than diff'ing, based on your comments.
For what it's worth, I'm not a tb developer, just a user. So I've been responding to requests from the tb developers to get them more information; specifically, trying to narrow-down a before/after time where the bug can and cannot be reproduced.
My (wild) assertion about an underlying change in Macos "causing" this issue wasn't suggesting that Macos actually has the bug. It's much more likely that some change in Macos has simply made this existing bug in tb more obvious and to occur more frequently.
I agree with you that it's very likely to be an improper lock-management situation within tb.
Comment 269•5 years ago
|
||
March 2nd and 1st both crashed for me this morning within minutes. I will jump back to the first version of 53 and move back or forward from there.
Comment 270•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #261)
(In reply to Scott from comment #269)
Well that was quick... v53 and v52 from November 14-15th 2016 both give EXC_BAD_ACCESS crashes when opening any email. I would guess OS incompatibility.
Where should we go from here?
Comment 271•5 years ago
|
||
(In reply to Scott from comment #270)
(In reply to Wayne Mery (:wsmwk) from comment #261)
(In reply to Scott from comment #269)
Well that was quick... v53 and v52 from November 14-15th 2016 both give EXC_BAD_ACCESS crashes when opening any email. I would guess OS incompatibility.
Where should we go from here?
I think I have reached a dead end.
I get Jan 25th 2017 daily to crash on mutex.
Every build I have tried between December 2nd 2016 and Jan 21st 2017 either crashes when I open the app or crashes when I select any mail message to be read.
I have also tried builds from Nov 14th and 15th 2016 - the switch from v52 to v53... and both crash when opening an email.
Summary: Every build that I can get to run on my iMac crashes on Mutex.
Comment 272•5 years ago
|
||
Greetings;
I have been having this same bug for a couple of years, now. I would very much like it fixed.
I know, you're doing your best. Keep up the great work!
I'd like to suggest that maybe there is a race condition with the internal thunderbird search indexing service. Careful, this might be a red herring.
I tend to think the suggestion that something changed in OSX is a good thought.
Anyway, just wanted to voice my concern that, as a Mac user, this bug is very annoying.
I will try to poke around and provide additional testing data now that I know there is a fresh thread/bug devoted to this pesky bugger.
TB: 68.4.2 (64-bit)
Mac: 10.11.6 (15G22010), ATI Radeon HD 2600 Pro 256 MB
Thanks!
John
Comment 273•5 years ago
|
||
important |
Potentially relevant:
https://forum.juce.com/t/opengl-deadlocks-mac/8933/3
This forum thread suggests that a window component may be causing trouble if its enclosing window is destroyed without first removing that component (and, maybe, disposing of it). I have never had tb lock-up on me just navigating around: getting it to lock-up always requires me to have just recently sent an email message. All the composition/sending/etc. windows have all closed and the main window is focused usually for a short time (.5 - 5 sec) before the color-wheel appears. So perhaps this is the source of the deadlock (?). It's clear that something is causing it to occur more frequently than in the past, but it doesn't appear that tb code is directly responsible for the increased frequency.
The stuck thread has this in its backtrace:
- [ChildView delayedTearDown] [...]
Perhaps tearing-down a child view (window?) is tripping-over some resource that hasn't yet been cleaned-up. The "Compositor" thread clearly holds this lock and isn't giving it up.
Comment 274•5 years ago
|
||
This has been going on for years for me as well and seems to have gotten worse recently with latest Thunderbird 68.5.0 and MacOS 10.15.3 (19D76).
It seems to happen only after interacting with the main mail panel right after pressing send on a message.
I have tons of MacOS dumps I can share if it helps.
I love Thunderbird for his cross-platform but this really impairs my work on my laptop and would hate to have to move to another mail client.
Can this be assigned to an Engineer to investigate?
Comment 275•5 years ago
|
||
Not assigned a severity, yet? This is a critical issue. Can we please have someone look at this? :D
Comment 276•5 years ago
|
||
I've been hoping that every new version of Thunderbird might include a fix for this issue. I'm currently using 68.5.0 on a system running macOS 10.13.6. My first few sessions worked fine, but now I'm in a cycle where every single message send beachballs Thunderbird and requires a Force Quit. Very frustrating. I know former Thunderbird users who are no more because of this issue and on days like today I can fully understand why they threw in the towel.
Comment 277•5 years ago
|
||
Hello All: I've been watching this bug report passively since I first signed on to Bugzilla in September of 2019. I am not a hobby user of TB, and I consider it my primary means of communication in my support role for a small mainframe software company.
Outlook is, in my opinion, the most often hacked platform, so I've been able to avoid using it by running Eudora until I went completely Mac, and installed Thunderbird. I evaluated both TB and Outlook, and came down on the side of Thunderbird. Great product. I even contributed a time or two, which I hope helped.
All was well with this setup until in February of 2019, when I upgraded macOS to 10.4.3. TB began to crash so I upgraded it to 45.8.0. I still experienced problems so later in that month I upgraded to TB 60.5.0. At the time I found the huge leap in versions remarkable, but there it is.
My OS is now at 10.14.6, and TB has received no further upgrades. Frankly, I'm afraid to upgrade either the OS or TB now. I experience this problem many times per day. I'm as annoyed as anyone else, but changing to Outlook would be a huge, and one-way migration.
I'm not an engineer, so all I can do is be a good reporter and let everyone know what I see. The smartest comment I've seen on this that the bug may be due to a "race condition" (which I infer to mean something that manifests under system load). That person also suggested that it was a hidden bug that was revealed by the upgrade to macOS. That fits with my experience.
I agree with John Dale above. This sort of bug would drive away most people. I work in a software house myself, and I have a great deal of understanding about what's involved here. That TB is free-ware and that it is community supported also a factor, and I am grateful to those smarter individuals who make it "go". That said, I really wish that this were fixed, and in the near term. Thanks for your time spent reading this.
Sincerely,
Bob
Comment 278•5 years ago
|
||
This week I was doing some video editing. When my editing software was rendering and converting for output (CPU was pegged), the bug happened three times in a row.
Comment 279•5 years ago
|
||
Okay, so no progress on this bug in 3 years. Time to do something different, rather than hoping for different results.
This is not a MacOS bug. MacOS is not in control of any of the threads in the app, and they are simply deadlocking on sempahores. It's a classic race condition bug, as I've stated many times, and which my own bug report (closed as a duplicate of this) detailed.
People have been "looking for" this bug, apparently, but not, apparently, making code changes. You don't fix race conditions by looking for reproducible scenarios: by definition, race conditions can't be reliably reproduced. You fix bugs like this by changing the source code and trying to eliminate the race condition.
My recommendation would be to REMOVE the semaphore locks completely. Are they really needed? There are few situations which genuinely require locks -- even shared memory overwrites are often benign, whereas a deadlock is effectively a crash bug. So would you intentionally put code into an app that is proven to CRASH/DEADLOCK it on a regular basis, to try to prevent a theoretical situation for which you think perhaps a semaphore is the solution? Of course not.
I know for a fact that no human being is smart enough to truly imagine all the conditions that arise between parallel threads, correctly anticipating all the situations, and apply locks perfectly where needed. I've seen a lot of code that had no semaphores that probably should have, that never had an actual bugs in them. And I've seen code that has locks to prevent hypothetical scenarios that lock up at random times due to race conditions.
So some developer at Mozilla, please, remove the offending semaphors. Comment them out. Release the app. And we will all watch to see if it's better or worse. I predict that the problem will simply go away and will not be replaced by whatever scenario the locks were originally imagined to prevent.
If I'm wrong, and something bad happens, well, then go fix whatever problems are observed.
Comment 280•5 years ago
|
||
I totally disagree to simply remove resource locks.
I am the poster of bug 1608733, which was marked as a duplicate of this one. I advocated leaving 1608733 active, but was told that the bug is easy to reproduce already without additional information.
If this bug is so easy to reproduce, I wonder why nobody has figured out the root cause yet. I was a senior software engineer for over 20 years. I solved some of the most intractable bugs in data communications system software – I know how tough bugs can be. I also have over 15 years of experience in web development.
I've been using TB for around 15 years. The only time I've ever seen TB hang is as described in 1608733, which is during sending an email, and then attempting to close the message I was replying to. I use a satellite internet connection, so sending takes some time.
Sending a message and closing another message are completely separate actions, so there may be some resource that the closing of the message is trying to grab, but can't because that resource is locked up – that's one explanation. Another explanation is that there is erroneous coding – i.e. someone wrote some code that simply doesn't do what it was intended to do, and hence the closing of the message causes the software to crash (yet the culprit code may have nothing to do with closing the message). Something of this nature can manifest as a sort of cascade of things going wrong, resulting in the SWOD. Sometimes code may just wind up in a endless loop, not related at all to a locked up resource.
If I was working on this bug, I would be hammering on reproducing it, and then strategically inserting debug output to check for the values of variables, to see if they make sense. This then ultimately can lead to discovery. This is the preferred method in a lot of cases, because running under a debugger can effect the nature of the bug too much, and possibly could prevent its appearance in the testing. If during the debugging a variable is found to have a crazy value, then the question becomes, what code put that value in the variable?
If indeed this is simply a resource lock issue, then the point in the code can be found where a thread gets stuck waiting for the resource. Then the question becomes, why is that resource locked up? Then you go looking for all the other places in the code that grab that resource, and then you can find the culprit that never releases the resource. Etc. Another cause for a thread getting stuck could be erroneous coding of the lock mechanism itself – the code that implements the mutex for grabbing and releasing the lock.
I have found a total of 5 bugs in TB version 68, and have been using the same OS X for years, 10.11.6. I have only reported 4 other bugs since 2012, and they were all minor.
That's my 2 cents for today, and I appreciate all who are trying to solve this one. I love TB ... it has been very solid. I abandoned Mac Mail early on because it was unstable.
Comment 281•5 years ago
|
||
I also have a lot of experience ... 30+ years of building and shipping major software, including iMovie and iPhoto at Apple, and was Director of Applications Software at Apple a few years back.
It has been clearly established by countless stack traces that this bug is a deadlock on semaphores. It's even in the title of the bug. It is a race condition on the locks. Removing them will at least allow the race to continue. Keeping them will accomplish what, exactly? No one knows, because know one knows why they were put there in the first place.
You're right that it's a design/coding error. But it doesn't crash, it locks on the semaphores. One thread against another. I very much doubt that the semaphore is necessary, which is why I advocate removing it. It is well-established to be causing this hang. It is not well-established that it has any other value.
Your paragraph that starts with "If indeed this is simply a resource lock issue..." is right. But after three years, it seems that no one has actually gone looking for this bug. Someone should.
Comment 282•5 years ago
|
||
Instead of bragging about your experience, you should just download the source, build Thunderbird and fix the bug. See:
https://developer.thunderbird.net/the-basics/building-thunderbird
Comment 283•5 years ago
|
||
Good point. Or maybe one of the people who work at Mozilla should at least try to fix the bug, first. As far as I can see from following this bug for three years, no one has tried to fix it. Should it be me? Or maybe you?
Comment 284•5 years ago
|
||
(In reply to webmaster2 from comment #280)
If this bug is so easy to reproduce, I wonder why nobody has figured out the root cause yet.
I've been using TB for around 15 years. The only time I've ever seen TB hang is as described in 1608733, which is during sending an email, and then attempting to close the message I was replying to. I use a satellite internet connection, so sending takes some time.
I can reproduce this - typically, daily - but the only action it requires from me is to not "wait" in Thunderbird. Sometimes it will happen every email message, sometimes it wont. On the very rare occasion I've had it run for several days, even upwards of a week before getting a crash.
The typical procedure is that I click "send" on an email, and I switch to a different window before the sending progress bar completes - usually another app, firefox, production software, sometimes even the main thunderbird 'inbox' window. Switching to a different window usually results in the progress bar being covered up behind whatever I am looking at. I am not manually attempting to close the message that is in the process of being sent, I am just attempting to get on with doing other things. Other users have reported this chain of events as well.
Has your experience been different?
Comment 285•5 years ago
|
||
(In reply to Glenn Reid from comment #283)
Good point. Or maybe one of the people who work at Mozilla should at least try to fix the bug, first. As far as I can see from following this bug for three years, no one has tried to fix it. Should it be me? Or maybe you?
Looks like you don't understand the governance structure. What you call "Mozilla" are in fact three legal entities: Mozilla Foundation (MoFo), Mozilla Corporation (MoCo) and MZLA, the administrative home of Thunderbird. Looks these up on Wikipedia. Now, people at MoFo don't cut code, and the ones at MoCo produce Firefox and the so-called Mozilla platform which is the basis for both Firefox and Thunderbird.
MoCo have around 1000 employees and MZLA have about 12. Of those twelve, most do not develop on Mac, and the few who do, can't reproduce the bug. So you answer your question: Should it be me? Or maybe you? Certainly not me, since I'm a Windows person. If you can reproduce the bug and you were "Director of Applications Software at Apple a few years back", you are most certainly the most qualified person for this job. Open source means: If it doesn't work for you or you don't like, you can actually fix it yourself ;-)
Comment 286•5 years ago
|
||
Of course I don't understand the governance structure. Nor do I care. It's an app, and it has a bug. And you want me to fix it. Listen to yourself.
I don't know exactly why you're being combative here, nor your relationship to Thunderbird, but I was trying to be helpful, to offer some strategies to fix the bug. You're just being a troll.
Comment 287•5 years ago
|
||
On behalf of those of us who are only able to support Netscape's descendants through our advocacy and use at this time, we sure would like someone to take charge of this one. If it's true that the squeeky wheel gets the grease .. well .. squeek squeek. We need some help from someone at the project capable of assigning a resource, or a Maverick who is sick of seeing this thread grow and wants to save the day. Perhaps the reward for solving this could be rewarded with a cash prize, or hired-in to do more work with the Mac team on Thunderbird. I have an opening in late April. If nothing has been discovered by then, I will dig-in. That said, I'm a Java dev with the luxury of robust and solid concurrent data structures .. I do top-to-bottom design/code/test/admin HTML5/J2E/RDB, so my learning curve will likely be extensive on a desktop app like TB. I don't have plans to switch to another client since I'm loyal to a fault, but I'm hopeful that this bug gets resolved and I can rely on a more bulletproof implementation for BSD .. er .. Mac. Sincerely, John
Comment 288•5 years ago
|
||
(In reply to Jorg K (GMT+1) from comment #282)
"Instead of bragging about your experience, you should just download the source, build Thunderbird and fix the bug."
Help me out here... who is responsible for the TB code base? How many people right now are in on debugging at this sort of level? How many people actually are authorized to modify the code base? Of these people, how many are actually getting paid by a Mozilla entity?
Look, you can't just go around removing semaphores and hoping for the best. That's a very bad idea. What you need to do is understand the code and understand why the original authors included those semaphores. If that's too much of a tall order, then someone who is very familiar with debugging this code base needs to get in there, reproduce the problem over-and-over, get some debug output going, and very meticulously zero in on what is happening. I have a feeling that the reason this bug is still around is that the effort required to debug it is stopping people.
Reward of a cash prize might be a good idea, if possible.
Comment 289•5 years ago
|
||
(In reply to Glenn Reid from comment #286)
Of course I don't understand the governance structure. Nor do I care. It's an app, and it has a bug. And you want me to fix it. Listen to yourself.
Well, I think it's valuable to understand the bigger picture. I'm sure there are difficult bugs in Firefox that 1000 MoCo employees haven't tackled in decades, so why would 12 Thunderbird staff be able to fix this in a timely fashion? Apart from 12 staff, Thunderbird has about 120 people doing all sorts of jobs on a voluntary basis. So you could be the 121st. Please remember that not all Mac users see this bug, so a prerequisite to fixing it, is to actually be able to reproduce it.
I don't know exactly why you're being combative here, nor your relationship to Thunderbird, but I was trying to be helpful, to offer some strategies to fix the bug. You're just being a troll.
I'm a Thunderbird and MailNews peer, I've been a volunteer since 2010, I started since there was something that didn't work for me, so I needed to fix it. I was also the overall maintainer and release manager from 2017-2019, now back to volunteer status.
This bug is dragging on, we're almost at comment 300 with no fix in sight. Neither your nor my comments help in any way since there are only two ways to fixing the bug:
- Find the regression. Some people have tried, but somehow they all stopped. Once you pinpoint when the code change occurred that caused the problem, you may have a chance to spot the code that changed. Or even then you won't be able to tell.
- Debug it directly by adding, maybe, print statements, running with a self compiled debug version for a while, etc. I did that once to find and fix a bug that occurred every couple of weeks. It wasn't fun.
I don't quite understand why giving you that information is being "combative" or a "troll", but if you read the etiquette, https://bugzilla.mozilla.org/page.cgi?id=etiquette.html, point 2, "No obligation", you'll see that there is no obligation to fix this bug, so if you want it fixed and have the right skills, it's a really good idea to fix it yourself, like I did years back.
Comment 290•5 years ago
|
||
I, for one, while indeed frustrated about this bug still being present and hitting me literally several times per day, am willing to be patient and try to help as best I can. I appreciate the efforts of Mozilla (by whatever name, covering all appropriate parties) to produce Thunderbird. I do not intend to switch to another email client. I will continue to suffer with this bug, and I will continue to try to help figure out what's going on.
The problem is that we have two separate groups with (apparently) zero overlap:
- People who can reproduce this bug (reliably!)
- People who are capable of debugging and, ultimately, fixing it
So how can we get some overlap, here? I can think of at least two possibilities:
-
Get together over screen-sharing, reproduce, re-build, reproduce, etc. just like we were physically together
-
Instrument the code (perhaps under a "find-that-damned-OSX-bug" flag) so the reproducers can produce some useful debugging information that the debuggers/fixers can actually use to locate what's going on
I will run daily/nightly builds. I will run in debug mode. I will upload huge trace logs. I will do it repeatedly, because that's literally all I can do to help. (A seasoned Java programmer is not useful when debugging native/cocoa/XUL/whatever.) But I will not harangue the only people who are able to effect any change, here. It's not helpful in the slightest.
So please let's all just take a breath and see how we can actually work together to solve this problem.
Comment 291•5 years ago
|
||
"I will run daily/nightly builds."
I'm down. Just let me know where that executable is .. I would prefer that there be a separate log file created so I don't have to fish through syslog.
Is it difficult getting the build up-and-running on Mac? Is it in git? I would be willing to pull hourly changes, build, and report back.
I will do my best to get this building in my environment .. who knows, maybe I'll be the dude who is able to fix it. :)
Sincerely,
John
Comment 292•5 years ago
|
||
(In reply to Christopher Schultz from comment #290)
(In reply to John Dale from comment #291)
Are either of you running an OSX Verison before High Sierra?
I have tested (I think) every version that I can get to run on my machine (Mid 2017 - running High Sierra), and can reproduce the crash. Versions before - roughly - December 2016 to January 2017 - wont run on my setup so I can't test anything older than that.
Comment 293•5 years ago
|
||
I run Daily builds on OSX, updated (mostly) daily.
I used to see this problem (or something like it; spinning beachball after sending email) a few times a month. Then some time back (6 months or more), I belatedly realised I'd not seen it for a few months, and I've never seen it since. I think I recorded this in one of the many bugs that was closed as a duplicate.
Have those who can reproduce it tried the Daily builds?
https://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-central/
https://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-central-l10n/
If it still happens there, are they running the latest OSX? I'm on 10.14.16 and, with Daily builds, haven't seen the problem in many months.
NOTE: if you try a more recent build, then go back to an earlier build, you will have to start it manually from the cmdline/xterm, just once, as:
/Applications/Thunderbird\ Daily.app/Contents/MacOS/thunderbird --allow-downgrade
(or similar path depending on version you're trying to go back to)
Comment 294•5 years ago
|
||
(In reply to Calum Mackay from comment #293)
Have those who can reproduce it tried the Daily builds?
Short answer yes, if you read back even 10~ comments you will see discussion about them.
I have personally tested around 30-40+ daily builds, and all of the ones that are compatible with my system crash (these go back to December 2016).
Comment 295•5 years ago
|
||
Yup, sorry Scott, I think I did know that; I've been reading this bug for a long time, just forgot and failed to check, sorry.
My other concern was that there was more than one bug here; there seemed a spate of "closing as duplicate", despite there being little evidence to connect some of the bugs.
e.g. in my case, I never saw a stack trace; I always had to force quit, with no debugging info shown. It's hard to see how anyone can be sure that bugs are the same cause, in that situation. So perhaps the issues we saw are different, and mine is fixed, either within or without Thunderbird.
Comment 296•5 years ago
|
||
forgot to add: I had the impression that all the reproduction attempts detailed above were directed at finding a window where the problem first appeared, testing builds from a while back.
but of course that would have started with recent builds, which is what I forgot :)
Comment 297•5 years ago
|
||
(In reply to Calum Mackay from comment #296)
forgot to add: I had the impression that all the reproduction attempts detailed above were directed at finding a window where the problem first appeared, testing builds from a while back.
but of course that would have started with recent builds, which is what I forgot :)
I did try beta builds first, but it has been a while. I have been pretty aggressively testing old daily builds since November. And have experienced the bug on my system since before 2018. I have about 15 users in my office, a mixture of which experience the issue. It has proven difficult to confirm who is and who isn't at any given time as I no longer work on site.
As of a few minutes ago I have moved back to the regular release channel, since none of the test builds really offered any improvements, and I don't have physical access to a machine that can run builds from December 2016 or earlier.
Comment 298•5 years ago
|
||
I'm running 10.11.6 (15G22010)
Comment 299•5 years ago
|
||
I just thought of something .. I'm not a fan of xcode .. I'm running an older imac 24" - can I get things working without xcode with relative ease?
Comment 300•5 years ago
|
||
(In reply to John Dale from comment #298)
I'm running 10.11.6 (15G22010)
Could you try this build? (it is the last nightly build of v52 as far as I can find)
http://archive.mozilla.org/pub/thunderbird/nightly/2016/11/2016-11-14-03-02-09-comm-central/
I took my existing thunderbird folder and renamed it "Thunderbird Backup" ... then opening the daily build create a new folder and new profile. Set up my account, made sure to disable automatic updates and didnt both downloading all my old email.
If it stops working and doesn't produce a crash dump - before force quitting, open activity monitor, select thunderbird (daily) and sample the process to check for the mutex lock.
Comment 301•5 years ago
|
||
(In reply to Calum Mackay from comment #295)
"My other concern was that there was more than one bug here; there seemed a spate of "closing as duplicate", despite there being little evidence to connect some of the bugs."
I motion to reopen bug 1608733. It may not be the same exact bug as 1381485, and it provides instructions for reproducing. Anyone following that line of attack can then make their contributions on that thread. The more approaches and the more information, the better.
Comment 302•5 years ago
|
||
(In reply to webmaster2 from comment #301)
(In reply to Calum Mackay from comment #295)
"My other concern was that there was more than one bug here; there seemed a spate of "closing as duplicate", despite there being little evidence to connect some of the bugs."I motion to reopen bug 1608733. It may not be the same exact bug as 1381485, and it provides instructions for reproducing. Anyone following that line of attack can then make their contributions on that thread. The more approaches and the more information, the better.
I just tried this half a dozen times with no luck. Open email in new window. Reply to email. Close window of original message after clicking send on the reply. Am I missing something?
Comment 303•5 years ago
|
||
Seems to happen most frequently with me when my system is under load .. streaming video in one window, maybe doing a backup at the same time.
Comment 304•5 years ago
|
||
(In reply to Scott from comment #302)
"I just tried this half a dozen times with no luck. Open email in new window. Reply to email. Close window of original message after clicking send on the reply. Am I missing something?"
I think it hangs for me because I'm on a satellite internet connection, and the send operation takes a long time. In other words, when I close the window of the original message, the send operation has not completed yet. Another way you could possibly precipitate the failure is to send a very large attachment. Fyi, it doesn't always hang for me. In fact, lately I haven't noticed this failure during normal use.
Comment 305•5 years ago
|
||
Comment 260 is certainly worth pursuing. But if the root cause is semaphores/mutex then the challenge here isn't just a matter of reproducing (which isn't difficult for some people), nor just getting a developer. It's getting the right developer, one that can reproduce the issue, and is willing to dive into graphics code. Plus we'll ultimately must have a graphics developer because that's where the core Firefox code is failing and it is they that will need approve any patch that gets applied to the code base. A further significant factor is we have yet to find a corresponding Firefox issue, which would then get the interest of Mozilla Firefox developers who have no vested interest in pursing a Thunderbird-only issue. We don't pay them, we don't manage them, and they have other priorities.
We have only a couple Thunderbird c++ developers - one of them offered an opinion in comment 143. If they can identify some code suggestions and then we may have a shot at enlisting the core developers. We will check further to see whether any are able to address this.
THESE factors are why we have pursued the lower skill approach of looking for a regression range. That approach is often successful for us, and that it hasn't worked out so far in this case is unfortunate.
As for webmaster2's bug 1608733, there is now a possibility that it is not related to this bug.
Comment 307•5 years ago
|
||
I did considerable research this morning into the early history which I am putting in the story section, which includes some major graphics code landings - perhaps someone can make something of it. It's worth noting that we've had far more more reports of this in Thunderbird 60 than version 52.
Comment 308•5 years ago
|
||
Comes to mind: people seeing this bug - does setting layers.acceleration.disabled to false make the bug go away? (== Check the "Use hardware acceleration when available"
Comment 309•5 years ago
|
||
wrong |
IIRC the default on Mac is enabled for Use hardware acceleration when available, no?
Comment 310•5 years ago
|
||
Comment 311•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #309)
IIRC the default on Mac is enabled for Use hardware acceleration when available, no?
On Mac it's disabled by default!
Comment 312•5 years ago
|
||
(In reply to Scott from comment #292)
Are either of you running an OSX Verison before High Sierra?
No. I primarily run on Mojave, but I have both Mojave and Catalina environments available for testing.
My Mojave environment is single-screen, and I can reproduce this with also every email I send if I'm not careful. It seems that triggering mouse events over the main window as the composition window is closing triggers the issue. If I press CMD-ENTER to send, then touch nothing as the window closes, all is well. When moving the mouse pointer (which is common, since I'm usually "doing email" and not just sending a single message), I get the color-wheel.
My Catalina environment is dual-screen, and I keep my email main window on one display while the composition window is on another display. I can't remember when the last time TB crashed in that environment. I can try to get it to fail, there.
This issue has been hitting me for years, and is becoming more frequent as time goes on. Perhaps something changed in Mojave which makes it either easier (or even just possible) to trigger while older versions are less prone (or impossible).
Comment 313•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #309)
IIRC the default on Mac is enabled for Use hardware acceleration when available, no?
I said as much in an email I just sent you - when you look at the bug reports data you shared - 7 of the first 10 most reported GPu's are all intel integrated graphics - and reasonably dated. This means the crashes are most frequently happening in lower end macbook/imacs or thunderbird is not using the discrete gpu.
(In (In reply to Magnus Melin [:mkmelin] from comment #308)
Comes to mind: people seeing this bug - does setting layers.acceleration.disabled to false make the bug go away? (== Check the "Use hardware acceleration when available"
I am a frequent reproducer - and mine is currently set to "true" I will change it to false and test. (I crashed on the last email I sent).
Comment 314•5 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #308)
Does setting layers.acceleration.disabled to false make the bug go away? (== Check the "Use hardware acceleration when available"
Since changing this value from "true" (which is the default: DO NOT use acceleration) to "false", I have had no lock-ups, running 68.5.0 which used to lock-up with nearly every message I sent. So I'm guessing that either a different code-path is being taken, or the window of opportunity to be hit by this bug is very very small when hardware acceleration is enabled (or, rather, NOT DISABLED).
John, what's your hw acceleration set to?
Comment 315•5 years ago
|
||
(In reply to Christopher Schultz from comment #314)
(In reply to Magnus Melin [:mkmelin] from comment #308)
Does setting layers.acceleration.disabled to false make the bug go away? (== Check the "Use hardware acceleration when available"
Since changing this value from "true" (which is the default: DO NOT use acceleration) to "false", I have had no lock-ups, running 68.5.0 which used to lock-up with nearly every message I sent. So I'm guessing that either a different code-path is being taken, or the window of opportunity to be hit by this bug is very very small when hardware acceleration is enabled (or, rather, NOT DISABLED).
John, what's your hw acceleration set to?
I am seeing the same behavior so far (no crashes yet!). But its too early to say for certain. Will keep testing.
Comment 316•5 years ago
|
||
(In reply to Scott from comment #315)
(In reply to Christopher Schultz from comment #314)
I switched a dozen of my users over to the new setting over lunch break (2 hours ago).
Already one user has reported a crash on sending email. This person is on a Mid 2015 Macbook Pro with a Radeon R9 M370X GPU. Typically runs with an external benQ 24" LCD as well. But has reported that they get crashes when both at home (not conencted to the external display) and regardless of which display thunderbird is in, of if the inbox and messages being sent are on the same display.
I will wait for more reports. Its possible it will help some users and not others. But as this persons machine did have a discrete GPU I would not have expected them to still be in the problem camp - if indeed this change is helpful.
Comment 317•5 years ago
|
||
As many users reported, this bug seems to be correlated to the load on the host system, so I wouldn't expect to be able to reliably test any workaround positive with a single user in a small time window...
Comment 318•5 years ago
•
|
||
RE: Wayne Mery, "As for webmaster2's bug 1608733, there is now a possibility that it is not related to this bug."
I don't think bug 1608733 is related to system load. It seems to be related to the timing of the closing of the original message in relation to a message send operation taking a long time, for example because of network delay.
Comment 319•5 years ago
|
||
You may be right webmaster2, but the point was that it's not a simple test to verify a fix. There's some concurrency scenario that is not exactly clear how to reproduce consistently.
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment 324•5 years ago
|
||
I made no changes (this was my default setting).
Comment 326•5 years ago
|
||
(In reply to Scott from comment #316)
(In reply to Scott from comment #315)
(In reply to Christopher Schultz from comment #314)
So far so good. All 4 of my main sufferers have not reported any further crashes since the preference change.
Comment hidden (obsolete) |
Comment 328•5 years ago
|
||
Thunderbird 68.5.0 crash report on macOs 10.15.3.
Crash happened again, and yesterday.
I'm a qualified developer and tester. Please let me know if there is any testing I can do to help.
Comment 329•5 years ago
|
||
This is not be something we could ship today in the product, but as a test affected users could force enable webrender https://wiki.mozilla.org/Platform/GFX/Quantum_Render#Build_instructions (also read the notes there about help > Troubleshooting to ensure it is enabled). I've been running it a few days in Thunderbird with no ill effects.
Comment 331•5 years ago
|
||
I have time for a quick note between projects ..
I was good for three days, then restarted Thunderbird after it took a gig of memory. It froze on me this morning and I did not have that flag set for hardware acceleration.
Still no luck! Please help!
Comment 332•5 years ago
|
||
(In reply to John Dale from comment #331)
I was good for three days, then restarted Thunderbird after it took a gig of memory. It froze on me this morning and I did not have that flag set for hardware acceleration.
You change the setting to False, and are still crashing with Clear Drawable errors?
What kind of mac are you using? Many models do not even have discrete graphics cards, so I can imagine changing the setting for those will not help.
Comment 333•5 years ago
|
||
iMac (24-inch, Mid 2007)
2.8 GHz Intel Core 2 Duo
4 GB 667 MHz DDR2 SDRAM
ATI Radeon HD 2600 Pro 256 MB
Comment 334•5 years ago
|
||
(In reply to John Dale from comment #333)
iMac (24-inch, Mid 2007)
2.8 GHz Intel Core 2 Duo
4 GB 667 MHz DDR2 SDRAM
ATI Radeon HD 2600 Pro 256 MB
Oooofff. That is certainly an aging computer, it hails from the era before on board graphics. I can imagine how the preference change may not being helping and the machine will simply lack the power necessary.
Comment 335•5 years ago
|
||
Lack the power necessary to send and receive emails and paint in boxes?
;)
John
Comment hidden (metoo) |
Comment 337•5 years ago
|
||
Hello all:
I have lurked on this issue because I'm neither a developer or a tester. I'm just a loyal user out in the field who is STILL experiencing multiple crashes per day. I use TB as my main application, and it's become crash-ware. Thankfully, no emails are lost during the crash or I'd be "gone" as a user by now.
Some have characterized this three-year-and-counting bug as a "race condition". Others have suggested this bug was revealed by a macOS upgrade. These are indeed useful suggestions, but I wonder when Mozilla will deploy an engineer or engineers to FIX this thing.
Perhaps someone can produce a diagnostic script that traces the action during a crash. If anyone wants me to install such a thing I'd do it. As it is, lots of information is generated during these crashes and are passed to Apple - who presumably has no interest in it.
I work in software myself, and I can't imagine a critical error that went on this long without having a priority set. Perhaps we have arrived at the lethal mutation of open-source software. You can only beg for assistance and depend upon the kindness of this community - which is certainly present, but at this time - apparently - ineffective.
This is the first time I've felt at a disadvantage for migrating from Wintel to macOS. Mozilla, would you please allocate some resources to this?
Sincerely (and with a plaintive little whine),
Bob
Comment 338•5 years ago
|
||
I now have layers.acceleration.disabled set to false i.e. "Use hardware acceleration when available" is selected.
I do not have the problem any more.
So at least for me, this solution is working fine.
Comment hidden (obsolete) |
Comment hidden (metoo) |
Comment 341•5 years ago
|
||
There looks to be a very strong correlation between TB locking up and the screen resolution or number of screens.
TB has been working great for the past few days while I've been working just from my MBP laptop without the additional display.
TB has just locked up on about the 6th email I sent since I've attached the 43" Phillips HD monitor (specs mentioned in previous reports on this ticket).
(In reply to Mike from comment #328)
Created attachment 9132131 [details]
Thunderbird 68.5.0 crash report on macOs 10.15.3Thunderbird 68.5.0 crash report on macOs 10.15.3.
Crash happened again, and yesterday.
I'm a qualified developer and tester. Please let me know if there is any testing I can do to help.
Comment 342•5 years ago
|
||
(In reply to Mike from comment #341)
There looks to be a very strong correlation between TB locking up and the screen resolution or number of screens.
Agreed. That has been my theory from nearly the beginning. Or more specifically screen resolution to GPU power.
To Bob, Mike and Matt, please try the possible "fix" a number of us have been testing for a couple of weeks now and are having success with. It is in comment #308.
Comment hidden (obsolete) |
Comment 344•5 years ago
|
||
I think the problem here is simply this.
Among the active contributors to TB source code, nobody seems to be using Mac as the PC for daily mail exchange. PERIOD.
I began contributing to bug fixes because TB literally ate my e-mail messages close to 12 years ago.
(Sorry, I am using x86-based linux PC.)
Were I using Mac and experiencing the kind of crashes or failures mentioned in this today, I would have gone nuts and
- either found the real cause of the bug and send a fix, OR
- tried to run TB inside a linux running in a virtual PC to see if that would help. OR
- ditched TB or Mac whichever is easier.
Given that I own an e-mail archive for more than dozen years in TB mail folders, I am afraid I would go with TB and ditch Mac.
I am curious if there is anyone inside Mozilla who uses a Mac and that person can spend a week or two to look into this
issue. But the problem is that person is probably an FF developer and clueless regarding a bug in TB.
(Maybe he/she is not and can point out the correct race/resource locking issue, etc.)
Last time I remember the note PC of choice for developers at Mozilla was a x86-PC of Dell or some such brand.
Just my two cents worth.
Comment 345•5 years ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #344)
I think the problem here is simply this.
Among the active contributors to TB source code, nobody seems to be using Mac as the PC for daily mail exchange. PERIOD.
That is my understanding, as has been mentioned by others in this bug thread. There are not many Thunderbird software engineers, even fewer use macs, and none of the ones that do can reproduce the bug.
There are also not a massive number of bug reports about it (don't quote me on that, Wayne probably has a better idea how the hundreds for this compare to other bugs they experience).
Comment hidden (obsolete) |
Comment 347•5 years ago
|
||
All:
Thanks for the suggestion that I "set layers.acceleration.disabled to false". I see a cryptic reference to "(== Check the 'Use hardware acceleration when available'.
Now, what the HECK does that mean? That's nothing that I can find in TB Preferences, nor is a choice under "Settings" in my Mac Pro.
How very opaque. See my postscript for an example of other jargon.
I am willing to do anything I can to help this thread along, but blithely mentioning resources that are meaningless to the uninformed (me) isn't going to get the job done.
With respect,
Bob
PS: A half-diminished chord can always be substituted with a dominant 9th chord one third down. Try it. It's a great substitution. -B
Comment 348•5 years ago
|
||
This is what I did:
Go to SETTING > ADVANCED. On the last one "Edit Configuration" you will see a list of commands (or whatever they are). Look for the one called: layers.acceleration.disabled and double click on it until it reads FALSE on the last column.
Hopes that helps!
(In reply to Bob Shimizu from comment #347)
All:
Thanks for the suggestion that I "set layers.acceleration.disabled to false". I see a cryptic reference to "(== Check the 'Use hardware acceleration when available'.
Now, what the HECK does that mean? That's nothing that I can find in TB Preferences, nor is a choice under "Settings" in my Mac Pro.
How very opaque. See my postscript for an example of other jargon.
I am willing to do anything I can to help this thread along, but blithely mentioning resources that are meaningless to the uninformed (me) isn't going to get the job done.
With respect,
BobPS: A half-diminished chord can always be substituted with a dominant 9th chord one third down. Try it. It's a great substitution. -B
Comment 349•5 years ago
•
|
||
important workaround |
WORKAROUND CHOICES pick one (summarized for clarity):
- use Thunderbird version 78 where hardware acceleration is enabled by default (or beta 76 or newer https://thunderbird.net/#channel)
- enable "Use hardware acceleration" (HWA)
** version 68 - Thunderbird > Preferences > Advanced > General > mark the checkbox for "Use hardware acceleration"
** version 78 (or beta version 76 or newer): Thunderbird > Preferences > General > (go to the bottom) > mark the checkbox for "Use hardware acceleration" - set layers.acceleration.disabled to false - Thunderbird > Preferences > Advanced > General > Config editor > paste '"layers.acceleration.disabled" > double click to toggle to false
** beta versions: Thunderbird > Preferences > General > (go to the bottom) > Config editor > paste "layers.acceleration.disabled" > double click to toggle to false - force enable webrender EXPERIMENTAL USE AT YOUR OWN RISK perhaps best only for users of Thunderbird beta builds - Webrender replaces the Gecko compositor (which is part of this problem). Read the Notes about double checking in help > Troubleshooting, to it verify is enabled. ref. comment 329 open bug reports
- after clicking the "Write" button on Thunderbird's main window to compose a new message, minimize the main window. Once done composing, clicking "Send" works and does not freeze Thunderbird. (from comment 375)
We need your feedback only in cases where one of the above does NOT help you, and include details about your hardware. In other words if the workaround helps please refrain from commenting so we can focus on delivering a solution. Also, see comment 350 below.
Any errors or additions to the above instructions, email me direct and I'll edit this comment.
Comment 350•5 years ago
•
|
||
important |
**75.0b3 has a workaround (mentioned above in comment 349) by enabling HWA [1] via bug 1623265. The patch will soon appear in 68.7.0, 68.7.1, or 68.8.0 - if you don't see it mentioned in the release notes then it hasn't been done in that release. **
NOTE, even with a manual or default setting of enabled, after startup you may find HWA is NOT running due to startup checks which ensure your hardware works with HWA. [2] In other words, you enabling HWA is not a 100% guarantee it will function on your PC.
Also, this is a workaround, not a code fix. I don't foresee further investigation to develop a code fix unless the workaround proves ineffective. Fingers crossed.
[1] HWA - HardWare Acceleration for graphics
[2] Mac acceleration requirements:
- For WebGL, we require Mac OS version 10.6 or newer. See bug 636611
- For layers acceleration (HWA), we require Mac OS version 10.6.3 or newer. See bug 629016. One exception is <video> acceleration, which is enabled on all Mac OS versions.
- For layers acceleration (HWA), we also block all old graphics adapters that do not fully support OpenGL 2.1 in hardware (use slow software fallbacks), or that can't render to non-power-of-two texture-backed framebuffers. That includes the following generations of GPUs: ATI Radeon X1000 and older, NVIDIA Geforce FX and older, and Intel GMA 950 and older.
Updated•5 years ago
|
Comment 351•5 years ago
|
||
Wayne and Matts: Thanks for showing me WHERE to affect TB's settings. I was able to enable "Use hardware acceleration when available". I then navigated to Config Editor only to find that layers.acceleration.disabled was already set to a false state.
I will see how TB performs.
Thanks for clarifying the settings for me.
Bob
Comment 352•5 years ago
|
||
I seem to have the same problem and have to force close thunderbird after about every other email sent. I have this problem already for months but it's getting worse and unbearable.
Comment 353•5 years ago
|
||
All:
I am pleased to report that after following Wayne Mery's advice in comment #349, all seems to be well.
Here's a snippet of Wayne's advice:
enable "Use hardware acceleration" - Thunderbird > Preferences > Advanced > General > mark the checkbox for "Use hardware acceleration"
** beta versions: Thunderbird > Preferences > General > (go to the bottom) > mark the checkbox for "Use hardware acceleration"
set layers.acceleration.disabled to false - Thunderbird > Preferences > Advanced > General > Config editor > paste '"layers.acceleration.disabled" > double click to toggle to false
** beta versions: Thunderbird > Preferences > General > (go to the bottom) > Config editor > paste "layers.acceleration.disabled" > double click to toggle to false
In my case all I had to do was mark the checkbox for "Use hardware acceleration". The other choices had already been set in my otherwise un-altered copy of TB.
It's been over a week now and I've experienced no crashes. I'm not any part of an engineer, but "Use hardware acceleration" might indicate a timing problem. In the mainframe world, we might issue a POST to indicate the conclusion of some sub-task. I don't know the analog in the case of world-zilla, but perhaps a POST is or isn't being done.
In any case, thanks Wayne!
Bob
Comment 354•5 years ago
|
||
No luck .. tried variations of the suggested settings. Still crapping-out.
Comment hidden (obsolete) |
Comment 356•4 years ago
|
||
(In reply to webmaster2 from comment #355)
I am the poster of bug 1608733, which was marked as a duplicate of this one. I am still seeing the bug; however, have not had time to try any of the workarounds (or the one workaround). TB 68.7.0 on Mac.
The workaround on 68 is trivial. If it doesn't help, please post an update here.
Comment hidden (obsolete) |
Comment 358•4 years ago
|
||
Tried workaround #2 above (enable "Use hardware acceleration"), and this seems to have fixed the problem for me. TYVM.
System description: MacBook Pro (16-inch, 2019), Catalina: 10.15.3 (19D76)
Comment 359•4 years ago
|
||
UPDATE: I am using workaround #2 and it appears to have fixed the problem.
TB 68.9.0. OS X 10.11.6.
Comment 360•4 years ago
|
||
I haven't tried the "workarounds".
I always (and only) experience this crash if I click away from a "sending" window back onto the main window while the outbound send is still in progress. If I wait until message delivery is completed I get no hangs.
Comment 361•4 years ago
|
||
p.s. TB 68.9.0, macOS 10.15.3. I've been experiencing this bug for years.
Comment 362•4 years ago
|
||
(In reply to Ray Bellis from comment #360)
I haven't tried the "workarounds".
I always (and only) experience this crash if I click away from a "sending" window back onto the main window while the outbound send is still in progress. If I wait until message delivery is completed I get no hangs.
That is expected behavior of this bug.
The workarounds work well and are easy to implement - go for it!
Updated•4 years ago
|
Comment 364•4 years ago
|
||
I have an end-user that runs TB 68.11 (64-bits) on Mac OS X 10.15.6 (new computer) which mentioned his Thunderbird freeze/hang 3-4 times per day randomly when sending message with or without attachment (IMAP/SMTP setup)... the compose message window remain opened, and it seems his progress bar is missing possibly and when that happens he is unable to regain access to TB without killing it. After restart of TB, message appears correctly in his IMAP mailbox Sent folder... so message was effectively sent...
Do you think that could be related to this bug?
Would the best option be to wait for him to upgrade automatically to 78.2 when soon out, to fix his issue?
Comment 366•4 years ago
|
||
(In reply to Richard Leger from comment #364)
I have an end-user that runs TB 68.11 (64-bits) on Mac OS X 10.15.6 (new computer) which mentioned his Thunderbird freeze/hang 3-4 times per day randomly when sending message with or without attachment (IMAP/SMTP setup)... the compose message window remain opened, and it seems his progress bar is missing possibly and when that happens he is unable to regain access to TB without killing it. After restart of TB, message appears correctly in his IMAP mailbox Sent folder... so message was effectively sent...
Do you think that could be related to this bug?
Yes, the symptoms match this bug - the message is sent but thunderbird is hung.
Would the best option be to wait for him to upgrade automatically to 78.2 when soon out, to fix his issue?
Yes, It is a good choice because one of the workarounds, hardware acceleration, is enabled by default in version 78.
Comment 367•4 years ago
|
||
(In reply to Scott from comment #345)
(In reply to ISHIKAWA, Chiaki from comment #344)
I think the problem here is simply this.
Among the active contributors to TB source code, nobody seems to be using Mac as the PC for daily mail exchange. PERIOD.
That is my understanding, as has been mentioned by others in this bug thread. There are not many Thunderbird software engineers, even fewer use macs, and none of the ones that do can reproduce the bug.
There are also not a massive number of bug reports about it (don't quote me on that, Wayne probably has a better idea how the hundreds for this compare to other bugs they experience).
I can contribute one every time it crashes if you need.
Thunderbird is my daily driver, on a Mac, OS 10.14.6.
Comment 368•4 years ago
|
||
Keep working the issue and if you can figure it out, bully for us! :D
In the meantime, I tread lightly and slowly when I'm sending and it seems to help.
I'm sorry I couldn't help more .. my Internetio broadcast is blowing-up and consuming me.
All of my correspondence with guests is done using Thunderbird. That will continue for as long as possible.
Thanks to everyone in the community for putting it forward.
Sincerely,
John Dale
DB2DOM.COM
PLAINSTRIBUNE.COM
Comment 369•4 years ago
|
||
For me, hardware acceleration fixed it.
OSX 10.15.6, daily driver.
TB is much more stable than the Mail.app that I finally gave up.
Comment 370•4 years ago
|
||
Thank you for the offer of crash reports. However, such actions will be a waste of time because ...
there is zero chance of this being fixed due to a) not affecting firefox b) problem is expected to go away in version 78 where HWA is enabled by default (automatic updates coming soon, and c) the rendering engine moving to webrender (in 2021?) which should resolve the issue for anyone not helped by version 78 ... and thus no effort will be invested in fixing the underlying cause, which thankfully we won't need.
Comment 371•4 years ago
|
||
Is there a way to turn on HWA in 68.12? On a Mac Mini that only has Intel HD Graphics 4000 (integrated)
Alternatively, when is version 78 expected to hit?
Because not expecting my mail app to randomly crash would be a nice thing.
Comment 372•4 years ago
|
||
(In reply to Anthony from comment #371)
Is there a way to turn on HWA in 68.12? On a Mac Mini that only has Intel HD Graphics 4000 (integrated)
comment 349 above
Alternatively, when is version 78 expected to hit?
you can get it today at https://www.thunderbird.net/
Comment 373•4 years ago
|
||
Sorry, didn't get the e-mail notification of your response, Thunderbird had crashed again. Here's a sample:
Comment 374•4 years ago
|
||
I enabled HWA, and it seems persistent with app restart. Fingers crossed.
Thank you all.
Comment 375•4 years ago
|
||
str |
I just discovered a simple workaround to this bug (these bugs?) that has worked for me nearly every time I've sent a message.
The workaround is, after I click the "Write" button on Thunderbird's main window to compose a new message, I minimize the main window. Once done composing, clicking "Send" works and does not freeze Thunderbird. Some more details...
I currently have Thunderbird 78 and first experienced these issues in version 68. Before that I had version 45.8.0 which did not have these issues.
In my case, because I've chosen not to keep this account's password permanently saved and don't want to use a Master Password, once I click the Send button, the main window automatically maximizes which now has a dialog overlayed asking for my email account's SMTP password. Normally this is when sending would freeze Thunderbird, but now I can enter my password and send without a problem. So it seems to me that this bug is triggered by a combination of which windows are open/visible, and when one, or perhaps multiple, popups get displayed.
This freezing bug only happens when interacting with my one normal password-authenticating account, which is my default account. I have two Oauth2 accounts which have sent with no problems so far, but I rarely use these so I can't be certain they wouldn't also have this problem.
I tried the workarounds suggested earlier in this thread, but I still had this problem whether setting layers.acceleration.disabled to true or false. Unfortunately I have an older Mac with an NVIDIA GeForce 9400M which doesn't appear to be supported by webrender. I did try force enabling webrender on both Thunderbird and Firefox and saw many graphical issues/artifacts. (Yes, I will upgrade my computer soon, likely not buying another Mac, but I feel it's still worth the underlying bug getting fixed in case it otherwise resurfaces despite the HWA workarounds!)
Though my workaround helps nearly every time, there are some instances after clicking Send, and despite entering my password correctly, that the compose window focuses again with yet another dialog box asking for my SMTP password. Sometimes Thunderbird freezes when this happens. When it doesn't freeze, I must minimize the main window again THEN type and submit my password on the compose window, otherwise I risk Thunderbird freezing at this point. There apparently is separate problem (bug 1661337 ?) with Thunderbird not remembering a password throughout the session, which increases the likelyhood of this Send bug resulting in a freeze. This also affects the automatic saving of drafts to the mail server, often with a message saying "Your draft message was not copied to your drafts folder (Drafts) due to network or file access errors. You can retry or save the draft locally to Local Folders/Drafts." To work around this I have either canceled saving drafts, or have sometimes been successful by maximizing the main window, clicking on the email account's "Drafts" folder which then prompts me for my account's password again, and then hopefully the draft gets saved successfully and I can minimize the main window again.
When I change my about:config setting for mailnews.show_send_progress to false, this bug doesn't happen as often, which reinforces the idea that this bug has to do with how Thunderbird manages its various windows and dialog popups. Just a big guess, but perhaps Thunderbird is creating multiple popups upon the click of "Send" which are competing for focus at the exact same time, triggering the crash?
Lastly, in case it helps to pinpoint the problem, my Mac's Console app shows many of these messages back-to-back while running Thunderbird:
thunderbird: NextSurface returning false because of invalid mSize (0, 0).
Comment 376•3 years ago
|
||
(In reply to kurtbarb from comment #375)
I just discovered a simple workaround to this bug (these bugs?) that has worked for me nearly every time I've sent a message.
Thanks for all the info.
I tried the workarounds suggested earlier in this thread, but I still had this problem whether setting layers.acceleration.disabled to true or false. Unfortunately I have an older Mac with an NVIDIA GeForce 9400M which doesn't appear to be supported by webrender. I did try force enabling webrender on both Thunderbird and Firefox and saw many graphical issues/artifacts. (Yes, I will upgrade my computer soon, likely not buying another Mac, but I feel it's still worth the underlying bug getting fixed in case it otherwise resurfaces despite the HWA workarounds!)
Can anyone on version 91 can reproduce the crash? If yes, please post your crash ID(s) - it will be helpful.
Comment 377•3 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #376)
Can anyone on version 91 can reproduce the crash? If yes, please post your crash ID(s) - it will be helpful.
I have set the value of layers.acceleration.disable to TRUE (which is no longer the default in 91 and likely earlier, so it's set to FALSE aka DISABLE ACCELERATION) and restarted Thunderbird. I'll see if it locks-up in the near future.
Comment 379•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #378)
Still reproduces?
I just checked, and my current (102.5.0, 64-bix macos 11.7/Big Sur) setup has:
layers.acceleration.disabled=true (non-default value)
layers.acceleration.force-enabled=false (default value)
I haven't experienced a crash in a good long time, so I suspect that disabling acceleration does in fact fix this.
But I haven't re-disabled disable-acceleration (to, in effect, ENABLE acceleration :) and re-tested it. If that would be helpful, I can do so and see if it starts locking-up again. I seem to recall it would happen multiple times daily so it shouldn't take long to reproduce. :)
Comment 380•2 years ago
|
||
With version 102 and webrender graphics, we should be seeing a lot less of this.
Can anyone still reproduce?
Comment 381•2 years ago
|
||
Wayne, I'm happy to re-configure and re-test. Can you tell me what settings you want for those two config values (and any other relevant ones)? I'll put my tb into that configuration and just run for a while to see if it still happens to me.
Comment 382•2 years ago
|
||
It has been a few years now since I have experienced this bug. I don't believe it is still affecting any of my offices machines.
Comment 383•1 year ago
|
||
Either enabling HWA in settings or layers.acceleration.force-enabled=true (default value). Although I'm not sure layers.acceleration.force-enabled is still hooked up.
Updated•1 year ago
|
Comment 384•9 months ago
|
||
I've been running with layers.acceleration.disabled=true (not the default) and layers.acceleration.force-enabled=false (the current default) for months, now, and I see no crashes or hangs. Are you asking me to restore the default layers.acceleration.disabled=false and run for a while?
Comment 385•9 months ago
|
||
(In reply to Christopher Schultz from comment #384)
Are you asking me to restore the default layers.acceleration.disabled=false and run for a while?Y
Yes
Comment 386•9 months ago
|
||
Okay. Here are my settings, now:
version: 115.6.0 (64-bit)
OS: MacOS Ventura 13.6.1 x86-64
layers.acceleration.disabled false
layers.acceleration.force-enabled false
We'll see how things go.
Description
•