crash in CVCGDisplayLink::getDisplayTimes Mac coming out of sleep (waking) with external monitor
Categories
(Core :: Widget: Cocoa, defect, P2)
Tracking
()
People
(Reporter: masayuki, Assigned: smichaud)
References
()
Details
(Keywords: crash, topcrash-mac, topcrash-thunderbird, Whiteboard: [gfx-noted][tbird topcrash][tpi:+])
Crash Data
Attachments
(8 files, 5 obsolete files)
4.30 KB,
text/plain
|
Details | |
4.30 KB,
text/plain
|
Details | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
3.82 KB,
text/plain
|
Details | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
14.37 KB,
text/plain
|
Details | |
1.86 KB,
patch
|
RyanVM
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
1.86 KB,
patch
|
RyanVM
:
approval-mozilla-esr68+
|
Details | Diff | Splinter Review |
Reporter | ||
Comment 1•10 years ago
|
||
Assignee | ||
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
Comment 4•10 years ago
|
||
Comment 5•10 years ago
|
||
Comment 6•10 years ago
|
||
Comment 7•10 years ago
|
||
Comment 8•10 years ago
|
||
Comment 9•10 years ago
|
||
Assignee | ||
Comment 10•10 years ago
|
||
Assignee | ||
Comment 11•10 years ago
|
||
Assignee | ||
Comment 12•10 years ago
|
||
Assignee | ||
Comment 13•10 years ago
|
||
Assignee | ||
Comment 14•10 years ago
|
||
Reporter | ||
Comment 15•10 years ago
|
||
Assignee | ||
Comment 16•10 years ago
|
||
Assignee | ||
Comment 17•10 years ago
|
||
Assignee | ||
Comment 18•10 years ago
|
||
Comment 19•10 years ago
|
||
Assignee | ||
Comment 20•10 years ago
|
||
Reporter | ||
Comment 21•10 years ago
|
||
Reporter | ||
Comment 22•10 years ago
|
||
Reporter | ||
Comment 23•10 years ago
|
||
Reporter | ||
Comment 24•10 years ago
|
||
Assignee | ||
Comment 25•10 years ago
|
||
Reporter | ||
Comment 26•10 years ago
|
||
Assignee | ||
Comment 27•10 years ago
|
||
Updated•10 years ago
|
Comment 28•9 years ago
|
||
Comment 29•9 years ago
|
||
Comment 30•9 years ago
|
||
Comment 31•8 years ago
|
||
Comment 32•8 years ago
|
||
Comment 34•8 years ago
|
||
Updated•8 years ago
|
Comment 36•8 years ago
|
||
Updated•8 years ago
|
Comment 37•8 years ago
|
||
![]() |
||
Updated•8 years ago
|
Comment 39•8 years ago
|
||
Comment 40•8 years ago
|
||
Comment 41•8 years ago
|
||
Comment 42•8 years ago
|
||
![]() |
||
Comment 43•7 years ago
|
||
Comment 44•7 years ago
|
||
Comment 45•7 years ago
|
||
Comment 46•7 years ago
|
||
Updated•7 years ago
|
Comment 47•7 years ago
|
||
Comment 48•7 years ago
|
||
Comment 49•7 years ago
|
||
Comment 50•7 years ago
|
||
Comment 51•7 years ago
|
||
Comment 52•7 years ago
|
||
Comment 53•7 years ago
|
||
Comment 54•7 years ago
|
||
Comment 55•7 years ago
|
||
Comment 56•7 years ago
|
||
Comment 57•7 years ago
|
||
Updated•7 years ago
|
Comment 58•7 years ago
|
||
Comment 59•7 years ago
|
||
Comment 60•7 years ago
|
||
Updated•7 years ago
|
Comment 62•7 years ago
|
||
Updated•7 years ago
|
Updated•7 years ago
|
Comment 65•6 years ago
|
||
Having this issue with 64.0.2 on my work Mac. I have one external monitor (Dell U2414H) connected over DP <-> USB-C. Crashing few times a day after I resume the laptop from sleep (when back from lunch). The lid is always closed.
Going to swap the display and see if problem keep happening (Dell U2718Q).
Comment 66•6 years ago
|
||
I have the same use-case as you and same symptoms. I’ve tried multiple displays and it doesn’t matter. Firefox is just completely unreliable on macOS. It’s a shame this bug has existed for over three years with no fix.
Comment 67•6 years ago
|
||
Can reproduce with 64.0.2 on a 2017 MacBook Pro with two external Dell P2715Q monitors connected via USB-C -> DisplayPort. Firefox crashes when I unlock the machine after the displays have been off (the computer itself doesn't need to have been asleep to reproduce this, just the displays.)
I love Firefox, but this one bug is making my daily experience with it extremely frustrating.
Comment 68•6 years ago
•
|
||
(In reply to Josh Dick from comment #67)
Can reproduce with 64.0.2 on a 2017 MacBook Pro with two external Dell P2715Q monitors connected via USB-C -> DisplayPort. Firefox crashes when I unlock the machine after the displays have been off (the computer itself doesn't need to have been asleep to reproduce this, just the displays.)
This is definitely new information. Could you provide the exact steps to reproduce? It would be great if your steps included starting Firefox, turning off displays, locking/unlocking the machine and anything else that is necessary to reproduce the issue reliably from start to finish. For example, does it matter what screen Firefox is on to reproduce? Thank you!
Comment 69•6 years ago
|
||
(In reply to Stephen A Pohl [:spohl] from comment #68)
(In reply to Josh Dick from comment #67)
Can reproduce with 64.0.2 on a 2017 MacBook Pro with two external Dell P2715Q monitors connected via USB-C -> DisplayPort. Firefox crashes when I unlock the machine after the displays have been off (the computer itself doesn't need to have been asleep to reproduce this, just the displays.)
This is definitely new information. Could you provide the exact steps to reproduce? It would be great if your steps included starting Firefox, turning off displays, locking/unlocking the machine and anything else that is necessary to reproduce the issue reliably from start to finish. For example, does it matter what screen Firefox is on to reproduce? Thank you!
Unfortunately I can't reproduce the issue 100% reliably, but I can reproduce it for roughly 2 out of every 3 screen wakes. I'll document some steps to the best of my ability. This is all on a 2017 15-inch MacBook Pro running macOS Mojave 10.14.3 and Firefox 64.0.2, but I previously had the same issue with a 2015 MacBook Pro. As I said before, I have two external Dell P2715Q monitors connected via USB-C -> DisplayPort, using the two USB-C ports on the left side of the computer (when looking at its built-in display.)
These steps assume the following:
a) The computer is running in clamshell mode/lid closed, using only the external displays, with external power connected. I normally keep Firefox on the display that is configured as secondary since only one display can be primary, but I doubt that which of the two displays Firefox is shown on makes any difference.
b) The computer is configured in System Preferences -> Energy Saver to "Prevent computer from sleeping automatically when the display is off" while connected to external power.
c) You have some way to sleep the displays without sleeping the computer. I normally do this via a Hot Corner, configured in System Preferences -> Desktop & Screen Saver -> Screen Saver tab -> Hot Corners... and configuring and using a "Put Display to Sleep" hot corner. Using pmset displaysleepnow
in the terminal should be exactly the same, but I reproduce this daily using a hot corner.
Finally, here are the steps:
-
Open Firefox on the secondary display (the one that doesn't have the Dock.) Use it for normal browsing for an hour or so.
-
Sleep the displays as described above, without sleeping the computer. Right before lunch is a great time for this. :)
-
Wait 30-60 minutes. The computer should remain awake and idle, and the displays should remain asleep.
-
Wake up the computer by pressing a keyboard key and log in.
-
Firefox will have crashed and be showing a crash report window.
I hope this information helps in further investigating this issue.
Comment 70•6 years ago
|
||
(In reply to Stephen A Pohl [:spohl] from comment #68)
(In reply to Josh Dick from comment #67)
Can reproduce with 64.0.2 on a 2017 MacBook Pro with two external Dell P2715Q monitors connected via USB-C -> DisplayPort. Firefox crashes when I unlock the machine after the displays have been off (the computer itself doesn't need to have been asleep to reproduce this, just the displays.)
This is definitely new information. Could you provide the exact steps to reproduce? It would be great if your steps included starting Firefox, turning off displays, locking/unlocking the machine and anything else that is necessary to reproduce the issue reliably from start to finish. For example, does it matter what screen Firefox is on to reproduce? Thank you!
More info that might help: The two monitors I'm using, Dell P2715Q monitors connected via USB-C -> DisplayPort, are both 4K monitors, and I run both at the same scaled resolution ("Looks like 2560 by 1440".)
Comment 71•6 years ago
|
||
(In reply to Michael S from comment #65)
been running this experiment for 2 weeks now. Not a single crash. I do notice that the current display fails to resume from sleep some times and I need to turn it off and on again but Firefox itself never crashed. I'm going to run with this display for another week and switch back to my old display and see if problem returned.
Comment 72•6 years ago
|
||
(In reply to Michael S from comment #71)
(In reply to Michael S from comment #65)
been running this experiment for 2 weeks now. Not a single crash. I do notice that the current display fails to resume from sleep some times and I need to turn it off and on again but Firefox itself never crashed. I'm going to run with this display for another week and switch back to my old display and see if problem returned.
Do you have the ability to use two displays at once? That seems to be a factor.
Comment 73•6 years ago
|
||
I do, but in my case, the issue happens with a single display.
Comment 74•6 years ago
|
||
For me, this bug happens at least every other day, or more frequently.
I am using the Mac in clamshell mode, plugged into a Thunderbolt dock. I have one display plugged into the thunderbolt dock, a 4K ultrawide display.
This bug seems to happen most often when I remove it from the dock and go back to using just the laptop screen or switching back to display plugged into the dock. It's almost like Firefox can't handle transitioning from one display to the other.
Thunderbird has this happen too.
I've also tried USB-C hubs, same problem.
Comment 75•6 years ago
|
||
No longer a topcrash for Thunderbird - ranks #46 for 60.4.0.
Comment 76•6 years ago
|
||
No longer a topcrash for Thunderbird - ranks #46 for 60.4.0.
But it does rank #3 for Mac Thunderbird.
And along the lines of comment 59, for Firefox 64.0.2 it is #1 Mac crash
Comment 77•6 years ago
|
||
Adding 67 as affected. This continues to be the top Mac crash in most releases.
Comment 78•6 years ago
|
||
An additional datapoint.
Using the same setup I described previously (two Dell P2715Q 4K monitors, both running at the same scaled resolution ("Looks like 2560 by 1440")), I switched from using USB-C -> DisplayPort cables that I believe contain no active electronics, to using USB-C -> HDMI cables that do seem to contain active electronics on the HDMI end, and the crashes have completely stopped for me. I don't think the previous DisplayPort cables were defective, since everything else on the Mac worked fine when using those cables.
Comment 79•6 years ago
|
||
I just recently switched from USB-C to Display Port cables to now using standard HDMI (single monitor) and I still get the crashes unfortunately. Both Firefox and Thunderbird crash every day for me, and are completely unstable. On my Ubuntu machine, same dock, same cables, same setup, same monitors, I have no issue whatsoever. Basically, I have one dock I plug in between my Ubuntu laptop and Macbook. I unplug one laptop from the dock and switch to the other periodically throughout the day. Ubuntu never has a single issue with this.
Updated•6 years ago
|
Updated•6 years ago
|
Comment 81•6 years ago
|
||
Adding the stalled keyword to this bug. On nightly 68 this is the #23 overall top crash.
Comment 82•6 years ago
|
||
Person on Thunderbird support forum posted issue with Thunderbird crashing upon waking computer.
https://support.mozilla.org/en-US/questions/1259347
Sumitted crash report:
https://crash-stats.mozilla.com/report/index/d4a7b177-0abd-45df-a4a5-8a9e90190518#tab-bugzilla
OS X 10.14
TB version 60.6.1
Crash Reason EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Comment 83•6 years ago
|
||
https://support.mozilla.org/en-US/questions/1260299
Product Firefox
Release Channel release
Version 67.0
Build ID 20190516215225 (2019-05-16)
OS OS X 10.14
OS Version 10.14.5 18F132
bp-6ec4ea83-017d-4df4-88fe-c47c50190527
Signature: CVCGDisplayLink::getDisplayTimes
Crash Reason EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Comment 84•6 years ago
|
||
Just to add some more info, I have this issue as well. I have my MBP (Mojave 10.14.5) open as a side monitor, using external ASUS VE278 (27", 1920 x 1080) with USB-C connector to a 3 ft long HDMI cable as my main display. I notice after I sleep the machine for an extended period of time (overnight, or leaving for a few hours) and when I come back and waken it up, the only thing that has closed unexpectedly is Firefox (v 67.0.4). Safari, Chrome, and all other apps remain open and unaffected.
https://crash-stats.mozilla.org/report/index/4c586e72-4ad6-4b1c-a339-617c00190703
Updated•6 years ago
|
Comment 85•6 years ago
|
||
Happens quite often in FF dev edition (69.0b13) now, for me.
https://crash-stats.mozilla.org/report/index/281a8891-3f94-4d62-8119-d55740190816
https://crash-stats.mozilla.org/report/index/9f4847b8-6751-4a1b-903d-fa7a90190815
https://crash-stats.mozilla.org/report/index/3e609871-0bd4-4f48-bff1-90ca90190813
Comment hidden (me-too) |
Comment 87•6 years ago
|
||
(In reply to denis.kosovich from comment #86)
This is just unbelievable! This issue was opened four years ago!!!
:( :( :(
https://crash-stats.mozilla.org/report/index/b973c380-1642-48ca-9011-68fd60190819
Comment 88•6 years ago
|
||
This is now the #6 crash for Nightly (Firefox).
Stephen, is this something you might look into again? Since the problem is getting worse maybe something critical has changed. Following up in email.
Comment 89•6 years ago
|
||
Some information I haven't seen on this bug yet is if those who experience the crash actually use the integrated or discrete GPU. Maybe this is related to the discrete GPU, and only visible on MBP starting from 15". With the 13" MBP, which only has the integrated GPU I never had this crash, and I'm using an external monitor each day.
So if you experience this crash please have a look at the following page, and check which GPU is used for Firefox:
https://support.apple.com/en-us/HT202053
Comment 90•6 years ago
|
||
Based on crash-stats 40% of all the crashes happen with the following graphic adapter:
Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (0x67ef)
So if anyone notices that the discrete GPU is in use please try to disable it, and force the Mac to use the internal GPU. Try and check if that maybe fixes the crash after sleep.
Comment 91•6 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #89)
Some information I haven't seen on this bug yet is if those who experience the crash actually use the integrated or discrete GPU. Maybe this is related to the discrete GPU, and only visible on MBP starting from 15".
I have a 15" MBP with integrated Intel Iris Pro only, and I used to run into this crash quite regularly before switching to Chrome about a year ago (not because of this issue but for performance / GPU temp reasons).
Comment 92•6 years ago
|
||
Crash stats shows AMD Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (0x67ef) is the most common Graphics Adapter affected, with over 40% of the crashes coming from that version. Various Intel adapters such as Crystal Well Integrated Graphics Controller (0x0d26) only account for about 8% of the crashes over a 6 month span.
Assignee | ||
Comment 93•6 years ago
|
||
Hi Marcia, Henrik and Liz. This has long bugged me, though I've never been able to reproduce it (I failed again just now). But I might be able to get somewhere by using my HookCase (https://github.com/steven-michaud/HookCase) to learn more about how CVDisplayLink::start() and CVCGDisplayLink::getDisplayTimes() are supposed to work.
This is a very complex problem, and I won't be working on it full time (since I'm now retired). So don't expect me to come up with a solution quickly. But this kind of problem is just the thing HookCase is best at, so it seems a shame not to try it out here.
Comment 94•6 years ago
|
||
I have this problem on a 13” MacBook Pro. It doesn’t happen as often but it does happen. It seems to happen the most often when connecting or disconnecting a thunderbolt dock. The thunderbolt dock doesn’t have a GPU. So it’s definitely not specific to the 15” model.
Comment 95•6 years ago
|
||
I have this problem (last happened 4 days ago) on a Mid 2015 MacBook Pro 15" which only has an integrated GPU.
Assignee | ||
Comment 96•6 years ago
|
||
By assigning this bug to myself, I don't mean to stop other people from working on it. I do mean to show that I'll be spending serious time on it over the next few weeks.
Comment 97•6 years ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #93)
Hi Marcia, Henrik and Liz. This has long bugged me, though I've never been able to reproduce it (I failed again just now). But I might be able to get somewhere by using my HookCase (https://github.com/steven-michaud/HookCase) to learn more about how CVDisplayLink::start() and CVCGDisplayLink::getDisplayTimes() are supposed to work.
This is a very complex problem, and I won't be working on it full time (since I'm now retired). So don't expect me to come up with a solution quickly. But this kind of problem is just the thing HookCase is best at, so it seems a shame not to try it out here.
Thanks so much Steven for taking time to look into this! We really appreciate your efforts.
Assignee | ||
Comment 98•6 years ago
|
||
I've discovered that the CVDisplayLink methods work a little strangely (and inefficiently) in Firefox as compared with Safari and Chrome. So it's possible that cleaning this up a bit may make this bug go away. In the next day or two I'll come up with a patch to do this, and (if I still have access to it) do a tryserver build. Once I get a tryserver build I'll ask people who can reproduce this bug to test it for a while, to see if my patch "fixes" this bug (i.e. works around it, since it's almost certainly an Apple bug).
But tryserver builds are made on the trunk (on the mozilla-central branch). So first I need to get people to test an unaltered mozilla-central nightly build. Here's a link to today's mozilla-central nightly:
Whoever can reproduce this bug at all reliably, please download this build and try it out for a day or two. Post your results here. If Josh Dick is still around (and still sees this bug), I'd particularly like to hear from him.
I hope and expect that you will still see this bug using the mozilla-central nightly I linked above. Otherwise there isn't much point to my doing a tryserver build.
You're most welcome, Marcia :-) Mozilla bugs are more fun now that I don't have to live with and breathe them 24-hours a day. It's also nice to get a chance to put my HookCase debugging tool through the paces.
Assignee | ||
Comment 99•6 years ago
|
||
I think I've figured out the proximate cause of this bug's crashes:
They happen dereferencing a variable at offset 0x570 (on macOS 10.13.6) in a CVCGDisplayLink object. This variable is set in CVCGDisplayLink::setCurrentDisplay(CGDirectDisplayID displayID). So if CVCGDisplayLink::getDisplayTimes() is called on an object before CVCGDisplayLink::setCurrentDisplay() is called on it, it's value is still NULL, and you get a crash trying to access data at offset 0x40.
Of course, knowing this doesn't (by itself) tell me how to work around Apple's bug. But at least it's an important clue.
Comment 100•6 years ago
|
||
Steven,
I'm indeed still around, but unfortunately can't reproduce this anymore, since I no longer have the 2017 15" MacBook Pro that I was previously able to somewhat-reliably reproduce this with.
FWIW: That computer was owned by my employer and has since been swapped out for a 2018 15" MacBook Pro, which I have yet to see this crash on, though it is now connected to lower-resolution displays (144p 2560 × 1440 instead of 4K.) I also have a 2018 Mac Mini connected to the aforementioned 4K monitors via USB-C -> DisplayPort cables, and have seen this crash happen maybe once ever on it.
In any case, thanks for taking the time and effort to investigate this once again!
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 101•6 years ago
|
||
I've come up with some very limited changes that might make a difference here. But when I pushed them to try I got the following error message:
remote: smichaud@pobox.com@hg.mozilla.org: Permission denied (publickey).
Presumably that means I no longer have permission to use the tryserver. Anyone know who I should contact about this? I'm hoping that you'll know, Liz, or know who to ask.
Comment 102•6 years ago
•
|
||
Nice to see you around again. See https://www.mozilla.org/en-US/about/governance/policies/commit/#dormant-accounts
Comment 103•6 years ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #101)
I've come up with some very limited changes that might make a difference here. But when I pushed them to try I got the following error message:
remote: smichaud@pobox.com@hg.mozilla.org: Permission denied (publickey).
Presumably that means I no longer have permission to use the tryserver. Anyone know who I should contact about this? I'm hoping that you'll know, Liz, or know who to ask.
Steven: I filed Bug 1576632 to have Infra help with this.
Assignee | ||
Comment 104•6 years ago
|
||
Thanks, MattN, for pointing me in the right general direction. And thanks, Marcia for finding the right people to ask. I've been able to push my test patch to the tryserver, and the build has completed.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4b8b27e028fb521121d5a961e2eeb6c9ff8f6c58
But now I don't know where to look for the build to test with. It seems tryserver builds are no longer being copied to https://ftp.mozilla.org/pub/firefox/try-builds/, and I can't find any Mozilla documentation that tells me the new location. Nor can I find any link on the above page that points to it.
Comment 105•6 years ago
|
||
If you click on the B for the build you want, then Job Details, then look for target.dmg.
Assignee | ||
Comment 106•6 years ago
|
||
Thanks, Timothy!
So here's the optimized build made with my patch (optimized as opposed to debug):
https://queue.taskcluster.net/v1/task/WMRaE5GdS0yW_Ig6oTlW9A/runs/0/artifacts/public/build/target.dmg
Do you have any idea how long these builds stay available in this location? Is it a few hours, a few days, or a few weeks?
Comment 107•6 years ago
|
||
I'm not sure, it's at least a few days, probably a few weeks.
Assignee | ||
Comment 108•6 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #107)
I'm not sure, it's at least a few days, probably a few weeks.
Thanks! I just found an extant target.dmg from a tryserver build made on 2019-08-08, which seems to point in the direction of "a few weeks".
Assignee | ||
Comment 109•6 years ago
|
||
So ...
I have a request for whoever still sees this bug at all regularly (preferably several times a day). Please download the following build and test with it for at least a few days. Do with it whatever you normally do, and report back with your results. With luck it fixes this bug.
Above (in comment #98) I asked you to test first with a current mozilla-central nightly (the build to which my patch was added which may fix this bug). I'd assumed that this bug might not necessarily happen in (unaltered) mozilla-central nightly builds. But that turns out to be wrong. https://crash-stats.mozilla.com shows that a lot of this bug's crashes happen in mozilla-central nightlies. So please just test target.dmg.
Comment 110•6 years ago
|
||
ni on Michael and Jay since they said they could reproduce it. Please see Comment 109 for instructions. Thanks!
Comment 111•6 years ago
|
||
I downloaded the mentioned build and am running it now side-by-side with my normal instance for the next few days 👍
Comment 112•6 years ago
|
||
On my end it happens when I connect/disconnect a USB-C or Thunderbolt dock with a monitor attached, so I will try to see if there's a way to reproduce it.
Assignee | ||
Comment 113•6 years ago
|
||
I've been trying to use HookCase to trigger these crashes. The basic idea is to hook a system call and make it behave incorrectly. So far it hasn't worked, but it might help if I knew on which URLs these crashes were most likely to happen. That is if there actually is any pattern to these crashes' URLs. But I don't have permission to view the crash URLs in https://crash-stats.mozilla.com/, or even to view the comments.
Marcia, I believe you have these permissions. At least you used to. Could you look at the URLs and see if there's some pattern to them. And if so, could you list 10 or so of the most common? Thanks in advance!
Comment 114•6 years ago
|
||
there's no particular pattern in the urls that get recorded with the crashes - they are just some popular sites you'd suspect showing up frequently:
1 https://mail.google.com/mail/u/0/#inbox 805 2.92 %
2 about:home 472 1.71 %
3 about:newtab 414 1.50 %
4 https://www.facebook.com/ 149 0.54 %
5 about:blank 104 0.38 %
6 about:sessionrestore 103 0.37 %
7 https://web.whatsapp.com/ 92 0.33 %
8 https://mail.google.com/mail/u/1/#inbox 85 0.31 %
9 https://inside.amazon.com/en/Pages/default.aspx 78 0.28 %
10 https://www.youtube.com/ 62 0.22 %
some of the recent comments:
- Plugged in my mpb into a dock
- Waking my computer up from sleep and the error appeared as soon as the screen came on.
- I was away from my MacBook laptop Pro when this happened. Computer was asleep or in lock screen mode. This has happened before.
- Crashed after unlocking the screen
- The browser crashes as soon as I unplug my external screens.
- Turned external monitor off and on again
- MacBook went to sleep (?) and found Firefox crashed after unlocking the screen
- this is the second time this has happened in the same day where after my MacBook pro 2017 in clam shell mode using a 40” 4k monitor was
- From what I can tell, I disconnected my MacBook Pro running MacOS Mojave 10.14.6 from an external monitor and unplugged the power cable (it was 100% charged). After a couple of hours, I connected to power and a different monitor, and Firefox had crashed.
Assignee | ||
Comment 115•6 years ago
|
||
Thanks, phillipp. As you suspected, the information I asked for isn't very helpful.
But I'm slowly making progress. I've now discovered that making the following internal API return NULL on every tenth call is a thoroughly reliable way of triggering this bug's crashes -- and only this bug's crashes. Doing that also doesn't seem to bother Safari or Google Chrome (it doesn't cause crashes in either).
__ZL32get_current_display_system_statev
get_current_display_system_state()
This method is called often, and in many different contexts. It's also carefully written to minimize the possibility that it will return NULL. You'd think that messing with it like I've done would cause a much larger variety of crashes, in all applications. But very mysteriously it doesn't.
Unfortunately, my patch doesn't stop these crashes. My altered target.dmg build crashes just as much as do unaltered mozilla-central nightlies, at least in my tests. So I suspect that the people testing my patch will still see them. Do keep testing, though, and let us know your results.
I think I've demonstrated that problems with get_current_display_system_state() are very likely at the root of this bug's crashes. Now I need to figure out how to insulate Firefox from these problems.
The get_current_display_system_state() method is in the SkyLight private framework on Mojave and HighSierra. So far I've only tested on Mojave.
Assignee | ||
Comment 116•6 years ago
•
|
||
Same results on High Sierra. I did see one "problem with tab" message in Safari, but it didn't bring down the whole app. That's presumably because, in Safari, graphics rendering takes place in a separate process.
Assignee | ||
Comment 117•6 years ago
|
||
I think I've found a real fix for this bug. As best I can tell, I've found Apple's bug and have a well-targeted workaround for it. I need to do a bit more testing, though. I expect I'll post it sometime tomorrow, together with another tryserver build.
Assignee | ||
Comment 118•6 years ago
|
||
I've started a tryserver build. I expect it will run for at least several hours, and possibly overnight.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8cc8ebc9191252cfeb5231570ab019fc3acab79f
Comment 119•6 years ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #118)
I've started a tryserver build. I expect it will run for at least several hours, and possibly overnight.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8cc8ebc9191252cfeb5231570ab019fc3acab79f
Assignee | ||
Comment 120•6 years ago
|
||
Thanks, Timothy. But I'd prefer if people waited a bit to test with it. I'm having trouble devising a good test (via HookCase) of whether or not it works properly. With luck I'll have figured that out by tomorrow. Then I can explain what I've done.
Assignee | ||
Comment 121•6 years ago
|
||
I just discovered that Firefox Nightly.app from the tryserver builds' target.dmg won't run when you double-click on it. It hasn't been signed :-(
Up til now I've been running them from the command line, which gets around the problem. I wonder why our testers haven't mentioned that.
Comment 122•6 years ago
|
||
You can also open them by context clicking and choosing open from the context menu.
Assignee | ||
Comment 123•6 years ago
|
||
Here's my description of Apple's bug, and how my latest patch works around it:
As I mentioned above (in comment #99), these crashes happen when a CVCGDisplayLink object's pointer variable is dereferenced, in CVCGDisplayLink::getDisplayTimes(), before it's had a chance to be initialized (while it's still NULL). This variable is set in CVCGDisplayLink::setCurrentDisplay(CGDirectDisplayID displayID), and at first I thought that this method was being called out of order, or being skipped entirely. That's not the case. In fact setCurrentDisplay() returns early when it makes a call to CGDisplayIDToOpenGLDisplayMask() that fails (when it returns 0). This failure causes setCurrentDisplay() to return early, skipping over the code that sets the pointer that's later dereferenced in getDisplayTimes().
setCurrentDisplay() does return an error ( kCVReturnInvalidDisplay) when its call to CGDisplayIDToOpenGLDisplayMask() fails. The bug is that its caller, CVCGDisplayLink::initWithCGDisplays(), doesn't itself return an error when this happens. initWithCGDisplays() succeeds, as does the method that calls it (CVDisplayLinkCreateWithCGDisplays(), in turn called by CVDisplayLinkCreateWithActiveCGDisplays()). Apple's code then blithely continues to use the CVCGDisplayLink object until getDisplayTimes() tries to dereference its uninitialized (NULL) pointer.
CGDisplayIDToOpenGLDisplayMask() fails (returns 0) when its call to get_current_display_system_state() returns NULL (an error condition). As I mentioned above in comment #115, get_current_display_system_state() is carefully written to minimize the possibility that it will return NULL. That it can still do so appears to be at the root of this bug's crashes. But I wasn't able to figure out why it can sometimes return NULL, and so I'm not able to workaround this bug by somehow guaranteeing that this can't happen. Instead I've decided to call CGDisplayIDToOpenGLDisplayMask() directly in XUL's OSXVsyncSource::OSXDisplay::EnableVsync(), and do an early return whenever CGDisplayIDToOpenGLDisplayMask() returns 0. This avoids calling CVDisplayLinkCreateWithActiveCGDisplays() or CVDisplayLinkCreateWithCGDisplays() in conditions that will inevitably lead to crashes in CVCGDisplayLink::getDisplayTimes().
Assignee | ||
Comment 124•6 years ago
|
||
For reference, here's my patch from the latest tryserver build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8cc8ebc9191252cfeb5231570ab019fc3acab79f
Assignee | ||
Comment 125•6 years ago
|
||
I couldn't have figured all this out without HookCase.
Here's the hook library that I used for general exploration of how the CVDisplayLink functions work, and of what causes this bug's crashes. To make it easier to read, it's formatted as a patch on the original HookLibraryTemplate/hook.mm in the HookCase distro.
Assignee | ||
Comment 126•6 years ago
|
||
Here's the hook library I used to test that my patch works as expected. Since I couldn't figure out this bug's underlying cause (why get_current_display_system_state() sometimes returns NULL), it's a bit preconceived -- it's closely tailored to the sequence of events that leads up to this bug's crashes, and to my workaround for them. But at least it shows that my patch works as it claims to. With luck it will also stop this bug's crashes :-)
Assignee | ||
Comment 127•6 years ago
|
||
When the time comes, I plan to submit two patches for review:
-
The patch in comment #124 (https://bugzilla.mozilla.org/attachment.cgi?id=9089552).
-
A patch that cleans up existing CVDisplayLink code, in a small way. For example, I don't think we need to register a callback if the call to CVDisplayLinkCreateWithCGDisplays()/CVDisplayLinkCreateWithActiveCGDisplays() in OSXVsyncSource::OSXDisplay::EnableVsync() fails. I suspect it's just fine to return early.
It should, I hope, be possible to land the first patch quickly. It almost certainly does no harm, and it might do a lot of good. The second patch may be more controversial, but it's also a lot less urgent. Before I request any reviews, though, I need to refamiliarize myself with the Bugzilla reviewing infrastructure (and make sure I still have permission to use it). I'll save that for next week, probably starting Tuesday (since Monday is Labor Day).
Assignee | ||
Comment 128•6 years ago
|
||
Something I forgot to mention earlier:
The CVDisplayLink functions (like CVDisplayLinkCreateWithActiveCGDisplays() and CVDisplayLinkSetOutputCallback()) are called a lot more often in Firefox than they are in Safari or Chrome. This isn't really a problem, and seems to be down to design differences with regard to vsync. But it does explain why this bug's crashes happen much more often in Firefox than they do in Safari or Chrome -- more frequent successful calls to CVDisplayLinkCreateWithActiveCGDisplays() mean more chances that get_current_display_system_state() will return NULL and trigger the crashes.
My first hook library, posted in comment #125 (https://bugzilla.mozilla.org/attachment.cgi?id=9089557), contains two tests of get_current_display_system_state(): One (mentioned in comment #115) that just makes it return NULL on one in every ten calls, and another that's tailored much more specifically to the conditions of this bug. The second test reliably crashes all three browsers in CVCGDisplayLink::getDisplayTimes().
Updated•6 years ago
|
Assignee | ||
Comment 129•6 years ago
|
||
Assignee | ||
Comment 130•6 years ago
|
||
Depends on D44525
Assignee | ||
Comment 131•6 years ago
|
||
I appear to have messed up with Phabricator. I will try to figure out how to fix things. Any suggestions will be appreciated :-)
Comment 132•6 years ago
|
||
Try folding the two patches into one locally, and then using moz-phab submit . .
to update the first patch. You can then mark the second patch as abandoned using the action dropdown at the very end of the phabricator page.
Updated•6 years ago
|
Assignee | ||
Comment 133•6 years ago
|
||
Thanks, Markus! I followed your suggestion and it seems to have worked.
Comment 134•6 years ago
|
||
Possibly related?
- Bug 1381485 - Hangs frequently while sending imap mail while copying message to imap Sent folder on Mac. - displaying the progress bar. No problem if Sent is set to local folder. Deadlock on CGLClearDrawable
- Bug 1398807 - Crash in nsMsgCompose::CloseWindow
Assignee | ||
Comment 135•6 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #134)
Possibly related?
- Bug 1381485 - Hangs frequently while sending imap mail while copying message to imap Sent folder on Mac. - displaying the progress bar. No problem if Sent is set to local folder. Deadlock on CGLClearDrawable
- Bug 1398807 - Crash in nsMsgCompose::CloseWindow
I think it's very unlikely.
I expect that bug 1201401 (this bug) always leads to a crash in CVCGDisplayLink::getDisplayTimes(). Hangs, or crashes in other locations, are almost certainly unrelated. Only crashes on macOS 10.13 (High Sierra) and 10.14 (Mojave) are symbolicated in crash reports, though. On earlier versions of macOS they'll show up as crashes at an address in the CoreVideo framework.
For example:
macOS 10.12: CoreVideo@0xba47
OS X 10.11: CoreVideo@0xc13d
OS X 10.10: CoreVideo@0x2955
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 136•6 years ago
|
||
These crashes still happen on macOS 10.15, and crash reports containing them aren't symbolicated.
Assignee | ||
Comment 137•6 years ago
|
||
Markus, I'd like to respond to your comment in Phabricator, but I don't know how. Thanks in advance for your help!
Comment 138•6 years ago
|
||
If you're logged in on Phabricator (which you probably are, given that it let you mark a patch as abandoned earlier), you should see reply buttons in the top right corners of the boxes that contain my inline comments. Those buttons are a curved arrow between the text "Not Done" and the cross icon that collapses the comment. The reply buttons will let you enter a reply to an inline comment, but that comment will not be submitted even if you save it. To actually submit your comments, you need to click the "Submit" button at the very end of the page. The end of the page is also where the textbox for general comments is located.
Assignee | ||
Comment 139•6 years ago
|
||
Thanks. I'm definitely logged in. Your description of how to reply seems only to apply to inline comments. As best I can tell, I can only reply to your general comment in the window at the bottom of the page. I'll do my best, and hopefully not mess things up too badly.
Comment 140•6 years ago
|
||
As best I can tell, I can only reply to your general comment in the window at the bottom of the page.
Ah, yes. That's the correct way to do that.
Assignee | ||
Comment 141•6 years ago
•
|
||
Markus, I've responded to your general comment. But I made a serious mistake, so I revised my response. This is just to let you know to refresh your screen :-)
Updated•6 years ago
|
Assignee | ||
Comment 142•6 years ago
|
||
My workaround has made the task less urgent, but we need to get Apple to fix this bug. In the past I've reported similar bugs to Apple on my own authority, in the hope that Apple would pay attention and fix them. Sometimes they did. But, unlike this bug, they were all reproducible. And I'm no longer working for Mozilla.
Recently, in bug 1570451, Mozilla reported a serious Catalina bug to Apple and they fixed it. I wonder if we should use the same channels here as were used in that bug.
Assignee | ||
Updated•6 years ago
|
Comment 143•6 years ago
|
||
Amazing work, Steven! I've forwarded this issue to our Apple contact referencing this bug and your explanation in comment 123. I'll update the bug if there is any progress.
Updated•6 years ago
|
Comment 144•6 years ago
|
||
Thanks Steven, the patch is clear to land. I'll let you press the buttons on https://lando.services.mozilla.com/D44525/ :)
Comment 145•6 years ago
|
||
Updated•6 years ago
|
Comment 146•6 years ago
|
||
bugherder |
Assignee | ||
Comment 147•6 years ago
|
||
As the stats table (under Crash Data above) shows, the latest mozilla-central nightly (with build id 20190906094324) is the first one with this bug's patch. I'm going to be keeping an eye on the number of crashes recorded there. I'm quite confident that this bug is fixed, and that the number will continue to be '0'. But you never know for sure.
By the way, I love the stats table. BMO didn't have it back in my day.
Comment 148•6 years ago
|
||
Awesome seeing this longstanding bug fixed, welcome back Steven! That said, I think we should let this fix ship with Fx70/68.2esr so it gets some bake time on Beta rather than trying to uplift into an Fx69 dot release before that.
Assignee | ||
Comment 149•6 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #148)
Awesome seeing this longstanding bug fixed, welcome back Steven! That said, I think we should let this fix ship with Fx70/68.2esr so it gets some bake time on Beta rather than trying to uplift into an Fx69 dot release before that.
Are you thinking of uplifting this bug's patch to the 70 branch (the current beta branch)? Not a bad idea. But let's wait a few days to make sure the number of crashes stays at '0'. We are, after all, dealing with a non-reproducible bug.
Comment 150•6 years ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #149)
Are you thinking of uplifting this bug's patch to the 70 branch (the current beta branch)? Not a bad idea. But let's wait a few days to make sure the number of crashes stays at '0'. We are, after all, dealing with a non-reproducible bug.
Yeah, there's no rush here (we just started the new cycle). But yeah, I think this would be great to uplift to Beta & ESR68 once we're confident the fix is working on Nightly without new regressions.
Assignee | ||
Comment 151•6 years ago
|
||
I just noticed a very bad sign. There's been one CVCGDisplayLink::getDisplayTimes() crash in the 20190906094324 mozilla-central nightly:
https://crash-stats.mozilla.com/report/index/0ec9eda7-3df8-4c50-aca2-e179c0190906
I'll wait a few days to see if a pattern emerges -- for example if there now appear to be fewer crashes than previously. But clearly my patch doesn't work around all of these crashes. One possibility is that there's a timing problem -- that my patch's call(s) to CGDisplayIDToOpenGLDisplayMask() can fail to return '0' even when EnableVsync()'s subsequent call to CGDisplayIDToOpenGLDisplayMask() (via CVDisplayLinkCreateWithCGDisplays()) does return '0' (and triggers the crash).
Assignee | ||
Comment 152•6 years ago
|
||
It's possible that I won't be able to find a true fix/workaround for this bug until I figure out why get_current_display_system_state() sometimes returns NULL. That will be a very tall order.
Assignee | ||
Comment 153•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 154•6 years ago
|
||
I expect to spend at least several days trying to figure out why get_current_display_system_state() sometimes returns NULL, starting next week. Until I've learned more about that, I probably won't have much to say.
Comment 155•6 years ago
|
||
One or two of those crash reports must be from me. I installed nightly build, and it crashes consistently when I unplug the external monitors with laptop lid closed, then reconnect them.
Assignee | ||
Comment 156•6 years ago
|
||
(In reply to Haitao Li from comment #155)
One or two of those crash reports must be from me. I installed nightly build, and it crashes consistently when I unplug the external monitors with laptop lid closed, then reconnect them.
Could you write up detailed steps to reproduce? Given past experience, it's very unlikely they'll work for me. But they might, and it'd be good to have your STR on record. Be sure to include what kind of equipment you have -- the computer, the external display, and the cable used to connect them. Thanks in advance!
Assignee | ||
Comment 157•6 years ago
|
||
(Following up comment #154)
In the meantime I don't think we should back out my current patch. It's very unlikely to cause harm, and we might learn something from it -- for example whether or not it reduces the crashes' frequency.
Assignee | ||
Comment 158•6 years ago
•
|
||
I'm giving this another try, with another patch.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2daf38e8178fadd14981ac441a73570fc4025837
This time, rather than trying to anticipate when this bug is going to happen, I detect it after it happened. Rather than trying to figure out when CVCGDisplayLink::setCurrentDisplay() is going to leave an internal pointer uninitialized (nulled), I check for it after the fact. I can't access the null pointer directly. But fortunately, when CVCGDisplayLink::setCurrentDisplay() triggers this bug (thanks to get_current_display_system_state() returning null and CGDisplayIDToOpenGLDisplayMask() returning 0), it also leaves the "current display" internal variable uninitialized (zeroed). This I can access, using the documented CVDisplayLinkGetCurrentCGDisplay() method.
As best I can tell, the reason my previous patch worked so poorly is that the condition(s) that cause get_current_display_system_state() to return null can change very quickly, from one call to the next. Given two consecutive calls to this function, there seems to be about a 50/50 chance that the first will return null and the second not, or vice versa (that is if the general conditions for this bug prevail, whatever they are).
I'm going to wait for my try build and its tests to finish. Then tomorrow I'll ask Stephen Pohl for a review. I'd ask Markus Stange again, but I see from bug 1578075 comment #12 I see that he's going on PTO tomorrow.
Comment 159•6 years ago
|
||
The crash is not as consistent as I first thought. After I installed the nightly build, I got 2 crashes in a row just by unplugging the USB-C display cables and plugged back in. But after that it didn't happen very often. I did get another crash today. I have a Macbook Pro 2019, and two external LG 4K monitors connected with USB-C cables. But I believe I had the same crash with just one external monitor. I don't use laptop monitor and external monitor at the same time. I always plug/unplug monitors when the lid is closed.
Assignee | ||
Comment 160•6 years ago
|
||
Here's my latest patch, for reference.
See comment #123 above for a detailed description of Apple's bug, and comment #158 for an explanation of the new approach I take in this patch (fixing the bug after it happens, rather than trying to avoid it before it happens).
There's yet one more thing that's new about this patch. Above, in comment #127, I said I wanted (at some point in the future) to get rid of the RetryEnableVsync() callback. As best I could tell, in my tests, it didn't serve any useful purpose. I've now changed my mind. While running an earlier version of my latest patch on top of a HookCase hook library that emulates this bug's crashes, I noticed one case of the display not getting updated until I manually refreshed the browser window. This happened when a CMD-TAB command coincided with an early return caused by my workaround for this bug's crashes. I now think it's best that we trigger the callback when my workaround causes an early return. Given that Apple's bug is so intermittent, I think this is unlikely to cause trouble. And it will presumably avoid the display failing to refresh when my workaround is exercised.
Assignee | ||
Comment 161•6 years ago
|
||
Here's the latest version of my HookCase hook library for testing. It can be used both for general exploration of how the CVDisplayLink functions work in different browsers (when you undefine BUG_1201401_CRASHTEST), and for emulating this bug's crashes. It can also be used to test my latest patch.
Assignee | ||
Comment 162•6 years ago
|
||
Here's a tryserver build I've started with my latest patch:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d02df2907d9fa0529ac336746bc7cebd635cb93a
Assignee | ||
Comment 163•6 years ago
|
||
Assignee | ||
Comment 164•6 years ago
|
||
Haitao Li, you might want to try out this test build made with my latest patch:
Comment 165•6 years ago
|
||
Steven, I'm running your test build now. Will report back tomorrow.
Assignee | ||
Comment 166•6 years ago
|
||
This fixes a small bug in my testing hook library.
Updated•6 years ago
|
Comment 167•6 years ago
|
||
Comment 168•6 years ago
|
||
bugherder |
Assignee | ||
Comment 169•6 years ago
•
|
||
Here's the first mozilla-central nightly with my second patch (build id 20190911215306):
Once again I'm going to keep an eye on the number of crashes in this and subsequent builds. If they stay at zero for a few days, I think we can consider this bug truly fixed.
Updated•6 years ago
|
Comment 170•6 years ago
|
||
I haven't got a crash with the new fix so far. It's looking good.
Assignee | ||
Comment 171•6 years ago
|
||
(In reply to Haitao Li from comment #170)
I haven't got a crash with the new fix so far. It's looking good.
I'm glad to hear it. And the stats table (under Crash Data above) still shows zero crashes with my new patch. But I want to wait a few more days before requesting uplift to the Beta (70) and ESR68 branches, as per comment #150 above.
Assignee | ||
Comment 172•6 years ago
|
||
Assignee | ||
Comment 173•6 years ago
|
||
Assignee | ||
Comment 174•6 years ago
|
||
Comment on attachment 9093104 [details] [diff] [review]
bug1201401 patch for beta (70) branch
Beta/Release Uplift Approval Request
- User impact if declined: Mac topcrasher will remain unfixed
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The crash workaround makes very reasonable assumptions
- String changes made/needed:
Assignee | ||
Comment 175•6 years ago
|
||
Comment on attachment 9093105 [details] [diff] [review]
bug1201401 patch for esr68 branch
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: This patch fixes a Mac topcrasher
- User impact if declined: A Mac topcrasher will remain unfixed
- Fix Landed on Version: 71
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The crash workaround makes very reasonable assumptions
- String or UUID changes made by this patch:
Comment 176•6 years ago
|
||
Comment 177•6 years ago
|
||
bugherder uplift |
Updated•6 years ago
|
Updated•6 years ago
|
Comment 178•6 years ago
|
||
Comment 179•6 years ago
|
||
bugherder uplift |
Comment 180•6 years ago
|
||
Hello,
I tried to reproduce this issue and to verify it, but unfortunately due to technical limitations I was not able to do so. If anybody else can confirm that this issue is fix please feel free to do so.
Updated•6 years ago
|
Assignee | ||
Comment 181•6 years ago
|
||
Haitao Li, can you confirm that this bug is fixed on your system?
Daniel, in general this bug is not reproducible. Though a few people see (or have seen) this bug consistently and have reported it here, they've often found that replacing some part of their equipment (say their monitor or their computer) makes the problem go away. Haitao Li is the most recent reporter, and I hope he can confirm that he no longer sees this bug (in current mozilla-central nightlies or Firefox 70 Beta8). But those of us who can't reproduce the bug (including me) have been relying on the stats table above (under Crash Data) to confirm that the bug is fixed -- since no crashes have occurred in builds with my latest patch.
Comment 182•6 years ago
|
||
I have been running nightly builds since your fix was merged in and there wasn't a crash so far. Before the fix I got it every day. So I'm quite confident it solved my issue at least.
Assignee | ||
Comment 183•6 years ago
|
||
On the strength of Haitao Li's comment, I'm marking this bug verified on the 70 and 71 branches.
Description
•