Crash in mozilla::gl::SurfaceFactory_IOSurface::CreateShared

VERIFIED FIXED in Firefox 65

Status

()

defect
P1
critical
VERIFIED FIXED
6 months ago
24 days ago

People

(Reporter: marcia, Assigned: haik)

Tracking

({crash, regression})

Trunk
mozilla66
Unspecified
macOS
Points:
---

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox63 unaffected, firefox64 unaffected, firefox65+ verified, firefox66+ verified)

Details

(crash signature)

Attachments

(3 attachments)

This bug was filed from the Socorro interface and is
report bp-8d460584-db2d-4450-90a1-4ab3d0181110.
=============================================================

Seen while looking at nightly crash stats - small volume Mac crash which started in Build ID 20181110100119: https://bit.ly/2FwOPwO. It is also flagged as a potential startup crash. Crash reason for all crashes is EXC_SOFTWARE / SIGABRT.

Possible regression range based on build id: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5e7636ec12c5c4543b64428e15165031cff32dc4&tochange=39dba5141dd90c70a861299459d418d230148d9f

Top 10 frames of crashing thread:

0  @0x7fff68db020a 
1 XUL google_breakpad::CrashGenerationClient::RequestDumpForException toolkit/crashreporter/google-breakpad/src/common/mac/MachIPC.mm:249
2 XUL google_breakpad::ExceptionHandler::WriteMinidumpWithException toolkit/crashreporter/breakpad-client/mac/handler/exception_handler.cc:382
3 XUL google_breakpad::ExceptionHandler::SignalHandler toolkit/crashreporter/breakpad-client/mac/handler/exception_handler.cc:628
4  @0x7fff68f77f59 
5  @0x1582ff81f 
6  @0x7fff68d151ad 
7 AppleIntelHD3000GraphicsGLDriver AppleIntelHD3000GraphicsGLDriver@0x1f4eb 
8 AppleIntelHD3000GraphicsGLDriver AppleIntelHD3000GraphicsGLDriver@0x2bd827 
9  @0x7fff4b28323c 

=============================================================
This one to figure with certainty. Nothing obvious in the possible regression range and the stack, and it crashes in the driver.
I'll try to insert a call to GLContext::MakeCurrent to see if it is caused by a mismatched or lost context, but with this volume it'll be hard to know for sure whether it made a difference.
Assignee: nobody → nical.bugzilla
Priority: -- → P3
Actually upon reading more of the WebGL presentation code we already made a number of GL calls in the same stack by the time we get there so there's no way MakeCurrent will catch an issue that wouldn't have blow up before.
19 crashes/11 installs in the last 7 days, not super huge volume. Almost all of the urls are Google maps such as https://www.google.com/maps.
To add to Comment 3, all users are running 10.13.6.
Hi Nicolas, have you been able to make any progress on this?
Flags: needinfo?(nical.bugzilla)
I haven't been able to make meaningful progress and given I'll be away a large part of December and there's WebRender related stuff higher up my list I'll unassign myself for now.
Assignee: nical.bugzilla → nobody
Flags: needinfo?(nical.bugzilla)
Adding 66 as affected. Not much else to go on other than what I noted in earlier comments.
Sotaro, would you be able to take a peak at this? Thank you!
Assignee: nobody → sotaro.ikeda.g
Ok, I am going to take a look :)
Hi Sotaro, is this something you can complete before your PTO?
Flags: needinfo?(sotaro.ikeda.g)
I checked source code change logs around SurfaceFactory_IOSurface, GLScreenBuffer, GLContext, GLContextCGL and WebGLContext. 
But there was no possible change around Build ID 20181110100119.
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #0)
> 
> Possible regression range based on build id:
> https://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=5e7636ec12c5c4543b64428e15165031cff32dc4&tochange=39db
> a5141dd90c70a861299459d418d230148d9f

I checked the range, but there was no change that related to WebGL.
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #0)
> 
> Possible regression range based on build id:
> https://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=5e7636ec12c5c4543b64428e15165031cff32dc4&tochange=39dba5141dd90c70a861299459d418d230148d9f

In the range, there was changes that are related to mac sandbox(Bug 1498750, Bug 1501126). And just before the range, there is mac sand box change(Bug 1505445). Then I wonder if it could be related to the problem.
:haik, is it possible that changes to mac sand box cause the problem?
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(haftandilian)
Assignee

Comment 15

5 months ago
(In reply to Sotaro Ikeda [:sotaro out of office 28/Dec - 3/Jan] from comment #14)
> :haik, is it possible that changes to mac sand box cause the problem?

I think it's probable. The dates of the crash reports align with when the sandboxing change was enabled in Nightly. The breakdown by OS version is

  141 reports on 10.13
    4 reports on 10.14

All the 10.13 reports I looked at crashed in module AppleIntelHD3000GraphicsGLDriver. The following machines (all 2011) are listed as including Intel HD 3000 Graphics. For now, I am assuming the AppleIntelHD3000GraphicsGLDriver module is only ever used on machines with actual Intel HD 3000 hardware.

  Mac mini (Mid 2011)
  Mac mini Server (Mid 2011)
  MacBook Air (11-inch, Mid 2011)
  MacBook Air (13-inch, Mid 2011)
  MacBook Pro (13-inch, Early 2011)
  MacBook Pro (15-inch, Early 2011)
  MacBook Pro (13-inch, Late 2011)
  MacBook Pro (15-inch, Late 2011)
  MacBook Pro (17-inch, Early 2011)
  MacBook Pro (17-inch, Late 2011)

I'll work on getting access to one of these systems to try to debug this. Taking this bug.
Assignee: sotaro.ikeda.g → haftandilian
Flags: needinfo?(haftandilian)
Assignee

Updated

5 months ago
Duplicate of this bug: 1508756
Haik: I have a MacBook Pro (13-inch, Early 2011) with this Graphics Driver if I can be of any help.
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #17)
> Haik: I have a MacBook Pro (13-inch, Early 2011) with this Graphics Driver
> if I can be of any help.

I was able to reproduce the crash on my machine running 10.12.6, but a get a different signature: https://crash-stats.mozilla.com/report/index/c79cfa33-0089-450d-889d-31f830181226#allthreads. I was loading https://www.google.fr/maps/@25.1113955,54.4302496,7z which is one of the crashing URLs.
Assignee

Comment 19

5 months ago
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #17)
> Haik: I have a MacBook Pro (13-inch, Early 2011) with this Graphics Driver
> if I can be of any help.

That would be really helpful, thank you. The first step would be to see if you can reproduce a crash on Nightly with Google Maps. Is that machine running 10.13.6? If it is reproducible, we can turn on sandbox violation logging and check for anything that could be related.

To enable the logging, set security.sandbox.logging.enabled=true then quit the browser. Before restarting Nightly, open Console.app (/Applications/Utilities/Console) and enter "plugin-container" in the filter/search list and then click the "Clear" icon. Then start Nightly and reproduce the crash. You'll see a lot of output (crash or no crash) in Console. After the crash, you can use Command-A to select all the entries and paste them into a text editor and save the file and I can look through that.

To disable the change that we think caused this, you can set security.sandbox.content.mac.earlyinit=false (requires restart). That might be useful for comparison purposes.
Assignee

Comment 20

5 months ago
(In reply to Marcia Knous [:marcia - needinfo? me] from comment #18)
> (In reply to Marcia Knous [:marcia - needinfo? me] from comment #17)
> > Haik: I have a MacBook Pro (13-inch, Early 2011) with this Graphics Driver
> > if I can be of any help.
> 
> I was able to reproduce the crash on my machine running 10.12.6, but a get a
> different signature:
> https://crash-stats.mozilla.com/report/index/c79cfa33-0089-450d-889d-
> 31f830181226#allthreads. I was loading
> https://www.google.fr/maps/@25.1113955,54.4302496,7z which is one of the
> crashing URLs.

Thanks for testing that. I'll assume it is the same problem. The crash stack is missing lots of entries.

If you set security.sandbox.content.mac.earlyinit=false (requires restart), the problem shouldn't be reproducible.

If possible, could you collect the Console.app log (described in my above comment 19) both with/without the crash?
Flags: needinfo?(mozillamarcia.knous)
Here is the console log with the crash.
Flags: needinfo?(mozillamarcia.knous)
Here is the console log without the crash, after tweaking the pref in about:config. After the pref tweak I didn't get a crash.

This is again running on 10.12.6 and not 10.13
Adding another Mac signature seen in crash stats, also has a similar URL to Comment 18, and crashes as well.
Crash Signature: [@ mozilla::gl::SurfaceFactory_IOSurface::CreateShared] → [@ mozilla::gl::SurfaceFactory_IOSurface::CreateShared] [@ @0x7fff6690320a]
Assignee

Comment 24

5 months ago
Marcia, could you give this build a try? (Making sure to revert security.sandbox.content.mac.earlyinit to true.) It's a long shot. I saw some iokit-get-property related errors in the crash log, but not in the no-crash log. However, I think the no-crash log might have been incomplete. (Console.app can be unreliable.) This fix adds back access to iokit-get-properties.

  https://queue.taskcluster.net/v1/task/d_w4xBfoRVu9sToR60hm5w/runs/0/artifacts/public/build/target.dmg
Flags: needinfo?(mozillamarcia.knous)
I can confirm that I no longer crash using the build in Comment 24. I tested using https://www.google.fr/maps/@25.1113955,54.4302496,7z, which was the site which was consistently crashing previously. I also used a new profile, making sure that the sandbox setting wouldn't be an issue.
Flags: needinfo?(mozillamarcia.knous)
Assignee

Comment 26

5 months ago
Marcia, thank you. Could you test one more binary? This is a more targeted fix and if this works I'll aim to land this fix in Nightly and verify the crashes have stopped.

  https://queue.taskcluster.net/v1/task/JWJmK1QuSauRi5b78ySvpw/runs/0/artifacts/public/build/target.dmg
Flags: needinfo?(mozillamarcia.knous)
(In reply to Haik Aftandilian [:haik] from comment #26)
> Marcia, thank you. Could you test one more binary? This is a more targeted
> fix and if this works I'll aim to land this fix in Nightly and verify the
> crashes have stopped.
> 
>  
> https://queue.taskcluster.net/v1/task/JWJmK1QuSauRi5b78ySvpw/runs/0/
> artifacts/public/build/target.dmg

I can verify that using this build and a new profile I also don't crash, using the same site as in Comment 25.
Flags: needinfo?(mozillamarcia.knous)
Changing the priority to p1 as the bug is tracked by a release manager for the current beta.
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P3 → P1
Assignee

Comment 29

5 months ago
Allow access to device-id and vendor-id IOKit properties needed for AppleIntelHD3000GraphicsGLDriver.

Comment 30

5 months ago
Pushed by haftandilian@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/51fa00bbe97e
Crash in mozilla::gl::SurfaceFactory_IOSurface::CreateShared r=Alex_Gaynor

Comment 31

5 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/51fa00bbe97e
Status: NEW → RESOLVED
Last Resolved: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Assignee

Comment 32

5 months ago
Thanks again for the verification, Marcia.

I will uplift this to Beta after verifying we stop receiving crash reports on 66.
Initial Nightly data is looking good - any chance you'd be willing to nominate this for tomorrow's 65.0b8 build? Looks like we'll get stronger signal from that branch given the frequency.
Flags: needinfo?(haftandilian)
Assignee

Comment 34

5 months ago
Comment on attachment 9033780 [details]
Bug 1508277 - Crash in mozilla::gl::SurfaceFactory_IOSurface::CreateShared r?Alex_Gaynor

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1505573

User impact if declined: Older 2011 Macs (see comment 15 for which models) may experience crashed tabs. The crash was easily reproducible on Google Maps.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: Yes

If yes, steps to reproduce: On an affected Mac, confirm that the Google Maps crash is no longer reproducible with the fix. Marcia was able to reproduce the crash on Nightly with <https://www.google.fr/maps/@25.1113955,54.4302496,7z>.

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): This is a low risk patch because the only change is to the Mac content sandbox policy, the change is small, and the change makes the sandbox slightly less restrictive by allow content processes to read two properties that were previously blocked.

String changes made/needed: None
Flags: needinfo?(haftandilian)
Attachment #9033780 - Flags: approval-mozilla-beta?
Assignee

Comment 35

5 months ago
(In reply to Haik Aftandilian [:haik] from comment #34)
> Is this code covered by automated tests?: No

To correct that, the code that is being changed is executed by automated tests, but the impact of the changes is hardware dependent.
Comment on attachment 9033780 [details]
Bug 1508277 - Crash in mozilla::gl::SurfaceFactory_IOSurface::CreateShared r?Alex_Gaynor

[Triage Comment]
Avoids a crash caused by sandboxing with some video adapters. Approved for 65.0b8.
Attachment #9033780 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
We don't have any machine to match the hardware list from comment 15 and since this seems to be hardware specific, we can't verify this issue for 65.b08. Marcia, would you please assist into verifying this fix for 65.b08?
Flags: needinfo?(mozillamarcia.knous)
I can confirm I don't crash using 20190103150357, 65.0b8 running on 10.12.6 on the affected hardware. I tested with the reliably crashing site https://www.google.fr/maps/@25.1113955,54.4302496,7z.
Flags: needinfo?(mozillamarcia.knous)
Marcia, thank you for the verification. 
Based on comment 25 and comment 39, marking as verified for Nightly 66 and Beta 65.0b8.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Depends on: 1539098
No longer depends on: 1539098
You need to log in before you can comment on or make changes to this bug.