Closed Bug 1432531 Opened 6 years ago Closed 6 years ago

Enable OMTP on Linux

Categories

(Core :: Graphics: Layers, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: rhunt, Assigned: rhunt)

References

Details

(Whiteboard: [gfx-noted])

Attachments

(2 files)

Try runs have been mostly green with a cairo-ft crash.
Is there anything we could do here?
Flags: needinfo?(rhunt)
Mostly this just needs someone to look into the cairo-ft crash that was happening on try, and to follow up with any other regressions that happen. I can't find the try runs that I did with this, so I can't point to an example of it. If anyone is interested, that would be a good start.

I've been working on enabling parallel OMTP with tiling on skia windows so I've been unable to look into this further.
Flags: needinfo?(rhunt)
Okay, I did another try run to see if the crash was still reproducible and got it. [1]

It only happens in AWSY, the rest of the tree looks green. I did a bunch of retriggers for mochitests, reftests, and web-platform-tests to be sure.

This is probably the major blocker to enabling this in nightly.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=4bfa8d0fe6c467eb776b00a28fe0f1c898baaf50&selectedJob=181274869
Here is one more crash from a retrigger that hadn't finished yet [1]

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=181343295&repo=try&lineNumber=1523
Okay, I think I've figured out a cause of thread unsafety that we are running into with freetype.

The T1 font backend is using global state via the T1_Hints_Funcs which has a pointer to a PS_Hints/T1_Hints struct which is stored on the 'pshinter' module global. PS_Hints is then used when performing hinting which is done by either FT_Load_Glyph, FT_Get_Advances, and FT_Get_Advance. Our per FT_Face lock doesn't protect against this and this can cause crashes.

Oddly enough, from the try runs that Lee and I did that used the global freetype lock around all these functions, we're still seeing crashes. It's uncertain whether they are new crashes or I'm still missing something here.
I got a green run on AWSY here by changing the face lock to also lock the global lock, and making the global lock recursive to work around deadlock as we sometimes hold both on the same thread. [1]

This is a potential option if we can't figure out what's going on here.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=2508fa7cd218d2f84e67b596119338d7c4bd7907
Copying over the try run from bug 1471261. Once that bug is fixed it looks like we can try to enable OMTP on linux.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a839483173552e4852624d6e6f86429657775d8e&selectedJob=184982246
It looks like the blockers for enabling OMTP on linux are resolved now. I'd like to enable it now early in 63 and see what crashers we get. If things are smooth we should be able to let it ride the trains.
Attachment #8988194 - Flags: review?(dbolter)
Attachment #8988194 - Flags: review?(dbolter) → review+
Any reason OMTP is enabled for Linux and not for all Gtk platforms? There's no Linux-specific widget code on Solaris, Darwin/X11, DragonFly, FreeBSD, NetBSD, OpenBSD. OTOH, XP_LINUX is also defined on Android.
Flags: needinfo?(rhunt)
Comment on attachment 8988194 [details] [diff] [review]
enable-omtp-linux.patch

Review of attachment 8988194 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/init/all.js
@@ +5797,4 @@
>  // Open noopener links in a new process
>  pref("dom.noopener.newprocess.enabled", true);
>  
> +#if defined(XP_WIN) || defined(XP_MACOSX) || defined(XP_LINUX)

Better use #if !defined(MOZ_WIDGET_ANDROID) to be explicit only still lacks OMTP and avoid accidentally subsuming bug 1467151.

https://wiki.mozilla.org/Platform/Platform-specific_build_defines
https://searchfox.org/mozilla-central/rev/d29662469051/build/moz.configure/init.configure#933-939
(In reply to Jan Beich from comment #12)
> Any reason OMTP is enabled for Linux and not for all Gtk platforms? There's
> no Linux-specific widget code on Solaris, Darwin/X11, DragonFly, FreeBSD,
> NetBSD, OpenBSD. OTOH, XP_LINUX is also defined on Android.

No particular reason except for ignorance.

I suppose what really needs to happen is turn on OMTP everywhere except
for Android.

(In reply to Jan Beich from comment #13)
> Comment on attachment 8988194 [details] [diff] [review]
> enable-omtp-linux.patch
> 
> Review of attachment 8988194 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: modules/libpref/init/all.js
> @@ +5797,4 @@
> >  // Open noopener links in a new process
> >  pref("dom.noopener.newprocess.enabled", true);
> >  
> > +#if defined(XP_WIN) || defined(XP_MACOSX) || defined(XP_LINUX)
> 
> Better use #if !defined(MOZ_WIDGET_ANDROID) to be explicit only still lacks
> OMTP and avoid accidentally subsuming bug 1467151.
> 
> https://wiki.mozilla.org/Platform/Platform-specific_build_defines
> https://searchfox.org/mozilla-central/rev/d29662469051/build/moz.configure/
> init.configure#933-939

Yes that would probably be better.

The patch that was committed is not subsuming bug 1467151 because we'll
disable OMTP at runtime on android here [1] (edge-padding is enabled only on
android). This was accidental though, I'll try and fix this so the prefs are
accurate.

[1] https://searchfox.org/mozilla-central/rev/d2966246905102b36ef5221b0e3cbccf7ea15a86/gfx/thebes/gfxPlatform.cpp#2726
Flags: needinfo?(rhunt)
Backed out for causing memory leaks

Push that caused the failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=d07e9d2b54945f8344e56ab1063419afc22864e9

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=185246980&repo=mozilla-inbound&lineNumber=3593

[task 2018-06-27T21:32:51.283Z] 21:32:51     INFO - TEST-INFO | Main app process: exit 0
[task 2018-06-27T21:32:51.285Z] 21:32:51     INFO - TEST-INFO | LeakSanitizer | To show the addresses of leaked objects add report_objects=1 to LSAN_OPTIONS
[task 2018-06-27T21:32:51.287Z] 21:32:51     INFO - TEST-INFO | LeakSanitizer | This can be done in testing/mozbase/mozrunner/mozrunner/utils.py
[task 2018-06-27T21:32:51.288Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at MakeAndAddRef, mozilla::layers::TextureClient::CreateForRawBufferAccess, mozilla::layers::TextureClient::CreateForDrawing, mozilla::layers::TextureClient::CreateForDrawing
[task 2018-06-27T21:32:51.290Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::layers::ShmemTextureData::Create, mozilla::layers::BufferTextureData::Create, mozilla::layers::TextureClient::CreateForRawBufferAccess, mozilla::layers::TextureClient::CreateForDrawing
[task 2018-06-27T21:32:51.292Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at allocate, allocate, _M_allocate, std::vector
[task 2018-06-27T21:32:51.294Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at allocate, _M_allocate_node, _M_push_back_aux, push_back
[task 2018-06-27T21:32:51.295Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at PLDHashTable::ChangeTable, ShrinkIfAppropriate, PLDHashTable::RemoveEntry, RemoveEntry
[task 2018-06-27T21:32:51.297Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at Malloc, nsTArray_base, AppendElement, operator
[task 2018-06-27T21:32:51.298Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at SkSurface_Raster::onNewCanvas, getCachedCanvas, SkSurface::getCanvas, mozilla::gfx::DrawTargetSkia::Init
[task 2018-06-27T21:32:51.300Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at SkMallocPixelRef::MakeWithProc, SkBitmap::installPixels, SkSurface_Raster::SkSurface_Raster, sk_make_sp
[task 2018-06-27T21:32:51.301Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::gfx::Factory::CreateDrawTargetForData, mozilla::layers::BufferTextureData::BorrowDrawTarget, BorrowDrawTarget, mozilla::layers::TextureClient::Lock
[task 2018-06-27T21:32:51.303Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at NewSegment, CreateSegment, mozilla::ipc::Shmem::Alloc, mozilla::ipc::IToplevelProtocol::ToplevelState::CreateSharedMemory
[task 2018-06-27T21:32:51.304Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at EnableBlockingReadLock, mozilla::layers::TextureClient::TextureClient, MakeAndAddRef, mozilla::layers::TextureClient::CreateForRawBufferAccess
[task 2018-06-27T21:32:51.306Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::SchedulerGroup::CreateEventTargetFor, mozilla::SchedulerGroup::CreateEventTargets, mozilla::dom::TabGroup::TabGroup, mozilla::dom::nsIContentChild::GetConstructedEventTarget
[task 2018-06-27T21:32:51.307Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at sk_make_sp, MakeRasterDirectReleaseProc, SkSurface::MakeRasterDirect, mozilla::gfx::DrawTargetSkia::Init
[task 2018-06-27T21:32:51.309Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::CrossProcessSemaphore::Create, CrossProcessSemaphoreReadLock, EnableBlockingReadLock, mozilla::layers::TextureClient::TextureClient
[task 2018-06-27T21:32:51.310Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at mozilla::layers::TextureClient::CreateIPDLActor, AllocPTextureChild, CreateTexture, mozilla::layers::CompositorBridgeChild::CreateTexture
[task 2018-06-27T21:32:51.312Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at MakeUnique, IProtocol, mozilla::layers::PTextureChild::PTextureChild, TextureChild
[task 2018-06-27T21:32:51.314Z] 21:32:51    ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at SkCanvas::SkCanvas, SkSurface_Raster::onNewCanvas, getCachedCanvas, SkSurface::getCanvas
[task 2018-06-27T21:32:51.315Z] 21:32:51     INFO - runtests.py | Application ran for: 0:12:43.149978

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/c50141d4c2cdf1e244c2718773580fe62bf53c5a
Flags: needinfo?(rhunt)
It looks like the culprit is in CompositorBridgeChild::Destroy. We can take an early exit if we cannot send over IPC anymore. If we do that we don't end up flushing any async paints. For some reason the destructor for the CompositorBridgeChild doesn't seem to run, but doesn't get counted as a leak?

I don't entirely understand what's going on here, but making us always flush async paints resolves this issue and seems harmless. [1]

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=e950e0b4779834655e233b67685debee0b06cd1d
Flags: needinfo?(rhunt)
Attachment #8989468 - Flags: review?(nical.bugzilla)
Attachment #8989468 - Flags: review?(nical.bugzilla) → review+
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a5bb1dc293cb
Enable OMTP on linux. r=davidb
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea49b4ea3828
Move FlushAsyncPaints so it is unconditionally run in CompositorBridgeChild::Destroy. r=nical
https://hg.mozilla.org/mozilla-central/rev/a5bb1dc293cb
https://hg.mozilla.org/mozilla-central/rev/ea49b4ea3828
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Push from comment 19 brought us big perf improvements!

== Change summary for alert #14207 (as of Wed, 04 Jul 2018 14:21:29 GMT) ==

Improvements:

 84%  rasterflood_gradient linux64 opt e10s stylo     716.17 -> 1,320.75
 79%  rasterflood_gradient linux64 pgo e10s stylo     722.58 -> 1,294.25
 36%  rasterflood_svg linux64 opt e10s stylo          18,674.88 -> 11,909.41
 35%  tsvgr_opacity linux64 opt e10s stylo            191.16 -> 123.39
 35%  rasterflood_svg linux64 pgo e10s stylo          19,423.27 -> 12,550.22
 34%  tsvgr_opacity linux64 pgo e10s stylo            178.00 -> 117.86
  3%  tsvgx linux64 opt e10s stylo                    197.32 -> 191.30

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14207
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: