Closed Bug 999713 Opened 11 years ago Closed 7 years ago

Call MakeCurrent universally, but lazily

Tracking

()

Status:

RESOLVED DUPLICATE of bug 1390386

Milestone:

mozilla31

People

(Reporter: jgilbert, Assigned: jgilbert)

References

Details

Attachments

(8 files, 10 obsolete files)

patch 0: Kill whitespace in files we're going to touch 11 years ago Kelsey Gilbert [:jgilbert] 28.46 KB, patch	u480271 : review+ jgilbert : checkin+	Details \| Diff \| Splinter Review
patch 1: Cache MakeCurrent calls 11 years ago Kelsey Gilbert [:jgilbert] 10.24 KB, patch	bjacob : review+ BenWa : review-	Details \| Diff \| Splinter Review
patch 2: Always call MakeCurrent 11 years ago Kelsey Gilbert [:jgilbert] 2.46 KB, patch	bjacob : review+	Details \| Diff \| Splinter Review
patch 3: Split MakeCurrent(bool) into MaybeMakeCurrent and ForceMakeCurrent 11 years ago Kelsey Gilbert [:jgilbert] 66.63 KB, patch	bjacob : review+ mattwoodrow : review+	Details \| Diff \| Splinter Review
patch 4: Remove IsCurrent asserts, since we're lazily-implicitly current 11 years ago Kelsey Gilbert [:jgilbert] 8.63 KB, patch	bjacob : review+	Details \| Diff \| Splinter Review
patch 5: fGetError needs to MakeCurrent 11 years ago Kelsey Gilbert [:jgilbert] 1.54 KB, patch	bjacob : review+	Details \| Diff \| Splinter Review
patch 6: Hide the MakeCurrent "skipped N" printf behind DebugMode(). 11 years ago Kelsey Gilbert [:jgilbert] 955 bytes, patch	bjacob : review+	Details \| Diff \| Splinter Review
patch 7: Fixups 11 years ago Kelsey Gilbert [:jgilbert] 9.73 KB, text/plain		Details
patch 1: Cache MakeCurrent calls 11 years ago Kelsey Gilbert [:jgilbert] 12.62 KB, patch	bjacob : review-	Details \| Diff \| Splinter Review
patch 2: Split MakeCurrent(bool) into MaybeMakeCurrent and ForceMakeCurrent, and update callers 11 years ago Kelsey Gilbert [:jgilbert] 53.95 KB, patch		Details \| Diff \| Splinter Review
patch 3: Remove IsCurrent asserts, since we're lazily-implicitly current 11 years ago Kelsey Gilbert [:jgilbert] 13.42 KB, patch	jgilbert : review+	Details \| Diff \| Splinter Review
patch 4: fGetError needs to MakeCurrent 11 years ago Kelsey Gilbert [:jgilbert] 1.61 KB, patch	jgilbert : review+	Details \| Diff \| Splinter Review
patch 5: Only cache on B2G, or by flipping a pref. 11 years ago Kelsey Gilbert [:jgilbert] 5.71 KB, patch		Details \| Diff \| Splinter Review
patch 5: Only cache on B2G, or by flipping a pref. 11 years ago Kelsey Gilbert [:jgilbert] 6.13 KB, patch		Details \| Diff \| Splinter Review
patch 1: Cache MakeCurrent calls 10 years ago Kelsey Gilbert [:jgilbert] 16.96 KB, patch	bjacob : review-	Details \| Diff \| Splinter Review
patch 2.0: Update callers in gfx/gl 10 years ago Kelsey Gilbert [:jgilbert] 28.88 KB, patch	bjacob : review- mattwoodrow : review+	Details \| Diff \| Splinter Review
patch 2.1: Update callers in content/canvas/src 10 years ago Kelsey Gilbert [:jgilbert] 5.37 KB, patch	u480271 : review+	Details \| Diff \| Splinter Review
patch 2.2: Update callers in gfx/layers and misc 10 years ago Kelsey Gilbert [:jgilbert] 19.24 KB, patch	mattwoodrow : review+	Details \| Diff \| Splinter Review

Kelsey Gilbert [:jgilbert]

Assignee

Description

•

11 years ago

Currently, we need to make sure we call MakeCurrent before making GL calls. In DEBUG builds, we get an assert which makes sure we are current. One path we investigated in bug 972668 was to statically assure that we were calling MakeCurrent in our WebGL code. An alternative, investigated here, is to instead MakeCurrent in GLContext.h's fFoo() functions. This guarantees we call MakeCurrent before running GL calls, but would be quite slow, especially since the compositor doesn't have this same overhead, though WebGL nearly does this already. So, we should try to cache whether or not we're the current context on the thread, and only MakeCurrent if we aren't. One complication for this is plugin code. When we enter plugin code, the plugin can MakeCurrent something on our thread. To work with this, we need to invalidate our cache when we exit plugin execution. However, since plugins can re-enter into Gecko code, we also need to invalidate our cache on plugin *entry*, not just exit.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 1

•

11 years ago

Attached patch patch 0: Kill whitespace in files we're going to touch — Details — Splinter Review

Attachment #8410545 - Flags: review?(dglastonbury)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 2

•

11 years ago

Attached patch patch 1: Cache MakeCurrent calls (obsolete) — Details — Splinter Review

Attachment #8410547 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 3

•

11 years ago

Attached patch patch 2: Always call MakeCurrent (obsolete) — Details — Splinter Review

Attachment #8410548 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 4

•

11 years ago

Attached patch patch 3: Split MakeCurrent(bool) into MaybeMakeCurrent and ForceMakeCurrent (obsolete) — Details — Splinter Review

Attachment #8410550 - Flags: review?(matt.woodrow)

Attachment #8410550 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 5

•

11 years ago

Attached patch patch 4: Remove IsCurrent asserts, since we're lazily-implicitly current (obsolete) — Details — Splinter Review

Attachment #8410551 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 6

•

11 years ago

Attached patch patch 5: fGetError needs to MakeCurrent (obsolete) — Details — Splinter Review

Attachment #8410553 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 7

•

11 years ago

Attached patch patch 6: Hide the MakeCurrent "skipped N" printf behind DebugMode(). (obsolete) — Details — Splinter Review

Attachment #8410554 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Updated

•

11 years ago

Comment 8

•

11 years ago

Without the last printf patch: https://tbpl.mozilla.org/?tree=Try&rev=b244fafa7280 Green except printf-during-shutdown? With: https://tbpl.mozilla.org/?tree=Try&rev=48dd95dbcb79

u480271

Comment 9

•

11 years ago

Comment on attachment 8410545 [details] [diff] [review] patch 0: Kill whitespace in files we're going to touch Review of attachment 8410545 [details] [diff] [review]: ----------------------------------------------------------------- Death to trailing whitespace.

Attachment #8410545 - Flags: review?(dglastonbury) → review+

Kelsey Gilbert [:jgilbert]

Assignee

Comment 10

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b06254e402b2 Let's get that in, regardless.

Kelsey Gilbert [:jgilbert]

Assignee

Updated

•

11 years ago

Attachment #8410545 - Flags: checkin+

Carsten Book [:Tomcat]

Comment 11

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/b06254e402b2

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla31

Benoit Jacob [:bjacob] (mostly away)

Comment 12

•

11 years ago

It seems that a [leave open] was intended here.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Whiteboard: [leave open]

Benoit Jacob [:bjacob] (mostly away)

Comment 13

•

11 years ago

Comment on attachment 8410547 [details] [diff] [review] patch 1: Cache MakeCurrent calls Review of attachment 8410547 [details] [diff] [review]: ----------------------------------------------------------------- R+ with mandatory nits on gfx/ parts. You need another reviewer than me for NPAPI parts. Perhaps :BenWa? ::: gfx/gl/GLContext.h @@ +2483,5 @@ > typedef gfx::SharedSurfaceType SharedSurfaceType; > typedef gfx::SurfaceFormat SurfaceFormat; > > + // Current context tracking (by thread) > + static unsigned sTLSId_CurrentContext; I would recommend that you also port this to MFBT ThreadLocal, but it's good to save that for a separate patch. @@ +2491,5 @@ > + PR_NewThreadPrivateIndex(&sTLSId_CurrentContext, nullptr); > + } > + > + static const GLContext* GetCurrentContext() { > + return (const GLContext*)PR_GetThreadPrivate(sTLSId_CurrentContext); static_cast, not C cast. @@ +2505,4 @@ > virtual bool MakeCurrentImpl(bool aForce) = 0; > > +private: > + uint32_t mSkippedMakeCurrents; I don't suppose that you want to keep this for landing? @@ +2520,5 @@ > + ++mSkippedMakeCurrents; > + return true; > + } > + } > + printf_stderr("Couldn't skip MakeCurrent. (%u skips)\n", mSkippedMakeCurrents); Remove this printf or restrict it somehow so it's not so noisy?

Attachment #8410547 - Flags: review?(bjacob)

Attachment #8410547 - Flags: review?(bgirard)

Attachment #8410547 - Flags: review+

Benoit Jacob [:bjacob] (mostly away)

Comment 14

•

11 years ago

I would quite strongly recommend that you insert the port to MFBT ThreadLocal as part 1.5 here, so that you immediately limit the overhead of thread local storage. I seem to remember that (perhaps depending on the platform) MFBT ThreadLocal can be significantly lower overhead than NSPR.

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Attachment #8410548 - Flags: review?(bjacob) → review+

Benoit Jacob [:bjacob] (mostly away)

Comment 15

•

11 years ago

Comment on attachment 8410550 [details] [diff] [review] patch 3: Split MakeCurrent(bool) into MaybeMakeCurrent and ForceMakeCurrent Review of attachment 8410550 [details] [diff] [review]: ----------------------------------------------------------------- MaybeMakeCurrent is a bad name because it doesn't reflect the fact that it actually is meant to honor a precise contract: The contract (IIUC) is that when MaybeMakeCurrent returns, the given GLContext is current on the current thread. How about renaming MaybeMakeCurrent to EnsureCurrent ? ForceMakeCurrent is fine. ::: gfx/gl/GLContext.cpp @@ -459,5 @@ > > }; > > mInitialized = LoadSymbols(&symbols[0], trygl, prefix); > - MakeCurrent(); This has me concerned. What happens if we create a context, destroy it, and create a new context, and that new context happens to have exactly the same address as the old one? We wouldn't want to skip the initial makecurrent then. I think it would be safer to keep this makecurrent and make it a ForceMakeCurrent. OK, I realize (now) that MarkDestroyed below does null the TLS pointer so that shouldn't actually ever happen. Still, this seems dangerous enough, and the benefit of skipping one makecurrent during init is so low, that I would rather have us ForceMakeCurrent here. ::: gfx/gl/GLContextProviderEGL.cpp @@ +370,5 @@ > Screen()->AssureBlitted(); > } > > mSurfaceOverride = surf ? (EGLSurface) surf : mSurface; > + ForceMakeCurrent(); good catch, @@ +432,5 @@ > mSurface = mozilla::gl::CreateSurfaceForWindow(nullptr, mConfig); // the nullptr here is where we assume Android. > if (mSurface == EGL_NO_SURFACE) { > return false; > } > + return ForceMakeCurrent(); and another good catch. I wonder if we could go full monty and also TLS the surface that we're current against. But that might be overkill, and totally unneeded outside of EGL. The concept itself may be mostly EGL specific. ::: gfx/layers/opengl/CompositorOGL.cpp @@ +661,5 @@ > // to make sure that GL sees the updated widget size. > if (mWidgetSize.width != width || > mWidgetSize.height != height) > { > + mGLContext->ForceMakeCurrent(); I'm surprised that we're not checking the return value here (to detect context loss etc).

Attachment #8410550 - Flags: review?(bjacob) → review+

Benoit Jacob [:bjacob] (mostly away)

Comment 16

•

11 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #0) > One complication for this is plugin code. When we enter plugin code, the > plugin can MakeCurrent something on our thread. To work with this, we need > to invalidate our cache when we exit plugin execution. However, since > plugins can re-enter into Gecko code, we also need to invalidate our cache > on plugin *entry*, not just exit. What about widget code, on platforms where we may have in-main-thread widget calls that may use OpenGL? Thinking about Mac here, and perhaps some flavors of Linux.

Benoit Jacob [:bjacob] (mostly away)

Comment 17

•

11 years ago

Need an reply to comment 16.

Flags: needinfo?(jgilbert)

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Attachment #8410551 - Flags: review?(bjacob) → review+

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Attachment #8410553 - Flags: review?(bjacob) → review+

Benoit Jacob [:bjacob] (mostly away)

Updated

•

11 years ago

Attachment #8410554 - Flags: review?(bjacob) → review+

Benoit Jacob [:bjacob] (mostly away)

Comment 18

•

11 years ago

I'm concerned that having the Skipped counter on in release builds is still like wasting 1 cache line for little good reason.

Benoit Girard (:BenWa)

Comment 19

•

11 years ago

Comment on attachment 8410547 [details] [diff] [review] patch 1: Cache MakeCurrent calls Review of attachment 8410547 [details] [diff] [review]: ----------------------------------------------------------------- r- because we're still missing possible calls to MakeCurrent. ::: dom/plugins/base/nsNPAPIPluginInstance.cpp @@ +1845,5 @@ > + if (aReentryState == NS_PLUGIN_CALL_UNSAFE_TO_REENTER_GECKO) { > + ++gInUnsafePluginCalls; > + } > + > + mozilla::gl::GLContext::SetCurrentContext(nullptr); You're assuming that all plugins execution will go through this Begin which is incorrect. A plugin is free to post a event to the native event loop to set a timer. It's probably very rare however for plugins to be loaded in the main thread since plugins are generally off (but not always) out of process. You might just want to have a way to disable your optimization if we have a gecko main thread plugin loaded. Don't forget that native widget (or more generally native code) may also post events that call MakeCurrent or even as a response/callback to the current gecko event invoke MakeCurrent. We have to be very careful with this kind of optimization.

Attachment #8410547 - Flags: review?(bgirard) → review-

Matt Woodrow (:mattwoodrow)

Updated

•

11 years ago

Attachment #8410550 - Flags: review?(matt.woodrow) → review+

Kelsey Gilbert [:jgilbert]

Assignee

Comment 20

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=48dd95dbcb79

Flags: needinfo?(jgilbert)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 21

•

11 years ago

:bjacob and I talked about this. For now, we want to basically restrict it to B2G, which doesn't have dangerous plugins. We should also be able to enable it for platforms where we won't have widgets doing GL, so Windows. Based on the Try run, which includes an assert that our cached value is correct, I'm pretty sure we're being over-cautious about widget code clobbering GL state. If we're worried about this in production, we can force-make-current once in the compositor, so at least that won't ever be messed up. :bgirard's point about plugins posting directly to the main thread is well taken. In light of that concern, we should hook glMakeCurrent, and intercept calls to it. This should handle everything perfectly. (and I think we support overriding functions on all platforms?)

Kelsey Gilbert [:jgilbert]

Assignee

Updated

•

11 years ago

Status: REOPENED → ASSIGNED

Benoit Girard (:BenWa)

Comment 22

•

11 years ago

(In reply to Benoit Jacob [:bjacob] from comment #18) > I'm concerned that having the Skipped counter on in release builds is still > like wasting 1 cache line for little good reason. Sorry, Hijacking: Regarding wasting a cacheline. I think we should consider doing some manual PGO and having a 'hot' segment in our binary. This would be code that is hot in our vanilla (no options flipped) builds. The profiler's state tracking is another thing that is constantly adding cache pressure just to check the state. (In reply to Jeff Gilbert [:jgilbert] from comment #21) B2G is probably a safe place to start (and good bang for our buck). Note that hooking glMakeCurrent is dangerous because it might just be an external entry point. The driver and/or native platform may have change the current context through any mean via an internal symbol.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 23

•

11 years ago

Green at last: https://tbpl.mozilla.org/?tree=Try&rev=e668546557b1

Kelsey Gilbert [:jgilbert]

Assignee

Updated

•

11 years ago

Whiteboard: [leave open]

Jeff Muizelaar [:jrmuizel]

Comment 24

•

11 years ago

Comment on attachment 8410548 [details] [diff] [review] patch 2: Always call MakeCurrent Review of attachment 8410548 [details] [diff] [review]: ----------------------------------------------------------------- Doesn't this add overhead to every compositor GL call when we know that nobody will ever clobber the current context?

Kelsey Gilbert [:jgilbert]

Assignee

Comment 25

•

11 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #24) > Comment on attachment 8410548 [details] [diff] [review] > patch 2: Always call MakeCurrent > > Review of attachment 8410548 [details] [diff] [review]: > ----------------------------------------------------------------- > > Doesn't this add overhead to every compositor GL call when we know that > nobody will ever clobber the current context? A very small amount, sure. We can add an override for this.

Jukka Jylänki

Updated

•

11 years ago

Blocks: 1014011

Jukka Jylänki

Updated

•

11 years ago

No longer blocks: 1014011

Kelsey Gilbert [:jgilbert]

Assignee

Comment 26

•

11 years ago

Attached file patch 7: Fixups (obsolete) — Details

Attachment #8431923 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 27

•

11 years ago

Comment on attachment 8431923 [details] patch 7: Fixups Nevermind, combined these into previous patches. After I get a building Try run, I'll post the new patch queue.

Attachment #8431923 - Attachment is obsolete: true

Attachment #8431923 - Attachment is patch: false

Attachment #8431923 - Flags: review?(bjacob)

Benoit Jacob [:bjacob] (mostly away)

Comment 28

•

11 years ago

I'm booked this week anyway, due to the 'bootcamp' week going on in Toronto, https://intranet.mozilla.org/Bootcamp1

Kelsey Gilbert [:jgilbert]

Assignee

Comment 29

•

11 years ago

Attached patch patch 1: Cache MakeCurrent calls (obsolete) — Details — Splinter Review

Should be mostly the same. Please check that I did the TLS correctly.

Attachment #8410547 - Attachment is obsolete: true

Attachment #8434482 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 30

•

11 years ago

Attached patch patch 2: Split MakeCurrent(bool) into MaybeMakeCurrent and ForceMakeCurrent, and update callers (obsolete) — Details — Splinter Review

Sorry this is big again/still.

Attachment #8410548 - Attachment is obsolete: true

Attachment #8410550 - Attachment is obsolete: true

Attachment #8434486 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 31

•

11 years ago

Attached patch patch 3: Remove IsCurrent asserts, since we're lazily-implicitly current — Details — Splinter Review

r=bjacob

Attachment #8410551 - Attachment is obsolete: true

Attachment #8434488 - Flags: review+

Kelsey Gilbert [:jgilbert]

Assignee

Updated

•

11 years ago

Attachment #8434488 - Attachment description: patch 4: Remove IsCurrent asserts, since we're lazily-implicitly current → patch 3: Remove IsCurrent asserts, since we're lazily-implicitly current

Kelsey Gilbert [:jgilbert]

Assignee

Comment 32

•

11 years ago

Attached patch patch 4: fGetError needs to MakeCurrent — Details — Splinter Review

r=bjacob

Attachment #8410553 - Attachment is obsolete: true

Attachment #8434490 - Flags: review+

Kelsey Gilbert [:jgilbert]

Assignee

Comment 33

•

11 years ago

Attached patch patch 5: Only cache on B2G, or by flipping a pref. (obsolete) — Details — Splinter Review

Attachment #8410554 - Attachment is obsolete: true

Attachment #8434491 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 34

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=482b1c030fcb

Kelsey Gilbert [:jgilbert]

Assignee

Comment 35

•

11 years ago

Attached patch patch 5: Only cache on B2G, or by flipping a pref. — Details — Splinter Review

Oops, that didn't build.

Attachment #8434491 - Attachment is obsolete: true

Attachment #8434491 - Flags: review?(bjacob)

Attachment #8434557 - Flags: review?(bjacob)

Benoit Jacob [:bjacob] (mostly away)

Updated

•

10 years ago

Attachment #8434482 - Flags: review?(bjacob)

Benoit Jacob [:bjacob] (mostly away)

Updated

•

10 years ago

Attachment #8434486 - Flags: review?(bjacob)

Benoit Jacob [:bjacob] (mostly away)

Updated

•

10 years ago

Attachment #8434557 - Flags: review?(bjacob)

Benoit Jacob [:bjacob] (mostly away)

Comment 36

•

10 years ago

Sorry about the long delay, and the decision that might sound disappointing: I shouldn't anymore be the one reviewing these patches. I believe that the people doing nontrivial/design reviews like this, contributing nontrivial decisions, should be the _same_ people who would suffer the consequences if the decision is wrong i.e the people on the hook to fix regressions and dealing with that code in their daily work. So as I'm moving out of gfx, I should not be doing such reviews anymore. I can still give feedback, but I've already given positive feedback here earlier. I really like the idea of these patches and I expect that they're going to be all right, but I'm not 100% certain. More eyes on these patches would be a good thing.

Benoit Jacob [:bjacob] (mostly away)

Comment 37

•

10 years ago

Comment on attachment 8434482 [details] [diff] [review] patch 1: Cache MakeCurrent calls Review of attachment 8434482 [details] [diff] [review]: ----------------------------------------------------------------- You will need a review from :BenWa for the plugin stuff. ::: gfx/gl/GLContext.h @@ +617,5 @@ > > void BeforeGLCall(const char* glFunction) > { > MOZ_ASSERT(IsCurrent()); > + Instead of asserting on IsCurrent, just call MakeCurrent here ... @@ +657,5 @@ > static void AssertNotPassingStackBufferToTheGL(const void* ptr); > > #define BEFORE_GL_CALL \ > do { \ > + MakeCurrent(); \ ...rather than here? It's always nice to avoid hiding code in macros. Plus, this removes the need for the assertion above. @@ +678,5 @@ > > +#define BEFORE_GL_CALL \ > + do { \ > + MakeCurrent(); \ > + } while (0) Since that BEFORE_GL_CALL macro is now nontrivial even in non-DEBUG, I think that you should just remove it and instead just call BeforeGLCall (which should be inlined by the compiler), and move the #ifdef DEBUG to there. @@ +2516,5 @@ > typedef gfx::SurfaceFormat SurfaceFormat; > > +public: > + static void StaticInit() { > + DebugOnly<bool> ok = sTLS_CurrentContext.init(); What is the point of that DebugOnly variable? Did you mean to assert on it? (Would be a good idea). @@ +2520,5 @@ > + DebugOnly<bool> ok = sTLS_CurrentContext.init(); > + } > + > + static const GLContext* GetCachedCurrentContext() { > + return (const GLContext*)sTLS_CurrentContext.get(); What is the point of this pointer cast? None I think! Also, C casts for pointer types are an unconditional r- :-) @@ +2532,4 @@ > virtual bool MakeCurrentImpl(bool aForce) = 0; > > +private: > + DebugOnly<uint32_t> mSkippedMakeCurrents; This is neat for experimentation to help prove that this optimization is useful. Is it worth having in mozilla-central?

Attachment #8434482 - Flags: review-

Jeff Muizelaar [:jrmuizel]

Comment 38

•

10 years ago

(In reply to Benoit Jacob [:bjacob] from comment #37) > Comment on attachment 8434482 [details] [diff] [review] We should also get performance numbers for this work. I'm worried about the unnecessary tls lookups.

Benoit Jacob [:bjacob] (mostly away)

Comment 39

•

10 years ago

Oh yes, great point, do get performance numbers before we go on looking at code.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 40

•

10 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #38) > (In reply to Benoit Jacob [:bjacob] from comment #37) > > Comment on attachment 8434482 [details] [diff] [review] > > We should also get performance numbers for this work. I'm worried about the > unnecessary tls lookups. LayersBench(? BenWa's thing) showed no difference.

Jeff Muizelaar [:jrmuizel]

Comment 41

•

10 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #40) > (In reply to Jeff Muizelaar [:jrmuizel] from comment #38) > > (In reply to Benoit Jacob [:bjacob] from comment #37) > > > Comment on attachment 8434482 [details] [diff] [review] > > > > We should also get performance numbers for this work. I'm worried about the > > unnecessary tls lookups. > > LayersBench(? BenWa's thing) showed no difference. On which platform? It would also be good to know the actual cost of the tls lookup.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 42

•

10 years ago

Attached patch patch 1: Cache MakeCurrent calls — Details — Splinter Review

Attachment #8434482 - Attachment is obsolete: true

Attachment #8443778 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 43

•

10 years ago

Attached patch patch 2.0: Update callers in gfx/gl — Details — Splinter Review

Attachment #8434486 - Attachment is obsolete: true

Attachment #8443779 - Flags: review?(matt.woodrow)

Attachment #8443779 - Flags: review?(bjacob)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 44

•

10 years ago

Attached patch patch 2.1: Update callers in content/canvas/src — Details — Splinter Review

Attachment #8443780 - Flags: review?(dglastonbury)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 45

•

10 years ago

Attached patch patch 2.2: Update callers in gfx/layers and misc — Details — Splinter Review

Attachment #8443782 - Flags: review?(matt.woodrow)

Kelsey Gilbert [:jgilbert]

Assignee

Comment 46

•

10 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #41) > (In reply to Jeff Gilbert [:jgilbert] from comment #40) > > (In reply to Jeff Muizelaar [:jrmuizel] from comment #38) > > > (In reply to Benoit Jacob [:bjacob] from comment #37) > > > > Comment on attachment 8434482 [details] [diff] [review] > > > > > > We should also get performance numbers for this work. I'm worried about the > > > unnecessary tls lookups. > > > > LayersBench(? BenWa's thing) showed no difference. > > On which platform? It would also be good to know the actual cost of the tls > lookup. Nexus 10. In these new patches, I add a manual mode for the compositor thread. Given this manual mode, I'm tempted to just s/EnsureCurrent/MakeCurrent/, so we don't need to do a busywork change to all callers.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 47

•

10 years ago

https://tbpl.mozilla.org/?tree=Try&rev=2e643bc9bbc9

Matt Woodrow (:mattwoodrow)

Updated

•

10 years ago

Attachment #8443779 - Flags: review?(matt.woodrow) → review+

Matt Woodrow (:mattwoodrow)

Updated

•

10 years ago

Attachment #8443782 - Flags: review?(matt.woodrow) → review+

Benoit Jacob [:bjacob] (mostly away)

Comment 48

•

10 years ago

Comment on attachment 8443778 [details] [diff] [review] patch 1: Cache MakeCurrent calls Review of attachment 8443778 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/gl/GLContext.h @@ +2518,5 @@ > + MOZ_ASSERT(ok); > + } > + > + static inline const GLContext* GetCachedCurrentContext() { > + return (const GLContext*)sTLS_CurrentContext.get(); I r-'d an earlier version of this patch because this is a C cast of pointer types. I still think that that's a sufficient reason to r- :-) There are other things in this patch that seem to ignore my earlier review comments, but I can't guess which ones are intentional (because my suggestions were bad?) For instance, does BEFORE_GL_CALL really need to be a macro still? If that's only needed to get the FUNCTION_NAME, then the nontrivial work could all be deferred to an inline BeforeGLCall function taking the function name literal string as parameter. Generally, I think that there is a lot of value in minimizing the amount of interesting work hidden behind macros. Finally (and most importantly) I thought we agreed that what's really needed here, before we get to discuss reviewing details, is performance analysis on some real-world workloads on real typical B2G hardware (which is less fast than a Nexus 10).

Attachment #8443778 - Flags: review?(bjacob) → review-

Benoit Jacob [:bjacob] (mostly away)

Comment 49

•

10 years ago

Comment on attachment 8443779 [details] [diff] [review] patch 2.0: Update callers in gfx/gl Review of attachment 8443779 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/gl/GLContextProviderEGL.cpp @@ +371,5 @@ > Screen()->AssureBlitted(); > } > > mSurfaceOverride = surf ? (EGLSurface) surf : mSurface; > + MOZ_ALWAYS_TRUE(ForceMakeCurrent()); Defaulting to r- as there is something basic that I'm not understanding here. This patch makes us crash debug builds when we fail to ensure that a GLContext is current. Do I correctly understand that if for whatever reason eglMakeCurrent fails (say, context loss), this now makes us crash debug builds? We definitely want to (try to) gracefully handle context loss... The rest of this patch looks fine and I would r+ if this concern goes away.

Attachment #8443779 - Flags: review?(bjacob) → review-

Kelsey Gilbert [:jgilbert]

Assignee

Comment 50

•

10 years ago

(In reply to Benoit Jacob [:bjacob] from comment #48) > Comment on attachment 8443778 [details] [diff] [review] > patch 1: Cache MakeCurrent calls > > Review of attachment 8443778 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: gfx/gl/GLContext.h > @@ +2518,5 @@ > > + MOZ_ASSERT(ok); > > + } > > + > > + static inline const GLContext* GetCachedCurrentContext() { > > + return (const GLContext*)sTLS_CurrentContext.get(); > > I r-'d an earlier version of this patch because this is a C cast of pointer > types. I still think that that's a sufficient reason to r- :-) Fixed. It's a duplicate cast anyways. > > There are other things in this patch that seem to ignore my earlier review > comments, but I can't guess which ones are intentional (because my > suggestions were bad?) For instance, does BEFORE_GL_CALL really need to be a > macro still? If that's only needed to get the FUNCTION_NAME, then the > nontrivial work could all be deferred to an inline BeforeGLCall function > taking the function name literal string as parameter. Generally, I think > that there is a lot of value in minimizing the amount of interesting work > hidden behind macros. I don't actually think this makes things cleaner. It also requires a s/Foo/Bar/ patch. Also, we know 100% that macros are inlined, whereas we've definitely had trouble force-inlining things in the past. > > Finally (and most importantly) I thought we agreed that what's really needed > here, before we get to discuss reviewing details, is performance analysis on > some real-world workloads on real typical B2G hardware (which is less fast > than a Nexus 10). I don't know what numbers would be useful to have here. I would like to avoid the game of "I make up numbers and get told why they aren't useful". Do we have a list of good performance metric methods? FWIW, with these new patches, we do not eat a TLS lookup in the compositor path.

u480271

Comment 51

•

10 years ago

Comment on attachment 8443780 [details] [diff] [review] patch 2.1: Update callers in content/canvas/src Review of attachment 8443780 [details] [diff] [review]: ----------------------------------------------------------------- LGTM. ::: content/canvas/src/WebGLContext.cpp @@ +1336,5 @@ > > void > +WebGLContext::MakeContextCurrent() const > +{ > + // It's automatic now! Awesome. I assume the next patch will remove this function and all calls to it?

Attachment #8443780 - Flags: review?(dglastonbury) → review+

Benoit Jacob [:bjacob] (mostly away)

Comment 52

•

10 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #50) > (In reply to Benoit Jacob [:bjacob] from comment #48) > > Comment on attachment 8443778 [details] [diff] [review] > > patch 1: Cache MakeCurrent calls > > > > Review of attachment 8443778 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > ::: gfx/gl/GLContext.h > > @@ +2518,5 @@ > > > + MOZ_ASSERT(ok); > > > + } > > > + > > > + static inline const GLContext* GetCachedCurrentContext() { > > > + return (const GLContext*)sTLS_CurrentContext.get(); > > > > I r-'d an earlier version of this patch because this is a C cast of pointer > > types. I still think that that's a sufficient reason to r- :-) > Fixed. It's a duplicate cast anyways. > > > > > There are other things in this patch that seem to ignore my earlier review > > comments, but I can't guess which ones are intentional (because my > > suggestions were bad?) For instance, does BEFORE_GL_CALL really need to be a > > macro still? If that's only needed to get the FUNCTION_NAME, then the > > nontrivial work could all be deferred to an inline BeforeGLCall function > > taking the function name literal string as parameter. Generally, I think > > that there is a lot of value in minimizing the amount of interesting work > > hidden behind macros. > I don't actually think this makes things cleaner. It also requires a > s/Foo/Bar/ patch. > Also, we know 100% that macros are inlined, whereas we've definitely had > trouble force-inlining > things in the past. As someone who's maintained an expression-templates matrix libraries relying entirely on inlining on 20+-deep call chains, I know what problems you're talking about, but I also think that in the present case, on current compilers that we have to support (the oldest MSVC that matters is 2010) the likeliness of such problems is easily low enough that the benefits of inline functions (if only for debugging ease) are worth it. But, it doesn't matter hugely, I guess. > > > > Finally (and most importantly) I thought we agreed that what's really needed > > here, before we get to discuss reviewing details, is performance analysis on > > some real-world workloads on real typical B2G hardware (which is less fast > > than a Nexus 10). > > I don't know what numbers would be useful to have here. I would like to > avoid the game of "I make up numbers and get told why they aren't useful". > Do we have a list of good performance metric methods? Since you're the one proposing this approach and these patches, it's for you to come up with a proposal of what would be good enough performance measurements and convince the rest of us that they are sufficient :-) > > FWIW, with these new patches, we do not eat a TLS lookup in the compositor > path. (Limited colloquial english skills here) what is meant by "eat" here?

Benoit Jacob [:bjacob] (mostly away)

Comment 53

•

10 years ago

(In reply to Benoit Jacob [:bjacob] from comment #52) > (In reply to Jeff Gilbert [:jgilbert] from comment #50) > > I don't know what numbers would be useful to have here. I would like to > > avoid the game of "I make up numbers and get told why they aren't useful". > > Do we have a list of good performance metric methods? > > Since you're the one proposing this approach and these patches, it's for you > to come up with a proposal of what would be good enough performance > measurements and convince the rest of us that they are sufficient :-) But to give you a hint in a way that shouldn't bind me: AFAICS, it would be good enough to measure that there is no significant negative impact on typical games (preferably real commercial games that we've been tracking the performance of) on B2G on typical B2G hardware (hamachi etc, anything MSM7225 based). That leaves somewhat open the definition of "no significant negative impact". Ideally, you'd look at a workload where we're not quite at 60FPS, so that any significant impact would be seen in the FPS counter. IIRC, on the WMW game, we are not quite at 60FPS (hope this is still the case) so that would make it a good testcase. I think that you need 3 testcases, one in each of the following categories: - WebGL - Canvas2D game using Skia/GL (default on B2G) - no canvas, just DOM compositing.

Benoit Jacob [:bjacob] (mostly away)

Comment 54

•

10 years ago

So to wrap it up: - WebGL: try WMW - Skia/GL: try Cut The Rope - DOM: try some CSS transforms demo such as one of the bigger molecules on http://mrdoob.github.io/three.js/examples/css3d_molecules.html

Kelsey Gilbert [:jgilbert]

Assignee

Comment 55

•

10 years ago

(In reply to Benoit Jacob [:bjacob] from comment #52) > > > > > > Finally (and most importantly) I thought we agreed that what's really needed > > > here, before we get to discuss reviewing details, is performance analysis on > > > some real-world workloads on real typical B2G hardware (which is less fast > > > than a Nexus 10). > > > > I don't know what numbers would be useful to have here. I would like to > > avoid the game of "I make up numbers and get told why they aren't useful". > > Do we have a list of good performance metric methods? > > Since you're the one proposing this approach and these patches, it's for you > to come up with a proposal of what would be good enough performance > measurements and convince the rest of us that they are sufficient :-) I actually disagree with this being the general rule. I don't have good expertise in layout/layerization/compositor to propose good benchmarks. This sort of lack of expertise is why we often see people using poorly designed demos as benchmarks, or the ever-common "we spend a lot of time in glFinish/glClientWaitSync, and it seems to work fine when I comment them out". I don't want to throw together benchmarks and have them be rejected when I imagine that I could just ask 'what makes a good benchmark here' and get a useful response. (Indeed, you have kindly given me some paths forward!) We should 'pull' and not 'push' in this regard. > > > > > FWIW, with these new patches, we do not eat a TLS lookup in the compositor > > path. > > (Limited colloquial english skills here) what is meant by "eat" here? "eat" is synonymous with "incur a penalty of" here, sorry.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 56

•

10 years ago

(In reply to Dan Glastonbury :djg :kamidphish from comment #51) > Comment on attachment 8443780 [details] [diff] [review] > patch 2.1: Update callers in content/canvas/src > > Review of attachment 8443780 [details] [diff] [review]: > ----------------------------------------------------------------- > > LGTM. > > ::: content/canvas/src/WebGLContext.cpp > @@ +1336,5 @@ > > > > void > > +WebGLContext::MakeContextCurrent() const > > +{ > > + // It's automatic now! > > Awesome. I assume the next patch will remove this function and all calls to > it? I want to do this in a follow-up patch, so I don't keep needing to rebase it.

Jeff Muizelaar [:jrmuizel]

Comment 57

•

10 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #55) > (In reply to Benoit Jacob [:bjacob] from comment #52) > > > > > > > > Finally (and most importantly) I thought we agreed that what's really needed > > > > here, before we get to discuss reviewing details, is performance analysis on > > > > some real-world workloads on real typical B2G hardware (which is less fast > > > > than a Nexus 10). > > > > > > I don't know what numbers would be useful to have here. I would like to > > > avoid the game of "I make up numbers and get told why they aren't useful". > > > Do we have a list of good performance metric methods? > > > > Since you're the one proposing this approach and these patches, it's for you > > to come up with a proposal of what would be good enough performance > > measurements and convince the rest of us that they are sufficient :-) > > I actually disagree with this being the general rule. > > I don't have good expertise in layout/layerization/compositor to propose > good benchmarks. > > This sort of lack of expertise is why we often see people using poorly > designed demos as benchmarks, or the ever-common "we spend a lot of time in > glFinish/glClientWaitSync, and it seems to work fine when I comment them > out". > > I don't want to throw together benchmarks and have them be rejected when I > imagine that I could just ask 'what makes a good benchmark here' and get a > useful response. (Indeed, you have kindly given me some paths forward!) We > should 'pull' and not 'push' in this regard. Why not just measure the time that the tls lookup takes and then count the number of typical gl calls in some scenarios. We can then extrapolate the performance impact from that.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 58

•

10 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #57) > (In reply to Jeff Gilbert [:jgilbert] from comment #55) > > (In reply to Benoit Jacob [:bjacob] from comment #52) > > > > > > > > > > Finally (and most importantly) I thought we agreed that what's really needed > > > > > here, before we get to discuss reviewing details, is performance analysis on > > > > > some real-world workloads on real typical B2G hardware (which is less fast > > > > > than a Nexus 10). > > > > > > > > I don't know what numbers would be useful to have here. I would like to > > > > avoid the game of "I make up numbers and get told why they aren't useful". > > > > Do we have a list of good performance metric methods? > > > > > > Since you're the one proposing this approach and these patches, it's for you > > > to come up with a proposal of what would be good enough performance > > > measurements and convince the rest of us that they are sufficient :-) > > > > I actually disagree with this being the general rule. > > > > I don't have good expertise in layout/layerization/compositor to propose > > good benchmarks. > > > > This sort of lack of expertise is why we often see people using poorly > > designed demos as benchmarks, or the ever-common "we spend a lot of time in > > glFinish/glClientWaitSync, and it seems to work fine when I comment them > > out". > > > > I don't want to throw together benchmarks and have them be rejected when I > > imagine that I could just ask 'what makes a good benchmark here' and get a > > useful response. (Indeed, you have kindly given me some paths forward!) We > > should 'pull' and not 'push' in this regard. > > Why not just measure the time that the tls lookup takes and then count the > number of typical gl calls in some scenarios. We can then extrapolate the > performance impact from that. The compositor does not use TLS in the new patch.

Jeff Muizelaar [:jrmuizel]

Comment 59

•

10 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #58) > The compositor does not use TLS in the new patch. But a webgl application would.

u480271

Comment 60

•

10 years ago

Something curious I saw. Looking at the WGL context handling there is a comment that says that we don't have our own TLS version of the current context pointer because that's what WGL does. http://dxr.mozilla.org/mozilla-central/source/gfx/gl/GLContextProviderWGL.cpp#308 When I was using CodeXL from AMD to debug issues on Win7, MakeCurrent was reported as being a "slow function" because it could cause the same effects as glFinish. I couldn't find any more information in the help, but it is curious.

Kelsey Gilbert [:jgilbert]

Assignee

Comment 61

•

7 years ago

Let's dupe this forward to the new iteration of the same idea.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago → 7 years ago

Resolution: --- → DUPLICATE

You need to log in before you can comment on or make changes to this bug.