Open Bug 1242969 Opened 4 years ago Updated 1 year ago

https://greensock.com/js/speed.html is a lot slower with e10s enabled

Categories

(Core :: DOM: Content Processes, defect, P2)

36 Branch
defect

Tracking

()

Tracking Status
e10s + ---

People

(Reporter: smaug, Assigned: gw280)

References

(Blocks 2 open bugs)

Details

(Keywords: regression)

See bug 1241165
Component: IPC → DOM: Content Processes
This happens at least on Linux.

Two guesses, either something wrong with vsync handling, or child is blocked on parent (trying to send some sync message). But just guesses, haven't quite interpreted the performance profile yet.
Flags: needinfo?(twalker)
He is what I observed.  Keep in mind without logs and tools to munge the data these are all visual observations of the fps meter and guestimates of the values.  Average is what I estimate the test run is hovering about.  peak is the highest value I saw while observing the test run.  In each case I started a run, I let it settle in for 10-15 seconds then began paying attention to the fps meter. Then I watched the test run for up to a minute.

As such the results aren't very precise. But I believe they show a notable slowdown on Mac in e10s vs non-e10s.  However, the results show notable improvements in e10s vs non-e10s on Windows and Linux.

e10s enabled, Dots: 300, Engine: JQuery
Win 7 (VM) - Nightly 47.0a1, Build ID 20160127030236 - average: ~48fps, peak: 60fps
Ubuntu (VM) - Nightly 47.0a1, Build ID 20160128030208 - average: ~14fps, peak: 17fps
Mac 10.9.5 - Nightly 47.0a1, Build ID 20160128030208 -  average: ~40fps, peak:47fps

new non-e10s window, Dots: 300, Engine: JQuery
Win 7 (VM) - Nightly 47.0a1, Build ID 20160127030236 - average: ~60fps, peak: 67fps
Ubuntu (VM) - Nightly 47.0a1, Build ID 20160128030208 - average: ~25fps, peak: 31fps
Mac 10.9.5 - Nightly 47.0a1, Build ID 20160128030208 -  average: ~34fps, peak: 39fps

===================================================================

e10s enabled, Dots: 300, Engine: TweenJS
Win 7  (VM)- Nightly 47.0a1, Build ID 20160127030236 - average ~68fps, peak: 72fps
Ubuntu (VM) - Nightly 47.0a1, Build ID 20160128030208 - average: ~29fps, peak: 32fps
Mac 10.9.5 - Nightly 47.0a1, Build ID 20160128030208 - average: ~74fps, peak: 83fps

new non-e10s window, Dots: 300, Engine: TweenJS
Win 7  (VM)- Nightly 47.0a1, Build ID 20160127030236 - average ~100fps, peak: 108fps
Ubuntu (VM) - Nightly 47.0a1, Build ID 20160128030208 - average: ~40fps, peak: 46fps
Mac 10.9.5 - Nightly 47.0a1, Build ID 20160128030208 - average: ~45fps, peak: 49fps
Flags: needinfo?(twalker)
Milan, can you assign some one to dig into this?
Blocks: e10s-perf
tracking-e10s: --- → +
Flags: needinfo?(milan)
Priority: -- → P2
(In reply to Olli Pettay [:smaug] from comment #1)
> This happens at least on Linux.
> 
> Two guesses, either something wrong with vsync handling, or child is blocked
> on parent (trying to send some sync message). But just guesses, haven't
> quite interpreted the performance profile yet.

Do you have the performance profile captured?  I see you commented on the non-E10S case, didn't know if you had also already recorded the E10S version.
Flags: needinfo?(milan) → needinfo?(bugs)
Hmm , I'm getting very different behavior here now. e10s is like before (30fps), but non-e10s only 1/4 of the fps I used to get (around 15fps now, not 60 like before). Has something regressed on this machine?
(Chrome is also ~30fps)

When profiling e10s using Zoom, on parent side I see 38% in apic_timer_interrupt and
on child side that is 15% .

Gecko profiler:
e10s: https://cleopatra.io/#report=f9a0ee82d4affc9a1e71255fd19ca7dd8ad2b8fd
non-e10s: https://cleopatra.io/#report=472866d01d3a6650c750f41dc8c2f36de31ec404



Btw, e10s vs. non-e10s must be tested using different profiles or at least changing the pref.
non-e10s Window in e10s doesn't bring quite right behavior.
Flags: needinfo?(bugs)
Basically I'm getting near hangs and a lot of time spent in Shmem dealloc. We might be doing something unreasonable like having too many layers here.
I'm seeing around 6 fps in both e10s and non-e10s on win7. IE gets about 45 fps. :(

Dots: 300, Engine: JQuery
Yea, we're getting destroyed because we're layerizing everything and trying to ship ~300 layers across process. A lot of them are 1x1 pixel sized layers.
I get good performance after disabling tiled layers in about:config. Part of the problem is we're not good at dealing with several tiled layers. Need more investigating before suggesting a fix.
We don't allow image layers when the dot is first created because the scale factor is too low (32px -> 1px). We do that because of the low quality downscale.

If I allow it the demo runs smoothly (60 FPS).

Demo seems to run fine as long as we don't hit the tiling code.
Looks like bug 1195400 made it worse:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a34ebc90b12

If I break early out of this loop (i.e. don't check ancestor) then the 'star' image layers are correctly marked as 'NONE' and get efficiently sized tiles instead of getting a scrollable tile layers and 1024x1024 tiles for 1x1 surfaces.

:tn says that it's relevant here that the parent is overflow:hidden and that we shouldn't trigger this patch.

:mstange can you advice here since you know AGR and this code better than I do?
Flags: needinfo?(mstange)
I'm not sure why the WantAsyncScroll part of ContainerState::GetLayerCreationHint is ifdefd for b2g, seems like it would make sense everywhere?

The other thing is, wouldn't we get this same problem on a scrollable page? Seems like creating a 1024x1024 tile for every tiny thebes layer on a page is something we want to avoid. As it stands right now only thebes layers in fixed pos content, or in pages that are overflow: hidden are not going to get the scrollable hint and hence 1024x1024 tile.
There is a patch on bug 1243589 that might help.
See Also: → 1243589
It would fix this problem, but not correctly. IMO this content should never be flagged as scrollable in the first place.
(In reply to Timothy Nikkel (:tnikkel) (mostly away until 2/20) from comment #15)
> I'm not sure why the WantAsyncScroll part of
> ContainerState::GetLayerCreationHint is ifdefd for b2g, seems like it would
> make sense everywhere?

Absolutely! The ifdef was added by part 6 in bug 1180326. I don't know why I didn't notice it during review. Matt, do you remember why you added it?

> The other thing is, wouldn't we get this same problem on a scrollable page?
> Seems like creating a 1024x1024 tile for every tiny thebes layer on a page
> is something we want to avoid.

True, and bug 1243589 avoids it now. For bigger layers (especially those whose visible region can change as the page is scrolled), we do want to use tiles.
Flags: needinfo?(mstange) → needinfo?(matt.woodrow)
(In reply to Markus Stange [:mstange] from comment #18)
> (In reply to Timothy Nikkel (:tnikkel) (mostly away until 2/20) from comment
> #15)
> > I'm not sure why the WantAsyncScroll part of
> > ContainerState::GetLayerCreationHint is ifdefd for b2g, seems like it would
> > make sense everywhere?
> 
> Absolutely! The ifdef was added by part 6 in bug 1180326. I don't know why I
> didn't notice it during review. Matt, do you remember why you added it?

It was just to preserve existing behaviour.

Before my patches, only b2g had the ability to switch between tiled and non-tiled, all other platforms used one or the other exclusively.

My patches added the switching for OSX, but since the comment below the WantAsyncScroll call suggested that it was only added for b2g, I left it as only taking effect on b2g.

It seems fine to drop this ifdef and check WantAsyncScroll everywhere.
Flags: needinfo?(matt.woodrow)
Renominating for blocking, this is unfortunate.
Assignee: nobody → gwright
Markus, (or perhaps Matt, Timothy, Kats?) any ideas how we should proceed here? This page still janks like crazy on 58 whereas Chrome is very smooth. In fact, it just completely hung up one content process for me forcing me to restart Firefox.

This is for the "CSS transitions" case. (There is another bug I need to work on to fix the "Web Animations" cases.)
Flags: needinfo?(mstange)
FWIW, when the CSS transitions test runs horribly slow, I see lot of below messages on console.

Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to lock new back buffer. (t=32.8787) |[631][GFX1-]: Failed to lock new back buffer. (t=71.397) |[632][GFX1-]: Failed to lock new back buffer. (t=71.3971) |[633][GFX1-]: Failed to lock new back buffer. (t=71.3973) |[634][GFX1-]: Failed to lock new back buffer. (t=71.3975) |[635][GFX1-]: Failed to lock new back buffer. (t=71.3975) |[636][GFX1-]: Failed to lock new back buffer. (t=71.3976) |[637][GFX1-]: Failed to lock new back buffer. (t=71.3977) |[638][GFX1-]: Failed to lock new back buffer. (t=71.3978) |[639][GFX1-]: Failed to lock new back buffer. (t=71.3981) |[640][GFX1-]: Failed to lock new back buffer. (t=71.899) |[641][GFX1-]: Failed to lock new back buffer. (t=71.8992) |[627][GFX1-]: Failed to lock new back buffer. (t=71.3956) |[628][GFX1-]: Failed to lock new back buffer. (t=71.396) |[629][GFX1-]: Failed to lock new back buffer. (t=71.3962) |[630][GFX1-]: Failed to lock new back buffer. (t=71.3967) [GFX1-]: Failed to lock new back buffer.
(In reply to Brian Birtles (:birtles) from comment #21)
> This is for the "CSS transitions" case. (There is another bug I need to work
> on to fix the "Web Animations" cases.)

With latest nightly on windows I captured some of the janks with the profiler: https://perfht.ml/2zlBLqn

What suspicious to me is the lot of NtGdiDdDDIWaitForSynchronizationObjectFromCpu in the compositor from mozilla::layers::DataTextureSourceD3D11::Update from mozilla::layers::BufferTextureHost::Upload. But maybe someone who is more familiar with graphics can understand the profile better.
(In reply to Brian Birtles (:birtles) from comment #21)
> Markus, (or perhaps Matt, Timothy, Kats?) any ideas how we should proceed
> here?

Thanks for putting this bug back on my radar. I think this is still the main reason we do badly on this test:

(In reply to Benoit Girard (:BenWa) from comment #12)
> We don't allow image layers when the dot is first created because the scale
> factor is too low (32px -> 1px). We do that because of the low quality
> downscale.

We should wait for bug 1368776 to land and then see if that improves things.

The conversations about tiling in this bug are no longer relevant; bug 1243589 pretty much eliminated that problem.
Depends on: 1368776
Flags: needinfo?(mstange)
See Also: → 1524155
You need to log in before you can comment on or make changes to this bug.