Open Bug 1498485 Opened 6 years ago Updated 6 months ago

hubs.mozilla.com frame rate with an empty scene is too slow in FxR on Oculus Go

Categories

(Core :: JavaScript Engine, defect, P3)

Product:

Component:

Platform:

Unspecified

Android

Type:

defect

Priority:

P3

Severity:

S3

Tracking

()

Status:

NEW

Performance Impact

medium

Tracking Flags:

Tracking

Status

geckoview62

---

wontfix

geckoview63

---

wontfix

firefox-esr60

---

wontfix

firefox63

---

wontfix

firefox64

---

wontfix

firefox65

---

affected

People

(Reporter: cpeterson, Unassigned)

References

(Depends on 3 open bugs,
URL
)

Details

(Keywords: perf:responsiveness, Whiteboard: [geckoview:fxr:p1][webvr])

Chris Peterson [:cpeterson]

Reporter

Description

•

6 years ago

Mozilla Hubs is a key scenario for FxR, but we can't make frame rate today rendering an empty scene. https://hubs.mozilla.com/

Chris Peterson [:cpeterson]

Reporter

Comment 1

•

6 years ago

Lars says: "The Google Pixel 1 is an equivalent device to the Oculus Go and would make a good baseline. There are some platform differences around CPU throttling (e.g., affecting the Gecko media stack), but from what the team has been telling me, the perf is roughly the same."

Chris Peterson [:cpeterson]

Reporter

Comment 2

•

6 years ago

63=wontfix because FxR 1.1 will ship GV 64.

status-firefox63: --- → wontfix

Comment hidden (typo)

Chris Peterson [:cpeterson]

Reporter

Comment 4

•

6 years ago

Jeff and Bas, here is the FxR bug about poor Hubs performance on the Oculus Go.

Vicky Chin [:vchin]

Updated

•

6 years ago

Flags: needinfo?(jgilbert)

Flags: needinfo?(bas)

Kelsey Gilbert [:jgilbert]

Comment 5

•

6 years ago

Here's the empty plane: http://bit.ly/2zljQxi

Flags: needinfo?(jgilbert)

Andrew Osmond [:aosmond] (he/him)

Updated

•

6 years ago

Priority: -- → P3

Daniel Holbert [:dholbert]

Comment 7

•

6 years ago

(As in bug 1498484, it looks like the Gecko profile is missing symbols here, too. Might be handy to recapture & be sure they get resolved/uploaded/whatever.)

Flags: needinfo?(jgilbert)

Bas Schouten (:bas.schouten)

Comment 8

•

6 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #5) > Here's the empty plane: > http://bit.ly/2zljQxi It does look like a fair amount of work here is coming from WebGL, in particular some texturing business shows up quite clearly. I don't see any reason why we'd be a lot slower than chromium here though :s.

Flags: needinfo?(bas)

Chris Peterson [:cpeterson]

Reporter

Comment 9

•

6 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #8) > It does look like a fair amount of work here is coming from WebGL, in > particular some texturing business shows up quite clearly. I don't see any > reason why we'd be a lot slower than chromium here though :s. Does Gecko on a phone (like the Google Pixel 1, whose hardware specs are comparable to the Oculus Go's) have the same WebGL hot spots? Or does this problem appear to be unique to the Oculus Go?

Kelsey Gilbert [:jgilbert]

Comment 10

•

6 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #8) > (In reply to Jeff Gilbert [:jgilbert] from comment #5) > > Here's the empty plane: > > http://bit.ly/2zljQxi > > It does look like a fair amount of work here is coming from WebGL, in > particular some texturing business shows up quite clearly. I don't see any > reason why we'd be a lot slower than chromium here though :s. You're right, thanks for checking me. I was reading the profile wrong.

Assignee: nobody → jgilbert

Flags: needinfo?(jgilbert)

Markus Stange [:mstange]

Comment 11

•

6 years ago

Is the performance difference reproducible in a regular Fennec or GeckoView-example build? From those, we can get profiles with symbols, which should make the analysis here easier. Or is the difference only visible in VR mode? What's the performance gap between Chrome and Firefox here?

Flags: needinfo?(jgilbert)

Jean-Yves Avenard [:jya]

Comment 12

•

6 years ago

sounds very much like bug 1463904, not much we could there as it's to do with task priorities.

See Also: → 1463904

Jean-Yves Avenard [:jya]

Comment 13

•

6 years ago

(In reply to Chris Peterson [:cpeterson] from comment #1) > Lars says: "The Google Pixel 1 is an equivalent device to the Oculus Go and > would make a good baseline. There are some platform differences around CPU > throttling (e.g., affecting the Gecko media stack), but from what the team > has been telling me, the perf is roughly the same." hubs.mozilla.com is pretty smooth on my Pixel 1, pretty much consistently showing 60fp, when I move a lot, it occasionally drop to 50fps but it's still very usable

Kelsey Gilbert [:jgilbert]

Comment 14

•

6 years ago

Ok, so I didn't notice that if you narrow it to "WebGL", it also renormalizes the percentages. As you can see from the profile color stack, it's mostly yellow (js), not green (graphics) or blue (dom). Of 30,908ms total, filtering by "WebGL" yields 3,849ms (12.4%). Of that 3,849ms, drawElements() and clear() are 1441 (37%) and 496 (12%), but only 340ms and 57ms respectively of that is outside the driver. There just doesn't seem like a lot of optimization opportunity here. I took a profile of Nightly Fennec on the spec-equivalent Pixel 1 XL running in mono fullscreen, and there did seem to be more Graphics load by proportion: https://perfht.ml/2zRFhX5 Also interesting is the eglCreateImage and eglCreateSync are taking 400ms out of 9400ms total, or 4% on Fennec, but that's not relevant to this bug or FxR, since VR uses SwapBuffers.

Flags: needinfo?(jgilbert)

Chris Peterson [:cpeterson]

Reporter

Comment 15

•

6 years ago

64=wontfix because FxR 1.1 is using GV 65 and this issue doesn't block Focus 8.0 from using GV 64.

status-firefox64: affected → wontfix

David Bolter [:davidb] (NeedInfo me for attention)

Comment 16

•

6 years ago

Should we move this to js?

Flags: needinfo?(jgilbert)

Kannan Vijayan [:djvj]

Updated

•

6 years ago

Component: Graphics → JavaScript Engine

Whiteboard: [geckoview:fxr:p1][webvr][qf] → [geckoview:fxr:p1][webvr][qf:p2:responsiveness]

Kannan Vijayan [:djvj]

Comment 17

•

6 years ago

I was going to comment about this being an ARM64-not-having-ion issue. But the arch in question is ARM32 and Ion definitely shows up in the profile. Looking at the stack map in inverted mode, I am struck that most of the leaf nodes in profile stacks, as I scan through.. look like they lead into libxul.so and platform. I really need to see what's going on in the leaves of these profiles. Is it calling into JS impl code, or webgl code, or other stuff? Can we get a profile with symbols enabled?

Flags: needinfo?(jgilbert) → needinfo?(cpeterson)

See Also: → arm64-ion

Kannan Vijayan [:djvj]

Updated

•

6 years ago

See Also: arm64-ion →

Chris Peterson [:cpeterson]

Reporter

Comment 18

•

6 years ago

(In reply to Kannan Vijayan [:djvj] from comment #17) > Looking at the stack map in inverted mode, I am struck that most of the leaf > nodes in profile stacks, as I scan through.. look like they lead into > libxul.so and platform. I really need to see what's going on in the leaves > of these profiles. Is it calling into JS impl code, or webgl code, or other > stuff? > > Can we get a profile with symbols enabled? Lars, is there an FxR engineer that would know about profiling FxR on the Oculus Go headset? Can we get FxR's debug symbols so the Gecko Profiler can symbolicate libxul.so code?

Flags: needinfo?(cpeterson) → needinfo?(larsberg)

Kelsey Gilbert [:jgilbert]

Comment 19

•

6 years ago

All that's needed is to run public FxR and pull the profile. I can do this real quick.

Flags: needinfo?(larsberg)

Kelsey Gilbert [:jgilbert]

Comment 20

•

6 years ago

Empty plane: http://bit.ly/2AyqFfw Atrium: http://bit.ly/2Ax92gq Atrium looks overwhelmingly JS to me, with just a sliver of green GFX.

Vicky Chin [:vchin]

Comment 21

•

6 years ago

Kannan, please see new profiles above.

Flags: needinfo?(kvijayan)

Kannan Vijayan [:djvj]

Comment 22

•

6 years ago

Ok I spent a couple of hours looking at the latest two profiles. The first one is interesting. There's no clear _single_ story here, but a few things stand out. The first is that Ion and Baseline compilation account for 5.6% of the TOTAL execution time across the entire profile. There's another 1% spent in IonCacheIR compilation, bringing it up to 6.6% of total time. Another 1.6% in ReprotectRegion (bad stacks which leave it unmoored, but which almost definitely come from compiler code), takes us to 8.2% of compile-related time. The rest of the profile is a grab bag of stuff, mostly related to interpreter + slowpath execution (property lookups, calls, etc.), gc, and painting / layout/etc. The second one is more interesting. A full 9% of the time across the entire profile is in ReprotectRegion. This is incredible, and can be attributed entirely to compiles and some to GCs. GC doesn't account for he bulk of the calls, however.. and is dwarfed by the respective compilers using ReprotectRegion. There are a couple of very clear signals coming out of this: 1. ReprotectRegion, specifically W^X memory protection, needs to be dealt with. I don't see any worker threads in the profile, so I'm not sure if we didn't capture them or GeckoView simply is not using them in this case. 2. BaselineCompiles need to be dealt with. Baseline compiles are showing up heavily in profiles and we can improve greatly if we are more fine-grained about the hot code we compile (i.e. cold blocks in hot functions should not be compiled). 3. Once again, there should be general improvements from a faster interpreter - lower overhead of Interp <=> JIT transitions, and simply faster overall performance.

Flags: needinfo?(kvijayan)

Comment 23

•

6 years ago

Sharing this profile as well, as asked in https://github.com/MozillaReality/FirefoxReality/issues/878 This one is specifically about hitching while spawning media (ducks) in Hubs. See the above github issue for repro steps. https://perfht.ml/2BbdqSx

Comment 24

•

6 years ago

(In reply to Kannan Vijayan [:djvj] from comment #22) > The rest of the profile is a grab bag of stuff, mostly related to > interpreter + slowpath execution (property lookups, calls, etc.), gc, and > painting / layout/etc. Could you clarify this point a bit more? Are there any resources / best practices to avoid slowpath execution?

Kannan Vijayan [:djvj]

Comment 25

•

6 years ago

> Could you clarify this point a bit more? Are there any resources / best practices to avoid slowpath execution? Nothing obvious I could suggest. The early-phase performance issue is something that simply requires faster execution of cold JS before it warms up. Jan's compiled interpreter (https://bugzilla.mozilla.org/show_bug.cgi?id=1499324) should help here quite a bit. I took a look at the latest profile just now. The key thing that stands out to me is in the Markers tab. At ~6s, ~12s, ~18s - roughly six-second increments, there seems to be a GC that wipes out all of our compiled scripts, which we subsequently recompile. It feels strongly that around every 6 seconds, we are throwing away all of our compiled code, and then hitting the interpreter, and then recompiling with baseline, then recompiling with Ion, etc. etc. This is apparent when we do an inverse view of the samples and notice that ReprotectRegion, associated with recompilation of Baseline scripts, Ion scripts, and Ion ICs, is the single most prominent item. Here are the general list of things that probably relate to improving this: 1. Stop throwing away code on GC, or at least be smarter about it and keep recently executed code. I'm needinfoing jonco about this. 2. When we do throw code away, we can recover faster if our interpreter is faster. That's primarily going to be Jan's interpreter.. which is likely to take a while - probably landing sometime in 2019. 3. A longstanding issue is that our codegen on ARM is very poor. CraneLift is a long-term thing that should improve things here, but I don't know what the timeframe for it is. 4. We _need_ to get ReprotectRegion and mprotect off the main thread. We are spending 4% of our total time in this call alone. 5. I suspect that our blind scheduling of heavyweight compiles on background threads is hurting us here - as the Oculus Go would have limited resources, and that's precisely where background compiles end up bottlenecking and taking a long time. The scheduling issue needs to be investigated - I suspect but cannot confirm that we are losing a lot of potential performance because of this. On the plus side, it turns out that event queue scheduling is likely to become a priority for 2019, as it's responsible for a significant chunk of our page-load performance issues. I will remember to raise this general issue with scheduling of JS background tasks when we discuss how to staff and implement the scheduler work we will need to do in 2019.

Chris Peterson [:cpeterson]

Reporter

Updated

•

6 years ago

Depends on: 1514113

Chris Peterson [:cpeterson]

Reporter

Comment 26

•

6 years ago

(In reply to Kannan Vijayan [:djvj] from comment #25) > 4. We _need_ to get ReprotectRegion and mprotect off the main thread. We > are spending 4% of our total time in this call alone. Since that sounds like an important and discrete task, I filed new bug 1514113.

Comment 27

•

6 years ago

(In reply to Kannan Vijayan [:djvj] from comment #25) > Here are the general list of things that probably relate to improving this: > > 1. Stop throwing away code on GC, or at least be smarter about it and keep > recently executed code. I'm needinfoing jonco about this. ^^^^^^^^^^^^^^^^^

Flags: needinfo?(jcoppeard)

Jon Coppeard (:jonco)

Updated

•

6 years ago

Depends on: 1514281

Jon Coppeard (:jonco)

Comment 28

•

6 years ago

I filed bug 1514281 for this.

Flags: needinfo?(jcoppeard)

Lars Bergstrom (:larsberg)

Updated

•

6 years ago

Blocks: 1528330

Maire Reavy [:mreavy]

Comment 29

•

6 years ago

Jeff took this bug for investigation when we thought the problem was in the gfx area. It appears to be a JS issue.

Assignee: jgilbert → nobody

(no longer active)

Updated

•

6 years ago

Depends on: 1537879

(no longer active)

Updated

•

6 years ago

Depends on: 1537951

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1537957

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1537961

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1537967

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1538260

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1537550

Markus Stange [:mstange]

Updated

•

6 years ago

Depends on: 1536672

Ted Campbell [:tcampbell]

Updated

•

4 years ago

Depends on: 1412202

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P2

Keywords: perf:responsiveness

Whiteboard: [geckoview:fxr:p1][webvr][qf:p2:responsiveness] → [geckoview:fxr:p1][webvr]

Updated

•

2 years ago

Severity: normal → S3

Jon Coppeard (:jonco)

Updated

•

6 months ago

Depends on: 1894940

You need to log in before you can comment on or make changes to this bug.