Closed
Bug 1344216
Opened 8 years ago
Closed 8 years ago
[meta] Stop doing sync IPC for WebVR
Categories
(Core :: WebVR, enhancement)
Core
WebVR
Tracking
()
RESOLVED
FIXED
Performance Impact | ? |
People
(Reporter: ehsan.akhgari, Assigned: kip)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: meta, Whiteboard: [gfx-noted])
Our WebVR implementation relies on a number of sync IPC messages on PVRManager. At least three of these (GetSensorState, SetHaveEventListener and GetDisplays) can be pretty slow, and as far as I can tell, these sync IPC messages are directly called from APIs that we expose to the web page, so the resulting experience is that the page calls the APIs, and we block the entire content process before the parent process gets back to us. If the page ends up calling one of these APIs several times per frame, the result can be disastrous.
Here is the telemetry data we have so far for these three messages that are showing up as being particularly slow:
<https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-02-28&keys=PVRManager%253A%253AMsg_GetSensorState!PVRManager%253A%253AMsg_SetHaveEventListener!PVRManager%253A%253AMsg_GetDisplays!PAPZCTreeManager%253A%253AMsg_ReceiveMouseInputEvent&max_channel_version=nightly%252F54&measure=IPC_SYNC_LATENCY_MS&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-02-02&table=0&trim=1&use_submission_date=0>
GetSensorState:
Start End IPC_SYNC_LATENCY_MS Count
0 1 0 (0%)
1 3 69.82M (20.41%)
3 8 76.63M (22.4%)
8 20 180.95M (52.89%)
20 50 13.93M (4.07%)
50 126 778.13k (0.23%)
126 317 31.32k (0.01%)
317 796 1.41k (0%)
796 2k 250 (0%)
2k Infinity 48 (0%)
SetHaveEventListener:
Start End IPC_SYNC_LATENCY_MS Count
0 1 0 (0%)
1 3 118.09k (55.09%)
3 8 43.04k (20.08%)
8 20 29.82k (13.91%)
20 50 15.93k (7.43%)
50 126 5.62k (2.62%)
126 317 1.36k (0.63%)
317 796 325 (0.15%)
796 2k 101 (0.05%)
2k Infinity 68 (0.03%)
GetDisplays:
Start End IPC_SYNC_LATENCY_MS Count
0 1 0 (0%)
1 3 7.68k (53.52%)
3 8 2.64k (18.41%)
8 20 2.57k (17.95%)
20 50 1.01k (7.03%)
50 126 330 (2.3%)
126 317 78 (0.54%)
317 796 23 (0.16%)
796 2k 4 (0.03%)
2k Infinity 8 (0.06%)
Note that due to the fact that this API isn't very popular yet, I'm extremely worried that we may not have a good amount of data here. But it seems like we should probably study our performance story very carefully before considering shipping.
Assignee | ||
Comment 1•8 years ago
|
||
I am working on a patch now that will ensure that all WebVR IPC that happens outside of WebVR presentation will be async..
I'd like to explore alternative non-blocking methods of sharing sensor state from the GPU process to all of the content processes to replace the current broadcast of IPC messages. One technique could be inspired by the non-locking memory mapped data structures that are used by Chromium's gamepad api.
Updated•8 years ago
|
Whiteboard: [gfx-noted]
Assignee | ||
Comment 2•8 years ago
|
||
(In reply to :Ehsan Akhgari from comment #0)
>
> GetSensorState:
> Start End IPC_SYNC_LATENCY_MS Count
> 0 1 0 (0%)
> 1 3 69.82M (20.41%)
> 3 8 76.63M (22.4%)
> 8 20 180.95M (52.89%)
> 20 50 13.93M (4.07%)
> 50 126 778.13k (0.23%)
> 126 317 31.32k (0.01%)
> 317 796 1.41k (0%)
> 796 2k 250 (0%)
> 2k Infinity 48 (0%)
>
These metrics do not match up with the hardware supported latency measurements that I am seeing in tests here. The vast majority of the calls to GetSensorState should be returning in a fraction of a millisecond. I noticed that our telemetry code is not reporting anything under 1ms due to kMinTelemetrySyncIPCLatencyMs:
https://dxr.mozilla.org/mozilla-central/source/ipc/glue/MessageChannel.cpp#1282
I suspect that these figures are representative of the IPC queue length for frames that endured latency due to other non-WebVR IPC messages, rather than following the impact of the GetSensorState call.
Comment 3•8 years ago
|
||
Please feel free to remove "sync GetControllers()" at PVRManager.ipdl. I have checked there is no one using it.
Reporter | ||
Comment 4•8 years ago
|
||
(In reply to :kip (Kearwood Gilbert) from comment #2)
> (In reply to :Ehsan Akhgari from comment #0)
> >
> > GetSensorState:
> > Start End IPC_SYNC_LATENCY_MS Count
> > 0 1 0 (0%)
> > 1 3 69.82M (20.41%)
> > 3 8 76.63M (22.4%)
> > 8 20 180.95M (52.89%)
> > 20 50 13.93M (4.07%)
> > 50 126 778.13k (0.23%)
> > 126 317 31.32k (0.01%)
> > 317 796 1.41k (0%)
> > 796 2k 250 (0%)
> > 2k Infinity 48 (0%)
> >
>
> These metrics do not match up with the hardware supported latency
> measurements that I am seeing in tests here. The vast majority of the calls
> to GetSensorState should be returning in a fraction of a millisecond. I
> noticed that our telemetry code is not reporting anything under 1ms due to
> kMinTelemetrySyncIPCLatencyMs:
>
> https://dxr.mozilla.org/mozilla-central/source/ipc/glue/MessageChannel.
> cpp#1282
>
> I suspect that these figures are representative of the IPC queue length for
> frames that endured latency due to other non-WebVR IPC messages, rather than
> following the impact of the GetSensorState call.
Yes, that is what you sign up for to deal with when you use sync IPC, and precisely why you shouldn't be doing that. ;-)
Reporter | ||
Comment 5•8 years ago
|
||
(To be more clear, the reason we are ignoring sync IPCs less than 1ms is that like all performance issues, when debugging perf problems we don't measure the good performance scenarios, but the bad ones. Very similarly to sync IO, for example, even though under tight testing conditions sync IPC may finish very fast, on users machines in the wild that may not end up being the case, which is why local measurements and best case performance numbers for things like this shouldn't be trusted.)
Assignee | ||
Comment 6•8 years ago
|
||
Once the dependencies tracked in this bug have landed, WebVR will no longer be doing any Sync IPC.
Summary: Stop doing sync IPC for WebVR → [meta] Stop doing sync IPC for WebVR
Updated•8 years ago
|
Whiteboard: [gfx-noted] → [gfx-noted][qf:meta]
Assignee | ||
Comment 7•8 years ago
|
||
The final sync IPC message, which only occurs during VR presentation, will be replaced with the lock free shared memory implementation in Bug 1346927.
I'll be doing this work, so have taken this bug.
Assignee | ||
Comment 8•8 years ago
|
||
This has been fixed with the refactoring in Bug 1362213
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•3 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•