Closed Bug 1537550 Opened 1 year ago Closed 1 year ago

1ms per frame is spent deserializing the VRDisplayInfo parameter

Categories

(Core :: WebVR, defect)

All
Android
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- fixed

People

(Reporter: mstange, Assigned: kip)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxr-ww])

Attachments

(1 file)

See these two profiles: https://perfht.ml/2YdZX7h (sampled with a 10ms interval) and https://perfht.ml/2Y9ndTA (sampled with a 2ms interval)

In these profiles it's taking a surprisingly long time to deserialize the VRDisplayInfo parameter of PVRManager::UpdateDisplayInfo.

That is indeed a lot of frame time! Thanks for identifying it.

Although the VR process uses simple SHMEM and locking to send this structure to the GPU process, we are still serializing it for IPC with the content process.

The structure contains only POD types, has no position-dependent members (eg pointers), and has well defined "read only" and "write" sections for each process/thread.

I would like to propose that we map / shmem the structure directly (with read-only access) to the content process and eliminate the serialization. This would lose some bounds checking with enums, but would not allow any pprivilege escalation as this struct is read-only by the content process.

@mstange Would this approach seem feasible? Would you have any concerns with this route?

Alternately, we could memcpy this struct as a quicker fix in VRMessageUtils.h but not actually map the address space.

Flags: needinfo?(mstange)
Assignee: nobody → kgilbert

Hi Kip,

We found this bug today in the Firefox Reality work week in Toronto, while profiling bug 1498485. We were not sure why it's necessary to send mLastSensorState and mLastFrameStart on every frame, perhaps that is one way to reduce the cost here. Also, it seems like the way that the serialization/deserialization is done here could really be improved in terms of efficiency. For example the fields in VRHMDSensorState::pose are read/written as individual floats one by one whereas memcpying the entire array of 19 floats in VRPose in one call should be much more faster since it avoids the overhead of maintaining the iterator, buffer overrun checks etc per each float.

Is this something you could work on or find an owner for? Thanks!

Hey Kip, I mid-aired with your comment, so as you can probably guess comment 2 was written before I saw comment 1. :-)

I think memcpying is probably a very simple solution that we should try first before going for something more complex. What do you think?

(But perhaps we shouldn't be IPCing some of this data in the first place?)

(In reply to :Ehsan Akhgari from comment #3)

Hey Kip, I mid-aired with your comment, so as you can probably guess comment 2 was written before I saw comment 1. :-)

I think memcpying is probably a very simple solution that we should try first before going for something more complex. What do you think?

(But perhaps we shouldn't be IPCing some of this data in the first place?)

This sounds great to me. I will take this one and can implement this as a quick fix. I suspect that switching to a memcpy should remove most of the overhead until we can explore mapping the struct itself. The intent with this struct is to eventually avoid copies all-together.

Any members that would still have value after a frame is passed would be kept in a ring buffer, while the rest are only useful within the same frame they are rendered on. Any members that should not be known by the content process will be implemented outside of this struct (eg, in structs containing this one).

Doing a memcpy sounds good to me.

Oh, and looking at the compositor thread, it looks like a very similar problem exists on the sending side: Serialization of the same struct is slow as well.

Flags: needinfo?(mstange)
Blocks: 1498485

Use PlainOldDataSerializer for POD types in VRMessageUtils
Moved non-POD member of VRDisplayInfo to VRDisplayHost
VRDisplayInfo is now also a POD type (And asserted so)

Whiteboard: [fxr-ww]

Great patch! It seems to cut down the deserialization time by a factor of 7.

I've gathered new profiles on a different benchmark.
Before: https://perfht.ml/2JyG8UU
After: https://perfht.ml/2JJ5MWK

The patch seems to reduce the per-frame time of VRDisplayInfo deserialization from 0.46ms to 0.066ms.

(I'm not entirely sure why it's taking only 0.46ms now and not the 1.1ms I measured the last time. Could be due to less load on other parts of the system.)

Pushed by kgilbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c6b4e466e158
Use PlainOldDataSerializer for POD types in VRMessageUtils r=daoshengmu

(In reply to :kip (Kearwood Gilbert) from comment #1)

I would like to propose that we map / shmem the structure directly (with read-only access) to the content process and eliminate the serialization. This would lose some bounds checking with enums, but would not allow any pprivilege escalation as this struct is read-only by the content process.

It's more complicated than that. Bug 1479960 has the details, but to summarize the current state of our shared memory support:

  1. If a process is already compromised when it receives the shared memory handle, it can map it read/write and there's no way to prevent that.

  2. If the process is compromised after setting up a read-only mapping, it may be able to restore write permissions, depending on OS.

Freezing shared memory so that nobody can write it again is almost finished; allowing one process to write while another only reads — which it sounds like this use case is asking for — isn't written yet and can't be implemented for pre-Oreo Android, and it's not yet clear what we want to do about that.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68

67=wontfix because, AFAIK, we don't need to uplift to GV 67 Beta.

https://hg.mozilla.org/projects/ash/rev/c6b4e466e15887eed24f7017f8cc25d1b887d5d0
Bug 1537550 - Use PlainOldDataSerializer for POD types in VRMessageUtils r=daoshengmu
Depends on: 1540206, 1540433
You need to log in before you can comment on or make changes to this bug.