Closed Bug 1394561 Opened 2 years ago Closed 2 years ago

WebVR latency is high, even when Framerate is sustained

Categories

(Core :: WebVR, defect)

x86_64
Windows
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: kip, Assigned: kip)

References

Details

Attachments

(2 files)

In recent testing, it appears that intermittently the latency of WebVR is very high (feels like ~30-100 frames of latency).

This is affecting Nightly (ff57), Beta (ff56), and Release (ff55).

The effect is visible even when frames are not being dropped, possibly indicating an IPC latency issue or other similar logic problem related to frame timing.

The issue intermittently disappears, resulting in expected low-latency operation.

It appears to be only occuring for our OpenVR backend, with HTC Vive.  On the same machine, Oculus does not appear to be impacted.  (Perhaps there is a change of behavior in OpenVR updates, causing the bug to appear?)

When the latency is high, the room scale boundaries drawn by the SteamVR compositor still appear to have low-latency, with a noticeable drift in the WebVR rendering when looking at the ground.

SteamVR and SteamVR beta both appear to be affected at this time.

SteamVR home continues to run smoothly when the issue occurs.
I am continuing to investigate and will take this bug.  I have been able to reproduce this consistently on my development environment.

I'll be instrumenting Firefox with some performance markers that will allow me to visualize the frame timing while reproducing the failure.
Although there appears to be something happening at the platform level in this case, it is also possible to cause these effects by making errors in the WebVR render loop in Javascript content.

If you see this happening on Oculus or after this bug is closed please check your content for common problems that are also known to cause it:

- Make sure you are calling getFrameData only within the VRDisplay.requestAnimationFrame callback
- Make sure you are disabling the normal 2d display's requestAnimationFrame callback when the VR presentation has started
- Make sure that VRDisplay.submitFrame is only called once per VRDisplay.requestAnimationFrame callback
Usually simple sites such as the https://webvr.info/samples are not affected; however, after visiting a heavier site such as SketchFab.com or aframe.io for a while, you can see the latency persist on simpler sites such as webvr.info.
See Also: → 1320616
I believe with A-Frame, the problem could be `Make sure you are disabling the normal 2d display's requestAnimationFrame callback when the VR presentation has started`. We don't do that at the moment. For curiosity and to better understand the solution, can you provide more detail on what that would achieve?
I have found the cause of the intermittent high latency.

You can reproduce it by making an infinite loop in Javascript that runs for a few seconds after you have requested VR presentation

The duration of the loop has a direct and linear correlation with the amount of latency

It is caused by Firefox's internal "watchdog" that tries to re-kick the RAF cycle when it stops receiving frames from the content

As the Javascript wasn't able to receive the requestAnimationFrame callbacks for the few seconds after the requestPresent call, the watchdog starts filling IPC queues within firefox with the frames that it wanted

When it is unblocked and the IPC queues start flushing, the frames will start to be rendered by content with stale pose information stored with the IPC messages

These frames are submitted to the VRDevice as fast as they can be rendered, which may be lower than the framerate of the headset

If two frames are submitted within one raf callback, it would normally ignore all but the first


So content that isn't dropping frames would "catch up"

But if content never hits the 90hz, it will continue to be lagged behind by an amount that linearly corresponds to the total number of watchdog injected raf callbacks.

My fix:

I will close the loop with the frame id's so the GPU process receiving the submitted frames can appropriately skip frames to allow the input sampling and frames to be re-synced

I'll need to land my current big patch first (it's going through reviews now), which conveniently sets up some of the information needed to do this

Also, instead of just skipping frames that have been rendered, it will throttle the RAF callback to accomodate this special case where content never reaches 90hz

I tested this hypothesis manually, by attaching a debugger to Firefox while the effect was reproduced.  I paused the GPU process to allow content to catch up to the RAF callback events in the IPC queue, then released the GPU process, which culled out all but one latent frame.

With the Sketchfab Lily and Snout scene, the latency was visible before i paused the GPU process, but immediately disappeared after stopping and resuming the GPU process

For Oculus Rift, this effect was not as apparent, but still present for a couple of reasons:

- The Oculus SDK receives matching poses from the input to the output

- When you remove your Oculus HMD from your head, the Firefox VR code throttles the VRDisplay.requestAnimationFrame callback, allowing the content to catch up.

Usually on heavy content loads like SketchFab, you put on the headset after the content has finished loading, resulting in the backlog of IPC messages not building up until the loading has been completed.

If you keep the Oculus on your head from the start to end of the Sketchfab load (you must clear your browser cache to reproduce it!), you will see the same latency as on the HTC Vive.  Taking the Oculus headset off and putting it on again results in the latency disappearing (except for 1 frame of latency!)
one factoid:  a student in my lab is trying to understand what’s going on in the vive with argon.js, and he says that when rendering without presenting (PC w/Vive attached) frames take ~2-4ms and are a solid 60fps.  When presenting, things alternate between ~7-8ms (and show 90fs) or 13-15ms (and show 60fps)  … with the bulk of the time being in submitFrame.  Not sure if this is “useful”.
Some heroes don't wear capes. Thanks, Kip!
This updated patch no longer depends on the bigger refactoring and can be landed on its own.
Attachment #8906189 - Flags: review?(kchen)
Attachment #8906189 - Flags: review?(dmu)
Kanru: Would you mind reviewing the small ipdl change?  We are simply adding a new integer parameter.

Daosheng: Could you please review the rest?
Comment on attachment 8906189 [details]
Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed

https://reviewboard.mozilla.org/r/177954/#review183598

r=me. Please consider to replace const uint64_t with a uint64_t when sending them to by functions. I am ok for keeping them still as const uint64_t, btw.

::: gfx/vr/VRDisplayHost.h:51
(Diff revision 1)
>    virtual void NotifyVSync();
>  
>    void StartFrame();
>    void SubmitFrame(VRLayerParent* aLayer,
>                     mozilla::layers::PTextureParent* aTexture,
> +                   const uint64_t aFrameId,

It looks like you don't need a const for uint64_t

::: gfx/vr/VRManager.h:48
(Diff revision 1)
>    template<class T> void NotifyGamepadChange(uint32_t aIndex, const T& aInfo);
>    RefPtr<gfx::VRDisplayHost> GetDisplay(const uint32_t& aDisplayID);
>    void GetVRDisplayInfo(nsTArray<VRDisplayInfo>& aDisplayInfo);
>  
>    void SubmitFrame(VRLayerParent* aLayer, layers::PTextureParent* aTexture,
> +                   const uint64_t& aFrameId,

It seems that you should replace the const uint64_t& with a uint64_t
Attachment #8906189 - Flags: review?(dmu) → review+
Comment on attachment 8906189 [details]
Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed

https://reviewboard.mozilla.org/r/177954/#review183612

This is fine. You don't need my review for IPDL changes unless you add/remove sync messages :)
Attachment #8906189 - Flags: review?(kchen) → review+
Pushed by kgilbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7d20f9d48e02
Ensure WebVR content can catch up when IPC messages are delayed r=daoshengmu,kanru
https://hg.mozilla.org/mozilla-central/rev/7d20f9d48e02
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Duplicate of this bug: 1376490
Hi,
I have just retested my model https://sketchfab.com/models/b4b378f7ee714996923c4db2ceaa19cb and head tracking is working correctly.

Many thanks for your great work.
Approval Request Comment
[Feature/Bug causing the regression]:
[User impact if declined]:
[Is this code covered by automated tests?]:
[Has the fix been verified in Nightly?]:
[Needs manual test from QE? If yes, steps to reproduce]: 
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]:
[Why is the change risky/not risky?]:
[String changes made/needed]:
Comment on attachment 8909987 [details] [diff] [review]
Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed (Release Uplift)

Approval Request Comment
[Feature/Bug causing the regression]:
N/A - Content and external runtimes changed, magnifying effect of latency
[User impact if declined]:
Users of WebVR will intermittently notice extreme latency (> 1s) on popular sites such as Sketchfab
[Is this code covered by automated tests?]:
Yes, this code is executed by our automated mochitests and reftests.
[Has the fix been verified in Nightly?]:
This fix has landed in Nightly with end-users confirming the fix.
[Needs manual test from QE? If yes, steps to reproduce]: 
Yes
STR:
- Clear the browser's cache.
- Visit Sketchfab.com while wearing an HTC Vive headset.
- View multiple models without physically removing the headset between each viewing (there is a proximity sensor in the headset).
- If failed, multiple seconds of latency can be perceived in the headset after viewing several models.
- If passed, there will be no noticeable cumulative degradation in performance as each model is viewed.
[List of other uplifts needed for the feature/fix]:
None needed
[Is the change risky?]:
Low risk.
[Why is the change risky/not risky?]:
This is a self-contained change, touching code only executed when viewing a WebVR site.  The fix has been confirmed in Nightly.
[String changes made/needed]:
N/A

Additional notes...

This is a high priority fix as the issue renders WebVR unusable on popular sites, prompting me to request uplift to beta even when so close to the uplift date.  Interaction between the web content and the VR runtime updates resulted in this issue being identified late in the cycle, but the fix is low risk.  If not accepted to Beta before the beta->release uplift, we would expect WebVR users to either switch browsers or downgrade to Nightly/Beta releases.
Attachment #8909987 - Flags: approval-mozilla-beta?
Comment on attachment 8909987 [details] [diff] [review]
Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed (Release Uplift)

Approval Request Comment
[Feature/Bug causing the regression]:
N/A - Content and external runtimes changed, magnifying effect of latency
[User impact if declined]:
Users of WebVR will intermittently notice extreme latency (> 1s) on popular sites such as Sketchfab
[Is this code covered by automated tests?]:
Yes, this code is executed by our automated mochitests and reftests.
[Has the fix been verified in Nightly?]:
This fix has landed in Nightly with end-users confirming the fix.
[Needs manual test from QE? If yes, steps to reproduce]: 
Yes
STR:
- Clear the browser's cache.
- Visit Sketchfab.com while wearing an HTC Vive headset.
- View multiple models without physically removing the headset between each viewing (there is a proximity sensor in the headset).
- If failed, multiple seconds of latency can be perceived in the headset after viewing several models.
- If passed, there will be no noticeable cumulative degradation in performance as each model is viewed.
[List of other uplifts needed for the feature/fix]:
None needed
[Is the change risky?]:
Low risk.
[Why is the change risky/not risky?]:
This is a self-contained change, touching code only executed when viewing a WebVR site.  The fix has been confirmed in Nightly.
[String changes made/needed]:
N/A

Additional notes...

This is a high priority fix as the issue renders WebVR unusable on popular sites, prompting me to request uplift so close to the release date.  Interaction between the web content and the VR runtime updates resulted in this issue being identified late in the cycle, but the fix is low risk.  If not accepted before final release of FF56, we would expect WebVR users to either switch browsers or downgrade to Nightly/Beta releases.
Attachment #8909987 - Attachment description: Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed (Beta Uplift) → Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed (Release Uplift)
Attachment #8909987 - Attachment filename: bug1394561_beta.patch → bug1394561_release.patch
Attachment #8909987 - Flags: approval-mozilla-beta? → approval-mozilla-release?
I have verified that the patch applies correctly on release and have done some manual testing with both Oculus CV1 and the HTC Vive to ensure that the fix is working appropriately on its own.  The fix is working great on all sites I tested it on, including Aframe.io, Sketchfab.com, and webvr.info.

Please advise if there are any other steps I need to take to get this into release FF56 (RC2).

- Kip
Flags: needinfo?(lhenry)
Flags: needinfo?(lhenry)
Comment on attachment 8909987 [details] [diff] [review]
Bug 1394561 - Ensure WebVR content can catch up when IPC messages are delayed (Release Uplift)

Looks like the effects of this patch should be limited to WebVR. 
Let's uplift for the 56 RC2 build.
Attachment #8909987 - Flags: approval-mozilla-release? → approval-mozilla-release+
Please see the Oculus testing results here: https://bugzilla.mozilla.org/show_bug.cgi?id=1381165#c11 , as for the HTC Vive, I will comment here as soon as the testing is done.
Hey Kip!

There's still noticeable latency on FF/WebVR. When the boundaries are up, you can see a swim/drift effect, not synced up solidly with the content. Any ideas?
You need to log in before you can comment on or make changes to this bug.