Closed Bug 1626734 Opened 7 months ago Closed 6 months ago

Pinchevents assert when using touchscreen pinch-zoom on Windows Debug

Categories

(Core :: Panning and Zooming, defect, P3)

Desktop
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox75 --- disabled
firefox76 --- disabled
firefox77 --- disabled
firefox78 --- fixed

People

(Reporter: ktaeleman, Assigned: kats)

References

(Blocks 1 open bug)

Details

(Whiteboard: apz-planning [apz:dtz:3:S])

Attachments

(3 files)

Assert:
MOZ_ASSERT(mBuffer.empty() || mBuffer.back().mTimeStamp <= event.mTimeStamp);

Objects:

-		(*((mozilla::InputData*)&(mBuffer)[3])).mTimeStamp	{mValue={mGTC=2633903100000000 mQPC=2633943701684000 mUsedCanonicalNow=false ...} }	mozilla::TimeStamp
-		mValue	{mGTC=2633903100000000 mQPC=2633943701684000 mUsedCanonicalNow=false ...}	mozilla::TimeStampValue
		mGTC	2633903100000000	unsigned __int64
		mQPC	2633943701684000	unsigned __int64
		mUsedCanonicalNow	false	bool
		mIsNull	false	bool
		mHasQPC	true	bool
-		event.mTimeStamp	{mValue={mGTC=2633903010000000 mQPC=2633943698229000 mUsedCanonicalNow=false ...} }	mozilla::TimeStamp
-		mValue	{mGTC=2633903010000000 mQPC=2633943698229000 mUsedCanonicalNow=false ...}	mozilla::TimeStampValue
		mGTC	2633903010000000	unsigned __int64
		mQPC	2633943698229000	unsigned __int64
		mUsedCanonicalNow	false	bool
		mIsNull	false	bool
		mHasQPC	true	bool

Callstack:

 	xul.dll!mozilla::layers::AsyncPanZoomController::OnScale(const mozilla::PinchGestureInput & aEvent) Line 1595	C++
 	xul.dll!mozilla::layers::AsyncPanZoomController::HandleGestureEvent(const mozilla::InputData & aEvent) Line 1267	C++
 	xul.dll!mozilla::layers::GestureEventListener::HandleInputTouchMove() Line 387	C++
 	xul.dll!mozilla::layers::GestureEventListener::HandleInputEvent(const mozilla::MultiTouchInput & aEvent) Line 119	C++
 	xul.dll!mozilla::layers::AsyncPanZoomController::HandleInputEvent(const mozilla::InputData & aEvent, const mozilla::gfx::Matrix4x4Typed<mozilla::ScreenPixel,mozilla::ParentLayerPixel,float> & aTransformToApzc) Line 1133	C++
 	xul.dll!mozilla::layers::InputBlockState::DispatchEvent(const mozilla::InputData & aEvent) Line 176	C++
 	xul.dll!mozilla::layers::TouchBlockState::DispatchEvent(const mozilla::InputData & aEvent) Line 679	C++
 	xul.dll!mozilla::layers::InputQueue::ProcessQueue() Line 767	C++
 	xul.dll!mozilla::layers::InputQueue::ReceiveTouchInput(const RefPtr<mozilla::layers::AsyncPanZoomController> & aTarget, mozilla::layers::TargetConfirmationFlags aFlags, const mozilla::MultiTouchInput & aEvent, unsigned __int64 * aOutInputBlockId, const mozilla::Maybe<nsTArray<unsigned int>> & aTouchBehaviors) Line 179	C++
 	xul.dll!mozilla::layers::InputQueue::ReceiveInputEvent(const RefPtr<mozilla::layers::AsyncPanZoomController> & aTarget, mozilla::layers::TargetConfirmationFlags aFlags, const mozilla::InputData & aEvent, unsigned __int64 * aOutInputBlockId, const mozilla::Maybe<nsTArray<unsigned int>> & aTouchBehaviors) Line 41	C++
 	xul.dll!mozilla::layers::APZCTreeManager::ProcessTouchInput(mozilla::MultiTouchInput & aInput) Line 2075	C++
 	xul.dll!mozilla::layers::APZCTreeManager::ReceiveInputEvent(mozilla::InputData & aEvent) Line 1513	C++
 	xul.dll!mozilla::layers::APZInputBridgeParent::RecvReceiveMultiTouchInputEvent(const mozilla::MultiTouchInput & aEvent, mozilla::layers::APZEventResult * aOutResult, mozilla::MultiTouchInput * aOutEvent) Line 31	C++
 	xul.dll!mozilla::layers::PAPZInputBridgeParent::OnMessageReceived(const IPC::Message & msg__, IPC::Message * & reply__) Line 193	C++
 	xul.dll!mozilla::gfx::PGPUParent::OnMessageReceived(const IPC::Message & msg__, IPC::Message * & reply__) Line 1561	C++
 	xul.dll!mozilla::ipc::MessageChannel::DispatchSyncMessage(mozilla::ipc::ActorLifecycleProxy * aProxy, const IPC::Message & aMsg, IPC::Message * & aReply) Line 2156	C++
 	xul.dll!mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message && aMsg) Line 2107	C++
 	xul.dll!mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask & aTask) Line 1960	C++
 	xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run() Line 1991	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 1220	C++
 	xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 481	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 84	C++
 	xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate * aDelegate) Line 271	C++
 	xul.dll!MessageLoop::RunInternal() Line 315	C++
 	xul.dll!MessageLoop::RunHandler() Line 309	C++
 	xul.dll!MessageLoop::Run() Line 291	C++
 	xul.dll!nsBaseAppShell::Run() Line 139	C++
 	xul.dll!nsAppShell::Run() Line 406	C++
 	xul.dll!XRE_RunAppShell() Line 909	C++
 	xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate * aDelegate) Line 237	C++
 	xul.dll!MessageLoop::RunInternal() Line 315	C++
 	xul.dll!MessageLoop::RunHandler() Line 309	C++
 	xul.dll!MessageLoop::Run() Line 291	C++
 	xul.dll!XRE_InitChildProcess(int aArgc, char * * aArgv, const XREChildData * aChildData) Line 744	C++
 	xul.dll!mozilla::BootstrapImpl::XRE_InitChildProcess(int argc, char * * argv, const XREChildData * aChildData) Line 67	C++
 	firefox.exe!content_process_main(mozilla::Bootstrap * bootstrap, int argc, char * * argv) Line 56	C++
 	firefox.exe!NS_internal_main(int argc, char * * argv, char * * envp) Line 303	C++
 	firefox.exe!wmain(int argc, wchar_t * * argv) Line 131	C++
 	[Inline Frame] firefox.exe!invoke_main() Line 90	C++
 	firefox.exe!__scrt_common_main_seh() Line 288	C++
 	kernel32.dll!00007ffed7196fd4()	Unknown
 	ntdll.dll!00007ffed8e9cf31()	Unknown```
See Also: → 1451461
Blocks: desktop-zoom
No longer blocks: desktop-zoom-xp
Whiteboard: desktop-zoom-nightly → apz-planning

I'll investigate.

Assignee: nobody → kats

For "reasons" we have both mTime and mTimeStamp on these input events, and mTime seems to be more reliable one as it comes from the OS. On Windows mTimeStamp is computed through some hairy code that tries to account for a clock skew and other things. So one pretty simple fix would be to just use mTime instead of mTimeStamp in the assertion. But I'll dig a bit more to see if I can narrow down why we're getting jitter in mTimeStamp since it seems like that shouldn't happen.

So this is indeed a bug in the SystemTimeConverter code, specifically with the fact that it truncates fractional milliseconds here. Here's an example sequence I captured with printf logging:

  • Call GetMessageTimeStamp(aEventTime=662853921)
  • This trickles down into this call to IsNewerThanTimeStamp
  • Inside IsNewerThanTimestamp, the timeDelta is 31 and the timestampDelta is 32.146563 from the reference time and timestamps, respectively. This causes the function to set deltaFromNow=1 (because it truncates the 32.146563 and then subtracts 31) and return newer=false.
  • Back at the call site, we subtract 1ms from roughlyNow and return it. This effectively returns a value that is 31.146563 ms after the reference timestamp.
  • Later, we again call GetMessageTimeStamp with the same aEventTime=662853921
  • Again this trickles down into IsNewerThanTimeStamp, but this time roughlyNow is at a value 43.139479ms after the reference timestamp.
  • Inside IsNewerThanTimestamp, we get timeDelta as 31 again, and timestampDelta as 43.139479. So again the truncation/subtraction happens, and this function produces deltaFromNow=12 and returns newer=false.
  • Back at the call site, we subtract 12ms from roughlyNow, which returns a value that is 31.139479 ms after the reference timestamp.

Note that the value returned in the second call to GetMessageTimeStamp is 0.007083ms earlier the value returned from the first call to GetMessageTimeStamp. We expect time to always go forward, but this code can result it in going backwards by a fractional millisecond.

Brian, it looks like you were the last one to significantly touch this code. Do you know if it's expected/allowed behaviour for GetMessageTimeStamp to return non-monotonic timestamp objects when called with the same input multiple times? (see comment 3 for detailed walkthrough). It looks like the TimeDuration objects we have really only have millisecond resolution so I'm having trouble thinking of a simple way to fix this problem without storing additional state.

Flags: needinfo?(brian)

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #4)

It looks like the TimeDuration objects we have really only have millisecond resolution so I'm having trouble thinking of a simple way to fix this problem without storing additional state.

Actually that's not true. TimeDuration objects have more resolution. I can probably fix this without too much trouble then.

We should only use mTimestamp. mTime was supposed to be removed in bug 1026809. Someone from Taipei was going to take that bug but never got around to it.

Regarding the time conversion, Karl probably remembers it as well as I do. (I'm no longer with Mozilla so I don't have so much time to dig into it I'm afraid but I could well believe that conversion code is buggy. Karl and I spent a long time discussing it but it's quite complicated.)

Flags: needinfo?(brian)

Ah, thanks for the info! I'll flag Karl for feedback on patches.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)

Thank you for that analysis. That is very helpful.

Essentially timeDelta is being calculated based on aEventTime with millisecond precision, but being subtracted from roughlyNow, which keeps changing independently of aEventTime with a different precision.

Brian did at one stage have mReferenceTimeStamp + TimeDuration::FromMilliseconds(timeDelta). That avoids the problem because the reference is stable, but that would have worked well only until timeDelta overflowed its 32-bit range, which is why I suggested roughlyNow - TimeDuration::FromMilliseconds(deltaFromNow). I see now the problem with using roughlyNow.

Brian's approach would work well if timeDelta were replaced with something with more range. That could be constructed with the most-significant bits from timestampDelta but least significant (more precise) bits from timeDelta. There would need to be some care around the 32-bit overflow points.

Just a quick update: what you said makes sense, and I think would be a better approach than what I have in the current patch, if I can make it work. Given how tricky this code is though I decided to spend some time making it gtest-able and writing some tests. That way we can proceed with more confidence that I'm not regressing tricky edge cases with overflow and whatnot.

Using gtest sounds great thanks.

I think one way to adapt the patch to take the approach in comment 9 for determining the timestamp would be

int64_t wholeMillis = static_cast<int64_t>(timeStampDelta.ToMilliseconds());
Time wrappedTimeStampDelta = wholeMillis; // truncate to unsigned
Time timeToTimeStamp = wrappedTimeStampDelta - timeDelta;
bool isNewer = false;
if (timeToTimeStamp == 0) {
  // wholeMillis needs no adjustment
} else if (timeToTimeStamp < kTimeHalfRange) {
  wholeMillis -= timeToTimeStamp;
} else {
  isNewer = true;
  wholeMillis += (-timeToTimeStamp)
}
*aTimeAsTimeStamp = mReferenceTimeStamp + TimeDuration::FromMilliseconds(wholeMillis);

Tests would be useful to verify of course.

The choice between the existing truncation to wholeMillis and other kinds of rounding doesn't matter for the calculation of timeAsTimeStamp because the difference gets cancelled out anyway. There remains a question for isNewer. Comparing aTimeStamp with aTimeAsTimeStamp is what we really want for isNewer, I expect, and that is different from the existing code, but leaving that as is for now also seems fine to focus on a safe improvement for the issue reported here.

I find I'm having some trouble wrapping my head around this code well enough to write good tests. I wrote a few that exercise some of the overflow cases but I don't think I did a particularly great job. I did write a test for this bug, and verified that the modification you posted above does fix it. For now I'll post the patches I have so far in the interest of getting feedback before I spend more time on this. I tried to make the test-writing experience fairly ergonomic so hopefully if you want to write more tests you can do that directly without too much trouble, or let me know what kind of cases need coverage and I can do it.

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2babcf81169d
Add some gtests for SystemTimeConverter. r=karlt
https://hg.mozilla.org/integration/autoland/rev/98c02385b0fd
Add gtests for SystemTimeConverter that exercise wrapping timestamps. r=karlt
https://hg.mozilla.org/integration/autoland/rev/1bc958f2959b
Fix edge case that can cause TimeStamps to go backwards by fractional milliseconds. r=karlt
Whiteboard: apz-planning → apz-planning [apz:dtz:3:S]
You need to log in before you can comment on or make changes to this bug.