914847 - IPDL::PBrowser::RecvRealTouchEvent causes 50ms sync reflow

The suspicious code is http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp?rev=bc427f5ec61b&mark=6063-6063#6061 which was added in Bug 780692. But if flush takes lots of time, there is obviously something to flush. Should figure out what, since there will be a flush at some point anyway. And perhaps try to remove the explicit flush for touch events.

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 5

•

12 years ago

Here's my attempt at repro (on a keon, because hamachi flashing is broken): https://people.mozilla.org/~bgirard/cleopatra/#report=d990e751f239197029652a3b728f85c712a0b554&search=realtouch

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 6

•

12 years ago

I notice a call to mozilla::dom::HTMLImageElementBinding::set_src immediately before the ~50ms reflow.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 7

•

12 years ago

(In reply to Olli Pettay [:smaug] from comment #4) > And perhaps try to remove the explicit flush for touch events. Or do it conditionally only if we have animations to flush.

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 8

•

12 years ago

So the user scrolls, the app adds more contacts to the DOM, and at some point we have to figure out where things are. What part of that do we think we can improve on? Or do we think that's not what's happening? Keep in mind that (1) I currently know very little about layout and how we do it, and (2) reflows caused by reading scrollTop/scrollLeft are covered by bug 914854, which is not this.

David Zbarsky (:dzbarsky)

Comment 9

•

12 years ago

(In reply to Olli Pettay [:smaug] from comment #7) > (In reply to Olli Pettay [:smaug] from comment #4) > > And perhaps try to remove the explicit flush for touch events. > Or do it conditionally only if we have animations to flush. That would solve the particular case for that bug, but I'm worried that there are other ways for content to not be at the correct location if we don't flush.

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 10

•

12 years ago

This seems to require deep knowledge of the layout code (which I do not have) to even determine if this is something that can meaningfully be "fixed", let alone do so. Help?

Flags: needinfo?(roc)

Flags: needinfo?(dbaron)

Flags: needinfo?(bzbarsky)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 11

•

12 years ago

Just sort of thinking out loud here: why do we want to flush as part of processing a touch event? In theory, if the user has touched something, we'd rather send the event to the thing they actually saw when they touched than what would be there if everything were up-to-date. The exception to that is animations on the compositor (which I think is why this flush got put in in the first place), where the layout data we have is behind what the user is actually seeing. So I wonder if, instead of flushing, we could use current data from transitions running on the compositor (maybe we even using a timestamp on the event) when determining where to dispatch the event, but otherwise *not* update layout data?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 12

•

12 years ago

Sounds good to me.

Flags: needinfo?(roc)

Boris Zbarsky [:bzbarsky]

Comment 13

•

12 years ago

Likewise.

Flags: needinfo?(bzbarsky)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 14

•

12 years ago

OK, so I guess that leaves a few questions. One, it occurs to me that there are two different ways to do what I said: (1) [which I was originally assuming] change the display list construction that we do fr event handling to look at "current" compositor animation state for things that have transform being animated on the compositor. (Maybe opacity too, but that's only barely relevant.) (2) re-use the mini-flush code and actually reflect the data in style, and do the necessary frame tree updates, without flushing anything else. It occurs to me that (2) *might* actually be simpler. Though mini-flush code is pretty dangerous to begin with, so perhaps not. Two: is this something Jed is comfortable taking a shot at? Or should we find somebody with more layout experience?

Mike Lee [:mlee]

Comment 15

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) from comment #14) > ... > ... > Two: is this something Jed is comfortable taking a shot at? Or should we > find somebody with more layout experience? Jed has already expressed some discomfort tackling this so a developer with more Layout experience is preferred. Who would you recommend?

Whiteboard: [c= p= s= u=1.2] → [c=handeye p= s= u=1.2]

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 16

•

12 years ago

(In reply to Mike Lee [:mlee] from comment #15) > Jed has already expressed some discomfort tackling this so a developer with > more Layout experience is preferred. Who would you recommend? I think that actually depends on the answer to "One", so I'll ask bz what he thinks about "One" from comment 14.

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 17

•

12 years ago

I don't have a good feel for how easy it is to ask the compositor for info like that. If it's not too hard, it certainly sounds safer than messing with miniflushes.

Flags: needinfo?(bzbarsky)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 18

•

12 years ago

I think option 1 is more complex in the end since it adds a new way to get style/geometry data. The miniflush is an existing concept.

Benoit Girard (:BenWa)

Reporter

Comment 19

•

12 years ago

If we're trying to provide scrolling physics then we should not do anything fancy but wait on sub-APZC which runs all the scrolling code in the compositor. Ideally in the mean time we can have a work around. Of course this doesn't work if we want to do something more then APZC does. Asking the compositor for information is bad because the compositor is designed to block on vsync so the compositor can reasonably be expected to take ~15ms to handle an IPC message.

Nick Cameron [:nrc]

Assignee

Comment 20

•

12 years ago

roc has asked me to look at this. I am not sure how option one would work - would we send an async message to the compositor thread querying the current values of the layer for a display list item? That sounds slow and complex. In the current code that Smaug pointed to (http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp?rev=bc427f5ec61b&mark=6063-6063#6061) why do we only flush for touch events and not (say) for clicks? Are assuming that we only need to flush on mobile? Or is there something else at play? Do we need to take account of the APZC transform as well as any OMTA here? Or has AZPC been factored in before we get to this stage?

Assignee: jld → ncameron

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 21

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #20) > I am not sure how option one would work - would we send an async message to > the compositor thread querying the current values of the layer for a display > list item? That sounds slow and complex. What I was thinking, anyway, was that we wouldn't ask the compositor anything -- we'd just, for anything being animated on the compositor, do the computation on the main thread, given the time (from the event) and the animation manager's animation data. > In the current code that Smaug pointed to > (http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell. > cpp?rev=bc427f5ec61b&mark=6063-6063#6061) why do we only flush for touch > events and not (say) for clicks? Are assuming that we only need to flush on > mobile? Or is there something else at play? Maybe because OMTA is only enabled on mobile, and we'd have needed it for clicks if we enabled OMTA on desktop? Anyway, the idea here is to remove that and replace it with a better solution.

Flags: needinfo?(dbaron)

Nick Cameron [:nrc]

Assignee

Comment 22

•

12 years ago

Does anyone have a test case for this? I have a test case for desktop and (changing |if (aEvent->eventStructType == NS_TOUCH_EVENT)| to |nsLayoutUtils::AreAsyncAnimationsEnabled()| so that I flush for clicks as well as touches) the behaviour is buggy - I often get catch the click event with the wrong element. Removing the flush makes absolutely no difference to the behaviour (so one fix here is just to delete it, we should get a perf improvement and no worse correctness). Disabling OMTA gives the correct behaviour, so I'm pretty sure OMTA is what is causing the issues. I had a bit of an experiment with the full flush and using a mini-flush, but I could not cause correct behaviour, nor determine why we get incorrect behaviour. I didn't measure the time difference between a full flush and a mini-flush either. I will continue to investigate on Monday.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 23

•

12 years ago

(In reply to Benoit Girard (:BenWa) from comment #19) > If we're trying to provide scrolling physics then we should not do anything > fancy but wait on sub-APZC which runs all the scrolling code in the > compositor. Ideally in the mean time we can have a work around. Of course > this doesn't work if we want to do something more then APZC does. > > Asking the compositor for information is bad because the compositor is > designed to block on vsync so the compositor can reasonably be expected to > take ~15ms to handle an IPC message. Perhaps a distraction from this bug, but -- how is doing scrolling physics on the compositor thread going to interact with event handling?

Benoit Girard (:BenWa)

Reporter

Comment 24

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) from comment #23) > Perhaps a distraction from this bug, but -- how is doing scrolling physics > on the compositor thread going to interact with event handling? If we have an event listeners then we dispatch the event to the main thread for handling, otherwise it's safe to handle the scrolling from the compositor. But yes its been tricky to get right and if there's a mismatch typically the screen position jumps around if the compositor and the main thread are mediating the numbers correctly between the two threads.

Nick Cameron [:nrc]

Assignee

Comment 25

•

12 years ago

I have more questions than answers, I'm afraid. I get the same results with my testcase whether I do a full flush, a mini flush, or no flush at all. I have a div transitioning I click on the div to start the transition and then click in the same place without moving the mouse once the div is no longer there. I listen for the event on the moving div and a div behind. Expected: the background div receives the click event. This happens with OMTA disabled. With OMTA enabled, neither div receives the event. I have been trying to figure out why that happens. The first thing I checked was the current event frame and content in the pres shell, surprisingly these were always correct. I also checked using the layout debugger (visual event debugging/event dumping) and this highlighted the expected elements. I then spent a while debugging through the event dispatch system but at this point I stopped getting anything useful - I'm just not familiar with the code, in particular I couldn't work out how to get the element associated with the event listener. Its possible also there is something wrong with my test case (I'll upload that in a sec), but it works fine with OMTA disabled and if I move the mouse before the second click to cause a mouse move events. SO, can someone give me some tips on debugging event dispatch please? Anything/anywhere I should check to find out why my listener is not getting the event?

Nick Cameron [:nrc]

Assignee

Comment 26

•

12 years ago

Attached file testcase for desktop (requires enabling OMTC and OMTA) — Details

Andreas Gal :gal

Comment 27

•

12 years ago

Paging :smaug to the rescue

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 28

•

12 years ago

Do you know if click event is dispatched at all? It is dispatched in http://mxr.mozilla.org/mozilla-central/source/content/events/src/nsEventStateManager.cpp?rev=5dbc32740161#4402 Note, aEvent->clickCount needs to be > 0. So if mousedown and mouseup happen on different elements, click isn't dispatched.

Nick Cameron [:nrc]

Assignee

Comment 29

•

12 years ago

(In reply to Olli Pettay [:smaug] from comment #28) > Do you know if click event is dispatched at all? > It is dispatched in > http://mxr.mozilla.org/mozilla-central/source/content/events/src/ > nsEventStateManager.cpp?rev=5dbc32740161#4402 > Note, aEvent->clickCount needs to be > 0. So if mousedown and mouseup happen > on different elements, > click isn't dispatched. Yes, I see click events dispatched here and, I think, to the correct elements. I see them bouncing around a bunch of HandleEvent type methods in nsPresShell, nsEventListenerManager and nsEventDispatcher, but that's where I couldn't match events to elements any more, so I'm not really sure of what is going on.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 30

•

12 years ago

nsEventListenerManager::HandleEventSubType should get called with aDOMEvent->GetInternalNSEvent()->message==NS_MOUSE_CLICK and aCurrentTarget pointing to the EventTarget which has at least one listener for click. (Elements, Documents and Windows for example are EventTargets)

Nick Cameron [:nrc]

Assignee

Comment 31

•

12 years ago

(In reply to Olli Pettay [:smaug] from comment #30) > nsEventListenerManager::HandleEventSubType should get called with > aDOMEvent->GetInternalNSEvent()->message==NS_MOUSE_CLICK and aCurrentTarget > pointing to the > EventTarget which has at least one listener for click. > (Elements, Documents and Windows for example are EventTargets) Ah! The crucial bit of info I was missing is that elements _are_ EventTargets. I was looking for a field. That should make it easier, thanks!

Nick Cameron [:nrc]

Assignee

Comment 32

•

12 years ago

I have kind of figured out what is going on here. The key is that for a successful click we go through the event handling machinery twice. As Smaug guessed, the problem is that click count gets reset so we don't dispatch the click at all. I didn't notice this because we dispatch in the first pass and not the second. That is because we are only doing a successful flush before the second pass and so when we check the click we have a different element. The flush doesn't work because the first time around because we have different pres contexts for the two passes. The two passes have the same root pres context and the first time around, the root pres context is the pres context associated with the pres shell. Only the transition manager on the non-root pres context knows about the transitioning element. Things I don't understand: - why we have two passes and what they are for - why the mouse down and mouse up events are handled by different pres shells - how we have a pres shell handling an event on an element which is transitioning without it knowing about the transition - how we fix this! I either need to change which pres shell gets handles the events, change which transition manager knows about the transition (I think this is not the right approach), or get access to the correct pres context from the pres shell in the first pass. I don't know how to do any of things.

Flags: needinfo?(dbaron)

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 33

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #32) > - why we have two passes and what they are for if with two passes you mean click dispatch, it is just how click is defined. mousedown and mouseup happening on the same element causes a click to be dispatched. > - why the mouse down and mouse up events are handled by different pres shells They shouldn't be. Does hittesting give different shell? What is the testcase you're using? The one in this bug? (How does one enable OMTC and OMTA on Linux? ) > - how we have a pres shell handling an event on an element which is > transitioning without it knowing about the transition what you mean? On desktop at least we don't do transitions omt, so hittesting should know about the state, no? > - how we fix this! I either need to change which pres shell gets handles the > events, change which transition manager knows about the transition (I think > this is not the right approach), or get access to the correct pres context > from the pres shell in the first pass. I don't know how to do any of things. I'm missing something here. mousedown/up should happen on the same shell/prescontext

Flags: needinfo?(bugs)

Nick Cameron [:nrc]

Assignee

Comment 34

•

12 years ago

(In reply to Olli Pettay [:smaug] from comment #33) > (In reply to Nick Cameron [:nrc] from comment #32) > > - why we have two passes and what they are for > if with two passes you mean click dispatch, it is just how click is defined. > mousedown and mouseup happening on the same element causes a click to be > dispatched. > So I assumed that because the pres shells were different they were deliberately more different than just mouse down/up. > > - why the mouse down and mouse up events are handled by different pres shells > They shouldn't be. Does hittesting give different shell? What is the > testcase you're using? Yes, different shells. Yes, the test case on the bug - click on the yellow box to start the transition, then (without moving the mouse) click once the box has passed - expected: background goes red; actual: nothing happens). > The one in this bug? (How does one enable OMTC and OMTA on Linux? ) You will need a Linux system that is OK with HWA. Then set the following prefs to true: layers.acceleration.force-enabled layers.offmainthreadcomposition.enabled layers.offmainthreadcomposition.async-animations > > > - how we have a pres shell handling an event on an element which is > > transitioning without it knowing about the transition > what you mean? On desktop at least we don't do transitions omt, so > hittesting should know about the state, no? > Main thread vs OMTA doesn't make much difference here. The transitions are managed by a transition manager and there is one transition manager per pres context. Transitions known to one transition manager are not know by transition managers on other pres contexts. > > - how we fix this! I either need to change which pres shell gets handles the > > events, change which transition manager knows about the transition (I think > > this is not the right approach), or get access to the correct pres context > > from the pres shell in the first pass. I don't know how to do any of things. > I'm missing something here. mousedown/up should happen on the same > shell/prescontext OK, so this is the big mystery to me two. I will do some digging tomorrow.

Nick Cameron [:nrc]

Assignee

Comment 35

•

12 years ago

So, I solved my mystery and have patches to hopefully solve the problem - I haven't profiled, it would be great if someone could do that to see if we actually save time. First, the mystery is not so mysterious - the sequence of events is - mouse down is triggered, that is dealt with by the root presshell, we hit test find the frame, dispatch the event, then set mouse capture for a scrollframe in content. There is an animation in progress but mini-flushing on the root prescontext does nothing so the now moved div gets the mouse down. Then mouse up happens, this time the event is handled by the content presshell due to the capture, again we mini-flush, but this time on the content prescontext which is managing the animation and so the flush has an effect and so the mouse up is registered on the background div. That means the down and up happen on different elements so the click event is discarded and nothing happens. My solution is to mini-flush all the prescontexts when we need to, then everything goes right and we get away without doing a full flush. (BTW, touch is different from clicks, but this work was necessary because we want to have OMTA for desktop in the near future where we need to be able to handle clicks in this situation. I'm building not to double check it works with touches.) Patches coming up once I've tested with touch as well as click.

Flags: needinfo?(dbaron)

Nick Cameron [:nrc]

Assignee

Comment 36

•

12 years ago

My enthusiasm was premature, it does not look like my solution works for touches. However, I am relatively optimistic it will not be too hard to fix. I'll be on it tomorrow morning.

Nick Cameron [:nrc]

Assignee

Comment 37

•

12 years ago

Attached patch Use a mini-flush in PresShell::HandleEvent — Details — Splinter Review

There are a couple of questions in comments tagged with 'REVIEW'

Attachment #810245 - Flags: review?(dbaron)

Nick Cameron [:nrc]

Assignee

Comment 38

•

12 years ago

Attached patch Add mini-flush for animations — Details — Splinter Review

We need to mini-flush animations as well as transitions. I'm not very happy about the big macro, but it was that or lots of copypasta or a virtual method call in the middle of an inner loop, and these things are meant to be fast so I wasn't sure if that is a cost we could tolerate.

Attachment #810247 - Flags: review?(dbaron)

Mike Habicher [:mikeh] (high bugzilla latency)

Updated

•

12 years ago

Comment 39

•

12 years ago

Mike Lee - can we get somebody to profile and measure the performance improvement with this patch? It is not a trivial change, and if we are going to push for landing it on Aurora, it would be good to know how much we're gaining.

Flags: needinfo?(mlee)

Mike Lee [:mlee]

Comment 40

•

12 years ago

Mason, need you to profile these changes and report your findings per comment #39 here.

Flags: needinfo?(mlee) → needinfo?(mchang)

Mason Chang [Inactive] [:mchang]

Comment 41

•

12 years ago

On it! I'm still new here though, so it might take me a while to get the profile up and running. I'll post the profiles on this bug once they are up.

Flags: needinfo?(mchang)

Mason Chang [Inactive] [:mchang]

Comment 42

•

12 years ago

I think I got a few profiles. Here's the initial profile showing the 50ms layout::DoReflow delay on Hamachi Device. https://people.mozilla.org/~bgirard/cleopatra/#report=43d980b6ebadca18674a335fd5417c9be71b34b2&search=RecvRealTouchEvent This was based on mozilla-central 149475:df452c8738ef. After applying the patch, here are the results: https://people.mozilla.org/~bgirard/cleopatra/#report=4d13ce9d5dcb3e496f2de9e5768ea88a03547184&search=RecvRealTouchEvent Essentially we no longer see a layout::Flush associated with RecvRealTouchEvent, assuming I'm reading Cleopatra right. This was on the contacts app with the gaia-heavy reference workload, running b2g V1.2. Profiling was done with 1ms sampling rate and profiling the Gecko thread of b2g. Is this the profile you needed Milan and Nick?

Benoit Girard (:BenWa)

Reporter

Comment 43

•

12 years ago

(In reply to Mason Chang from comment #42) > Essentially we no longer see a layout::Flush associated with > RecvRealTouchEvent, assuming I'm reading Cleopatra right. You are, it looks much better.

Mason Chang [Inactive] [:mchang]

Comment 44

•

12 years ago

Hey Nick, I realize you're probably coming back from the summit. I'm new to bugzilla and should've need infoed you with comment 42. Let me know if there's anything else I can do to help!

Flags: needinfo?(ncameron)

Nick Cameron [:nrc]

Assignee

Comment 45

•

12 years ago

(In reply to Mason Chang from comment #44) > Hey Nick, I realize you're probably coming back from the summit. I'm new to > bugzilla and should've need infoed you with comment 42. Let me know if > there's anything else I can do to help! Hi, thanks for the profile. I'm not great at interpreting these things but BenWa is, so if he says it looks good, then I am happy. I don't think there is anything else we need right now, thanks. Just waiting for review and (after that) a decision on whether the gain is worth the risk of landing the patches.

Flags: needinfo?(ncameron)

Dietrich Ayala (:dietrich)

Updated

•

12 years ago

Target Milestone: --- → 1.2 C3(Oct25)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 46

•

12 years ago

Comment on attachment 810245 [details] [diff] [review] Use a mini-flush in PresShell::HandleEvent I like the refactoring to go inside dispatchUsingCoordinates and do this for all events for which dispatchUsingCoordinates istrue. I'm not sure why the UnlockPointer call was there either, but it goes back to bug 737100, and it might be that that bit (unlike the rest of it) should be specific to NS_TOUCH_EVENT. smaug would know. Per hg blame it looks like the reason for using script blockers goes back to bug 436965, and given that it matches the things that restyle in FlushPendingNotifications it's probably a good idea -- but it seems better to use an nsAutoScriptBlocker rather than using AddScriptBlocker and RemoveScriptBlocker directly. (Make sure its destructor runs *before* the frame = GetNearestFrameContainingPresShell(this) call, though.) The bigger question I'd like to understand is which documents you need to flush styles for. You're currently doing mDocument and all its subdocuments (for subframes, I presume), but you're *not* doing ancestor documents. That seems a little bit peculiar to me. The reason to want to flush is presumably that the async animations have moved something in a way that's going to change the position -> event target mapping. The reason to want to flush styles in multiple documents is that the touch target might be moving due to movement i multiple documents -- its own document, and ancestor documents. But if there's movement in ancestor documents, it better have already been flushed before we get to this point, or else we might be the wrong pres shell entirely. The way I'd like to think the event dispatch works is that if the event lands within a subframe, we'll go through this code that you're modifying for the outer frame, and then go through it again for the inner frame if the coordinates in the outer frame mean that the event is inside an inner frame. If that's the case, then you don't need to enumerate sub documents at all; you just need to flush mDocument (which is what the code that you're removing did, too). If it's not the case, then I'd think you need to flush ancestor documents -- and probably somewhere earlier even before you get to this point (but if that were needed, it seems like the old code would have needed to do it as well, so it seems unlikely). I'm having trouble seeing the justification for flushing all subdocuments. So I'm inclined to think the right thing is dropping the EnumerateSubDocuments call, and doing the flushing only on mDocument. Marking review- to at least prompt answers to the last question, but if you think this is right, feel free to ask again.

Attachment #810245 - Flags: review?(dbaron)

Attachment #810245 - Flags: review-

Attachment #810245 - Flags: feedback?(bugs)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 47

•

12 years ago

Comment on attachment 810247 [details] [diff] [review] Add mini-flush for animations AnimationCommon.h: >+ // Reparent aContent's style. >+ already_AddRefed<nsStyleContext> ReparentContent(nsIContent* aContent, >+ nsStyleContext* aParentStyle); Mention that it does :before and :after too, in the comment. >+#define UPDATE_THROTTLED_INTERNAL(aClass,aAnimationsGetter) void \ space after comma And perhaps also "IMPL" at the beginning or end of the macro name, and the full name of the function, e.g., IMPL_UPDATE_ALL_THROTTLED_STYLES_INTERNAL. I think that would be a bit clearer. also, make the macro arguments all-lowercase (with underscores if needed), and with a trailing _ to signal that they're macro arguments. There are multiple 80th column violations in the patch. Please fix them. r=dbaron with that

Attachment #810247 - Flags: review?(dbaron) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 48

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) from comment #46) > I'm not sure why the UnlockPointer call was there either, but it goes back > to bug 737100, and it might be that that bit (unlike the rest of it) should > be specific to NS_TOUCH_EVENT. smaug would know. Pointer lock is for mouse events, for now at least, and since it is not defined for touch events (not talking about pointer events here), we explicitly unlock if someone uses touch. for now.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 49

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) (vacation until October 21) from comment #46) > The way I'd like to think the event dispatch > works is that if the event lands within a subframe, we'll go through this > code that you're modifying for the outer frame, and then go through it again > for the inner frame if the coordinates in the outer frame mean that the > event is inside an inner frame. That's not how it works :-). Basically events are always dispatched by calling PresShell::HandleEvent containing the widget which received the event. If the event needs to go to the focused document or the document currently holding mouse capture, we recursively call PresShell::HandleEvent for that document, but there's only ever one level of recursion. Targeting of events at subdocuments is achieved by building a displaylist for the presshell which crosses subdocument boundaries. > If that's the case, then you don't need to > enumerate sub documents at all; you just need to flush mDocument (which is > what the code that you're removing did, too). If it's not the case, then > I'd think you need to flush ancestor documents -- and probably somewhere > earlier even before you get to this point (but if that were needed, it seems > like the old code would have needed to do it as well, so it seems unlikely). I think Nick's approach is right. We have to flush subdocuments to ensure that the display list we build is accurate. (Maybe we don't have to flush all of them, but that could get complicated.) We don't have to flush ancestors since we guarantee that when PresShell::HandleEvent is called, the event must go to a document in the subdocument tree rooted at this presshell.

Nick Cameron [:nrc]

Assignee

Comment 50

•

12 years ago

dbaron: does roc's comments persuade you that the patch is correct? Or does it still need changes?

Target Milestone: 1.2 C3(Oct25) → ---

Nick Cameron [:nrc]

Assignee

Updated

•

12 years ago

Target Milestone: --- → 1.2 C3(Oct25)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 51

•

12 years ago

Comment on attachment 810245 [details] [diff] [review] Use a mini-flush in PresShell::HandleEvent sounds good, so r=dbaron with the third paragraph of comment 46 addressed (ignore the long fourth paragraph), plus the change implied by comment 48 (keeping the UnlockPointer call specific to touch events)

Attachment #810245 - Flags: review- → review+

Nick Cameron [:nrc]

Assignee

Comment 52

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) from comment #46) > Comment on attachment 810245 [details] [diff] [review] > Use a mini-flush in PresShell::HandleEvent > > I like the refactoring to go inside dispatchUsingCoordinates and do this for > all events for which dispatchUsingCoordinates istrue. > Do you mean inside WidgetEvent::IsUsingCoordinates? And by 'refactoring' do you mean the code for doing flushes? That seems odd to me - IsUsingCoordinates is just a tester method at the moment and the code uses a member function of nsPresShell.

Flags: needinfo?(dbaron)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 53

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #52) > (In reply to David Baron [:dbaron] (needinfo? me) from comment #46) > > Comment on attachment 810245 [details] [diff] [review] > > Use a mini-flush in PresShell::HandleEvent > > > > I like the refactoring to go inside dispatchUsingCoordinates and do this for > > all events for which dispatchUsingCoordinates istrue. > > > > Do you mean inside WidgetEvent::IsUsingCoordinates? And by 'refactoring' do > you mean the code for doing flushes? That seems odd to me - > IsUsingCoordinates is just a tester method at the moment and the code uses a > member function of nsPresShell. I mean I like the change that you made that's already in the patch; no need to do anything else.

Flags: needinfo?(dbaron)

Nick Cameron [:nrc]

Assignee

Comment 54

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) from comment #53) > (In reply to Nick Cameron [:nrc] from comment #52) > > (In reply to David Baron [:dbaron] (needinfo? me) from comment #46) > > > Comment on attachment 810245 [details] [diff] [review] > > > Use a mini-flush in PresShell::HandleEvent > > > > > > I like the refactoring to go inside dispatchUsingCoordinates and do this for > > > all events for which dispatchUsingCoordinates istrue. > > > > > > > Do you mean inside WidgetEvent::IsUsingCoordinates? And by 'refactoring' do > > you mean the code for doing flushes? That seems odd to me - > > IsUsingCoordinates is just a tester method at the moment and the code uses a > > member function of nsPresShell. > > I mean I like the change that you made that's already in the patch; no need > to do anything else. Ah, great, nice and easy :-)

Nick Cameron [:nrc]

Assignee

Comment 55

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/82b31924a8bb https://hg.mozilla.org/integration/mozilla-inbound/rev/87ee0a1865e4

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 56

•

12 years ago

Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/7540c762bb3c and https://hg.mozilla.org/integration/mozilla-inbound/rev/9b2cfe924652 under suspicion of causing the mochitest-metro bustage in https://tbpl.mozilla.org/php/getParsedLog.php?id=29506096&tree=Mozilla-Inbound

Matt Brubeck (:mbrubeck)

Comment 57

•

12 years ago

The mochitest-metro test failure is probably caused by Metro test code that depends on animation timing in incorrect or fragile ways. Let us know if you need any help debugging/fixing the test(s).

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 58

•

12 years ago

Comment on attachment 810245 [details] [diff] [review] Use a mini-flush in PresShell::HandleEvent Hmm, I gave feedback already.

Attachment #810245 - Flags: feedback?(bugs)

Mike Lee [:mlee]

Comment 59

•

12 years ago

Updating Target Milestone for FxOS Perf koi+'s.

Target Milestone: 1.2 C3(Oct25) → 1.2 C4(Nov8)

Nick Cameron [:nrc]

Assignee

Comment 60

•

12 years ago

(In reply to Matt Brubeck (:mbrubeck) from comment #57) > The mochitest-metro test failure is probably caused by Metro test code that > depends on animation timing in incorrect or fragile ways. Let us know if > you need any help debugging/fixing the test(s). Sorry, was distracted by other bugs. So, I pushed a few things to Try, and if we do a full flush rather than a mini-flush then we pass the Metro tests. Obviously, that defeats the purpose of this bug. Nothing else I experimented with solved the problem. I don't see anything obviously timing related in the tests which this should affect, but I don't really understand what is going on. I'd really appreciate some help debugging this, especially as I don't have a Metro device. Any suggestions for things I can throw at Try?

Flags: needinfo?(mbrubeck)

Matt Brubeck (:mbrubeck)

Comment 61

•

12 years ago

Adding jimm who wrote the Metro test in question. The first failure happens here: http://hg.mozilla.org/mozilla-central/file/48dbd532a004/browser/metro/base/tests/mochitest/browser_selection_basic.js#l176 This test uses the TouchDragAndHold helper defined in browser/metro/base/tests/mochitest/head.js, which repeatedly calls EventUtils.synthesizeTouchAtPoint to simulate a "drag" (touchstart, sequence of touchmove, touchend). The drag *should* be targeted at a floating selection "marker" that is part of our browser chrome. The test then does some stuff to wait until our browser chrome has processed the resulting events and messages, and then checks to see that the selection state has been updated correctly. Dragging the selection marker works by setting .left and .top on the marker element, which is an element in a XUL stack. However, the code that controls the marker also caches this position, and the test reads from that cached position (.xPos) rather than from element.left. Is it possible that avoiding reflows means the element hasn't actually been moved to its new position yet, and so it doesn't receive the touch events sent by the test? Should we force a reflow by reading from element.left directly, or maybe through a hack like offsetLeft?

Flags: needinfo?(mbrubeck)

Nick Cameron [:nrc]

Assignee

Comment 62

•

12 years ago

(In reply to Matt Brubeck (:mbrubeck) from comment #61) > Adding jimm who wrote the Metro test in question. > > The first failure happens here: > http://hg.mozilla.org/mozilla-central/file/48dbd532a004/browser/metro/base/ > tests/mochitest/browser_selection_basic.js#l176 > > This test uses the TouchDragAndHold helper defined in > browser/metro/base/tests/mochitest/head.js, which repeatedly calls > EventUtils.synthesizeTouchAtPoint to simulate a "drag" (touchstart, sequence > of touchmove, touchend). The drag *should* be targeted at a floating > selection "marker" that is part of our browser chrome. The test then does > some stuff to wait until our browser chrome has processed the resulting > events and messages, and then checks to see that the selection state has > been updated correctly. > > Dragging the selection marker works by setting .left and .top on the marker > element, which is an element in a XUL stack. However, the code that > controls the marker also caches this position, and the test reads from that > cached position (.xPos) rather than from element.left. Is it possible that > avoiding reflows means the element hasn't actually been moved to its new > position yet, and so it doesn't receive the touch events sent by the test? > Should we force a reflow by reading from element.left directly, or maybe > through a hack like offsetLeft? I tried replacing xPos with _element.left and likewise for yPos, but that did not change the outcome. Which is surprising, I would have thought that would trigger a full flush in its own right.

Nick Cameron [:nrc]

Assignee

Comment 63

•

12 years ago

Try push with that failure: https://tbpl.mozilla.org/?tree=Try&rev=c623465bd870

Jim Mathies [:jimm]

Comment 64

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #63) > Try push with that failure: > https://tbpl.mozilla.org/?tree=Try&rev=c623465bd870 The first failure is a selection exists test (looking for active selection in the page) which fails. Is there any way your changes could have triggered a reset of selected content? The test checks to be sure there is initial selection before the drag, so something goes wrong after we try to manipulate the selection bounds using the selection monocles. The other thing that seems clear is that the drag of the monocle takes place and triggers code your patch - https://tbpl.mozilla.org/php/getParsedLog.php?id=30164774&tree=Try&full=1#error0 Not sure what 14:13:21 INFO - Would have flushed 14:13:21 INFO - full flush means but we get a slew of these while the drag takes place. I can try running this locally to see. We can also dump a bunch of screen shots if the test failure is slave specific to see what's going on.

Nick Cameron [:nrc]

Assignee

Comment 65

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #64) > (In reply to Nick Cameron [:nrc] from comment #63) > > Try push with that failure: > > https://tbpl.mozilla.org/?tree=Try&rev=c623465bd870 > > The first failure is a selection exists test (looking for active selection > in the page) which fails. Is there any way your changes could have triggered > a reset of selected content? The test checks to be sure there is initial > selection before the drag, so something goes wrong after we try to > manipulate the selection bounds using the selection monocles. > > The other thing that seems clear is that the drag of the monocle takes place > and triggers code your patch - > > https://tbpl.mozilla.org/php/getParsedLog. > php?id=30164774&tree=Try&full=1#error0 > > Not sure what > > 14:13:21 INFO - Would have flushed > 14:13:21 INFO - full flush > Hmm, this is interesting - it means that I posted the wrong patch, well an extra patch actually which reverts part of my change from full to mini flush. So, this patch queue is the one that succeeds, but changing xPos to element.left causes us to fail (seemingly without the changes which caused the same failure before). I'm not exactly sure what this means. Possibly we are (in some way) doing more flushing than before and that upsets the assumptions in the test infrastructure. I'm not sure if accessing the element rather than the cached xPos will trigger more flushing than the touch event would. > means but we get a slew of these while the drag takes place. > > I can try running this locally to see. We can also dump a bunch of screen > shots if the test failure is slave specific to see what's going on. It might be useful to see what is happening on the screen during the test (or just in normal use too, to see if my patches cause real problems or just test problems). I'm pretty sure it is not just one slave, I've seen these errors on multiple try runs now.

Jim Mathies [:jimm]

Comment 66

•

12 years ago

Can you post with a rollup merged to mc tip? The two patches posted here don't apply cleanly.

Jim Mathies [:jimm]

Comment 67

•

12 years ago

This might interfere even more but you can snap some screen shots during the particular test that fails: let snap = SpecialPowers.snapshotWindow(window, false); info(snap.toDataURL()); which will show up as data uris in the logs. Probably best to start by running them locally and see if they fail.

Nick Cameron [:nrc]

Assignee

Comment 68

•

12 years ago

Attached patch rebased rollup patch (obsolete) — Details — Splinter Review

Jim Mathies [:jimm]

Comment 69

•

12 years ago

I'm seeing out of sync touch event delivery here. For example, in the 'expand / collapse selection' test, I see: TEST-PASS | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | selection active TEST-PASS | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | short selection test SelectionHelperUI: touchstart SelectionHandler: Browser:SelectionMoveStart SelectionHelperUI: Content:SelectionRange .. (lots of ranges, expected) SelectionHelperUI: Content:SelectionRange SelectionHelperUI: touchend SelectionHelperUI: Content:SelectionRange SelectionHandler: Browser:SelectionMoveEnd (the ping below post the drag to clear the message manager queue, expected) SelectionHandler: Browser:SelectionHandlerPing SelectionHelperUI: Content:SelectionHandlerPong TEST-PASS | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | long selection test TEST-PASS | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | selection active SelectionHelperUI: touchstart ^^ unexpected, odd touchstart delivered SelectionHandler: Browser:SelectionClose SelectionHelperUI: Content:HandlerShutdown ^^ selection handler shutting down due touchstart TEST-UNEXPECTED-FAIL | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | selection active - Got false, expected true

Jim Mathies [:jimm]

Comment 70

•

12 years ago

The coords are interesting as well. In this particular test, we have a sentence in the page. We select the first work of the sentence, and then using the monocles, drag the selection over to the end of the sentence. At this point we check the selected text of the page to be sure everything worked. Then we drag the monocle back to the first word and check again. We releat that process five times. On the first drag I see a touch start that fits what the test is doing: SelectionHelperUI: touchstart touchstart: 74,37 and at the end I see: SelectionHelperUI: touchend SelectionHelperUI: touchstart touchstart: 449,37

Nick Cameron [:nrc]

Assignee

Comment 71

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #69) > I'm seeing out of sync touch event delivery here. For example, in the > 'expand / collapse selection' test, I see: ... > SelectionHelperUI: touchstart > ^^ unexpected, odd touchstart delivered So if we were setting off a new touchstart where we shouldn't be then we would see this behaviour and then fail?

Nick Cameron [:nrc]

Assignee

Comment 72

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #70) > The coords are interesting as well. In this particular test, we have a > sentence in the page. We select the first work of the sentence, and then > using the monocles, drag the selection over to the end of the sentence. At > this point we check the selected text of the page to be sure everything > worked. Then we drag the monocle back to the first word and check again. We > releat that process five times. > > On the first drag I see a touch start that fits what the test is doing: > SelectionHelperUI: touchstart > touchstart: 74,37 > > and at the end I see: > > SelectionHelperUI: touchend > SelectionHelperUI: touchstart > touchstart: 449,37 Presumably the second touchstart is either at the end of the text or somewhere in between? So the movement five times, that happens all in one touch event? Or each selection is a separate touch? What would you expect to see for this value - the same as at the start? And this time around you didn't see any more touchstarts?

Jim Mathies [:jimm]

Comment 73

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #72) > (In reply to Jim Mathies [:jimm] from comment #70) > > The coords are interesting as well. In this particular test, we have a > > sentence in the page. We select the first work of the sentence, and then > > using the monocles, drag the selection over to the end of the sentence. At > > this point we check the selected text of the page to be sure everything > > worked. Then we drag the monocle back to the first word and check again. We > > releat that process five times. > > > > On the first drag I see a touch start that fits what the test is doing: > > SelectionHelperUI: touchstart > > touchstart: 74,37 > > > > and at the end I see: > > > > SelectionHelperUI: touchend > > SelectionHelperUI: touchstart > > touchstart: 449,37 > > Presumably the second touchstart is either at the end of the text or > somewhere in between? So the movement five times, that happens all in one > touch event? Or each selection is a separate touch? What would you expect to > see for this value - the same as at the start? And this time around you > didn't see any more touchstarts? I wouldn't expect to see the touchstart after the touchend, it doesn't correlate to what the test harness is doing. Looks like some sort of erroneously generated event. The test harness here sends a touchstart for a single touch point, a bunch of touchmoves with coords that move across the screen, then a touch end. It's simulating a simple touch down, drag, touch up on a monocle. As the monocle is dragged the selection library manipulates the selection underneath using various selection apis we have for setting and manipulating selection in the page. harness: http://mxr.mozilla.org/mozilla-central/source/browser/metro/base/tests/mochitest/head.js#760 test: http://mxr.mozilla.org/mozilla-central/source/browser/metro/base/tests/mochitest/browser_selection_basic.js#135

Jim Mathies [:jimm]

Comment 74

•

12 years ago

Attached file test debug output — Details

So I went in and added a ton of logging to this test and slowed things down a bit. My original assessment that there was a errant touchstart was wrong. What we're seeing here is a failure to drag a monocle. I need to add some more output here to figure out what's going on. Will post back in a bit.

Jim Mathies [:jimm]

Comment 75

•

12 years ago

0:16.77 TEST-INFO | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | ** end drag 0:16.77 SelectionHelperUI: Content:SelectionRange 0:16.77 SelectionHelperUI: Content:SelectionRange 0:16.77 SelectionHandler: Browser:SelectionMoveEnd 0:16.77 SelectionHelperUI: Content:SelectionRange 0:16.77 SelectionHelperUI: Content:SelectionRange 0:16.77 SelectionHandler: Browser:SelectionHandlerPing 0:16.77 SelectionHelperUI: Content:SelectionHandlerPong 0:16.77 TEST-PASS | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | selection active 0:16.77 TEST-PASS | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | drag back selection check 0:16.77 TEST-INFO | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | ** starting new drag 0:16.77 TEST-INFO | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | [0] touchstart 74.69999694824219 x 37.55000305175781 0:16.77 SelectionHelperUI: MozAppbarDismissing 0:16.77 SelectionHelperUI: touchstart 0:16.77 touchstart: 74 37 0:16.77 TEST-INFO | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | [1] touchmove 84.20599700927734 x 37.55000305175781 0:16.77 SelectionHelperUI: touchmove received, checking for content scroll. 0:16.77 TEST-INFO | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | [2] touchmove 93.71199707031249 x 37.55000305175781 0:16.77 SelectionHelperUI: touchmove received, checking for content scroll. 0:16.77 TEST-INFO | chrome://mochitests/content/metro/browser/metro/base/tests/mochitest/browser_selection_basic.js | [3] touchmove 103.21799713134764 x 37.55000305175781 0:16.77 SelectionHelperUI: touchmove received, checking for content scroll. 0:16.77 SelectionHelperUI: content is being scrolled, shutdown selection handling. 0:16.77 closeEditSession 0:16.77 SelectionHandler: Browser:SelectionClose 0:16.77 SelectionHelperUI: Content:HandlerShutdown Looks like when we go to start a new drag, the drag coords don't match the position of the monocle.

Jim Mathies [:jimm]

Comment 76

•

12 years ago

Unfortunately snapping screen shots after every drag fixes the bug.

Jim Mathies [:jimm]

Comment 77

•

12 years ago

Not sure how to proceed. Basically, a round 18x18 image event target (the monocle) is not where it's supposed to be, or it's where it's supposed to be and our tests have been compensating for bad position.

Mike Lee [:mlee]

Updated

•

12 years ago

Target Milestone: 1.2 C4(Nov8) → 1.2 C5(Nov22)

Ben Kelly [:bkelly, not reviewing]

Comment 78

•

12 years ago

Is there anything else we can try here? This patch is very helpful on the b2g FPS scrolling bugs I am working. Is a win8 machine adequate to build and run these tests?

Nick Cameron [:nrc]

Assignee

Comment 79

•

12 years ago

(In reply to Ben Kelly [:bkelly] from comment #78) > Is there anything else we can try here? This patch is very helpful on the > b2g FPS scrolling bugs I am working. Is a win8 machine adequate to build > and run these tests? There is certainly more to do here, just been stuck with some other high priority stuff lately. Will get back on this soon.

Mike Lee [:mlee]

Comment 80

•

12 years ago

Looking to fix this but will need to push this to 1.3 if not fixed by December 1st. 1.2 Triage

Nick Cameron [:nrc]

Assignee

Comment 81

•

12 years ago

We could land this with a b2g ifdef if we want it to make 1.2 for sure. We really shouldn't land this last minute because I think it carries some risk. So (although it makes me pretty sad) I think we should land now with an ifdef, rather than try to fix this and then land. I will try and find a proper solution asap too, but that might take a while. mlee - does this sound sensible? dbaron - does this fly from a technical point of view?

Flags: needinfo?(mlee)

Flags: needinfo?(dbaron)

Target Milestone: 1.2 C5(Nov22) → 1.2 C4(Nov8)

Mike Lee [:mlee]

Comment 82

•

12 years ago

Thanks Nick. I'll wait on dbaron's response. Please provide specifics about the risk involved in with your ifdef proposal. Ben, How dependent are we on this patch for resolving our outstanding FPS regressions?

Flags: needinfo?(ncameron)

Flags: needinfo?(mlee)

Flags: needinfo?(bkelly)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 83

•

12 years ago

You're proposing making it a B2G ifdef because if you land it everywhere you end up with some metro tests failing?

Ben Kelly [:bkelly, not reviewing]

Comment 84

•

12 years ago

(In reply to Mike Lee [:mlee] from comment #82) > Ben, > How dependent are we on this patch for resolving our outstanding FPS > regressions? Mike, I'm not sure. We are still trying to identify the cause of the regression. In theory this bug is not new, so the regression is unlikely directly related to this bug. It just helps us compensate by minimizing reflows. Nick, I appreciate you trying to accommodate the b2g deadlines and my desire to have this patch. After thinking about it, though, it seems like it might be better to track down the test issue before landing and, if necessary, let it ride the trains in v1.3. Just my opinion. Thanks again!

Flags: needinfo?(bkelly)

Jim Mathies [:jimm]

Comment 85

•

12 years ago

What's the deadline on the 1.2 release/merge/whatever? I'd prefer we not add to our b2g technical debt. I'll find some time to poke at this more today. Maybe nrc can give me a hand via irc.

Ben Kelly [:bkelly, not reviewing]

Comment 86

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #85) > What's the deadline on the 1.2 release/merge/whatever? I'd prefer we not add > to our b2g technical debt. I'll find some time to poke at this more today. > Maybe nrc can give me a hand via irc. Per comment 80, we need to land before Dec 1 in order to get it into v1.2. Thanks!

Jim Mathies [:jimm]

Comment 87

•

12 years ago

Nick, could your work impact how quickly a xul element's visibility is reflected in the dom? Our selection code hides the xul:toolbarbuttons [1] associated with the markers when they are being dragged, then sets them back once the drag completes. We do this via the hidden property on the element [2]. When we test we repeat each drag for a set of iterations. So for example: 1) select a word 2) drag right hand marker to the right expanding selection across a sentence 3) check what text is selected 4) drag right hand marker to the left contracting selection back to a word 5) check what text is selected 6) goto 2 for 4 more iterations In steps 2 & 4, the drag marker gets hidden by our selection controller during the drag. Both steps also assume that when they start the marker is visible and will receive touch events. If I add a delay after 3 & 5, the test succeeds. If I don't sometimes the follow up drag doesn't pick up a marker implying that maybe the marker isn't visible in the dom yet. These are automated tests, so they execute really quickly. [1] http://mxr.mozilla.org/mozilla-central/source/browser/metro/base/content/bindings/selectionoverlay.xml#13 [2] http://mxr.mozilla.org/mozilla-central/source/browser/metro/base/content/helperui/SelectionHelperUI.js#154

Flags: needinfo?(dbaron)

Nick Cameron [:nrc]

Assignee

Comment 88

•

12 years ago

(In reply to David Baron [:dbaron] (needinfo? me) from comment #83) > You're proposing making it a B2G ifdef because if you land it everywhere you > end up with some metro tests failing? Yes (I know, that is kind of evil, but possibly better than a last minute landing of a high-ish risk patch).(In reply to Ben Kelly [:bkelly] from comment #84) > (In reply to Mike Lee [:mlee] from comment #82) > > Ben, > > How dependent are we on this patch for resolving our outstanding FPS > > regressions? > > Nick, I appreciate you trying to accommodate the b2g deadlines and my desire > to have this patch. After thinking about it, though, it seems like it might > be better to track down the test issue before landing and, if necessary, let > it ride the trains in v1.3. > > Just my opinion. Thanks again! OK, fine by me. In fact, better :-) (In reply to Jim Mathies [:jimm] from comment #85) > What's the deadline on the 1.2 release/merge/whatever? I'd prefer we not add > to our b2g technical debt. I'll find some time to poke at this more today. > Maybe nrc can give me a hand via irc. I would much prefer to fix the tests. I'm technically on holiday today, but keen to get stuck in next week.

Flags: needinfo?(ncameron)

Mike Lee [:mlee]

Updated

•

12 years ago

blocking-b2g: koi+ → 1.3+

Whiteboard: [c=handeye p= s= u=1.2] → [c=handeye p= s= u=1.3]

Target Milestone: 1.2 C4(Nov8) → 1.3 Sprint 6 - 12/6

Nick Cameron [:nrc]

Assignee

Comment 89

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #87) > Nick, could your work impact how quickly a xul element's visibility is > reflected in the dom? Our selection code hides the xul:toolbarbuttons [1] > associated with the markers when they are being dragged, then sets them back > once the drag completes. We do this via the hidden property on the element > [2]. When we test we repeat each drag for a set of iterations. So for > example: > > 1) select a word > 2) drag right hand marker to the right expanding selection across a sentence > 3) check what text is selected > 4) drag right hand marker to the left contracting selection back to a word > 5) check what text is selected > 6) goto 2 for 4 more iterations > > In steps 2 & 4, the drag marker gets hidden by our selection controller > during the drag. Both steps also assume that when they start the marker is > visible and will receive touch events. > > If I add a delay after 3 & 5, the test succeeds. If I don't sometimes the > follow up drag doesn't pick up a marker implying that maybe the marker isn't > visible in the dom yet. These are automated tests, so they execute really > quickly. > > [1] > http://mxr.mozilla.org/mozilla-central/source/browser/metro/base/content/ > bindings/selectionoverlay.xml#13 > [2] > http://mxr.mozilla.org/mozilla-central/source/browser/metro/base/content/ > helperui/SelectionHelperUI.js#154 This seems like a reasonable hypothesis. I replace |FlushPendingNotifications| by a flush which only flushes animations and transitions. If the change to the hidden property was previously flushed by |FlushPendingNotifications| then it no longer would be, which would have the effects we observe/you describe. We can confirm this by logging what gets flushed in that FlushPendingNotifications call. I'm not sure how easy that is, but I'll have a go. If it is the case, I'm not sure how to fix it. This doesn't seem like something that should be touch specific like it is currently. I would expect a similar test, but using mouse actions rather than touch, would currently fail. I'm not sure under what circumstances we need to do a full flush and when we can get away with just flushing animations.

Nick Cameron [:nrc]

Assignee

Comment 90

•

12 years ago

Attached patch some pretty basic logging — Details — Splinter Review

jimm or anyone else with a metro setup: could you run the failing tests using a build with this patch please and post the logs? (It needs to build with DEBUG, but on Try the mc testsuite only runs for opt builds. I don't have a local Metro build). Thanks!

Flags: needinfo?(jmathies)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 91

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #90) > jimm or anyone else with a metro setup: could you run the failing tests > using a build with this patch please and post the logs? (It needs to build > with DEBUG, but on Try the mc testsuite only runs for opt builds. I don't > have a local Metro build). Thanks! You might be able to use Try despite this, by having the patch stack that you push to try include a change to the build configuration. (You can modify the appropriate {browser,mobile/android,b2g,xulrunner}/config/mozconfigs/... file if you can figure out which is the right one, or just hack configure.in if you can't.)

Jim Mathies [:jimm]

Comment 92

•

12 years ago

(In reply to Nick Cameron [:nrc] from comment #90) > Created attachment 8336631 [details] [diff] [review] > some pretty basic logging > > jimm or anyone else with a metro setup: could you run the failing tests > using a build with this patch please and post the logs? (It needs to build > with DEBUG, but on Try the mc testsuite only runs for opt builds. I don't > have a local Metro build). Thanks! I'll take it for a spin. FWIW, you can include "WinUtil.h" and use the logging functions there to get output on try runs in release builds.

Flags: needinfo?(jmathies)

Jim Mathies [:jimm]

Comment 93

•

12 years ago

Is this logging patch to be applied with the rollup? It doesn't look the same around line 6168 in nsPresShell. Also it doesn't apply cleanly.

Nick Cameron [:nrc]

Assignee

Comment 94

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #93) > Is this logging patch to be applied with the rollup? It doesn't look the > same around line 6168 in nsPresShell. Also it doesn't apply cleanly. No, just on top of trunk. (Its not the logging functions I need from debug builds, its the |List| functions for elements - and they call off into a whole bunch of other functions which are also ifdef'ed). Thanks for giving it a go!

Flags: needinfo?(jmathies)

Mike Lee [:mlee]

Comment 95

•

12 years ago

Jim, have you been able to take a look at this since Nick's reply?

Jim Mathies [:jimm]

Comment 96

•

12 years ago

(In reply to Mike Lee [:mlee] from comment #95) > Jim, have you been able to take a look at this since Nick's reply? No I haven't, sorry. Been busy on fx28 merge related work. I'll get back to this next week.

Flags: needinfo?(jmathies)

Jim Mathies [:jimm]

Comment 97

•

12 years ago

Attached file log 1 — Details

Here you go.

Attachment #8346658 - Flags: feedback?(ncameron)

Nick Cameron [:nrc]

Assignee

Comment 98

•

12 years ago

Comment on attachment 8346658 [details] log 1 Great, thanks!

Attachment #8346658 - Flags: feedback?(ncameron)

Nick Cameron [:nrc]

Assignee

Comment 99

•

12 years ago

This is the interesting bit from the log, we get it seven times over the duration of the test: Before restyling: XUL* xul:toolbarbutton@066035B8 anonid="selectionhandle-mark3" class="selectionhandle" label="^" left="8px" top="25.550003051757812px" state=[10000] flags=[00000a08] primaryframe=00000000 refcount=8<> anonymous-children< XUL* xul:image@0C416B58 class="toolbarbutton-icon" xbl:inherits="validate,src=image,label" label="^" state=[10000] flags=[00000a20] primaryframe=00000000 refcount=5<> XUL* xul:label@0C416BD8 class="toolbarbutton-text" crop="right" flex="1" xbl:inherits="value=label,accesskey,crop" value="^" state=[10000] flags=[00000a20] primaryframe=00000000 refcount=5<> > After restyling: XUL* xul:toolbarbutton@066035B8 anonid="selectionhandle-mark3" class="selectionhandle" label="^" left="8px" top="25.550003051757812px" state=[10000] flags=[00000a08] primaryframe=0689ECB8 refcount=9<> anonymous-children< XUL* xul:image@0C416B58 class="toolbarbutton-icon" xbl:inherits="validate,src=image,label" label="^" state=[10000] flags=[00000a20] primaryframe=0BD9E628 refcount=6<> XUL* xul:label@0C416BD8 class="toolbarbutton-text" crop="right" flex="1" xbl:inherits="value=label,accesskey,crop" value="^" state=[10000] flags=[00000a20] primaryframe=00000000 refcount=6<> > I don't think that it helps confirm or disprove the hypothesis from comment 89 though :-(

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 100

•

12 years ago

(In reply to Jim Mathies [:jimm] from comment #87) > If I add a delay after 3 & 5, the test succeeds. If I don't sometimes the > follow up drag doesn't pick up a marker implying that maybe the marker isn't > visible in the dom yet. These are automated tests, so they execute really > quickly. It seems clear to me that Nick's patch means we don't flush layout or style while handling your synthesized DOM events, so the marker element is still treated as invisible. Can you try inserting a forced layout+style flush instead of a delay? E.g. by calling getBoundingClientRect on the marker element? That should fix it. If that fixes it, I think we have two choices here: actually add those flushes to the test, or modify the event synthesis code (called via nsDOMWindowUtils I guess?) to do the flushes automatically. I slightly prefer the former.

Jason Smith [:jsmith]

Comment 101

•

12 years ago

This was punted from 1.2. Why is this necessary for 1.3?

blocking-b2g: 1.3+ → 1.3?

Ben Kelly [:bkelly, not reviewing]

Comment 102

•

12 years ago

Personally I think it can probably ride the trains, but how do we keep it high priority enough to finally get it landed? Its a nice to have, but my tests with scrolling showed it was very good nice to have in terms of user experience. (Maybe this has changed in impact with APZC, not sure.)

Jim Mathies [:jimm]

Comment 103

•

12 years ago

Nick, can you update 'rebased rollup patch' when you get a chance? I'll take rocs suggestion for a spin.

Flags: needinfo?(ncameron)

Nick Cameron [:nrc]

Assignee

Comment 104

•

12 years ago

Attached patch rebased rollup patch — Details — Splinter Review

Attachment #827685 - Attachment is obsolete: true

Flags: needinfo?(ncameron)

Jim Mathies [:jimm]

Comment 105

•

12 years ago

Attached patch mochitest-metro fix — Details — Splinter Review

This fixed mochitest-metro tests locally. I'd suggest some try runs to confirm. Also, maybe we should add an explicit flushLayout() call to dom window utils for this?

Preeti Raghunath(:Preeti)

Comment 106

•

12 years ago

1.3+ for performance and APZC work

blocking-b2g: 1.3? → 1.3+

Nick Cameron [:nrc]

Assignee

Comment 107

•

12 years ago

Try push: https://tbpl.mozilla.org/?tree=Try&rev=3f58a1fc6ca1 Looks good :-) jimm, could you ask someone appropriate for review on your patch please?

Flags: needinfo?(jmathies)

Matt Brubeck (:mbrubeck)

Updated

•

12 years ago

Attachment #8359201 - Flags: review+

Flags: needinfo?(jmathies)

Nick Cameron [:nrc]

Assignee

Comment 108

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/1c63f67bbaf1 https://hg.mozilla.org/integration/mozilla-inbound/rev/8aacc8a67472 https://hg.mozilla.org/integration/mozilla-inbound/rev/b07b7fca7f6c

Phil Ringnalda (:philor)

Comment 109

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/1c63f67bbaf1 https://hg.mozilla.org/mozilla-central/rev/8aacc8a67472 https://hg.mozilla.org/mozilla-central/rev/b07b7fca7f6c

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Ryan VanderMeulen [:RyanVM]

Comment 110

•

12 years ago

This is going to need a branch-specific patch for Aurora (v1.3) uplift.

status-b2g-v1.3: --- → affected

status-b2g-v1.4: --- → fixed

status-firefox27: --- → wontfix

status-firefox28: --- → affected

status-firefox29: --- → fixed

Flags: needinfo?(ncameron)

Keywords: branch-patch-needed

Target Milestone: 1.3 Sprint 6 - 12/6 → 1.3 C3/1.4 S3(31jan)

Nick Cameron [:nrc]

Assignee

Comment 111

•

12 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/306c55f88554 https://hg.mozilla.org/releases/mozilla-aurora/rev/b7a571142fa5 https://hg.mozilla.org/releases/mozilla-aurora/rev/c6e3ddc8d831

status-b2g-v1.3: affected → fixed

status-firefox28: affected → fixed

Flags: needinfo?(ncameron)

Ryan VanderMeulen [:RyanVM]

Updated

•

12 years ago

Keywords: branch-patch-needed

Alice0775 White

Updated

•

12 years ago

Depends on: 969222

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

status-b2g-v1.3T: --- → fixed

testcase for desktop (requires enabling OMTC and OMTA) 12 years ago Nick Cameron [:nrc] 1.03 KB, text/html		Details
Use a mini-flush in PresShell::HandleEvent 12 years ago Nick Cameron [:nrc] 2.45 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
Add mini-flush for animations 12 years ago Nick Cameron [:nrc] 41.76 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
rebased rollup patch 12 years ago Nick Cameron [:nrc] 43.42 KB, patch		Details \| Diff \| Splinter Review
test debug output 12 years ago Jim Mathies [:jimm] 61.83 KB, text/plain		Details
some pretty basic logging 12 years ago Nick Cameron [:nrc] 4.13 KB, patch		Details \| Diff \| Splinter Review
log 1 12 years ago Jim Mathies [:jimm] 26.19 KB, text/plain		Details
rebased rollup patch 12 years ago Nick Cameron [:nrc] 43.32 KB, patch		Details \| Diff \| Splinter Review
mochitest-metro fix 12 years ago Jim Mathies [:jimm] 1.95 KB, patch	mbrubeck : review+	Details \| Diff \| Splinter Review