Closed Bug 1075670 Opened 5 years ago Closed 5 years ago

[e10s] event.screenX and event.screenY is wrong

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
e10s m4+ ---
firefox39 --- fixed
firefox40 --- fixed

People

(Reporter: alice0775, Assigned: handyman)

References

Details

Attachments

(14 files, 41 obsolete files)

158 bytes, text/html
Details
151.90 KB, image/jpeg
Details
402 bytes, text/html
Details
256.11 KB, text/plain
Details
5.27 KB, patch
Details | Diff | Splinter Review
9.57 KB, patch
Details | Diff | Splinter Review
1.68 KB, patch
Details | Diff | Splinter Review
6.99 KB, patch
Details | Diff | Splinter Review
7.39 KB, patch
Details | Diff | Splinter Review
4.95 KB, patch
Details | Diff | Splinter Review
1.22 KB, patch
Details | Diff | Splinter Review
5.04 KB, patch
smaug
: review+
Details | Diff | Splinter Review
1.59 KB, patch
Details | Diff | Splinter Review
32.26 KB, patch
billm
: review+
jlorenzo
: qa-approval+
Details | Diff | Splinter Review
Steps To Reproduce:
1. Start e10s
2. Open ScratchPad and make sure chrome privilege.
3. Pase the following code to ScratchPad

var script = 'data:application/javascript,' + encodeURIComponent('addEventListener("mouseup", function(event) {sendSyncMessage("test_mouseup", {}, {event: event}); }, false);');

window.messageManager.loadFrameScript(script, true);

window.messageManager.addMessageListener("test_mouseup",
    function(msg){handleEvent(msg.objects.event)}
);

function handleEvent(event) {
  alert(event.screenX + ", " + event.screenY);
}

4. Click on anywhere of contents area

Actual Results:
event.screenX and event.screenY is not screen coordinates.

Expected Results:
event.screenX and event.screenY should be screen coordinates.
Err
> 3. Pase the following code to ScratchPad

  3. Paste the following code to ScratchPad
Not really events related. Events just use the coordinates from widget + offset to layout stuff.
Are we not updating WidgetToScreenOffset() in the child process correctly?
Component: DOM: Events → DOM
Attached file event.screenXY.html
Simple STR
1. Start e10s
2. Open attached html
3. Click contents area
Assignee: nobody → davidp99
Attached image windowoffset.jpg
I assume this is what's been trashing my mouse gestures.
I drew out some data I was getting back from mousemove events screenX and screenY - red line is the raw points, green line is an approximation of the actual movement, got from throwing out large deltas.

The offset of the bad co-ordinates seems linked to the content area's coordinates on the screen - ie, the closer the browser window is to the top left of the screen, the smaller the margin of error.

Hope that's new/useful information. If you already knew this, please do enjoy the pretty colours.
Thanks all.  Alice's test and the smeared image make it pretty clear what was happening.  This patch uses a different widget function to generate screen coordinates.  The patch uses some of my work in bug 1044182, which is part of bug 1092630, so the patch would need to wait for that to be checked in to land.

The only interesting issue here is WidgetToScreenOffset, which doesn't report actual screen coordinates in remote processes.  This is typically correct as remote tabs consider themselves to be located at the origin and the main process 'repairs' them when it gets messages.  However, when directly talking to page documents and plugins, the content process has to do this itself (pages and plugins should and do assume no knowledge of main-process/content-process differences).  So we need to handle that specially.  RealWidgetToScreenOffset is created for that purpose.  Its use should be contained to these two cases (ie page content and plugins).
(Posted the wrong bug's patch.)
Attachment #8520320 - Attachment is obsolete: true
Blocks: 1096030
Blocks: 1103458
I don't see how the patch would fix the issue
(nor it does locally). Better to use the testcase in non-maximized window to see the difference to
non-e10s behavior.
The issue is that if the window is just moved, not resized, TabChild doesn't get any messages about the move.
(In reply to David Parks [:handyman] from comment #5)
> The only interesting issue here is WidgetToScreenOffset, which doesn't
> report actual screen coordinates in remote processes.  This is typically
> correct
I don't actually see this ever being correct.


> as remote tabs consider themselves to be located at the origin and
> the main process 'repairs' them when it gets messages.
Sounds like we have some cases to fix.
WidgetToScreenOffset should really be about screen offset, not about something else.
It is used in many cases and currently e10s gets lots of them wrong.
Update: smaug's comment 10 has proven a bit tricky.  It'll need a bit more work.  First, some of the wiring is missing in the current code -- resize and move behavior are handled quite differently.  nsWebShellWindow::WindowMoved should notify the docShell, similar to how nsWebShellWindow::WindowResized works.

From there, window-resize updates the TabParents indirectly through reflow callbacks.  We obviously dont reflow on a window move... the TabParents need to be fished out of the DOM by finding the nsSubdocumentFrames.  At least, that's my plan.

Additionally, nsCocoaWindow::ReportMoveEvent (and ReportSizeEvent) are operating in the wrong coordinates.  With move events, this leads to HiDPI coordinates in Gecko (bad).  (mBounds should be scaled by 1/BackingScaleFactor().)  This would not be a simple fix -- its a pretty central thread.  Just fixing 'move' by itself, however, is probably reasonable (since it isn't quite right now anyway).
Updating patch for bitrot.
Attachment #8520762 - Attachment is obsolete: true
Created the "MozUpdateWindowPos" chrome event to notify tabs of a window move.  Tabs register for this event on the windowroot of their content.
These two patches seem to fix the bug but I'm still looking into test failures:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e714830824e1
Quick update: The Linux e10s mochitest-1 failure appears to be Linux-only (although TBPL doesn't run the test elsewhere, they pass for me on mac and win).
Turns out window.mozInnerScreenX/Y had the same math issue.  When they were both out of sync, the tests (incorrectly) pass.  With both fixed, the tests are green again.

TBPL: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c73ab5655a28
Attachment #8544708 - Attachment is obsolete: true
Attachment #8544713 - Flags: review?(bugs)
Attachment #8545345 - Flags: review?(bugs)
Comment on attachment 8545345 [details] [diff] [review]
1. Properly report screen coords to Javascript, even with a PuppetWidget

We should fix puppet widget to not have fake screen coordinates.
Otherwise this will fix just a small subset of the issues and we'll
need to go through all the other cases.

I have no idea why PuppetWidget has fake screen coordinates.
Attachment #8545345 - Flags: review?(bugs) → review-
Comment on attachment 8544713 [details] [diff] [review]
2. Update window position in TabParent when window widget is moved

> TabParent::SetOwnerElement(Element* aElement)
> {
>+  // If we held previous content then unregister for its events.
>+  nsCOMPtr<nsIContent> frame = do_QueryInterface(mFrameElement);
This is useless. Element inherits nsIContent. So just use mFrameElement everywhere.

>+  if (frame && frame->OwnerDoc() && frame->OwnerDoc()->GetWindow()) {
OwnerDoc() never ever returns null. In general in DOM code only Get* methods may return null.


>+    nsCOMPtr<nsPIDOMWindow> window = do_QueryInterface(frame->OwnerDoc()->GetWindow());
No need for the do_QueryInterface.

>+  if (frame && frame->OwnerDoc() && frame->OwnerDoc()->GetWindow()) {
ditto about OwnerDoc()

>+    nsCOMPtr<nsPIDOMWindow> window = do_QueryInterface(frame->OwnerDoc()->GetWindow());
useless QI
Attachment #8544713 - Flags: review?(bugs) → review+
Fixes the calculations by changing the PuppetWidget behavior, as suggested.  Also repairs mac plugin calculations.

To see the effect of the plugin coordinate change, use the test plugin in bug 1044182 (save the plugin to a file and then open it locally... I dealt with this by comparing values in an e10s window vs. a non-e10s window).

Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf83020a099d
Attachment #8545345 - Attachment is obsolete: true
Attachment #8554684 - Flags: review?(bugs)
with smaug's changes in comment 18.
Attachment #8544713 - Attachment is obsolete: true
You don't need to change TabParent::MapEventCoordinatesForChildProcess?
You mean because it does the adaptation that shifts event coordinates to be widget-relative?  I initially thought I would... but no.  The deal is that the content process event coordinate stuff circumvents the usual coordinate adaptation, using refPoints to establish a screen-relative position.  The refPoints are based on 'real' screen coordinates (not content-process screen coordinates) so they produce the correct results -- they are the position of the tab's origin, in actual screen coordinates.  You can see what it does with, for example, mouse position, by running the testcase in this bug.
Ah. This code has been so nicely inconsistent :/
Waiting for tryserver results before reviewing. Especially b2g results will be interesting.
(In reply to Olli Pettay [:smaug] from comment #23)
> Ah. This code has been so nicely inconsistent :/

Indeed.  I imagine the event-refPoint-for-content-proc stuff could be ripped out with this change (the normal math should work now... I'm pretty sure) but that can be a job for another day.
Comment on attachment 8554684 [details] [diff] [review]
1. Make PuppetWidget::WidgetToScreenOffset report proper screen coords

"// This is actually relative to window-chrome."
Could you fix that comment in plugins code


>   switch (sourceSpace) {
>     case NPCoordinateSpacePlugin:
>-      screenPoint = sourcePoint + pluginFrame->GetContentRectRelativeToSelf().TopLeft() +
>-        chromeSize + pluginPosition + windowPosition;
>+      screenPoint = sourcePoint + pluginPosition;
I can understand removal of chromeSize and windowPosition, but why also
pluginFrame->GetContentRectRelativeToSelf().TopLeft()


>   switch (destSpace) {
>     case NPCoordinateSpacePlugin:
>-      destPoint = screenPoint - pluginFrame->GetContentRectRelativeToSelf().TopLeft() -
>-        chromeSize - pluginPosition - windowPosition;
>+      destPoint = screenPoint - pluginPosition;
ditto
Note, I'm not familiar with that code.


>diff --git a/widget/PuppetWidget.h b/widget/PuppetWidget.h
This seems to be ok.
smaug had asked about the contentRect calculation in plugin coord code.  This test case shows the failure if it is missing -- the plugin's 100px border isn't counted properly without it.  This is just the testcase from bug 1044182 with a 100px border added.
NIing smaug with no question -- since he was reviewing this but requested a plugin reviewer for the nsPluginInstanceOwner code.

The plugin screen values can be tested with the test case in the previous comment.  The big picture is that the change to PuppetWidget::WidgetToScreenOffset in this patch changes the calculations in ConvertPointPuppet since the pluginFrame coordinate system is different (i.e. it's 'top-level' coordinate system is no longer tab-relative but is genuinely screen relative).
Attachment #8554684 - Attachment is obsolete: true
Attachment #8554684 - Flags: review?(bugs)
Flags: needinfo?(bugs)
Attachment #8554861 - Flags: review?(joshmoz)
Attachment #8554861 - Flags: review?(joshmoz) → review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/94a2a4a7c84c
https://hg.mozilla.org/mozilla-central/rev/47de36ef3ab4
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Looks like this is still a bit wrong. We're getting too small X coordinates.
Flags: needinfo?(bugs)
Depends on: 1126902
This is breaking the context menu and tooltips in nightly. We should back it out if it won't be fixed by tonight. Do you think you can fix it by then?
Flags: needinfo?(davidp99)
I think context menu is broken (also) because its implementation relies on wrong coordinates.
Tooltips and context menus are indeed broken.  Tooltips were using non-HiDPI-capable math... but the WidgetToContent displacement was always (0,0) before, so it didn't show up.
Removed mapScreenCoordinatesFromContent, since content and chrome now use same screen coords.
Comment on attachment 8556205 [details] [diff] [review]
4. Remove mapScreenCoordinatesFromContent as it is no longer needed

Review of attachment 8556205 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this looks great.

::: toolkit/content/widgets/browser.xml
@@ +872,1 @@
>                this.startScroll(data.scrolldir, pos.x, pos.y);

Please just pass data.screenX/Y directly into startScroll.
Attachment #8556205 - Flags: review+
These patches basically fix the problems I was seeing. However, the Y coordinates are still off by about 20 or 30 pixels. This might be a Linux-specific issue. It sounds like this is bug 1126902.

I'd recommend asking smaug for review on the tooltip patch. Maybe he can also help with bug 1126902. Then I think we can re-land.

I'm not sure how to test this stuff. Somehow we'd need to be able to find the coordinates that the context menu or the autocomplete or tooltip popups are shown.
https://hg.mozilla.org/mozilla-central/rev/1b4d06590778
https://hg.mozilla.org/mozilla-central/rev/a0e5f5eee758
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1127134
Duplicate of this bug: 1127262
This doesn't actually change any behavior -- its just removal of code that is made even more redundant by the patches in this bug.
The previous patches exposed some holes in the math that affects Linux behavior.  This patch does two things: 

1) fixes popup calcs to be client-relative (this only really matters on Linux but could have an effect on Windows -- its been tested on Windows as well -- other platforms do not need to distinguish between widget and widget-client relative).  

2) fixes GTKs nsWindow::WidgetToScreenOffset to use the correct (ie screen-relative, not parent-widget relative) GTK function.  This makes a difference to tooltips.
de-bitrot
Attachment #8556187 - Attachment is obsolete: true
Attachment #8559453 - Flags: review?(bugs)
de-bitrot
Attachment #8557387 - Attachment is obsolete: true
Attachment #8559455 - Flags: review?(bugs)
PuppetWidget didn't compensate for the case where the OS had a titlebar or other chrome that made nsWindow::GetClientOffset something other than the origin.  This didn't matter when the PuppetWidget content was defined to be at the screen origin (since content displacement was always 0,0).

The PuppetWidget position now "translates past" the client offset.  The PuppetWidget itself therefore has no client offset.

Tests:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=332655649907
Attachment #8557393 - Attachment is obsolete: true
Attachment #8559497 - Flags: review?(bugs)
Comment on attachment 8559453 [details] [diff] [review]
3. Repair tooltip math for HiDPI displays

Oh, somewhat old code from bug 171229
Attachment #8559453 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] (review overload. No more review requests before Feb 8, please) from comment #49)
> Comment on attachment 8559453 [details] [diff] [review]
Please test cases when content page is zoomed-in/out
Comment on attachment 8559455 [details] [diff] [review]
5. Clean up redundant code in chrome proc related to content offset

Ah, yes, I was wondering why GetChromeDisplacement was added in bug 1092630.
Looked like a hack.
Attachment #8559455 - Flags: review?(bugs) → review+
Comment on attachment 8559497 [details] [diff] [review]
6. Make PuppetWidget respect ClientOffset

(I wouldn't be surprised if some coordinate handling tweaks will be still need in some cases when dealing with some ancient code like embedding/* but such cases should be handled in followup bugs.)
Attachment #8559497 - Flags: review?(bugs) → review+
Looks good.  Tests are in comment 38.  All 6 patches are there.
Keywords: checkin-needed
Part 1 failed to apply cleanly: 

renamed 1075670 -> final.1.patch
applying final.1.patch
patching file widget/PuppetWidget.h
Hunk #2 FAILED at 119
1 out of 2 hunks FAILED -- saving rejects to file widget/PuppetWidget.h.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working dir
errors during apply, please fix and refresh final.1.patch

David, could you take a look, thanks!
Flags: needinfo?(davidp99)
Keywords: checkin-needed
Fixing bitrot
Attachment #8559450 - Attachment is obsolete: true
Flags: needinfo?(davidp99)
I'd suggest a new set of patches merged to mc tip and pushed to try. :) Also, the commit info in the current set was missing reviewer info.
de-bitrot
Attachment #8562367 - Attachment is obsolete: true
Flags: needinfo?(davidp99)
de-bitrot
Attachment #8559453 - Attachment is obsolete: true
de-bitrot
Attachment #8559455 - Attachment is obsolete: true
de-bitrot
Attachment #8559497 - Attachment is obsolete: true
Depends on: 1133518
Hi David,
With these patches, we can't see the bluetooth setting page.
Please check https://bugzilla.mozilla.org/show_bug.cgi?id=1133518
Flags: needinfo?(davidp99)
If it helps, I have these patches reverted on my github branch (and verified it fixes the issues this bug caused on b2g): https://github.com/Cwiiis/gecko-dev/tree/patches-i-want
Backed out directly on m-c using the backout patch gsvelto created.

https://hg.mozilla.org/mozilla-central/rev/4bb425001d8a
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
...on it.
Flags: needinfo?(davidp99)
On Flame, the chrome process offset is (0,45).  It was being calculated as (-480, 45).  This undoes the change that switched to the bad math.
Attachment #8567320 - Flags: review?(bugs)
de-bitrot.  r+ed in comment 52.
Attachment #8563530 - Attachment is obsolete: true
(In reply to David Parks [:handyman] from comment #72)
> Created attachment 8567320 [details] [diff] [review]
> 7. Change function used to calculate chrome process offset
> 
> On Flame, the chrome process offset is (0,45).  It was being calculated as
> (-480, 45).  This undoes the change that switched to the bad math.

Why do we get 480? GetChildProcessOffset() shouldn't return that (well, 480). Or is the iframe
actually placed at (480, 45)?
Comment on attachment 8567320 [details] [diff] [review]
7. Change function used to calculate chrome process offset

>+static nsIntPoint
>+GetChromeDisplacement(nsFrameLoader *aFrameLoader)
>+{
>+  if (!aFrameLoader)
>+    return nsIntPoint();
always {} with 'if'


But I guess we can take this for now, and figure out the issue in a followup...however, I'm worried that we have then wrong screenX/Y on b2g.
Attachment #8567320 - Flags: review?(bugs) → review+
I don't know much about how B2G defines 'browser' (in the PBrowser sense) but the situation with the Bluetooth display was that the home screen and the bluetooth app are the same PBrowser instance but with different primary frames.  Somehow, the primary frame in the Bluetooth case is translated... but the resulting touch positions (which use this math) are properly handled.  I took that to mean that GetChildProcessOffset defines something slightly different in B2G.  But maybe its just a number of bugs.  This would be good to know.
Attachment #8567320 - Attachment is obsolete: true
Lets try this again...
Keywords: checkin-needed
(In reply to David Parks [:handyman] from comment #80)
> but the situation with the Bluetooth display was that the home screen and
> the bluetooth app are the same PBrowser instance but with different primary
> frames.
That makes no sense. You have PBrowser (well, TabChild) for top level child process Window objects.
Fabrice, could you perhaps comment on comment 80.
Flags: needinfo?(fabrice)
The homescreen and the bluetooth app are not in the same process, so I'm not sure how comment 80 is relevant. Some of the bluetooth functionality is actually handled by the system app, so not even OOP.
Flags: needinfo?(fabrice)
de-bitrot
Attachment #8567353 - Attachment is obsolete: true
Added NULL check for PrimaryFrameOfOwningContent.
Attachment #8568167 - Attachment is obsolete: true
Comment on attachment 8568681 [details] [diff] [review]
7. Change function used to calculate chrome process offset

Tests are looking good.

smaug, the only change to the patch is the null check of contentFrame, which caused the earlier Linux bc1 failures.
Attachment #8568681 - Flags: review?(bugs)
Attachment #8568681 - Flags: review?(bugs) → review+
Keywords: checkin-needed
Backed out for causing bug 1139010. That's twice this has been backed out for on-device regressions now. Please don't request checkin again until this has been verified to work correctly on a real device. Regression tests would sweeten the deal.

https://hg.mozilla.org/mozilla-central/rev/44bcd21e59fe
Status: RESOLVED → REOPENED
Depends on: 1139010
Resolution: FIXED → ---
Target Milestone: mozilla39 → ---
I would like to request that we have a QA run of the patches before landing next time around please.  I'm willing to help facilitate that to occur.  Please needinfo me before landing.
This resolves the B2G issues from before.  It does not fix screenX/screenY on B2G -- there is a serious problem with -moz-transform-origin when used with -moz-transform: scale(2) which makes this hard and a bigger problem with gonk touch coordinate calculations that I'm still trying to track down.  I've dropped the old patch #7 (Change function used to calculate chrome process offset) since this now works with the bad values that come from that function.  But the bugs in that function are the direct result of the -moz-transform-origin issue and are half of the reason the B2G screen coords are still screwed up.
Attachment #8568681 - Attachment is obsolete: true
Depends on: 936092
Blocks: 936092
No longer depends on: 936092
Comment on attachment 8576356 [details] [diff] [review]
7. Make composition bounds calculation widget-relative

We are going to try to go with this temporarily, despite event.screenX/Y still being incorrect on Flame.  This patch should fix any B2G regressions from the previous patches.

kats, you worked on related code in bug 1043644 so you might have some insight here.  The gist of the patches in this bug is that PuppetWidget is no longer defined to be positioned at the origin and some math that assumed it was needed to be fixed.  You might also have an idea whether or not this change should also be applied to nsLayoutUtils::CalculateCompositionSizeForFrame and nsLayoutUtils::CalculateRootCompositionSize (which I speculate based solely on your patches in bug 1043644).
Attachment #8576356 - Flags: review?(bugmail.mozilla)
Comment on attachment 8576356 [details] [diff] [review]
7. Make composition bounds calculation widget-relative

Review of attachment 8576356 [details] [diff] [review]:
-----------------------------------------------------------------

tn and/or botond would probably be better reviewers for this, they should be able to answer your questions.
Attachment #8576356 - Flags: review?(tnikkel)
Attachment #8576356 - Flags: review?(bugmail.mozilla)
Attachment #8576356 - Flags: review?(botond)
Comment on attachment 8576356 [details] [diff] [review]
7. Make composition bounds calculation widget-relative

Review of attachment 8576356 [details] [diff] [review]:
-----------------------------------------------------------------

I'll defer to Timothy for the review, but to answer your question about whether or not this change should be applied to nsLayoutUtils::CalculateCompositionSizeForFrame and nsLayoutUtils::CalculateRootCompositionSize: I don't believe so, because those functions return a size rather than a rect, while this change only affects the origin of the rect.
Attachment #8576356 - Flags: review?(botond)
Attachment #8576356 - Flags: review?(tnikkel) → review+
Great, thanks kats, botond and tn.  nhirata, you wanted to play with this before it was checked in.  Note that the STR in comment 3 (no actual test for this) still will not give the right screen values for events on B2G but there should not be regressions.
Flags: needinfo?(nhirata.bugzilla)
Hi David, do I just need to apply the one patch?  Or do I need to apply all the attached patches?
Flags: needinfo?(davidp99)
You'll need all seven -- they build on one another so they need to be applied in numerical order.
Flags: needinfo?(davidp99)
de-bitrot all
Attachment #8563525 - Attachment is obsolete: true
de-bitrot all
Attachment #8568680 - Attachment is obsolete: true
de-bitrot all.  r+ed at comment 98
Attachment #8576356 - Attachment is obsolete: true
Kats, can you help david figure this out? There seems to be something wrong with the b2g coordinate system and APZC
Flags: needinfo?(bugmail.mozilla)
Attached file B2G crashstack
Naoki's crash stack
Needinfo Blake to ask Naoki what's going on here. There doesn't seem to be any evidence that this crash (which appears to be related to a bad event somehow) is related to David's patch. We also don't know how common it is. David did a lot of manual testing and didn't reproduce anything.

Without further information, I'd like to land this patch tomorrow.
Flags: needinfo?(mrbkap)
Clearing ni as it seems things are under control here and the patches will be landing. If there are still issues that you are having trouble with I'd be happy to help but please provide specific scenarios of bad behavior that i can reproduce with these patches applied.
Flags: needinfo?(bugmail.mozilla)
de-bitrot
Attachment #8578900 - Attachment is obsolete: true
de-bitrot.

Tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c6f316bcd64c

NI to billm for potential submission.
Attachment #8578907 - Attachment is obsolete: true
Flags: needinfo?(wmccloskey)
Attached patch patchSplinter Review
The try push was not looking good and one of the failures was in a test I recently added so I took a look. Some of this stuff probably got broken by bug 1126089.

1. Because of the changes from bug 1126089, we have to be careful not to null out mFrameElement too soon because then the message manager won't work. That's why the unload.js test was failing.
2. We were also crashing b2g reftests in some cases because ActorDestroy assumes mFrameElement is non-null.
3. I think we also were not always unregistering the event listener in SetOwnerElement correctly. We have to make sure to do it while mFrameElement is still part of the document. Otherwise we don't have a way to get to the window.

Olli, this patch applies on top of part 2 (which doesn't seem to be titled correctly).

Here's a new try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9f0026c74d05

Please ignore if the crashes aren't fixed.
Flags: needinfo?(wmccloskey)
Attachment #8583595 - Flags: review?(bugs)
Attachment #8583595 - Flags: review?(bugs) → review+
The try push looks good. There are some consistent b2g failures, but they're failing consistently in the m-c push this was based on. I'll land when the tree re-opens (so, like, next year).

I suspect the crash here may have been fixed by the patch I posted, so I don't think we need to look into that any more.
Flags: needinfo?(nhirata.bugzilla)
Flags: needinfo?(mrbkap)
I haven't crashed on the patch that Bill landed ( tested on the mozilla-inbound build 20150326204640
for flame-kk-eng

Now that I think about it, I should probably test Nexus 5 L as well.
https://hg.mozilla.org/mozilla-central/rev/02fabf032cbe
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
I think we need to back this patch out, since it seems to break the FirefoxOS Gallery app. (See bug 1149183)

Comment #109 says there seems to be something wrong with the B2G coordinate system. And that's what I'm seeing. With this patch landed, I'm getting touch events that contain Touch objects that have negative values for screenY or sometimes for screenX on my Flame device running nightly.

The Gallery app is the only one in Gaia that uses screenX and screenY, so it is broken by this patch.  I should probably modify Gallery to use clientX and clientY instead, but, negative screen coordinates for touch events seem pretty clearly wrong, so let's revert this for now.

Here are some things I notice that might help to debug this:

When I get a touch event with a negative screenY, I find that clientY - screenY is always equal to 569 or 570.  This is on a Flame, where the screen height in CSS pixels is 569.5.  

Once, when I tested, I was seeing positive screenY values that were the same as clientY, but in that case, the screenX values were negative and clientX - screenX was always equal to 320, which is the width of the screen in CSS pixels.  During this test, when I was getting negative screenX values, I tried rotating the device into landscape orientation. When I did that clientX - screenX became 569 or 570, matching the new screen width.
I'm attaching the small gaia patch I used to diagnose the negative screenX/screenY coordinate issue.

This adds some console.log() statements to shared/js/gesture_detector.js and to the Gallery app.

The logging output for the gallery output is specific to bug 1149183. It prints touch event coordinates (via the gesture detector) when the user tries to drag the crop handles in the gallery app editing mode.

The logging output in the gesture detector prints stuff out any time it sees negative screen coordinates, especially when panning. This output demonstrates that the bug is not specific to the gallery app's cropping code.  You can see the same thing when you drag the exposure slider in the image editor. And you can see the same thing if you take a photo with the camera, preview it in the camera, and then zoom in (with a pinch) and pan around with a single finger drag. All of those other things still work because they don't use the screenX and screenY coordinates, but this logging patch demonstrates that the negative coordinates are occurring in those cases too.
David, can the b2g team give the e10s team a hand in diagnosing and fixing this? This bug blocks a series of patches for e10s that have been waiting to land. The e10s team has little experience with debugging b2g, so it would be great if the b2g team could lend us a hand in figuring out where we're going wrong in this patch.
Flags: needinfo?(dflanagan)
Maybe kats can help now that we have something more concrete that's broken.
Flags: needinfo?(bugmail.mozilla)
The test that is attached to this bug also shows this fairly well.  By clicking around the screen and looking at the coordinates, you can see that it establishes a pretty non-linear coordinate system.  I believe I'd traced this to geometry that establishes coordinate system axes for widget touch input but I'm not 100% on that.
Depends on: 1149202
(In reply to Jim Mathies [:jimm] from comment #128)
> David, can the b2g team give the e10s team a hand in diagnosing and fixing
> this? This bug blocks a series of patches for e10s that have been waiting to
> land. The e10s team has little experience with debugging b2g, so it would be
> great if the b2g team could lend us a hand in figuring out where we're going
> wrong in this patch.

Jim: that's a totally reasonable request. I work on media apps, however, so noone on my team has any expertise in the APZ or events code. Last time I was involved in bug like this, Kats fixed it in no time, so let's ask him :-)  Kats: can you work on this or can you suggest someone who can?
Flags: needinfo?(dflanagan)
(In reply to Bill McCloskey (:billm) from comment #129)
> Maybe kats can help now that we have something more concrete that's broken.

Yeah I'll take a look. I can repro bug 1149183 on my local build (latest m-c code which doesn't have the backout) and will debug from there.
Flags: needinfo?(bugmail.mozilla)
My initial investigation for the gallery app case seems to indicate that when the TabParent::SendUpdateDimensions call is issued, the chromeOffset [1] is y=-854 (on Flame). This chromeOffset gets stored in the TabChild and used when computing the WidgetToScreen offset, resulting in the negative screen coordinates. The chromeoffset is y=-854 initially I believe because of the CSS transform at [2]. This transform is removed once the app is open and stable, but the chromeOffset is never updated and retains the stale value from app initialization.

Continuing to dig...

[1] http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabParent.cpp?rev=8bc7561d7557#933
[2] https://github.com/mozilla-b2g/gaia/blob/ee1fba22db03606b8c01fdf947e3746a25a2da46/apps/system/style/window.css#L552
If I remove the guard at http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabParent.cpp?rev=8bc7561d7557#919 it makes things work correctly. The guard doesn't account for the chromeOffset value changing
I threw up a patch for review on bug 1149183. I suggest waiting until that lands, and then re-landing these patches.
In the same way Naoki requested in comment 94, I would like to that we have an on-device automation run before the next landing. I can help with that. Please needinfo me before landing.
This is the final patch including the fix from kats in bug 1149183. There's a try run here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=22e742e9fab8

I can understand the concern about this patch, but this is a very high priority for us, so please try to get the automation run done soon.
Flags: needinfo?(jlorenzo)
Attachment #8586346 - Flags: review+
Gaia UI smoketests are currently running against 3e6e4cfe95801ef207c5d8287bdf115f65f72949 + attachment 8586346 [details] [diff] [review]. Leaving the NI to announce the results.
Depends on: 1149167
Comment on attachment 8586346 [details] [diff] [review]
patch for checkin

This patch looks good from the smoketests standpoint even though, I encountered a couple of issues while them. I made sure this wasn't related to the patch. Here are the details below:

=====================
Smoketest results

passed: 40
failed: 6

FAILED TESTS
-------
test_browser_play_video.py => The browser keeps loading. I repro'd the issue on 20150401010204 which passed in the lab[1]. There is likely something wrong in my set up.

test_homescreen_delete_app.py => I built a userdebug build due to the issue described in bug 1150009. The test fails to install the app during the setUp() phase. This is not related to this patch, I got the issue on a userdebug build which doesn't contain the patch.
test_homescreen_delete_app_packaged.py => Idem
test_homescreen_launch_app.py => Idem
test_homescreen_launch_app_packaged.py => Idem

test_music_artist_mp3.py => Due to bug 1149886

[1] http://jenkins1.qa.scl3.mozilla.com/view/UI/job/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/134/HTML_Report/
Flags: needinfo?(jlorenzo)
Attachment #8586346 - Flags: qa-approval+
https://hg.mozilla.org/mozilla-central/rev/0160303649ae
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: mozilla39 → mozilla40
Depends on: 1178591
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.