Closed Bug 892994 Opened 11 years ago Closed 11 years ago

Australis menu panel runs away when hovering subview items on hidpi display

Categories

(Core :: Widget: Cocoa, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: mconley, Assigned: tnikkel)

References

(Blocks 1 open bug)

Details

(Whiteboard: [Australis:M?][Australis:P1])

Attachments

(1 file, 3 obsolete files)

Demonstration of bug: http://www.youtube.com/watch?v=26qq-ZbzyAw&feature=youtube_gdata Not sure what is happening here, but at a glance Mark Hammond suggested it might be a rounding error in some panel code. Marking P1 because I really don't think we'd ship this.
Pretty intimidated by this bug, since it'll likely run the gamut of Panel + Layout with that special Retina flavour, but I'll give it a shot.
Tentatively assigning myself, but also Cc'ing Mark Hammond who might have an idea of where I should start...
Assignee: nobody → mconley
So, some interesting facts: 1) Setting noautohide="true" on the panel causes the bug to disappear (but also stops the panel from growing to match the subview height). 2) Not everybody sees this. Using a fresh profile (empty history), this bug did not appear. So I don't have reliable STR here yet, except locally.
Also, if I skip hovering and of the subview items and let the panel close, the next open seems to work alright (even though for some reason, the subview doesn't cause the container to grow to its height). http://www.youtube.com/watch?v=lJjPrcvQakY
More data: This bug seems to be somewhat related to bug 852775 - they sound like similar problems, and I've tracked the problem down to the same piece of code - namely, nsXULPopupManager::PopupResized (http://mxr.mozilla.org/mozilla-central/source/layout/xul/base/src/nsXULPopupManager.cpp#384) When this function gets called, it appears that we fail the aSize and curDevSize comparison, and that causes us to set those width and height attributes on the panel. And those attributes are what's causing the problem. We only fail the comparison by, at most, 1px each time (and sometimes we don't fail - but all it takes is one fail to set the attributes). I thought it might have been a rounding error somewhere, but I can't see where it might be. For the rects being compared, curDevSize is generated via nsView::CalcWidgetBounds, which runs the dimensions set on a view via nsView::SetDimensions through a bit of processing. This is the value that is always 1px less than aSize. aSize is generated in Cocoa code - we get the content rect for the resized window, and multiply it by a scale factor (2.0 for Retina). So my questions: 1) Why are those attributes ever set on the panel? When is that useful? 2) Is what I'm seeing likely the result of a rounding error? If so, where should I look to correct the error? If not...what's causing the failed comparison?
More data - here's a conversation that Enn and I were having over IRC: mconley: Enn: ok, got the numbers. So in the instance where it first failed the comparison, mRect.height was 26985, CalcWidgetBounds returned a height of 900, and DoResize was called with a height of 450.00 mconley: and mRect.height / 60 = 449.75 Enn: mconley: you can get nsMenuPopupFrame::mRect at the end of SetPopupPosition. At the cocoa level, you can check within nsCocoaWindow::DoResize to see what the window is being set to Enn: they should be the same, once you converted css pixels to device pixels mconley: alright, I'll compare those. Thanks. mconley: Enn: so, it seems that normally, nsMenuPopupFrame::mRect is getting a value that's exactly 60x the value of what's being set via DoResize in nsCocoaWindow Enn: mconley: that sounds right mconley: Enn: on the occasions that the value doesn't match, it looks like mRect isn't *exactly* 60x the value - for example, mRect.height can be set to 20993, and DoResize is called with 350.00 mconley: 20993 / 60 = 349.88333... Enn: hmmm mconley: Enn: markh suggested this might have to do with subpixels - but I'm already a little out of my depth with this stuff. Enn: mconley: I don't really know but it looks as if the real window size is rounded to the nearest pixel but when retrieving the size, it rounds differently Enn: mconley: what is CalcWidgetBounds returning? mconley: Enn: sorry, was in a meeting - let me get that for you mconley: Enn: ok, got the numbers. So in the instance where it first failed the comparison, mRect.height was 26985, CalcWidgetBounds returned a height of 900, and DoResize was called with a height of 450.00 mconley: and mRect.height / 60 = 449.75 mconley: Enn: so that last bit, where I give you what CalcWidgetBounds is returning... does that tell you anything about where I should look next? mconley: or any idea where this difference is coming from, why it's there, and what I can do to mitigate it? Enn: mconley: well either the mRect.height is being calculated wrong is some cases and it should always be a mutliple of 60, or... Enn: it's correct and the value is being rounded in one way when being sent to cocoa widget layer and rounded differently when being retrieved Enn: does mRect.height differ each time the popup is opened, despite being the same content? mconley: well, note that the popup is animating itself taller mconley: so the height will differ, yes mconley: Enn: --^ Enn: intentionally animating? mconley: Enn: yes mconley: Enn: the panel is supposed to grow in height to accommodate the height of the "subview" - here's the behaviour spec: http://people.mozilla.com/~shorlander/panel-experiment-03/panel-experiment.html (click on History) Enn: I meant is the panel supposed to be visibly animating larger? mconley: Enn: yes. mconley: like the panel in the demo mconley: Enn: could the CSS transition / animation be causing this rounding error? Enn: mconley: possibly. but probably something not quite right with the changes in bug 852775 firebot: Bug https://bugzilla.mozilla.org/show_bug.cgi?id=852775 nor, --, mozilla22, mhammond, RESO FIXED, social panels sized incorrectly on hidpi displays mconley: hm, ok mconley: Enn: well, I guess I'll try to see where in core these different numbers are being sourced from mconley: but any suggestions are welcome, because I'll basically be poking the dark with a stick Enn: mconley: I'd ask the three people looking into that bug for more info. one of them may have some insight mconley: cool, thanks. :)
More data (and this is likely going to expose my ignorance of both our layout and graphics code and how it relates to hidpi displays): As mentioned earlier, we're failing a comparison in nsXULPopupManager::PopupResized[1] where we compare two nsRect's - one called curDevSize, and one called aSize. Their heights, widths, or both do not match. When setting the width of a window, I believe nsView gets SetDimensions called on it with an nsRect to reflect the new dimensions of the window (the panel, in this case). Let's suppose that we call SetDimensions with an nsRect with the width set to 13950. curDevSize: curDevSize is generated by asking the view of the panel to calculate its bounds. This is done by returning the value set via SetDimensions, and running it through ToNearestPixels (which on my Retina MacBook does a divide of the value by 30). So, curDevSize's width is determined to be (13950 / 30) = 465. aSize: At some point, nsCocoaWindow::DoResize is called with width 232.5 (which is 13950 / 60). This gets *rounded up* to 233.0 via [1], and is set as the content rect's width, I believe(?). aSize is actually passed in to nsXULPopupManager::PopupResized, which is called from nsView::WindowResized, which is called by nsCocoaWindow::ReportSizeEvent. ReportSizeEvent takes the width stored by DoResize (so 233.0), and passes it to nsCocoaUtils::CocoaRectToGeckoRectDevPix(r, scaleFactor);. CocoaRectToGeckoRectDevPix takes that 233.0 value, multiplies it by the scale factor (2), and returns 466. So that's how curDevSize got to be 465 and aSize got to be 466. [1]: http://mxr.mozilla.org/mozilla-central/source/layout/xul/base/src/nsXULPopupManager.cpp#384 [2]: https://mxr.mozilla.org/mozilla-central/source/widget/cocoa/nsCocoaWindow.mm#1355
> This gets *rounded up* to 233.0 via [1], and is set as the content rect's width, I believe(?). Ugh, that should have been [2]. Me and my fancy footnotes.
Attached patch Patch v1 (obsolete) — Splinter Review
So this looks like a float -> int conversion problem. The solution that :markh and I were thinking of was to simply allow a 1px margin of error when doing the comparison. This fixes the running-away panel for me.
Comment on attachment 780085 [details] [diff] [review] Patch v1 Is this an acceptable solution?
Attachment #780085 - Flags: review?(tnikkel)
Attachment #780085 - Flags: feedback?(enndeakin)
(In reply to Mike Conley (:mconley) from comment #9) > Created attachment 780085 [details] [diff] [review] > Patch v1 > > So this looks like a float -> int conversion problem. Specifically, I think it is as described in bug 852775 comment 10 and 11 - the tl;dr version of which is that we make multiple float->int transitions which introduces these kinds of issues. > The solution that :markh and I were thinking of was to simply > allow a 1px margin of error when doing the comparison. The "correct" fix is probably as described in bug 852775 comment 10 and 11 - use float pixels for everything other than device pixels - but that sounds like a much larger project. If we really did want to take that on, we could consider Mike's attachment a work-around in the meantime.
What if someone resizes the popup by 1 pixel?
(In reply to Neil Deakin from comment #12) > What if someone resizes the popup by 1 pixel? They get a badge for "most dexterous use of a mouse"? ;) Seriously though, I'd suggest that this is a reasonable tradeoff in the short term - but I guess you are suggesting we look into bug 852775 comment 4 (ie, that we detect a user resize and only add those width and height attributes in that case?)
Or, some code could resize the popup by one pixel, say during a transition. Apparently, a transition resize is happening in this case.
(In reply to Neil Deakin from comment #14) > Or, some code could resize the popup by one pixel, say during a transition. > Apparently, a transition resize is happening in this case. But IIUC, that shouldn't be a problem (ie, we wouldn't want the width or height attributes in that case either, and as far as I know, we don't get them)
(In reply to Mike Conley (:mconley) from comment #7) > At some point, nsCocoaWindow::DoResize is called with width 232.5 (which is > 13950 / 60). This gets *rounded up* to 233.0 via [1], and is set as the > content rect's width, I believe(?). > > [2]: > https://mxr.mozilla.org/mozilla-central/source/widget/cocoa/nsCocoaWindow. > mm#1355 If this is the only problem then can we solve it by making ConstrainSize take floats/doubles so that we don't have to do width = round(width * scale) ConstrainSize(width, ...) width = round(width / scale) in nsCocoaWindow::DoResize and instead keep everything as floats here without unnecessary rounding of intermediate values? Does that solve the problem?
(In reply to Timothy Nikkel (:tn) from comment #16) > (In reply to Mike Conley (:mconley) from comment #7) > > At some point, nsCocoaWindow::DoResize is called with width 232.5 (which is > > 13950 / 60). This gets *rounded up* to 233.0 via [1], and is set as the > > content rect's width, I believe(?). > > > > [2]: > > https://mxr.mozilla.org/mozilla-central/source/widget/cocoa/nsCocoaWindow. > > mm#1355 > > If this is the only problem then can we solve it by making ConstrainSize > take floats/doubles so that we don't have to do > > width = round(width * scale) > ConstrainSize(width, ...) > width = round(width / scale) > > in nsCocoaWindow::DoResize and instead keep everything as floats here > without unnecessary rounding of intermediate values? Does that solve the > problem? Yes, I can overload ConstrainSize to take doubles, and that prevents us from rounding there. However, we end up rounding again anyways, since the "newBounds" nsIntRect that we pass to FitRectToVisibleAreaForScreen we create needs, well, ints. 1) Could / should we modify FitRectToVisibleAreaForScreen to take doubles? 2) If not (1), can we pass FitRectToVisibleAreaForScreen the rounded nsIntRect, but store a double-y rect in mWindow? Not sure if it's advisible to have FitRectToVisibleAreaForScreen receive a different set of values than what is stored in mWindow... 3) Forgive my ignorance, but what can I use to create double-y rects? Is that even possible? nsDoubleRect doesn't seem to exist. nsRect uses nscoords, which, after searching around, seems to be int32_t (though there seems to be plans to switch to floats using NS_COORD_IS_FLOAT). It's been a while since I've been down in C-land, but I'm pretty sure int32_t is also an integer.
Flags: needinfo?(tnikkel)
(In reply to Mike Conley (:mconley) from comment #17) > Yes, I can overload ConstrainSize to take doubles, and that prevents us from > rounding there. > > However, we end up rounding again anyways, since the "newBounds" nsIntRect > that we pass to FitRectToVisibleAreaForScreen we create needs, well, ints. Rounding will probably have to happen at some point, my point was that we are rounding on an intermediate value in that specific spot and maybe if we avoid that it fixes the problem? If not we can see how much more code would need to be changed to avoid this rounding error, but it might not be a huge amount, not sure until I try.
Flags: needinfo?(tnikkel)
(In reply to Timothy Nikkel (:tn) from comment #18) > (In reply to Mike Conley (:mconley) from comment #17) > > Yes, I can overload ConstrainSize to take doubles, and that prevents us from > > rounding there. > > > > However, we end up rounding again anyways, since the "newBounds" nsIntRect > > that we pass to FitRectToVisibleAreaForScreen we create needs, well, ints. > > Rounding will probably have to happen at some point, my point was that we > are rounding on an intermediate value in that specific spot and maybe if we > avoid that it fixes the problem? Unfortunately no. :/ It doesn't matter where along the chain we do any rounding - but as soon as it happens, the bug rears its head. If the dimensions are rounded before or during the call to PopupResized, we're in the buggy state. And it seems that we do that rounding in quite a few places. We do it in DoResize, as you've pointed out. We do it again in nsCocoaWindow::ReportSizeEvent via GetClientBounds by converting the values from DoResize into an nsIntRect. And we do it again in nsView::WindowResized, which converts the values from GetClientBounds to an nsIntSize. It's ints and nsIntRects all over the place, it seems. :/
(In reply to Mike Conley (:mconley) from comment #19) > Unfortunately no. :/ It doesn't matter where along the chain we do any > rounding - but as soon as it happens, the bug rears its head. If the > dimensions are rounded before or during the call to PopupResized, we're in > the buggy state. And it seems that we do that rounding in quite a few > places. We do it in DoResize, as you've pointed out. We do it again in > nsCocoaWindow::ReportSizeEvent via GetClientBounds by converting the values > from DoResize into an nsIntRect. And we do it again in > nsView::WindowResized, which converts the values from GetClientBounds to an > nsIntSize. > > It's ints and nsIntRects all over the place, it seems. :/ Ok, thanks for trying. I don't think it's quite that bad though because PopupResized and ReportSizeEvent are dealing in device pixels (actual real pixels on the screen). The value in DoResize is in "display pixels", which on a retina screen are a 2x2 pixel block. From comment 7, 466 and 465 are device pixels, 232.5 is a display pixel value. Since the values that PopupResized and ReportSizeEvent deal with are actual device pixels they are always actually integers so that should be fine. Looking at DoResize I don't see us storing any rounded ints in that function. The only real side effect of that function seems to be to call setFrame, and setFrame gets passed an NSRect, and if you check Apple's docs NSRect's are made of floats. So hopefully if we can make DoResize deal all in floats that will fix this bug. That looks like we have to change FitRectToVisibleAreaForScreen, GeckoRectToCocoaRect, and ConstrainSize. Which seems not too bad. Before layout/base/Units.h we would have probably just used a gfxRect if we wanted a rect with floats in it. But after Units.h the proper way is to probably add a new type for "display pixels" (if that is what we want to call them), and use typed rects of that type. But you can use gfxRect's first while testing to make sure this idea works out.
Hrm, so I've successfully made DoResize only use and store floats via NSRects. Unfortunately, and I should have seen this coming - when ReportSizeEvent gets fired, it converts the stored floats to ints by way of GetClientBounds. So it takes our floats, and then turns them into an nsIntRect to do our comparison with. And then we're back to square one. :/ Or am I missing something?
Flags: needinfo?(tnikkel)
At that point we should we dealing with actual device pixels, so it should be just integers there. No rounding should be taking place at that point.
Flags: needinfo?(tnikkel)
Attachment #780085 - Flags: feedback?(enndeakin)
mconley was debugging this with suggestions from myself today. We seem to have discovered that if one passes an NSRect with a width or height of say 232.5 (or even 232.25) then in the resulting windowDidResize notification the value of [mWindow frame] will have that width or height value at 233 (yes 232.25 rounds up to 233). These values are in display pixels (right?) so on a retina screen 232.5 should be 232.5*2 = 465 actual physical pixels and this should be possible. Anyone come across this before? Is there an api we can use that allows us to actually address every physical pixel on the screen? Or do we just have to work around this and place windows at even pixel values?
Flags: needinfo?(smichaud)
Flags: needinfo?(mstange)
Flags: needinfo?(jfkthame)
It's definitely possible that Cocoa rounds things like these up to screen points. I suppose we have to work around it :(
Flags: needinfo?(mstange)
When I was working on the OS X hidpi support, it seemed to me that although Cocoa takes floating-point coordinates for things like window frames, it always snaps the window frame to integer points (i.e. "display pixels"), even on a retina screen, and so it is -not- possible to place a window frame at odd device-pixel coordinates (which would equate to half-display-pixel values). I don't recall seeing this behavior documented anywhere, but it's what I remember observing. If we assume that for a non-retina screen, window frames are required to be aligned to whole points (pixels) to ensure crisp drawing, then it makes some sense to retain the same restriction on a retina screen - otherwise, the frame might have to be resized if it is moved from a retina to non-retina display. Snapping the frame to whole (display) pixels in all cases ensures that it can be moved to any screen without problems and without triggering an unexpected resize. (This is not the only case where coordinates seem to be rounded to whole points even on a retina display: AFAICT, the mouse pointer location is always snapped to whole points, too. So it moves in steps of two device pixels at a time.)
Flags: needinfo?(jfkthame)
Flags: needinfo?(smichaud)
So I guess the best fix is to make CalcWidgetBound aware of this restriction when it calculates a widgets position/size based on the view's position/size. Kinda yucky, but seems forced upon us by OS X.
A relevant cocoa-dev post that seems to confirm our suspicions: http://lists.apple.com/archives/cocoa-dev/2011/Aug/msg01078.html
Attached patch Patch v2 (obsolete) — Splinter Review
Instead of allowing the 1px of error, this patch modifies nsView for OSX such that we attempt to do the same rounding to keep our numbers in sync with Cocoa.
Attachment #780085 - Attachment is obsolete: true
Attachment #780085 - Flags: review?(tnikkel)
Comment on attachment 781111 [details] [diff] [review] Patch v2 This fixes my bug for me - but I'm not sure what other impacts this change might have. Does it look acceptable?
Attachment #781111 - Flags: review?(tnikkel)
I thought Cocoa rounded *up* to whole points, in which case wouldn't using ceil() rather than floor() result in a better match to its behavior?
We want to avoid the Cocoa rounding behaviour altogether by choosing a pixel value that won't be altered by Cocoa. I think floor is the best choice because if we round up I'm not sure what will be painted in the extra pixels. I think it's better to cut off a single row/column of pixels.
Comment on attachment 781111 [details] [diff] [review] Patch v2 Thanks, this looks good but I have a few comments. I'm not sure if we should restrict this hack to only be used if the scale factor is 2. If Apple introduces other scale factors their rounding behaviour may change. For example, if the scale factors goes up to 3 is cocoa going to still disallow not evenly divisible by 3 pixels coordinates? Or if they introduce 1.5 or another non-integral value then does it even make sense anymore to restrict pixel coordinates? It's not clear. We can't rely on mWindow because CalcWidgetBounds is called during widget creation. Probably we can just look at the parent widget. We should do this extra rounding instead of the viewBounds.ToNearestPixels, not after it. It would be good to avoid rounding and then rounding again and just round to what we want in one step. CalcWidgetBounds is used for popup widgets and plugin widgets. I think we should use this only for popup widgets. Plugin widgets are child views on mac I believe, so they are a little bit different and we don't know if we need this for plugin widgets yet, we can add that later if we find we need it. I don't think GetDefaultScale is the right thing to be looking at here, that can be modified by prefs, and the problem we are working around here is only based on what the OS does for certain screens. I think what we really want is BackingScaleFactor but that is not on nsIWidget (GetDefaultScaleInternal might suffice since that is what it returns but I think the implementation of GetDefaultScaleInternal could change while it's meaning stays the same making it no longer work for us).
Attached patch WIP Patch 3 (obsolete) — Splinter Review
Trying to implement tn's suggestions when I suddenly came to the realization that I had no idea what I was doing. Instead of flailing and wasting tn's time in IRC, I'm posting what I've done (not a whole lot), and tn said he'd drive it the rest of the way. Thanks tn!
Attachment #781111 - Attachment is obsolete: true
Attachment #781111 - Flags: review?(tnikkel)
Status: NEW → ASSIGNED
Depends on: 874792
Assignee: mconley → tnikkel
Attachment #781878 - Attachment is obsolete: true
Attachment #782948 - Flags: review?(mstange)
Attachment #782948 - Flags: review?(mstange) → review+
I can confirm that this fixes the bug from comment 0! Thanks tn, mstange!
Status: ASSIGNED → NEW
Thanks for testing, and all the rest of your help, would have been impossible without that (since I don't have any retina hardware).
Component: Menus → Widget: Cocoa
Product: Firefox → Core
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Blocks: 879500
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: