Closed Bug 1865002 Opened 10 months ago Closed 9 months ago

Explore the feasibility of opening the Select Translations Panel near to selected content

Categories

(Firefox :: Translations, task, P3)

task

Tracking

()

RESOLVED FIXED

People

(Reporter: nordzilla, Assigned: nordzilla)

References

Details

Attachments

(3 files)

Summary

As part of the Select Translations Design, we are considering the feasibility of allowing the Select Translations Panel open near to highlighted text on the page, instead of always opening up in the browser UI.

The above proposal would provide an ideal experience for many users, especially accessibility customers using . However, this proves to be a non-trivial technical problem for a number of reasons, based on my initial investigation:

  1. The Translations Panel is a XUL element that exists in the browser UI.
  2. The selected text exists in a content process within the page.
  3. Determining the location at which to open the panel will require some sort of message-passing communication between the privileged and content processes.
  4. Beyond determining a location at which to open the panel, there exists a rabbit hole of difficulty with regard to making a content-process node act as an anchor for a XUL element (if it's even possible to do so).

While the alternative of always opening the Select Translations Panel at the top of the UI, where the current Full-Page Translations Panel opens, would certainly be the easiest technical implementation, it is clear that having the panel open near to the web content itself is the strongly desired approach by both the UX and A11y teams.

I want to do my due diligence to explore the technical feasibility of such a design.

I'd like to have a technical discussion with Firefox front-end experts regarding the desire to anchor a panel to the location of highlighted text on the screen, including plausible alternatives to my mock-up implementation described below, implementation difficulties, the existence of prior art that attempts anything similar etc.


A mock-up approach

I made an extremely rough mock-up exploration patch, in which I attempt to open the existing Translations Panel near to highlighted text on the page (see attached video and Phabricator revision).

The approach I took was to retrieve the selection bounding box in the child actor and pass the top left bottom and right values back to the parent actor, at which point I move an invisible XUL <box> element to the bottom-right of the bounding box to act as an anchor for the panel.

I realize this approach may be unacceptable for production code, but it was the first thing I was able to come up with.


Concerns

Scrolling

One concern I have is scrolling. The panel element is clearly a part of the Firefox UI and does not appear to be movable via the mouse, nor does it stay next to the highlighted text when we scroll on the page.

We could disable scrolling while the panel is open, which the right-click menu already seems to do. That would be fine. But I'm curious to know if it would ever be possible to make a content node be a legitimate anchor for a XUL panel.

Magnification

This isn't shown in the video, but increasing the zoom level in the browser breaks my mock-up implementation. The panel only seems to reliably open near to the highlighted text when the zoom level is at 100%

Disjoint selections

Firefox users can select multiple parts of text at different locations on the page. This could prove difficult to determine where to open the panel in such a case.

Click and drag / resize via mouse

One of the UX desires along with a floating panel is to be able to drag and possibly resize the panel with the mouse. I don't know that this will be possible with using the XUL elements for the panel.

Call-to-action icon

One of the design explorations expresses the desire to have a floating call-to-action icon, similar to the Translations icon in the awesome bar, which would appear next to highlighted text and serve as an anchor for the floating Select Translations Panel when clicked. However, placing the icon itself near to the selected text presents the same issues.


Plausible Alternatives

Floating panel for right-click access only

One of the primary ways users will be able to access the Select Translations Panel is from the right-click menu, when clicking on highlighted text. I'm sure that we can anchor the panel at the same location where the menu exists. In the worst case, we could have this be the only time where the panel floats.

Call-to-action icon appears on mouse-up after selecting text

Similar to the right click menu, we could have the CTA icon appear after the user has selected text (if the relevant translation conditions are met, of course). However, this might add a lot of complexity and performance regressions.


Outcomes

If we do think that there is a path forward here, I wouldn't mind putting in the work to try to make this a reusable pattern. I imagine that the ability to open XUL components at content locations on the page could be useful in multiple contexts. I think this seems like an interesting problem to solve and a worthwhile investment for the Translations project as well as other projects. But I would like to understand the full scope of the difficulties that may arise from people who understand the front end better than I do.

[DO NOT LAND]
This is a mock-up exploration of an idea, not a submission for review.

Hi Gijs, Jamie, and Mike,

I'm requesting need-info from you to get your technical perspectives regarding the bug description above.

Please feel free to add someone else if you think they would have a good perspective here.

Flags: needinfo?(mconley)
Flags: needinfo?(jteh)
Flags: needinfo?(gijskruitbosch+bugs)
Priority: -- → P3

One thing that immediately comes to mind is that even if you manage to anchor the panel so that it visually appears in the right place, there's no way to do this semantically for a XUL panel. In practice, what this means is that screen reader users will not perceive the panel as being near the selected content.

If the panel is somewhat like a dialog anyway (I'm not sure), this is probably okay. By necessity, screen reader users tend to perceive dialogs as separate from the content because it's somewhat difficult for a screen reader user to interact with two different areas at once.

Beyond the translated text, what else is in this panel? I'm guessing it allows you to choose languages, etc.

Another solution to consider (albeit problematic for various reasons and maybe completely infeasible) is to inject the translations panel into the DOM of the content page. That way, you're running in the same scope and in the same document. That might make visual positioning easier and allows you to position it semantically or at least establish semantic relationships. It also has the propensity to break the page and may not actually work. :)

Flags: needinfo?(jteh)

Beyond the translated text, what else is in this panel? I'm guessing it allows you to choose languages, etc.

The reason I asked this question is that I'm wondering whether it's possible to replace the selection with the translated text, so the translation would appear to happen "inline". We could still have a panel elsewhere to change the settings or whatever, but the inline translation means that the context of the actual translation is preserved for all users.

I'm not really sure I understand what's being asked here. Here are a few mostly-technical observations:

  • We anchor the context menu, autofill menus, select dropdowns, the date picker (ie <input type=date/time> implementation), colorpicker, and probably other things I'm not thinking of, to web content already. It would probably make sense to look at what those do, and not diverge substantially in terms of UX + implementation patterns. These also work correctly for zoom/magnification, and yes they stop scrolling - this new thing should do the same.
  • if you wanted to not disable scrolling, you'd need to continuously pass scroll information across processes. A naive fix would be to send messages using the JSWindowActor mechanics, but those are likely to be too slow to not look awful/janky, especially on slower hardware. A better fix would be co-opting the actual scrolling GPU/APZ/layout bits which I think already have to talk to the parent process. But it's not really clear what you'd do in cases where e.g. the scrolled content disappears out of the viewport or a subframe's boundary box. So honestly I think solving this is both technically hard and has inescapable UX issues.
  • You don't need the invisible box, you can pass positioning arguments to the relevant XUL panel opening methods (you don't need an element to anchor to, you can open a popup using just coordinates). Here's the context menu doing this: https://searchfox.org/mozilla-central/rev/f78093864e287014db7ac9383bb76c45edbf8559/browser/base/content/nsContextMenu.js#106 .
  • elements from the content process simply do not exist in the parent process. So fundamentally, as you've found out, you cannot anchor/relate the thing you open in the parent process to the thing in the child process directly. You can do it indirectly (by manually passing coordinates, and perhaps even metadata so you can at least describe to a11y users what the thing is that just opened), but never directly, because you cannot directly reference DOM elements across process/thread boundaries, for obvious reasons.
  • disjoint selections are a problem that's orthogonal to the cross-process issue you originally cited. I understand it's potentially a problem for this specific feature but any solution to it is needed orthogonal to the cross-process issue, so let's leave that out of the equation for now. :-)
  • if there is supposed to be a way of opening the panel that isn't the URL bar that is closer to the text, the context menu seems the natural place to do it. Then you can just have a (sub)menu item for the UI that you currently have in the panel - you don't actually need the panel, the UI could just live inside the menu itself. Then you also don't need to reinvent passing text selection or coordinates across processes, and it solves the positioning woes because the menu implementation takes care of this.
  • You're right, I'm not aware of resizable/draggable panels in our UI at the moment and making XUL panels do that may be an uphill battle. But also, the panel allows saying "yes translate this" and choosing languages / deciding never to translate sites/languages. The size of its content does not vary and so I don't really understand why UX want it to be resizable. That doesn't seem like it helps users do anything.
  • There are ways of showing browser UI inside the content process, as Jamie alluded to. The devtools and screenshots component code use this to provide overlays on web content that interact with the web content. There are security implications though, because it means that we'd need to allow content processes to tell the parent process "translate this site from X to Y" and/or "never translate this site" and/or "never translate this language". I don't know the security properties of the translation process and if this would be considered problematic (ie if it'd help a potentially compromised content process escalate its privileges). Also, any such UI would potentially scroll and either way it cannot overlap the chrome, which means there are some line-of-death issues (though the usefulness of that concept is disputed, so ymmv) in terms of how users know they can trust the content that is more or less indistinguishable from (untrusted) web content.

Some more subjective observations:

  • in terms of where to position things... to me, it's confusing that the argument is that (a) the panel should be next to the selection/text and (b) the way to open the panel should still be in the URL bar. This is more mouse movement, not less, so it seems strictly worse than the status quo.
  • from a personal perspective, having extra UI (the described "Call-to-action icon" appearing on mouseup) appear floating next to text selection for a feature I don't use seems really annoying. This is feedback we've had before for e.g. the PiP control on videos.
  • If we were going to offer this type of UX, it also seems like other functionality (search for selection, copy selection, take a screenshot) would be more high value than translation.
  • although it seems "cool" to offer in-context translation, I'm not sure I understand the situation where a user both doesn't speak a language and so needs a translation, yet understands a page in another language well enough to select a specific bit that they want translating (apparently not caring about anything else). Is the text selection really critical here, or do users "just" want a page context menu integration because that's where they expect this type of functionality (even when there's no selection on the page) ?

ISTM that the simplest solution here would be to provide a context menu entry that translates from the auto-detected language to the user's UI language, potentially with a second option that allows opening the panel in either its usual place or replacing the context menu at the same position, if users want to customize that choice (which seems like an unlikely/infrequent occurrence, but perhaps you have data that shows otherwise?).

In terms of a generically reusable thing here... any reusable thing would need code in both processes and need a pluggable way of finding the right anchor in the child process's content area (be it selection or a containing / "from point" node of some kind) and the right panel/popup/whatever in the parent. I guess it's possible to add an abstraction here, but I don't think the abstraction will help super lots - the only thing it'd abstract is going from the content point to the screen position for the panel in the parent. Effectively I think any abstraction you come up with would probably be more code than cargo-culting "compute coordinates, send to parent" from any of the existing solutions. But perhaps my imagination is too narrow here. :-)

Flags: needinfo?(gijskruitbosch+bugs)

I think Gijs has covered most of the bases here, but I wanted to add one thing:

from a personal perspective, having extra UI (the described "Call-to-action icon" appearing on mouseup) appear floating next to text selection for a feature I don't use seems really annoying. This is feedback we've had before for e.g. the PiP control on videos.

I noticed recently that Opera seems to have this kind of functionality, in case people want to try it out in a real browser product and get a feel for the idea. See the attached video.

Flags: needinfo?(mconley)

Jamie,

Regarding: "One thing that immediately comes to mind is that even if you manage to anchor the panel so that it visually appears in the right place, there's no way to do this semantically for a XUL panel. In practice, what this means is that screen reader users will not perceive the panel as being near the selected content."

This is a really interesting point that I'd like to bring up with UX. I think that so far this design has been influenced by the idea that having the panel exist near to the content would make it more accessible for screen readers, and also for users who are zoomed in at the OS level.


Regarding: "Beyond the translated text, what else is in this panel? I'm guessing it allows you to choose languages, etc."

The panel would primarily display only the translated text as the main content. It would communicate that the text has been translated from one language to the other, and probably have dropdowns to change the languages.


Regarding: "Another solution to consider (albeit problematic for various reasons and maybe completely infeasible) is to inject the translations panel into the DOM of the content page. That way, you're running in the same scope and in the same document. That might make visual positioning easier and allows you to position it semantically or at least establish semantic relationships. It also has the propensity to break the page and may not actually work. :)"

This does seem like an interesting idea, but it definitely sounds a bit scary. Perhaps it's worth a brief exploration.


Regarding: "The reason I asked this question is that I'm wondering whether it's possible to replace the selection with the translated text, so the translation would appear to happen "inline". We could still have a panel elsewhere to change the settings or whatever, but the inline translation means that the context of the actual translation is preserved for all users."

This option was explored by UX, but but a decision was made to open the translated text in a panel instead. I believe that Microsoft Edge implements the in-line translation style if I recall correctly.

Gijs,

Regarding: "We anchor the context menu, autofill menus, select dropdowns, the date picker (ie <input type=date/time> implementation), colorpicker, and probably other things I'm not thinking of, to web content already."

Thank you! I'll definitely have a look at some of those implementations in terms of prior art.


Regarding: "if you wanted to not disable scrolling, you'd need to continuously pass scroll information across processes... So honestly I think solving this is both technically hard and has inescapable UX issues."

I agree. I think that would be a very difficult problem to solve both technically and with regard to UX.


Regarding: "You don't need the invisible box, you can pass positioning arguments to the relevant XUL panel opening methods (you don't need an element to anchor to, you can open a popup using just coordinates)."

I tried this approach first and I couldn't get it to work. In this particular case, I was trying to get it to work with PanelMultiView.openPopup()

I tried invoking it like this:

PanelMultiView.openPopup(panel, /* anchor */ null, {}, /* x */ 50, /* y */ 50);

But it seems to always call the _calculateMaxHeight() function, which seems to assume that the panel has a non-null anchor. In any case, I couldn't get it to work while I was trying to make my quick prototype, so I used an anchor instead.

Perhaps you could confirm if I've misunderstood something or not, and if not, I'll file a bug that this behavior should be fixed.


Regarding: "if there is supposed to be a way of opening the panel that isn't the URL bar that is closer to the text, the context menu seems the natural place to do it. Then you can just have a (sub)menu item for the UI that you currently have in the panel - you don't actually need the panel, the UI could just live inside the menu itself. Then you also don't need to reinvent passing text selection or coordinates across processes, and it solves the positioning woes because the menu implementation takes care of this."

You're right, the UX designs intend to have potentially three ways to open this menu.

  1. Right clicking the selected text.
  2. Clicking the Translations button while text is selected.
  3. The "call-to-action icon" that I mentioned in the original description, which would be a clickable icon that floats next to selected text if the selection meets translation requirements.

Option 3) is one of the primary reasons I'm doing this exploration in the first place. Please do not take the literal example of my mock-up solution as intended behavior. I was only attempting to use existing elements to test the "idea" of opening them next to a text selection as a path of least resistence. Even in the case of Option 2), the panel would not be floating in that case, it would just open at the button where it was clicked.


Regarding: "There are ways of showing browser UI inside the content process, as Jamie alluded to. The devtools and screenshots component code use this to provide overlays on web content that interact with the web content. There are security implications though, because it means that we'd need to allow content processes to tell the parent process "translate this site from X to Y" and/or "never translate this site" and/or "never translate this language". I don't know the security properties of the translation process and if this would be considered problematic (ie if it'd help a potentially compromised content process escalate its privileges). Also, any such UI would potentially scroll and either way it cannot overlap the chrome, which means there are some line-of-death issues (though the usefulness of that concept is disputed, so ymmv) in terms of how users know they can trust the content that is more or less indistinguishable from (untrusted) web content."

Definitely, this seems like a security risk to me.


Regarding: "in terms of where to position things... to me, it's confusing that the argument is that (a) the panel should be next to the selection/text and (b) the way to open the panel should still be in the URL bar. This is more mouse movement, not less, so it seems strictly worse than the status quo."

As I mention in one of the previous responses, the behavior in my mock-up solution is not actual behavior, but rather an exploration of opening a UI panel near to selected text. In the event that the user invokes the Select Translations Panel via the button, it will open at the button. But we may still need this capability as a whole for the feature, so I used the existing panel in my exploration.


Regarding: "from a personal perspective, having extra UI (the described "Call-to-action icon" appearing on mouseup) appear floating next to text selection for a feature I don't use seems really annoying. This is feedback we've had before for e.g. the PiP control on videos."

Do you have any links to specific negative feedback about this pattern? I think those would be worthwhile to read. I'm happy to go digging myself, but a link is welcome if you know exactly where it is or how to find it easily.


Regarding: "If we were going to offer this type of UX, it also seems like other functionality (search for selection, copy selection, take a screenshot) would be more high value than translation."

Perhaps, but that's not really my decision to make. I'm working on Translations right now.


Regarding: "although it seems 'cool' to offer in-context translation, I'm not sure I understand the situation where a user both doesn't speak a language and so needs a translation, yet understands a page in another language well enough to select a specific bit that they want translating (apparently not caring about anything else). Is the text selection really critical here, or do users "just" want a page context menu integration because that's where they expect this type of functionality (even when there's no selection on the page) ?"

I think this is a question more for the UX team and their user research. My understanding is that select translations is one of the most desired translation features after full-page translation.

Though, as someone who is learning a second language myself as an adult, I certainly see value this capability. It is not uncommon for me to be reading a page in Spanish, at which point I come across a particular sentence or paragraph that has new vocabulary or unusual-to-me grammar structures. In that case, I'd likely want to translate that selection. Language learning is not binary, where learners have either 0% comprehension or 100% comprehension. If it were, I would agree with you that full-page translation is wholly sufficient. This is just one example from my personal experience where I think the feature has value, but I'm sure that there are others, even unrelated to learning languages.


Regarding: "In terms of a generically reusable thing here... any reusable thing would need code in both processes and need a pluggable way of finding the right anchor in the child process's content area (be it selection or a containing / "from point" node of some kind) and the right panel/popup/whatever in the parent. I guess it's possible to add an abstraction here, but I don't think the abstraction will help super lots - the only thing it'd abstract is going from the content point to the screen position for the panel in the parent. Effectively I think any abstraction you come up with would probably be more code than cargo-culting "compute coordinates, send to parent" from any of the existing solutions. But perhaps my imagination is too narrow here. :-)"

Sure, there may not be a reason to make a general abstraction. I was only offering to do such a thing if it seemed both feasible and useful to do.


Thank you so much for all of the detailed feedback. This is incredibly helpful to me and to the Translations team.

Mike,

Thanks, that video is a helpful demonstration of the kind of thing we're looking for.

Jamie,

One more follow-up about what you said here:

"One thing that immediately comes to mind is that even if you manage to anchor the panel so that it visually appears in the right place, there's no way to do this semantically for a XUL panel. In practice, what this means is that screen reader users will not perceive the panel as being near the selected content."

Even if the screen reader wouldn't perceive the panel as being near to the selected content, which makes sense given the situation, would we still be able to change the focus so that when a user invokes the panel from the screen reader, that the reader's context is now within the panel in a way that flows naturally?

I apologize, I'm not very well versed in screen-reader tech at this time, but I'm planning to dive into it in preparation for work on this feature.

Flags: needinfo?(jteh)

(In reply to Erik Nordin [:nordzilla] from comment #10)

Even if the screen reader wouldn't perceive the panel as being near to the selected content, which makes sense given the situation, would we still be able to change the focus so that when a user invokes the panel from the screen reader, that the reader's context is now within the panel in a way that flows naturally?

You can and should move the focus inside the panel. Because it's in the parent process and the user begins focused in content, this might be a bit fiddly to implement, but there is prior art in the date picker, for example.

A screen reader user won't be able to perceive the visual context around the panel, but moving focus will mean that their attention is directed inside the panel.

I apologize, I'm not very well versed in screen-reader tech at this time, but I'm planning to dive into it in preparation for work on this feature.

Always happy to help.

Flags: needinfo?(jteh)
Status: NEW → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED

(In reply to Erik Nordin [:nordzilla] from comment #8)

Apologies for not replying here, this got kinda lost in my mail folders.

Regarding: "You don't need the invisible box, you can pass positioning arguments to the relevant XUL panel opening methods (you don't need an element to anchor to, you can open a popup using just coordinates)."

I tried this approach first and I couldn't get it to work. In this particular case, I was trying to get it to work with PanelMultiView.openPopup()

I tried invoking it like this:

PanelMultiView.openPopup(panel, /* anchor */ null, {}, /* x */ 50, /* y */ 50);

Note that you're calling PanelMultiView.openPopup and that is very different from calling openPopupAtScreen as the context menu code is doing.

PanelMultiView has really only been used and tested anchored to buttons, but I don't think that a priori there is any reason it could not be made to work without (probably on an opt-in basis given none of the other consumers want this).

Perhaps you could confirm if I've misunderstood something or not, and if not, I'll file a bug that this behavior should be fixed.

If you're still interested in this, then yes filing a bug would be helpful.

Regarding: "from a personal perspective, having extra UI (the described "Call-to-action icon" appearing on mouseup) appear floating next to text selection for a feature I don't use seems really annoying. This is feedback we've had before for e.g. the PiP control on videos."

Do you have any links to specific negative feedback about this pattern? I think those would be worthwhile to read. I'm happy to go digging myself, but a link is welcome if you know exactly where it is or how to find it easily.

e.g. bug 1567136
https://support.mozilla.org/bm/questions/1332399
https://www.reddit.com/r/firefox/comments/zlw0ey/is_there_a_way_to_get_rid_popout_this_video/
https://www.reddit.com/r/firefox/comments/bfeh3w/pictureinpicture_unwanted_help/

To be clear, this is probably a minority of users, and we've made the icon less conspicuous since some of the reports (it's not bright blue anymore). But worth bearing in mind for UI that would (for many users) be encountered more often.


You closed this out, can you add a small comment about what your conclusion / next steps are, so that future finders of this bug understand some of the context? :-)

Flags: needinfo?(enordin)

(In reply to :Gijs (he/him) from comment #12)

You closed this out, can you add a small comment about what your conclusion / next steps are, so that future finders of this bug understand some of the context? :-)

Sure! Thanks for following up here.

Ultimately we've decided that the idea of implementing a shortcut that appears near to selected text (when relevant conditions are met for a supported translation) is possible and worth exploring.

However, we are implementing the Select Translations feature to be accessible via the right-click context menu first and foremost, likely with an initial release consisting of only that functionality, and a follow-up release that would enable the shortcut.


(In reply to Erik Nordin [:nordzilla] from comment #8)

Note that you're calling PanelMultiView.openPopup and that is very different from calling openPopupAtScreen as the context menu code is doing.

PanelMultiView has really only been used and tested anchored to buttons, but I don't think that a priori there is any reason it could not be made to work without (probably on an opt-in basis given none of the other consumers want this).

If you're still interested in this, then yes filing a bug would be helpful.

I'd like to file a bug for this. I think, at the very least, that the PanelMultiView.openPopup() documentation is a bit misleading because it implies to me that it should be able to forward the same set of ...args as other openPopup functions, which do allow for x and y to be specified.

However, I don't think that fixing this will be relevant to the implementation of Select Translations. It was relevant during my exploration, because as a path of least resistance, I was trying to open up the Translations Panel at a specific location using PanelMultiView, but in the actual implementation, the Select Translations panel will either be anchored to the context menu, or to the floating shortcut button. So there shouldn't be any need to open PanelMultiView at an x-y location for this feature.

Flags: needinfo?(enordin)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: