Open Bug 1403295 Opened 7 years ago Updated 2 years ago

Allow commands to call window actions such as "navigation reload"

Categories

(WebExtensions :: General, enhancement, P3)

enhancement

Tracking

(Not tracked)

UNCONFIRMED

People

(Reporter: koushien, Unassigned)

References

(Depends on 3 open bugs, Blocks 2 open bugs)

Details

As suggested in the September 26 triage meeting linked below, we split bug 1215061 into two separate patches. This bug is for the keyboard shortcut API. The experiment is currently located at https://github.com/Koushien/keyboard-shortcut-api. This API sets off actions possible with Firefox's native shortcuts.

From the meeting outcome, this patch is in need of a mentor as we write tests and land in Mozilla Central.

Triage: https://docs.google.com/document/d/1pw5y-GHwDLPV9bYK4HWCiZtslqFtAeL3G9bC4ZDbdjs/edit
Flags: needinfo?(amckay)
"Implementing a keyboard shortcut API" is a kinda of vague bug title, we've already got one called commands. I think this bug and experiment is to allow shortcuts to allow command shortcuts to trigger actions in the browser, such as "navigation forward" or "navigation reload override cache". I tried to change the title on this to be a bit clearer.

Leaving the needinfo on me to find a mentor.
Blocks: 1215061
Summary: Implementing a keyboard shortcut API → Allow commands to call window actions such as "navigation reload"
> I think this bug and experiment is to allow shortcuts to allow command shortcuts to trigger actions in the browser 

The intention is that these functions are usable outside the commands API in general webextension code.

I might characterise it as "Allow webextensions to trigger actions in the window, such as navigation reload, select URL bar". "Actions" is terribly vague, but I can't think of better right now.

The vagueness is probably because this API is designed to match the existing firefox shortcuts, rather than some more, uh, semantically-related collection of functionality.

> Leaving the needinfo on me to find a mentor.

Thanks :)
(In reply to Colin Caine from comment #2)
> The intention is that these functions are usable outside the commands API in
> general webextension code.

Well that's interesting because we might not want to do that. Commands are a user interaction and we try to ensure that things that change the UI are part of user interaction. As an example (probably a bad one), we've said no to the ability to open a context menu programmatically.

I haven't read through all the events you intend to trigger, but I imagine we'd want to think about that.
Flags: needinfo?(amckay)
It'd certainly have to be a case-by-case thing. I wouldn't want my extensions being able to focus the address bar outside of a user-triggered event, but things like "navigation reload" are already available without user input via a more awkward content script-based solution.
Well, shall we try and work out what you're happy to have called programmatically?

To help me participate in the discussion, I'd appreciate it if you would clarify what your threat model is: what are you aiming to defend against? My guess is nuisance or badly-written add-ons giving firefox a bad name, but I'd rather not guess :)

Here's a list of functions in the experiment broken down into logical sections for our review:

# Functions needed because content scripts can't be on restricted pages

 - History navigation (forward/back/reload/home/stop)
 - Scrolling
 - Print page
 - Focus frame
 - Edit commands (copy/cut/delete/paste/redo/undo/selectall)

# Functions that provide new capabilities

 - Tab undelete
 - Window undelete
 - Zoom (Maybe? Depends if this is nicer than setting css zoom)
 - toggle full screen
 - toggle caret browsing
 - select location bar
 - focus tab content (not implemented, maybe not a firefox shortcut either, but very useful)

# Functions that replicate functionality from other webext APIs

 - Most (all?) of the other tab and window functions

# Functions that trigger UI elements

 - searchbar/quickfind/quickfind links only (missing from find API for some reason?)
 - history sidebar/library window/clear recent history dialog
 - bookmark this page
 - bookmark sidebar/library
 - devtools panes
 - open file/save page dialogs

# Functions that navigate to particular URLs

 - addons, download
 - toggle reader mode

If the commands can't be called programmatically then the functions should be registerable against a key-sequence and state that the the keyboard API will then listen for, as well as to the commands API.

I'm not enthusiastic about the registered functions pattern because:

 1. It breaks the powerful and very useful idea common to vim, vimperator, emacs, etc. that anything you do in one mode you can do programmatically
 	- e.g. `:autocmd LocationChange google.com :normal 10j`
 2. It complicates the implementation of the keyboard API
 3. It complicates and constrains the design decisions of UI addons that want to use the API
Severity: normal → enhancement
Priority: -- → P3
(In reply to Colin Caine from comment #2)
> The intention is that these functions are usable outside the commands API in
> general webextension code.

If that's the case, we just need to go through these one-by-one.  I think in many cases we'll want to limit actions only to be invokable while handling user input (ie, not let an extension just take these actions from a background page when the user isn't explicitly interacting with the extension).  But before we get into that, I'm not sold on that approach...

(In reply to Colin Caine from comment #5)
> If the commands can't be called programmatically then the functions should
> be registerable against a key-sequence and state that the the keyboard API
> will then listen for, as well as to the commands API.
> 
> I'm not enthusiastic about the registered functions pattern because:
> 
>  1. It breaks the powerful and very useful idea common to vim, vimperator,
> emacs, etc. that anything you do in one mode you can do programmatically
>  	- e.g. `:autocmd LocationChange google.com :normal 10j`
>  2. It complicates the implementation of the keyboard API
>  3. It complicates and constrains the design decisions of UI addons that
> want to use the API

I don't really understand these last 3 points.  In fact I think using registered functions actually simplifies extensions that are going to use this API -- if these are general purpose functions each will have its own associated permissions, calling signature, etc.  With registered functions, the permissions model can be much simpler.  To be more specific, giving an extension the ability to associate some action (eg focus the location bar) with a keyboard shortcut, is much simpler, less prone to abuse, etc. than giving an extension the ability to actually take that action programmatically.

Can you explain your issues with the registered functions approach in a little more detail, perhaps with specific examples?

> To help me participate in the discussion, I'd appreciate it if you would
> clarify what your threat model is: what are you aiming to defend against? My
> guess is nuisance or badly-written add-ons giving firefox a bad name, but
> I'd rather not guess :)

Getting this thoroughly documented is a work in progress but see:
https://webextensions-experiments.readthedocs.io/en/latest/new.html#new
Thank you for your response, Andrew. Hopefully we'll chat about this sometime in your morning tomorrow.

(In reply to Andrew Swan from comment #6)
> I don't really understand these last 3 points.  In fact I think using registered functions actually simplifies extensions that are going to use this API -- if these are general purpose functions each will have its own associated permissions, calling signature, etc.  With registered functions, the permissions model can be much simpler.  To be more specific, giving an extension the ability to associate some action (eg focus the location bar) with a keyboard shortcut, is much simpler, less prone to abuse, etc. than giving an extension the ability to actually take that action programmatically.

The problems come when you want to do something a little more complicated. Here are some examples that are relevant in a vim-like interface.

Let's say I want to map `H` to history back. Given that `H` goes back one page. As a vim user, I expect the following to work:

 1. `5H`: goes back 5 pages
 2. `.`: repeats the last command
 3. `:map x H'af`
 	- a quick map to help navigate a book, imagine navigating to and from a table of contents
 	- This says "When I press `x` in the future, do the actions I expect if I were to press `H'af`
	- `'` is go-to-mark, `a` is the name of the mark to go to, `f` is start hint mode[1]
	- So that means, when I press `x`, go back one page, jump to a predefined scroll point, and let me quickly choose the next link
 4. `:autocmd LocationChange http://book.com/badpage normal x`
 	- A contrived autocmd, when I navigate to some URL, do what I'd expect to happen if I pressed `x` in normal mode (in this case, our map above)
 5. `:command GoBackAndHint historyback | open #toc | zoom +10 | hint --anchorsOnly`
 	- define a custom command along the same lines as the map
	- commands are called by typing `:` to bring up the commandline, then typing them
 6. `:command MyHistoryBack -js { try { await historyback() } catch (e) { urlup() } }`
 	- a custom command that tries to go back one page, and if it can't, it goes up one level in the url path (so example.com/foo/bar -> example.com/foo)

These examples are invented, but vimperator and vim users really do use all of these features. I would argue that one of the main reasons for using vim is that you can program it quickly like this. Likewise with the interfaces that I want to support for firefox.

1-3 could all be done with sufficient flexibility in the registered function system, but they mean WebExt developers have to delegate the implementation of their key-sequence parser to the API, rather than being free to implement whatever they like.

A minor point is that this also makes it quite fiddly if you want to act on the return value of one of these registered functions, or if you want to add other extra conditions or logic around the call (see 6).

4 could be done if the historyback function can be called in the appropriate handlers (but see caveat below)

I don't see how to realistically do 5 without being able to call historyback and zoom from arbitrary contexts.


> I think in many cases we'll want to limit actions only to be invokable while handling user input

The webextension environment is full of async. If one can only invoke these functions in a serial context then they'll be really difficult to use, just like keyevent.preventDefault() is now.

[1]: https://youtu.be/t67Sn0RGK54?t=25
Updated list of functions from comment #5 (better categories):

# Functions needed because content scripts can't be on restricted pages

 - History navigation (forward/back/reload/home/stop)
 - Scrolling
 - Print page
 - Focus frame
 - Edit commands (copy/cut/delete/paste/redo/undo/selectall)
 - Zoom

# Functions to help interact with web content

 - Zoom (Maybe? Depends if this is nicer than setting css zoom)
 - toggle caret browsing
 - toggle reader mode

# Functions to help navigate the web

 - select location bar
 - Focus tab content (not implemented, maybe not a firefox shortcut either, but really useful for reversing the option above - otherwise you can't easily leave the location bar without clicking or loading a page)

# Functions that open new UI elements + fullscreen

 - toggle full screen
 - searchbar/quickfind/quickfind links only (last one missing from find API for some reason?)
 - history sidebar/library window/clear recent history dialog
 - bookmark this page
 - bookmark sidebar/library
 - devtools panes
 - open file/save page dialogs

# Functions that navigate to particular URLs

 - addons, download
 - toggle reader mode

# Functions that replicate functionality from other webext APIs and should probably be dropped

 - Most (all?) of the tab and window functions
Oh, missed a few more duplicates:

tabs API has setZoom, print, and printPreview; so this API doesn't need them.

tabs API also has saveAsPDF, but no open or save HTML functions

Interestingly, three of those functions open UI elements, but they're all callable from a background script.
(In reply to Colin Caine from comment #8)
> Updated list of functions from comment #5 (better categories):
> 
> # Functions to help interact with web content
> 
>  - toggle reader mode
>

This is already possible: tabs.toggleReaderMode
 
> # Functions that navigate to particular URLs
> 
>  - addons, download
>  - toggle reader mode
>

See above.
Ah, that's undocumented and not available in beta or developer yet. Thanks for mentioning it!

Is there somewhere I can subscribe to to hear about changes to the webext APIs?
(In reply to Colin Caine from comment #11)
> Ah, that's undocumented and not available in beta or developer yet. Thanks
> for mentioning it!
> 
> Is there somewhere I can subscribe to to hear about changes to the webext
> APIs?

New APIs don't tend to get documented on MDN until close to release. I'm not sure there's anywhere to subscribe to hear about changes, other than following some of our Bugzilla components, but that can get very noisy.

I would suggest you check our most recently landed code at http://searchfox.org/mozilla-central/source/browser/components/extensions and http://searchfox.org/mozilla-central/source/toolkit/components/extensions when you are looking to see which API methods and events are currently implemented.
Depends on: 1411724
Blocks: 1411729
Depends on: 1411756
The new list of functions with attached bugs looks like this:

# Functions needed because content scripts can't be on restricted pages

 - Bug 1411724 - scrolling and history navigation
     - History navigation (forward/back/reload/home/stop)
     - Scrolling
     - Focus frame
     - Edit commands (copy/cut/delete/paste/redo/undo/selectall)

# Functions to help interact with web content

 - Bug 1411729 - toggleCaretBrowsing

# Functions to help navigate the web

 - Bug 1295400 - focusLocationBar
     - select location bar
     - Focus tab content (not implemented, maybe not a firefox shortcut either, but really useful for reversing the option above - otherwise you can't easily leave the location bar without clicking or loading a page)

# Functions that open new UI elements + fullscreen

 - Bug 1388479 - fullScreen

 - Bug 1411756 - open UI elements
     - searchbar/quickfind/quickfind links only (last one missing from find API for some reason?)
     - history sidebar/library window/clear recent history dialog
     - bookmark this page
     - bookmark sidebar/library
     - devtools panes
     - open file/save page dialogs

# Functions that navigate to particular URLs

 - Bug 1371793 - open about:* pages
     - addons, download
Depends on: 1269456, 1388479
Depends on: 1411227
No longer depends on: 1388479
(In reply to Andy McKay [:andym] from comment #3)
> (In reply to Colin Caine from comment #2)
> > The intention is that these functions are usable outside the commands API in
> > general webextension code.
> 
> Well that's interesting because we might not want to do that. Commands are a
> user interaction and we try to ensure that things that change the UI are
> part of user interaction. As an example (probably a bad one), we've said no
> to the ability to open a context menu programmatically.
> 
> I haven't read through all the events you intend to trigger, but I imagine
> we'd want to think about that.

Why?
What is the fix for my Full Screen Firefox extension? Add a permission? What permission?
https://addons.mozilla.org/en-US/firefox/addon/full-screen-for-firefox/
In Google Chrome the code works just fine and no extra permission. (it is a user gesture when you click on the Full Screen button)!
Product: Toolkit → WebExtensions
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.