Should stop exposing raw URL from `IMEStateManager` and `nsAccUtils::DocumentURL` to native apps
Categories
(Core :: General, enhancement, P2)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox107 | --- | fixed |
People
(Reporter: masayuki, Assigned: masayuki)
Details
Attachments
(2 files)
Document URI may contain:
- user/password the user-pass section before
@followed by domain - user name in the path (e.g, jsfiddle)
- some private data in the query (e.g., search part) and/or the reference part (e.g,
/#/hack)
For the main purpose of exposing URL from IMEStateManager (for native IME) and nsAccUtils::DocumentURL (not entirely sure for what used in the wild), I think that at least, the user-password section, the query section and the reference section shouldn't be exposed. I'm not sure about the path section because it may split authors in the old web sites.
BasePrincipal::GetExposableSpec does this what I explained above:
https://searchfox.org/mozilla-central/rev/560b7b1b174ed36912b969eee0c1920f3c59bc56/caps/BasePrincipal.cpp#878-893
nsIOService::CreateExposableURI removes only the user-pass section:
https://searchfox.org/mozilla-central/rev/560b7b1b174ed36912b969eee0c1920f3c59bc56/netwerk/base/nsIOService.cpp#1063-1084
I believe that they should expose same URI because leaking something from either once makes native apps may try to use the one even if it's not a proper solution. Currently, IMEStateManager exposes only http and https URI for avoiding copying to big data (e.g., data URI and blob URI) and avoiding unnecessary data for IME (chrome URI, file URI, etc). If it's reasonable for a11y module, I think that a11y module should stop exposing such URIs.
Even if a11y requires to expose non-http and non-https URIs, the sanitizer should be in a common place.
Jamie: WDYT about this issue?
Comment 1•3 years ago
|
||
There are three primary use cases I'm aware of for a11y exposure of URL:
- To enable a screen reader to implement a command to quickly report the current URL to the user, rather than having to focus the address bar and then return focus to the page.
- To enable a screen reader to maintain the document position for a page so that navigating back and forth will return the user to their previous position. The scroll position and focus aren't accurate enough for this, since the screen reader's cursor needs to be restored to the exact spot in the text where the user was previously located.
- This requires the most information, since a change in path or hash reference effectively means a different document or position in that document. If we didn't include the hash reference, the screen reader would override a new anchor navigation with the user's last position, which would not be desirable. Consider opening a file in Searchfox, closing it, then opening the same file with a link pointing to a different line number.
- To enable a screen reader user to customise behaviour for a specific website.
- This probably requires the least specific URL.
None of these require the username and password to be included. We should almost certainly strip that.
Strictly speaking, all of these only require the URL to be exposed for the top level document.
As noted above, I don't think we could reasonably strip any paths or the hash reference without breaking the second use case.
It's reasonable that a user might have a large document (e.g. software user guide, book) stored locally, so use case 2) ideally needs us to expose file URIs.
I guess it's far less likely that a large top level document would be provided in a data or blob URI, so we probably don't need to expose those.
| Assignee | ||
Comment 2•3 years ago
|
||
Thank you for the quick reply!
since a change in path or hash reference effectively means a different document or position in that document. If we didn't include the hash reference, the screen reader would override a new anchor navigation with the user's last position, which would not be desirable.
Sounds reasonable to expose the path section and the reference section for screen readers.
None of these require the username and password to be included. We should almost certainly strip that.
Although it's rarely used in these days, but obviously it shouldn't be exposed.
I guess it's far less likely that a large top level document would be provided in a data or blob URI, so we probably don't need to expose those.
Yeah, I think so. When I try to write an addon to collect images which save them with using data URI, sending data URI to parent process made the parent process slower (probably caused by memory fragmentation). So, if it's possible, it's better to stop sending such long URI from content process to the main process.
Anyway, sounds like that I should give it up to use exactly same behavior between a11y module and IME handlers.
Comment 3•3 years ago
|
||
(In reply to Masayuki Nakano [:masayuki] (he/him)(JST, +0900) from comment #2)
When I try to write an addon to collect images which save them with using data URI, sending data URI to parent process made the parent process slower (probably caused by memory fragmentation). So, if it's possible, it's better to stop sending such long URI from content process to the main process.
Note that with the new caching architecture enabled, a11y uses CanonicalBrowsingContext::GetCurrentURI for remote documents. I'm not sure whether that includes long data/blob URIs or not, but if it does, that's a change that would need to be made in DOM.
| Assignee | ||
Comment 4•3 years ago
|
||
Thanks, according to the purpose of the API design of CanonicalBrowsingContext::GetCurrentURI, I guess that it needs to handle such big URIs. However, once accessible stops handling it when the scheme is data or blob, we can avoid multiple big heap allocations. E.g., it calls nsIURI::GetSpec may allocate same size string as nsStandardURL::mSpec (could be just managed with refcount, though), then, converting it to nsString (at this point, allocation occurs if over 64bytes).
| Assignee | ||
Comment 5•3 years ago
|
||
| Assignee | ||
Comment 6•3 years ago
|
||
Non-malicious IME (and text services) must not require private things in URL,
such as username and password, query data and reference in the document.
Therefore, we should omit them from URI exposed from our native IME handlers in
any platforms.
| Assignee | ||
Comment 7•3 years ago
|
||
We shouldn't expose the user-pass section of document URI to another apps since
it's sensitive data, but non-malicious a11y API users must not need it.
Additionally, data and blob URI may be too big (e.g., can be some mega
bytes) and they may need to allocate too big string multiple times in the heap
per URI. Therefore, the accessibility module should stop handling URI as-is if
the document URI is one of them.
Depends on D158734
Comment 9•3 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/6040ed946f26
https://hg.mozilla.org/mozilla-central/rev/119144758463
Description
•