Declarative onLoadRequest
Categories
(GeckoView :: General, enhancement, P2)
Tracking
(firefox-esr91 wontfix, firefox99 wontfix, firefox100 wontfix, firefox101 wontfix)
People
(Reporter: agi, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [geckoview:2022h2?])
Attachments
(6 files, 3 obsolete files)
GeckoView uses onLoadRequest
to give Apps a chance to intercept network navigations and stop them when needed.
Common use cases include:
- handle network requests in the app directly
- offload special URLs to apps
By it's very nature, onLoadRequest
sits in the critical page load path, and needs to run on the UI thread. This means that for every page load the network thread is blocked until the UI thread can process the onLoadRequest
delegate call.
This can cause unnecessary performance issues when the UI thread is busy, e.g. because we are trying to paint the app for the first time.
The untested assumption of this bug is that the onLoadRequest
"allow" rules are simple enough that we could apply a static filter to onLoadRequest
that can be executed within the network thread directly to avoid the round trip unnecessarily.
For example, we could do have three possible values: DENY
(reject onLoadRequest
decoratively), ALLOW
and PROMPT
(call onLoadRequest
in the UI thread) and then a list of rules like:
{
host: ["abc.com", ... ], // this rule matches all hosts in the list
protocol: ["http", "https", "ftp"],
isRedirect: true, // matches redirects
path: ["/signin"], // matches path
value: ALLOW,
}
Presumably we would want the app to be able to add rules dynamically.
I'm opening this bug so we can discuss the feasibility of such a system and investigate what common use cases are.
Reporter | ||
Comment 1•3 years ago
|
||
Sebastian, could you or somebody in AC give us a little summary of what onLoadRequest
is used for in AC/Fenix? Do you think it would be feasible to build a declarative list of rules for most use cases (as in, that allows most page loads to go through without prompting the UI thread).
Reporter | ||
Comment 2•3 years ago
|
||
See also this performance bug that mcomella found: https://bugzilla.mozilla.org/show_bug.cgi?id=1734916
Comment 3•3 years ago
|
||
By it's very nature,
onLoadRequest
sits in the critical page load path, and needs to run on the UI thread. This means that for every page load the network thread is blocked until the UI thread can process theonLoadRequest
delegate call.
Where's the need for the UI thread is coming from? On our side none of the invoked code requires the UI thread. Is it somehow required by the internal messaging of GeckoView?
Sebastian, could you or somebody in AC give us a little summary of what
onLoadRequest
is used for in AC/Fenix?
In general there are two categories of code hooking into onLoadRequest
:
- (1) Code actually intercepting requests
- (2) Code only observing
LoadRequest
s and theisRedirect
/hasUserGesture
properties.
In group (1) there's for example:
- App Links: A lookup whether the request can/should be handled by an app
- Web Apps: An interceptor that launches the PWA when navigating to a URL that we have a web app installed for (But it looks like this is not active in Fenix)
- AMO: Fenix has an interceptor for AMO install URLs and redirects those to the internal addon manager UI
In group (2) there's for example:
- Search: Code that clears the "search terms" from the URL bar when certain properties are set
- Telemetry: search ad telemetry is based on looking at load requests / redirects
- Toolbar: Based on specific properties set in the
LoadRequest
we decide to show a hidden toolbar
Do you think it would be feasible to build a declarative list of rules for most use cases (as in, that allows most page loads to go through without prompting the UI thread).
For group (1):
- App Links: For every request we perform a (cached) lookup via the package manager. In order to create a list of rules we would need to query all packages, parse all intent filters and translate them into rules. That doesn't seem feasible. Instead, if the lookup would happen inside GV and the result exposed via an API, could that be faster?
- Web Apps: Doesn't seem to be used -> Ignore.
- AMO: Seems like a good candidate for a rule. It's already a (regex) rule that we implemented.
For group (2): We can't really transform them into a rule. But they also never intercept anything. So maybe instead it would be enough to notify us, but not wait for us?
Reporter | ||
Comment 4•3 years ago
•
|
||
(In reply to Sebastian Kaspari (:sebastian; :pocmo) from comment #3)
By it's very nature,
onLoadRequest
sits in the critical page load path, and needs to run on the UI thread. This means that for every page load the network thread is blocked until the UI thread can process theonLoadRequest
delegate call.Where's the need for the UI thread is coming from? On our side none of the invoked code requires the UI thread. Is it somehow required by the internal messaging of GeckoView?
It's not required by our messaging per se, it's mostly so that apps don't have to deal with multi-threaded code. Things can get complicated real fast if we start making delegates call on a random thread instead of the UI thread. We could make an exception for onLoadRequest
but I think we can do better.
Sebastian, could you or somebody in AC give us a little summary of what
onLoadRequest
is used for in AC/Fenix?In general there are two categories of code hooking into
onLoadRequest
:
For group (1):
- App Links: For every request we perform a (cached) lookup via the package manager. In order to create a list of rules we would need to query all packages, parse all intent filters and translate them into rules. That doesn't seem feasible. Instead, if the lookup would happen inside GV and the result exposed via an API, could that be faster?
Perhaps. We would still have to do the lookup which is the problem I think. So if I'm understanding correctly, these rules are random https
addresses? e.g. facebook says "I want to intercept any fb.com navigations"? And do you send an intent to see if a navigation would be intercepted, is that how it works?
- Web Apps: Doesn't seem to be used -> Ignore.
- AMO: Seems like a good candidate for a rule. It's already a (regex) rule that we implemented.
For group (2): We can't really transform them into a rule. But they also never intercept anything. So maybe instead it would be enough to notify us, but not wait for us?
I think group (2) should use onLocationChange
instead of onLoadRequest
. Until you get onLocationChange
we haven't really committed the navigation. We can likely add any missing information from onLoadRequest
(if any). But if that's not possible we could definitely have a non-blocking onLoadRequest
for this.
Updated•3 years ago
|
Comment 5•3 years ago
|
||
(In reply to Agi Sferro | :agi | ni? for questions | ⏰ PST | he/him from comment #4)
It's not required by our messaging per se, it's mostly so that apps don't have to deal with multi-threaded code. Things can get complicated real fast if we start making delegates call on a random thread instead of the UI thread.
Ah, okay, yeah, I can understand that. The code in A-C should be thread-safe. But that may not be true for every other consumer, I guess.
Perhaps. We would still have to do the lookup which is the problem I think. So if I'm understanding correctly, these rules are random
https
addresses? e.g. facebook says "I want to intercept any fb.com navigations"? And do you send an intent to see if a navigation would be intercepted, is that how it works?
Yeah, the app can define parts of a URI in its intent filters: <scheme>://<host>:<port>[<path>|<pathPrefix>|<pathPattern>]
, see: https://developer.android.com/guide/topics/manifest/data-element
I think group (2) should use
onLocationChange
instead ofonLoadRequest
. Until you getonLocationChange
we haven't really committed the navigation. We can likely add any missing information fromonLoadRequest
(if any). But if that's not possible we could definitely have a non-blockingonLoadRequest
for this.
I have to take a look at the code again. But I think some of it requires some of the additional attributes from LoadRequest
(isRedirect
, hasUserGesture
), that are not available from onLocationChange()
.
Reporter | ||
Comment 6•3 years ago
|
||
(In reply to Agi Sferro | :agi | ni? for questions | ⏰ PST | he/him from comment #4)
(In reply to Sebastian Kaspari (:sebastian; :pocmo) from comment #3)
Sebastian, could you or somebody in AC give us a little summary of what
onLoadRequest
is used for in AC/Fenix?In general there are two categories of code hooking into
onLoadRequest
:For group (1):
- App Links: For every request we perform a (cached) lookup via the package manager. In order to create a list of rules we would need to query all packages, parse all intent filters and translate them into rules. That doesn't seem feasible. Instead, if the lookup would happen inside GV and the result exposed via an API, could that be faster?
Perhaps. We would still have to do the lookup which is the problem I think.
One thing I'm realizing reading this again is that the lookup might be thread-safe, in which case we could do it in GeckoView and avoid the thread round-trip.
Updated•3 years ago
|
Depends on D133383
Depends on D133384
I pushed a prototype for a declarative onLoadRequest that covers the two cases where we actually need to ALLOW or DENY requests (from comment 3): AMO & app links. My primary goals with this prototype were:
- see if it was possible to cover these in GeckoView
- see how practical the suggested API from comment 0 was
A non-goal was making the code readable/robust enough to land and make it work for every edge case. I found 1) yes, it's possible (though I'm not sure what parts of an app link interception should be returned to the application) and 2) it's usable but clunky for app link's complex rules (it'd be less clunky with kotlin default arguments).
I'm going to look into a few additional things before I request feedback.
Side note: in the prototype, I ended up handling app link matchers as a separate case. However, I don't think that was strictly necessary for the logic in fenix.
For my prototype, I implemented the design from comment 0 and found some issues. With this bug, we're designing an API to handle boolean logic without letting the application run code. However, it's unfamiliar to write complex boolean logic (example: AppLinksInterceptor) using ALLOW/DENY blocks. Specifically, dependent clauses (A && B
) and inversions of matchers (!A
) are possible to write but unfamiliar and not as readable because they require multiple matchers where you may not see the dependency between them. For example:
// not(A), i.e. inversions. Block all requests to anything but apple.com
ALLOW requests to apple.com
DENY all requests
// A && B, i.e. dependent matchers. Allow only users to navigate in subframes AND
// only pages to change the main frame; block all other requests. e.g. for a
// multi-page game that doesn't want users to be able to navigate away themselves
ALLOW requests with user gestures && in sub frames
ALLOW requests without user gestures && not in sub frames
DENY all requests
When we introduce PROMPT
and TRY_APP_LINK
results, however, I think there are some things we can't express, like:
if (!A && !B && !C) {
tryAppLink()
} else if (D) {
allowRequest()
}
where we want to execute statements only if prior dependent matchers failed. I think it works okay with just ALLOW and DENY because they are opposites from each other (i.e. no overlap) but that changes with new results. I'm having difficult wrapping my head around this though.
Alternative
I saw our issues as being unable to elegantly express certain forms of boolean logic so it seemed reasonable to add familiar boolean operators to solve the problem. Here's an example API:
data class Matcher.Conditions( // same as Agi's suggestion
host,
protocol,
path,
requestProperties: Map<Property, Boolean> // replacement for isRedirect
)
// Example...
return Arrays.asList(
// Each top-level matcher in the list is evaluated independently.
// Example: block requests to apple.com
new Matcher(Result.DENY, new Matcher.Conditions(
".*apple\\.com", ".*", ".*", emptyMap
),
// Example: not(A). Block requests to everything but AMO
new Matcher(Result.DENY, not(new Matcher.Conditions(
".*addons\\.mozilla\\.org", ".*", ".*", emptyMap
),
// Example: A && B. Open another app only if it looks safe. Kind of from fenix
new Matcher(Result.TRY_APP_LINK, /* these varargs Conditions act as && rules */
not(new Matcher.Conditions(".*", ".*", ".*", mapOf(
// If a subframe request comes without a user gesture, we don't want to try
// to open an app link.
HAS_USER_GESTURE to false,
IS_SUBFRAME_REQUEST to true))),
not(new Matcher.Conditions(".*", geckoSupportedProtocols, ".*", mapOf(
// tbh, not sure I understand high level reason why this rule exists but it's
// in fenix. Maybe we can invert it to simplify it
HAS_USER_GESTURE to false,
IS_ALLOWED_REDIRECT to false,
IS_DIRECT_NAVIGATION to false)))
)
);
(or maybe varargs Conditions
should be changed to just one Condition
and we add an and(varargs Condition)
for clarity)
I think this lets us express that last case we weren't able to before and introduces more familiar boolean semantics to matching. However, it's also duplicating existing functionality and possibly introducing complexity. Will application's ever really need to express that last case?
:agi, what do you think? Which API do you think we should go with?
Reporter | ||
Comment 12•3 years ago
•
|
||
We chatted a little bit on matrix and I'm not sure if your comment was before or after so I might be repeating some points I made in chat.
I think we don't need TRY_APP_LINK
, the fact that we have an app registered for a given link can be a property of the matcher and then apps can decide whether to PROMPT
for those.
I also starting to think that having DENY
doesn't make a lot of sense. The majority of the DENY
use cases are for the app to do something else instead, so really the only options that we need are ALLOW
(if the app doesn't have to do anything about this load) or PROMPT
if the app needs to deny the load and do something else instead (or if it needs more information to check if the load should go through)
would that be enough to express our use cases? I would really hope we can avoid implementing a fully fledged boolean logic.
I wrote a proposal taking into account :agi's suggestions (it's not full boolean logic anymore 😌).
:agi, can you provide feedback on my proposal?
I confirmed my prototype improves performance. Here are profiles demonstrating the fix (look at the timeline to see how the android UI blocks the request in Before but not in After):
I recorded two side-by-side videos: the left side is mc without my prototype and the right side is that commit with my prototype. We see a surprisingly large improvement with the prototype for time until visual completeness on both apple.com (800ms) and amazon.com (1833ms to progress bar disappears, 1167ms to (mostly) visual completeness which happens before the progress bar disappears).
Reporter | ||
Comment 17•3 years ago
|
||
Discussed it offline, the doc looks pretty good.
The LoadRequestFilterAppLink.getIntentsIfAppLinkMatch method is a port of the
app links logic from ac -> GV of AppLinksUseCases.interceptedAppLinkRedirect
from commit 1931ec7d85d3a084d87db15da0707b51e4dd2a94. If some variables were
unchanging in our code path, they were hardcoded and any applicable logic was
removed (e.g. includeHttpAppLinks == false)
The LoadRequestFilterAppLinkTest class features most of the tests for
AppLinkUseCases verbatim: a shim layer was introduced to convert the GV return
types to ac types so the tests could be used exactly except for a handful of
cases where "change for GV" comments were added. Tests were removed under the
following conditions:
- they test variables we removed (e.g. launchInApp)
- they test other input values for GetAppLinkRedirect we removed
- they test functions we don't support (e.g. openAppLink)
- they test the cache we didn't implement
Depends on D137219
Depends on D137220
Depends on D137221
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
I'm handing off my WIP to the GeckoView team. The proposal document was reviewed so my approach seems okay from a high level.
I pushed my WIP code. The first commit adds the LoadRequestFilter class and can probably be reused. The second commit adds an API in GeckoSessionSettings for the app to set its LoadRequestFilters: it can probably be reused. The third commit integrates the declarative API with page load requests: this commit might be worth rewriting because it's sloppy. The fourth commit is a demonstration of this API in geckoview_example and probably shouldn't be landed. I wrote tests though they're not totally thorough so they should be revisited.
The LoadRequestFilterAppLink code was adapted from ac's AppLinksInterceptor logic, which I found to be really challenging to do. To minimize possibly added bugs, I used the ac tests verbatim with a shim layer: however, this runs on robolectric (to allow us to run verbatim) so I'm not sure we want to keep it. We should also see if there are more tests to write.
In addition to the "Unsolved challenges" in the proposal doc and the TODOs in the code, here is one other TODO item. In the current implementation, the app link code, i.e. the slow queryIntentActivities, is run for every LoadRequestFilter that declares MATCH_APP_LINK to true: this doesn't affect fenix because it only declares one but we may want to cache the results.
Keep in mind, to get the full benefits for this patch, we also need to make changes in fenix/ac that prevent us from running into the same root cause: posting to the UI thread.
I can continue to do the performance measurements and I'm around if you have questions. Thank you!
Updated•3 years ago
|
Comment 23•3 years ago
|
||
This bug blocks bug 1734916. Fixing that bug will require some A-C and Fenix work to use this new API.
This bug is related to, but not dependent, upon AppLinksInterceptor bug 1750890
Comment 24•3 years ago
|
||
Agi estimates fixing this bug would take a month, so we don't have time this release.
mcomella says this could be a big perf improvement, so we should consider this bug for GV's H2 planning.
Comment 25•3 years ago
|
||
Tentatively assigning this bug to Irene because she said she'd like to implement it in Q2 (so GV m102 or m103).
Comment 26•3 years ago
|
||
Not in GV 102
Comment 28•3 years ago
|
||
Irene will start work in 103, but it might not land in 103.
Comment 29•3 years ago
|
||
Nominated for 105 according to the planning we have done
Updated•3 years ago
|
Updated•3 years ago
|
Description
•