Closed
Bug 1330454
Opened 9 years ago
Closed 8 years ago
Implement baseDomain extraction in Fennec
Categories
(Firefox for Android Graveyard :: General, defect, P3)
Firefox for Android Graveyard
General
Tracking
(firefox53 affected)
RESOLVED
WORKSFORME
Tracking | Status | |
---|---|---|
firefox53 | --- | affected |
People
(Reporter: ahunt, Unassigned)
References
Details
(Whiteboard: [MobileAS])
Attachments
(5 files, 1 obsolete file)
59 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
59 bytes,
text/x-review-board-request
|
sebastian
:
review+
sebastian
:
feedback+
|
Details |
59 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
59 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
59 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
We currently don't have a full-featured domain extraction implementation in Fennec. We want to be able to get the actual site domain, i.e. things like:
mozilla.org -> mozilla.org
www.mozilla.org -> mozilla.org
www.mobile.mozilla.org -> mozilla.org
etc.
We have a basic implementation that strips www/mobile, but that's incomplete, and ideally we want to always get the base domain:
https://dxr.mozilla.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/util/StringUtils.java#129
Gecko does have an implementation, but that's not too useful for UI and DB code:
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIEffectiveTLDService#getBaseDomain()
It seems simplest to port the iOS implementation of host parsing:
https://github.com/mozilla-mobile/firefox-ios/blob/f65be4372e20d284aa1fc340159597db97194f8f/Utils/Extensions/NSURLExtensions.swift#L297
We already ship the required TLD lists, which are parsed in PublicSuffixPatterns.
Reporter | ||
Comment 1•9 years ago
|
||
I initially want this for the new domains table, see e.g.:
https://bugzilla.mozilla.org/show_bug.cgi?id=1319485#c17
It would be good to replace existing uses of StringUtils.stripCommonSubdomains() with this new implementation, after some careful testing, but I'll leave that for another bug. I wonder if this would need overrides though, e.g. users might expect mail.google.com and google.com to be distinct.
Reporter | ||
Comment 2•9 years ago
|
||
On second thought, we might not need this:
https://bugzilla.mozilla.org/show_bug.cgi?id=1319485#c22
I'll reopen this bug if needed, but its superfluous for now.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Comment 3•9 years ago
|
||
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Reporter | ||
Comment 6•9 years ago
|
||
I've pushed my WIP patches just in case we need them, we might need this code in Bug 1319485, depending on how we want the domains table to look.
Reporter | ||
Updated•9 years ago
|
Assignee: nobody → ahunt
Status: RESOLVED → REOPENED
Iteration: --- → 1.14
Priority: -- → P1
Resolution: WONTFIX → ---
Whiteboard: [MobileAS]
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Reporter | ||
Updated•9 years ago
|
Attachment #8826697 -
Flags: review?(s.kaspari)
Attachment #8826699 -
Flags: review?(s.kaspari)
Attachment #8832246 -
Flags: review?(s.kaspari)
Comment 13•9 years ago
|
||
mozreview-review |
Comment on attachment 8826697 [details]
Bug 1330454 - Introduce DomainProcessor to split domains
https://reviewboard.mozilla.org/r/104596/#review110232
::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/util/publicsuffix/DomainProcessor.java:125
(Diff revision 3)
> + public static DomainProcessor fromURL(final Context context, final String url) {
> + String host;
> + try {
> + host = new URL(url).getHost();
> + } catch (MalformedURLException e) {
> + host = url;
> + }
> +
> + return new DomainProcessor(context, host);
> + }
> +
> + public static DomainProcessor fromHost(final Context context, final String host) {
> + return new DomainProcessor(context, host);
> + }
Using those methods will later access PublicSuffixPatterns.EXACT and we will read a file from disk (once). This can take some time and potentially block the UI thread. So either transform this into callback based methods or make sure to add annotations and an assetion using ThreadUtils.
Attachment #8826697 -
Flags: review?(s.kaspari) → review+
Comment 14•9 years ago
|
||
mozreview-review |
Comment on attachment 8826699 [details]
Bug 1330454 - Migrate domain prefix/suffix stripping to use DomainProcessor
https://reviewboard.mozilla.org/r/104598/#review110234
Attachment #8826699 -
Flags: review?(s.kaspari) → review+
Comment 15•9 years ago
|
||
mozreview-review |
Comment on attachment 8832246 [details]
Bug 1330454 - Remove deprecated subdomain/suffix stripping code
https://reviewboard.mozilla.org/r/108582/#review110236
Attachment #8832246 -
Flags: review?(s.kaspari) → review+
Comment 16•9 years ago
|
||
mozreview-review |
Comment on attachment 8826697 [details]
Bug 1330454 - Introduce DomainProcessor to split domains
https://reviewboard.mozilla.org/r/104596/#review110238
::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/util/publicsuffix/DomainProcessor.java:55
(Diff revision 3)
> + while (i < prefixes.length) {
> + final String prefix = prefixes[i];
> +
> + if (host.regionMatches(true, offset, prefix, 0, prefix.length())) {
> + offset += prefix.length();
> + // reset the iterator, as described above
> + i = 0;
> + } else {
> + i++;
> + }
> + }
I guess whenever we end up with a public suffix + one segment then we should stop too:
m.www.mobile.com -> stop at mobile.com
However this would make this a bit more expensive and shouldn't be a synchronous call anymore.
Comment 17•9 years ago
|
||
mozreview-review |
Comment on attachment 8832247 [details]
Bug 1330454 - Pre: move stripCommonSubdomains into (package-private) DomainProcessorUtils
https://reviewboard.mozilla.org/r/108584/#review110240
Attachment #8832247 -
Flags: review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Reporter | ||
Comment 24•8 years ago
|
||
I decided to make DomainProcessor calls background-thread only.
Some of the clients already use a background thread, and callbacks don't really work well for some of the other locations (e.g. it would be hard to rewrite BrowserSearch to use callbacks, but pushing all of the BrowserSearch code into a background thread does work well). That required some rework of the homepanel context menu - but overall that seems to result in a slightly better UI, so this is probably useful overall.
(Additional patches incoming via reviewboard - I'll try to squash them down after review.)
Reporter | ||
Updated•8 years ago
|
Attachment #8834147 -
Flags: review?(s.kaspari)
Updated•8 years ago
|
Iteration: 1.14 → 1.15
Comment 25•8 years ago
|
||
mozreview-review |
Comment on attachment 8834147 [details]
Bug 1330454 - Allow immediate PublicSuffix access
https://reviewboard.mozilla.org/r/110188/#review112024
Mh. This adds some AsyncTasks and runs code that we normally wouldn't need here. I wonder if we should step back and re-think whether the DomainProcessor is the right approach. Maybe we want to be more specific and not split a domain in all its parts all the time?
Reporter | ||
Comment 26•8 years ago
|
||
Yeah, I've been wondering that too.
I had to add async tasks to:
- HomePanels (i.e. TwoLinePageRow):
We only use the stripped domain as a fallback if an item doesn't have a title. This should very rarely be the case.
Moreover, TwoLinePageRow didn't do any stripping - stripping was only ever performed in the context menu. We could
easily completely remove any domain stripping here, without significant impact.
- Browser Toolbar: I think it's worth doing this here, since it's extremely user visible, so having the best possible
String to show seems worth it. Doing this doesn't make things significantly more complicated. (But we could revert
back to the simpler stripping too.)
- BrowserSearch autocomplete: this is more questionable, but
I definitely want to have the DomainProcessor in place for our DB usage, since that is essentially set in stone once the migration is run (and I'd prefer it to be completely correct). I think I still prefer having a single authoritative implementation to minimise confusion.
One option I considered was having DomainProcessor.processImmediate() method that can be run on the UI thread: if data isn't loaded yet, we just return null (and the caller can just use the raw URL) - if data is loaded we don't need the background thread anymore. This would probably be suitable for all the above cases (and we just need to make sure we try to load the suffix list *eventually* - we could even just post a background runnable from ProcessImmediate if data isn't loaded yet, so we'd hopefully have it ready for the next call). This makes DomainProcessor more complicated, but it might be worth it for simpler UI code?
Reporter | ||
Comment 27•8 years ago
|
||
(In reply to Andrzej Hunt :ahunt from comment #26)
> - BrowserSearch autocomplete: this is more questionable, but
Continuing where I forgot to finish my sentence: BrowserSearch autocomplete is less immediately visible, and also has less user impact (since most sites don't need the better domain stripping), so is also a candidate for reverting to the simpler stripCommonSubdomains().
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Reporter | ||
Updated•8 years ago
|
Attachment #8834148 -
Attachment is obsolete: true
Attachment #8834148 -
Flags: review?(s.kaspari)
Reporter | ||
Comment 33•8 years ago
|
||
I've tried a slightly different approach to the AsyncTask issue - I'm not convinced it's ideal either (and it makes DomainProcessor a bit more complicated), but at least the frontend code stays simple. (I still need to double-check my patches, consider them draft quality.)
Comment 34•8 years ago
|
||
(In reply to Andrzej Hunt :ahunt from comment #26)
> I definitely want to have the DomainProcessor in place for our DB usage,
> since that is essentially set in stone once the migration is run (and I'd
> prefer it to be completely correct). I think I still prefer having a single
> authoritative implementation to minimise confusion.
Yeah, the DomainProcessor itself isn't a problem. I was more concerned about every action needing to go through it - even if they only want to strip some things from the URL and not split it into all components. And then adding the async step in ToolbarDisplayLayout is a bit scary: Can this introduce race conditions? Every time we change the behavior here there's a chance someone opens a new spoofing security bug. :) (This reminds me of bug 1286178).
(In reply to Andrzej Hunt :ahunt from comment #33)
> I've tried a slightly different approach to the AsyncTask issue - I'm not
> convinced it's ideal either (and it makes DomainProcessor a bit more
> complicated), but at least the frontend code stays simple. (I still need to
> double-check my patches, consider them draft quality.)
Nice. I'll clear the review flag and then just reflag me as soon as you are happy. Btw. I'm just looking at ToolbarDisplayLayout and as we have the splitting behind a pref that is disabled by default anyways - maybe we can remove that if this makes the code here easier.
Updated•8 years ago
|
Attachment #8834147 -
Flags: review?(s.kaspari)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Reporter | ||
Comment 40•8 years ago
|
||
Comment on attachment 8826699 [details]
Bug 1330454 - Migrate domain prefix/suffix stripping to use DomainProcessor
I've tried a new approach which uses the prefix lists *if* loaded, and otherwise falls back to the old string stripping - thoughts? I'm not sure if this is still far too complicated, we could just use the old method anywhere that uses the UI thread.
The new implementation for this case is in:
"String stripCommonSubdomainsFromURL(" in DomainProcessor.java
Attachment #8826699 -
Flags: feedback?(s.kaspari)
Comment 41•8 years ago
|
||
mozreview-review |
Comment on attachment 8832247 [details]
Bug 1330454 - Pre: move stripCommonSubdomains into (package-private) DomainProcessorUtils
https://reviewboard.mozilla.org/r/108584/#review111560
Attachment #8832247 -
Flags: review+
Comment 42•8 years ago
|
||
mozreview-review |
Comment on attachment 8834147 [details]
Bug 1330454 - Allow immediate PublicSuffix access
https://reviewboard.mozilla.org/r/110188/#review116076
Attachment #8834147 -
Flags: review?(s.kaspari) → review+
Comment 43•8 years ago
|
||
(In reply to Andrzej Hunt :ahunt from comment #40)
> I've tried a new approach which uses the prefix lists *if* loaded, and
> otherwise falls back to the old string stripping - thoughts? I'm not sure if
> this is still far too complicated, we could just use the old method anywhere
> that uses the UI thread.
This looks good, I think! :)
Updated•8 years ago
|
Attachment #8826699 -
Flags: feedback?(s.kaspari) → feedback+
Updated•8 years ago
|
Iteration: 1.15 → 1.16
Reporter | ||
Updated•8 years ago
|
Iteration: 1.16 → ---
Priority: P1 → P2
Updated•8 years ago
|
Blocks: as-android-notyet
Updated•8 years ago
|
Assignee: andrzej → nobody
Updated•8 years ago
|
Priority: P2 → P3
I did this at some point: URIUtils.getFormattedDomain(context, uri, /* include public suffix */ true, /* subdomainCount */ 0);
http://searchfox.org/mozilla-central/rev/f6dc0e40b51a37c34e1683865395e72e7fca592c/mobile/android/base/java/org/mozilla/gecko/util/URIUtils.java#88
Status: REOPENED → RESOLVED
Closed: 9 years ago → 8 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•5 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•