Improve post-filtering of dupes in UnifiedComplete
Categories
(Toolkit :: Places, defect, P2)
Tracking
()
People
(Reporter: mak, Assigned: bugzilla)
References
(Blocks 1 open bug)
Details
(Whiteboard: [fxsearch])
Attachments
(1 file)
| Reporter | ||
Updated•11 years ago
|
| Reporter | ||
Updated•11 years ago
|
| Reporter | ||
Updated•11 years ago
|
| Reporter | ||
Updated•10 years ago
|
| Reporter | ||
Comment 4•8 years ago
|
||
| Reporter | ||
Updated•8 years ago
|
| Reporter | ||
Updated•5 years ago
|
| Reporter | ||
Updated•5 years ago
|
| Reporter | ||
Comment 7•5 years ago
|
||
Here the remaining problem is we have lots of duplicates in the form
http://site.com/path
http://www.site.com/path
https://site.com/path
https://www.site.com/path
One thing we should likely do is deduping https and http, and we should prefer the https entry. An alternative may be to actually fetch whether the origin supports https, that can be done by using moz_origins.prefix. So, if we want to be more aggressive on https (ask Connor) we could modify all the queries to additionally fetch the prefix from moz_origins and force that prefix on returned urls.
The other thing is www, and here we must confirm whether we want to dedupe or not, we ignore www when searching and we don't show it in results after Bug 1614957, that means in the end we'll show 2 identical urls without a way to distinguish them if we don't dedupe.
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 9•5 years ago
•
|
||
We discussed that this bug should encompass deduping any results with different prefixes that are otherwise identical. This means prioritizing www. and http/https differently for the purposes of deduping, surfacing results with these preferences:
https:// > https://www. > http:// > http://www.
For example, if the user has both http://site.com/path and https://www.site.com/path in their results, we show only https://www.site.com/path.
We should not dedupe if the results have different titles; this is to mitigate against deduping www.site.com when it is an entirely different site from site.com. We should also not dedupe if a result with a lower-priority prefix is the heuristic result. In the example above, we would show both http://site.com/path and https://www.site.com/path if http://site.com/path was the heuristic result.
I'm bumping the points on this due to the increase in scope and because quite a few tests will need updating as well.
This is all still pending Product approval.
| Assignee | ||
Comment 10•5 years ago
|
||
| Reporter | ||
Updated•5 years ago
|
Comment 11•5 years ago
|
||
Comment 12•5 years ago
|
||
Backed out for lint failures on test_swap_protocol.js.
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=294402630&repo=autoland&lineNumber=277
Backout: https://hg.mozilla.org/integration/autoland/rev/fc4c4e983d42d9bf0600385276e6e23c22a7eaa9
Comment 13•5 years ago
|
||
| Assignee | ||
Comment 14•5 years ago
|
||
Fixed lint issues and queued a new patch for landing.
Comment 15•5 years ago
|
||
Comment 16•5 years ago
|
||
| bugherder | ||
Comment 17•5 years ago
•
|
||
Updating Fx75 tracking flags to reflect QA triage decision taken with :mdeboer in QA-Search weekly sync meeting.
Comment 18•5 years ago
|
||
Harry, could you please provide some STR in order to manually verify this issue?
Thank you!
| Assignee | ||
Comment 19•5 years ago
|
||
STR:
- Open Firefox with a new profile.
- Visit
http://example.com,http://www.example.com,https://example.com, andhttps://www.example.com, in that order. - Open a new tab and close the tab that you were opening example.com in.
Description
•