Closed Bug 1046074 Opened 10 years ago Closed 4 years ago

Improve post-filtering of dupes in UnifiedComplete

Categories

(Toolkit :: Places, defect, P2)

defect
Points:
5

Tracking

()

VERIFIED FIXED
mozilla76
Iteration:
76.1 - Mar 9 - Mar 22
Tracking Status
firefox75 --- wontfix
firefox76 --- verified
firefox77 --- verified

People

(Reporter: mak, Assigned: bugzilla)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxsearch])

Attachments

(1 file)

we can probably slightly improve the post-filtering of duplicate entries by also using the finalcomplete value of the autoFill entry and maybe trying to strip www.
Blocks: 1071461
No longer blocks: UnifiedComplete
Depends on: UnifiedComplete
Summary: Improve post-filtering of dupes in UnifiedComplete → Improve post-filtering of dupes in UnifiedComplete (through www. and finalCompleteValue)
Blocks: 1142709
This is still an issue with current unified autocomplete implementation.
Priority: -- → P3
Any news on this ?
I'm glad to say that Firefox 53 greatly improved deduplication.
yes, the second part of this has been handled in bug 1322747.
We could still investigate merging www. and non-www. entries, but that's not a priority atm, I think the most problematic part has been handled. So, I'm dropping this since I'm overloaded already and doubt I can fix this anytime soon.
Patches with tests are welcome though!
Assignee: mak77 → nobody
Status: ASSIGNED → NEW
Depends on: 1322747
Summary: Improve post-filtering of dupes in UnifiedComplete (through www. and finalCompleteValue) → Improve post-filtering of dupes in UnifiedComplete (ignoring www.)
Blocks: 1222435
Blocks: 1425029
Whiteboard: [fxsearch]
Points: --- → 3
Blocks: 1489367

Here the remaining problem is we have lots of duplicates in the form
http://site.com/path
http://www.site.com/path
https://site.com/path
https://www.site.com/path

One thing we should likely do is deduping https and http, and we should prefer the https entry. An alternative may be to actually fetch whether the origin supports https, that can be done by using moz_origins.prefix. So, if we want to be more aggressive on https (ask Connor) we could modify all the queries to additionally fetch the prefix from moz_origins and force that prefix on returned urls.

The other thing is www, and here we must confirm whether we want to dedupe or not, we ignore www when searching and we don't show it in results after Bug 1614957, that means in the end we'll show 2 identical urls without a way to distinguish them if we don't dedupe.

Assignee: nobody → htwyford
Status: NEW → ASSIGNED
Iteration: --- → 75.1 - Feb 10 - Feb 23
Iteration: 75.1 - Feb 10 - Feb 23 → 76.1 - Mar 9 - Mar 22

We discussed that this bug should encompass deduping any results with different prefixes that are otherwise identical. This means prioritizing www. and http/https differently for the purposes of deduping, surfacing results with these preferences:

https:// > https://www. > http:// > http://www.

For example, if the user has both http://site.com/path and https://www.site.com/path in their results, we show only https://www.site.com/path.

We should not dedupe if the results have different titles; this is to mitigate against deduping www.site.com when it is an entirely different site from site.com. We should also not dedupe if a result with a lower-priority prefix is the heuristic result. In the example above, we would show both http://site.com/path and https://www.site.com/path if http://site.com/path was the heuristic result.

I'm bumping the points on this due to the increase in scope and because quite a few tests will need updating as well.

This is all still pending Product approval.

Points: 3 → 5
Flags: needinfo?(mconnor)
Summary: Improve post-filtering of dupes in UnifiedComplete (ignoring www.) → Improve post-filtering of dupes in UnifiedComplete
Priority: P3 → P2
Pushed by htwyford@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/604ae25cad30
Improve post-filtering of dupes in UnifiedComplete. r=mak
Backout by rmaries@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b320c59d42db
Backed out changeset 604ae25cad30 for Lint failure on test_swap_protocol.js. CLOSED TREE

Fixed lint issues and queued a new patch for landing.

Flags: needinfo?(mconnor)
Flags: needinfo?(htwyford)
Pushed by htwyford@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8e1c0b3f826d
Improve post-filtering of dupes in UnifiedComplete. r=mak
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76

Updating Fx75 tracking flags to reflect QA triage decision taken with :mdeboer in QA-Search weekly sync meeting.

Harry, could you please provide some STR in order to manually verify this issue?
Thank you!

Flags: needinfo?(htwyford)

STR:

  1. Open Firefox with a new profile.
  2. Visit http://example.com, http://www.example.com, https://example.com, and https://www.example.com, in that order.
  3. Open a new tab and close the tab that you were opening example.com in.
  4. Type "ex" in the address bar.
  5. Verify that the first result (the one that would be selected if you hit Enter), is https://www.example.com.
  6. Verify that there is one and only one other history result for example.com. Use the arrow keys to move down and highlight it. The address bar should be filled with https://example.com.
  7. Visit https://example.com several more times. Open a new tab and close the tab you were opening https://example.com in.
  8. Repeat steps 4-6, except the order of the results in steps 5 and 6 should be swapped. https://example.com should be the first result and https://www.example.com should be the other result.
Flags: needinfo?(htwyford)
Flags: qe-verify+
QA Contact: cristian.comorasu

Thank you for the steps.
I can confirm this issue is fixed, I verified using Fx 77.0a1 and Fx 76.0b8 on Windows 10 x64, macOS 10.13 and Ubuntu 18.04.
Updating the flags accordingly.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: