Top sites can be unpredictable

RESOLVED FIXED

Status

()

Firefox for iOS
Home screen
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: rnewman, Assigned: rnewman)

Tracking

unspecified
Other
iOS

Firefox Tracking Flags

(fxios+)

Details

Attachments

(2 attachments)

(Assignee)

Description

3 years ago
I have a page that I viewed once on my device. It 404ed on my second attempt to load it. It's now my number one top site. Why?

I'm going to look into this, but might bump this to someone with more time.
(Assignee)

Comment 1

3 years ago
Questions:

Why does this get a frecency of 28?


6540        https://smile.amazon.com/gp/dmusic/cloudplayer/player?ie=UTF8&ref_=m_ty_cp#stations                    Amazon Mus  SaMyhhLWp6Mh  350         smile.amazon.com  1.43942622691112e+15  1.4375357522649  60               231               28.7456711035592


Why does this row have only one local visit, when there are dozens in the DB that apparently contribute to the frecency of 542?

8214        http://r.duckduckgo.com/l/?kh=-1&uddg=http%3A%2F%2Fwww.amazon.com%2Fb%3Fie%3DUTF8%26node%3D8335758011              JUAkDkBsn2KU  324         r.duckduckgo.com  1.43915351772058e+15  0                1                0                 542.315427090679
(Assignee)

Comment 2

3 years ago
The 'fix' for this is to compute frecency for each individual site (or even visit), then aggregate.

Desktop does it on a per-visit basis.

In order to be performant we need an additional filter on recency. An updated query looks something like:



SELECT
historyID, url, title, guid, domain_id, domain, localVisitDate,
remoteVisitDate, localVisitCount, remoteVisitCount, iconID, iconURL,
iconDate, iconType, iconWidth
FROM (

SELECT historyID, url,
title, guid, domain_id, domain,
max(localVisitDate) AS localVisitDate,
max(remoteVisitDate) AS remoteVisitDate,
sum(localVisitCount) AS localVisitCount,
sum(remoteVisitCount) AS remoteVisitCount,
sum(frecency) AS frecencies
FROM (

SELECT *,
(localVisitCount * (5 + localVisitCount)) * max(1, 100 * 225 / (((1439576194960535 - (localVisitDate)) / 86400000000.0) * ((1439576194960535 - (localVisitDate)) / 86400000000.0) + 225)) + remoteVisitCount * max(1, 100 * 225 / (((1439576194960600 - (remoteVisitDate)) / 86400000000.0) * ((1439576194960600 - (remoteVisitDate)) / 86400000000.0) + 225)) AS frecency
FROM (
SELECT history.id AS historyID, history.url AS url, title, guid, domain_id, domain,
COALESCE(max(case visits.is_local when 1 then visits.date else 0 end), 0) AS localVisitDate,
COALESCE(max(case visits.is_local when 0 then visits.date else 0 end), 0) AS remoteVisitDate,
COALESCE(sum(visits.is_local), 0) AS localVisitCount,
COALESCE(sum(case visits.is_local when 1 then 0 else 1 end), 0) AS remoteVisitCount
FROM history
INNER JOIN domains ON domains.id = history.domain_id
INNER JOIN visits ON visits.siteID = history.id
WHERE (history.is_deleted = 0) AND (domains.showOnTopSites IS 1)
GROUP BY historyID)
WHERE ((localVisitCount + remoteVisitCount) > 0) AND
max(localVisitDate, remoteVisitDate) > 143950424963682
)
GROUP BY domain_id
ORDER BY
frecencies
DESC
LIMIT 15 )
LEFT OUTER JOIN view_history_id_favicon ON historyID = view_history_id_favicon.id;
(Assignee)

Comment 3

3 years ago
I have a slightly tweaked query that I'll put into code this afternoon.

This needs one other change, which is to load the domain instead of the topmost URL -- the URL is likely to be some particular search engine redirect, rather than the engine itself. We should do this by calling .baseDomain on the returned URL; not only is this simpler, but it'll preserve the protocol.

I also think that before Bug 1194852 is addressed, we should add

(domains.domain NOT LIKE 'r.%')

to the frecency query. That kicks r.search.yahoo.com and r.duckduckgo.com out of my top sites, which is a big help.

Karen, lemme know if you want me to simulate this against your browser DB, or if you just want it to hit Nightly so you can try it there.
Flags: needinfo?(krudnitski)
(Assignee)

Comment 4

3 years ago
Created attachment 8648238 [details]
justdomain.sql

This is a domains-only version of the query. You can run it against your own DB to see what happens.
Attachment #8648238 - Flags: feedback?(sleroux)
Attachment #8648238 - Flags: feedback?(bnicholson)
Comment on attachment 8648238 [details]
justdomain.sql

I ran the query on my DB and I find it gives better results. It definitely seems more correct and seems to put more importance on number of times I've visited a domain since I'm seeing some more frequently visited sites move up.
Attachment #8648238 - Flags: feedback?(sleroux) → feedback+
(Assignee)

Comment 6

3 years ago
N.B. to self and future reviewers: pay close attention to not b0rking history substring search, which uses part of the same query mechanism.
I've been playing around on nightly all day yesterday and still can't get it to populate with relevant mobile top sites.

In fact, Top Sites isn't updating at all. I have hit Facebook about 10 times in a very short period of time and it's not showing up anywhere on the Top Sites grid. Although it has keyed in that I visit the bbc, it's only showing up the 'most read' page and not the main 'news' page which is where I start off.

And after visiting the telegraph, doing several other mobile searches / browsing, the Top Sites grid hasn't changed at all.

It's stuffed with some desktop sites that aren't used frequently but a smattering of open tabs and something I visited once. 

The good news is that the Firefox watermark is showing up in thumbnails instead of blank.
Flags: needinfo?(krudnitski)
(Assignee)

Comment 8

3 years ago
Created attachment 8649509 [details] [review]
Pull req.

Read these commits in order.

I split apart the shared history query function, tidied it up for history, and then revised it to reflect the new query.

This slightly regresses strongly-recent top sites data, but doesn't seem to be noticeable for real data.

History panel is now faster, which is nice.
Attachment #8649509 - Flags: review?(sleroux)
(Assignee)

Updated

3 years ago
Attachment #8649509 - Flags: ui-review?(randersen)
Attachment #8649509 - Flags: ui-review?(dhenein)
Comment on attachment 8649509 [details] [review]
Pull req.

Tested on both iPhone and iPad (devices and sims) and everything matches. Wicked fast, too!
Attachment #8649509 - Flags: ui-review?(randersen)
Attachment #8649509 - Flags: ui-review?(dhenein)
Attachment #8649509 - Flags: ui-review+
Comment on attachment 8649509 [details] [review]
Pull req.

Code looks good from what I can see
Attachment #8649509 - Flags: review?(sleroux) → review+
(Assignee)

Comment 11

3 years ago
Steph landed this.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
On nightly now, and it's being better, although the facebook thumbnail isn't resolving (even after tapping it twice now) BUT at least it's in the top row!
(Assignee)

Comment 13

3 years ago
The FB favicon will continue to be a problem because it's desktop's URL that wins its group. So the favicon guesser gets it wrong, and we never load www.facebook.com, so we never get an icon. 

Changing that involves making favicon lookup domain-aware and/or biasing the internal ordering towards mobile sites. 

Favicons: still the hardest problem in browsing.
FWIW, I don't use desktop [remote] to go to Facebook except once in a blue moon - it's 95% done on my mobile [local]....
(Assignee)

Comment 15

3 years ago
Noting for the historical record: we can only distinguish between 'local' and 'remote' -- local being your iOS device, and remote being _typically_ a desktop or laptop.

But in Karen's case, remote also includes a bunch of phones and tablets. Their 100+ visits contribute no more weightily to m.facebook.com than her 500 recent desktop visits do to www.facebook.com, so www wins in the pool.

It's easy to see three tiers of behavior here:

  1. On-device. I want my browsing on this device to be most accessible.
  2. Same-category. My browsing on mobile devices is similar, and not the same as my desktop browsing. (This is also the split for m./www.)
  3. Everything. Finally, I want to revisit sites that I visited on a desktop.

We're unable to support #2 without reworking Sync and Places.


I filed Bug 1196243 to figure out how to get an icon for this kind of situation.
Attachment #8648238 - Flags: feedback?(bnicholson)
You need to log in before you can comment on or make changes to this bug.