Imported chrome browsing history ignores visit_count and shows recently visited history instead of top sites

NEW
Unassigned

Status

()

Firefox
Migration
P2
normal
2 months ago
a month ago

People

(Reporter: Mardak, Unassigned)

Tracking

(Depends on: 1 bug)

Trunk
Points:
---

Firefox Tracking Flags

(firefox59 affected)

Details

(Reporter)

Description

2 months ago
A user reported importing from chrome and seeing Top Sites that had pages visited from today but seemed to be missing an actual Facebook top site. Some imported top sites might have only been visited once in Chrome.
Let's check to see if this is a specific issue with Activity Stream, or if it belongs in another component.
Priority: -- → P2
I've tried importing chrome data into a fresh profile to see what happens. I can take a guess as to why the user would complain about that behavior: a website visited today will a frecency value that can be an order of magnitude greater than another that you haven't visited today (because of time decay). We have a custom sorting function [0] in which frecency values take precedence. Most likely all of the websites the user accesses frequently were selected but their order did not fit her/his mental model, they probably got pushed to the second row which is hidden by default.
I also found it important that the same navigation behavior in both browsers leads to the same frequency value (after you import). The history import helper [1] gets all the navigation details (typed, or transition link, counts etc).


[0] https://searchfox.org/mozilla-central/rev/f5f1c3f294f89cfd242c3af9eb2c40d19d5e04e7/toolkit/modules/NewTabUtils.jsm#1111-1115
[1] https://searchfox.org/mozilla-central/rev/f5f1c3f294f89cfd242c3af9eb2c40d19d5e04e7/browser/components/migration/ChromeProfileMigrator.js#264-334
(Reporter)

Comment 3

2 months ago
And what were the frecency values for the imported data compared to the page's visit count and type? If a user visits a page many times but not today, it should have a higher frecency value.
Good catch. This is the "urls" table we query when importing history:

0|id|INTEGER|0||1
1|url|LONGVARCHAR|0||0
2|title|LONGVARCHAR|0||0
3|visit_count|INTEGER|1|0|0
4|typed_count|INTEGER|1|0|0
5|last_visit_time|INTEGER|1||0
6|hidden|INTEGER|1|0|0 
(this is just the output given by table_info comand)

We check "typed_count > 0" [0] to decide what kind of visit was it, typed or navigation. There doesn't seem to be any query to fetch the value of the "visit_count" field.

In Google Chrome, a query for url, visit_count returns:
https://news.ycombinator.com/|14
After we import in moz_places I see:
https://news.ycombinator.com/|1

Perhaps this can be easily fixed by inserting "visit_count" entries in the visits array [1] (plus some additional logic to figure out what the visit type was).

[0] https://searchfox.org/mozilla-central/rev/ed212c79cfe86357e9a5740082b9364e7f6e526f/browser/components/migration/ChromeProfileMigrator.js#327
[1] https://searchfox.org/mozilla-central/rev/ed212c79cfe86357e9a5740082b9364e7f6e526f/browser/components/migration/ChromeProfileMigrator.js#333
(Reporter)

Comment 5

a month ago
I believe we could just copy the visit_count value from Chrome's urls table to Firefox's moz_places visit_count, and frecency should compute a relatively better number. (I believe it's [visit frequency] * [recent visit types] ?)

mak, any idea if visit_count was previously used but removed? Or just never used to begin with?
Component: Activity Streams: Newtab → Migration
Flags: needinfo?(mak77)
Summary: Imported chrome browsing history shows recently visited history instead of top sites → Imported chrome browsing history ignores visit_count and shows recently visited history instead of top sites
the computes in moz_places are actual, thus they reflect the real entries in moz_historyvisits. We don't write fake numbers there and we cannot generate visits because Chrome doesn't even store the visit type and date for each visit afaik.
Keeping fake numbers would involve quite a few changes to avoid coherency problems and possible privacy hits.
Flags: needinfo?(mak77)
(Reporter)

Comment 7

a month ago
Looks like "visits" is similar to moz_historyvisits:

CREATE TABLE visits(id INTEGER PRIMARY KEY,url INTEGER NOT NULL,visit_time INTEGER NOT NULL,from_visit INTEGER,transition INTEGER DEFAULT 0 NOT NULL,segment_id INTEGER,visit_duration INTEGER DEFAULT 0 NOT NULL, is_indexed BOOLEAN);

id = id
url = place_id
visit_time = visit_date
from_visit = from_visit
transition = visit_type https://chromium.googlesource.com/chromium/src/+/master/ui/base/page_transition_types.h
segment_id = session
Thanks, that means we should import more visits, that is something we don't do because it's too slow. Doug Thayer is working on improving the time to insert many visits in bug 1332225
Depends on: 1332225
You need to log in before you can comment on or make changes to this bug.