Closed
Bug 1447891
Opened 6 years ago
Closed 6 years ago
Top Sites keep changing dramatically
Categories
(Firefox :: New Tab Page, defect, P2)
Firefox
New Tab Page
Tracking
()
People
(Reporter: mossop, Assigned: Mardak)
References
Details
Attachments
(1 file)
It seems like over short periods of time there can be sudden dramatic shifts in the ordering of Top Sites that appear to not be related to the pages I visit. All of a sudden a site that was previously in my Top Sites (1 row viewed normally) will be relegated to the second row for no apparent reason. I'd guess that the vast majority of my page loads are my pinned tabs, twitter, facebook and reddit. I don't particularly notice how my pinned tabs move around in the Top Sites but the other three I sure do, particularly when they fall out of the first row. For most of today reddit was in the 1st row. Somewhere in the middle if I remember right. Suddenly this evening it isn't. It's now the 4th item in the 2nd row. Above it are: Wikipedia: 1 visit today Home depot: Last visited three days ago Phonebook: Last visited yesterday Why would a site I visited earlier today suddenly drop below a site I haven't visited for three days?
Assignee | ||
Comment 1•6 years ago
|
||
What's the magnitude of your frecencies? NewTabUtils.activityStreamLinks.getTopSites().then(v => console.log(v.map(s => [s.frecency, s.url]).join("\n"))) My history has the first few in the millions / 7 digits, then 5 more with 6 digits then last few with 5 digits. I haven't really seen my top sites change for a long time. I wonder if synced data is causing some frecency recalculations?
Reporter | ||
Comment 2•6 years ago
|
||
This is what I get: 455000,https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23developers 192560,https://www.facebook.com/ 178640,https://www.reddit.com/ 176500,https://twitter.com/ 155610,https://www.amazon.com/ 128300,https://mozilla.slack.com/messages/C4D3JFF26/ 76963,https://mozilla.github.io/mozpdx-lunch/ 73620,https://www.washingtonpost.com/ 72072,https://www.wikipedia.org/ 66721,https://mail.mozilla.org/admindb/firefox-dev 66464,https://phonebook.mozilla.org/ 65879,https://www.homedepot.com/ reddit wasn't showing in the first row after I logged that, then I went to settings and turned on the second row, came back, and then reddit was correctly showing in the 3rd spot in the 1st row.
Reporter | ||
Comment 3•6 years ago
|
||
I do sync with my mobile device, none of the sites that show up in the top sites on my android phone are listed in here.
Assignee | ||
Comment 4•6 years ago
|
||
Hrmm... I could see reddit/twitter swapping 3rd and 4th places, but having it disappear from the first row definitely seems odd. Do you have pinned top sites -- in particular reddit? NewTabUtils.pinnedLinks.links.map((l, i) => [i, l && l.url]).join("\n") At first I was thinking it could be related to bug 1422867 comment 3 where old Tiles pinned links could be pinned to a position out of view, but this seems to be the opposite…? r1cky, andreio any ideas?
Reporter | ||
Comment 5•6 years ago
|
||
No pinned top sites
Reporter | ||
Comment 6•6 years ago
|
||
About 30 minutes ago I noticed that twitter jumped to the top of my Top Sites: 516780,https://twitter.com/ 461000,https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23developers 314496,https://www.facebook.com/ 174740,https://www.reddit.com/ 151720,https://www.amazon.com/ 120158,https://mozilla.slack.com/messages/C4D3JFF26/ 75039,https://mozilla.github.io/mozpdx-lunch/ 71780,https://www.washingtonpost.com/ 70270,https://www.wikipedia.org/ 68248,https://phonebook.mozilla.org/ 65053,https://mail.mozilla.org/admindb/firefox-dev 64232,https://www.homedepot.com/ Then just now it jumped back down again: 462200,https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23developers 314496,https://www.facebook.com/ 179000,https://twitter.com/ 174740,https://www.reddit.com/ 151720,https://www.amazon.com/ 120250,https://mozilla.slack.com/messages/C4D3JFF26/ 101525,https://phonebook.mozilla.org/ 75039,https://mozilla.github.io/mozpdx-lunch/ 71780,https://www.washingtonpost.com/ 70270,https://www.wikipedia.org/ 65053,https://mail.mozilla.org/admindb/firefox-dev 64232,https://www.homedepot.com/ I'm surprised that the frecency can change so quickly.
Assignee | ||
Comment 7•6 years ago
|
||
(In reply to Dave Townsend [:mossop] from comment #6) > 516780,https://twitter.com/ > 179000,https://twitter.com/ > I'm surprised that the frecency can change so quickly. Oh ho ho. Very suspicious. mak, ideas? Maybe it is sync related. I believe because not all history is synced, some subset used to estimate the frecency value.
Flags: needinfo?(mak77)
Comment 8•6 years ago
|
||
I'm not sure honestly, I don't expect big jumps as well, and I can't reproduce something similar. Unfortunately we don't have good logging of Places actions, apart from MOZ_LOG storage, that produces huge logs. We should really have a circular buffer log that one can enable through prefs in Places. The only things that come to my mind are: 1. expiration should remove very old pages, and frecency calculation only uses the last 10 visits, so it should not cause big changes, unless the database has tens of thousands of bookmark data that pushes away very recent history data 2. Sync Can surely cause a recalculation and if added visits are within the last 10 it can pump up or down frecency. Like, if most of the locally stored visits are "typed" and Sync adds recent "link" visits, the value is likely to go down much. Opposite, if most of the local visits are "link" and Sync adds recent "typed" visits, the value is likely to go up much. Do you have twitter in a pinned tab, or restored through session restore? IIRC we still don't store any visit if a page is restored, and then if you visit that page on mobile you'd likely get a remote visit quite newer than a local one. This may explain a grow, but I'm not sure how it would explain a sudden fall. Sync has decent logging, so likely enabling that could be a good first step to see if there's a relation. Ask Thom or Kit for that. Another interesting thing would be to check coherence of the reddit entry in the DB, you can use this snippet: (async function() { let db = await PlacesUtils.promiseDBConnection(); let rows = await db.execute(` SELECT h.visit_count, count(*), h.last_visit_date, MAX(v.visit_date), h.frecency, CALCULATE_FRECENCY(h.id) FROM moz_places h JOIN moz_historyvisits v ON v.place_id = h.id WHERE h.url_hash = hash(:url) AND h.url = :url GROUP BY h.id `, {url: "https://www.reddit.com/"}); console.log(`count: ${rows[0].getResultByIndex(0)} == ${rows[0].getResultByIndex(1)}\n date: ${rows[0].getResultByIndex(2)} == ${rows[0].getResultByIndex(3)}\n frec: ${rows[0].getResultByIndex(4)} == ${rows[0].getResultByIndex(5)}`); })();
Flags: needinfo?(mak77)
Comment 9•6 years ago
|
||
So, I'm looking at my db, I have various pages that apparently have very different frecency values from the ones that would normally be calculated: https://www.reddit.com/r/Amd/new/ count: 637767 == 1195236 https://www.amazon.it/ count: 457758 == 1195236 https://www.facebook.com/ count: 293961 == 1195236 https://www.reddit.com/r/firefox/ count: 228919 == 1195236 off-hand looks like there's some common operation that can have a large impact on frecency but doesn't seem to care to update frecency. When something else runs and actually updates frecency, we move to the right value, that could be quite different. Now, the hunting is open.
Comment 10•6 years ago
|
||
fwiw, expiration doesn't recalculate frecency, and the current frecency algo seems to depend a lot on visit_count. A possible theory is that Sync adds a bunch of visits and increases frecency, then expiration kicks in because the db is at its maximum and removes some old visits, but doesn't recalculate frecency (it's expensive in general). When some other operation happens that causes a recalculate, the value falls to the right one.
Reporter | ||
Comment 11•6 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #8) > Do you have twitter in a pinned tab, or restored through session restore? > IIRC we still don't store any visit if a page is restored, and then if you > visit that page on mobile you'd likely get a remote visit quite newer than a > local one. This may explain a grow, but I'm not sure how it would explain a > sudden fall. I don't pin twitter, just visit it a lot. I guess sometimes when I restart for app updates twitter is open so would get restored then. > Another interesting thing would be to check coherence of the reddit entry in > the DB, you can use this snippet: > > (async function() { > let db = await PlacesUtils.promiseDBConnection(); > let rows = await db.execute(` > SELECT h.visit_count, count(*), > h.last_visit_date, MAX(v.visit_date), > h.frecency, CALCULATE_FRECENCY(h.id) > FROM moz_places h > JOIN moz_historyvisits v ON v.place_id = h.id > WHERE h.url_hash = hash(:url) AND h.url = :url > GROUP BY h.id > `, {url: "https://www.reddit.com/"}); > console.log(`count: ${rows[0].getResultByIndex(0)} == > ${rows[0].getResultByIndex(1)}\n > date: ${rows[0].getResultByIndex(2)} == > ${rows[0].getResultByIndex(3)}\n > frec: ${rows[0].getResultByIndex(4)} == > ${rows[0].getResultByIndex(5)}`); > })(); count: 620 == 620 date: 1521761409575878 == 1521761409575878 frec: 179800 == 179800 (In reply to Marco Bonardo [::mak] from comment #10) > fwiw, expiration doesn't recalculate frecency, and the current frecency algo > seems to depend a lot on visit_count. > A possible theory is that Sync adds a bunch of visits and increases > frecency, then expiration kicks in because the db is at its maximum and > removes some old visits, but doesn't recalculate frecency (it's expensive in > general). When some other operation happens that causes a recalculate, the > value falls to the right one. Not sure if it is useful data or not, but I basically never visit reddit or twitter on my phone. The top sites on my phone are a completely different set to those on desktop in that I basically never visit the top sites I have on my desktop on my phone and vice versa.
Reporter | ||
Comment 12•6 years ago
|
||
I noticed Twitter had jumped up to the top again, even though I haven't visited it today. Ran the DB query for it and got this: count: 1805 == 1812 date: 1521786905015043 == 1521786905015043 frec: 1209350 == 1209350
Comment 13•6 years ago
|
||
well, visit_count can be a little bit smaller than the real visit count, because we exclude a few types, unfrtonately that doesn't help. The only clear thing so far is that frecency value can become stale and be updated after a while. Btw, even if you don't visit twitter on your phone, you may have multiple profiles/computers/mac on the same firefox account?
Reporter | ||
Comment 14•6 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #13) > Btw, even if you don't visit twitter on your phone, you may have multiple > profiles/computers/mac on the same firefox account? The only things I have signed into my Firefox account right now are my phone and desktop, one profile on each.
Reporter | ||
Comment 15•6 years ago
|
||
Ran into a case where reddit dropped out of tip sites again: 477500,https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23developers 464310,https://www.facebook.com/ 189100,https://twitter.com/ 133200,https://mozilla.slack.com/messages/C4D3JFF26/ 110210,https://www.amazon.com/ 85647,https://www.wikipedia.org/ 69985,https://www.washingtonpost.com/ 68281,https://phonebook.mozilla.org/ 63400,https://www.reddit.com/ 62370,https://treeherder.mozilla.org/ 61845,https://mozilla.github.io/mozpdx-lunch/ 61060,https://www.homedepot.com/ count: 634 == 634 date: 1522093826365264 == 1522093826365264 frec: 63400 == 63400 I ran multiple syncs on my phone and desktop and it didn't change reddit's position.
Reporter | ||
Comment 16•6 years ago
|
||
I visited reddit once and the numbers changed a bunch: 477700,https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23developers 333600,https://www.facebook.com/ 189400,https://twitter.com/ 184440,https://www.reddit.com/ 133200,https://mozilla.slack.com/messages/C4D3JFF26/ 110210,https://www.amazon.com/ 85647,https://www.wikipedia.org/ 69985,https://www.washingtonpost.com/ 68281,https://phonebook.mozilla.org/ 62370,https://treeherder.mozilla.org/ 61845,https://mozilla.github.io/mozpdx-lunch/ 61060,https://www.homedepot.com/ count: 636 == 636 date: 1522096772053442 == 1522096772053442 frec: 184440 == 184440 BUT, reddit still isn't in the top row :s
Reporter | ||
Comment 17•6 years ago
|
||
(In reply to Dave Townsend [:mossop] from comment #16) > BUT, reddit still isn't in the top row :s It just appeared!
Assignee | ||
Comment 18•6 years ago
|
||
The delay in reddit moving back to position 4 is expected due to activity stream's caching. The wildly different calculated frecency seems likely dependent on the type of visits. Here's a script that should print out the most recent 100 visit types: PlacesUtils.promiseDBConnection().then(db => db.execute("SELECT GROUP_CONCAT(visit_type) FROM moz_places h JOIN moz_historyvisits v ON v.place_id = h.id WHERE url = 'https://www.reddit.com/'")).then(v => console.log(v[0].getResultByIndex(0).split(",").slice(-100).join(" "))) If you typically type in the url / select from the address bar, most likely they'll be "2"s: https://searchfox.org/mozilla-central/rev/003262ae12ce937950ffb8d3b0fa520d1cc38bff/toolkit/components/places/nsINavHistoryService.idl#1196-1248
Assignee | ||
Comment 19•6 years ago
|
||
(In reply to Dave Townsend [:mossop] from comment #15) > count: 634 == 634 > frec: 63400 == 63400 (In reply to Dave Townsend [:mossop] from comment #16) > count: 636 == 636 > frec: 184440 == 184440 The former seems to be visit count * 100 while the latter is visit count * 290… Where I believe frecency looks at each visit's type. So unless all visits to the page were link clicks, the former's * 100 seems to be wrong.
Reporter | ||
Comment 20•6 years ago
|
||
> 2 1 1 1 1 1 2 2 1 1 1 2 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 2 2 1 2 1 2 1 2 1 2 1 1 1 1 2 2 1 2 2 1 2 1 2 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1
I normally visit by clicking the top sites, I assume that just counts as a link click. Is it possible that doing that slowly pushes off url bar visits causing it to suddenly drop and then require a url bar visit to go there causing it to jump up again?
Assignee | ||
Comment 21•6 years ago
|
||
(In reply to Dave Townsend [:mossop] from comment #20) > 2 1 1 1 1 1 1 1 1 1 1 1 2 1 > I normally visit by clicking the top sites, I assume that just counts as a > link click. Oh well would you look at that ;) https://searchfox.org/mozilla-central/rev/49cc27555d5b7ed2eb06baf156693dd578daa06f/toolkit/components/places/nsNavHistory.cpp#283 , mNumVisitsForFrecency(10) https://searchfox.org/mozilla-central/rev/49cc27555d5b7ed2eb06baf156693dd578daa06f/toolkit/components/places/SQLFunctions.cpp#638-643 Looks like it grabs the most recent 10 visits to determine the average visit type. And between your two most recent "2" visit types, there were 11 link click visits! Should the sampling range be increased? Should activity stream treat top site clicks as TYPED… or some other transition type?
Flags: needinfo?(mak77)
Assignee | ||
Comment 22•6 years ago
|
||
Oh, for anyone wanting to double check the math :p 9 * 100 (regular link visits) + 1 * 2000 (typed bonus) -> 290 average bonus matching up with comment 16's frecency score.
Assignee | ||
Comment 23•6 years ago
|
||
From the activity stream side, I suppose we would need to send a message to main to call something like markPageAsTyped: https://searchfox.org/mozilla-central/rev/49cc27555d5b7ed2eb06baf156693dd578daa06f/browser/components/places/PlacesUIUtils.jsm#419-428 And then navigate instead of allowing it to do a normal link click navigation. (Or maybe if we allow navigation and send a message and hopefully it sets the "next" type sooner than the actual visit…)
Comment 24•6 years ago
|
||
(In reply to Ed Lee :Mardak from comment #23) > From the activity stream side, I suppose we would need to send a message to > main to call something like markPageAsTyped: > And then navigate instead of allowing it to do a normal link click > navigation. (Or maybe if we allow navigation and send a message and > hopefully it sets the "next" type sooner than the actual visit…) In years typed grew a very relaxed definition. It pretty much means the user either: typed the url, clicked from a Places view (like history menu or sidebar), paste&go. Additionally we have bug 1330343, that pretty much causes us to mark as typed anything from the urlbar. At this point I'd say "typed" pretty much means "revisited from the UI", while "link" is just a link click from a page. As such it makes sense for Activity Stream to use it, but it should do something similar to _openNodeIn, where it sets the visit as "typed" or "bookmark" appropriately. markPageAs must be invoked before history actually adds the visit, it's a synchronous call that just adds to a memory hash table.
Flags: needinfo?(mak77)
Comment 25•6 years ago
|
||
(In reply to Ed Lee :Mardak from comment #21) > Should the sampling range be increased? Probably not. Not because it wouldn't be good, but because frecency is expensive enough as-is, we can't increase its costs. In the past Jesse proposed a different implementation of a monotonic frecency (https://wiki.mozilla.org/User:Jesse/NewFrecency), that would be much cheaper, but it requires analysis and on-the-field comparison with the current frecency to understand if it fits our necessities. We never found the time/resources for that, but the frecency costs are an evergreeen problem.
Updated•6 years ago
|
Iteration: --- → 61.4 - May 7
status-firefox60:
--- → wontfix
status-firefox61:
--- → affected
Priority: -- → P2
Comment 26•6 years ago
|
||
Is there any hope to have AS use markAsTyped in 61? Some users seem to notice frecency becoming less useful recently, and this sounds like part of the problem.
Assignee | ||
Comment 27•6 years ago
|
||
The intention is to get it in during 61.4 iteration focusing on more direct-UI changes in earlier iterations.
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → edilee
Comment 28•6 years ago
|
||
Comment 29•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/activity-stream https://github.com/mozilla/activity-stream/commit/d43f1404071158033527d080359691f8784c01b4 fix(topsites): Give all top site link clicks a typed frecency bonus (#4119) Fix Bug 1447891 - Top Sites keep changing dramatically
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 30•6 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/9edd64fc07d3
Target Milestone: --- → Firefox 61
Updated•5 years ago
|
Component: Activity Streams: Newtab → New Tab Page
You need to log in
before you can comment on or make changes to this bug.
Description
•