Make History aware of userContextId

NEW
Unassigned

Status

()

defect
P2
normal
3 years ago
6 days ago

People

(Reporter: tanvi, Unassigned, NeedInfo)

Tracking

(Depends on 2 bugs, Blocks 4 bugs)

unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox57 fix-optional)

Details

(Whiteboard: [userContextId][domsecurity-active][OA])

Attachments

(1 attachment, 6 obsolete attachments)

(Reporter)

Description

3 years ago
I would like history items to have userContextId's associated with them in the history database.  In the future, we may want to tailor awesomebar or newtab results to the container the user is in.  In order to do this, history needs to be aware of the container it was loaded in.

This bug is to determine if we already do this (baku thinks it already is from the work he has done on session restore).  And if not, to add userContextId to the history entry database.  Bonus if we can determine this for bookmarks as well; they may use the same database as history but I'm not sure.
(Reporter)

Updated

3 years ago
Whiteboard: [userContextId][domsecurity-backlog] → [userContextId][domsecurity-backlog][OA]
Don't forget Sync in this world - I'm guessing we will need to ensure Sync handles this attribute correctly too.
Baku said origin attributes are added to nsISHEntry but not directly to the history database. As we sometimes need to do DB query like this: http://searchfox.org/mozilla-central/source/browser/components/newtab/PlacesProvider.jsm#167, there's still some work to do.
When we visit a website, transactions on at least two History tables might occur. If this is the first time we visit this website, we insert the information related to this visit into a table called "moz_historyvisits". If this is our first visit to this website, we add a new *place* into a table called "moz_places". Otherwise, we need to update the frequency column of that *place* in the "moz_places" table.

We can add a userContextId column to those tables. But in addition to mere awareness, there will be data segregation because for a single website there might be multiple *places* in the table if it has been visited in more than one containers.
(Reporter)

Comment 4

3 years ago
Thanks Jonathan!

(In reply to Jonathan Hao [:jhao] from comment #3)
> We can add a userContextId column to those tables. But in addition to mere
> awareness, there will be data segregation because for a single website there
> might be multiple *places* in the table if it has been visited in more than
> one containers.

Yeah, this means our frequency counts will be a little off for users using containers.  Which might be okay or it might not, depending on how users use containers (are they visiting the same site in multiple containers, or do they visit different sites in each container).

The best, but most complex, solution would be to increment all frequency counters associated with an origin.  So if the user visits foo.com in the default and the personal context you would have:

User visits foo.com in default context:
Entry added: https://foo.com, usercontextid=0, frequency = 1

Next User visits foo.com in personal context
Existing entry: https://foo.com, usercontextid=0, frequency = 2
Entry added: https://foo.com, usercontextid=1, frequency = 2

This seems quite complex though, and may not be worth the effort.

Perhaps we should reach out to someone familiar with the History code to see if they have some advice on ho we should proceed.
Assignee: nobody → jhao
Status: NEW → ASSIGNED
Whiteboard: [userContextId][domsecurity-backlog][OA] → [userContextId][domsecurity-active][OA]
Although in Comment 4, Tanvi said we would increment all the frequency, but we later decide it might be better to separate the frequency of each container.

Hi, Marco.

We'd like to add origin attributes, which contains the userContextId, to History and Places, so that when a user visits a page, we record the userContextId in the history database.  We also might need to add origin attributes to places database, so that we have separated frequency for each container. 

Do you have any advice on how we should proceed?
Flags: needinfo?(mak77)
(In reply to Jonathan Hao [:jhao] from comment #5)
> Although in Comment 4, Tanvi said we would increment all the frequency, but
> we later decide it might be better to separate the frequency of each
> container.

I wonder if we need to step back and make sure we have the goals correct. Frecency affects a number of things, with the awesomebar and about:newtab being the most visible to the user. This discussion also touches on Sync, where users have expressed frustration at the fact the frecency of their "local" device can be swamped by visits from a remote device (ie, the "top sites" on their mobile suddenly start reflecting their top sites from desktop, when they'd prefer to have 2 different sets of top sites).

However, I don't think that totally segmenting frecency is the right thing to do in either of these cases - I expect we want more isolation in things like top-sites (eg, only show top-sites in this container/device) than we do in the awesomebar (eg, sites visited in only one container or device should probably appear in the awesomebar everywhere).

To put it another way, if we completely isolate frecency, I suspect we will end up breaking user expectations in the awesomebar and other places where sites you recently visited are shown. I think it would confuse users if they switched to a different tab and had a completely different awesomebar experience.

Does that sound correct in the context of containers?
Is it possible to keep both "frequency by containers" and "frequency across containers"? Then, top-sites can use the former and awesomebar can use the latter.
I think the modeling line here is not frecency but visits themselves. Frecency (or, in this case, frecencies) is computed from visits, not adjusted directly.

As Mark notes, we're already aware of some desire to track visits by device (iOS and Android already do, at a coarse level). It's not a big step from there to treat containers as an additional source/marker/whatever.

Bear in mind that anything you want to keep has to be modeled by Sync, otherwise it'll round-trip and be lost.
The problem is quite complex, especially if it involves frecency.
Let's take things one at a time:

1. Adding contextId to visits. This is feasible and probably "easy" from a Places point of view. Just add a contextid column and store the value through docshell. It has only one downside, it may considerably increase the db size (for each place there are a lot of visits, I have almost 200k in my profile), thus it may be better if it would be an integer rather than an alpha numeric string. 8 bytes on 200k entries would increase the db size by 1.6MB that, likely, will double to 3.2MB if we need an index to query on this. Though, more realistically the ids could fit into a couple bytes. I don't see big troubles in this, but I still don't see clear use-cases and it would be better to first have those.

2. Bookmarks, I guess you care where they were opened, rather than where they were created, thus history already covers this. Showing only bookmarks for a certain context is clearly something that requires the most work, our views are not partitioned atm. Plus a bookmark may have multiple visits in different context, who would win? The most common, the most recent? This may require some pre-calculated column as well, to be performant.

3. Storing a contextId for every place... well, I don't know how that would work, the most common contextid in the last N visits? the most rcent contextid? it's again something that needs use-cases before it can be evaluated, it may be expensive to keep it up-to-date, and may require a pre-calculated column.

4. Frecency is a complex and expensive algorithm, that requires precalculating a score every time a visit/bookmark is added/removed. Partitioning frecency by contextid means creating a new frecency value per context, that multiplies that cost (plus it could only be done for a fixed number of contexts or it would become really complex). This is sort-of a show-stopper for frecency, or at least something that would require a lot of brainstorming and designing time.

5. There is a risk of making frecency less general by partitioning it, so that for the user it may instead be harder, rather than easier, to find stuff he is looking for (when that was part of a different context). One of the benefits of the awesomebar is that it has a large reach into general user habits.

6. in case of a profile reset, we copy the db to a new profile, could then these stored contextId point to wrong contexts? Can the user define custom contexts or do we know they are immutable across all the devices/profiles?
Flags: needinfo?(mak77)
FWIW I agree it's good to look at things from a very far point of view and see how many awesome changes we could do, on the other side, it would be simpler if we'd start from a more realistic project like "show in the awesomebar the context for a match and reopen the page in that context".
(Reporter)

Comment 11

3 years ago
Needinfo'ing myself to provide all the use cases for this.
Flags: needinfo?(tanvi)
A use case is: when a user opens a new tab in a certain context, we want about:newtab to show only the sites that the user has visited in this context, and sorted by the frequency in that context only.

Marco, what do you think if we start from that?
Flags: needinfo?(mak77)
(In reply to Jonathan Hao [:jhao] from comment #12)
> A use case is: when a user opens a new tab in a certain context, we want
> about:newtab to show only the sites that the user has visited in this
> context, and sorted by the frequency in that context only.

this is incomplete, what does "has visited in this context" mean:
1. all visits in that context
2. at least one visit in that context
3. the most recent visit in that context

Those are very different from both a result and a coding point of view.

Plus, as I said, "frecency in a specific context" is very complex and expensive to have, afaict.

I don't want to sound too negative of stop-energy, but doesn't look like this has a well-defined feature plan yet.
Flags: needinfo?(mak77)
(Reporter)

Updated

3 years ago
Blocks: 1288504
(In reply to Marco Bonardo [::mak] from comment #13)
> this is incomplete, what does "has visited in this context" mean:
> 1. all visits in that context
> 2. at least one visit in that context
> 3. the most recent visit in that context

The general idea is to assume the visits in other contexts don't exist.  For example, a user visits twitter in the Personal context only. When she opens a new tab in the Work context, she shouldn't see twitter because she never visited twitter in the Work context.  That means (2), we need to have at least one visit in that context.  The page ranking algorithm can be the same, but only considering the visits in current context.

> Plus, as I said, "frecency in a specific context" is very complex and expensive to have, afaict.

I'll try to figure out how that can be done, and discuss with you and Tanvi whether it's worth implementing.

> I don't want to sound too negative of stop-energy, but doesn't look like
> this has a well-defined feature plan yet.

Tanvi may have a more well-defined plan.
(In reply to Jonathan Hao [:jhao] from comment #14)
> The page ranking algorithm can be the
> same, but only considering the visits in current context.

What this means, from a db point of view, is that we have to join to the visits table and check if a visit for the given place and context exist. It is feasible, but it will add a space cost (we need a compund index on place_id and context_id, likely other 3MB to the db size or so) and a perf cost (the history table is large).
The most problematic thing is that we don't have an index to search, that means we'll have to do the above for every single line of the db until we find all the entries. So it's not exactly a join but an EXISTS (visit with given context id). Since contexts will be interleaved, the query will pass through a lot of entries that should be discarded.
I can't predict the price, but off-hand it will be "meaningful", unless we figure a way to flatten the context information in an indexable way on the places table.

> > Plus, as I said, "frecency in a specific context" is very complex and expensive to have, afaict.
> 
> I'll try to figure out how that can be done, and discuss with you and Tanvi
> whether it's worth implementing.

Personally I'd just keep using the global frecency. The cost is too high from both a coding point of view and a perf point of view. I can't see us having partitioned frecency without a a huge overhaul of the system.
(Reporter)

Comment 16

3 years ago
Thank you Marco for putting so much thought into this.  You have raised some questions we hadn't considered or thought about.

Sorry this is so long...

When a user opens a container tab, we'd like about:newtab and the awesomebar to show the user results that are relevent to that container.  So in the Work Container, I want to see bugzilla, google calendar, mdn, etc both in the newtab page and when I start typing in the awesomebar.  In my Shopping container, I want to see amazon.com, safeway.com, target.com, etc.  There may be some overlap; ex: in my Personal container I also want to see google calendar and that instance will load with a different set of cookies.  Part of the reason for this is that we want the browser to help users pick the right websites in containers.  Assume I were to go to a shopping tab and I saw bugzilla in about:newtab.  I may at that moment remember that I had to comment on a bug and click bugzilla.  I'd go to make a comment, and then be confused as to why I wasn't logged in.  Then (maybe) I'd remember it is because I'm in the shopping container and I need to be in the work container for bugzilla.  To avoid this scenario, about:newtab and awesomebar could help me along by being populated with sites from the right context.


In order to naively achieve this experience, each history entry in the database would have to be marked with a "userContextId" which is an integer.  Right now it is an integer from 0-4.  0 means it is a regular/default tab, and 1-4 are for the 4 built in containers.  Then, when awesomebar or about:newtab are being populated, we could (naively) use the existing database query and heuristics, and then throw away anything with a mismatched usercontextid.

That naive approach likely causes huge performance issues, so we'd have to sit down and see if there is a better, more performant way to do this.  One thought is to have 5 different databases, and only query the one for the context in which the user is operating.

One problem with that approach is that we do want to enable custom containers.  So instead of usercontextId's 0-4, we may have 0-10.  Or even more.  We could impose a limit on this if needed.  Custom container options are being added in bug 1267916.

Another thing we could do is only offer this tailored experience for about:newtab, and let that awesomebar behave as normal. In order to do this, we would have to maintain one database, instead of 5 (or more) as mentioned above.  If the user has not visited any sites in a specific container, the about:newtab page wouldn't have any tiles and may look something like this with some introductory/explanation text about what containers are - https://wiki.mozilla.org/images/3/3a/Containers-start-page.png.  In the awesomebar, we could potentially add the container name and icon to the right hand side of the URL to show the user what container they opened the page in previously.

Bookmarks don't necessarily have to be segregated by userContextId.  I assumed (perhaps incorrectly) that bookmarks are part of the history database and just have an extra flag indicating they are a bookmark.  So by adding usercontextid to history, we would get it for free in bookmarks.

Another question you brought up is how this would work with sync.  Containers is a Desktop only feature right now, so if a user has synced with their mobile device, and we aren't careful, they may only get suggestions that are from the default container (userContextId 0).  It sounds like user expectations are unclear here (i.e. maybe they don't want about:newtab to show suggestions based on their usage on other devices, but maybe they do want awesomebar to show cross device suggestions).  When it comes to do browsers on the same type of device, I do believe container information shoudl be synced (just filed bug 1288858)

Anyway, it sounds like there is a lot we have to figure out here.  I want to create a design that balances between performance, user experience, and code complexity here.  And I think the best way to do that is if we meet to discuss what is feasible and what is not.
Per today's discussion on Vidyo, we'll start with adding userContextId to visits table, and let awesomebar tell the user in which context you visited each page.  If that context isn't the current tab's context, the user can open that page in that context just by clicking the awesomebar result.
N.B., awesomebar rows represent aggregations of visits, so you'll have to figure out what to do when a place has been visited in more than one context. This approach sidesteps the frecency scoring problem, but you'll still need some kind of counter/bitfield/additional chunk of query in order to do what you want.
(Reporter)

Comment 19

3 years ago
Clearing the needinfo I set for myself.
Flags: needinfo?(tanvi)
Tanvi suggested we start by using history with context on about:newtab.  I plan to add a column "last_visit_user_context" to moz_places table, and in about:newtab we can try to show only the results that was last opened in the current container.

I thought I could just add a condition to the SELECT operation.  But looking at the newtab code today, I found that firefox only reads from the database once and stores a map of urls to frequencies.  So, in order to make about:newtab container-aware, I may need to maintain a map for each container.  I haven't figured out exactly how to do that yet.
Attachment #8779726 - Attachment is obsolete: true
Hi Drew.  Marco suggested us to consult you about places and visits.  

In order for about:newtab to show top sites based on the last_visit_user_context, I tried adding a column to moz_places, and try to get each place's userContext here: http://searchfox.org/mozilla-central/rev/b92ae9a263a05efefc0110eaad5e56369c9f1b10/toolkit/modules/NewTabUtils.jsm#659

>            let title = row.getResultByIndex(2);
>            let frecency = row.getResultByIndex(12);
>            let lastVisitDate = row.getResultByIndex(5);

But I can't figure out where these magical numbers come from.  They don't seem to match the order in db schema.

Could you enlighten me, please?
Flags: needinfo?(adw)
Hi Jonathan, sorry for the delay.  `row` is a row in the result set of the underlying SQL query.  The query in this case is a SELECT statement, so the result set is a SQL projection.  Those numbers are the zero-based indexes of columns in that projection.

Long story short, the SELECT statement for this particular query is here: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsNavHistory.cpp#1508

If you map the indexes in the NewTabUtils caller to the columns of the SELECT, you can see:

> let url = row.getResultByIndex(1);              //  1 => url
> if (LinkChecker.checkLoadURI(url)) {
>   let title = row.getResultByIndex(2);          //  2 => title
>   let frecency = row.getResultByIndex(12);      // 12 => frecency
>   let lastVisitDate = row.getResultByIndex(5);  //  5 => last visit date

If you need to know why that particular SELECT statement is being used here, it's because that PlacesSQLQueryBuilder class is used to build the SQL statements that underly the history query API that that NewTabUtils code is calling.

On this line, NewTabUtils is getting a new options object: http://searchfox.org/mozilla-central/rev/b92ae9a263a05efefc0110eaad5e56369c9f1b10/toolkit/modules/NewTabUtils.jsm#642

Options objects have a "result type".  NewTabUtils didn't change the default, which is nsINavHistoryQueryOptions::RESULTS_AS_URI.  If you look at PlacesSQLQueryBuilder::Select, there's a switch statement that chooses which SELECT statement to use based on the result type, and in the RESULTS_AS_URI case, it calls SelectAsURI(), which is where that SELECT statement I linked to earlier is.
Flags: needinfo?(adw)
While we have bug 1288858 for the longer term, we should consider the short-term story with Sync. ISTM the safest thing to do is to have desktop Sync ignore records with the new column - otherwise we will blindly sync them to mobile devices as "normal" history, which feels wrong. I don't think this would be difficult. We can later work out how to coordinate the different Sync implementations having a sane story. Tanvi, what do you think?
Flags: needinfo?(tanvi)
Now I can get each place's last_visit_user_context, but I'm stuck at how to let newtab.xhtml itself know which container it's in.
Attachment #8781454 - Attachment is obsolete: true
I added a column to moz_places to record the last_visit_user_context to each place.
Attachment #8783478 - Attachment is obsolete: true
In each non-default about:newtab, we only show those pages that the user last visited in the same userContext.
(In reply to Mark Hammond [:markh] from comment #25)
> While we have bug 1288858 for the longer term, we should consider the
> short-term story with Sync. ISTM the safest thing to do is to have desktop
> Sync ignore records with the new column - otherwise we will blindly sync
> them to mobile devices as "normal" history, which feels wrong. I don't think
> this would be difficult. We can later work out how to coordinate the
> different Sync implementations having a sane story. Tanvi, what do you think?

The patch as written doesn't track per-visit context at all, so perhaps there's nothing to worry about here -- there's no data to lose!

If we do want to start syncing a subset, then this might still be a little tricky.

If visits themselves were annotated, then we could easily sync a 'view' of places, but as the current patch is implemented we'll see history records flickering in and out of the set we care about as they're visited in different contexts. Incoming records will still collide with these and need to be merged.

Observed behavior would most likely be: no changes are uploaded until the one time you open your work site in the default context by mistake (Cmd-T and start typing!), at which point the most recent context would switch back to 0, and Sync would upload _every_ visit.
Hi Marco, in order to customize about:newtab for non-default containers, does this last_visit_user_context in previous patches look like a good way to start?  Would there be a technical debt if we add this column, but later want to change the design?

The column can be a TINY_INT or something similar if you're concerned about the size increase.

Also, we might also want to add userContextId to each entries in moz_historyvisits, and show them in "Show All History" if container pref is on.  There will be soon be a UI bug for that.
Flags: needinfo?(mak77)
(In reply to Jonathan Hao [:jhao] from comment #30)

> The column can be a TINY_INT or something similar if you're concerned about
> the size increase.

FWIW, SQLite doesn't care about numeric size restrictions. If your value always fits within a byte, it'll take one byte on disk for each table or index in which the value appears.


> Also, we might also want to add userContextId to each entries in
> moz_historyvisits, and show them in "Show All History" if container pref is
> on.  There will be soon be a UI bug for that.

If this happens, all of the syncing considerations from Bug 1288858 (and Comment 29 and so on earlier in this bug) will start to apply, and more thought is required.
Depends on: 1297973
(In reply to Jonathan Hao [:jhao] from comment #30)
> Hi Marco, in order to customize about:newtab for non-default containers,
> does this last_visit_user_context in previous patches look like a good way
> to start?  Would there be a technical debt if we add this column, but later
> want to change the design?

Well, Sqlite doesn't allow to remove columns from a table, so one technical debt is that we may end up with a badly named column. Maybe we should just name it context_id and its contents are an internal implementation detail, that could change in future.
The other possible debt is that, when in future you decide to change the contents, users downgrading may end up with wrong contents. But provided it will still be an id, it will somewhat work.
Finally, we could end up with a fully null column if the feature is completely removed, but that should not be a big trouble.
In the end, provided this will always be a numeric id, and we'll pick a generic enough column name, the price is acceptable, imo.

> The column can be a TINY_INT or something similar if you're concerned about
> the size increase.

As Richard already said, it doesn't matter, in Sqlite you can just use INTEGER or TEXT and it will adapt itself. It's also not mandatory, you could add text to an integer column (you should not though)...

(In reply to Richard Newman [:rnewman] from comment #31)
> > Also, we might also want to add userContextId to each entries in
> > moz_historyvisits, and show them in "Show All History" if container pref is
> > on.  There will be soon be a UI bug for that.
> 
> If this happens, all of the syncing considerations from Bug 1288858 (and
> Comment 29 and so on earlier in this bug) will start to apply, and more
> thought is required.

I don't see a big risk here, honestly.
History is different from bookmarks, you can only add and remove from it (no updates), even if we store context ids for each visit, Sync could just ignore those, and start tracking those in the future when it is teached to do so. It could also just ignore any previous ids and only consider new ones. Dataloss would be the same as starting immediately to store and sync them in the future, afaict.
What's the problem exactly?
Flags: needinfo?(mak77) → needinfo?(rnewman)
If visits are annotated with a context ID, the following user expectations will not be true unless additional work is done:

* Connecting a brand new desktop will have the same context annotations as your existing device. Instead, your work history will appear to be default history on the other device.
* Wiping and re-syncing your device will get it back to the state it was in.

As you note, there's no updating of visits, so it's not straightforward to fill in historical data later.

Further, if annotations are later added to sync, then all platforms need to know about them, else the platforms that don't will drop the extra data from the server each time they write a record, leading to divergence, even between platforms that support context annotations. Just implementing this for desktop Sync won't really work.


For those reasons, if the plan is indeed to start annotating visits at some point soon, it's a good idea to start thinking about what that means in a Sync record format, and what work is required on each platform. History has shown that "figure out the problems later" doesn't really work.

Additionally, the expected user experience is not what the user gets in the presence of Sync, which makes the feature inconsistent and kinda useless for users with more than one machine -- ignoring syncing in the definition of a feature like this is very 2007-era thinking, albeit convenient.
Flags: needinfo?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #33)
> If visits are annotated with a context ID, the following user expectations
> will not be true

I think it all boils down to the scope here. If it's just to experiment on desktop, I don't see why we should be creating user expectations anywhere, it will be a desktop experiment. I suppose the experiment is needed to figure out if it is useful, how it works, how it can be improved, before moving it to other platforms.

> Further, if annotations are later added to sync, then all platforms need to
> know about them

I was assuming Sync could ignore and drop any kind of context id information, until it's ready across all the platforms.

> Additionally, the expected user experience is not what the user gets in the
> presence of Sync, which makes the feature inconsistent and kinda useless for
> users with more than one machine -- ignoring syncing in the definition of a
> feature like this is very 2007-era thinking, albeit convenient.

No one is ignoring Sync, it was brought up multiple times (included Comment 1), but so far Containers is a desktop only feature, isn't it? So no users expectations on mobile and no Sync short term.
To me, sounds like we should just clarify what we can do to allow in the future to eventual port the feature to mobile (and Sync).
I didn't see anything in the proposal preventing a future port to mobile and Sync (At the same time). But maybe there are undisclosed problems? Who can fill the gap?
Re Sync, I'm also thinking about leakage - a desktop user only ever visiting a site in a container may not want it on their mobile device at all - and before that device is container aware, those visits will be forever global visits.
Thanks to Mark, Marco, Drew, Richard.  Your inputs are very helpful.  We do need to have a well-thought plan for Sync.  Maybe Tanvi will have some more ideas after the design sprint this week.

Updated

3 years ago
Depends on: 1302597
Hi, Marco.  Would you please review this patch?  I added a context_id column to moz_places and moz_historyvisits.  I decided not to do any UI in this bug.  Instead, I wrote a browser test to test if I wrote correct value to the database.

Thank you.
Attachment #8797033 - Flags: review?(mak77)
Attachment #8783905 - Attachment is obsolete: true
Attachment #8783907 - Attachment is obsolete: true
Sorry for the delay, I plan to look into this tomorrow.
Comment on attachment 8797033 [details] [diff] [review]
Add userContextId to moz_places and moz_historyvisits.

Review of attachment 8797033 [details] [diff] [review]:
-----------------------------------------------------------------

Please note that Kit is also about to land a Places migration task in bug 1302901, and I think he's close to landing, so one of you may need to unbitrot.

Likely you may want to also update the CREATE_HISTORYVISITS_AFTERDELETE_TRIGGER in nsPlacesTriggers.h to update the context_id when a visit is removed? Something similar to what we do for last_visit_date I think?
That will also need a test.

::: browser/components/contextualidentity/test/browser/browser_history.js
@@ +39,5 @@
> +    let db = yield PlacesUtils.promiseDBConnection();
> +    let rows = yield db.executeCached(
> +      "SELECT context_id FROM moz_places WHERE url_hash = hash(:url) AND url = :url",
> +      { url: URI });
> +    is(rows[0].getResultByIndex(0), userContextId, "Check the last visited userContextId of this URI.");

you may want to actually add 2 visits with 2 different contexts and see the moz_places value is the last one, not the first one (if for now you intend to use this as last-visit-context-id

::: docshell/base/nsDocShell.cpp
@@ +12977,5 @@
>        visitURIFlags |= IHistory::UNRECOVERABLE_ERROR;
>      }
>  
> +    (void)history->VisitURI(aURI, aPreviousURI, visitURIFlags,
> +                            mOriginAttributes.mUserContextId);

is this correct even if the feature is disabled? what happens in such a case, we still get a valid context id but don't use it in the ui, or always get 0?

::: toolkit/components/places/Database.cpp
@@ +1852,5 @@
> +    "SELECT context_id FROM moz_places"
> +  ), getter_AddRefs(stmt));
> +  if (NS_FAILED(rv)) {
> +    rv = mMainConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> +      "ALTER TABLE moz_places ADD COLUMN context_id INTEGER DEFAULT 0 NOT NULL"

shouldn't you also alter moz_historyvisits?

::: toolkit/components/places/Database.h
@@ +17,5 @@
>  #include "Shutdown.h"
>  
>  // This is the schema version. Update it at any schema change and add a
>  // corresponding migrateVxx method below.
> +#define DATABASE_SCHEMA_VERSION 35

when bumping the schema version you must also bump CURRENT_SCHEMA_VERSION in toolkit/components/places/tests/head_common.js, create places_vNN.sqlite in toolkit/components/places/tests/migration/ and eventually create a simple migration test to check the columns exist. You can use bug 1302901 as an example or just check the other tests there.
The db can be created just by launching a build with your new schema version, then opening the db and issuing a VACUUM command.

::: toolkit/components/places/nsPlacesTables.h
@@ +34,5 @@
>      ", place_id INTEGER" \
>      ", visit_date INTEGER" \
>      ", visit_type INTEGER" \
>      ", session INTEGER" \
> +    ", context_id INTEGER" \

is there a reason this is not DEFAULT 0 NOT NULL?
Attachment #8797033 - Flags: review?(mak77)
(In reply to Marco Bonardo [::mak] from comment #39)
> Comment on attachment 8797033 [details] [diff] [review]
> Add userContextId to moz_places and moz_historyvisits.
> 
> Review of attachment 8797033 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Please note that Kit is also about to land a Places migration task in bug
> 1302901, and I think he's close to landing, so one of you may need to
> unbitrot.
> 
> Likely you may want to also update the
> CREATE_HISTORYVISITS_AFTERDELETE_TRIGGER in nsPlacesTriggers.h to update the
> context_id when a visit is removed? Something similar to what we do for
> last_visit_date I think?
> That will also need a test.
> 
> ::: browser/components/contextualidentity/test/browser/browser_history.js
> @@ +39,5 @@
> > +    let db = yield PlacesUtils.promiseDBConnection();
> > +    let rows = yield db.executeCached(
> > +      "SELECT context_id FROM moz_places WHERE url_hash = hash(:url) AND url = :url",
> > +      { url: URI });
> > +    is(rows[0].getResultByIndex(0), userContextId, "Check the last visited userContextId of this URI.");
> 
> you may want to actually add 2 visits with 2 different contexts and see the
> moz_places value is the last one, not the first one (if for now you intend
> to use this as last-visit-context-id

Yes, that's already what I did in the test.  The loop goes through three different contexts, add visits, and make sure that the moz_places value is the last one.

> ::: docshell/base/nsDocShell.cpp
> @@ +12977,5 @@
> >        visitURIFlags |= IHistory::UNRECOVERABLE_ERROR;
> >      }
> >  
> > +    (void)history->VisitURI(aURI, aPreviousURI, visitURIFlags,
> > +                            mOriginAttributes.mUserContextId);
> 
> is this correct even if the feature is disabled? what happens in such a
> case, we still get a valid context id but don't use it in the ui, or always
> get 0?

If privacy.userContext.enabled is never turned on, mUserContextId will always be 0.

> ::: toolkit/components/places/Database.cpp
> @@ +1852,5 @@
> > +    "SELECT context_id FROM moz_places"
> > +  ), getter_AddRefs(stmt));
> > +  if (NS_FAILED(rv)) {
> > +    rv = mMainConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> > +      "ALTER TABLE moz_places ADD COLUMN context_id INTEGER DEFAULT 0 NOT NULL"
> 
> shouldn't you also alter moz_historyvisits?
> 
> ::: toolkit/components/places/Database.h
> @@ +17,5 @@
> >  #include "Shutdown.h"
> >  
> >  // This is the schema version. Update it at any schema change and add a
> >  // corresponding migrateVxx method below.
> > +#define DATABASE_SCHEMA_VERSION 35
> 
> when bumping the schema version you must also bump CURRENT_SCHEMA_VERSION in
> toolkit/components/places/tests/head_common.js, create places_vNN.sqlite in
> toolkit/components/places/tests/migration/ and eventually create a simple
> migration test to check the columns exist. You can use bug 1302901 as an
> example or just check the other tests there.
> The db can be created just by launching a build with your new schema
> version, then opening the db and issuing a VACUUM command.
> 
> ::: toolkit/components/places/nsPlacesTables.h
> @@ +34,5 @@
> >      ", place_id INTEGER" \
> >      ", visit_date INTEGER" \
> >      ", visit_type INTEGER" \
> >      ", session INTEGER" \
> > +    ", context_id INTEGER" \
> 
> is there a reason this is not DEFAULT 0 NOT NULL?

I'll address your other comments.  Thanks, Marco.
Hi Marco, I fixed what you said in the last comment.  If Kit lands his patch first, I think I only need to add one to the version numbers, right?
Attachment #8800040 - Flags: review?(mak77)
Attachment #8797033 - Attachment is obsolete: true
(In reply to Jonathan Hao [:jhao] from comment #41)
> Created attachment 8800040 [details] [diff] [review]
> Add userContextId to moz_places and moz_historyvisits.
> 
> Hi Marco, I fixed what you said in the last comment.  If Kit lands his patch
> first, I think I only need to add one to the version numbers, right?

right, and I think it landed.
Before moving on, I want to be sure we have clearance from Sync for the Desktop only addition and that previous doubts are resolved. The patch is adding context to every history entry and coalesced last_context to each uri.
Flags: needinfo?(markh)
I think the moz_historyvisits change is problematic in the context of Sync - for example, 2 desktop devices syncing will not carry that contextId (ie, a visit with a non-default contextId on one device will appear as a visit on the default contextId on the second device). This might be solvable by having Sync ignoring any visit with a non-default context ID.

Similarly, a user who successfully limits visits to a site to a non-default context on desktop, but still uses that site on mobile will find that Sync will "pollute" their history on desktop by adding visits against the default context ID. It *might* be possible to address this by, eg, applying an incoming record from a mobile device that's not contextID-aware by using the contextId from the most recent visit on the local device.

These are all variations of the points already made above.

(In reply to Jonathan Hao [:jhao] from comment #36)
> Thanks to Mark, Marco, Drew, Richard.  Your inputs are very helpful.  We do
> need to have a well-thought plan for Sync.  Maybe Tanvi will have some more
> ideas after the design sprint this week.

This comment hasn't been addressed and the needinfo on Tanvi remains in place. Best I can tell it will not break Sync itself but runs the risk of making this new feature behave unexpectedly and causing what the user may perceive as "corruption" of their history visits if they use Sync. IMO it would be short-sighted to land this without that well-thought plan.
Flags: needinfo?(markh)
(In reply to Mark Hammond [:markh] from comment #44)

> I think the moz_historyvisits change is problematic in the context of Sync -
> for example, 2 desktop devices syncing will not carry that contextId (ie, a
> visit with a non-default contextId on one device will appear as a visit on
> the default contextId on the second device). This might be solvable by
> having Sync ignoring any visit with a non-default context ID.

Which is a painful tradeoff to make. We have to choose between:

  * Your other device has all of your contextualized history dumped into the default bucket.
  * Your other device has no idea at all about many of the pages you visited.

Furthermore, if a user wipes their machine and re-syncs, the result will be either:

  * Large chunks of history are missing entirely. Data loss!
  * All of their contextualized history now appears to be in the default bucket. Data loss!

In short: this feature would be kinda painfully useless for Sync users. They wouldn't be able to rely on context data.


> IMO it would be short-sighted to land this without that well-thought plan.

Agreed.
Comment on attachment 8800040 [details] [diff] [review]
Add userContextId to moz_places and moz_historyvisits.

Review of attachment 8800040 [details] [diff] [review]:
-----------------------------------------------------------------

while I could proceed with the review, Looks like you have a pending problem to resolve first.
I'd suggest to organize a meeting or a threaded mail discussion with the Sync team to find a way out of it, and then unbitrot the patch and re-ask for review. From a technical point of view I don't see roadblock and I'll be happy to review this (hopefully faster than I did so far). Unbitrotting and reviewing now, sounds like it may end up doubling the work if then the other discussion takes a long time and other changes hit central.
From a quick look the patch looks ok, we will go into finer details when ready.
Attachment #8800040 - Flags: review?(mak77) → feedback+
Will this patch work for isolating visited links on the page itself across containers or should we file a follow bug for this?
Flags: needinfo?(jhao)
No, this patch tags the database entries with usercontextid when users visit pages but it hasn't used that information to query whether a page is visited.
Flags: needinfo?(jhao)
Blocks: 1325874
Unassign myself because I won't be able to work on this anymore.
Assignee: jhao → nobody
Status: ASSIGNED → NEW

Comment 51

2 years ago
As I see things … 

… thoughts in and around this bug 1283320 might be helped, maybe reshaped, by a relatively recent (2016-12-15) NEW P5 DOM:security enhancement: 

bug 1323873 - Allow tabs to change user contexts when navigating
<https://bugzilla.mozilla.org/show_bug.cgi?id=1323873>

– if that allowance will be made, then will it become _simpler_ to think about history? 

And so on, and not only developers' thinking. There's also the wish for end users to have the potential for non-complex education, maybe tacit learning, about how Firefox containers can shape the history of a tab.
Bug 1323873 is merely a way to change the tab from one container to another upon navigating to a new URL.
The other bug needs to account for history across containers but it doesn't mean the database would be tagged correctly which is the issue here. This bug might make the other simpler to implement though.
Depends on: 1374788
Blocks: 1374788
No longer depends on: 1374788
Duplicate of this bug: 1428598

Comment 54

a year ago
This feature is very needed for extensions like Private Tab and Containers On The Go, that must clean-up all history on container remove, and now this is impossible without having userContextId in each History record. So this is not only security thing, but large usability problem for many users.

Comment 55

11 months ago
Any progress? I'm still waiting for upgrade from firefox 57 to firefox quantum to use the two addons: "Private Tab" and "Proxy Privacy Ruler"

https://github.com/Infocatcher/Private_Tab_WE#issues  the addon Private Tab WE is blocked by issue 1283320(https://bugzilla.mozilla.org/show_bug.cgi?id=1283320) and issue 1353726(https://bugzilla.mozilla.org/show_bug.cgi?id=1353726)
Comment hidden (advocacy)

Comment 57

11 months ago
It also block "Temporary Containers" add-on.

Comment 58

6 months ago
This is what is preventing the awesome containers feature from going further, as well as preventing the add-ons already mentioned above by others.

Is there anyone working on this? This is key!

Comment 59

3 months ago

Is there something we can do to advance this?

(In reply to oransterf from comment #59)

Is there something we can do to advance this?

Who is "we" here? To move this forward, we need to come up with a plan which takes all the above comments into account. The comments above aren't all that easy to follow, so something concrete might be to start a document which attempts to list the challenges in a sane, structured way, which can then be followed by an implementation plan (which is likely to require work on all of our sync implementations)

You need to log in before you can comment on or make changes to this bug.