Closed Bug 1127894 Opened 11 years ago Closed 2 years ago

Gather interest with partners about a content personalization experiment

Categories

(Content Services Graveyard :: Classification Engine, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: kghim, Assigned: mruttley)

References

Details

(Whiteboard: .?)

No description provided.
Goal: Get community involvement - people able to build extensions upon the interest signal. Process: 1. Design a web standard format for representing the UP signal, probably as some kind of super extensible JSON 2. Integrate that standardized signal into navigator.interests, in the same way that we have navigator.geolocation Considerations: 1. UI-based personalization: Give a list of articles/links, re-order them based on a user's interest. Have declarative syntax in the markup denoting personalizable content. e.g. <ul personalizable=true> <li topic="football">...</li> <li topic="basketball">...</li> ... </ul> This content can be re-ordered according to the user's preference. The topic list could be: - self-declared - obtained through meta-tags (used for SEO) - prompted to the user via a door-hanger for each domain, as determine by rules in HSTS or CORS, with scope and expiry 2. Content Jail: Have a portion of a page declared to be open to personalization. (could be similar to the markup example above) This portion of the page would be treated differently by the browser: - After rendering the markup un-personalized, create a duplicate of that DOM fragment and give it its own javascript runtime - All resources for the personalizable area are pre-loaded (perhaps they are declared in a CSP fashion?) - That javascript runtime will have limited capabilities (e.g. no network access). It runs the code supplied by the Site. Do not expose things like the visited CSS selector etc. - Only transforms are allowed in that content jail The two JS runtimes would not be allowed to talk to each other.
Summary: Interest Signal Browser Attribute, Microformat Study → Gather interest with partners about a content personalization experiment
We've tried getting interests as a web API a couple years ago. Both as a general API as well as a limited API. In either case, there's always the concern of data leak as this gives way more access to users' browsing history than before that any page could have access to and then track on servers forever. Even if the "API" is automatic reordering of content, it's similar to the css link visited issue where a site could read out the styling to determine if a user has visited a page or not, so there were a lot of special casing to prevent that data leakage. To be clear, this problem is quite complex as well as there are ways to trick users into clicking on certain things as well, e.g., coloring background black and text black unless the interest API transforms the content text to be white and visible -> users who interacted with the visible white text would reveal their interest.
Another side of what we ran into with the interest API was setting defaults and figuring out user prompting. Would sites get default access to a tiny bit of the API? Could users allow for more access to certain sites? Would Firefox enforce some type of limited access based on how much data has been shared so far? All of those had various deeper questions as well -- especially the Firefox enforcement. How to quantify how much has been exposed vs to be exposed. What's the limit? Are there certain combinations of data that are much more revealing, e.g., accidental classification of someone's street address (e.g., Gouda Ave -> accidental really high interest cheese)?
But comment 2 and comment 3 were both in the context of putting the API into Firefox. If we're just looking to experiment to measure potential benefit of having the API, we can have the appropriate privacy policy and disclaimers to let add-on users be aware of what would be going on in the experiment. A lot of the questions from the previous posts don't really need to be addressed until we know for sure this is something that we want to get into Firefox for all users.
The considerations portion of the conversation is part of an email I wrote. Let me clarify. The conversation about UI-based recommendation is an approach we could use as a low-cost way to get publishers involved in an experiment. We would need to ship and add-on, possibly as a Telemetry Experiment, that would be able to interpret the personalizable content. This experiment would not address concerns about privacy leaks. It would be limited to re-ordering elements. The Sandbox idea, or the Content jail would not be part of a conversation with partners. It would be a conversation to be had with a broader Mozilla community. It's a longer term effort that could be supported by the findings from the UI-based experiment. The scope could be greater than just re-ordering elements. Further transformations could be allowed. This description is out of scope for this bug.
The goal of the Sandbox idea would be to address information leak concerns outlined in comment 2. It is also a more abstract idea. There is no need to be limited to an "UP" signal. The data available in the sandbox could be anything: e.g. site visit history, purchase history, ad impression history for conversion tracking, etc...
Pasting in relevant communication to Ben regarding which parts of the site need to be personalized to get the engagement lift. Based on nytimes.com study: Table below shows visits made by non-subscribers vs. subscribers to various sections of site. +------------+-------+-------+-----------+--------------+----------+ | subscribed | users | total | home_page | section_page | articles | +------------+-------+-------+-----------+--------------+----------+ | 0 | 875 | 30724 | 14268 | 2629 | 13827 | | 1 | 28 | 2965 | 877 | 305 | 1783 | +------------+-------+-------+-----------+--------------+----------+ Note that: - subscribers visit disproportionally larger number of pages - yet visit volume form random web folks is 10x more - home page gets about 50% of the visit volume - section pages seem insignificant If we are looking for "personalization lift" (however it's measured), personalizing home page is the main target: - it's much easier to do then the full site - being unspecific it should benefit most from personalization - if "personalization lift" exists at all it should surface at home-page level (and prove our usefulness assumption)
In my humble opinion we should go to Yahoo:News, figure out how sand-boxing will work with them, take their RSS, take their trending/popularity data and slap together personalized Yahoo:News home page. Then ran telemetry experiment to measure the "lift". If it works for Yahoo:News we could take to the rest of them.
Iteration: 38.2 - 9 Feb → 38.3 - 23 Feb
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: [story] → .?
Iteration: 38.3 - 23 Feb → 39.1 - 9 Mar
Iteration: 39.1 - 9 Mar → 39.2 - 23 Mar
Iteration: 39.2 - 23 Mar → 39.3 - 30 Mar
Intent Casting is now working! (Albeit with dummy data) Branch: https://github.com/mozilla/interest-dashboard/commits/InterestSignal Main Commits: https://github.com/mozilla/interest-dashboard/commit/0598ecacb776385384ff5d397c1a8566163a82c3 Testing Page: https://github.com/mozilla/interest-dashboard/commit/dc37ef8d9319568a07bad87ee295aa96947dc3df Minor Correction later: https://github.com/mozilla/interest-dashboard/commit/5b64aa2234d19fc8be5029f9acc3767d9e801661 Next steps involve hooking in actual data (I'm guessing from SimpleStorage) and creating the contentJail style prototype.
Screenshots of it in action: - https://i.imgur.com/ZvGcYQ9.png (Permissions prompt) - https://i.imgur.com/OWknf2c.png (Result) I've tried to make it as similar in functionality to navigator.geolocation as possible
What is this work for? We have existing code back from 2013 that was implemented directly in Firefox to share interests: https://github.com/Mardak/up-central/blob/8dffdc699140e0c022e5d36b53d24b82bc989004/toolkit/components/interests/Interests.js There were Nightly builds that allowed users to share interests from Firefox using the usual Firefox interface for permissions, etc.
In particular, we had an API that allowed getting some number of top interests: https://github.com/Mardak/up-central/blob/8dffdc699140e0c022e5d36b53d24b82bc989004/toolkit/components/interests/Interests.js#L801 as well as an API to get scores of interests by name: https://github.com/Mardak/up-central/blob/8dffdc699140e0c022e5d36b53d24b82bc989004/toolkit/components/interests/Interests.js#L767 Both were exposed through navigator.interests on a webpage.
There's more information in the various bugs that block bug 839132.
Mardak: Thanks for the links, I've never seen that file before. However, this implementation is very different since it involves integrating data from the Interest Dashboard and is combined with the extension itself. It will also be combined with the container approach (e.g. putting personalizable=true on things like <ul> tags), and the two will be released to see what the developer community thinks.
Iteration: 39.3 - 30 Mar → 40.1 - 13 Apr
Iteration: 40.1 - 13 Apr → 40.2 - 27 Apr
Iteration: 40.2 - 27 Apr → ---
Blocks: 1104322
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.