1159571 - Use server provided frequency caps for daily and lifetime totals

Reporter

Description

•

9 years ago

We'll want two types of persistent frequency caps: a daily cap and a total cap. These caps come from the server instead of being hardcoded as initially implemented in bug 1140496.

Ed Lee :Mardak

Reporter

Comment 1

•

9 years ago

jterry mentioned it would be nice to be able to spread out usage of frequency capping, i.e., show each suggestion once before showing the same thing a second time.

And to confirm, we do want daily and lifetime totals, e.g., 3 daily and 5 total -- this could result in a user seeing a suggestion for 2 days (3, 2) or 5 days (1, 1, 1, 1, 1) or even longer if some days had no impressions (1, 0, 0, 1, 0, 0, 3).

For now, we'll continue frequency capping by url as per bug 1140496.

There's potentially multiple ways to frequency cap on a daily basis:
- show at most daily-cap within 24 hours from first showing
- reset the counter at midnight
- 24 hour rolling (e.g., if we reset at time T, potentially the user sees a tile 3 times at T-1 then 3 more times at T+1)

Kevin Ghim

Comment 2

•

9 years ago

Ideally, we'd want flexibility for frequency capping for various type combinations. The factors we need to consider are:
- is the sponsorship for awareness or response?
- is the revenue type for the sponsorship by impressions or by clicks?

If the campaign is CPC, we should need tighter control in setting impression and click caps. 

If the revenue type is by impressions, there's a fair balance we need to strike between the possibility of conversion and profitability. The less we display the sponsorship, the effectiveness will rise, however, it might take longer to reach revenue goal. 

There are a number of studies done on this topic which is worth looking into:
https://advertising.microsoft.com/user/en-us/researchlibrary/researchreport/OptimalFrequency.pdf
http://static.googleusercontent.com/media/www.google.fr/fr/fr/doubleclick/pdfs/Better-Brand-Engagement-with-Display-Formats-Feb-2012.pdf

Generally, 3 per 24 hours per user is optimal, but these study are based on IAB standard size display ads which is quite different than our ad products. We should test out what is optimal for our audience depending on product type, revenue type and eventually channel type.

Ed Lee :Mardak

Reporter

Comment 3

•

9 years ago

(In reply to Kevin Ghim from comment #2)
> If the campaign is CPC, we should need tighter control in setting impression
> and click caps. 
What kind of tighter controls?

Kevin Ghim

Comment 4

•

9 years ago

(In reply to Ed Lee :Mardak from comment #3)
> (In reply to Kevin Ghim from comment #2)
> > If the campaign is CPC, we should need tighter control in setting impression
> > and click caps. 
> What kind of tighter controls?

As in, if the campaign is CPC, cap is 2 vs. CPM cap is 5. Not sure what the number is for us, but we'd want to back up our assumptions from frequency vs CTR reporting. Is that possible?

Ed Lee :Mardak

Reporter

Comment 5

•

9 years ago

I'm confused about what additional controls you need over using server provided frequency cap values.

Kevin Ghim

Comment 6

•

9 years ago

(In reply to Ed Lee :Mardak from comment #5)
> I'm confused about what additional controls you need over using server
> provided frequency cap values.

Perhaps I can answer by asking some questions I have. For 39, can we manually set a frequency cap amount per impression and/or clicks per window? If not, we should consider that for future versions. Additionally, can we get reporting on frequency cap performance? Both questions should provide direction towards optimal capping amount per variable type (product, revenue, audience, etc).

Kevin Ghim

Updated

•

9 years ago

Whiteboard: .? → .002

Ed Lee :Mardak

Reporter

Updated

•

9 years ago

Assignee: nobody → edilee

Iteration: --- → 40.3 - 11 May

Points: --- → 8

Ed Lee :Mardak

Reporter

Comment 7

•

9 years ago

I'm not sure what "frequency cap performance" means. With server provided frequency caps for daily and total views, we can set that on a per tile basis. We can measure impressions and clicks to calculate CTR as usual.

So this means we can serve a tile with frequency caps of 3daily+10total and another tile with 5daily+5total or whatever combination and see if that affects CTR.

Ed Lee :Mardak

Reporter

Updated

•

9 years ago

Blocks: 1160372

maxim zhilyaev

Assignee

Updated

•

9 years ago

Assignee: edilee → mzhilyaev

maxim zhilyaev

Assignee

Comment 8

•

9 years ago

I believe that we add count of shows to the click ping - we will know the optimal frequency cap early through the campaign.  So we can correct it during the campaign.  Which brings the question of showing doing a round-robin through available suggested tiles rather then showing same suggested tile all the time.

Ed Lee :Mardak

Reporter

Comment 9

•

9 years ago

Round robin probably isn't necessary. A call to _updateSuggested will pick the next one at random at this point and spreads out views more than just repeating the same.

maxim zhilyaev

Assignee

Comment 10

•

9 years ago

Need info 

1) Is directoryId a unique tile identifier?
Do we recycle these id?
Is it possible to get a different tile with same directoryID?

2) If tile contains no daily/total frequency cap, should there be defaults?
If so which ones?

Flags: needinfo?(oyiptong)

Tim Spurway [:tspurway]

Comment 11

•

9 years ago

(In reply to maxim zhilyaev from comment #10)
> Need info 
> 
> 1) Is directoryId a unique tile identifier?
> Do we recycle these id?
> Is it possible to get a different tile with same directoryID?
> 
The directoryId is unique and will never be reused.

Olivier Yiptong [:oyiptong]

Comment 12

•

9 years ago

> 2) If tile contains no daily/total frequency cap, should there be defaults?
If so which ones?

We should ask :jterry what a sensible default is.
I think 3/24 is the standard for the daily freq. cap.

Flags: needinfo?(oyiptong)

Olivier Yiptong [:oyiptong]

Updated

•

9 years ago

Depends on: 1161192

maxim zhilyaev

Assignee

Comment 13

•

9 years ago

Attached patch v1. new frequency capping (obsolete) — Details — Splinter Review

Attachment #8601629 - Flags: review?(msamuel)

maxim zhilyaev

Assignee

Updated

•

9 years ago

Blocks: 1161656

maxim zhilyaev

Assignee

Comment 14

•

9 years ago

Attached patch v2. frequency caps - updated documentation (obsolete) — Details — Splinter Review

updated documentation at browser/docs/DirectoryLinksProvider.rst

Attachment #8601629 - Attachment is obsolete: true

Attachment #8601629 - Flags: review?(msamuel)

Attachment #8601751 - Flags: review?(msamuel)

Marina Samuel [:emtwo]

Comment 15

•

9 years ago

Comment on attachment 8601751 [details] [diff] [review]
v2. frequency caps - updated documentation

Review of attachment 8601751 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/modules/DirectoryLinksProvider.jsm
@@ +762,5 @@
> +   * @param json file path
> +   * @return a promise resolved to a valid object or undefined upon error
> +   */
> +  _readJsonFile: function DirectoryLinksProvider_readJsonFile(filePath) {
> +    let jsonObj;

`let jsonObj = {}` here ensures when an error occurs we still return a valid json object rather than `undefined`

@@ +792,5 @@
> +      // check _frequencyCapFilePath for existence
> +      let doesFileExists = yield OS.File.exists(this._frequencyCapFilePath);
> +      if (doesFileExists) {
> +        let fileObject = yield this._readJsonFile(this._frequencyCapFilePath);
> +        if (fileObject) {

I think you can skip this check if you make the change I mention above in _readJsonFile(). So you won't need line 791: `this._frequencyCaps = {};`

And you can directly write `this._frequencyCaps = yield this._readJsonFile(this._frequencyCapFilePath);`

@@ +852,5 @@
> +   *        all cap objects with lastUpdated less than (now() - timeDelta)
> +   *        will be removed. This is done to remove frequency cap objects
> +   *        for unused tile urls
> +   */
> +  _pruneFrequencyCapUrls: function DirectoryLinksProvider_pruneFrequencyCapUrls(timeDelta = DEFAULT_PRUNE_TIME_DELTA) {

Pruning based on a time delta seems a little arbitrary to me. Why is this delta chosen? What happens if a URL was last updated more than 10 days ago so we delete its frequency cap info and stop showing the tile even though it didn't reach its cap? Maybe we should prune for urls that have already reached their caps?

Ed Lee :Mardak

Reporter

Comment 16

•

9 years ago

(In reply to Marina Samuel [:emtwo] from comment #15)
> Pruning based on a time delta seems a little arbitrary to me.
This is more of a failsafe of making sure we don't store frequency counts for old urls forever.

> Maybe we should prune for urls that have already reached their caps?
If we prune too quickly, then we'll end up showing something that reached its total cap -> prune -> now uncapped again.

maxim zhilyaev

Assignee

Comment 17

•

9 years ago

Talked to Marina and we agreed to keep _readJsonFile() unchanged because it may be useful to check if file read fails.  We also decided to keep file existence check logic in _readFrequencyCapFile() method.

maxim zhilyaev

Assignee

Comment 18

•

9 years ago

Attached patch v3. minor documentation tweak (obsolete) — Details — Splinter Review

Attachment #8601751 - Attachment is obsolete: true

Attachment #8601751 - Flags: review?(msamuel)

Attachment #8602235 - Flags: review?(adw)

maxim zhilyaev

Assignee

Comment 19

•

9 years ago

Attached patch v4. yet another minor doc change (obsolete) — Details — Splinter Review

yes another one liner

Attachment #8602235 - Attachment is obsolete: true

Attachment #8602235 - Flags: review?(adw)

Attachment #8602238 - Flags: review?(adw)

Drew Willcoxon :adw

Comment 20

•

9 years ago

Comment on attachment 8602238 [details] [diff] [review]
v4. yet another minor doc change

Review of attachment 8602238 [details] [diff] [review]:
-----------------------------------------------------------------

Technically this looks fine, but I think there's kind of a big problem around leaking the user's history.  Sorry for not thinking of this earlier when you sketched the idea to me, Max.  (I'd ask you about it in IRC first but it's the end of the day and I see you're not there, and I promsied a review today.)  I also have code comments below that I typed up earlier.

There are a couple of aspects.  You're storing capObject.clicked, so it's possible to tell that the user visited capObject's URL.  But even without that, you're storing the simple fact that a suggested link was shown in newtab at all.  (Because the URLs that are the keys in the _frequencyCaps object are suggested URLs, not URLs from the user's history, correct?)  So while it's not possible to tell what the user has in their history (setting aside the capObject.clicked problem), it is possible to guess at the kinds of pages that might be in their history based on the ones that were suggested.

Am I wrong about any of that?

I think it's OK to store all that information as long as the `clicked` suggested pages remain in the user's history and as long as the pages that prompted the suggested sites remain in the user's history, but once those pages are deleted from history, I don't think it's acceptable to keep that info on disk, not even for the 10-day window in the patch.

There are a couple of technical options, but it probably gets complicated when you think about purging suggested URLs when the pages that prompted the suggestions are deleted.  You might be able to use the Places annotation service.  Then your data would be hooked into the history service automatically and you wouldn't have to worry about leaking anything (with the aforementioned complication around suggested URLs).  But I don't think the annotation service is used very much.  We could ask Marco for his opinion.  http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsIAnnotationService.idl

The other option is to do what you're doing but listen for history deletions so you can update your data.  In that case I think we should add onDeleteURI (from nsINavHistoryObserver) to PlacesProvider and have it broadcast onLinkChanged.  DirectoryLinksProvider already observes PlacesProvider, so it would be notified when a link is deleted.  Then you would update your data.

You and Ed should figure something out, if what I'm saying is reasonable.

::: browser/docs/DirectoryLinksProvider.rst
@@ +183,5 @@
>    Suggested Tile if the user has the site in one of the top 100 most-frecent
>    pages. Only preapproved array of strings that are hardcoded into the
>    DirectoryLinksProvider module are allowed.
> +- ``frequency_caps`` - an object consisting of daily and total frequency caps
> +  that limit a number of times a Suggested Tile can be shown in the new tab

a number -> the number

::: browser/modules/DirectoryLinksProvider.jsm
@@ +448,5 @@
>      let ping = new XMLHttpRequest();
>      ping.open("POST", pingEndPoint + (action == "view" ? "view" : "click"));
>      ping.send(JSON.stringify(data));
>  
> +    return Task.spawn(function() {

function* ()

@@ +542,5 @@
>  
>        // Allow for one link suggestion on top of the default directory links
>        this.maxNumLinks = links.length + 1;
>  
> +      // prune frequency caps off outdated urls

off -> of (I think?)

@@ +761,5 @@
> +   * Reads json file, parses its content, and returns resulting object
> +   * @param json file path
> +   * @return a promise resolved to a valid object or undefined upon error
> +   */
> +  _readJsonFile: function DirectoryLinksProvider_readJsonFile(filePath) {

Nit: You could use Task.async to make this function a little nicer.  That's basically a style choice though, so up to you.

https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Task.jsm#async%28%29

@@ +785,5 @@
> +   * @return a promise resolved upon load completion
> +   *         on error or non-exstent file _frequencyCaps is set to empty object
> +   */
> +  _readFrequencyCapFile: function DirectoryLinksProvider_readFrequencyCapFile() {
> +    return Task.spawn(function() {

Nit: Same nit here about Task.async.  Instead of

foo: function(arg) {
  return Task.spawn(function* () {
  });
}

you can write

foo: Task.async(function* (arg) {
}) // note you don't need bind() here

@@ +789,5 @@
> +    return Task.spawn(function() {
> +      // empty _frequencyCaps object - it will be set to file content
> +      this._frequencyCaps = {};
> +      // check _frequencyCapFilePath for existence
> +      let doesFileExists = yield OS.File.exists(this._frequencyCapFilePath);

Just read the file without checking if it exists first.  OS.File.read will reject its promise with an error in that case, which you handle.

@@ +870,5 @@
> +  _isTodayTimestamp: function DirectoryLinksProvider_isTodayTimestamp(timestamp) {
> +    let showOn = new Date(timestamp);
> +    let today = new Date();
> +    // call timestamps identical if both day and month are same
> +    return showOn.getDate() == today.getDate() && showOn.getMonth() == today.getMonth();

Please check the year too just to prevent the possibility of embarrassing bugs.  And "timestamps identical" isn't right, is it?  It's "days identical."

@@ +891,5 @@
> +      // update lastShownDate
> +      capObject.lastShownDate = Date.now();
> +    }
> +
> +    // bump both dialy and total counters

dialy -> daily

@@ +893,5 @@
> +    }
> +
> +    // bump both dialy and total counters
> +    capObject.totalViews ++;
> +    capObject.dailyViews ++;

Please remove the space before ++.

Attachment #8602238 - Flags: review?(adw)

Ed Lee :Mardak

Reporter

Comment 21

•

9 years ago

(In reply to Drew Willcoxon :adw from comment #20)
> it is possible to guess at the kinds of pages that might be in their
> history based on the ones that were suggested.
That's true, and to be clear, one could try to infer types of pages as opposed to specific pages or sites. Because the suggestion isn't necessarily tied to a specific page, I'm not sure if there's a good way to watch page deletions to trigger removing the fact that a suggestion was shown.

Would it be better if the url was not in plaintext? Similar to blocked tiles are recorded as a md5 hash?

Ed Lee :Mardak

Reporter

Comment 22

•

9 years ago

maksik, the md5 hash I'm referring to in NewTabUtils:
https://hg.mozilla.org/mozilla-central/file/617dbce26726/toolkit/modules/NewTabUtils.jsm#l61

We could probably reuse that code for some simple/fast-ish hashing of strings.

maxim zhilyaev

Assignee

Comment 23

•

9 years ago

We can md-hash tile urls, but how does it save us from attack?
I believe Drew's concern is a user who deleted a bunch of history, and then his computer got arrested.
The FBI agent gets frequency caps file from user's machine, gets json file from mozilla, run's json file suggested tiles urls through md5 hash and deciphers which tiles where shown to a user and potentially when.

My knee jerk reaction would be to encrypt the frequency cap file, but that does not help either, for if FX can decrypt the file upon restart, so can FBI agent. 

Theoretically we could store urls that triggered a show of a particular tile in frequency cap file.
Then, upon deletion we remove deleted url from the list and remove frequency cap entry when all trigger urls are deleted and we can't find another trigger url in suggestedLinksMap.  This has an issue with false positives: we are essentially removing frequency_cap info for tile completely, so next time FX reloads json file, views and clicks are set to 0, and if the user visits triggering page again, he will see the ad that he might've clicked already.

Drew Willcoxon :adw

Comment 24

•

9 years ago

Max, I agree with all of that, including your suggested fix and its drawback.  (And as we talked about, you don't have to resort to scenarios involving the FBI.  We're Mozilla, and we try to protect the user's privacy as much as we can.  (I'm not implicating you and Ed or anything.  You're good people! :-)))

Sorry to cause more trouble for you guys.

Ed Lee :Mardak

Reporter

Updated

•

9 years ago

No longer blocks: 1160372

Drew Willcoxon :adw

Comment 25

•

9 years ago

Summary of our meeting: I had two concerns, (1) `clicked` reveals which suggested pages you've visited, (2) the URLs by themselves may reveal the kinds of sites you have visited.  I'm dropping (2) because the "kinds of sites" are very broad categories like news sites.  (1) on the other hand reveals your actual history, and we'll try to fix that by observing when those URLs are deleted, and responding by deleting them from the data.

Drew Willcoxon :adw

Comment 26

•

9 years ago

Max, here's an example of how to add a visit to a page: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/test/newtab/browser_newtab_bug722273.js#40  I think `place` has to include a `visits` array with at least one visit.  PlacesUtils comes from PlacesUtils.jsm.  Method definition here: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/mozIAsyncHistory.idl#175

To remove the visit you add, you have to use this other interface: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/History.jsm#264  You can access that from PlacesUtils.history.remove().

maxim zhilyaev

Assignee

Comment 27

•

9 years ago

Attached patch v5. fix reviewer comments, handle page deletion and clear history events (obsolete) — Details — Splinter Review

Attachment #8603174 - Flags: review?(adw)

Ed Lee :Mardak

Reporter

Updated

•

9 years ago

Attachment #8603174 - Attachment is patch: true

Attachment #8603174 - Attachment mime type: text/x-patch → text/plain

Drew Willcoxon :adw

Comment 28

•

9 years ago

Comment on attachment 8603174 [details] [diff] [review]
v5. fix reviewer comments, handle page deletion and clear history events

Review of attachment 8603174 [details] [diff] [review]:
-----------------------------------------------------------------

I was expecting PlacesProvider.onDeleteURI to call PlacesProvider._callObservers("onLinkChanged") like we talked about.  I think you're right though that clearing recent history does not trigger individual calls to onDeleteURI for each of the deleted pages, so listening for that too is the right thing to do.  But in that case I would expect you to call PlacesProvider._callObservers("onManyLinksChanged").

But this is fine, it should work, it's good enough to land.  Thanks!

Attachment #8603174 - Flags: review?(adw) → review+

maxim zhilyaev

Assignee

Comment 29

•

9 years ago

(In reply to Drew Willcoxon :adw from comment #28)

> I was expecting PlacesProvider.onDeleteURI to call
> PlacesProvider._callObservers("onLinkChanged") like we talked about. 

I took a harder look into calling onLinkChange upon page deletion or clear history, and decided against it.
Apart from removing clicked flag, nothing really may change in Directory provider upon page deletion.
So, I felt perhaps we should not bother the rest of the code recomputing suggested tiles, if all we do is removing a flag from click tracking object.  Thanks for the review

maxim zhilyaev

Assignee

Comment 30

•

9 years ago

Attached patch v6. HG changeset patch with commit message — Details — Splinter Review

Attachment #8602238 - Attachment is obsolete: true

Attachment #8603174 - Attachment is obsolete: true

Pulsebot

Comment 31

•

9 years ago

https://hg.mozilla.org/integration/fx-team/rev/ded137b69283

Ed Lee :Mardak

Reporter

Comment 32

•

9 years ago

ni? per https://wiki.mozilla.org/Firefox/Data_Collection

(In reply to Pulsebot from comment #31)
> https://hg.mozilla.org/integration/fx-team/rev/ded137b69283
There's no additional data being collected, but the server provides an extra object containing frequency caps for daily/total.

Flags: needinfo?(benjamin)

Ed Lee :Mardak

Reporter

Updated

•

9 years ago

Blocks: 1163142

Phil Ringnalda (:philor)

Comment 33

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/ded137b69283

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox40: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → Firefox 40

Benjamin Smedberg

Comment 34

•

9 years ago

f+

Flags: needinfo?(benjamin)

Ed Lee :Mardak

Reporter

Comment 35

•

9 years ago

Attached patch for beta (Mardak will land) — Details — Splinter Review

Marina Samuel [:emtwo]

Comment 36

•

9 years ago

STR:

1) Go to about:config and set browser.newtabpage.directory.src = https://people.mozilla.org/~elee/suggested.unicorn3.json

2) Visit addons.mozilla.org and several other sites until the first 8 tiles (not including suggestions) are history tiles. A suggested tile should appear.

3) Open 9 more tabs - verify that the same suggested tiles appear.

4) On the 11th tab, verify the suggested tile changes (different wording)

5) Open 9 more tabs. On the 21st tab, verify no more suggested tiles appear.

6) Change the system date to another date in the future. Open 2 more tabs. Verify there is a suggested tile. Repeat steps 3-5

Status: RESOLVED → VERIFIED

status-firefox41: --- → verified

Cornel Ionce [:noni] [Hubs QA]

Comment 37

•

9 years ago

I can confirm that the fix is now live on latest Aurora, build ID: 20150528004000, using the STR from the above comment.

I've tested on Windows 7 64-bit, Ubuntu 14.04 32-bit and Mac OS X 10.9.5.

status-firefox40: fixed → verified

v1. new frequency capping 9 years ago maxim zhilyaev 25.08 KB, patch		Details \| Diff \| Splinter Review
v2. frequency caps - updated documentation 9 years ago maxim zhilyaev 26.39 KB, patch		Details \| Diff \| Splinter Review
v3. minor documentation tweak 9 years ago maxim zhilyaev 27.13 KB, patch		Details \| Diff \| Splinter Review
v4. yet another minor doc change 9 years ago maxim zhilyaev 27.14 KB, patch		Details \| Diff \| Splinter Review
v5. fix reviewer comments, handle page deletion and clear history events 9 years ago maxim zhilyaev 34.38 KB, patch	adw : review+	Details \| Diff \| Splinter Review
v6. HG changeset patch with commit message 9 years ago maxim zhilyaev 27.08 KB, patch		Details \| Diff \| Splinter Review
for beta (Mardak will land) 9 years ago Ed Lee :Mardak 34.41 KB, patch		Details \| Diff \| Splinter Review