Closed Bug 1271313 Opened 9 years ago Closed 9 years ago

Measure the number of total URIs and unique domains visited in a "session fragment"

Categories

(Toolkit :: Telemetry, defect, P1)

defect
Points:
2

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox49 --- affected
firefox50 --- fixed

People

(Reporter: Dexter, Assigned: Dexter)

References

Details

(Whiteboard: [measurement:client])

Attachments

(1 file, 5 obsolete files)

This is about implementing the count of the number of total URIs visited in a session fragment. URI fragments and reloads are equally counted. (E.g. www.example.com/#part1 is distinct from www.example.com AND www.example.com/#part2) We'll want this to utilize the places code for this, we already have some code like PLACES_PAGES_COUNT for guidance. Context: https://docs.google.com/spreadsheets/d/1G33YEBL2-hSaF0-HrHPi0HVwANoTpYHsnevmMd1eZ1U/edit#gid=0
Rebecca, does "AJAX changes should not reflect a separate count." (found in the context spreadsheet) mean that AJAX requests should contribute to this count? Or should they be excluded?
Blocks: 1252625
Flags: needinfo?(rweiss)
Priority: -- → P3
Whiteboard: [measurement:client]
We want them to be excluded from this count. Basically, the idea is to capture the user intent of navigating around the web, so if the click links for www.example.com www.example.com/#part1 www.example.com/#part2 they have taken three distinct actions that we want to count, and likewise if they click/hotkey to reload www.example.com, that is a distinct action. However, if www.example.com requests a bunch of resources in the background via AJAX or otherwise, that is not an action triggered by the user, and likewise if the page is reloaded automatically by a script, so we do not wish to count those in this probe. (If that becomes necessary we'd file a separate bug I guess.) I'm sure there are other edge cases we have not thought of (or that we don't even have the technical sophistication to know about), so we appreciate you help in finding them and coming up with sensible ways to handle them. We are also aware that it may be impossible to capture this notion of "user intent" perfectly, and we're comfortable having this measure as a rough proxy that comes close but is not perfect (even if it's rough, it'll be better than what we have now: nothing!).
Agree with Brendan in Comment #2 above.
Flags: needinfo?(rweiss)
Attached patch bug1271313.patch (obsolete) — Splinter Review
This patch also contain the domain count, but I have a question about that: should we consider mail.foo.com and bar.foo.com two different domains? Other examples: www.test.com vs test.com
Status: NEW → UNCONFIRMED
Ever confirmed: false
Priority: P3 → P1
Status: UNCONFIRMED → NEW
Ever confirmed: true
Points: --- → 1
Points: 1 → 2
Hi Alessio, I talked to rweiss; we think that it's better to count all subdomains as one domain-- e.g. all visits to anything under *.test.com can just be counted as a visit to test.com. FTR, the thinking here is that we want to know how broadly a user browses the internet, and for this purpose, if users spend a lot of their time in one ecosystem controlled by one company/organization, it's fine to group the pages controlled by that entity. (Also ps: I don't think this is exactly a dupe of 1271310 at least not conceptually, but I guess you are consolidating them b/c the code is the same?) Thanks!
(In reply to brendan c from comment #7) > Hi Alessio, I talked to rweiss; we think that it's better to count all > subdomains as one domain-- e.g. all visits to anything under *.test.com can > just be counted as a visit to test.com. > > FTR, the thinking here is that we want to know how broadly a user browses > the internet, and for this purpose, if users spend a lot of their time in > one ecosystem controlled by one company/organization, it's fine to group the > pages controlled by that entity. That makes sense, thanks! > (Also ps: I don't think this is exactly a dupe of 1271310 at least not > conceptually, but I guess you are consolidating them b/c the code is the > same?) Yeah, I'm consolidating them because of the code.
Assignee: nobody → alessio.placitelli
Summary: Measure the number of total URIs visited in a "session fragment" → Measure the number of total URIs and unique domains visited in a "session fragment"
(In reply to :Gijs Kruitbosch from bug 1271304 comment #21) > @@ +86,5 @@ > > + } > > + > > + // Don't count "about:*" pages. > > + if (/^about:\S+/.test(uriSpec)) { > > + this._log.trace("onLocationChange - Skipping about:* URIs."); > > What about file URIs? resource? chrome? data: ? jar:file ? feed: ? moz-icon > ? javascript: ? > > Can we just only allow http(s) and bail for everything else? That seems > saner if we're looking at domains anyway. @Brendan, @rweiss: should we only stick to counting http/https URIs? That basically means excluding local files, ftp, etc. from the total URI count. Consider that, for example, local files do not have a "domain": so a file URI would add +1 to the URI count and nothing to the domain count.
Flags: needinfo?(rweiss)
Flags: needinfo?(bcolloran)
For keeping things simple, i'd stick with http/https here (which is presumably what we are interested in). We could later keep track of the no. of user navigations per protocol type (i.e. keyed count) if we wanted to know which protocols are used (and to what degree).
To make this precise: "anything under *.test.com" means "aggregating by eTLD+1".
Yeah, I concur with Georg, we can stick to http+https.
Flags: needinfo?(rweiss)
Flags: needinfo?(bcolloran)
Attached patch bug1271313.patch (obsolete) — Splinter Review
Attachment #8768671 - Attachment is obsolete: true
Attached patch bug1271313.patch (obsolete) — Splinter Review
Gijs, this are the URI/Domain probes that were part of bug 1271304. browser.engagement.total_uri_count The count of the total URIs visited in a session fragment. Don’t count: - Error pages. - Background page requests (e.g. AJAX). - Everything that’s not HTTP, HTTPs. - URIs from embedded pages (i.e. iframes). - URI in private browsing. Count: - Links to external pages. - Page fragments. - User initiated browsing. The count resets to 0 on subsession splits. browser.engagement.unique_domains_count The count of the unique domains visited in a session fragment, capped to 100. Don’t count: - Error pages. - Background page requests. - Everything that’s not HTTP, HTTPs. - Domains from embedded pages (i.e. iframes). - The same domain twice. - URI in private browsing. Count: - Everything else. - The t.example.com and s.example.com must be counted just once (aggregate by eTLD + 1). The count resets to 0 on subsession splits.
Attachment #8771375 - Attachment is obsolete: true
Attachment #8771434 - Flags: review?(gijskruitbosch+bugs)
Rebecca, who should be owning these probes and when should they expire? I set them as I did for the tab/window ones in bug 1271304 (i.e. you own them and they expire in Firefox 55). If that's correct, I'll flag Benjamin for the data-review.
Flags: needinfo?(rweiss)
Comment on attachment 8771434 [details] [diff] [review] bug1271313.patch Review of attachment 8771434 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/modules/BrowserUsageTelemetry.jsm @@ +84,5 @@ > + } catch (e) { > + return; > + } > + > + Services.telemetry.scalarSet(UNIQUE_DOMAINS_COUNT_SCALAR_NAME, this._domainSet.size); Shouldn't this use setmaximum or whatever that is? ::: browser/modules/test/browser_UsageTelemetry.js @@ +203,5 @@ > + yield ContentTask.spawn(newWin.gBrowser.selectedBrowser, XHR_URL, function(url) { > + return new Promise(r => { > + var xhr = new content.window.XMLHttpRequest(); > + xhr.open("GET", url); > + xhr.onload = () => r(); nit: we're not penalizing you on bytes used, please just call the param "resolve" :-) @@ +230,5 @@ > + // embedded iframes). > + yield ContentTask.spawn(newWin.gBrowser.selectedBrowser, null, function* () { > + let doc = content.document; > + let iframe = doc.createElement("iframe"); > + let promise = ContentTaskUtils.waitForEvent(iframe, "load", false); Nit: make the naming of the local var here clearer please. @@ +231,5 @@ > + yield ContentTask.spawn(newWin.gBrowser.selectedBrowser, null, function* () { > + let doc = content.document; > + let iframe = doc.createElement("iframe"); > + let promise = ContentTaskUtils.waitForEvent(iframe, "load", false); > + iframe.setAttribute("src", "https://example.org/test"); Nit: iframe.src =
Attachment #8771434 - Flags: review?(gijskruitbosch+bugs) → review+
Attached patch bug1271313.patch (obsolete) — Splinter Review
Attachment #8771434 - Attachment is obsolete: true
Attachment #8771958 - Flags: review+
(In reply to :Gijs Kruitbosch from comment #16) > Comment on attachment 8771434 [details] [diff] [review] > bug1271313.patch > > Review of attachment 8771434 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: browser/modules/BrowserUsageTelemetry.jsm > @@ +84,5 @@ > > + } catch (e) { > > + return; > > + } > > + > > + Services.telemetry.scalarSet(UNIQUE_DOMAINS_COUNT_SCALAR_NAME, this._domainSet.size); > > Shouldn't this use setmaximum or whatever that is? This should be identical in this case, as |this._domainSet.size| will never be less than the value stored within the scalar.
:Dexter, yes that is correct. The only distinction is that instead of Tab Center, this will be needed for Universal Search and Activity Stream KPIs (have these feature increased or decreased general web usage as measured via site usage).
Flags: needinfo?(rweiss)
Comment on attachment 8771958 [details] [diff] [review] bug1271313.patch Flagging Rebecca for the data-review, since she's a data-review steward.
Attachment #8771958 - Flags: review+ → review?(rweiss)
The JP failure is an interesting one. It fails on [1] due to |weakWindows[i].get()| throwing at [2]. > [5407] WARNING: No inner window available!: file /home/dexter/mozilla-central/dom/base/nsGlobalWindow.cpp, line 9694 > --DOCSHELL 0x7fe2825ca000 == 9 [pid = 5407] [id = 7] > --DOMWINDOW == 21 (0x7fe29fc7d000) [pid = 5407] [serial = 2] [outer = (nil)] [url = about:blank] > --DOMWINDOW == 20 (0x7fe28c6d6400) [pid = 5407] [serial = 9] [outer = (nil)] [url = about:blank] > --DOCSHELL 0x7fe2854ef000 == 8 [pid = 5407] [id = 8] > --DOCSHELL 0x7fe28553e000 == 7 [pid = 5407] [id = 10] > --DOCSHELL 0x7fe2854f0000 == 6 [pid = 5407] [id = 9] > --DOMWINDOW == 19 (0x7fe28c0d7400) [pid = 5407] [serial = 12] [outer = (nil)] [url = about:blank] > --DOMWINDOW == 18 (0x7fe2854ef800) [pid = 5407] [serial = 19] [outer = (nil)] [url = ] > --DOMWINDOW == 17 (0x7fe287c62000) [pid = 5407] [serial = 20] [outer = (nil)] [url = ] > [5407] WARNING: NS_ENSURE_TRUE(currentInner) failed: file /home/dexter/mozilla-central/dom/base/nsGlobalWindow.cpp, line 8937 > --DOMWINDOW == 16 (0x7fe28962ec00) [pid = 5407] [serial = 21] [outer = (nil)] [url = about:blank] > --DOMWINDOW == 15 (0x7fe28977f800) [pid = 5407] [serial = 22] [outer = (nil)] [url = about:blank] > --DOMWINDOW == 14 (0x7fe288e11800) [pid = 5407] [serial = 15] [outer = (nil)] [url = about:blank] > --DOMWINDOW == 13 (0x7fe2825ca800) [pid = 5407] [serial = 17] [outer = (nil)] [url = chrome://browser/content/browser.xul] > TEST-PASS | jetpack-package/addon-sdk/source/test/leak/test-leak-event-dom-closed-window.js.test sdk/event/dom does not leak when attached to closed > window | should see at least one new window > TEST-UNEXPECTED-FAIL | jetpack-package/addon-sdk/source/test/leak/test-leak-event-dom-closed-window.js.test sdk/event/dom does not leak when attached to closed window | Component returned failure code: 0x8057000a (NS_ERROR_XPC_BAD_CONVERT_NATIVE) [xpcIJSWeakReference.get] > TEST-INFO | Traceback (most recent call last): > File "resource://gre/modules/Promise-backend.js", line 816, in this.PromiseWalker.walkerLoop > this.handlers.shift().process(); > File "resource://gre/modules/Promise-backend.js", line 940, in Handler.prototype.process > nextValue = this.onReject.call(undefined, nextValue); > File "resource://gre/modules/commonjs/sdk/test/assert.js", line 88, in fail > this._log.fail(message); > File "resource://gre/modules/commonjs/sdk/deprecated/unit-test.js", line 104, in fail > this.console.testMessage(false, false, this.test.name, message); > File "resource://gre/modules/commonjs/sdk/test/harness.js", line 541, in testMessage > this.trace(); > TEST-PASS | jetpack-package/addon-sdk/source/test/leak/test-leak-event-dom-closed-window.js.test sdk/event/dom does not leak when attached to closed window | This test is done. This seems to happen because of the |win.gBrowser.removeTabsProgressListener(URICountListener);| call added by this patch in BrowserUsageTelemetry.jsm. Or at least, commenting out that line makes the test pass. Surprisingly enough, that happens whether |addTabsProgressListener| is called or not. Commenting the |removeTabsProgressListener| and adding a dump like dump("\n**** DEBUG Is Referencing win.gBrowser a problem?" + !!win.gBrowser+ "\n"); still triggers the failure. Any other reference to just |win| or |win.document| isn't triggering the failure: it seems that only referencing gBrowser causes the issue. One other interesting bit is that NS_ENSURE_TRUE(currentInner) failed: ... from [3] seems to only be present in the logs when there's a failure. So it might be related. Any clue about what's going on there? Could this be related to bug 1288006? [1] - https://hg.mozilla.org/mozilla-central/diff/865803c7d5f8/addon-sdk/source/test/leak/test-leak-event-dom-closed-window.js [2] - https://hg.mozilla.org/mozilla-central/diff/ebc0f45e1ea5/addon-sdk/source/test/leak/leak-utils.js#l70 [3] - https://dxr.mozilla.org/mozilla-central/rev/4c05938a64a7fde3ac2d7f4493aee1c5f2ad8a0a/dom/base/nsGlobalWindow.cpp#8937
:bkelly, I saw you added this test for bug 1267693. Do you have any clue/suggestion about how to debug the issue I highlighted in comment 22?
Flags: needinfo?(bkelly)
Bug 1287235 just landed. Does that help with this error?
Flags: needinfo?(bkelly)
Actually, that bug is unrelated here. This is essentially throwing from this code: let weakWindows = []; function windowObserver(subject, topic) { let supportsWeak = subject.QueryInterface(Ci.nsISupportsWeakReference); if (supportsWeak) { weakWindows.push(Cu.getWeakReference(supportsWeak)); } } Services.obs.addObserver(windowObserver, "domwindowopened", false); /* ... */ // Check to see if any of the windows we saw survived the GC. We consider // these leaks. assert.ok(weakWindows.length > 0, "should see at least one new window"); for (let i = 0; i < weakWindows.length; ++i) { assert.equal(weakWindows[i].get(), null, "window " + i + " should be GC'd"); } I don't understand why a weak reference we just created would then throw on .get(). Boris, do you understand whats going on here? It seems deep down in the xpconnect stuff which I don't understand.
Flags: needinfo?(bzbarsky)
I'm also debugging this locally.
(In reply to Ben Kelly [:bkelly] from comment #26) > I'm also debugging this locally. Thanks for helping with that (and as you anticipated, unfortunately bug 1287235 isn't helping).
Is it throwing from the nsContentUtils::WrapNative call or the JS_WrapObject call?
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] from comment #28) > Is it throwing from the nsContentUtils::WrapNative call or the JS_WrapObject > call? Its failing here: https://dxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCConvert.cpp?#789 From this code: // First, see if this object supports the wrapper cache. // Note: If |cache->IsDOMBinding()| is true, then it means that the object // implementing it doesn't want a wrapped native as its JS Object, but // instead it provides its own proxy object. In that case, the object // to use is found as cache->GetWrapper(). If that is null, then the // object will create (and fill the cache) from its WrapObject call. nsWrapperCache* cache = aHelper.GetWrapperCache(); RootedObject flat(cx, cache ? cache->GetWrapper() : nullptr); if (!flat && cache && cache->IsDOMBinding()) { RootedObject global(cx, xpcscope->GetGlobalJSObject()); js::AssertSameCompartment(cx, global); flat = cache->WrapObject(cx, nullptr); if (!flat) return false; }
Flags: needinfo?(bzbarsky)
I think this is detecting a window leak in your patch. Its just throwing because we're trying to get a window object for a closed window.
Alessio, in your _unregisterWindow() you ignore certain targets, but you seem to add the progress listener to all windows in _registerWindow(). Should you be only adding the progress listener to the same subset of windows you check in _unregisterWindow()?
Flags: needinfo?(alessio.placitelli)
Oh, I missed that we only call _registerWindow() on certain windows.
Flags: needinfo?(alessio.placitelli)
Attached patch bug1271313_fix.patch (obsolete) — Splinter Review
So for this test we never fully create the window. We do still get the DOMWindowClosed observer topic, though. So your patch is only executing this code: win.gBrowser.removeTabsProgressListener(URICountListener); Accessing the .gBrowser getter seems to initialize some things on the window causing it to leak. I think the best solution here is not to use DOMWindowClosed observer at all. Instead, just listen for the unload event on a window when you start observing it. Thats what this patch does and it seems to avoid the leak. Alessia, does this work for you?
Flags: needinfo?(bzbarsky) → needinfo?(alessio.placitelli)
This is very similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1271304 in scope and application, as described above. The major difference is that the initial use of this measurement is to determine whether we have seen an increase or decrease (or no change) in URIs/domains visited for Activity Stream and Universal Search Test Pilot participants. We should follow the same expiration standard and evaluate further usefulness of these measurements upon expiration. So as with 1271304, data-review=me with expires: 55. Note that I have not reviewed the code of the patch.
Attachment #8771958 - Flags: review?(rweiss) → review+
Attached patch bug1271313.patchSplinter Review
(In reply to Ben Kelly [:bkelly] from comment #33) > Created attachment 8772884 [details] [diff] [review] > bug1271313_fix.patch > > So for this test we never fully create the window. We do still get the > DOMWindowClosed observer topic, though. So your patch is only executing > this code: > > win.gBrowser.removeTabsProgressListener(URICountListener); > > Accessing the .gBrowser getter seems to initialize some things on the window > causing it to leak. Mh I see, that makes sense. Thanks for the pointers. > I think the best solution here is not to use DOMWindowClosed observer at > all. Instead, just listen for the unload event on a window when you start > observing it. Thats what this patch does and it seems to avoid the leak. > > Alessia, does this work for you? Yes, I think that would work. I've updated the original patch with your changes and slightly tweaked it. @Gijs, is listening to "unload" instead of the "domwindowclose" an acceptable change or do you want to review tha patch again?
Attachment #8771958 - Attachment is obsolete: true
Attachment #8772884 - Attachment is obsolete: true
Flags: needinfo?(alessio.placitelli) → needinfo?(gijskruitbosch+bugs)
Comment on attachment 8772934 [details] [diff] [review] bug1271313.patch Review of attachment 8772934 [details] [diff] [review]: ----------------------------------------------------------------- ::: browser/modules/BrowserUsageTelemetry.jsm @@ +174,5 @@ > /** > * Adds listeners to a single chrome window. > */ > _registerWindow(win) { > + win.addEventListener("unload", this); From painful experience: you need to make sure this event handler doesn't get hit by unload events in parent-process browsers (easiest to test by forcing e10s off) in this window. I *think* it doesn't bubble and you're not registering a capturing listener, so you should be OK, but you should check.
Attachment #8772934 - Flags: review+
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs Kruitbosch from comment #36) > Comment on attachment 8772934 [details] [diff] [review] > bug1271313.patch > > Review of attachment 8772934 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: browser/modules/BrowserUsageTelemetry.jsm > @@ +174,5 @@ > > /** > > * Adds listeners to a single chrome window. > > */ > > _registerWindow(win) { > > + win.addEventListener("unload", this); > > From painful experience: you need to make sure this event handler doesn't > get hit by unload events in parent-process browsers (easiest to test by > forcing e10s off) in this window. I *think* it doesn't bubble and you're not > registering a capturing listener, so you should be OK, but you should check. Thanks for the tip: I just checked it seems to behave correctly forcing e10s off.
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb74a01c8dc0bc508ea8b492e6a4c179e1c2ff19 Bug 1271313 - Measure the number of total URIs and unique domains visited in a "session fragment". r=gijs, data-review=rweiss
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Depends on: 1506664
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: