Closed Bug 1329157 Opened 3 years ago Closed 3 years ago

Custom tabs: Collect the source names of 3rd party app that is using custom tabs

Categories

(Firefox for Android :: Metrics, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 55
Tracking Status
firefox55 --- fixed

People

(Reporter: jcheng, Assigned: cnevinchen)

References

(Blocks 2 open bugs)

Details

Attachments

(4 files)

As Mozilla, we want to understand which apps are using custom tabs
Assignee: nobody → cnevinchen
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

https://reviewboard.mozilla.org/r/114372/#review116542

r- for data review (although this isn't because of the code itself!) - I don't think we can collect this kind of specific data.

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:86
(Diff revision 1)
> +        final String url = intent.getDataString();
> +        final String host = this.getReferrer().getHost();
> +
> +        JSONObject telemetryExtra = new JSONObject();
> +        telemetryExtra.put("url",url);
> +        telemetryExtra.put("host",host);

Okay, I definitely don't think that we can collect a specific url based on the Mozilla/Firefox privacy terms. Regarding host/app collection, I'm also hesitant on that, but depending on how we plan to use that, maybe we could find an acceptable alternative. Would you manually read through all the apps collected? If there is a specific set of apps that we are interested in (e.g. top 100 apps, or may urls from the top 100 Alexa websites) we might be able to do that, for planning to integrate better with those apps.

I know that's kind of an annoying ask and I'm sorry because I know it's a lot more work than just collecting url and host. I'll also flag bsmedberg for some clarification or ideas on how to collect alternative forms of data.
Attachment #8839807 - Flags: review?(liuche) → review-
Hi Benjamin, I'm trying to think of some alternative ways to collect data for custom tabs, which is an Android API that allows other apps to use Firefox when opening web content links inside the app (e.g. reading an article from the Facebook app).

Basically, it'd be useful to know 1) what app is opening Firefox, and 2) what url it wants to open. I think we can't just collect any app-name and any url, but would it be allowed if it were, say, one of the top 100 websites (ranked by Alexa) or top 100 apps in the store? Or do you have any other alternative suggestions?
Flags: needinfo?(benjamin)
1) what app is opening Firefox; 

this is the data I'd really love to know so we can work towards integrating better with top apps that opens Firefox. if we need to set a limit to this like Chenxia suggested, it should already give us enough information

2) what url it wants to open; 

i'm not so sure about this one as we may cross the line too much? from a product perspective, i think it's fine we don't collect this initially if this is against our policy.
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

I'll keep the patch as is and cancel the review flag. If the final decision is to enable the tracking, I'll request it again.
Attachment #8839807 - Flags: review?(liuche)
I'm a *little* worried about collecting what app is opening Firefox, although I see the value it can provide. For mainstream/common applications it's not a big deal, but for uncommon apps or apps which are somehow sensitive that could be a big deal. Are there ways to mitigate the risk, such as:

* not attaching a client ID to this information
* use a whitelist of apps we care about measuring and don't present a big privacy risk

At the present time there is no way to record anything about the URLs it wants to open. We explicitly promise not to collect that as part of telemetry.
Flags: needinfo?(benjamin)
is it possible we do this only?
* not attaching a client ID to this information

we don't quite know what apps we care about the most for sure before we have initial data (though we can take a guess from other data sources but sometimes we never know what we will get from actual data)
* use a whitelist of apps we care about measuring and don't present a big privacy risk
Hi Georg, 
Sorry that I don't quite understand the code here.
http://searchfox.org/mozilla-central/rev/90d1cbb4fd3dc249cdc11fe5c3e0394d22d9c680/toolkit/components/telemetry/TelemetryController.jsm#964

Could you please shed us some lights if "not attaching a client ID for certain ping" possible?
Thanks!
Flags: needinfo?(gfritzsche)
Yes, you can use .submitExternalPing() with the option addClientId:false (which is the default).
http://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/collection/custom-pings.html

Questions:

(1) Do you want to submit this for all users or only those with Telemetry opt-in?
For opt-in users (a small percentage), you could probably just use a custom ping (shouldn't matter much next to what we already collect).
For the whole user-base you'd have to look for what the right option is for bandwidth, battery, ... usage.
There is a Java ping generator & uploader that mcomella wrote; that might be the better option.

(2) Do you need to have numbers per user?
Without attaching a client-id you only get overall usage numbers, which are biased towards heavy users.
You can't say how many users used a feature or app.

(3) How often are you planning to send this information?
With the current size of the mobile user-base it will probably be ok cost-wise if you keep the ping size down.
However, you should run the numbers and think about what user & device impact is acceptable here.

(4) What other information do you need to correlate this to?
If you use a custom ping, this dictates what else you need to put in it.
Flags: needinfo?(gfritzsche)
Hi Joe
Please help look at comment 11 and help answer above questions.

About the ping size/frequency, we should definetely take that in consideration. 
Maybe we only need to collect the urls for certain releases (ex. 53~55).
Flags: needinfo?(jcheng)
Talked to Joe. I'll help answer below questions:

(In reply to Georg Fritzsche [:gfritzsche] from comment #11)
> (1) Do you want to submit this for all users or only those with Telemetry opt-in?
We'll only collect the URLs for opt-in users.

> (2) Do you need to have numbers per user?
bug 1329155 is for that purpose.

> (3) How often are you planning to send this information?
Everytime the user use a custom tab. Since only opt-in users will send this ping, the size should be fine.
 
> (4) What other information do you need to correlate this to?
That's all for now.


Questions:
for question(1) ,What should we do for "only collect the URLs for opt-in users"
Should we
1. Using custom ping(with the option addClientId:false)? But I can't find how to set opt-in for custom ping.
2. Using histogram? But what kind of histogram can I use? categorical? The caller app could be any string like org.mozilla.fennec or com.example.abc . I can't find a suitable type..
3. Using UITelemetry with event and add url in it? Since we are using https as endpoint, PI maynot be leaked easily...But I still don't know how to set opt-in for this.
Flags: needinfo?(jcheng)
Flags: needinfo?(gfritzsche)
Flags: needinfo?(benjamin)
I'm hoping we can collect and have counts on all custom tab sessions so we can understand how many of the total Firefox open sessions are contributed by custom tabs. This should be no different from what we collect regarding firefox launches thru intents. This happens regardless of whether opt-in or not.

However, if user chooses to opt-in, then we collect additional information on which app is using custom tabs so we can understand what apps are mostly used to launch firefox through custom tabs so we can improve the experience with those popular apps.
We cannot collect URLs according to our privacy notice. The answer to that is a simple "no".

I cannot help you decide *how* to collect this data, but I believe that Frank Bertsch is the data engineer who advises and assists the android team, so you should reach out to him. A custom ping would require more analysis support but have greater privacy. But I'm not sure we've tested that on Android before. If you wanted to submit it only for the opt-in (prerelease) population, you'd have to write that check yourself. A keyed histogram could possibly work.

If this is in any of UITelemetry/histograms, then it is associated with the client ID and so I'd want you to implement the whitelist or other risk mitigation techniques.
Flags: needinfo?(benjamin)
Hi Benjamin.
What if we only collect the application ID like "com.facebook.katana" or "org.mozilla.webmaker" , is that ok? We just want to know which app invokes Fennec... is that also "no"?
btw, Thanks for the advice!
Flags: needinfo?(benjamin)
I believe comment 8 still applies when talking about collecting which app opened Firefox.
Flags: needinfo?(benjamin)
Going back to comment 8;
// For mainstream/common applications it's not a big deal, but for uncommon apps or apps which are somehow sensitive that could be a big deal. 

Can you elaborate a bit more on the reasoning behind uncommon apps being sensitive if we collect its name? Not exactly sure if I get the sensitive part.


// Are there ways to mitigate the risk, such as:
* not attaching a client ID to this information

For those who opt-in to share the app names, not collecting the client ID should still give us some interesting insights initially.
Component: General → Metrics
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

https://reviewboard.mozilla.org/r/114372/#review123408

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:81
(Diff revision 7)
>          Intent intent = new Intent(getIntent());
>          intent.setClassName(getApplicationContext(), CustomTabsActivity.class.getName());

I'd prefer if you add whatever info you need into the intent for CustomTabsActivity and then handle sending telemetry etc. there

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:84
(Diff revision 7)
>  
>      private void dispatchCustomTabsIntent() {
>          Intent intent = new Intent(getIntent());
>          intent.setClassName(getApplicationContext(), CustomTabsActivity.class.getName());
> +        final String host = getReferrerHost();
> +        if (host !=null ) {

We ignore "null" here, but I wonder whether we want to still add something, so that we see "Custom tab, but we do not know where it came from".

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:97
(Diff revision 7)
>  
>          startActivity(intent);
>      }
>  
> +    private String getReferrerHost() {
> +        if (Build.VERSION.SDK_INT >= 22) {

nit: Please use AppConstants

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:97
(Diff revision 7)
> +        if (Build.VERSION.SDK_INT >= 22) {
> +            return this.getReferrer().getHost();
> +        }
> +        Intent intent = this.getIntent();
> +        Uri referrer = intent.getParcelableExtra("android.intent.extra.REFERRER");
> +        if (referrer != null) {
> +            return referrer.getHost();
> +        }
> +        String referrerName = intent.getStringExtra("android.intent.extra.REFERRER_NAME");
> +        if (referrerName != null) {
> +            return Uri.parse(referrerName).getHost();
> +        }
> +        return null;

EXTRA_REFERRER_NAME is API 22 and EXTRA_REFERRER is API 17. Our minSdkLevel is 15. I guess as long as custom tabs doesn't add the package name we can't be sure to get one? Is our plan to ignore older Android versions?

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:98
(Diff revision 7)
>          startActivity(intent);
>      }
>  
> +    private String getReferrerHost() {
> +        if (Build.VERSION.SDK_INT >= 22) {
> +            return this.getReferrer().getHost();

This method seems to read REFERER and REFERRER_NAME too. So guess there is no need to call it if we have code to read those anyways.

::: mobile/android/base/java/org/mozilla/gecko/LauncherActivity.java:100
(Diff revision 7)
> +        Intent intent = this.getIntent();
> +        Uri referrer = intent.getParcelableExtra("android.intent.extra.REFERRER");

nit: final

::: mobile/android/chrome/content/browser.js:415
(Diff revision 7)
>        "Session:Forward",
>        "Session:GetHistory",
>        "Session:Navigate",
>        "Session:Reload",
>        "Session:Stop",
> +      "Telemetry:custom-tab",

Let's use CamelCase like the other events do. Maybe something like

Telemetry:CustomTabsPing

::: mobile/android/chrome/content/browser.js:1779
(Diff revision 7)
> +      case "Telemetry:custom-tab": {
> +        TelemetryController.submitExternalPing("anonymous", { client: data.client });
> +        break;
> +      }

I wonder why we decided to name this "anonymous" ping. This isn't descriptive at all?

::: toolkit/components/telemetry/docs/data/anonymous-ping.rst:1
(Diff revision 7)
> +
> +"anonymous" ping
> +============
> +

I'll add frank to review this part and whether it makes sense from the server side.

As said before "anonymous" doesn't make sense to me if this one is only about the app opening a custom tab.
Attachment #8839807 - Flags: review?(s.kaspari)
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

Frank: See the documentation in this patch. This is about adding a new ping (without clientId because of sensitive information) to track which third-party apps use this specific Firefox feature.
Attachment #8839807 - Flags: feedback?(fbertsch)
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

https://reviewboard.mozilla.org/r/114372/#review124374

Code-wise this looks good to me. But please ask Frank for feedback so that we are sure that we can use this data in this form.

I also wonder if we can limit sending this data to a couple of releases. Do we need to send this forever?

::: mobile/android/base/java/org/mozilla/gecko/customtabs/CustomTabsActivity.java:112
(Diff revision 8)
> +    private void recordCustomTabUsage(final String host) {
> +        final GeckoBundle data = new GeckoBundle(1);
> +        if (host != null) {
> +            data.putString("client", host);
> +        } else {
> +            data.putString("client", "unkown");

typo: unkown -> unknown

::: mobile/android/base/java/org/mozilla/gecko/customtabs/CustomTabsActivity.java:316
(Diff revision 8)
> +        }
> +        String referrerName = intent.getStringExtra("android.intent.extra.REFERRER_NAME");
> +        if (referrerName != null) {
> +            return Uri.parse(referrerName).getHost();
> +        }
> +        return null;

I just looked at some custom tabs intents and saw that at least some of them have APPLICATION_ID set. Maybe that's another value we could use here?
https://developer.android.com/reference/android/provider/Browser.html#EXTRA_APPLICATION_ID

::: mobile/android/chrome/content/browser.js:1780
(Diff revision 8)
>          this.scrollToFocusedInput(browser, false);
>          break;
>        }
>  
> +      case "Telemetry:CustomTabsPing": {
> +        TelemetryController.submitExternalPing("anonymous", { client: data.client });

Let's explicitly set addClientId=false here.

::: toolkit/components/telemetry/docs/data/anonymous-ping.rst:42
(Diff revision 8)
> +~~~~~~
> +It could be ``com.example.app``, which is the identifier of the app.
> +
> +Version history
> +---------------
> +* v1: initial version - Will be shipped in `Fennec 54 <https://bugzilla.mozilla.org/show_bug.cgi?id=1329157>`_.

54 -> Are we going to uplift this (Nightly is 55 now).
Attachment #8839807 - Flags: review?(s.kaspari)
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

https://reviewboard.mozilla.org/r/114372/#review124376
Attachment #8839807 - Flags: review+
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

See comment 24.
Attachment #8839807 - Flags: feedback?(fbertsch)
At this point the data will only be available in ATMO through the Dataset API.

I would like there to be a schema available for this new ping type. Nechen, can you put a json schema in at [0]? This ping is very simple so it should be straightforward.

How are you wanting to analyze this? If you want to use STMO, we need direct-to-parquet or an ETL job. Either way that needs to be another bug. If so please create one in Metrics: Pipeline and cc me and I'll take care of it.

[0] https://github.com/mozilla-services/mozilla-pipeline-schemas
Flags: needinfo?(cnevinchen)
I want to analyze this in STMO.
PR for schema is here [0]

And bug 1349431 is created for the etl job.

[0] https://github.com/mozilla-services/mozilla-pipeline-schemas/pull/32
Flags: needinfo?(cnevinchen)
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/114372/diff/8-9/
Attachment #8839807 - Flags: feedback?(fbertsch) → review?(fbertsch)
Hi Frank.
Please see my answer in comment 30 and let me know what I should do. Thank you!
Flags: needinfo?(fbertsch)
Hey Nevin, sorry for the delay -- that looks good to me. You need to update the JSON schema, as that's the schema I'm going to use for direct-to-parquet and bug 1349431 (which happens to be a palindrome!). I left you a review in mozilla-serices/mozilla-pipeline-schemas.
Flags: needinfo?(fbertsch)
Attached file core.json
Attached file anonymous.json
Hi Frank.
Thank you for your comments in my PR : https://github.com/mozilla-services/mozilla-pipeline-schemas/pull/32

But the anonymous ping sent out from my code here is different from the core ping(I've provided a sample in Attachment 8851494 [details]). There's more attributes like "application", "version", "creationDate","type","id" in my ping.

Maybe it's because core ping is sent to : https://incoming.telemetry.mozilla.org/submit/telemetry/<docID>/core/Fennec/55.0a1/default/20170322154215 using Java directly , 

and this bug I use JS API :
TelemetryController.submitExternalPing("anonymous", { client: data.client }, { addClientId: false });

Maybe I should add other options in JS API?
Hi Georg, could you please advicse?
Flags: needinfo?(fbertsch)
(In reply to Nevin Chen [:nechen] from comment #35)
> But the anonymous ping sent out from my code here is different from the core
> ping(I've provided a sample in Attachment 8851494 [details]). There's more
> attributes like "application", "version", "creationDate","type","id" in my
> ping.
>
> Maybe it's because core ping is sent to :
> https://incoming.telemetry.mozilla.org/submit/telemetry/<docID>/core/Fennec/
> 55.0a1/default/20170322154215 using Java directly , 

Drive-by comment since Georg is on PTO :)

That's exactly the reason. The core ping is sent through a different API [1] and has a different format compared to the "Common Ping Format" [2].
 
> and this bug I use JS API :
> TelemetryController.submitExternalPing("anonymous", { client: data.client },
> { addClientId: false });
> 
> Maybe I should add other options in JS API?
> Hi Georg, could you please advicse?

If you send a ping using |TelemetryController.submitExternalPing| [3], Telemetry will generate a ping that conforms to the "Common Ping Format [2]. That's the reason why you're seeing the additional fields that you didn't expect.

[1] - http://searchfox.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetryCorePingBuilder.java
[2] - https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/common-ping.html
[3] - http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/toolkit/components/telemetry/TelemetryController.jsm#206
Hi Nevin,

In light of that information, can we update the documentation to include all of the fields that the ping will contain?
Flags: needinfo?(fbertsch)
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

Document and bug 1349431 are both updated. Thank you!
Hi Frank. Please feel free to let me know what's missing. Thank you!
Flags: needinfo?(fbertsch)
Sorry Nevin, that looks good. I would like to clean up the commit to mozilla-pipeline-schemas, check out my comment there. I'll be using that schema for bug 1349431 direct-to-parquet.
Flags: needinfo?(fbertsch)
Comment on attachment 8839807 [details]
Bug 1329157 - Safely collect caller app information.  data-r?bsmedberg

https://reviewboard.mozilla.org/r/114372/#review128694
Attachment #8839807 - Flags: review?(fbertsch) → review+
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s 800c05aa8428 -d 6c828cc61fa1: rebasing 386779:800c05aa8428 "Bug 1329157 - Safely collect caller app information. r=frank,sebastian data-r?bsmedberg" (tip)
merging mobile/android/base/java/org/mozilla/gecko/customtabs/CustomTabsActivity.java
merging mobile/android/chrome/content/browser.js
warning: conflicts while merging mobile/android/base/java/org/mozilla/gecko/customtabs/CustomTabsActivity.java! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/77ff01bb6e2a
Safely collect caller app information. r=frank,sebastian data-r?bsmedberg
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/77ff01bb6e2a
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Flags: needinfo?(gfritzsche)
You need to log in before you can comment on or make changes to this bug.