Closed Bug 1273940 Opened 4 years ago Closed 2 years ago

Make stub attribution script available on the server

Categories

(Cloud Services :: Operations, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ckprice, Assigned: oremj)

References

(Blocks 1 open bug, )

Details

In bug 1261140, mhowell created a script which placed an arbitrary value into the stub installer.

Bug 1259612 comment 6 details the structure the tag will take.

This bug tracks the work to move the script to a server, and make it available.
What incoming parameters should we expect? I'm assuming something like: https://stubservice/?os=win&lang=en-US&product=Firefox-stub-44.0&code=astubattributioncode. From there we would download the stub installer from https://download.mozilla.org/?os=win&lang=en-US&product=Firefox-stub-44.0, modify it with code and then return it to the user. Is that assumption correct?

Is that correct?
Flags: needinfo?(cprice)
Assignee: nobody → oremj
:cmore and I discussed this in bug 1259612 comment 10. The final structure for the `attribution_code` is:

https://stubservice/?os=win&lang=en-US&product=Firefox-stub-46.0&attribution_code=source%3Dgoogle.com%26medium%3Dorganic%26campaign%3D(not%20set)%26content%3D(not%20set)

The moz.org team will update its buttons to point users to the new service URL.
Flags: needinfo?(cprice)
A few more things we came up with:

1) We'd like the format of the attribution_code, so we can validate it before writing it into the stub installer. We'll need the name of each parameter, the separator and a valid character set for each parameter.

2) Some sort of validation that the attribution_code came from a known source, so people can't spam the service or trick users in to downloading stubs with random information written. This is slightly less critical (and more difficult) than #1.
(In reply to Jeremy Orem [:oremj] from comment #3)
> A few more things we came up with:
> 
> 1) We'd like the format of the attribution_code, so we can validate it
> before writing it into the stub installer. We'll need the name of each
> parameter, the separator and a valid character set for each parameter.

Does comment 2 provide this? Note that this string will be urlencoded before it's sent to the service.

parameters are: source, medium, campaign, content
separator: &
charset: we probably won't be doing a huge amount of validation on the moz.org side. The entire string will be encoded and sent in the `attribution_code` parameter.

> 2) Some sort of validation that the attribution_code came from a known
> source, so people can't spam the service or trick users in to downloading
> stubs with random information written. This is slightly less critical (and
> more difficult) than #1.

:cmore, can you please follow-up with the bedrock team on this (I started an email last week).
Flags: needinfo?(chrismore.bugzilla)
Can we limit this in any way to letters and numbers or maybe characters, numbers and a few approved special characters?
(In reply to Jeremy Orem [:oremj] from comment #5)
> Can we limit this in any way to letters and numbers or maybe characters,
> numbers and a few approved special characters?

For the attribution_code value?

It should just be:

letters: a-zA-Z
numbers: 0-9
special: characters: -.%()_
Flags: needinfo?(chrismore.bugzilla)
:cmore - Do you need the special characters? It doesn't look like the subset is large enough to craft code injections with it, but if we could do without, it would make the input validation code better.
Flags: needinfo?(chrismore.bugzilla)
(In reply to Julien Vehent [:ulfr] from comment #7)
> :cmore - Do you need the special characters? It doesn't look like the subset
> is large enough to craft code injections with it, but if we could do
> without, it would make the input validation code better.

This should be suffice:

a-z A-Z 0-9 - . % ( ) _ ~

If there was any special character, it would should be urlcoded before passed.

So if you try to pass <script>, it would be encoded as %3Cscript%3E. Any special characters will be % encoded.

Does that work?

We can urldecode later to turn the query parameter value back into the original values.

Even though we are passing the string urlencoded, can we store the values in certificate and Firefox decoded?
Flags: needinfo?(chrismore.bugzilla)
(In reply to Chris More [:cmore] from comment #8)
>...
> Even though we are passing the string urlencoded, can we store the values in
> certificate and Firefox decoded?
If we want the stub to send this data as part of the stub installer ping which is the current plan I would very much prefer it if the values were already encoded since the stub installer doesn't have built-in encoding and decoding. Also, handing the data off to Firefox encoded shouldn't be a problem since Firefox has the ability to decode built-in.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #9)
> (In reply to Chris More [:cmore] from comment #8)
> >...
> > Even though we are passing the string urlencoded, can we store the values in
> > certificate and Firefox decoded?
> If we want the stub to send this data as part of the stub installer ping
> which is the current plan I would very much prefer it if the values were
> already encoded since the stub installer doesn't have built-in encoding and
> decoding. Also, handing the data off to Firefox encoded shouldn't be a
> problem since Firefox has the ability to decode built-in.

Good point and makes sense.
> For the attribution_code value?
> 
> It should just be:
> 
> letters: a-zA-Z
> numbers: 0-9
> special: characters: -.%()_

I'd also like to limit length. I believe we get 1024 bytes for the attribution code, but something smaller would limit the risk of someone creative managing to fit an exploit in the installer. [a-zA-Z0-9] on a 10 char string gives you 62^10  =839299365868340224 possibilities. Do you need more or would that be enough?

Was the attribution code meant to be an intelligible string rather than a random value?
(In reply to Julien Vehent [:ulfr] from comment #11)
> > For the attribution_code value?
> > 
> > It should just be:
> > 
> > letters: a-zA-Z
> > numbers: 0-9
> > special: characters: -.%()_
> 
> I'd also like to limit length. I believe we get 1024 bytes for the
> attribution code, but something smaller would limit the risk of someone
> creative managing to fit an exploit in the installer. [a-zA-Z0-9] on a 10
> char string gives you 62^10  =839299365868340224 possibilities. Do you need
> more or would that be enough?
> 
> Was the attribution code meant to be an intelligible string rather than a
> random value?

We made the decision a few weeks ago that it is not an encrypted code/hash and it will be plain text and that length of the string is not factor (within reason).

See conversation here and below: https://bugzilla.mozilla.org/show_bug.cgi?id=1259612#c4

I checked the data that we have in google analytics and the longest combination of source+medium+campaign+content is 108 characters. If we need a limit 200 characters should be suffice.

Thoughts?
That seems reasonable to me. If we ever need to increase the 200 char limit, it will be a simple change on the server side.
How about we do something like:


if len(attribution_code) > 200:
  fail
unquoted = urllib.unquote(attribution_code)
code_parts = urlparse.parse_qs(unquoted)
code_to_write_to_binary = urllib.urlencode(code_parts)
I'm working on this now on the bedrock side. It seems from the logic doc[0] that in some cases you'd also like the domain component of the referrer. So with all of the parameter names and encoded separation characters we're up to 62 characters just for that. So we should still be fine, but with the addition of referrer we may well reach 200 a lot more easily.

I'd also propose that we keep the data coming from bedrock as simple and minimal as possible. If the stub service needs to change I think it'd be best if we didn't have to coordinate deployments with bedrock. So I'd like to alter the structure of the attribution_code data to not include any parameters we don't gather. Currently the proposal is that we add "(not set)" or sometimes "(direct)" or "(none)"[0]. This is logic that I believe belongs in the service since it should be the central place that ensures that the data going into the installers is what you want.

Also, would you rather the HMAC signature be in a separate parameter or as part of the attribution_code param?

attribution_code=source%3Dbrandt%26medium%3Daether&signature=b13e5f394f284bbe8afc099cb60ccfb7296b3fec

or

attribution_code=source%3Dbrandt%26medium%3Daether%26signature%3Db13e5f394f284bbe8afc099cb60ccfb7296b3fec

I'd vote for the former, but it's up to you.

[0] https://docs.google.com/document/d/1DzIg19kAdtYEzS_waQNCQBfi8CSGj4cl9N8T24WSiyc/edit

PS: the hash algorithm I'm using with HMAC at present is sha1. If we'd like to use a different one just let me know.
> we may well reach 200

That was an arbitrary value. Feel free to increase it to fit your needs, while keeping it reasonable.

> PS: the hash algorithm I'm using with HMAC at present is sha1. If we'd like to use a different one just let me know.

Please use SHA-256.
(In reply to Julien Vehent [:ulfr] from comment #16)
> Please use SHA-256.

Done. Thanks.
(In reply to Paul [:pmac] McLanahan from comment #15)
> If the stub service needs to change I think it'd be
> best if we didn't have to coordinate deployments with bedrock. So I'd like
> to alter the structure of the attribution_code data to not include any
> parameters we don't gather. Currently the proposal is that we add "(not
> set)" or sometimes "(direct)" or "(none)"[0]. This is logic that I believe
> belongs in the service since it should be the central place that ensures
> that the data going into the installers is what you want.
NI :oremj and :cmore to comment on this.
Flags: needinfo?(oremj)
Flags: needinfo?(chrismore.bugzilla)
Per our meeting today (https://public.etherpad-mozilla.org/p/st2u5U0HyH)

Add an `overflow mechanism` to the service to queue unique requests, serving a vanilla version.

(In reply to Cory Price [:ckprice] from comment #18)
> (In reply to Paul [:pmac] McLanahan from comment #15)
> > If the stub service needs to change I think it'd be
> > best if we didn't have to coordinate deployments with bedrock. So I'd like
> > to alter the structure of the attribution_code data to not include any
> > parameters we don't gather. Currently the proposal is that we add "(not
> > set)" or sometimes "(direct)" or "(none)"[0]. This is logic that I believe
> > belongs in the service since it should be the central place that ensures
> > that the data going into the installers is what you want.
> NI :oremj and :cmore to comment on this.

This conversation is continuing in the document in comment 15.
Flags: needinfo?(oremj)
Flags: needinfo?(chrismore.bugzilla)
(In reply to Paul [:pmac] McLanahan from comment #15)
> So we should still be fine, but with the
> addition of referrer we may well reach 200 a lot more easily.
Chiming in here. The module code Matt is writing in Firefox is also using a 200-character limit. Let's stick to it.
Blocks: 1279291
:oremj - can you give us the final URL (or what the final URL will be) for the stub service?
Flags: needinfo?(oremj)
Does stubdownloader.prod.mozaws.net work? Do we want to use a mozilla.org address instead?

Do we have any stub service compatible stub binaries in bouncer yet, so we can do a full test?
Flags: needinfo?(oremj) → needinfo?(cprice)
(In reply to Jeremy Orem [:oremj] from comment #22)
> Does stubdownloader.prod.mozaws.net work?
Yep.

> Do we have any stub service compatible stub binaries in bouncer yet, so we
> can do a full test?
I am not sure. NI :mhowell to see if he can help answer.
Flags: needinfo?(cprice) → needinfo?(mhowell)
I'm assuming that a "stub service compatible stub binaries" mean one that's been signed with the dummy certificate? I don't see any of those in the places I know to look, but I don't really keep tabs on what's going on with bouncer; somebody who actually works with it can give you better information.
Flags: needinfo?(mhowell)
(In reply to Jeremy Orem [:oremj] from comment #22)
> Does stubdownloader.prod.mozaws.net work? Do we want to use a mozilla.org
> address instead?

Are we doing anything special with the cert on download.m.o to ensure it works for everyone? I believe we're at least using a sha1 cert so that users of Win XP < SP3 can still get the download right? Will we be able to get a sha1 cert for any other domains? I think we have a wildcard one for *.mozilla.org, so perhaps we should do that. There's also the fact that people do (okay, maybe just nerds like me) look at the domain of the link they're about to visit. It may appear fishy if it's not *.mozilla.org, especially for a download.
Do we need a SHA1 certificate for this service? I don't think CAs will issue them anymore. Will XP users be directed at stub installers?
(In reply to Jeremy Orem [:oremj] from comment #26)
> Do we need a SHA1 certificate for this service? I don't think CAs will issue
> them anymore. Will XP users be directed at stub installers?

I think they are. Alex do you know? I know that we solved the XP detection and product serving on bouncer, so if we're sending all links that would go to stub not to bouncer then these people may well be lost if the new thing doesn't handle it. I believe that bedrock isn't doing any special detection for winxp, but I'll research that more and get back to you.
Let's avoid sending XP users to this service by either:

* detecting XP users on mozorg
* passing through bouncer, which could ignore attribution_code if it is an XP user
(In reply to Jeremy Orem [:oremj] from comment #28)
> Let's avoid sending XP users to this service by either:
> 
> * detecting XP users on mozorg
> * passing through bouncer, which could ignore attribution_code if it is an
> XP user

:agibson,

Do we currently or can we reliably detect XP on the client side?

I'd obviously prefer we do the latter because I think it'd be more robust and we'd avoid any odd download URLs on mozorg.
Flags: needinfo?(agibson)
(In reply to Paul [:pmac] McLanahan from comment #29)
> (In reply to Jeremy Orem [:oremj] from comment #28)
> > Let's avoid sending XP users to this service by either:
> > 
> > * detecting XP users on mozorg
> > * passing through bouncer, which could ignore attribution_code if it is an
> > XP user
> 
> :agibson,
> 
> Do we currently or can we reliably detect XP on the client side?
> 
> I'd obviously prefer we do the latter because I think it'd be more robust
> and we'd avoid any odd download URLs on mozorg.

We could match whatever logic is is already in place for detecting XP in bouncer, however if Bouncer already has this logic I would favor using it for consistency, as opposed to start adding another layer of detection on mozorg. It would leave less room for error and also keep things simpler in bedrock.
Flags: needinfo?(agibson)
The production url will now be https://stubdownloader.services.mozilla.com/. We aren't supposed to use mozaws.net for user facing URLs.
(In reply to Jeremy Orem [:oremj] from comment #31)
> The production url will now be https://stubdownloader.services.mozilla.com/.
> We aren't supposed to use mozaws.net for user facing URLs.

Does this mean we're no longer using the regular bouncer urls on the mozorg side? Or is this something bouncer will forward to if it sees the stub attribution params? I thought this was what we agreed to doing in our last discussions.

Can someone please clarify?
This is something bouncer will forward to. The flow will still be mozorg -> bouncer -> stubdownloader.
(In reply to Jeremy Orem [:oremj] from comment #33)
> This is something bouncer will forward to. The flow will still be mozorg ->
> bouncer -> stubdownloader.

Thanks for confirming
It looks like there are still no builds with the dummy certificate published to bouncer. We won't be able to test the service until a build is in place.
(In reply to Jeremy Orem [:oremj] from comment #35)
> It looks like there are still no builds with the dummy certificate published
> to bouncer. We won't be able to test the service until a build is in place.

Sorry for the double need-info, Jeremy, but is this addressed too, now that bug 1318456 is resolved FIXED?
Flags: needinfo?(oremj)
It looks like you can test the "test-stub" product, but bug 1320773 will be required before going live.
Flags: needinfo?(oremj)
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.