Closed Bug 1001619 Opened 10 years ago Closed 10 years ago

Security Review: Directory Tiles Server-side

Categories

(mozilla.org :: Security Assurance: Review Request, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: oyiptong, Assigned: st3fan)

References

(Blocks 1 open bug, )

Details

Attachments

(1 file, 1 obsolete file)

Who is/are the point of contact(s) for this review?
oyiptong@mozilla.com

Please provide a short description of the feature / application (e.g. problem solved, use cases, etc.):
This is a webservice that supports the Directory Tiles features of Firefox Desktop and Fennec (eventually).
It serves Directory Links when clients call on an API endpoint.
It also captures impression and click data pings.

Please provide links to additional information (e.g. feature page, wiki) if available and not yet included in feature description

Server application server can be found at: https://github.com/oyiptong/onyx

We plan on riding the Firefox release trains and would like to have this launched for nightly within the next few weeks.

To help prioritize this work request, does this project support a goal specifically listed on this quarter's goal list? If so, which goal?

It fits in "Invest in Sustainability"/Content Services and "Get Firefox on a Growth Trajectory"/Firefox

Does this feature or code change affect Firefox, Thunderbird or any product or service the Mozilla ships to end users?
Yes

Are there any portions of the project that interact with 3rd party services?
Yes. We will be using Amazon Web Services for hosting the application, assets and will be using AWS services.

Will your application/service collect user data? If so, please describe
Yes. Impression data, clicks, unique user identifiers
Blocks: tiles-dev
Blocks: 993908, 993909, 993906
Blocks: 986521
Whiteboard: [triage needed]
QA Contact: sarentz
Assignee: nobody → sarentz
QA Contact: sarentz
Whiteboard: [triage needed]
Summary: Security Review for Directory Tiles Server-side → Security Review: Directory Tiles Server-side
Couple of quick questions:

* Is the current code stable/finished enough to start doing this review?
* Is there a development or staging instance running somewhere?
* Is there API documentation or some high level documentation about what this service implements and how it interacts with Firefox?
Flags: needinfo?(oyiptong)
Because this project is relatively small, I am going to add comments inline here as comments.
This is not really a security issue per se, but I think we should not encrypt the session data.

There are only two things in the cookie, a session id and a timestamp. I think for the sake of transparency of this project, which should be as high as possible, those two values should be in the cookie in plain text.

If you are worried about people tampering with those values, I would suggest to only add a simple hash to the cookie. This is actually what Flask does by default. It's session values are stored in a cookie as:

> base64(json(session_dict)) + "." + hash(session_dict, SECRET_KEY)

If you don't want to lock in to the Flask method then I would suggest to do something very similar manually and use a simple HMAC-SHA1 to sign the data.
I think Flask already does HMAC-SHA1 signing by default via the itsdangerous package.

SecureSessions use itsdangerous by default.
Flags: needinfo?(oyiptong)
(In reply to Stefan Arentz [:st3fan] from comment #1)
> Couple of quick questions:
> 
> * Is the current code stable/finished enough to start doing this review?
Yes
> * Is there a development or staging instance running somewhere?
No
> * Is there API documentation or some high level documentation about what
> this service implements and how it interacts with Firefox?
Yes, there are a couple:

https://docs.google.com/a/mozilla.com/document/d/1_9pTBxmabAMLxNESfchTOsY1U5E1BzFQiJgdVsQJQ20

and

https://docs.google.com/a/mozilla.com/document/d/1ItO5SuUQpPr0Sls0GK-TXxSP_3ZhEEB86z4UjyGeC0w
This is more about privacy than security: I see the IP address is sent to Heka. Is that actually used in the backend?
(In reply to Stefan Arentz [:st3fan] from comment #6)
> This is more about privacy than security: I see the IP address is sent to
> Heka. Is that actually used in the backend?

Yes, we are building reporting infrastructure. One of the uses of the IP address is to obtain geo data.
We plan on sending just the approx geolocation to sponsors, and also for our own bean counting.
Ok I do not know what our policies are WRT storing IP addresses. You should probably check that with someone on the privacy team.
Bryan, what is our position about storing IP addresses?

The server is currently aggregating IP addresses to compute geo location. We will be doing that outside of the response/request cycle, meaning that the data will be stored elsewhere as implemented.

Does that comply with our privacy policy? How about the end-user agreement?
Flags: needinfo?(clarkbw)
My only other concern is that the SESSION_COOKIE_SECURE setting in Flask defaults to False. I wonder if it would make sense to programmatically set that based on the extra headers that the front-end web server or proxy sets.
I was planning to have a config that overrides the default settings.
The production config would have SESSION_COOKIE_SECURE set to True
Attached file production config sample (obsolete) —
Attachment #8416534 - Flags: feedback?(sarentz)
Attachment #8416534 - Attachment is obsolete: true
Attachment #8416534 - Flags: feedback?(sarentz)
Attachment #8416538 - Flags: feedback?(sarentz)
Great. I'm going to close this to RESOLVED/FIXED. Please reopen if things change and need a new look.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
(In reply to Olivier Yiptong [:oyiptong] from comment #9)
> Bryan, what is our position about storing IP addresses?
> 
> The server is currently aggregating IP addresses to compute geo location. We
> will be doing that outside of the response/request cycle, meaning that the
> data will be stored elsewhere as implemented.
> 
> Does that comply with our privacy policy? How about the end-user agreement?

I'll try to gather the FHR policy on this because we're lining up with their usage but my understanding from that team is that we don't store IP addresses at all.  We can use the IP address in our geoIP service to gather a region but once we've detected the region we drop the IP address.  The region is required for Directory Tiles so we'll need to use the IP address to determine region but beyond that need the IP address has no value to us and likely leaks too much additional information.
Flags: needinfo?(clarkbw)
(In reply to Bryan Clark (Firefox Search PM) [:clarkbw] from comment #15)
> I'll try to gather the FHR policy on this because we're lining up with their
> usage but my understanding from that team is that we don't store IP
> addresses at all.  We can use the IP address in our geoIP service to gather
> a region but once we've detected the region we drop the IP address.  The
> region is required for Directory Tiles so we'll need to use the IP address
> to determine region but beyond that need the IP address has no value to us
> and likely leaks too much additional information.

Is it OK to obtain the location from the geoIP service outside of the front-end server?

The IP addresses will then go from the front-end server, to a message queue, then will ship to a worker that will translate this data and strip out the IP address. The data would not be stored permanently, but will be stored temporarily, in a work queue.
(In reply to Olivier Yiptong [:oyiptong] from comment #16)
> (In reply to Bryan Clark (Firefox Search PM) [:clarkbw] from comment #15)
> > I'll try to gather the FHR policy on this because we're lining up with their
> > usage but my understanding from that team is that we don't store IP
> > addresses at all.  We can use the IP address in our geoIP service to gather
> > a region but once we've detected the region we drop the IP address.  The
> > region is required for Directory Tiles so we'll need to use the IP address
> > to determine region but beyond that need the IP address has no value to us
> > and likely leaks too much additional information.
> 
> Is it OK to obtain the location from the geoIP service outside of the
> front-end server?
> 
> The IP addresses will then go from the front-end server, to a message queue,
> then will ship to a worker that will translate this data and strip out the
> IP address. The data would not be stored permanently, but will be stored
> temporarily, in a work queue.

That's a good question I'm not sure I'm qualified to answer well enough.
Attachment #8416538 - Flags: feedback?(sarentz) → feedback+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: