Closed Bug 752997 Opened 12 years ago Closed 11 years ago

[tracker] Create a Phonebook API

Categories

(Participation Infrastructure :: Phonebook, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: aakashd, Unassigned)

References

Details

Attachments

(1 file)

Developers in the Mozilla community should be able to use the Phonebook to authorize contributors as Vouched Mozillians (and other, more granular methods in the future) as well as share profile data across our universe.

Features:
* Privacy controls - Fields set to "Admin" are not available to developers via the API.
* Authorization - Community sites, tools and systems will be able to authorize accounts as "Vouched Mozillians" using the Phonebook API.
* Profile Data Sharing - Community sites and Vouched Mozillians will be allowed to get Mozillian Phonebook profile data that a user has allowed the API to show. 
* API Key Generator - Only Mozillian Phonebook users within the Staff, Stewards and Reps groups will be able to use the API (and see the key generator). Over time, we will open up the API to all Vouched Mozillians once we have better systems in place to support the number of API calls likely to be made from the app.
Assignee: nobody → timw
we'd like to have API access for an IRC bot so people can use it (like we have used firebot) to look up a person's contact info.

Person would ping the bot with the person's IRC nick, bot would return info for the person who has that nick listed on their mozillians profile.

eg: "Mozilliansbot: Lucy" would return something like "Lucy is Majken "Lucy" Connor, majken@gmail.com see mozillians.org/lucy for more info"
(In reply to Majken "Lucy" Connor from comment #1)
> we'd like to have API access for an IRC bot so people can use it (like we
> have used firebot) to look up a person's contact info.
> 
> Person would ping the bot with the person's IRC nick, bot would return info
> for the person who has that nick listed on their mozillians profile.
> 
> eg: "Mozilliansbot: Lucy" would return something like "Lucy is Majken "Lucy"
> Connor, majken@gmail.com see mozillians.org/lucy for more info"

This raises some serious privacy concerns. The person running the bot may be a vouched Mozillian but the person USING the bot (and the people in the channel seeing the response) may not. Thus, the bot might be returning data that Mozillians have made available to VOUCHED (or above) users but not public, and then displaying that data publicly.

I think we need to be very clear that such use of the API is not OK, and even go so far as to prevent it.
Comment 2 is why Mozillians needs (and needed from the beginning) proper data privacy controls, so people can choose to make information public.

Why do we need API keys? For uses of the API such as Bugzilla integration (i.e. providing people in Bugzilla with an "Info" popup populated with Mozillians data), surely it's sufficient just to be logged into Mozillians, and for Mozillians to check the Origin header to make sure the requesting domain is trusted?

How would something like Bugzilla integration work if each user needed an API key?

Gerv
(In reply to James Socol [:jsocol, :james] from comment #2)
> I think we need to be very clear that such use of the API is not OK, and
> even go so far as to prevent it.

Well what sort of API is this bug about? An OAuth-style one (so, for apps which users can authorise with their account, like how the Army of Awesome works with Twitter) or a REST-style one, for publicly accessible data (like the Bugzilla and MediaWiki APIs).

While both kinds of APIs are needed, I feel that the one which would be more useful in the short-term would be the REST-style, public, one. (Of course this means that permissions also need to be sorted out.)
(In reply to James Socol [:jsocol, :james] from comment #2)
> (In reply to Majken "Lucy" Connor from comment #1)
> > we'd like to have API access for an IRC bot so people can use it (like we
> > have used firebot) to look up a person's contact info.
> > 
> > Person would ping the bot with the person's IRC nick, bot would return info
> > for the person who has that nick listed on their mozillians profile.
> > 
> > eg: "Mozilliansbot: Lucy" would return something like "Lucy is Majken "Lucy"
> > Connor, majken@gmail.com see mozillians.org/lucy for more info"
> 
> This raises some serious privacy concerns. The person running the bot may be
> a vouched Mozillian but the person USING the bot (and the people in the
> channel seeing the response) may not. Thus, the bot might be returning data
> that Mozillians have made available to VOUCHED (or above) users but not
> public, and then displaying that data publicly.
> 
> I think we need to be very clear that such use of the API is not OK, and
> even go so far as to prevent it.

I don't see how serious it could be. The bot can be restricted to only giving out x information. If it only gives out the mozillians link for example then there is no privacy concern. I'm also not sure giving out the primary email is a _big_ deal either considering all the other places where people can find it.

I also think the win in ease of use is huge (which is why we were putting email addresses in firebot to begin with) and the potential for abuse is extremely low. Someone malicious would have to figure out that there is an IRC network, and that there is a bot somewhere on it. But maybe further discussion belongs on the mozillians thread?
(In reply to Gervase Markham [:gerv] from comment #3)
> Comment 2 is why Mozillians needs (and needed from the beginning) proper
> data privacy controls, so people can choose to make information public.

I 100% agree with that. We're talking with Data Safety soon. For K9o, we may need to make an API available before privacy controls can be finished, but if we do, access to that API is going to be *extremely* restricted.

> Why do we need API keys? For uses of the API such as Bugzilla integration
> (i.e. providing people in Bugzilla with an "Info" popup populated with
> Mozillians data), surely it's sufficient just to be logged into Mozillians,
> and for Mozillians to check the Origin header to make sure the requesting
> domain is trusted?

This only works for things that allow you to log in via BrowserID. For Bugzilla, this *might* work (though recall that some accounts aren't allowed to log in that way) but for something like an IRC bot, which cannot authenticate via BrowserID, there needs to be some way of authenticating the actor.

> 
> How would something like Bugzilla integration work if each user needed an
> API key?

The APP would need an API key, not the user. This is getting into some nitty-gritty we haven't solved yet, but there are a couple concerns:

1) Bugzilla is making the request to the API, not a specific user of Bugzilla. (This may not be true in the example you gave, but if I am, say, ReMo, and want to populate profile data from the Phonebook, or MDN and want to check someone's vouched-status to decide if they can edit a page, it is true.) API keys also give us the ability to revoke API keys, in the case of such a non-human actor behaving badly (e.g. the IRC bot above, using elevated access to post information publicly).

2) Even when users are making requests to the API, BrowserID authentication is not always possible. If I am using a Python Mozillians-Phonebook client, and I am a vouched Mozillian so am doing a (for future example) search on all Mozillians who've shared location data with Vouched members and are in Argentina, I need a non-BrowserID way of identifying myself.

3) Even to take your example, in a form that tries to keep everything client-side, if I sign in to Bugzilla with BrowserID, I'm not automatically signed in to Mozillians with BrowserID, so I'm not entirely sure how that would work at all.

The idea of API keys is granting access to the API from things that aren't Browsers, and thus can't use BrowserID, even if they are accessed from browsers but are not accessing the API via a browser (e.g. a server-to-server REST or RPC call for Bugzilla, MDN, SUMO, etc).
(In reply to Majken "Lucy" Connor from comment #5)
> I don't see how serious it could be. The bot can be restricted to only
> giving out x information. If it only gives out the mozillians link for
> example then there is no privacy concern. I'm also not sure giving out the
> primary email is a _big_ deal either considering all the other places where
> people can find it.

Even a bot built in good-faith that has access to data users would rather not make public could accidentally put that info in the public. This is why, as Gerv said, we need proper privacy controls. You may not think giving out the primary email or real name is a big deal, but that isn't really your call. That's the call of the person whose information is being pasted in a public IRC channel.

And another bot or actor built in bad-faith could just run through all the IRC nicks it sees and collect email addresses. Even if it just used IRC to communicate with a bot built in good-faith.

> I also think the win in ease of use is huge (which is why we were putting
> email addresses in firebot to begin with) and the potential for abuse is
> extremely low. Someone malicious would have to figure out that there is an
> IRC network, and that there is a bot somewhere on it. But maybe further
> discussion belongs on the mozillians thread?

Putting email addresses in firebot is opt-in. Putting an email address in Mozillians is required, or you opt out of the phonebook (and later, potentially other things) completely.

It's no secret that Mozilla's IRC network exists. It certainly would be no secret if someone built this bot. That is security by obscurity without very much obscurity.
(In reply to Leo McArdle [:leo] from comment #4)
> (In reply to James Socol [:jsocol, :james] from comment #2)
> > I think we need to be very clear that such use of the API is not OK, and
> > even go so far as to prevent it.
> 
> Well what sort of API is this bug about? An OAuth-style one (so, for apps
> which users can authorise with their account, like how the Army of Awesome
> works with Twitter) or a REST-style one, for publicly accessible data (like
> the Bugzilla and MediaWiki APIs).

We've been working through the use cases we have so far. This is almost certainly going to be a tracking bug for a number of API endpoints.

The main 4 use cases so far seem to be:

1) Is a user vouched/in-a-group? E.g., should a user have permission to do X on my site. (This is probably not going to seem terribly RESTful, as we've brainstormed ideas. The goal is to return the minimum amount of information to answer the question "does email address X match criteria Y?")
2) Get information about a single user, most likely by email address. This will look more RESTful (GET /some/path/jsocol%40@mozilla.com) and the information returned will need to be restricted by a combination of who (or what) is asking, and the user's privacy settings.
3) Find multiple users, search by some criteria. We're still working this out. We may have multiple ways to do this (e.g. "get a list of users in a group" may have a shorter, optional syntax than "get a list of vouched users in a country"). Again, we must be careful what information we return and to whom/what.
4) Update user data. We're pushing this off for now. It's a valid use case but not a very high priority compare to the others, especially #3.

> While both kinds of APIs are needed, I feel that the one which would be more
> useful in the short-term would be the REST-style, public, one. (Of course
> this means that permissions also need to be sorted out.)

In the short term, the REST-style, more public one is what we'll build, because it's important for K9o, but access to it will be severely restricted until privacy controls are finished.
Assignee: timw → james
Summary: Create a Phonebook API → [tracker] Create a Phonebook API
Mozillians is actually more opt in than Firebot since anyone can teach Firebot anything and you're not alerted when they do.

There's a difference here between never using an API for an IRC bot and doing this when the API can respect the privacy controls.

There are also other ways. The bot could only PM the answer, for example. The bot could limit how many results it gives to a single person, and/or in a specific time frame. 

Giving out the name and the email IMO isn't a big deal because I think that as participants in an open community there is a certain amount that needs to be available. However I think we both agree the ideal situation is an API that can respect the private/public settings and build a bot off that.
(In reply to James Socol [:jsocol, :james] from comment #6)
> (In reply to Gervase Markham [:gerv] from comment #3)
> > Why do we need API keys? For uses of the API such as Bugzilla integration
> > (i.e. providing people in Bugzilla with an "Info" popup populated with
> > Mozillians data), surely it's sufficient just to be logged into Mozillians,
> > and for Mozillians to check the Origin header to make sure the requesting
> > domain is trusted?
> 
> This only works for things that allow you to log in via BrowserID. For
> Bugzilla, this *might* work (though recall that some accounts aren't allowed
> to log in that way) but for something like an IRC bot, which cannot
> authenticate via BrowserID, there needs to be some way of authenticating the
> actor.

I don't think that's quite right. (Perhaps it would help to clarify that I'm thinking about client-side use of the API by JS.) Let's say I'm logged into Mozillians, and not into Bugzilla. I visit a Bugzilla page. I find that each name has a little [?] icon by it. I mouse over one. The JS in-page goes off and asks Mozillians about the person with that email address. Mozillians says:

- Hmm, Origin says this request is coming from bugzilla.mozilla.org, whose integration code I 
  trust
- This person is logged in to Mozillians, so I can supply them with data appropriate to their 
  permissions
- <supplies data>

It doesn't matter whether the person is logged into Bugzilla or not.

> 1) Bugzilla is making the request to the API, not a specific user of
> Bugzilla. (This may not be true in the example you gave, but if I am, say,
> ReMo, and want to populate profile data from the Phonebook, or MDN and want
> to check someone's vouched-status to decide if they can edit a page, it is
> true.) API keys also give us the ability to revoke API keys, in the case of
> such a non-human actor behaving badly (e.g. the IRC bot above, using
> elevated access to post information publicly).

Why would the IRC bot have access to information that it's not allowed to post? The only reason I can think of is because we decide that the Mozillians permission model is currently broken, and we are working around it until we fix it. That's a fine reason, but we shouldn't design the API around that use case.

> 2) Even when users are making requests to the API, BrowserID authentication
> is not always possible. If I am using a Python Mozillians-Phonebook client,
> and I am a vouched Mozillian so am doing a (for future example) search on
> all Mozillians who've shared location data with Vouched members and are in
> Argentina, I need a non-BrowserID way of identifying myself.

Why could a Python app not log in with BrowserID?

(In reply to Majken "Lucy" Connor from comment #9)
> There's a difference here between never using an API for an IRC bot and
> doing this when the API can respect the privacy controls.

The trouble we have here is that privacy controls were not built into Mozillians from the start in the design which got implemented, and so I suspect the privacy folks will say that we have no informed consent to disclose anything :-|

Gerv
Let's step back a little. We don't HAVE an API right now, we're still designing it.

Unfortunately, due to k9o priorities and severely constrained developer time, we'll need parts of the API before we can get privacy controls where they need to be. So the initial versions of the API will be both incomplete and extremely restricted. As privacy controls get built, we'll continue to build out the API with them in mind.

This is all really valuable feedback and input, and we'll definitely keep it all in mind.

I don't want to get into a gigantic argument about a thing that doesn't exist yet, but to answer three questions:

> It doesn't matter whether the person is logged into Bugzilla or not.

For your very specific example, yes that works.

> Why would the IRC bot have access to information that it's not allowed to post? 

It shouldn't, which is really my point. But some apps will need higher levels of access to the same API. Hence we need some way to identify the app and its level of access: API keys are one very obvious solution.

If it's using a vouched Mozillian's level of access, it would see things that a vouched Mozillian could see. But then it has the potential to put that into a public space. So something like an IRC bot should be using the API with only "public" access, access to data that is marked "public", NOT the access level of the person who created the bot, who might be staff or an admin or whatever. And we should be able to revoke the bot's API access if it misuses it or violates the privacy levels. (E.g. if the bot or another app has a VOUCHED level of access and creates a publicly-visible list of data that only VOUCHED users should be able to see.)

(And it shouldn't be the access level of the person asking the question, because there's really no way to verify that the /nick asking is, in fact, the same person, and it still might dump the data into a public channel. IRC doesn't have that deep a security model and most of us use pretty weak passwords on IRC, anyway.)

This is technically true of humans, too. As a vouched, staff member, and/or admin, you have access to data that other people don't make public (even with privacy controls in place). The same issue exists on Facebook. If we're FB friends, I can see info you've shared with me and you're trusting me not to copy/paste and share it outside that. But I can, because I can see it. We probably can't fix humans, but for non-human apps, we can and should be more careful.

> Why could a Python app not log in with BrowserID?

Because it's not a browser. BrowserID only works in a context that can execute JS, with a 'window.navigator' object, where users can click on things, sign in... It was a PITA to automate even for our webapp tests in Selenium.

Several of us have raised this with the Identity team, that BrowserID as the sole authentication path means we have to implement a second form of access for non-browser APIs. There doesn't seem to be a best practice or recommendation for how to do that, yet.
Thanks for this! 

A bot could only respond to people with registered nicks, but I think the better solution is to only use information that's public. That discussion can wait until we're there though. Should I file a spinoff bug and mark it depending on this one?
(In reply to Majken "Lucy" Connor from comment #12)
> A bot could only respond to people with registered nicks, but I think the
> better solution is to only use information that's public.

IRC just isn't designed to be secure this way. Even if it only PMed people with registered nicks, all it takes is one vouched mozillian with a weak IRC password to tear through it, or to sniff the IRC traffic. And registered nick != vouched mozillian. &c. The latter approach is more API work but better, I think.

> Should I file a spinoff bug and mark it
> depending on this one?

Not yet. I don't think there's really a describable task, yet. We'll bet there.
(In reply to James Socol [:jsocol, :james] from comment #11)
> > It doesn't matter whether the person is logged into Bugzilla or not.
> 
> For your very specific example, yes that works.

But my example isn't specific; it's how you'd integrate with almost any website client-side. You can imagine doing the same thing for identities in hg.mozilla.org, tbpl, or any other place where you see the email addresses or other identifiers of Mozillians.

This kind of thing was the primary use case when I was designing the way Domesday would work originally. OK, the codebase has moved on, but the stuff we want the Phonebook to do is still the same.

I'm not opposed to API keys, I just can't see how you could do this major use case if they were the only way to do things. How would this kind of integration Just Work(TM) in an API key world? You can't have a secret API key appear in Bugzilla's client-side JS...
 
> If it's using a vouched Mozillian's level of access, it would see things
> that a vouched Mozillian could see. But then it has the potential to put
> that into a public space. So something like an IRC bot should be using the
> API with only "public" access, access to data that is marked "public", NOT
> the access level of the person who created the bot, who might be staff or an
> admin or whatever.

Of course. But I thought the point of the first part of your comments was that we currently don't have any data marked "public", and aren't going to have any, because there are no privacy controls and won't be any time soon?

> > Why could a Python app not log in with BrowserID?
> 
> Because it's not a browser. BrowserID only works in a context that can
> execute JS, with a 'window.navigator' object, where users can click on
> things, sign in... It was a PITA to automate even for our webapp tests in
> Selenium.

I agree that you'd need to reimplement some of browser.js in Python. But I'd have thought this was a generally useful task, for the reasons you state.
 
Gerv
Depends on: 775556
Assignee: james → giorgos
Status: NEW → ASSIGNED
Blocks: 700832
No longer blocks: 700832
No longer depends on: 747524
Depends on: 795380
Depends on: 795382
Depends on: 795385
Depends on: 795386
Depends on: 795388
Depends on: 795390
Depends on: 799136
Depends on: 799140
Depends on: 751661
No longer depends on: 799140
Depends on: 799276
Depends on: 768539
Depends on: 768536
Depends on: 805180
Depends on: 805449
Depends on: 808744
Depends on: 808827
Depends on: 809156
Depends on: 809204
Depends on: 809575
Depends on: 810063
Assignee: giorgos → nobody
Marking this bug as resolved because the API launched in December. Thanks for the fantastic work here. We have an API! :)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
No longer depends on: 805180
Resolution: --- → FIXED
Bumping to verified live and loud.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: