Closed
Bug 1119545
Opened 11 years ago
Closed 8 years ago
[spam] Disable spam-attracting fields in the profile
Categories
(developer.mozilla.org Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: aspivak, Unassigned)
References
Details
(Whiteboard: [LOE:?])
User Story
* A user viewing the profile page of another user who has fewer than 10 revisions will not see the biography or the website URL fields * Users who have fewer than 10 revisions will see the contents of these fields on their own profiles (if they exist), accompanied with a warning: "Until you have made 10 revisions, these fields are hidden from others." * An administrator viewing a profile page of another user will see the contents of these fields accompanied by a warning: "Until this user has made 10 revisions, these fields are hidden from others."
Disable problematic, spam-attracting fields in the user profile.
Specifically, spammers place links in the website field, the free-form About me field, and occasionally create free-form tags that are off-topic for MDN.
Specifically, we need to disable
1. the website form field - this is the most abused field, and rarely used by legitimate contributors.
2. free-form About me field - also mostly used for spam
Other things that would be helpful would be to restrict the interest tags to a pre-approved set, to avoid things like the following (from actual profiles):
profile:interest:com, profile:interest:garyschollmeier, profile:interest:http://garyschollmeier.com, profile:interest:www.garyschollmeier.com
link: https://developer.mozilla.org/en-US/profiles/Alisha1feKr
"profile:interest:best moving company", "profile:interest:full service movers", "profile:interest:local movers", "profile:interest:moving services", "profile:interest:professional moving company", "profile:interest:residential movers", "profile:interest:top rated moving company"
"profile:interest:nike air max 90 shoes", "profile:interest:nike air max", "profile:interest:nike flyknit air max"
Discussion needed around the proposal that we disable these fields only until the user passes a "check" that they are a valid site contributor.
decide the right vetting process. Ideas for far have been
- # of edits (after 10 edits, these fields are enabled)
- Vouching by either the link to a vouched Mozillains profile or vouchning/approval by an MDN admin.
I'd like to see if we can determine how often these fields are used legitimately before we build a vetting process/feature - if it is rarely used legitimately then I think we would be better off just removing the fields entirely...
Comment 1•11 years ago
|
||
I think many people use this field to point to their blog, for example (I know I do). Removing the ability for users to point at their blogs is kind of unfair, and IMHO damages the viability of MDN profiles. I would imagine that many users only fill out their profile to help encourage people to visit their blogs and whatnot.
Maybe I'm wrong. But that's certainly an impression I've gotten through conversations with admins of other sites.
Comment 2•11 years ago
|
||
What about displaying these fields only for vouched Mozillians?
Flags: needinfo?(jswisher)
Comment 3•11 years ago
|
||
(In reply to Jean-Yves Perrier [:teoli] from comment #2)
> What about displaying these fields only for vouched Mozillians?
:groovecoder, can you estimate the level-of-effort for...
scenario 1: remove the fields from the profile edit screen and the profile view screen
scenario 2: show the fields on the profile view screen only if the profile being viewed is a vouched mozillian; show the fields on the profile edit screen only if the profile being edited is a vouched mozillian.
Flags: needinfo?(lcrouch)
Whiteboard: [LOE:?]
Comment 4•11 years ago
|
||
Using Mozillians.org to manage access control to MDN features is certainly one way to get access management without implementing it all in MDN. For example, air.mozilla.org uses mozillians.org to control who gets access to what videos (e.g., you must be in the "NDA" group on mozillians.org to videos that require an NDA).
We'd need to have a process to ensure that we regularly vouch active, trusted MDN users. We need that in any case.
Flags: needinfo?(jswisher)
Comment 5•11 years ago
|
||
Removing fields from edit & view pages: LOE 1 day
Checking for vouched Mozillians status: LOE 5 days (at least; first day would be Mozillians API research)
Flags: needinfo?(lcrouch)
Comment 6•10 years ago
|
||
I don't have a good read on how many people a Mozillians-only feature would serve. Does anyone? Absent that data, here's my take on this:
* There are ~85 Mozillians in the MDN group: https://mozillians.org/en-US/group/mdn/
* There are ~220 in the Developer Documentation group: https://mozillians.org/en-US/group/developer-documentation/
* There are ~700 MDN profiles with something in the "Mozillians" field
My guess is the number of vouched Mozillians in MDN is between 220 and 700, or about 0.5% of MDN users. I don't think we should invest the time to build a feature just for them right now, especially this feature.
(In reply to ali spivak from comment #0)
> I'd like to see if we can determine how often these fields are used
> legitimately before we build a vetting process/feature - if it is rarely
> used legitimately then I think we would be better off just removing the
> fields entirely...
There are ~24,000 website urls in the database (approximately 25% of all profiles have one). I wrote a query that will extract them[0]. It is often easy to see a spammy address through visual inspection (for example, http://garagedoorrepairapollobeachfl.com/).
Unfortunately, that is the only way I can see to review them. I wrote a script[1] to learn whether these addresses appear in SURBL[2] lists but a couple obvious candidates did not. So I'm not terribly optimistic that we could use a blocklist approach to learning the answer to Ali's question.
A quick review of the results of the query in [0] gives me the impression that at least half of the websites in the website field are spammy.
[0] https://gist.github.com/hoosteeno/9d392f330a8026bb6ed9
[1] https://gist.github.com/hoosteeno/8d7d3d416d37489ea693
[2] https://en.wikipedia.org/wiki/SURBL
Comment 7•10 years ago
|
||
"Half" of links from profiles being spam makes it tough to make a call about whether to allow them. What the LOE to set a threshold for displaying them, by either account age, or number of edits the user has made (should be a high number)?
Comment 8•10 years ago
|
||
I created a query to get the bio, URL, count of revisions and the link to the profile out of the data export I have:
https://gist.github.com/hoosteeno/bb11f1b0ac8032b07ec9
Using this I have produced a spreadsheet with all of those fields in it, which I will share once I know the right venue for sharing it.
Of the 23,913 profiles that include a URL, 22,647 have 0 revisions. Reviewing the spreadsheet, it seems like a _very large_ majority of the 0-revision profiles with URLs are spam.
I believe if we restricted display of bio and URL to people with X edits, we would eliminate a huge amount of spammy content from the site. And based on the content I can see, I think if we already had honeypots (as in bug 1119532) we would have prevented a majority of these profiles from being created.
Getting rid of the existing spammy profiles will probably require manually weeding through them, unfortunately.
Updated•10 years ago
|
User Story: (updated)
Updated•10 years ago
|
User Story: (updated)
Comment 9•8 years ago
|
||
Here's the current situation on MDN profile pages:
* We've removed most of the free text fields, such as website, bio and "About Me", from the profile
* We're more aggressive about removing links to banned users
* We still allow creating new tags, which may be used for spammy purposes
* We allow links to GitHub, Linked In, etc, and validate that they are valid URLs.
* robots.txt allows scraping the profile pages, and contributor links do not include "nofollow"
* The meta tag on profile pages is <meta name="robots" content="noindex, nofollow">, so spammy profile stuff should not make it to Google, etc.
There could be additional work (tags are a mess), but I think that would be best served with new bugs.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•5 years ago
|
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•