Last Comment Bug 841095 - Vote abuse coming from single machine
: Vote abuse coming from single machine
Status: RESOLVED FIXED
u=sumo-team c=kpidash p=0 s=2013.4
:
Product: support.mozilla.org
Classification: Other
Component: Knowledge Base Software (show other bugs)
: unspecified
: All All
: P3 normal (vote)
: 2013Q1
Assigned To: Ricky Rosario [:rrosario, :r1cky]
:
Mentors:
Depends on: 846399
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-13 11:26 PST by Ibai Garcia [:ibai]
Modified: 2013-03-18 17:33 PDT (History)
4 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Spike in votes (52.33 KB, image/jpeg)
2013-03-09 10:33 PST, Verdi [:verdi]
no flags Details
Another spike (63.94 KB, image/jpeg)
2013-03-09 10:34 PST, Verdi [:verdi]
no flags Details

Description Ibai Garcia [:ibai] 2013-02-13 11:26:36 PST
I stumbled upon a case of fairly obvious vote abuse (probably generated by a script) in the following article:

https://support.mozilla.org/en-US/kb/getting-started-firefox-os

A machine with the following configuration:
Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

Generated 66 positive votes between:
2013-02-09 02:47:15
and
2013-02-09 03:02:06

The votes don't show up on Google Analytics. I will investigate and report back if I can find similar patterns in other articles.
Comment 1 Ibai Garcia [:ibai] 2013-02-13 11:38:46 PST
Some other articles who suffer from the same effect:

Firefox for Android Crashes On Startup - How To Fix
Mozilla News
Sådan bruger du Java, hvis det er blevet blokeret
Superheroes Wanted!

It seems that any article linked from the Home Page was hit by this effect.
Comment 2 Kadir Topal [:atopal] 2013-02-19 08:00:23 PST
I guess we have to move forward with rate limiting. I assume that there are similar issues in the support forums. Also, I guess that this is not necessarily malicious. 

Usually the GA vote numbers are about 10% lower than our own number, which might also be due of time zone differences, but in the last two day our own numbers exploded while GA stayed level:

Date	GA	Kitsune
2/8/2013	14048	15252
2/9/2013	13153	15947
2/10/2013	12764	14046
2/11/2013	13121	14308
2/12/2013	12980	14429
2/13/2013	12851	14024
2/14/2013	12515	13807
2/15/2013	12289	13373
2/16/2013	12339	13600
2/17/2013	12219	96351
2/18/2013	12501	118311

I'll look into it.
Comment 3 Kadir Topal [:atopal] 2013-02-19 08:07:52 PST
Ah, mystery solved, the excess votes were created by: 
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; Netsparker)

This might be our own security testing team. I'll see if we can confirm their use of Netsparker. I guess GA filters Netsparker automatically. I know I do that for the log analysis.
Comment 4 Ibai Garcia [:ibai] 2013-02-20 14:17:41 PST
There's a crazy spike in some articles last Saturday

https://support.mozilla.org/en-US/kb/getting-started-firefox-os/history
https://support.mozilla.org/en-US/kb/get-started-firefox-overview-main-features/history

Can we remove them?
Comment 5 Ricky Rosario [:rrosario, :r1cky] 2013-02-25 12:49:23 PST
(In reply to Ibai Garcia [:ibai] from comment #4)
> Can we remove them?

If there is a way to identify all the bogus votes, then we could write a migration to delete them.

(In reply to Kadir Topal [:atopal] from comment #3)
> This might be our own security testing team. I'll see if we can confirm
> their use of Netsparker. I guess GA filters Netsparker automatically. I know
> I do that for the log analysis.

Gah. That's not good. We shouldn't have automated scripts causing data to be created on production.
Comment 6 Ibai Garcia [:ibai] 2013-02-25 15:08:44 PST
It is hard to tell what votes are bogus from the DB without doing a manual check... for example, the UA that Kadir has shared has created a lot of bad votes, but it also has created what looks like a legit set of votes.

I have compiled the votes generated by that UA in the last 2 months and the majority are single votes here and there:

https://docs.google.com/spreadsheet/ccc?key=0AmCjyDM0fEFgdHh2eFFkcG5SYUVNdXpxWDlWZFd3ZlE#gid=0

Hard to tell if they are "malicious".

What is true is that when the same UA generates more than a couple of votes per minute per article, this tend to represent abuse (i.e. they don't stop in two...but they go to dozens). So maybe we could take that machine configuration and block it when we detect abuse? Do we have something more tangible as a cookie with a unique identifier? I'm thinking that with FX OS and everyone running the same device, the UA trick is going to be...a really bad idea...but maybe we can do it for machines that are not Firefox.
Comment 7 Ricky Rosario [:rrosario, :r1cky] 2013-02-25 15:53:17 PST
(In reply to Ibai Garcia [:ibai] from comment #6)
> Do we have something more
> tangible as a cookie with a unique identifier?

Yes, we set a cookie on all anonymous votes to (try to) prevent multiple votes per user. A script just works around that by not sending the cookie.
Comment 8 Ibai Garcia [:ibai] 2013-02-25 15:54:37 PST
How crazy is not to record the vote if we can write in the cookie?
Comment 9 Ricky Rosario [:rrosario, :r1cky] 2013-02-25 16:39:19 PST
(In reply to Ibai Garcia [:ibai] from comment #8)
> How crazy is not to record the vote if we can write in the cookie?

I am not sure I fully understand this. We record the vote and set the cookie in the response. If they try to vote again, the cookie will be there and the vote won't count. At the time of recording the vote, we have no idea what is going to happen to the cookie. We could make it a little harder to vote by requiring a CSRF token. This would just make their script slightly harder but it still would be possible to cheat the system.

The only way to make the system cheat-proof is to only allow votes from auth'd users. But that would probably not work so well on the KB where I assume most votes are from anonymous users.
Comment 10 Ibai Garcia [:ibai] 2013-02-25 16:52:58 PST
I meant can't instead of can...my mistake. So your first paragraph answers my suggestion.

We need a method that reduces the "cheating" (I don't think that is necessarily cheating...it's more like trolling) but still enables non registered votes. 

And it needs to be friendly (a Captcha doesn't seem like a good option). 

I'm inclined to remove votes coming from "fishy" UAs. We can refine the method a little bit.
Comment 11 Kadir Topal [:atopal] 2013-02-27 05:42:50 PST
Sorry, meant to comment here after my meeting with Ricky.

Ibai, Netsparker is a tool that is used to probe sites for security issues. No normal user will have that in their UI. So, if we remove those votes we should not be removing any legitimate votes. 

Also, I'm only seeing it come up on the 17th and 18th, and a little bit on the 26th:

2013-02-17	82401
2013-02-18	104329
2013-02-26	83


Ricky, here is the SQL query I'd suggest: 

First, removing all vote_metadata, so we won't be stuck with it after removing the actual votes.

DELETE 
FROM `wiki_helpfulvotemetadata`
WHERE `wiki_helpfulvotemetadata`.`vote_id` in (
SELECT `wiki_helpfulvote`.`id` FROM `wiki_helpfulvote`
WHERE `wiki_helpfulvote`.`user_agent` LIKE '%Netsparker%'
AND `created` BETWEEN '2013-02-17 0' AND '2013-02-19 0');

Then deleting the actual votes should be quite straight forward:

DELETE FROM `wiki_helpfulvote`
WHERE `wiki_helpfulvote`.`user_agent` LIKE '%Netsparker%'
AND `created` BETWEEN '2013-02-17 0' AND '2013-02-19 0';


Next step is getting rate limited activated on SUMO, see bug 785850
Comment 12 Ricky Rosario [:rrosario, :r1cky] 2013-02-28 10:16:54 PST
I'll test the SQL out and add it to a migration for deploying.
Comment 13 Ricky Rosario [:rrosario, :r1cky] 2013-02-28 10:52:59 PST
Willkg convinced me to just ask IT to do it. Filed bug 846399.
Comment 14 Stephen Donner [:stephend] - PTO; back on 5/28 2013-02-28 18:14:26 PST
BTW, I've talked to most of you, but just so you know -- we would never (at least intentionally, and definitely not in this case) run automation that would change values in production).
Comment 15 Ricky Rosario [:rrosario, :r1cky] 2013-03-01 08:05:49 PST
The votes have been deleted.
Comment 16 Stephen Donner [:stephend] - PTO; back on 5/28 2013-03-05 12:20:51 PST
Meant to comment here, but just so everyone is aware: this IP/DDOS wasn't any accidental or malicious event from Web QA's side -- the attacker just happened to use NetSparker, a tool we (I, mostly) use quite often.
Comment 17 Verdi [:verdi] 2013-03-09 10:33:47 PST
Created attachment 723099 [details]
Spike in votes

(In reply to Ricky Rosario [:rrosario, :r1cky] from comment #15)
> The votes have been deleted.

I still see spikes on some articles (see attachments).
Comment 18 Verdi [:verdi] 2013-03-09 10:34:21 PST
Created attachment 723100 [details]
Another spike
Comment 19 Ibai Garcia [:ibai] 2013-03-11 16:09:17 PDT
The votes for this 2 articles come from this machine:

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)

This matches the articles that I referred in my previous comment. It seems that we removed part of the effect but not completely. And similarly to the other case, they don't show up on GA...so they may be happening because of script or something.

I can't understand how somebody can be doing this....
Comment 20 Kadir Topal [:atopal] 2013-03-18 17:33:20 PDT
Apparently the Netsparker votes had not been removed. They are now. But also, we are adding rate limiting, so at least in the future this should not be an issue anymore. Unless people are launching a sophisticated attack.

Note You need to log in before you can comment on or make changes to this bug.