Closed Bug 1173112 Opened 9 years ago Closed 9 years ago

Deploy Infernyx 0.1.41 to Stage

Categories

(Content Services Graveyard :: Tiles: Ops, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
Iteration:
41.3 - Jun 29

People

(Reporter: tspurway, Unassigned)

Details

(Whiteboard: .?)

This release contains the ip blacklist jobs for detecting suspicious IPs and daily jobs to negate metrics attributed to blacklisted ips.
Iteration: --- → 41.3 - Jun 29
Whiteboard: .?
Note that this release is dependent on migrations in Splice 1.1.19
does this require that splice be deployed *first*, or will it simply do nothing until splice 1.9 is deployed?
:relud, it depends on the database migrations in 1.1.19, which I believe have already been deployed to stage
Yes, the migrations for Splice 1.1.19 are already in stage.

Building/deploying Infernyx tag 0.1.41 via Jenkins, assuming the version of disco hasn't changed from 0.5.4.
This is deployed into stage.  Tim, please test that it behaves as you would expect (and it would be nice if you could write up what you use to test it, as well) and then mark this bug VERIFIED when you're satisfied that it works as it's supposed to.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(tspurway)
Resolution: --- → FIXED
This release contains two new 'daily' rules (so called because the run once per day on the previous days data):  ip_click_counter and blacklisted_impression_stats

To test these (manually), first you need some 'fraudulent' data, which would be repeated clicks and/or impressions from the same ip address.  You might start with a record like:

{"tiles": [{"score": 3}, {"score": 3}, {"score": 1}, {"score": 1}, {"score": 1}, {}, {}, {}, {}, {"id": 469}, {"id": 470}, {"id": 471}, {"id": 472}, {"id": 473}, {"id": 474}], "locale": "es-ES", "ip": "92.186.132.150", "timestamp": 1424476606289, "date": "2015-06-15", "ua": "Mozilla/5.0 (Windows NT 6.3; rv:35.0) Gecko/20100101 Firefox/35.0", "click": 9}

(which has a random IP for privacy) (choose a date that is less than 7 days old) and duplicate it a bunch of times into a file called 'the_test_file' (arbitrarily).  We consider a fraudulent IP to be one that produces more than 50 clicks or more than 150 impressions in a single day.  Put a bunch of other records in the file with different IPs that are below these thresholds as well, then push the file into a temporary tag into ddfs with the following command (run on infernyx_tiles.stage.mozaws.net):

ddfs chunk test:impression:<date> ./the_test_file

replace <date> with the date you chose in the test file.  Let's assume the tag is 'test:impression:2015-06-15'.

Now, run the 'ip_click_counter' Inferno rule on this data by invoking the command:

inferno -i rules.ip_click_counter -t test:impression:2015-06-15

after the rule runs, you can check that it inserted the 'fraudulent' ips into the blacklisted_ips table by running the following SQL on redshift.tiles.stage.mozaws.net:

select * from blacklisted_ips;

you should then run it again and ensure that duplicate ip address are entered (you want to test that the db allows duplicate IP addresses).

Keep track of your current database impression and click counts with this SQL:

select sum(impressions), sum(clicks) from impression_stats_daily;

You should then insert the fraudulent impressions in this new tag into the database by issuing:

inferno -i rules.impression_stats_daily -t test:impression:2015-06-15

Re-run the SQL for counting impression and clicks - you should verify that your fraudulent data has been recorded.

Then we want to run the blacklisted_impression_stats rule, which will insert *negative* impressions and clicks into the database to account for fraudulent data:

inferno -i rules.blacklisted_impression_stats -t test:impression:2015-06-15

Re-run the SQL for counting impression and clicks - you should verify that your fraudulent data has been removed, and that your impression counts are the same as they were before you inserted the fraudulent data.
Status: RESOLVED → VERIFIED
Flags: needinfo?(tspurway)
You need to log in before you can comment on or make changes to this bug.