Closed Bug 1188029 Opened 9 years ago Closed 7 years ago

[meta] Scan posted revisions with Akismet

Categories

(developer.mozilla.org Graveyard :: Editing, defect)

All
Other
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: groovecoder, Assigned: jwhitlock)

References

Details

(Keywords: in-triage, Whiteboard: [specification][type:change])

What feature should be changed? Please provide the URL of the feature if possible.
==================================================================================
New Document: https://developer.mozilla.org/en-US/docs/new
Edit Document: e.g., https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS$edit

What problems would this solve?
===============================
Recent & on-going spike in spam titles

Who would use this?
===================
Everyone creating or editing an article on MDN

What would users see?
=====================
When a user submits a revision with a title that is flagged by Akismet as spam, return a form validation error.

What would users do? What would happen as a result?
===================================================
Legitimate users will create or edit articles and nothing will happen.

Spammers will create or edit articles and get a form validation error.

If Akismet is down or does not return responsively, we will ignore it and save the revision anyway.

Is there anything else we should know?
======================================
Assignee: nobody → jezdez
Status: NEW → ASSIGNED
See Also: → 1124358
Component: General → Editing
Depends on: 1188033
Blocks: 1188038
I don't think we should send only titles to Akismet. I can think of no reason to exclude any non-PII fields that might make Akismet more accurate; doing so arbitrarily caps Akismet's effectiveness. The Akismet API docs[0] are very clear about this:

"The more data you send with each comment check, the better chance Akismet has of avoiding missed spam and false positives."

But that's just an opinion. For the sake of scientific rigor how about a quick experiment? In tests where we send the body of the article, Akismet has identified spam at appx. 80% accuracy, and ham at appx. 98% accuracy. Just now when I removed the article body from the payload (still sending the title, the IP, the time of day, the username), it identified spam with 3% accuracy and ham with 100% accuracy. 

I don't believe that is the effect we want. Let's give Akismet what it needs to achieve what we intend.

[0] http://akismet.com/development/api/#detailed-docs
I agree with Justin. The only reason we talked only about title was to be minimalist. If there is no significant performance burden, we should send as much information as possible (if accuracy is not impacted).
Severity: normal → major
Keywords: in-triage
Summary: New & Edit revisions: send Titles thru Akismet → [meta] Scan posted revisions with Akismet
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/da9ad8caf07e331fe66c74df8b0e96ecd170d9a3
Bug 1188029 - Upgrade django-constance.

This also gets rid of our custom test utility to override constance config values.

https://github.com/mozilla/kuma/commit/a21c8b63886902dade64c775b15692ca308439a7
Bug 1188029 - Stop using custom test client wrapper.

https://github.com/mozilla/kuma/commit/8e27d6a8776bd561dfd5db66cdb5dcc3f98a302a
Bug 1188029 - Stylistic and cosmetic code cleanup.

https://github.com/mozilla/kuma/commit/1c22d9e7060d3a5dc9c4315f8624a8dd02e60ea3
Bug 1188029 - Use native taggit API to get tag names.

https://github.com/mozilla/kuma/commit/6386ba7d44dab3bbb13b4f1494fe06dbd78fa999
Merge pull request #3566 from mozilla/bug1188029-cleanup

Some cleanup commits from #3482.
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/928b1e5eda3b0b3f05f74ffdd7e4bad2f49d60d1
Bug 1188029 - Add Akismet based spam checks.

Fix bug 1194766 - Create new "Trusted Writers" grop.

Fix bug 1203989 - Integrate form validation error into frontend code.

Fix bug 1194771 - Use Akismet's "comment check" API on incoming revisions.

Bug 1215199 - Draft email to be sent when Akismet check fails.

https://github.com/mozilla/kuma/commit/853a50f03ca8799ce7d5dab24d80d510ad54109a
Merge pull request #3482 from mozilla/bug1188029

Bug 1188029 - Spam prevention via Akismet. r=jwhitlock r=robhudson.
Depends on: 1221967
The revision form code sends the content to Akismet (when several layers of enabling are on), but does not stop submission when Akismet identifies the content as spam.

In kuma/spam/akismet.py [1], check_comment returns a boolean for the spam check, and raises an exception if things went wrong. According to the Akismet docs [2], True means the comment appears to be spam, False means it is not spam.

The AkismetCheckFormMixin, used in the revision form, calls check_comment, but does nothing with the response [3]. If an exception is raised, then a form validation error occurs and the submission stops [4]. Similar code could be used to stop new content when Akismet identifies it as spam.

[1] https://github.com/mozilla/kuma/blob/master/kuma/spam/akismet.py#L140-L183
[2] https://akismet.com/development/api/#comment-check
[3] https://github.com/mozilla/kuma/blob/master/kuma/spam/forms.py#L79-L83
[4] https://github.com/mozilla/kuma/blob/master/kuma/spam/forms.py#L32-L37
I have found that the filter is not blocking spam from being posted to the public on MDN.

I've logged in as user SheppyNoob (who is unprivileged). I then create a new page and paste in the contents of a spam I received from someone, and when I click the save button, the spammy text is saved normally and the page is created with the spam presented on it.

Instead, I should be getting a rejection message when I click save, and no page should be created.
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/25c0bb023f9cc5e387666c37f75ab078bb938d00
Bug 1188029 - Fix API issue in Akismet spam check.

This fixes a design issue in the spam package that was overlooked and never surfaced as part of the unittests since they only tests the error cases, and not specfically the success cases.

https://github.com/mozilla/kuma/commit/095a1b0b760d531632fdaa7389f1b173ce522d82
Merge pull request #3781 from mozilla/bug1188029-fixup

Bug 1188029 - Fix API issue in Akismet spam check.
Depends on: 1255609
Unassigning myself since it's time.
Assignee: jezdez → nobody
Status: ASSIGNED → NEW
Assignee: nobody → jwhitlock
Depends on: 1257003
Depends on: 1259151
Depends on: 1259173
Depends on: 1259233
Depends on: 1259870
Blocks: 1260253
Depends on: 1260197
Depends on: 1260832
Commits pushed to master at https://github.com/mozilla/kuma

https://github.com/mozilla/kuma/commit/3f7d3d87eb6d02da9efbd8dfd8a9d71eba7a7145
bug 1188029 - Capture Akismet error data

If Akismet returns an error from check-comment, continue to block the
edit, but save the details of the error for debugging, and adjust the
notification sent to the spam watchers list.

https://github.com/mozilla/kuma/commit/e1d3502acb07fd8a23ab920954f04c6715a908d0
Merge pull request #3826 from mozilla/akismet-errors-1188029

bug 1188029 - Capture Akismet error data

+r jezdez
No longer blocks: 1260253
Depends on: 1265495
Depends on: 1267713
Depends on: 1268511
Depends on: 1271282
All dependent bugs are closed.  There's some additional work to avoid over-zealous checks with Akismet (bug 1358541), but I think we can call the initial effort closed.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.