Closed Bug 1188029 Opened 6 years ago Closed 4 years ago
[meta] Scan posted revisions with Akismet
What feature should be changed? Please provide the URL of the feature if possible. ================================================================================== New Document: https://developer.mozilla.org/en-US/docs/new Edit Document: e.g., https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS$edit What problems would this solve? =============================== Recent & on-going spike in spam titles Who would use this? =================== Everyone creating or editing an article on MDN What would users see? ===================== When a user submits a revision with a title that is flagged by Akismet as spam, return a form validation error. What would users do? What would happen as a result? =================================================== Legitimate users will create or edit articles and nothing will happen. Spammers will create or edit articles and get a form validation error. If Akismet is down or does not return responsively, we will ignore it and save the revision anyway. Is there anything else we should know? ======================================
I don't think we should send only titles to Akismet. I can think of no reason to exclude any non-PII fields that might make Akismet more accurate; doing so arbitrarily caps Akismet's effectiveness. The Akismet API docs are very clear about this: "The more data you send with each comment check, the better chance Akismet has of avoiding missed spam and false positives." But that's just an opinion. For the sake of scientific rigor how about a quick experiment? In tests where we send the body of the article, Akismet has identified spam at appx. 80% accuracy, and ham at appx. 98% accuracy. Just now when I removed the article body from the payload (still sending the title, the IP, the time of day, the username), it identified spam with 3% accuracy and ham with 100% accuracy. I don't believe that is the effect we want. Let's give Akismet what it needs to achieve what we intend.  http://akismet.com/development/api/#detailed-docs
I agree with Justin. The only reason we talked only about title was to be minimalist. If there is no significant performance burden, we should send as much information as possible (if accuracy is not impacted).
Severity: normal → major
Summary: New & Edit revisions: send Titles thru Akismet → [meta] Scan posted revisions with Akismet
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/da9ad8caf07e331fe66c74df8b0e96ecd170d9a3 Bug 1188029 - Upgrade django-constance. This also gets rid of our custom test utility to override constance config values. https://github.com/mozilla/kuma/commit/a21c8b63886902dade64c775b15692ca308439a7 Bug 1188029 - Stop using custom test client wrapper. https://github.com/mozilla/kuma/commit/8e27d6a8776bd561dfd5db66cdb5dcc3f98a302a Bug 1188029 - Stylistic and cosmetic code cleanup. https://github.com/mozilla/kuma/commit/1c22d9e7060d3a5dc9c4315f8624a8dd02e60ea3 Bug 1188029 - Use native taggit API to get tag names. https://github.com/mozilla/kuma/commit/6386ba7d44dab3bbb13b4f1494fe06dbd78fa999 Merge pull request #3566 from mozilla/bug1188029-cleanup Some cleanup commits from #3482.
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/928b1e5eda3b0b3f05f74ffdd7e4bad2f49d60d1 Bug 1188029 - Add Akismet based spam checks. Fix bug 1194766 - Create new "Trusted Writers" grop. Fix bug 1203989 - Integrate form validation error into frontend code. Fix bug 1194771 - Use Akismet's "comment check" API on incoming revisions. Bug 1215199 - Draft email to be sent when Akismet check fails. https://github.com/mozilla/kuma/commit/853a50f03ca8799ce7d5dab24d80d510ad54109a Merge pull request #3482 from mozilla/bug1188029 Bug 1188029 - Spam prevention via Akismet. r=jwhitlock r=robhudson.
The revision form code sends the content to Akismet (when several layers of enabling are on), but does not stop submission when Akismet identifies the content as spam. In kuma/spam/akismet.py , check_comment returns a boolean for the spam check, and raises an exception if things went wrong. According to the Akismet docs , True means the comment appears to be spam, False means it is not spam. The AkismetCheckFormMixin, used in the revision form, calls check_comment, but does nothing with the response . If an exception is raised, then a form validation error occurs and the submission stops . Similar code could be used to stop new content when Akismet identifies it as spam.  https://github.com/mozilla/kuma/blob/master/kuma/spam/akismet.py#L140-L183  https://akismet.com/development/api/#comment-check  https://github.com/mozilla/kuma/blob/master/kuma/spam/forms.py#L79-L83  https://github.com/mozilla/kuma/blob/master/kuma/spam/forms.py#L32-L37
I have found that the filter is not blocking spam from being posted to the public on MDN. I've logged in as user SheppyNoob (who is unprivileged). I then create a new page and paste in the contents of a spam I received from someone, and when I click the save button, the spammy text is saved normally and the page is created with the spam presented on it. Instead, I should be getting a rejection message when I click save, and no page should be created.
There is https://github.com/mozilla/kuma/pull/3781 now
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/25c0bb023f9cc5e387666c37f75ab078bb938d00 Bug 1188029 - Fix API issue in Akismet spam check. This fixes a design issue in the spam package that was overlooked and never surfaced as part of the unittests since they only tests the error cases, and not specfically the success cases. https://github.com/mozilla/kuma/commit/095a1b0b760d531632fdaa7389f1b173ce522d82 Merge pull request #3781 from mozilla/bug1188029-fixup Bug 1188029 - Fix API issue in Akismet spam check.
Unassigning myself since it's time.
Assignee: jezdez → nobody
Status: ASSIGNED → NEW
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/3f7d3d87eb6d02da9efbd8dfd8a9d71eba7a7145 bug 1188029 - Capture Akismet error data If Akismet returns an error from check-comment, continue to block the edit, but save the details of the error for debugging, and adjust the notification sent to the spam watchers list. https://github.com/mozilla/kuma/commit/e1d3502acb07fd8a23ab920954f04c6715a908d0 Merge pull request #3826 from mozilla/akismet-errors-1188029 bug 1188029 - Capture Akismet error data +r jezdez
All dependent bugs are closed. There's some additional work to avoid over-zealous checks with Akismet (bug 1358541), but I think we can call the initial effort closed.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.