What problem would this feature solve? ====================================== When Akismet incorrectly identifies an edit as spam (a "false positive" in spam-fighting terminology), there is no way to tell Akismet that it was incorrect. A method is needed for staff can train Akismet and reduce the false positive rate. Who has this problem? ===================== All contributors to MDN How do you know that the users identified above have this problem? ================================================================== Non-staff contributors have had valid edits blocked by the spam check, and told us about it. From February 18th to March 10th, Akismet has blocked 324 contributions, and allowed 6174. Some portion of the 324 blocked contributions are false positives. How are the users identified above solving this problem now? ============================================================ Some complain to the msn-admins list , as suggested by the message they get for blocked contributions. Others have opened bugs like bug 1254180 to request edits.  https://mail.mozilla.org/private/mdn-admins/ (archived are private to list members) Do you have any suggestions for solving the problem? Please explain in detail. ============================================================================== When Akismet marks a revision as spam, store information about the attempted commit in the database, in the DocumentSpamAttempt table. Create a dashboard so that MDN staff can review the details, and either confirm Akismet's decision or identify the change as "Ham". Send the "Ham" results to Akismet to further train spam detection and reduce future false positives. The spam submission information will contain sensitive information, such as IP addresses. The dashboard should be restricted to staff, and the data deleted 1) when a staff member has confirmed or denied the decision, and 2) after a short time period. It should be omitted from any anonymized databases. In order to get it deployed quickly, the first version will show the raw content as submitted, and not a diff against the previous content. Interface improvements can be made in future iterations. Is there anything else we should know? ====================================== Wordpress uses Akismet for checking comments. The official Akismet plugin  stores all comments in the database, and marks them as being identified as spam by Akismet. The blog admin can then review the comments, and decide if a comment is legitimate (false positive), or should have been marked as spam (false negative), with the admin's decision sent to Akismet for training, and the blog's displayed contents updated accordingly. Our use case, checking shared wiki content, is different than blog comments. We can train Akismet in order to reduce the false positive rate, but it is hard or impossible to re-submit the content to get it to display on MDN as if it was never blocked. We will have to ask the user to attempt to resubmit their edit.  https://plugins.svn.wordpress.org/akismet/trunk/class.akismet-admin.php
2 years ago
Assignee: nobody → jwhitlock
Status: NEW → ASSIGNED
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/ded0f661d707a5279c1082527a48c1d305b3ab50 bug 1255609 - Enable training on false positives Add additional data to wiki's DocumentSpamAttempt: * data - JSON-serialized data sent to Akismet * review - Staff review status (default 'Needs Review') * reviewer - Staff reviewer * reviewed - Review date Old DocumentSpamAttempts do not have this additional data, so their status is set to "Review Unavailable". Update the Django admin for DocumentSpamAttempt so that it works better as a dashboard for second-guessing Akismet: * Limit the length of title and slug on the list display * Allow filtering by "Needs Review", to treat the admin as a review backlog. * Add an editable dropdown to quickly review multiple edits from known users * When a review of "Spam" is chosen, it confims Akismet's choice, and no futher API call is needed. * When a review of "Ham / False Positive" is chosen, submit the original data to Akismet's submit_ham API endpoint. On failure, revert to unreviewed status so it can be tried again. Add a management command and task to drop submission data for old records, saving database resources and removing unneeded PII. Update the anonymization scripts to drop submission data. https://github.com/mozilla/kuma/commit/0c8efe4778e72e9c9229c27d7a7ee330b559fb16 Merge pull request #3816 from mozilla/spam_dashboard_1255609 bug 1255609 - Enable training on false positives r=jezdez,robhudson
A few issues are apparent from testing in staging: * The Document column in the list view is too wide to fit in a desktop browser, and needs to be limited * Reviewing a spam attempt sends a second email to the mailing list I'll do this work against this bug before closing it.
Commits pushed to master at https://github.com/mozilla/kuma https://github.com/mozilla/kuma/commit/5c584dda13eeccec059e88515713fbb317917963 bug 1255609 - Don't send email on spam review Only send an email when a DocumentSpamAttempt is created (when a contributor's edit is blocked as spam), instead of when the DocumentSpamAttempt is updated (such as when a reviewer submits an edit as ham). Also, reuse the constance config EMAIL_LIST_FOR_FIRST_EDITS. https://github.com/mozilla/kuma/commit/0abdf2bb9a4659754d6a5cf4232103eadcc7a1a3 bug 1255609 - Doc in DocumentSpamAttempt admin In the DocumentSpamAttemt admin list, shorten the document name so that there are no more than 25 characters before a line break. This will help make it fit well in a desktop browser. https://github.com/mozilla/kuma/commit/d40a53fd00db94569ea4268b97eb0fbe2996d066 bug 1255609 - Rename to EMAIL_LIST_SPAM_WATCH Rename the contstance config "EMAIL_LIST_FOR_FIRST_EDITS" to the more generic "EMAIL_LIST_SPAM_WATCH". Production and staging have not customized, so should be a safe change. https://github.com/mozilla/kuma/commit/5167c6ebc611eb58352be1b774dcce742dd6f9a0 Merge pull request #3825 from mozilla/spam_dashboard_fixes_1255609 bug 1255609 - Fixes for spam dashboard
Changes are deployed to production. I suggest that future changes are incorporated into an interface that isn't the Django admin.
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.