Closed Bug 598060 Opened 14 years ago Closed 7 years ago

Implement changes to discussion groups configuration to fight spam

Categories

(Infrastructure & Operations :: Infrastructure: Mail, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gerv, Assigned: limed)

References

Details

(Whiteboard: [tracking])

After several years of relative happiness, the discussion forums now have a significant spam problem, most of it coming from Google Groups. After exhaustive discussion of the technical possibilities, the plan to fix this is here:
https://wiki.mozilla.org/Discussion_Forums/Proposal

This bug tracks the changes necessary to implement the plan. Many people from across Mozilla have impressed upon me that this is a matter of urgency.

AIUI from justdave, for the groups in question (see below), the workflow should be as follows:

* Blog that the change is about to be made, noting the possible negative 
  effects as outlined in the wiki page
* Set the Google Groups and newsgroups to "moderated" so all messages get sent 
  to us before being posted anywhere
* Set up SpamAssassin on the incoming messages (from Groups, news _and_ mail)
* Discard or bounce anything marked as spam
* Then apply a global whitelist of posters
* If the person is not in the whitelist, hold for moderation
* Otherwise, post it
* Allow moderators to free or delete held messages, and to add people to the 
  global whitelist

The global whitelist should be pre-populated with, or include, the subscribers to all the mailing lists and the existing "can post even though not a member" whitelists for each mailing list. Making this whitelist and plugging it into the system may be a small matter of programming.

This should be applied to all discussion forums _except_:

mozilla.support.*
mozilla.feedback.*

Please don't break the moderation system on existing moderated groups like mozilla.announce. :-)

Gerv
I think you meant:

* If the poster is in the whitelist, post it
* Otherwise, hold for moderation

To that, I'd like to add that the existing filters that list moderators have
already created that filter on sender, subject, and any other header, should
then be honored, whether the message originated from email or as a news 
posting.   SpamAssassin should not be the ONLY filter.  Messages flagged as
spam by SpamAssassin should not end up in the moderation queue if the 
moderator has setup filters that would discard them for other reasons.
Assignee: server-ops → justdave
OK so what's the next step, except for the fact that Nelson is rising the issue that existing filters should be kept available ?  When does Dave have time to work on this ?

I am tempted to suggest that one way to go can be to first test the new system on a guinea pig group, suggesting to use mozilla.dev.tech.java, that hasn't received a single non-spam message since last may, and where the few messages before that did not really receive useful answers.
I'd happily volunteer mozilla.governance.mpl-update, which has troll problems.
Blocks: 602184
Spam in the newsgroups is a big problem and one I believe is hurting the project.  Dave, Matthew, can we get an ETA on this?
Got a ticket opened with Giganews on this finally today, to get a feeler on what kind of time they need to implement on their end.  If they're game, we may have this going this weekend, or perhaps Monday.
Giganews is on-board.  We can do this either Monday or Tuesday night at approximately 11pm PDT (timed just ahead of when Google polls our config so we have a minimum amount of dataloss across the switchover).  Doing it Tuesday gives us a little additional time to blog about it and such and have people know it's coming.  I'll let Gerv pick the date I guess.
Dave: I have a bunch of questions about this:

- Does this mean _any_ Monday or Tuesday, or are this coming Monday and Tuesday (18th and 19th of October) special? Why Mondays and Tuesdays particularly?

- Are you able to do all the steps outlined in the proposal at this time, including having a global posting whitelist? I'm sure some of them required code to be written - is it written?

- What mechanisms will be available for adding names to or removing names from the global posting whitelist?

- Are you planning to do all the groups in one go or just test it with a few first? It might be wiser to do the latter, so we can make sure the global posting whitelist stuff is working properly.

Both Monday and Tuesday at 11pm PST seem somewhat quick; I'd rather give people a week's notice. But if this Monday and Tuesday are very special, then let's do it on Tuesday - and someone can announce it in the Monday meeting (I'm not there, but we can email Jono) and blog about it today.

Gerv
Again, I'd volunteer the mpl list as a guinea pig for this Tuesday, even if every other list gets a week or two notice.
(In reply to comment #7)
> - Does this mean _any_ Monday or Tuesday, or are this coming Monday and
> Tuesday (18th and 19th of October) special? Why Mondays and Tuesdays
> particularly?

They can do whenever we want.  I was aiming for Monday or Tuesday because I was under the impression we were in a hurry and those were the two upcoming days after a business day when everyone would see this bug update. :)  If we'd rather wait longer, that's fine and won't be an issue.

So how about we do like Luis is suggesting and do the mpl list tomorrow and wait a week on everything else?

> - Are you able to do all the steps outlined in the proposal at this time,
> including having a global posting whitelist? I'm sure some of them required
> code to be written - is it written?

The global whitelist based on subscribers is already in place. Including the overrides from all of the lists has not been implemented yet, but will only take a few minutes to set up.

> - What mechanisms will be available for adding names to or removing names from
> the global posting whitelist?

The current proposal was to combine the whitelists from each of the individual mailing lists.  This would be quite easy to implement (so that if someone got whitelisted for any list, they would be for all lists).  The downside of this method is if we decide someone needs to be removed from it, we would need to figure out which individual list's whitelist they were on to be able to remove them.  Perhaps we could have a global blacklist as well, which is manually maintained, that would override the global whitelist.  Doing it that way would allow us to blacklist someone globally and allow their individual whitelist on the list they got whitelisted in to still work, too (it would essentially go back to the existing behavior for that user).
(In reply to comment #9)
> The global whitelist based on subscribers is already in place. Including the
> overrides from all of the lists has not been implemented yet, but will only
> take a few minutes to set up.

The code hack within mailman for processing the global whitelist is already there, the only change would be the data sources used by the script which generates that whitelist (which is run hourly by a cron job).
Be aware that I do expect to need to get a particular person (not a spammer, but an aggressive troll) off the list, so if he's whitelisted on some other list, we'll need to be able to look him up and nuke him from there too :)
If this is happening to all dev groups within the next 48 hours, I think there should be some sort of "heads up" announcement.
OK, let's kick over mozilla.governance.mpl-update on Tuesday, plus the mozilla.community.philippines list (who were also concerned and are involved in the process). And do mozilla.tools, as that's mostly just me. That's three, and that should be enough for testing.

Tomorrow morning my time, in 8 hours, I'll post a heads-up to those groups.

Gerv
Gerv: add mozilla.dev.tech.crypto to that list for tomorrow as well. (bug 577732)
Question: do list owners have any extra responsibility to manage the white list? (or other?)
Dave: We are go for switching those four groups tonight. I've posted the below message to the groups in question; please let me know if a corrective follow-up is necessary.

Please can you explain here:

- What effects your average regular reader/poster might expect to see
- What someone attempting to post for the first time will see
- How moderators are notified about things in the queue
- What they do to post or disapprove those things
- How people are added to or removed from the global whitelist

If it's not currently possible to "blacklist" people automatically, that's OK, but as Luis says, there's one particular person we'd want to remove manually.

Thanks :-)

Gerv

Message to groups:

At 11pm Pacific Time on Tuesday night (6am UTC on Wednesday morning) we are implementing[0] the new discussion forums anti-spam plan[1] on the following guinea pig groups:

mozilla.community.philippines
mozilla.governance.mpl-update
mozilla.dev.tech.crypto
mozilla.tools

This has been agreed with those responsible for those groups.

The effect you can expect to see is a longer lag between posting a message and seeing it show up - both in the mode you posted it in (mail, news or Google Group), and in other modes.

However, if you do not get an email back to the effect that your message was held for moderation, _and_ you do not see the message appear after 30 minutes, then there may be a problem. 

If you see that, or anything else strange about posting and reading behaviour in these four groups (in any mode), then please file a bug:

https://bugzilla.mozilla.org/enter_bug.cgi?product=mozilla.org&component=Discussion%20Forums

If all goes well, we will roll out the change to all the other groups (except mozilla.support.* and mozilla.feedback.*) next week.

Gerv

[0] https://bugzilla.mozilla.org/show_bug.cgi?id=598060
[1] https://wiki.mozilla.org/Discussion_Forums/Proposal
(In reply to comment #16)
> [...] we will roll out the change to all the other groups (except
> mozilla.support.* and mozilla.feedback.*) next week.

Seeing that list, I wondered if it wouldn't make sense to also keep mozilla.general non moderated.

So I sent a short warning post on that group and everyone there thinks moderation would not be a good idea (including Jay Garcia for example).
In particular, it was argued that it would impair redirecting off-topic discussions from other groups to that group.

So if the active and sensefull participants in a group all believe it shouldn't be moderated, I think it's better to keep it that way, and include mozilla.general in the list of groups that won't be moderated.
While I hope my newsgroup messages are sufficiently on-topic and cogent that I would be in the global whitelist, I nevertheless support leaving mozilla.general unmoderated for the reason given in comment #17.  (I believe I may have been the first in the mozilla.general discussion of moderation to raise the issue of redirecting discussions to that newsgroup.)
If anyone didn't already notice, this change did *not* take place on Tuesday night.  A combination of the Firefox release on that day with my being sick for part of the day combined to accidentally drop this off the plate until it was too late to sync up with Giganews to make it happen in time to catch the Tuesday night Google sync.  I will try again for Wednesday night.

As for answering some of the questions that came up here in the meantime:

(In reply to comment #16)
> Please can you explain here:
> 
> - What effects your average regular reader/poster might expect to see

A slight delay in posts showing up, in most cases (5 or 10 minutes instead of 2 or 3 minutes that you commonly see now). If you're a first-time poster (or first time since this system is implemented) your post might get held until someone clears it and adds you to the whitelist.

> - What someone attempting to post for the first time will see

If you used a real email address to post, you'll get back a message stating that only subscribers are allowed to post, and directing you to mailman's subscription page for the mailing list version of the list.  If you used a fake address to post (quite common on the news side it seems) you'll get back nothing.

> - How moderators are notified about things in the queue

Depends on the settings the moderator has made on their group.  By default they get an email every time a post gets held, with a link on it to approve or disapprove of the message.  Some moderators elect to get a daily summary instead of a mailing for each post.

> - What they do to post or disapprove those things

Click the link in the above-mentioned email and follow the directions on the resulting page.

> - How people are added to or removed from the global whitelist

There's a box to check next to the message when you approve it to add the person to the whitelist.

(In reply to comment #17)
> So I sent a short warning post on that group and everyone there thinks
> moderation would not be a good idea (including Jay Garcia for example).
> In particular, it was argued that it would impair redirecting off-topic
> discussions from other groups to that group.

I'm not sure everyone understood what was going on, by the sound of the responses.  This is using Mailman to auto-moderate the groups.  It's not a full moderation where a moderator has to approve every message.  Mailman is attempting to keep the spam out (since Google won't do it for us).  Once this is implemented across the board, if a user has successfully posted to another group, they'll already be approved to post here (unless their previous posts were on the support or feedback groups).
(In reply to comment #19)
> > - What someone attempting to post for the first time will see
> 
> If you used a real email address to post, you'll get back a message stating
> that only subscribers are allowed to post, and directing you to mailman's
> subscription page for the mailing list version of the list.  If you used a
> fake address to post (quite common on the news side it seems) you'll get back
> nothing.

I take that back... the above is what you get on the mailing list side of the support groups currently, because the moderators of those groups didn't want to deal with huge moderator queues full of spam (so it rejects the messages with the above notice rather than queuing them for a moderator).

With the rest of the lists, the message in question informs you that your message has been held for a moderator and the post will show up when the moderator approves it.  It will offer a link to cancel the post if you change your mind about posting it before the moderator gets to it.  As mentioned above, if you use a fake address to post, you won't get anything back at all, and it just won't show up until it's been approved if you weren't a mail-side subscriber anywhere and haven't posted before.
Yes: I think the occupants of mozilla.general have not understood what is going on. Moved discussions, by definition, will have no problem with moderation. 

However, if they want to be left at the mercy of the spammers, I have no particular objection. I don't read that group.

Gerv
(In reply to comment #19)
> > - What someone attempting to post for the first time will see
> 
> If you used a real email address to post, you'll get back a message stating
> that only subscribers are allowed to post, and directing you to mailman's
> subscription page for the mailing list version of the list.  If you used a fake
> address to post (quite common on the news side it seems) you'll get back
> nothing.

And on the Google Groups side, they'll get the message back to their associated email address?

> > - How moderators are notified about things in the queue
> 
> Depends on the settings the moderator has made on their group.  By default they
> get an email every time a post gets held, with a link on it to approve or
> disapprove of the message.  Some moderators elect to get a daily summary
> instead of a mailing for each post.

Do we know that every group has an active and attentive moderator? If not, is there any way of telling which groups are unowned?

> > - How people are added to or removed from the global whitelist
> 
> There's a box to check next to the message when you approve it to add the
> person to the whitelist.

As in, it adds them to the local whitelist, but your backend work means they are also added to the global whitelist, because it's made up of the sum total of the local ones?

How are we going to manage blacklisting? A manual process where you write a script to search all the component whitelists for the person's email address and remove it? Or is there some web-based way? Can we make a global blacklist out of all the list blacklists, and use that to override the whitelist?

Gerv
Thanks for the headsup, Dave. If you're going to do it tonight I'll put off the Alpha until tomorrow morning.
Thanks for putting m.d.t.crypto on the initial list.

As the moderator of that list/group, what will I need to do differently than before, if anything?

I have the list setup with a rather large black list of senders, a black list 
of strings in subjects, and a small white list (of course, all list sub's are
also white listed).  Black listed messages are discarded, not rejected. 
Messages that go into the moderation queue also generate no reply.  They are
either 
(a, most commonly) silently discarded, 
(b, less commonly) rejected with a nice "how to subscribe" message, or 
(c, rarely) accepted, with sender added to white list (for on-topic messages
that originate at nabble.)

I NEVER send reject messages to spam, because spammers use such rejects to 
inundate their victims with reject notices, by sending their spam sppearing
to have come from their victims.  

I hope none of this will be forced to change under the new scheme, but I fear
that I may lose the ability to keep the list's own black list or white list,
or to silently drop list spam.  Please advise.
The individual settings you have on your list will remain the same with the exception of the news gateway type.  It's currently set to "open list, open newsgroup", and it will be changed to "open list, moderated newsgroup", which will make mailman stick an approval stamp on everything it submits to the news server, but otherwise won't change the behavior of the list.  The main change you will see as a moderator is that messages posted via the news servers will now have to get through whatever list filters you've set up on your list in order to get sent out to users or posted to the news servers. (Previously, anything posted via the news servers would show up everywhere, regardless of the filters, which the exception of the spamassassin hack for news->mail reed added last summer).
Blocks: 57732
Blocks: 577732
No longer blocks: 57732
Config change made on the mailman side.

According to email from Giganews earlier today, they'll be making the change on their side sometime between now and midnight (about 52 min from now).

Google usually polls us for the config about 2 minutes past midnight.

--- /var/www/html/newsgroups.txt        2010-10-20 23:02:03.000000000 -0700
+++ /root/.listconfig/newsgroups.txt    2010-10-20 23:05:23.000000000 -0700
@@ -11 +11 @@
-mozilla.community.philippines
+mozilla.community.philippines moderated
@@ -61 +61 @@
-mozilla.dev.tech.crypto
+mozilla.dev.tech.crypto moderated
@@ -105 +105 @@
-mozilla.governance.mpl-update
+mozilla.governance.mpl-update moderated
@@ -120 +120 @@
-mozilla.tools
+mozilla.tools moderated
Got an email back from Giganews that they ran into issues trying to change the status of the group, and he needs to get with their developers to find out why it's not working.  Unfortunately Google already pulled the config change, which means at least 24 hours to revert.  So it'll probably be late morning before Giganews is set up correctly, and posts from Giganews won't propagate to mail or Google in the meantime, for lack of moderator approval stamps.
No news from Giganews today ? Google already reverted to non-moderated I suppose.

I see two new spam on mozilla.dev.tech.crypto
from Giganews:

>A group change of this kind was once done using an automated process and
>could be completed in a couple of hours. We are now required to manually
>update every server in our cluster making the total turn around time for a
>request of this nature about 48 hours. We should be finishing up with this
>first round tonight sometime.

Does anyone know if the change took on Google's side at all?  I never did revert on our end, so if their polling script is working like it's supposed to, they should be calling those groups moderated since a day or two ago.
ok, I just posted a test message to mozilla.tools from Google Groups, and it directly posted, so apparently Google isn't doing these quite correct.  Let's see if I can actually track down someone to complain to over there... :|
All I can say at this point is with both Giganews and Google having issues trying to accomplish this, I'm glad we only did a couple and didn't switch everything over at once after all.
justdave: if you need more help from anyone here, just ask for it. 

Brendan recently went into the js-engine newsgroup, and nearly "went blind" from all the spam:
https://bugzilla.mozilla.org/show_bug.cgi?id=606189#c30

I think there is consensus throughout the organization that we need to return these lists to good order ASAP.

Gerv
Wondering whether Erik reads his old bugmail account, and whether he can help from within Google.

/be
Gerv, I mentioned mailman archive format=flowed message content (I think that is what it was) terribly, with super-long unwrapped lines. It's important that this not happen upstream of Google Groups in a way that affects more than the mailman archive (if there is one).

/be
FTR, the stuff that zimbra sends out looks like format=flowed, but isn't. That's a bug on their side, https://bugzilla.zimbra.com/show_bug.cgi?id=22951
I have not yet heard back from Google, if someone knows anyone they can ping over there I'd appreciate it.  Subject line that went into their ticket system was "[mozilla.org 598060] mozilla.* configuration changes not working".  I received neither a confirmation message nor a bounce.
Oh, I just realized I didn't post it here before, but the changes went through on Giganews' side on Friday last week, for those four groups.

Going forward (once we're sure Google's back on board again) it's best for Giganews if we do the rest all at once, and they'll need 48 hours notice to do it because of the way their system is at the moment (the problem that caused it to not go through right away last week).
Dave: can you currently update us on what we should expect to be seeing in terms of functionality in those four groups? Is interoperability going to be OK, or will some messages from some sources not get through (along the lines of comment 27)?

Gerv
In theory messages coming from Google Groups on those four groups should be getting ignored by Giganews and Mailman for lack of moderator approval stamps on them.  If Google now has it set up correctly, those messages will get mailed by Google directly to the mailman server, which will then post them to Giganews on its behalf (with a moderator approval stamp).
I have succeeded in posting a message to mozilla.tools from Google Groups, and having it appear in on the Giganews news server.
news://news.mozilla.org:119/a56b198d-c7cb-40ea-9164-63958495bf0e@g25g2000yqn.googlegroups.com

However, after 10 minutes, it still has not appeared on Google Groups. 
http://groups.google.com/group/mozilla.tools/topics
By contrast, a message posted via Google Groups to mozilla.test appeared on Google Groups almost instantaneously.
http://groups.google.com/group/mozilla.test/browse_thread/thread/5ebc33d5a65aefb4#

I can't tell if the message to mozilla.tools went to the mailing list, as I wasn't subscribed and mailman doesn't keep archives. But if it's got to Giganews, presumably via moderation (as it hasn't shown up at Google Groups), surely it has got to the mailing list too?

Gerv
It wasn't moderated. If it went through the list server, it would have the "To:" header, saying tools@lists.mozilla.org.
Giganews says if they get the article from a trusted peered news server they assume the peered server has already ascertained that it was moderated properly and accept the message.  So basically we're screwed until Google gets the config fixed on their end.  But we're not losing messages at least (which is unfortunate on the part of the spam messages).
This past week, there's been a rather dramatic upturn in spam originating at google getting through to the dev.tech.crypto list.  

We have it in our power to fix this unilaterally, without waiting for Google 
or Giganews, by forcing the news->mail gateway to apply the filters to that 
content, just as it does for any content received by email.
(In reply to comment #43)
> We have it in our power to fix this unilaterally, without waiting for Google 
> or Giganews, by forcing the news->mail gateway to apply the filters to that 
> content, just as it does for any content received by email.

It's already doing that.  If it's getting to the list it's getting past the spam filters.
Dave,
The filters on headers that are under the list admin's control are NOT being 
applied.  This is demonstrably provable.
... not being applied to messages coming from google groups, that is.
Gerv or Dave- do we have any way to clear out the spam from the Google Groups archives?  I realize that this request may be a different bug but I wanted to raise it here first.
Nope, we have no way to remove the existing spam.  All we can do is try to prevent new stuff from showing up.
I tried sending messages to mozilla.tools through nmo and google. Both instances went through mailman.
Google gave me a "New topic submitted to moderators of mozilla.tools for review. Your post will appear in this group after it is approved by moderators." message.

However, it appears the mailing list is set to reject messages from non-members instead of holding them, because in both instances I got a "we require that all
posters be subscribers of the list to which they are posting" auto-response.
(In reply to comment #49)
> [...] it appears the mailing list is set to reject messages from non-members
> instead of holding them, because in both instances I got a "we require that 
> all posters be subscribers of the list to which they are posting"

For mozilla.dev.tech.crypto, it works properly, so it sounds like it's really just that setting that needs to be corrected. 

(In reply to comment #31)
> All I can say at this point is with both Giganews and Google having issues
> trying to accomplish this, I'm glad we only did a couple and didn't switch
> everything over at once after all.

Now that it's working properly at least for some groups (and it's a *huge* progress for them), is it possible to set a plan forward for the rest ? I'm unfortunately not sure you had the opportunity to learn a lot about how to do it better.

OTOH if I take for example mozilla.dev.tech.crypto, as bumpy as it looked from the administrative side, I believe the process hasn't been really disruptive for users (from the users point of view, posting has always been possible, it's just the start of effective moderation that took longer than expected with nobody able to say when exactly it would be really effective).
justdave: can you please tell us how we can check whether this is now working, and give us a recommended set of settings for the mailman interface which best implements the system we want?

Once we have confirmed it is working fully, we can look at rolling it out everywhere.

Dave: do we have any confidence yet that a cross-Mozilla rollout will succeed with fewer hitches than the test rollout? Do you have contacts at Giganews and Google you can go to if there's a problem?

Gerv
Considering the fiasco when we attempted these four, I'm hesistant to do the rest until we hear from Google saying it's actually going to work.  Perhaps we can try with one or two more groups to find out if they fixed the automation, since they won't tell us.  Brendan attempted to pull a few strings for us, but we've still gotten nowhere.  If anyone at Google has actually heard our cries and done anything about it, they haven't told us they've done so.

So, here's how it's supposed to work:  If you attempt to post to Google Groups, your message will get emailed (by Google, not by your client) to our mailman interface, which will then accept/reject/hold the message based on that mailing list's settings.  If the message is accepted, it'll be posted to the news server with a stamp from mailman saying that it's been moderated so the news server will accept it, and mailed to the list.  It will then propagate from there back to Google at some point for display and reading (hopefully that whole process takes under 5 or 10 minutes if things are working well).

There is evidence in the logs that the following groups are currently working correctly:

mozilla.community.philippines
mozilla.dev.tech.crypto
mozilla.tools

There is no evidence of any posts being made from Google that correctly came through this interface to mozilla.governance.mpl-update, however.  This may only mean nobody's posted to that group via Google since they fixed it, or it may mean that group is still broken.
Fwiw, I am the admin for mozilla.community.philippines and I am getting moderation requests.  I also have not seen any spam in that group since the bits were flipped.
I do have one (minor) gripe about the spam handling.  As you know, each 
mail alias has a whitelist and a blacklist, that is, a list of email 
addresses from which mails are supposed to be accepted or discarded 
regardless of other considerations.  Spam filtering should be done AFTER 
those lists are applied, IMO.  Spam filtering should CERTAINLY be applied 
AFTER the blacklist is applied.  But it's not.  Consequently, it doesn't help 
reduce the administrator's burden to put email addresses of repeat spammers 
into the blacklist.  Doing so do not prevent the spam from those spammers 
from showing up in the daily moderation queue.  It should, but it doesn't.

Here ONE example case in point (of numerous): spammer success@yahoo.com sends daily spam with subject containing the host name texastrue.com.  Long ago, 
I added success@yahoo.com to the black list, so all those emails should just
be discarded, and not appear in the moderation queue, yet they all appear in
the moderation queue.  The moderation queue does not give me the option of 
adding this email address to the black list, because it's already in the 
black list.  I'll leave today's spam from that source there in the queue if someone wants to have a look.
(In reply to comment #52)
> There is no evidence of any posts being made from Google that correctly came
> through this interface to mozilla.governance.mpl-update, however.  This may
> only mean nobody's posted to that group via Google since they fixed it, or it
> may mean that group is still broken.

I just tried posting to mozilla.governance.mpl-update and I got the same result as in comment 49.

I also haven't seen anyone test the master whitelist. My test posts get automatically rejected. Theoretically, they should be held for moderation, and when approved, my posting address added to the whitelist allowing me to post to one of the other test groups without moderation.
For what it is worth, I haven't gotten any notification of Chris's attempt to post to mpl-governance.
I just changed the default non-member action from "Reject" to "Hold" on these lists:
mozilla.community.philippines
mozilla.tools
mozilla.dev.tech.js-engine.rhino
mozilla.governance.mpl-update

All other moderated lists were either already set that way or are only designed to be posted to by specific users anyway.
Gerv: got suggestions for what to attempt on the next batch?  We should probably try one or two just to see if the engine is working to propagate the changes.  It's evident that Google eventually fixes stuff and never acknowledges that they've done so, so we should test and see if they did or not on a small set.
justdave: I think you are right, the only thing to do is test again. Let's look at doing:

mozilla.legal
mozilla.jobs
mozilla.community.uk

I'll post in those groups and check it's OK.

Can you check that those groups do have moderators and they are people known to be still involved with the project and responsive?

Gerv
As far as I understood, the filter is using a whitelist (at least as significant factor).

mozilla.jobs is inappropriate for that, because the interesting posters are often not part of the community and would be posting for the first time. The damage of a false positive is also big.
Recent traffic in mozilla.jobs shows one job a month; assuming the moderator is not asleep at the wheel (and I asked Dave to check that), I think it's an OK choice.

Gerv
I wouldn't want to have to depend on a moderator. My food depends on that.
justdave: who is the current moderator of mozilla.jobs?

Gerv
Looking at <https://lists.mozilla.org/listinfo/jobs>, it's Frank Hecker.
Now that the new system has been in place for a while in m.d.t.crypto, 
I'm seeing a growing number of messages in the moderation queue each day
that are from email addresses that are either subscribers or in the 
whitelist of non-subscribers, that go into the moderation queue with the 
explanation "message has implicit destination".  I'm not sure what that 
means.  The messages all are clearly addressed to the mailing list or to
the newsgroup or both. 

Since they're from subscribers and/or whitelisted addresses, why don't 
they go right through??   

I don't mind when a first-time poster's on-topic gets delayed for up to 
24 hours, but when an long time subscriber's message does, something needs
to be fixed.
(In reply to comment #57)
> I just changed the default non-member action from "Reject" to "Hold" on these
> lists:
> mozilla.community.philippines
> mozilla.tools
> mozilla.dev.tech.js-engine.rhino
> mozilla.governance.mpl-update

Okay, I've sent a message to mozilla.tools via Google Groups. I got the same "New topic submitted to moderators of mozilla.tools for review. Your post will appear in this group after it is approved by moderators" message, but this time the mailing list sent me a message saying that my message awaits moderator approval.

Hopefully Gerv received the moderator request.
(In reply to comment #65)
> "message has implicit destination"

OK, I know the problem here.  The moderator addresses don't exactly match the list posting address (they have an extra "mozilla-" in front because it was an easier regexp for giganews and google to automatically configure).  There's a script on a cron job that ensures the moderator addresses all exist in the aliases file.  I'll see if I can hook that same cron job to ensure they're also added to the "allowed recipients" list on each of the lists at the same time.
(In reply to comment #62)
> I wouldn't want to have to depend on a moderator. My food depends on that.

I understand your concern, but if a would-be poster looks at the group and sees nothing but a large amount of spam, the deterrent effect is likely to be at least as strong as the one of moderation.

But it would be a good idea to have a large moderation team for that list. 
IMO you could be a member of the moderation team.

(In reply to comment #65)
> I'm seeing a growing number of messages [...] go into the moderation queue 
> with the explanation "message has implicit destination".

http://wiki.list.org/pages/viewpage.action?pageId=4030676
1) Your list has a different domain name (FQDN) than you think it does, or than was used as the target address of the message. [...]
2) The message was BCC'ed (blind carbon copied) to the list, [...]

=> Solutions :
[...] go to Privacy Options / Recipient Filters / Alias Names, and enter the name of the umbrella list as a valid alias.
[...] get rid of this message entirely, go to Privacy Options / Recipient Filters and set the require_explicit_destination option to be "No".
Is there an option to subscribe to the unmoderated traffic? (ideal)

If I was a moderator, could I get the new posts per email, individually, as they come in, before moderation? (workaround)
The moderator by default gets an email with a link to approve/disapprove for every post attempt to the list which needs moderation.  There is an option in the admin config to opt out of those mailings and only get a daily summary.
(In reply to comment #68)
> (In reply to comment #65)
> > I'm seeing a growing number of messages [...] go into the moderation queue 
> > with the explanation "message has implicit destination".
> 
> http://wiki.list.org/pages/viewpage.action?pageId=4030676
> 1) Your list has a different domain name (FQDN) than you think it does, or than was used as the target address of the message. [...]

Yes, I said this already in comment 67. :)  This will be automatically taken care of for all of the lists as soon as I finish tweaking the script (working on it now)
(In reply to comment #71)
> > (In reply to comment #65)
> > > I'm seeing a growing number of messages [...] go into the moderation
> > > queue with the explanation "message has implicit destination".
> 
> This will be automatically taken care of for all of the lists as soon as I
> finish tweaking the script

Done, and it's been run once, so they should all work now.  It's part of the cron job that generates the newsgroups file, so it should stay up-to-date automatically from here on out.
Ben: I see no reason why you could not take over the moderation of mozilla.jobs from Frank. He is not so involved with the project now and I'm sure would be glad to hand over to you. The only thing would be that you would have to avoid possible conflicts of interest by promising not fail to post a job because you wanted to apply for it :-))

Gerv
Grev: obviously
To avoid exactly that, I proposed to allow anybody to subscribe to the unmoderated stream. I don't know whether mailman (or whatever you use for moderation) allows that.
The idea of several moderators is not bad either.
(In reply to comment #74)
> To avoid exactly that, I proposed to allow anybody to subscribe to the
> unmoderated stream. I don't know whether mailman (or whatever you use for
> moderation) allows that.

It doesn't.  Only the moderators get the unmoderated stuff.
In reply to comment 68, 

The "To:" line in all the messages that get the "implicit destination" treatment is always the same, namely
     To: mozilla-dev-tech-crypto@lists.mozilla.org

The recipient filter allows messages that match any of these expressions:

mozilla-crypto@mozilla.org
^mozilla\.dev\.tech\.crypto@lists\.mozilla\.org$
^mozilla-dev-tech-crypto@lists\.mozilla\.org$

It appears to me that the actual To address in the messages should match the last of the above 3 expressions.  But evidently it does not.  I have suspected that the problem is that those expressions do not allow extra space at the beginning or end, so twice I have changed the above filter expressions to:

mozilla-crypto@mozilla.org
^.*mozilla.dev.tech.crypto@lists\.mozilla\.org

Something keeps changing those recipient expressions back to the three I cited above.   Grrr.
Nelson, did you read comment 71 and 72?
Besides, your regexp not only allows extra space, but arbitrary characters at beginning and end.
Yes, extra characters like "To: ".  
I don't know why the existing expressions don't work.  
The fact that they don't work, even though they appear to match the "To:" name,
suggests to me that there is more in the string that is being matched by these
expressions than the To address alone.  Perhaps it is also the "To:" string.

My big beef is that some script script creator thinks his script has more 
right to administer my list than I do, and so keeps undoing my changes.
> some script script creator thinks his script has more 
> right to administer my list than I do

Not "your" list, but ours ;-) I think justdave is can figure out why the rule doesn't match, and fix it for all lists, if you give him some time.
There is a cron job regenerating those daily (around 11pm PST).  However, all it's doing is ensuring that those last two exist in the list.  If you add anything else to it, those should stick.  If you remove either of those last two they should get put back.  The regexp is not supposed to match the full header, it's supposed to match each email address individually (because there can be more than one in either of the To or CC lists)
Actually, I'm wrong, it runs once an hour.
I just looked, and the one you added is indeed still in there.
Next time you get one of those, can you leave it in the moderator queue and ping me about it?
Gerv, did you get a moderator request for my post to mozilla.tools? See comment 66.
Sorry Chris; I was expecting a moderator mail and didn't get one. But now I check the interface, yes, you are there. I approved the message and whitelisted you.

Gerv
Chris: the message doesn't seem to have turned up on the news server. Can you send another one, and see if my adding you to the whitelist made a difference? Ping me a private mail so I can check the queue immediately.

Gerv
Justdave: please go ahead and convert over the groups in comment 59, and we'll see if the process goes any smoother. We can do this even while we are debugging the exact correct mailing list config.

Gerv
Please do not change mozilla.jobs until the moderation question is clear and I (and whoever else has a justfied interest) can get unmoderated traffic in any case.


----

Sorry if that's answered above, but: If a posting is held for moderation, is it held on all channels, including the original source (e.g. news.mozilla.org and google groups)? Somebody who posts via news.mozilla.org will only check there and once he sees it there, he will assume it reached all subscribers. In other words, the must not be any circumstance where the groups are split and only some subscribers get the post, but others don't. This is critical for reliability.
For the record, I would be willing to do moderation of mozilla.jobs, if you trust me with that. As somebody suggested, you could allow others contractors to be co-moderators, to avoid any doubts.
Approved.

Looks like there's a checkbox as well as a radio button which needs to be checked to whitelist you. I missed that last time. I've done it this time, so let's see if a) this post gets through and b) whether you get moderated next time.

Gerv
Yep, looks like the test got through, to News at any rate. Post another from the same email address and see if you get moderated.

Gerv
Test results:
* after being added to the whitelist, posting via google to mozilla.tools again was auto-approved.
* after being added to the whitelist, posting via NNTP to mozilla.tools was auto-approved.
* I then tried posting to mozilla.governance.mpl-update via NNTP, and the post was held for moderation.

So it appears the whitelist is still per-newsgroup.
Please add mozilla.test to the next batch.
Testing groups need to work reliably; mozilla.test is therefore a bad candidate IMO.

Dave: please go ahead, but replace mozilla.jobs with mozilla.governance (which has been requested by Mitchell).

Gerv
(In reply to comment #92)
> So it appears the whitelist is still per-newsgroup.

Yeah, looks like the global thing isn't working right.  I'll try to poke at it today.
FWIW, the last couple new group creations we've done have failed to show up on Google Groups, so we're still having problems getting the config changes to propagate.
(In reply to comment #96)
> FWIW, the last couple new group creations we've done have failed to show up on
> Google Groups, so we're still having problems getting the config changes to
> propagate.

Well :
- The current spam level is a huge problem for many groups : when the number of messages is low, the spams makes them unusable
- In the previous attempt, we've seen that the consequence of the change not propagating to Google is just that moderation is not applied for the posts coming from Google. As soon as the change propagates, it starts being applied.
- I think that the changes in the status of the group might propagate more easily than the creation of a new group. Maybe that's because Google wants to manually review each group creation. Last time, the status changes ended up being propagated faster than the time we since then have already spent waiting for answers about what happens at Google.

If there's any visibility about when more info might become available, or sensible hopes for news in the next few weeks, I'd be in favor of converting a batch of the groups that have in effect already become unusable because of the spam. For example, all the groups that have since two month received more than 3 spams for 1 useful message, or where because of the spams no useful conversation is currently happening.
I think J-M is right. We should convert the next batch of groups and see what happens. If it succeeds, awesome. If it fails, then we have something specific to get our top brass to escalate to Google.

Gerv
Should we create a wiki doc or dependent bugs (making this a meta bug) to track which groups have been changed and any issues with specific newsgroups? Maybe an outline of what changes each mailing list admin needs to make in the admin panel, and perhaps an short tutorial of how to add someone to the auto-approve list?

I'd be happy to create that documentation.
(In reply to comment #96)
> FWIW, the last couple new group creations we've done have failed to show up on
> Google Groups

dev-tech-js-enginge-internals has recently been created, and now appears on google.

May I suggest that up from now all new groups be created moderated from start so that they don't need to be changed again later ?
(In reply to comment #90)
> Approved.

Was that in reply to my comment 89?

Has mozilla.jobs been switched to moderated? I see that Frank Hecker is still the moderator. Given his involvement levels, that and moderation would mean that the newsgroup is basically locked down. How do we go forward on this?
(In reply to comment #101)
> (In reply to comment #90)
> > Approved.
> 
> Was that in reply to my comment 89?

Comment 86, I believe.
Blocks: 620085
Re comment #101:  This is why bug #620085 is not only important but also why the Web page should be updated incrementally as soon as each additional newsgroup becomes moderated.
Who do we need to escalate this to within Mozilla to get to the right person at Google?
Gen: we need a documented account of what is not working, which justdave was going to get when he set the next round of groups to be 'moderated'. After that, Mitchell (who is aware of the situation) can take something to Google.

Dave: is there an ETA on switching the next batch of groups?

Gerv
This is getting pretty ridiculous again, perhaps as a stop-gap for now any ALL CAPS titled messages can be discarded?
I suspect that if we had any way of applying filters to those messages, we'd be running them through a full spam filter. But Dave: can we do that?

And please give us an update on progress on switching the next batch of groups! :-)

Gerv
Dave, any updates please?
It's getting escalated way up the ranks right now (just within the last day or two), we're waiting on the results of that.
(In reply to comment #107)
> I suspect that if we had any way of applying filters to those messages, we'd be
> running them through a full spam filter. But Dave: can we do that?

We are and most of those messages haven't made it to the mailing list side of most of the lists because they've been getting caught.  I've been having great fun cleaning them all out of the moderator queues on the lists that are still generically moderated by list-admin@mozilla.org.  "Reason: SpamAssassin identified this message as possible spam"  When we finally get Google Groups to acknowledge our moderation status most of that will probably go away.
(In reply to comment #110)
> 
> We are and most of those messages haven't made it to the mailing list side of
> most of the lists because they've been getting caught.  I've been having great
> fun cleaning them all out of the moderator queues on the lists that are still
> generically moderated by list-admin@mozilla.org.  "Reason: SpamAssassin
> identified this message as possible spam"  When we finally get Google Groups to
> acknowledge our moderation status most of that will probably go away.

The list I manage, community-philippines, is one of the three that was placed under moderation first as a test.

I can say as the sole moderator, that the SpamAssassin moderation duties come to the list owner, so there is more work for the list owner to do to moderate messages that may be spam.  Of course SpamAssassin may be able to be adjusted per list, but I don't know where we'd do that (or if that's under Mozilla's control or Google's.)
If you're confident that SpamAssassin hasn't given you any false positives, I can set a cron-job to auto-nuke anything flagged by SpamAssassin in your moderator queue.
(In reply to comment #112)
> If you're confident that SpamAssassin hasn't given you any false positives, I
> can set a cron-job to auto-nuke anything flagged by SpamAssassin in your
> moderator queue.

Thank you but SpamAssassin is not perfect.  A few (out of hundreds) of the flagged posts did need to be approved.  Please keep the settings as they are for the time being.

SpamAssassin also sends notice when they automatically delete messages (when they're over a certain threshold) which is nice too.  You can filter those out for review if you want to.
Maybe the Spam Assassin auto-discard thresholds could be lowered? As it currently stands, my list's moderation queue is a mess because so much spam is getting through, which is causing me to pay less attention to it- which is eventually going to cause a real poster to get screwed.
Spam assassin has never given me a false positive in m.d.t.crypto. 
I would like the OPTION to be able to cause all the messages it identifies
to be silently dropped, before the daily email that I get telling me what's
in the moderation queue.
Perhaps setting SpamAssassin to automatically reject anything with a relatively high score (like 8) would be a good setting for all lists. (This is how I have my personal mail server configured, and not a single person has ever said their legitimate message to me bounced back to them.)
(In reply to comment #109)
> It's getting escalated way up the ranks right now (just within the last day or
> two), we're waiting on the results of that.

So I'm happy to report that we are actually in communication with people inside the Google Groups unit now, and they are now actively working on the problem with group creation and updates.  Hopefully we'll have some results really soon now.
Blocks: 635078
I'm going to use this as a tracking bug/discussion of the overall project for this conversion and start filing individual bugs (which will be set as blocking this one) for each batch of changes we send along, just to keep track of them better.
Whiteboard: [tracking]
Depends on: 638209
The following are now on bug 638209

mozilla.community.uk
mozilla.governance
mozilla.jobs
mozilla.legal
mozilla.support.other
The config polling automation on Google's end appears to be operational again as of Tuesday night. :)

Time to start pushing more batches through.  (bug 638209 is up first)
No longer blocks: 635078
Depends on: 635078
One of the latest spam messages in support.other has the header:
X-Spam-Score: 0.231

A recent valid message had:
X-Spam-Score: 2.344

So it looks like spam score isn't entirely dependable.
I also noticed that my reply to that message, which was sent via NNTP, did not have a "X-Spam-Score:" header.
Dave- any status on new groups being moderated?  There's a fair amount of spam getting through recently (moz.dev.apps.mobile for one, although the same spammer is hitting many of our groups).
Yeah, time to get another batch going. Gerv: what groups do you suggest we try next?  Given the issues we had on the last batch I'm still thinking we do 5 or 10 at a time until we get a batch that goes through without any issues.
Assignee: justdave → dgherman
Here's my current view of what's needed:
https://wiki.mozilla.org/Discussion_Forums/ToDo

Comments?

Gerv
C.2. probably isn't going to happen unless we get someone to volunteer to hack on mailman for us (the source is out there - it's all Python).
mozilla.dev.accessibility is particularly bad.  If that can be put into the next batch, that would be great.
More groups: bug 660675.

Gerv
Dave/Dumitru:  any chance of closing this out this week??
:cshields: to close this out, bug 658972 (global blacklist for mailman) needs to be implemented, and then I need to find moderators for all the groups, and then we need to flip all the switches. So it can't be done in a week, and it's blocked by bug 658972. It would be awesome if you tasked someone to fix that bug.

Gerv
Assignee: dgherman → rbryce
QA Contact: mrz → shyam
Assignee: rbryce → mburns
Assignee: mburns → limed
Component: Server Operations → Server Operations: Infrastructure
QA Contact: shyam → jdow
Component: Server Operations: Infrastructure → Infrastructure: Other
Product: mozilla.org → Infrastructure & Operations
Component: Infrastructure: Other → Infrastructure: Mail
QA Contact: jdow → limed
Resetting this from critical to normal because ... this bug hasn't been touched in over a year. That said, what is needed to close this?
Severity: critical → normal
Flags: needinfo?(gerv)
I could do with having a chat with whichever member of the IT team is currently responsible for this system. It's been a few different people over the time. 

achavez: is it you, right now?

Gerv
Flags: needinfo?(gerv) → needinfo?(achavez)
(In reply to Gervase Markham [:gerv] from comment #131)
> I could do with having a chat with whichever member of the IT team is
> currently responsible for this system. It's been a few different people over
> the time. 
> 
> achavez: is it you, right now?
> 
> Gerv

For this, you'd want to ask limed.
Flags: needinfo?(achavez)
Mailman will be maintained as is
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.