It seems like posts made from a newsgroup will get the wrong headers when the newsgroup gateway sends out a corresponding email for them. I'm not sure what component is the right one to file this in...
Could you cite an instance of this? The new Google Groups interface breaks threading. Maybe that's what you're seeing?
Ehsan told me that my messages (coming from the newsgroup side) seem to break threading on his side in Thunderbird, using the mailing list side. I'll let him point to specific examples - Ehsan, if you have such a message that breaks threading, could you save it and attach it here so we can look into its headers?
I'm getting the opposite problem. I use Thunderbird to read our groups as newsgroups, and threading is horribly broken. For example, Ted sent a message with this Message-ID on the newsgroup server: To: "mozilla.dev.planning group" <firstname.lastname@example.org> Message-ID: <email@example.com> timeless sent this response: To: Ted Mielczarek <firstname.lastname@example.org> Cc: "mozilla.dev.planning group" <email@example.com> In-Reply-To: <BANLkTinCLa4XJMOWeWmyVO09u_DO6C-MCA@mail.gmail.com> References: <BANLkTinCLa4XJMOWeWmyVO09u_DO6C-MCA@mail.gmail.com> Message-ID: <firstname.lastname@example.org> This broke the thread. Meanwhile Zack's response, which appears to originate over NNTP (no "To:"): Message-ID: <iMednZQzkMdQ6V3QnZ2dnUVZ_gmdnZ2d@mozilla.org> References: <email@example.com> In-Reply-To: <firstname.lastname@example.org> threaded correctly. It seems like the gateway is generating new message IDs instead of using the old ones when ferrying messages.
(In reply to comment #4) > It seems like the gateway is generating new message IDs instead of using the > old ones when ferrying messages. Yes, that is indeed exactly what it's doing. From the Mailman FAQ: -----8<----- 4.59. Why is the Mailman mail-to-USENET-news gateway munging the Message-Id: header? There are two parts to this answer. First, the RFCs require that the Message-ID: header be globally unique throughout the world, and the only way that Mailman can be reasonably sure of this is to generate its own header that it should be able to guarantee will meet this requirement. Secondly, Mailman uses the value in the Message-ID: header to determine whether or not it has seen this newsgroup posting before, and whether it should copy that message back to the mailing list. This is to try to avoid duplicate postings to the list and loops between multiple bi-directional mail/news gateway systems. If your mail program or newsreader implements message threading correctly (see http://www.jwz.org/doc/threading.html), then it should be able to deal with these differences. If not, then you should complain to the people responsible for implementing and supporting your mail program or newsreader. As far as Mailman is concerned, this is an old issue. The algorithm that Mailman uses today was developed years ago, to deal with the problems that were being created at the time by the mail programs, newsreaders, and other mail/news gateways. However, many broken programs remain in use today, and these problems haven't gone away. This is not an ideal solution to the problem, but it does appear to cause the least overall confusion and disruption to the affected communities. If you have a better solution, one that takes into account the problems that have been experienced in the past, please feel free to post your patch on the Mailman tracker at https://bugs.launchpad.net/mailman. For more information, see the threads at http://mail.python.org/pipermail/mailman-users/2005-January/041884.html, and http://mail.python.org/pipermail/mailman-users/2001-November/015730.html, among others. -----8<----- What the FAQ fails to explain, and what is mentioned on the bug report (https://bugs.launchpad.net/mailman/+bug/266263) about finding some other solution to the problem, is: -----8<----- the conflicts occur when a user cross-posts a message to two lists. the newsrunner won't turn it into a cross-post, instead it tries to post it separately to both groups, and the news server rejects the second one for a duplicate message-id (tested with innd). -----8<-----
FWIW, this may be partially fixed when we finish converting all of the lists in bug 598060, as all of the newgroups will be moderated, and even if mailman just auto-approves stuff, the message IDs will *all* be munged, instead of just the ones posted via email.
Reopening because I have a suggestion for a fix. But first : Modifying Message-IDs is a pretty evil thing to do, and Mailman ought to do every effort to avoid that. Even jwz's algorythm can't recover the situation fully, because there really is information loss in the way Mailman does that. The cross-posting thing is actually another bug in my opinion : Mailman should be able to recognize that situation, and handle it properly by converting mail cross-posting to a proper newsgroup cross-posting. However without significantly changing Mailman's behavior, there's one simple fix that would enhance things a lot (by solving the information loss part of the problem) : - When forwarding messages to the newgroups, Mailman could change the References header to add the unmangled version of the message-id in front of it, before replacing it with a modified version inside Message-ID: header. This means that replies to this message would contain both the mangled ID and the unmangled one, so the unmangled version would be properly referenced as a direct ancestor of the response. Together with an error recovery algorithm similar to the jwz's one, this should make threading work properly in most cases (it may still be broken in mail, if mail only uses the Reply-to header, and not the References one. But I believe Thunderbird/Seamonkey Mail actually use the same code to do threading both in mail and newsgroups).
Are we willing to run a patched mailman on our gateway or do we need to get an official release? There is a patch available on https://bugs.launchpad.net/mailman/+bug/266263 that seems quite reasonable and I can update it if need be. It is probably worth noting that when the Message-ID destruction does not manifest itself as fragmenting the thread it will manifest as incorrect parenting of replies in the thread hierarchy.
We already run a patched mailman, IIRC. CCing rbryce. Gerv
Which one of the patch are you talking about Andrew ? - The "do not rewrite the message-id unless the NNTP server reports an error when posting it" patch is imperfect but already a major improvement over the current situation, and one which we have been waiting too long for. It means cross-posted messages are still broken (the way they are now), but at least only the cross-posted message threads will have a problem, and not anymore every message thread. - I'm not convinced the "NNTP-MsgID-Handling" patch solves anything for Mozilla. The NNTP error is caused by duplicated "Message-ID", that patch doesn't unduplicate them at all. This patch seems to solve loop problems, where the gateway injects messages and the same message later come back again to the gateway. But here, the root cause is instead that the original unmodified message sent by the user arrives twice to the gateway because of the cross-posting. So adding a new header to what you post will change nothing, you'll still receive the second instance unmodified (I hope I'm clear enough about what's happening) - the solution jcranmer proposes solves the cross-posting issue, which is the root cause of most of the problem here. But he did not post a patch implementing it. And if at one moment for some reason you end up with an incomplete list of the mailing lists/corresponding newsgroups in your configuration of the gateway, those message will fail hard (not being posted to newgroup because of a duplicated ID). Adding the first patch to his solution, would mean that almost every messages would be perfectly handled, and the remaining one would be broken but just the way they are now (which wouldn't be a big issue anymore). I still think my solution above wouldn't be that bad, both basic and quite effective. Or then the solution of always rewriting the Message-ID and using it for both mail and news (option 2 of the initial message of mailman's bug report). BTW I think Mozilla's infrastructure group would gain a lot from publicly documenting everything it does and uses (here the exact configuration of the patched mailman used). Only the machine's password should be secret. This means outsider could help with testing every aspect of the setup, and suggesting correction. If you think this would endanger the security of Mozilla's infrastructure, then you are should realize this means you believe in security by obscurity.
(In reply to Jean-Marc Desperrier from comment #10) > Which one of the patch are you talking about Andrew ? I was referring to the NNTP-MsgID-Handling patch, but I had forgotten about the cross-posting issue that you raise. Thanks for summarizing all the options! I agree that your proposal sounds best because it is straightforward/minimal without regressing the cross-posting behavior. I can confirm that Thunderbird's threading logic will work much better with your proposal implemented; both in the threaded view and in the global database. Presumably mailman 3 will do things the 'right way' as jcranmer proposes and we can then eventually upgrade to that (but with less urgency). Is there a patch corresponding to your proposal? If not, are you planning to write one, or should I do so?
I've built an updated mailman rpm (to be rolled out soon) that uses that includes the NNTP-MsgID-Handling patch, but I agree with Jean-Marc this is not a complete solution. I've reached out to :jcranmer via Launchpad, but haven't heard back yet. Jean-Marc or Andrew, if you are interested in help writing a patch to implement the fix in comment 7, we are not able to roll that patch out with considerably more ease.
(In reply to Michael Burns [:mburns] from comment #12) > I've reached out to :jcranmer via Launchpad, but > haven't heard back yet. Or just CC him to this bug. ;)
Thanks reed :) Correction, we are *now* able to roll that patch out with more ease. I've looked at a couple similar patches that attempted to address the threading/message-id munging problem and apply them to a modern Mailman codebase, but they have bit-rotted over the years and appear to cause as many/more problems as they attemp to solve. :jcranmer, if you have interest in helping author the patch described in https://bugs.launchpad.net/mailman/+bug/266263/comments/6 or similar, I'd love to get a more elegant solution to this problem pushed into production.
I'm out of the running for a fix for the next few weeks (FxOS Gaia e-mail app still needs a lot of love). Is there a source control repo you are using to track the state of mozilla's patched branch, either as a proper branch or a stack of patches? Also, is there a staging server or other mechanism to facilitate testing that would help whoever is writing the patch either test it or maybe need to be less paranoid about things?
There is a staging host for Mailman that we can do trial deployments to. I've avoided importing the code into the sysadmin SVN repo which is expected to get open sourced at some point (sooner than later, but unsure when), but maybe that is the best place to put it. I'll see if we can get this cleared to go up on Github, along with the documentation requested in comment 10.
(In reply to Michael Burns [:mburns] from comment #14) > :jcranmer, if you have interest in helping author the patch described in > https://bugs.launchpad.net/mailman/+bug/266263/comments/6 or similar, I'd > love to get a more elegant solution to this problem pushed into production. I have an (admittedly untested) patch for mailman 3 that fixes the threading but not the crosspost issue. Fixing the crosspost issue was more difficult since I really don't know my around the mailman APIs, but the message-ID munging is smarter and fixes threading in a few more cases.
FWIW Joshua's patch does both : - The mailman bug 266263 fix : don't rewrite the message-id unless the NNTP server reports an error when posting it - The comment #7 fix : When munging the id, insert also the original value in the references to allow threading with error recovery What it doesn't do is the "ideal" cross-post handling, as described here : https://bugs.launchpad.net/mailman/+bug/266263/comments/6 I'm not sure, but it could mean a risk of message duplication in some cases. But as there's already a lot of that currently, it's certainly already much better.
The dependency for this bug is to install mailman 3 which I am not going to go ahead with until it has reach stable. At this time it appears that mailman3 is still in a beta state, when there reaches a time where mailman 3 is stable we can revisit this bug again. For now as stated on comment 12 the NNTP message handling patch is rolled in to the mailman rpm that we have right now. This solves part of the problem but not all, and based on the launchpad bug it looks like there is a fix in mailman3, so we'll just wait for that. : https://bugs.launchpad.net/mailman/+bug/266263/comments/4
(In reply to Ed Lim [:limed] from comment #20) > For now as stated on comment 12 the NNTP message handling patch is rolled in > to the mailman rpm that we have right now. This solves part of the > problem but not all, and based on the launchpad bug it looks like there is a > fix in mailman3, so we'll just wait for that. Having the source code to mailman3 in front of me, it is not fixed. The patch was posted, but no action has been done on it.
The good news is that the patch that we're currently using drastically improves the threading situation for people subscribed via the mailing list side of the house. All threading issues should now be gone! Hooray! (Breakage should be instead confined to messages sent by people cross-posting to multiple mailing lists. Theoretically the messages will only make it to one of the newsgroups mirroring the mailing list. Assuming the newsgroup server is actually a hard-ass about the message-id being globally unique rather than unique to the list.)
> Theoretically the messages will only make it to one of the newsgroups mirroring the > mailing list. It's not theoretical, and it's totally screwing up communications. See bug 901188.