Last Comment Bug 651527 - Our newsgroup gateway breaks threading in Thunderbird
: Our newsgroup gateway breaks threading in Thunderbird
Status: RESOLVED WONTFIX
[pending mailman-3]
:
Product: Infrastructure & Operations
Classification: Other
Component: Infrastructure: Mail (show other bugs)
: other
: x86 Mac OS X
: -- enhancement (vote)
: ---
Assigned To: Ed Lim [:limed]
: Dave Miller [:justdave] (justdave@bugzilla.org)
Mentors:
Depends on: 770721
Blocks: 711375
  Show dependency treegraph
 
Reported: 2011-04-20 09:25 PDT by :Ehsan Akhgari (out sick)
Modified: 2013-08-02 22:05 PDT (History)
14 users (show)
See Also:
Due Date:
QA Whiteboard:
Iteration: ---
Points: ---
Cab Review: ServiceNow Change Request (use flag)


Attachments
Sample email (6.39 KB, text/plain)
2011-04-20 12:12 PDT, :Ehsan Akhgari (out sick)
no flags Details
Mailman 3 message-ID patch (4.41 KB, patch)
2013-03-31 20:51 PDT, Joshua Cranmer [:jcranmer]
no flags Details | Diff | Review

Description :Ehsan Akhgari (out sick) 2011-04-20 09:25:45 PDT
It seems like posts made from a newsgroup will get the wrong headers when the newsgroup gateway sends out a corresponding email for them.

I'm not sure what component is the right one to file this in...
Comment 1 Chris Ilias [:cilias] 2011-04-20 09:32:02 PDT
Could you cite an instance of this? The new Google Groups interface breaks threading. Maybe that's what you're seeing?
Comment 2 Robert Kaiser (not working on stability any more) 2011-04-20 09:52:44 PDT
Ehsan told me that my messages (coming from the newsgroup side) seem to break threading on his side in Thunderbird, using the mailing list side.

I'll let him point to specific examples - Ehsan, if you have such a message that breaks threading, could you save it and attach it here so we can look into its headers?
Comment 3 :Ehsan Akhgari (out sick) 2011-04-20 12:12:54 PDT
Created attachment 527347 [details]
Sample email

This is on example.  Does this help?
Comment 4 fantasai 2011-05-20 16:06:23 PDT
I'm getting the opposite problem. I use Thunderbird to read our groups as newsgroups, and threading is horribly broken.

For example, Ted sent a message with this Message-ID on the newsgroup server:
  To: "mozilla.dev.planning group" <dev-planning@lists.mozilla.org>
  Message-ID: <mailman.2627.1304454152.20808.dev-planning@lists.mozilla.org>

timeless sent this response:
  To: Ted Mielczarek <ted@mielczarek.org>
  Cc: "mozilla.dev.planning group" <dev-planning@lists.mozilla.org>
  In-Reply-To: <BANLkTinCLa4XJMOWeWmyVO09u_DO6C-MCA@mail.gmail.com>
  References: <BANLkTinCLa4XJMOWeWmyVO09u_DO6C-MCA@mail.gmail.com>
  Message-ID: <mailman.2627.1304454152.20808.dev-planning@lists.mozilla.org>

This broke the thread.

Meanwhile Zack's response, which appears to originate over NNTP (no "To:"):
  Message-ID: <iMednZQzkMdQ6V3QnZ2dnUVZ_gmdnZ2d@mozilla.org>
  References: <mailman.2627.1304454152.20808.dev-planning@lists.mozilla.org>
  In-Reply-To: <mailman.2627.1304454152.20808.dev-planning@lists.mozilla.org>
threaded correctly.

It seems like the gateway is generating new message IDs instead of using the old ones when ferrying messages.
Comment 5 Dave Miller [:justdave] (justdave@bugzilla.org) 2011-05-21 13:55:21 PDT
(In reply to comment #4)
> It seems like the gateway is generating new message IDs instead of using the
> old ones when ferrying messages.

Yes, that is indeed exactly what it's doing.

From the Mailman FAQ:

-----8<-----
4.59. Why is the Mailman mail-to-USENET-news gateway munging the Message-Id: header?

There are two parts to this answer.

First, the RFCs require that the Message-ID: header be globally unique throughout the world, and the only way that Mailman can be reasonably sure of this is to generate its own header that it should be able to guarantee will meet this requirement.

Secondly, Mailman uses the value in the Message-ID: header to determine whether or not it has seen this newsgroup posting before, and whether it should copy that message back to the mailing list. This is to try to avoid duplicate postings to the list and loops between multiple bi-directional mail/news gateway systems.

If your mail program or newsreader implements message threading correctly (see http://www.jwz.org/doc/threading.html), then it should be able to deal with these differences. If not, then you should complain to the people responsible for implementing and supporting your mail program or newsreader.

As far as Mailman is concerned, this is an old issue. The algorithm that Mailman uses today was developed years ago, to deal with the problems that were being created at the time by the mail programs, newsreaders, and other mail/news gateways. However, many broken programs remain in use today, and these problems haven't gone away.

This is not an ideal solution to the problem, but it does appear to cause the least overall confusion and disruption to the affected communities.

If you have a better solution, one that takes into account the problems that have been experienced in the past, please feel free to post your patch on the Mailman tracker at https://bugs.launchpad.net/mailman.

For more information, see the threads at http://mail.python.org/pipermail/mailman-users/2005-January/041884.html, and http://mail.python.org/pipermail/mailman-users/2001-November/015730.html, among others.
-----8<-----

What the FAQ fails to explain, and what is mentioned on the bug report (https://bugs.launchpad.net/mailman/+bug/266263) about finding some other solution to the problem, is:

-----8<-----
the conflicts occur when a user cross-posts a message to
two lists. the newsrunner won't turn it into a cross-post,
instead it tries to post it separately to both groups, and
the news server rejects the second one for a duplicate
message-id (tested with innd).
-----8<-----
Comment 6 Dave Miller [:justdave] (justdave@bugzilla.org) 2011-05-21 13:58:47 PDT
FWIW, this may be partially fixed when we finish converting all of the lists in bug 598060, as all of the newgroups will be moderated, and even if mailman just auto-approves stuff, the message IDs will *all* be munged, instead of just the ones posted via email.
Comment 7 Jean-Marc Desperrier 2011-12-16 04:41:10 PST
Reopening because I have a suggestion for a fix.

But first : Modifying Message-IDs is a pretty evil thing to do, and Mailman ought to do every effort to avoid that. Even jwz's algorythm can't recover the situation fully, because there really is information loss in the way Mailman does that.
The cross-posting thing is actually another bug in my opinion : Mailman should be able to recognize that situation, and handle it properly by converting mail cross-posting to a proper newsgroup cross-posting.

However without significantly changing Mailman's behavior, there's one simple fix that would enhance things a lot (by solving the information loss part of the problem) :
- When forwarding messages to the newgroups, Mailman could change the References header to add the unmangled version of the message-id in front of it, before replacing it with a modified version inside Message-ID: header.

This means that replies to this message would contain both the mangled ID and the unmangled one, so the unmangled version would be properly referenced as a direct ancestor of the response.
Together with an error recovery algorithm similar to the jwz's one, this should make threading work properly in most cases (it may still be broken in mail, if mail only uses the Reply-to header, and not the References one. But I believe Thunderbird/Seamonkey Mail actually use the same code to do threading both in mail and newsgroups).
Comment 8 Andrew Sutherland [:asuth] 2012-04-03 16:50:31 PDT
Are we willing to run a patched mailman on our gateway or do we need to get an official release?  There is a patch available on https://bugs.launchpad.net/mailman/+bug/266263 that seems quite reasonable and I can update it if need be.

It is probably worth noting that when the Message-ID destruction does not manifest itself as fragmenting the thread it will manifest as incorrect parenting of replies in the thread hierarchy.
Comment 9 Gervase Markham [:gerv] 2012-04-04 01:13:19 PDT
We already run a patched mailman, IIRC.

CCing rbryce.

Gerv
Comment 10 Jean-Marc Desperrier 2012-04-04 10:30:32 PDT
Which one of the patch are you talking about Andrew ?

- The "do not rewrite the message-id unless the NNTP server
reports an error when posting it" patch is imperfect but already a major improvement over the current situation, and one which we have been waiting too long for.
It means cross-posted messages are still broken (the way they are now), but at least only the cross-posted message threads will have a problem, and not anymore every message thread.

- I'm not convinced the "NNTP-MsgID-Handling" patch solves anything for Mozilla. The NNTP error is caused by duplicated "Message-ID", that patch doesn't unduplicate them at all. This patch seems to solve loop problems, where the gateway injects messages and the same message later come back again to the gateway. But here, the root cause is instead that the original unmodified message sent by the user arrives twice to the gateway because of the cross-posting. So adding a new header to what you post will change nothing, you'll still receive the second instance unmodified (I hope I'm clear enough about what's happening)

- the solution jcranmer proposes solves the cross-posting issue, which is the root cause of most of the problem here. But he did not post a patch implementing it. And if at one moment for some reason you end up with an incomplete list of the mailing lists/corresponding newsgroups in your configuration of the gateway, those message will fail hard (not being posted to newgroup because of a duplicated ID). 
Adding the first patch to his solution, would mean that almost every messages would be perfectly handled, and the remaining one would be broken but just the way they are now (which wouldn't be a big issue anymore).

I still think my solution above wouldn't be that bad, both basic and quite effective. Or then the solution of always rewriting the Message-ID and using it for both mail and news (option 2 of the initial message of mailman's bug report).

BTW I think Mozilla's infrastructure group would gain a lot from publicly documenting everything it does and uses (here the exact configuration of the patched mailman used). Only the machine's password should be secret. This means outsider could help with testing every aspect of the setup, and suggesting correction. 
If you think this would endanger the security of Mozilla's infrastructure, then you are should realize this means you believe in security by obscurity.
Comment 11 Andrew Sutherland [:asuth] 2012-04-04 13:48:45 PDT
(In reply to Jean-Marc Desperrier from comment #10)
> Which one of the patch are you talking about Andrew ?

I was referring to the NNTP-MsgID-Handling patch, but I had forgotten about the cross-posting issue that you raise.  Thanks for summarizing all the options!

I agree that your proposal sounds best because it is straightforward/minimal without regressing the cross-posting behavior.  I can confirm that Thunderbird's threading logic will work much better with your proposal implemented; both in the threaded view and in the global database.

Presumably mailman 3 will do things the 'right way' as jcranmer proposes and we can then eventually upgrade to that (but with less urgency).

Is there a patch corresponding to your proposal?  If not, are you planning to write one, or should I do so?
Comment 12 Michael Burns [:mburns] 2013-03-31 13:13:30 PDT
I've built an updated mailman rpm (to be rolled out soon) that uses that includes the NNTP-MsgID-Handling patch, but I agree with Jean-Marc this is not a complete solution. I've reached out to :jcranmer via Launchpad, but haven't heard back yet.

Jean-Marc or Andrew, if you are interested in help writing a patch to implement the fix in comment 7, we are not able to roll that patch out with considerably more ease.
Comment 13 Reed Loden [:reed] (use needinfo?) 2013-03-31 13:22:42 PDT
(In reply to Michael Burns [:mburns] from comment #12)
> I've reached out to :jcranmer via Launchpad, but
> haven't heard back yet.

Or just CC him to this bug. ;)
Comment 14 Michael Burns [:mburns] 2013-03-31 14:03:08 PDT
Thanks reed :)

Correction, we are *now* able to roll that patch out with more ease.

I've looked at a couple similar patches that attempted to address the threading/message-id munging problem and apply them to a modern Mailman codebase, but they have bit-rotted over the years and appear to cause as many/more problems as they attemp to solve.


:jcranmer, if you have interest in helping author the patch described in https://bugs.launchpad.net/mailman/+bug/266263/comments/6 or similar, I'd love to get a more elegant solution to this problem pushed into production.
Comment 15 Andrew Sutherland [:asuth] 2013-03-31 19:46:35 PDT
I'm out of the running for a fix for the next few weeks (FxOS Gaia e-mail app still needs a lot of love).

Is there a source control repo you are using to track the state of mozilla's patched branch, either as a proper branch or a stack of patches?  Also, is there a staging server or other mechanism to facilitate testing that would help whoever is writing the patch either test it or maybe need to be less paranoid about things?
Comment 16 Michael Burns [:mburns] 2013-03-31 20:15:10 PDT
There is a staging host for Mailman that we can do trial deployments to.

I've avoided importing the code into the sysadmin SVN repo which is expected to get open sourced at some point (sooner than later, but unsure when), but maybe that is the best place to put it. I'll see if we can get this cleared to go up on Github, along with the documentation requested in comment 10.
Comment 17 Joshua Cranmer [:jcranmer] 2013-03-31 20:49:11 PDT
(In reply to Michael Burns [:mburns] from comment #14)
> :jcranmer, if you have interest in helping author the patch described in
> https://bugs.launchpad.net/mailman/+bug/266263/comments/6 or similar, I'd
> love to get a more elegant solution to this problem pushed into production.

I have an (admittedly untested) patch for mailman 3 that fixes the threading but not the crosspost issue. Fixing the crosspost issue was more difficult since I really don't know my around the mailman APIs, but the message-ID munging is smarter and fixes threading in a few more cases.
Comment 18 Joshua Cranmer [:jcranmer] 2013-03-31 20:51:30 PDT
Created attachment 731755 [details] [diff] [review]
Mailman 3 message-ID patch
Comment 19 Jean-Marc Desperrier 2013-04-09 03:34:19 PDT
FWIW Joshua's patch does both :
- The mailman bug 266263 fix : don't rewrite the message-id unless the NNTP server reports an error when posting it 
- The comment #7 fix : When munging the id, insert also the original value in the references to allow threading with error recovery

What it doesn't do is the "ideal" cross-post handling, as described here : https://bugs.launchpad.net/mailman/+bug/266263/comments/6
I'm not sure, but it could mean a risk of message duplication in some cases. But as there's already a lot of that currently, it's certainly already much better.
Comment 20 Ed Lim [:limed] 2013-08-01 15:54:44 PDT
The dependency for this bug is to install mailman 3 which I am not going to go ahead with until it has reach stable. At this time it appears that mailman3 is still in a beta state, when there reaches a time where mailman 3 is stable we can revisit this bug again.

For now as stated on comment 12 the NNTP message handling patch is rolled in to the mailman rpm that we have right now[1]. This solves part of the problem but not all, and based on the launchpad bug it looks like there is a fix in mailman3, so we'll just wait for that.

[1]: https://bugs.launchpad.net/mailman/+bug/266263/comments/4
Comment 21 Joshua Cranmer [:jcranmer] 2013-08-01 16:25:02 PDT
(In reply to Ed Lim [:limed] from comment #20)
> For now as stated on comment 12 the NNTP message handling patch is rolled in
> to the mailman rpm that we have right now[1]. This solves part of the
> problem but not all, and based on the launchpad bug it looks like there is a
> fix in mailman3, so we'll just wait for that.

Having the source code to mailman3 in front of me, it is not fixed. The patch was posted, but no action has been done on it.
Comment 22 Andrew Sutherland [:asuth] 2013-08-01 16:35:45 PDT
The good news is that the patch that we're currently using drastically improves the threading situation for people subscribed via the mailing list side of the house.  All threading issues should now be gone!  Hooray!

(Breakage should be instead confined to messages sent by people cross-posting to multiple mailing lists.  Theoretically the messages will only make it to one of the newsgroups mirroring the mailing list.  Assuming the newsgroup server is actually a hard-ass about the message-id being globally unique rather than unique to the list.)
Comment 23 Boris Zbarsky [:bz] (Out June 25-July 6) 2013-08-02 22:05:12 PDT
> Theoretically the messages will only make it to one of the newsgroups mirroring the
> mailing list. 

It's not theoretical, and it's totally screwing up communications.  See bug 901188.

Note You need to log in before you can comment on or make changes to this bug.