Closed
Bug 891906
Opened 11 years ago
Closed 11 years ago
mailing-list / newsgroup mirroring is broken
Categories
(Infrastructure & Operations :: Infrastructure: Mail, task)
Infrastructure & Operations
Infrastructure: Mail
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dbaron, Assigned: justdave)
References
Details
(Whiteboard: affected lists listed in comment 12)
https://groups.google.com/forum/#!topic/mozilla.dev.platform/UCio5fB4VJo was posted to dev.platform nearly 24 hours ago but has not appeared to subscribers who read dev-platform as a mailing list. This is a blocker for communication within the project -- it prevents people from sending and receiving information and knowing who has received information, and needs to be fixed immediately (or at the very least announced as an unexpected outage to all@, etc.)
Updated•11 years ago
|
Assignee: server-ops → infra
Severity: blocker → normal
Component: Server Operations → Infrastructure: Mail
Product: mozilla.org → Infrastructure & Operations
QA Contact: shyam → limed
Comment 1•11 years ago
|
||
David, Thanks for bringing this to our notice. We'll take a look and let you know what we find out.
Comment 3•11 years ago
|
||
This appears to be specific to dev-platform: I continue to receive email from dev-planning and a few other lists correctly. As moderator of dev-platform I have checked the mailmain subscription info for myself and several others who are not receiving mail, and everyone is subscribed correctly.
Assignee | ||
Comment 4•11 years ago
|
||
That link goes to an entire thread, was there a specific message that didn't go through, or all messages in that thread or?
Reporter | ||
Comment 5•11 years ago
|
||
All messages in that thread.
Comment 6•11 years ago
|
||
I think it's all messages to dev-platform since 9-July.
Assignee | ||
Comment 7•11 years ago
|
||
OK, I've determined that the news gateway process is skipping this mailing list when checking for new newsgroup messages for some reason. I haven't yet found any configuration differences between it and any of the other lists to determine why, and it's not logging any errors (it's just not doing it to begin with). I'm continuing to play with it...
Assignee: infra → justdave
Assignee | ||
Comment 8•11 years ago
|
||
fwiw, the fact that it's outright skipping the group when checking for messages gives me high hopes that the missing messages will all go through at once when we get it fixed...
Assignee | ||
Comment 9•11 years ago
|
||
OK, this appears to be fixed now. The root cause sickens me. :( mozilla.community.hungary group has corrupted pointers on giganews' servers, and as best as I can tell, has since May 4th, 2013 (because that's when this appears to have broken). Giganews is claiming there are 2.15 billion new messages in that newsgroup, and mailman was running out of memory trying to create a data structure to grab the headers for that many messages, causing it to crash, and failing to sync any newsgroups that came after it in the run order of the news gateway script. There are a *LOT* of incoming messages from the news side getting pulled in and re-sent to the mailing lists right now. The script is still running. I'll post back here with a complete list of the affected mailing lists as soon as it's done (there were more than just this one). The only way we would have caught this is monitoring mailman's crash logs. This has typically been an unpalatable thing to monitor, because it crashes a lot, and 99% of the crashes are completely innocuous things that we wouldn't actually care about, and would only cause us to start ignoring the alerts anyway.
Assignee | ||
Comment 10•11 years ago
|
||
FWIW, this was fixed by telling mailman to perform a one-time mass catchup on community-hungary, telling it to ignore those 2.15 billion pending new messages in that group.
Assignee | ||
Comment 11•11 years ago
|
||
It's still running (a lot of catching up to do). In the meantime we are brainstorming on IRC about ways to detect if this starts failing again in the future.
Assignee | ||
Comment 12•11 years ago
|
||
OK, it's done. Of the 214 total mailing lists we have gatewayed to news.mozilla.org, the following 74 lists were affected by this issue: bugmasters community-games community-india community-ireland community-mexico community-switzerland community-tunisia community-turkey dev-apps-bugzilla dev-apps-calendar dev-apps-chatzilla dev-apps-firefox dev-apps-seamonkey dev-apps-thunderbird dev-b2g dev-builds dev-gaia dev-identity dev-js-sourcemap dev-l10n dev-l10n-de dev-l10n-fa dev-l10n-in dev-l10n-new-locales dev-l10n-pt-br dev-l10n-sr dev-l10n-ta dev-l10n-vi dev-l10n-web dev-mdc dev-mdc-es dev-mdn dev-mozilla-org dev-pdf-js dev-platform dev-popcorn dev-ports-os2 dev-privacy dev-security-policy dev-shumway dev-static-analysis dev-tech-crypto dev-tech-dom dev-tech-js-engine dev-tech-js-engine-internals dev-tech-layout dev-tech-plugins dev-tech-svg dev-tech-xml dev-tech-xpcom dev-tech-xul dev-tree-management dev-webapi dev-webapps dev-webdev general governance mozillians privacy reps-general reps-mentors reps-webdev support-bugzilla support-other support-seamonkey support-thunderbird support-webtools test tools tools-l10n webapps webmaker webmaker-canada-bc wishlist The missed messages have all now been downloaded from the news server, and are spooling out to the mailing lists now.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•11 years ago
|
Whiteboard: affected lists listed in comment 12
Assignee | ||
Comment 13•11 years ago
|
||
bug 892051 has been filed to track our progress on coming up with a way to monitor for this in the future.
Reporter | ||
Comment 14•11 years ago
|
||
Could you send an unexpected downtime notice explaining this, so that people understand what happened? It's important both for the folks on the mailing list side receiving a flood of messages, and for the folks on the newsgroup side who need to understand that everything they've posted to these lists for the past few months has only been read by a part of the expected audience.
Assignee | ||
Comment 16•11 years ago
|
||
I filed a ticket with Giganews this afternoon about the mozilla.community.hungary newsgroup pointers. They replied back that they were unable to resolve the issue without deleting and re-creating the newsgroup from scratch. As best as I could tell prior to them doing so (from using an NNTP reader client) there was only one real message on that newsgroup anyway (and the pointer issue may have been preventing people from using it).
Comment 18•11 years ago
|
||
shyam/justdave: this issue was reported in bug 877134 on 29th May. dbaron reported it again 20 hours ago, and it is now fixed. For future reference, what special magic did he apply to get such prompt and excellent service, that all the people CCed on bug 877134 could use next time they have a discussion forum problem? :-) Gerv
Reporter | ||
Comment 19•11 years ago
|
||
While I'm not them, I'd note two factors: (1) using a bug summary that was a reasonably accurate description of the actual problem. Also see http://dbaron.org/log/20100426-bug-summary . (2) mentioning that the problem was an issue related to newsgroup -> mailing list mirroring rather than a purely mailing list issue (which was never explicitly mentioned in bug 877134, although I'd think it should have been considered)
You need to log in
before you can comment on or make changes to this bug.
Description
•