Closed Bug 388919 Opened 17 years ago Closed 12 years ago

Thunderbird 2 corrupts feed data

Categories

(MailNews Core :: Feed Reader, defect)

1.8 Branch
x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.5) Gecko/20070713 Firefox/2.0.0.5 XpcomViewer/0.9 Creative ZENcast v1.02.10
Build Identifier: Thunderbird 2.0.0.5

From time to time, Thunderbird 2 corrupts the feeds.rdf file contained in the profile directory, and loses some of my feeds.  I'd have to restore the file manually each time.  I will attach a sample of the correct and corrupted feeds.rdf file to this bug.  This has happened in Thunderbird 2.0.0.0, 2.0.0.4, and 2.0.0.5.

I have not been able to map this event with something that I do.  I usually leave Thunderbird open, and after some time I realize that I am not receiving any feeds in the Mozilla directory (which usually gets a new feed at least once an hour).  Then I'd inspect feeds.rdf and see that it's been corrupted.

This does not happen every time I run Thunderbird, but happens consistently.

Reproducible: Sometimes

Actual Results:  
The feeds.rdf file gets corrupted.

Expected Results:  
The feeds.rdf file must not get corrupted.
Should it block Thunderbird 2.0.0.6?
Flags: blocking1.8.1.6?
Version: unspecified → 2.0
Attached file Correct feeds.rdf
The correct feeds.rdf file (before corruption).
Attached file Corrupted feeds.rdf
The feeds.rdf file after corruption.
We need some reproducibility. Still haven't corrupted them for me. 
Blocks: 389369
We were hoping the fix for bug 375102 would solve this, are you sure it has  happened in 2.0.0.5 or 2.0.0.6?

Since we don't know why or when this happens it's not realistic to say we'd stop ship ("block") on this one for a branch release. If we can figure it out I'm sure we'd happily approve a patch, though.
Flags: wanted1.8.1.x+
Flags: blocking1.8.1.7?
Flags: blocking1.8.1.7-
perhaps related to bug 390351?
(In reply to comment #5)
> We were hoping the fix for bug 375102 would solve this, are you sure it has 
> happened in 2.0.0.5 or 2.0.0.6?

Yeah, it happens about one or twice a week for me in both 2.0.0.5 and 2.0.0.6...

I'm happy to provide help in debugging this issue, I just need pointers as to where should I be looking.
(In reply to comment #6)
> perhaps related to bug 390351?
> 

Hmmm, the description of bug 390351 is not too well-written, but if I'm reading it right, these are different issues.  In the case of this bug, some of the feed subscriptions are deleted, not the whole RSS account.
Ever since I upgraded from TBird 2.0 to 2.0.0.6, I can't read or subscribe to new feeds.  I can still see the entries that I saved on my computer, but I can't get new posts.  I also can't export my feeds, as noted in bug 395525.  I checked my old feed.rdf file, and it's blank.  The only way I can read new feeds is by creating a new Blog & News account and manually typing in every RSS address.  This works for some feeds, but in once, Thunderbird said, "You already have a subscription to this feed."  I can't see that feed anywhere.  I tried deleting that folder and subscribing again.  No luck.  Ugh!  Now I can't even re-subscribe to my feeds.
I'm having similar trouble here, with TB 2.0.0.6 on WinXPMCE. Clicking to get messages gets me a few spins of the green thermometer, but then nothing happens... and I *know* the feeds have things to download. When this first started, I got a status message telling me that one particular feed wasn't valid. Thinking that the invalid feed was somehow stopping the whole works (why, I don't know), I deleted the feed and tried again. And nothing happened. 

I too have experienced this problem - except my entire feeds.rdf and feeditems.rdf were completely empty. Prior to finding this, I thought my feeds were taking too much space (the news/blogs folder was 99.8MB in size). I thought this was pretty convenient - but clearing out the space didn't solve anything.

I then noticed these files were empty. I deleted them and then thunderbird worked again. I am using 2.0.0.6.
(In reply to comment #13)
> I too have experienced this problem - except my entire feeds.rdf and
> feeditems.rdf were completely empty. [...]
> 
> I then noticed these files were empty. I deleted them and then thunderbird
> worked again. I am using 2.0.0.6.
> 
For me, both files were not empty, but also caused this bug. Deleting them solved the problem for me as well.

While I had this bug, I was not able to add any feeds to the corrupted folders. Thunderbird started checking the newly added feeds and never finished it, although the process was not hanging. Updating to Thunderbird 3 also fixed that problem for me, but Thunderbird 3 seems to download all feeds items everytime it updates a feed, even if I have already read them.

I also want to say that I have three rss accounts and they all broke at once, although they have individual feeds.rdf and feeditems.rdf files.
(In reply to comment #4)
> We need some reproducibility. Still haven't corrupted them for me. 

I've managed to get my account data corrupted on two separate occasions. The weird thing was that the first time, what happened to Jan-Niklas in comment 14 didn't happen to me: of my six accounts, only three of them got corrupted. The second time all of them got corrupted.

I first noticed the first corruption when it seemed that TB was forgetting that I preferred to see only the summaries of some of my feeds. On closer inspection, editing subscription failed:
Error: currentFolder has no properties
Source File: chrome://messenger-newsblog/content/feed-subscriptions.js
Line: 621

There were a lot more problems in trying to fix via the subscription window and judicious use of rm that eventually resulted in me nuking the account and starting over again, happy that it didn't happen to my mega blog account.

I'm not sure that all of these corruptions are the same bugs though, as the circumstances seem different. The first corruption I got, I could still get new messages, but I couldn't the second time.

In any case, I still have four borked RSS accounts that I can/will freely give to anyone who wants to diagnose more.
In the meantime, I also had the problem that only a few feeds were broken.

I also noticed that TB doesn't simply show only the summaries for some feeds (although they are provided in the RSS feeds). I actually have 3 RSS accounts. Since I now know that it were not all feeds of my TB that got corrupted at once, does TB corrupt all feeds of an account (for me, the "only show summaries" feature is only non-functional for some feeds of an account)?
I've done some self-diagnosis. Since I think the problem is related to multiple accounts, I'll briefly explain my setup. I have 17 accounts, numbered in my accounts pref as 1-4, 6-11, 5, 13, 14, 17, 15, 16, 18. In order, these are Local folders, POP-1, IMAP-1, RSS-4, RSS-1, NNTP-1, NNTP-2, RSS-2, RSS-3, RSS-0, IMAP-2, RSS-5, NNTP-3, NNTP-4, IMAP-3, NNTP-5, and NNTP-6 (RSS-* are numbered based on the News & Blogs-* folders they are stored in; everything else is just for identification purposes).

At this point, I manually recreated 1 account and restored another by deletion of feeds.rdf and feeditems.rdf (RSS-4 and RSS-1, resp.). In preparing for this bug, I deleted the feeditems.rdf from RSS-5. It didn't appear to cause the server to uncorrupt itself, so I checked the error console. There are two prominent errors:

Error: syntax error
Source File: file:///C/Users/jcranmer/AppData/Roaming/Thunderbird/Profiles/201w0zr4.default/Mail/News%20&%20Blogs-3/feeditems.rdf
Line: 1, Column: 1
Source Code:
9652990)>[-1B8:m(^9C=a7d)(^8F=a7d)(^91^36CA)(^92=2)^
Error: [Exception... "Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIRDFService.GetDataSourceBlocking]"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: chrome://messenger-newsblog/content/utils.js :: getItemsDS :: line 219"  data: no]
Source File: chrome://messenger-newsblog/content/utils.js
Line: 219

RSS-3's feeditems was corrupted. Later on, there was a warning about RSS-0's feeditems being ill-formed. Noticing that RSS-3's error looks suspiciously like a mork file, I took a look inside. The file appears to consist of portions of several files, concatenated:

First was a portion of the .msf for the first folder of NNTP-1 (for 259 lines, it appears to start at would be the 87th line of the actual msf). Then there are 2571 null characters, so that the first malformed fragment is in total 12288 or 0x3000 characters.

The next hunk was a portion of a feeditems.rdf for RSS-1 (NOT RSS-3). There were 58 complete lines of this file (the end, to be specific), followed by another hunk of null characters, 1424 this time. The total to the end of this fragment is a suspicious 16384.

The final hunk was a portion of a feeditems.rdf for RSS-5. This had 280 lines, but the feeditems.rdf of RSS-5 at the time appears to be hopelessly lost.

I went to check for feeds.rdf for RSS-3, and it appears that this one is corrupted. Most of the feeds actually appear to be feeds for RSS-5.

I do not know if RSS-1's or RSS-4's files were ever corrupted (but neither updated, and I recall (I can't be entirely sure of this) that they "forgot" [1] their preferences telling them that I wanted to see the summary, not the web page). I am guessing that one or both of them were corrupted.

RSS-5's feeds appear to be intact, although the feeditems seem to have been overwritten. RSS-0's feeds also appears to be intact, but it's feeditems is a concatenation of what appears to be the end of RSS-5's feeds.rdf and the middle of my panacea.dat file. The feeds.rdf hunk to the end of null characters was 28672 characters, or 0x7000 for those of you like hexadecimal.

Summaries of findings from the files:
* The feeditems.rdf are overwritten with a mixture of files. These files are RDF files related to distinct accounts, an msf file, and the panacea.dat cache. The lengths of all of the hunks excluding the last one are all multiples of 1024. The last hunks' lengths are not interesting in hexadecimal notation.
* Two of the RDF files whose contents were mixed into the corrupt files were malformed, the third one was destroyed in an attempt to restore normalcy to my accounts. It is possible, but unverifiable, that the other two files included were destroyed as part of normal recovery operation. The first group commit may have happened after this event (I can narrow down the time frame to Jan. 9, but a few messages were posted that day, so I do not know if this would have happened after the corruption or before), so it is possible that the msf rebuilt itself as a result of corruption.

There is another problem that just crossed my mind as I was trying to work out exactly which operating system I was using (I share data betwixt a Windows and Linux dual-boot system). At some point around the early weekend, I somehow came to the realization that my NTFS partition was having problems, specifically on a feeds.rdf and a feeditems.rdf for one of my accounts (I *want* to say RSS-3, the one that was corrupted, but I can't say for sure). I therefore rebooted to Windows to have it run its chkdisk to clean up the problem. The chkdisk may be why files for RSS-3 are so messed up.

It is therefore quite likely that this problem could be entirely a problem with an NTFS driver, but, even if it is what is causing the issue, it would be advisable to have TB attempt to reconstruct a corrupted file or, at the very last, have a UI-accessible way to reconstruct corrupted files à la Rebuild Index.

At this stage, I have made a backup of the [MailD] directory so that all of these borked files that still exist.
Final report:
Removing the two malfunctioning feeditems.rdf files and the suspected-incorrect feeds.rdf and feeditems.rdf files seems to have fixed the issues. On the other hand, I have a lot of duplicates to go wading through...
(In reply to comment #16)
> I also noticed that TB doesn't simply show only the summaries for some feeds
> (although they are provided in the RSS feeds). I actually have 3 RSS accounts.
> Since I now know that it were not all feeds of my TB that got corrupted at
> once, does TB corrupt all feeds of an account (for me, the "only show
> summaries" feature is only non-functional for some feeds of an account)?

It appears that corruption is localized to one or a few accounts, but some aspects of corruption can appear to spread. The way that RSS accounts are processed means that one account failing to load due to corruption can cause all subsequent accounts to fail to get messages.
I can't explain what causes the corruption, but my current working thesis is that the RDF datasource is messing something up along the line. I've seen a mild form of corruption a few times, but it's difficult to reproduce, so it'll be a while before I can track down whether or not RDF is the culprit.

In the meantime, it should be possible for someone to hack something up to rescue feeds.rdf if something fails.
Just happened to me aswell.
Two things I can recall to have done before the corruption of the feeditems file occured:

- upgrading thunderbird to Version 2.0.0.21
- exporting my feeds
Component: RSS → Feed Reader
Product: Thunderbird → MailNews Core
Version: 2.0 → 1.8 Branch
(In reply to comment #21)
> I can't explain what causes the corruption, but my current working thesis is
> that the RDF datasource is messing something up along the line. I've seen a
> mild form of corruption a few times, but it's difficult to reproduce, so it'll
> be a while before I can track down whether or not RDF is the culprit.

Joshua, any promising ideas for fixing this?
bug 390351 is a duplicate?
Depends on: 450543
to be brief, there were a number of paths and conditions to what has generally become the 'feed corruption' problem.  i believe they are all fixed by bug 705504, bug 709247, and bug 711173.

in addition, bug 716706 does a bit of work with the Subscribe dialog, and along the way fixes the issue of partial feeds.rdf entries (seed entries left lying around after an unsuccessful subscribe, due to invalid feed or error on valid feed).
Depends on: 711173
No longer depends on: 450543
fixed by bug 711173/bug 705504.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: