Closed Bug 1259040 Opened 9 years ago Closed 6 years ago

For maildir format, pick a more stable name for mail files

Categories

(MailNews Core :: Database, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 66.0

People

(Reporter: infinity0, Assigned: benc)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [maildir])

Attachments

(2 files, 7 obsolete files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0 Iceweasel/44.0.2 Build ID: 20160214092551 Steps to reproduce: Look at your mail files under cur/*, and notice that they are named according to [*] the time that they were retrieved from the IMAP server. Then, run "File -.> Compact Folders" or "Folder Properties -> Repair Folder". Actual results: All mails are re-downloaded, and the mails are stored under new file names. This makes it very hard for incremental/copy-on-write backup schemes to save space. Expected results: Since one of the motivation factors of maildir is to save space in backups, it would be better to pick a more stable name for these files. For example, the SHA256 hash of the contents (below the first line that unfortunately includes [*] above). If I run $ for i in *; do tail -n+2 $i | sha256sum; done part-way through a compact/repair operation, or after several renames and re-renames (see bug 1259035), then I can see that some of these files are duplicate mails. The disadvantage of this hash-based method is that you can only do it *after* the mail is actually downloaded. I'm not an expert at IMAP but perhaps there is something equivalent that can be done *before* the mail is downloaded? This would allow Compact/Repair to be much more efficient too, since one can see what mails one already has.
Whiteboard: [maildir]
While we're here, we should consider bug 1144478.
See Also: → 1144478
Message-ID is supposed to be unique, but of course those can be missing, and you could have duplicates due to duplicates of the same message being in the same folder. Still, I think that would be pretty sweet. For duplicates we could just add #N to their names.
Status: UNCONFIRMED → NEW
Ever confirmed: true
And giving the files an extension (bug 1144478) would allow Windows to index them.
Yes we'd want to do that at the same go.
(In reply to Magnus Melin from comment #2) > Message-ID is supposed to be unique, but of course those can be missing, and > you could have duplicates due to duplicates of the same message being in the > same folder. The Message-ID is anything but reliable. For example if you detach an attachment, you have already two different emails in the folder. One with the /deleted flag and attachment. And one without those.
(In reply to infinity0 from comment #0) > $ for i in *; do tail -n+2 $i | sha256sum; done That is not enough. Copy the same email twice from folder A to folder B. Then both copies are usually identical and would have the same hash. > *after* the mail is actually downloaded. I'm not an expert at IMAP but > perhaps there is something equivalent that can be done *before* the mail is > downloaded? This would allow Compact/Repair to be much more efficient too, > since one can see what mails one already has. For IMAP, an email is identified by the combination of UID and UIDVALIDITY. UIDVALIDITY is unique for every folder in the mailbox. The UID is a sequential number for each e-mail stored in the folder. <https://tools.ietf.org/html/rfc3501#section-2.3.1.1>
Yes not totally reliably. But it doesn't have to be. Since we have pop, imap, news, feeds and some more, message-id is the only common thing that makes a mesagge somewhat unique. The hashing suggestion would be nice in a way, but there are too many cases where you only have part of the message for that to give real advantages.
Here's a first pass at using the Message-ID as a base for maildir filenames. At the works-for-me-YMMV stage. A couple of corner-cases handle (eg >10000 messages in a folder with the same Message-ID :-), and some rough edges round off. The main missing chunk is fixing up the mbox->maildir conversion. But the main thing is that it means the maildir filenames should be a lot more stable than the old timestamp-named ones.
Assignee: nobody → benc
What about a file extension, comment #3, bug 1144478?
(In reply to Jorg K (GMT+1) from comment #9) > What about a file extension, comment #3, bug 1144478? Oh, yes - forgot to mention, this patch adds the ".eml" extension to the name too. I guess that means it should probably also cull out the .mozmsgs support too - assuming the windows search proves to Just Work(tm) with the .eml files...
OK, changes new to this patch: Tidies up the unique-filename generation (and as a side effect, it should now cope with far-fetched-but-nasty corner-case where you've got >10000 messages in a folder all with the same Message-ID!). mbox->maildir conversion now outputs message files with the ".eml" extension. maildir->mbox conversion can now handle non-integer filenames. Caveat: The mbox->maildir conversion still generates timestamp-based filenames (albeit now with an ".eml" extension). It'd be possible to get it to generate messageID-based filenames, but I didn't think it was worth the risk for this bug. It'd mean big changes in the already-complicated conversion code. I'm happy to look at it, but it can be done separately, and I think it'd be better to land this one first.
Attachment #9027103 - Attachment is obsolete: true
Attachment #9027796 - Flags: review?(jorgk)
Quick braindump on search-integration implications while I'm thinking about it: The extra .mozmsgs dir is still required for mbox, but not for maildir (maybe). There might be some GUI implications here (eg ghost out the search-integration option if the mailstore doesn't need it). I guess the plugable mailstore interface needs to support an attribute the search-integration can query to decide if it needs to maintain .mozmsgs or not. Windows: Will windows automatically search the mail folders or do we have to register the location via some API? Mac: the profile directory is not searched by default. OSX expects searchable data to be in a different location (outside the TB profile). Maybe there's a way to add the mail folders under the TB profile to the search list? .eml files are not understood/indexed by default by OSX search. But we have an extension which adds support (I think). OSX does know about .xeml files (which is what mail.app uses, I think). In theory we could use xeml files on OSX instead of eml... (not wild about this idea though).
Comment on attachment 9027796 [details] [diff] [review] use_msgid_for_maildir_files_2.patch Review of attachment 9027796 [details] [diff] [review]: ----------------------------------------------------------------- Sorry about the delay which is inverse proportional to my affinity towards a certain area of code. That said, the review was a whole lot easier than I thought. I'll test this now, planning to convert mbox to maildir and then do a few operations on the result. Maybe Magnus wants to see it was well. ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +47,5 @@ > +{ > + const char* badChars = "/\\?%*:|\"<> "; > + const char* end = in.EndReading(); > + const char* cur; > + out.SetCapacity(in.Length()); But that can grow beyond that length if more space is required, right? @@ +760,5 @@ > NS_ERROR("FinishNewMessage - oops! file does not exist!"); > return NS_ERROR_FILE_TARGET_DOES_NOT_EXIST; > } > > + // By now we'll have the Message-ID, which we'll base the final filename on Nit. Full stop. @@ +777,5 @@ > + nsCString toName(baseName); > + toName.Append(".eml"); > + toPath->AppendNative(toName); > + > + // Yes, you were trying to say? ;-) @@ +801,5 @@ > + } > + > + rv = fromPath->MoveToNative(curPath, toName); > + NS_ENSURE_SUCCESS(rv, rv); > + // Update the db to reflect the final filename Full stop.
Attachment #9027796 - Flags: review?(mkmelin+mozilla)
Attachment #9027796 - Flags: review?(jorgk)
Attachment #9027796 - Flags: review+
(In reply to Jorg K (GMT+1) from comment #14) > I'll test this now, planning to convert mbox to maildir and then do > a few operations on the result. Thanks Jorg! I keep thinking I've maybe overlooked some obscure aspect of this stuff, so the more scrutiny this gets, the better. > ::: mailnews/local/src/nsMsgMaildirStore.cpp > @@ +47,5 @@ > > +{ > > + const char* badChars = "/\\?%*:|\"<> "; > > + const char* end = in.EndReading(); > > + const char* cur; > > + out.SetCapacity(in.Length()); > > But that can grow beyond that length if more space is required, right? Yes, exactly. I just added a comment to try and clarify that point.
Attachment #9027796 - Attachment is obsolete: true
Attachment #9027796 - Flags: review?(mkmelin+mozilla)
Attachment #9029335 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9029335 [details] [diff] [review] use_msgid_for_maildir_files_3.patch I've got side-tracked into more release issues, so I'll test this tomorrow.
Attachment #9029335 - Flags: review+
I finally go around to testing this. I started with "Local Folders" as maildir. Adding new message gave the expected names, or something based on time when there wasn't a message ID. Next I converted to mbox and got "Local Folders-mbox". I converted back and got "Local Folders-maildir". All the converted messages had "date names" since I suppose we don't parse the message upon conversion. Hmm. Converting back to mbox worked, I assumed it overwrote the files in "Local Folders-mbox", then converting back to maildir thoroughly hung. I assume it tried to gain access to the previous "Local Folders-maildir" folder. Cancelling hung conversion and removing that folder made the mbox->maildir conversion work again. We'll have to fix that when we do a test for the round trip. All files have .eml extension and seem to live happily next to their older sisters and brothers without the extension. Importing a message twice adds -1 to the message ID. All good. But ... I couldn't believe my eyes when importing messages without message ID. I got filenames like: md5%3aCvtU6HwNSk7kAaT5Uk4lWA==.eml No kidding. Yes, I can read the code: baseName.AppendInt(static_cast<int64_t>(PR_Now())); I added some debug and saw that GetMessageId() was returning md5:vLmOfRoTaHLjm/YuToiqVA==. Nice, huh? Comes from somewhere here: https://searchfox.org/comm-central/search?q=md5%3A&case=false&regexp=false&path= When I say import, I mean dragging a message onto a folder. Maybe we want to add StringBeginsWith(..., "md5:").
(In reply to Jorg K (GMT+1) from comment #17) > I finally go around to testing this. I started with "Local Folders" as > maildir. Adding new message gave the expected names, or something based on > time when there wasn't a message ID. Next I converted to mbox and got "Local > Folders-mbox". I converted back and got "Local Folders-maildir". All the > converted messages had "date names" since I suppose we don't parse the > message upon conversion. Hmm. Yes. The converter would have to parse at least some of the message to grab the message-id, then implement the same filename-encoding rules I implemented in this patch. Actually probably not tooooo hard, but I figured there was enough churn going on that it could be shunted into it's own bug. (philosophically I don't like how the converter is replicating all the mbox and maildir code anyway - I'd like to think of a way to factor it all out into a single place - but that's a separate issue). > Converting back to mbox worked, I assumed it > overwrote the files in "Local Folders-mbox", then converting back to maildir > thoroughly hung. I assume it tried to gain access to the previous "Local > Folders-maildir" folder. Cancelling hung conversion and removing that folder > made the mbox->maildir conversion work again. We'll have to fix that when we > do a test for the round trip. I'm just wrapping up a big refactor of the whole conversion process (in bug 1491228) which I _think_ might fix this hanging - there were some knotty issues with the way the completion of the conversion was indicated to the main thread. So definitely a good test. > All files have .eml extension and seem to live happily next to their older > sisters and brothers without the extension. > > Importing a message twice adds -1 to the message ID. All good. Hooray! > But ... I couldn't believe my eyes when importing messages without message > ID. I got filenames like: > md5%3aCvtU6HwNSk7kAaT5Uk4lWA==.eml > > No kidding. Yes, I can read the code: > baseName.AppendInt(static_cast<int64_t>(PR_Now())); > > I added some debug and saw that GetMessageId() was returning > md5:vLmOfRoTaHLjm/YuToiqVA==. Nice, huh? Comes from somewhere here: > https://searchfox.org/comm-central/ > search?q=md5%3A&case=false&regexp=false&path= > > When I say import, I mean dragging a message onto a folder. Maybe we want to > add StringBeginsWith(..., "md5:"). Ha - that's fantastic :-) Always something else to keep us on our toes! Sounds like some part of the codebase (drag and drop maybe?) relies on MessageID being present and helpfully synthesises missing ones. Ideally we can just remove this behaviour at source... I'll investigate. Anyway, thanks for checking it out - we're creeping toward a finished patch now!
I've written up my findings as Bug 1513093 (TLDR: there is code that relies on the md5 hack). Thinking about it, I think that from the point of view of _this_ patch, the `md5:....` messageId is fine to use as the basis for the .eml filename. Rationale: - If the fake md5 IDs remain, then the .eml filenames will be stay stable and consistent, which was one of the motivations of this work: stable filenames for backup. - If the fake md5 IDs are removed, this patch will handle it, using a timestamp instead. - If the fake ID is retained as a different attribute, we can use that to regain stable & consistent filenames (check for messageID, if missing fall back to fakeID. And no timestamp hack.). So I'm happy with this patch as it stands.
Personally I prefer not to have a third class of names: Message IDs, timestamps and now md5: thingies. Timestamps are already unfortunate since they are in fact not stable. Repair folder, if comment #0 is correct, will re-download and then use the message ID instead. I should try that. So far, we only use timestamps during conversion mbox->maildir (which is also unfortunate) and if the message doesn't have an ID. I think adding md5: names to the mix makes things even worse. Let alone explaining this to the inquisitive user.
(In reply to Jorg K (GMT+1) from comment #20) > Personally I prefer not to have a third class of names: Message IDs, > timestamps and now md5: thingies. I agree. (my third option above doesn't really relate to this patch - I was thinking about things which currently use MessageID as a unique key, like nsImapMailDatabase::UpdatePendingAttributes()). For the context of this patch, I'm suggesting that we don't add a special-case check for md5-thingies - just take the MessageID as given. Real or md5-fake, it'll still be stable. If the md5-thingies are removed in future, this patch will still work, falling back to timestamps. Which sucks, but it's only for emails missing the Message-ID header, and I don't have a better suggestion. > Repair folder, if comment #0 is correct, will > re-download and then use the message ID instead. I should try that No, I don't think it'll work: the md5-thingy is only generated if the Message-Id header is missing in the first place (and it _is_ technically optional). So re-downloading won't help.
(In reply to Ben Campbell from comment #21) > If the md5-thingies are removed in future, this patch will still work, > falling back to timestamps. Which sucks, but it's only for emails missing > the Message-ID header, and I don't have a better suggestion. Actually, while I was typing that I _did_ think of a better suggestion: I could munge together the `Subject:` and `Date:` headers. That'd be pretty stable I think.
But you can't trust either of those to exist either. I think in the end you'll find the Message-ID is as good as it gets. That at least offers some possible advantages too (wrt lookups, debugging and more). The only other stable thing you could do is do a checksum (perhaps not md5) of the whole message. But, we don't always have the whole message available so there are not that many advantages there unfortunately.
(In reply to Ben Campbell from comment #21) > > Repair folder, if comment #0 is correct, will > > re-download and then use the message ID instead. I should try that > No, I don't think it'll work: the md5-thingy is only generated if the > Message-Id header is missing in the first place (and it _is_ technically > optional). So re-downloading won't help. You're right. > For the context of this patch, I'm suggesting that we don't add a > special-case check for md5-thingies - just take the MessageID as given. Real > or md5-fake, it'll still be stable. So if you used a time stamp, repairing/re-downloading would create a new name, I've tested that: if (!msgID.IsEmpty() && !StringBeginsWith(msgID, NS_LITERAL_CSTRING("md5"))) { So let's go with what we have, not totally happy, but convinced.
Comment on attachment 9029335 [details] [diff] [review] use_msgid_for_maildir_files_3.patch Review of attachment 9029335 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/base/util/mailstoreConverter.jsm @@ +233,5 @@ > + return a.lastModifiedTime - b.lastModifiedTime; > + }); > + // Build the list of msg files to send to the worker. > + for (let elem of msgFiles) { > + let f = elem.leafName.toString(); we can remove the extra f temporary here ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +41,5 @@ > > +// Helper function to percent-encode any characters which might > +// be a problem in filenames. > +// Assumes input is ASCII, which should be fine for the intended > +// use here. I think the input could end up being non-ascii (malformed/malicious mail) @@ +42,5 @@ > +// Helper function to percent-encode any characters which might > +// be a problem in filenames. > +// Assumes input is ASCII, which should be fine for the intended > +// use here. > +static void percentEncode(nsACString const& in, nsACString& out) there is NS_EscapeURL @@ +774,5 @@ > } > > + nsCOMPtr<nsIFile> toPath; > + curPath->Clone(getter_AddRefs(toPath)); > + nsCString toName(baseName); we should make sure it's not too short/long (win max path 252ch) either. If there's no @, perhaps just append @invalid - to avoid problems with special file names like "con". @@ +787,5 @@ > + break; // Done! > + } > + // Craft a new filename. > + // We could use nsIFile::CreateUnique() for this, but that bails out > + // after 10000 attempts. By doing it manually, we'll cope with the 10000 is a lot, so I'd use nsIFile::CreateUnique. You can still handle errors by changing the name if that fails.
Attachment #9029335 - Flags: review?(mkmelin+mozilla)
Attachment #9029335 - Attachment is obsolete: true
(In reply to Magnus Melin [:mkmelin] from comment #25) > @@ +42,5 @@ > > +// Helper function to percent-encode any characters which might > > +// be a problem in filenames. > > +// Assumes input is ASCII, which should be fine for the intended > > +// use here. > > +static void percentEncode(nsACString const& in, nsACString& out) > > there is NS_EscapeURL There are still a bunch of invalid characters that NS_EscapeURL() doesn't filter out: '/', '\' ,'*' etc... > I think the input could end up being non-ascii (malformed/malicious mail) Yes, it'll cope with that. Any invalid bytes will be individually percent-encoded. I _did_ forget to filter out non-printable and extended ASCII values, so this new patch adds that. It also tries to be clearer in the comment describing percentEncode(). > @@ +774,5 @@ > > } > > > > + nsCOMPtr<nsIFile> toPath; > > + curPath->Clone(getter_AddRefs(toPath)); > > + nsCString toName(baseName); > > we should make sure it's not too short/long (win max path 252ch) either. Ooo - yes, I hadn't thought of that. New patch clips filenames to an arbitrary 128 chars (I felt 64 was a tad short and 128 was the next Nice Round Number(tm)). > If there's no @, perhaps just append @invalid - to avoid problems with > special file names like "con". All the special filenames we want to avoid are less than 9 chars (eg "LPTR12.TXT"), so I just treat message-id shorter than that as suspicious. And ignore it, using the timestamp instead. Happy to add the "@invalid" approach too, if you think it's worthwhile, but seems like a corner case. > 10000 is a lot, so I'd use nsIFile::CreateUnique. You can still handle > errors by changing the name if that fails. I've left it in the current patch, but I can change it. Did you mean that if CreateUnique() fails it should fall back to using the timestamp as filename and try again? Or that it should just fail and let the user deal with it?
Note to self - potential unit tests for maildir: Make sure we get safe filenames from Message-IDs with: - Binary data, multi-byte and extended-ASCII text - extra long values (>MAX_PATH, say 256 chars) - forbidden windows filenames (COM,AUX etc)
(In reply to Ben Campbell from comment #27) > > there is NS_EscapeURL > > There are still a bunch of invalid characters that NS_EscapeURL() doesn't > filter out: '/', '\' ,'*' etc... Would have to verify, but I think passing the right flags (esc_FilePath, or esc_Directory + esc_Forced?) escapes what you want. > All the special filenames we want to avoid are less than 9 chars (eg > "LPTR12.TXT"), so I just treat message-id shorter than that as suspicious. > And ignore it, using the timestamp instead. Sounds good. > I've left it in the current patch, but I can change it. Did you mean that if > CreateUnique() fails it should fall back to using the timestamp as filename > and try again? That sounds good. Would also avoid problems (as much as we can) when the total file path would get too long.
Please let us know when it's time to have another look at the patch.
(In reply to Magnus Melin [:mkmelin] from comment #29) > (In reply to Ben Campbell from comment #27) > > > there is NS_EscapeURL > > > > There are still a bunch of invalid characters that NS_EscapeURL() doesn't > > filter out: '/', '\' ,'*' etc... > > Would have to verify, but I think passing the right flags (esc_FilePath, or > esc_Directory + esc_Forced?) escapes what you want. No joy after a load of fiddling about with NS_EscapeURL(). `esc_FilePath|esc_Colon` (where esc_FilePath = esc_Directory | esc_FileBaseName | esc_FileExtension) ...got me the closest, but it's still no good. It doesn't encode '*', '|' or '/' (all no-nos on windows). I also got tripped up by the fact that unless `esc_AlwaysCopy` is set, NS_EscapeURL will fail when characters are escaped. Which could be argued as reasonable, if it were documented somewhere :- ). So I'm sticking with my function. Retrojustification: filenames aren't URLs, so I think it's a little dangerous to blur the lines like that anyway :-) But all the futzing about did get me thinking and I've revised my function. The aim is to strip out: - anything illegal (eg all the windows restricted chars) - anything that might be annoying to deal with in hacky little shell scripts (spaces, semicolons, quotes...). But to try and keep most sane Message-IDs free of encoding. Ends up as a pretty small set - mainly base64 and a few extras: [-+.%=@_0-9a-zA-Z]. I also found and fixed a sign-related bug, so binary/multichar now works as expected, rather than outputting overly-verbose encoding. So now, using utf8: percentEncode("日本語") => `%e6%97%a5%e6%9c%ac%e8%aa%9e`
New to this patch: - Revised filename-sanitising - Back to using CreateUnique() to handle name clashes, falling back to timestamp filename if the 10000 limit is hit (and failing that... failing :-)
Attachment #9030625 - Attachment is obsolete: true
Magnus, if this looks sane to you I'll ask Jorg to review it.
Flags: needinfo?(mkmelin+mozilla)
Comment on attachment 9031032 [details] [diff] [review] use_msgid_for_maildir_files_5.patch Review of attachment 9031032 [details] [diff] [review]: ----------------------------------------------------------------- Looks ok to me
Flags: needinfo?(mkmelin+mozilla)
Attachment #9031032 - Attachment is obsolete: true
Attachment #9035553 - Flags: review?(jorgk)
Comment on attachment 9035553 [details] [diff] [review] use_msgid_for_maildir_files_6.patch Review of attachment 9035553 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the persistence ;-) ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +796,5 @@ > + toName.Append(".eml"); > + toPath->AppendNative(toName); > + > + // Using CreateUnique in case we have duplicate Message-Ids > + rv = toPath->CreateUnique(nsIFile::NORMAL_FILE_TYPE, 0600); A bit clumsy, but this should be: if (NS_FAILED(rv) && rv != NS_ERROR_FILE_TOO_BIG) { NS_ENSURE_SUCCESS(rv,rv); else if (rv == NS_ERROR_FILE_TOO_BIG) { Open to better ideas. @@ +806,5 @@ > + toName.Append(".eml"); > + toPath->SetNativeLeafName(toName); > + rv = toPath->CreateUnique(nsIFile::NORMAL_FILE_TYPE, 0600); > + } > + NS_ENSURE_SUCCESS(rv,rv); Please move this after the call you want to check, or else the line number it reports will be wrong.
Attachment #9035553 - Flags: review?(jorgk) → review+

Thanks, Jorg!
Minor tweakage as per your comments.
(I'm assuming new patch => new review?)

Attachment #9035553 - Attachment is obsolete: true
Attachment #9035808 - Flags: review?(jorgk)
Comment on attachment 9035808 [details] [diff] [review] use_msgid_for_maildir_files_7.patch Review of attachment 9035808 [details] [diff] [review]: ----------------------------------------------------------------- I'm afraid you'll need to rebase this, mostly in mailstoreConverter.jsm after bug 1491228 which just got backed out :-( ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +801,5 @@ > + if (NS_FAILED(rv)) { > + if (rv != NS_ERROR_FILE_TOO_BIG) { > + NS_ENSURE_SUCCESS(rv, rv); > + } > + // NS_ERROR_FILE_TOO_BIG means CreateUnique() bailed out at 10000 attempts. Nit: Move explanation up.
Attachment #9035808 - Flags: review?(jorgk) → review+

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/277f7d1fd552
use Message-ID as basis for maildir file names, and add ".eml" extension. r=jorgk DONTBUILD

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED

OK, I've landed this now since the merge conflict went away after backing out bug 1491228. You'll have to do the rebasing there now, but that patch needs more work anyhow.

Target Milestone: --- → Thunderbird 66.0
Comment on attachment 9035808 [details] [diff] [review] use_msgid_for_maildir_files_7.patch Review of attachment 9035808 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +58,5 @@ > + const char c = *cur; > + bool whitelisted = (c >= '0' && c <= '9') || > + (c >= 'A' && c <= 'Z') || > + (c >= 'a' && c <= 'z') || > + c == '-' || c == '+' || c == '.' ||c == '%' || c == '=' | c == '@' | c == '_'; Is this actually right with the bitwise | ? Also I get compiler warnings on this: warning: comm/mailnews/local/src/nsMsgMaildirStore.cpp:62:56 [-Wparentheses] suggest parentheses around comparison in operand of ‘|’ warning: comm/mailnews/local/src/nsMsgMaildirStore.cpp:62:78 [-Wparentheses] suggest parentheses around comparison in operand of ‘|’ warning: comm/mailnews/local/src/nsMsgMaildirStore.cpp:66:55 [-Wignored-qualifiers] type qualifiers ignored on cast result type And we must not have warnings as those sometimes fail the build in some cases (m-c sets -Werror (warnings as errors)).
Attachment #9035808 - Flags: feedback?(jorgk)

Comment on attachment 9035808 [details] [diff] [review]
use_msgid_for_maildir_files_7.patch

Hmm, is f? a new way to attract attention? Sorry, those typos slipped the review, I'll fix it.

Attachment #9035808 - Flags: feedback?(jorgk)
Pushed by mozilla@jorgk.com: https://hg.mozilla.org/comm-central/rev/28b72562d11e Follow-up: Correct typos in logical or. r=me DONTBUILD

(In reply to Jorg K (GMT+1) from comment #42)

Hmm, is f? a new way to attract attention? Sorry, those typos slipped the review, I'll fix it.

And what would be better?

Thanks for the fix, also the compiler warnings are gone (as now the intention is clearer), except this one:
comm/mailnews/local/src/nsMsgMaildirStore.cpp:66:55: warning: type qualifiers ignored on cast result type [-Wignored-qualifiers]
out.AppendPrintf("%%%02x", (const unsigned char)c);
^

Either the 'const' or 'unsigned' is superfluous, c already is 'const char'.

NI would be better. Can you please attach a patch, on Windows there are no warnings. Why not make it const unsigned char c = ...? Also, can we fix the loop to for (cur = in.BeginReading(); cur < end; ++cur) {.

(In reply to Jorg K (GMT+1) from comment #45)

NI would be better. Can you please attach a patch, on Windows there are no warnings. Why not make it const unsigned char c = ...?

There are just single uses of 'const unsigned char' in c-c, but thousands of 'const char'. Just removing the cast in the AppendPrintf() seems to work for me.

Also, can we fix the loop to for (cur = in.BeginReading(); cur < end; ++cur) {.

OK

Attachment #9036205 - Flags: review?(jorgk)
Comment on attachment 9036205 [details] [diff] [review] 1259040.patch maildir gcc warning Review of attachment 9036205 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +62,5 @@ > c == '-' || c == '+' || c == '.' || c == '%' || c == '=' || c == '@' || c == '_'; > if (whitelisted) { > out.Append(c); > } else { > + out.AppendPrintf("%%%02x", c); I guess the unsigned here tries to avoid that - say - 0x80 is printed badly, well, I don't know what badly would be. I have patches for tonight, so Ben can look at this one.
Attachment #9036205 - Flags: review?(jorgk) → review?(benc)
Comment on attachment 9036205 [details] [diff] [review] 1259040.patch maildir gcc warning Review of attachment 9036205 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/local/src/nsMsgMaildirStore.cpp @@ +62,5 @@ > c == '-' || c == '+' || c == '.' || c == '%' || c == '=' || c == '@' || c == '_'; > if (whitelisted) { > out.Append(c); > } else { > + out.AppendPrintf("%%%02x", c); Badly looks like: "%ffffff80" rather than "%80". So it needs to be fed into the printf as unsigned. BeginReading()/EndReading() deal with `const char*`, so I figured it was least intrusive to perform the cast at the last moment. Infuriatingly, I'm not getting any compiler warnings on this, so I'm not sure how to address it. Worse still, I don't see any warnings for the bitwise `| c=='@'` atrocity that slipped through in my patch! Does mach suppress warnings if there are too many? (if I touch `nsMsgMaildirStore.cpp` and rebuild, it just says `504 compiler warnings present.` which seems both excessive and uninformative...)
Attachment #9036205 - Flags: review?(benc) → review-

How can it squeeze ffffff80 into two a %02x format? Surely the result would be "80", no?

The width specifier in printf format strings is a minimum width.
(And I have an idea that varargs promotes chars to int, so printf is seeing a unsigned int, hence all the extra digits).

This works for me (no warning) and it seem to match what you want.

Attachment #9036205 - Attachment is obsolete: true
Attachment #9036749 - Flags: review?(jorgk)
Attachment #9036749 - Flags: feedback?(benc)

Comment on attachment 9036749 [details] [diff] [review]
1259040.patch maildir gcc warning v2

I'm keeping out of this ;-)

Attachment #9036749 - Flags: review?(jorgk)
Attachment #9036749 - Flags: review?(benc)
Attachment #9036749 - Flags: feedback?(benc)
Attachment #9036749 - Flags: feedback+

Comment on attachment 9036749 [details] [diff] [review]
1259040.patch maildir gcc warning v2

Always a little unsettling seeing const being cast away, but it's not a pointer or anything, so if the compiler's happy, so am I :-)

Attachment #9036749 - Flags: review?(benc) → review+
Keywords: checkin-needed
OS: Unspecified → All
Hardware: Unspecified → All

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/a77b11bc0dfc
fix compiler warning at nsMsgMaildirStore.cpp:66 related to const unsigned cast. r=BenC

Keywords: checkin-needed
Blocks: 1144478

Is there a concept on and/or plan how to migrate existing Maildir accounts from the prior non-*.eml-files to the new *.eml-files? If not we could have a lot support requests because of not working system integration (eg. not working Windows search) in future.

I don't think we have one yet. Filing it nwo

Blocks: 1526289

Good decision - thanks!

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: