Hide MUTF-7 use deeper in IMAP
Categories
(MailNews Core :: Networking: IMAP, enhancement)
Tracking
(Not tracked)
People
(Reporter: benc, Unassigned)
References
Details
There are a whole load of places in the IMAP code which deal with folder names and have special cases for handling MUTF-7 or UTF-8.
The patches in Bug 1687727 and Bug 1688782 act as a bit of a roadmap to such handling.
And Bug 771487 might be relevant too (issue with filter rules and MUTF-7) - I suspect there are a bunch of lurking edge-cases waiting there.
I think an approach might be to make sure we always use UTF-8 IMAP folder names, and the only time MUTF-7 is used is way way down in the protocol level when talking to servers without UTF8=ACCEPT. i.e. converting to/from MUTF-7 right down where IMAP commands are sent/received.
A small though experiment: imagine using an old (MUTF-7 only) IMAP server, and crafting some filter rules. If that IMAP server is upgraded to support UTF8=ACCEPT, will those filter rule destinations still be valid?
This is probably a part of a larger effort to make sure we've got well defined, canonical representations for paths (and folder URIs)... but got to start somewhere.
Other things to consider: path separators, percent encoding.
A good start would be to make sure we've got test coverage for all kinds of IMAP folder naming cases: nested folders with accented characters, non-latin languages, test servers with and without UTF8 support... without good unit tests, this kind of work is just regressions waiting to happen :-)
| Reporter | ||
Updated•4 years ago
|
Check this beauty: https://hg.mozilla.org/comm-central/rev/7463adb63b60#l7.133 referring to bug 264071. I'm not sure why you say "UTF-8 IMAP folder names". Shouldn't they just be nsString scrapping the constant UTF-8 to UTF-16 conversions? Unit tests, test servers with/without UTF-8, that sounds very ambitious. I thought there were some protocol rewrites incl. for IMAP on the cards. So maybe that would be a good point to do it. In JS there are only JS strings which typically don't carry raw UTF-8 bytes.
Description
•