Closed
Bug 1381762
Opened 8 years ago
Closed 7 years ago
Replace nsIPlatfromCharset on mailnews (or move it to comm-central)
Categories
(MailNews Core :: Internationalization, enhancement)
MailNews Core
Internationalization
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 58.0
People
(Reporter: m_kato, Assigned: jorgk-bmo)
References
Details
Attachments
(1 file)
2.52 KB,
patch
|
m_kato
:
review+
|
Details | Diff | Splinter Review |
mailnews/base/util/nsMsgI18N.cpp still uses nsIPlatformCharset. But we would like to remove this interface after fixing bug 943284. So comm-central should replace it with other ways, or move this to comm-central.
Assignee | ||
Comment 1•8 years ago
|
||
Thanks for the heads-up. Yes, we use it in nsMsgI18NFileSystemCharset() which has a lot of callers:
// Charset used by the file system.
const char * nsMsgI18NFileSystemCharset()
{
/* Get a charset used for the file. */
static nsAutoCString fileSystemCharset;
if (fileSystemCharset.IsEmpty())
{
nsresult rv;
nsCOMPtr <nsIPlatformCharset> platformCharset = do_GetService(NS_PLATFORMCHARSET_CONTRACTID, &rv);
if (NS_SUCCEEDED(rv)) {
rv = platformCharset->GetCharset(kPlatformCharsetSel_FileName,
fileSystemCharset);
}
if (NS_FAILED(rv))
fileSystemCharset.Assign("ISO-8859-1");
}
return fileSystemCharset.get();
}
Is there a replacement?
If C-C were to adopt it, we'd need to grab the underlying code for all platforms, right?
intl/locale/windows/nsWinCharset.cpp
intl/locale/unix/nsUNIXCharset.cpp
intl/locale/mac/nsMacCharset.cpp
Reporter | ||
Comment 2•7 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #1)
> Thanks for the heads-up. Yes, we use it in nsMsgI18NFileSystemCharset()
> which has a lot of callers:
>
> // Charset used by the file system.
> const char * nsMsgI18NFileSystemCharset()
> {
> /* Get a charset used for the file. */
> static nsAutoCString fileSystemCharset;
>
> if (fileSystemCharset.IsEmpty())
> {
> nsresult rv;
> nsCOMPtr <nsIPlatformCharset> platformCharset =
> do_GetService(NS_PLATFORMCHARSET_CONTRACTID, &rv);
> if (NS_SUCCEEDED(rv)) {
> rv = platformCharset->GetCharset(kPlatformCharsetSel_FileName,
> fileSystemCharset);
> }
>
> if (NS_FAILED(rv))
> fileSystemCharset.Assign("ISO-8859-1");
> }
> return fileSystemCharset.get();
> }
>
> Is there a replacement?
bug 943284's fix is good sample. c-c can use FallbackEncoding instead.
Assignee | ||
Comment 3•7 years ago
|
||
So we do this:
https://hg.mozilla.org/mozilla-central/rev/f32554cf5530#l1.36
Can you please explain a little more about the fallback encoding. Would that return the same result as the code quoted above?
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•7 years ago
|
||
OK, this replaces two different calls using kPlatformCharsetSel_PlainTextInFile and kPlatformCharsetSel_FileName with the fallback charset.
Attachment #8920569 -
Flags: review?(m_kato)
Reporter | ||
Updated•7 years ago
|
Attachment #8920569 -
Flags: review?(m_kato) → review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/0b0cba8d70bd
Replace nsIPlatfromCharset in mailnews. r=m_kato
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Target Milestone: --- → Thunderbird 58.0
Sorry about the late reply.
The purpose of mozilla::dom::FallbackEncoding is to model the legacy encoding for *consuming* unlabeled HTML content on the Web. While it is strongly correlated with Windows "ANSI code page" concept, it's not the same e.g. for Polish (for questionable historical reasons).
On cursory look, it seems that mailnews uses "file system charset" for operations that involve paths of some kind (unclear to me what kind exactly) e.g. at https://searchfox.org/comm-central/source/mailnews/base/search/src/nsMsgFilter.cpp#814 . For file system paths, the things to use are NS_CopyNativeToUnicode() and NS_CopyUnicodeToNative(). These use the glibc environment on Linux and use the pre-NTFS legacy encoding on Windows. (Windows code really should be using the W variants of APIs for file system paths.)
That is to say: All this stuff could use a review of *why* non-UTF-8 is used and what the legacy source of non-UTF-8ness is in each case. mozilla::dom::FallbackEncoding is unlikely to be the right answer for all of them.
You need to log in
before you can comment on or make changes to this bug.
Description
•