Sending documents from MS Word can break document names

RESOLVED FIXED in Thunderbird 66.0

Status

defect
RESOLVED FIXED
3 months ago
2 months ago

People

(Reporter: mikekaganski, Assigned: mikekaganski)

Tracking

Trunk
Thunderbird 66.0

Thunderbird Tracking Flags

(thunderbird_esr6065+ fixed, thunderbird65 fixed, thunderbird66 fixed)

Details

Attachments

(4 attachments, 1 obsolete attachment)

(Assignee)

Description

3 months ago
Posted patch CP_UTF8.diff (obsolete) — Splinter Review

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36

Steps to reproduce:

Sending documents from MS Word on Windows can make the documents' names unreadable in the composed email, when the current language for non-Unicode applications (ACP) is not UTF-8.

Word uses previously-undocumented feature of MapiMessage struct [1] to take a special value of ulReserved equal to CP_UTF8, when it means "strings are UTF-8-encoded instead of ACP-encoded". Word does not use MAPISendMessageW (which support is being added in bug 1048658, taking UTF-16 strings), even when it is present.

To resolve the issue, TB needs to support this possible special value.

Steps:

  1. Open attachment 8957565 [details] from bug 1048658 in MS Word (tested with Word 2016) on Windows. Its name contains an English, a Cyrillic, and a Greel character, and is only representable in ACP if it is set to UTF-8 (beta function in Windows 10 at the moment); for testing, non-UTF-8 ACP is required.
  2. Use its "Send by email" feature (for Word 2016, it's File->Share->Email->Send as Attachment).

Actual results:

The composer window with the document appears, where subject, and the attached file are filled. In both, the file name if broken, and is not "aбγ.odt" as it should be.

Expected results:

The file name must have been "aбγ.odt" in both Subject, and Attachments fields.

The attachment is a proposed patch. Unfortunately, I couldn't test the code that handles addresses; a Mail Merge test is required.

[1] https://docs.microsoft.com/ru-ru/windows/desktop/api/mapi/ns-mapi-mapimessage

(Assignee)

Updated

3 months ago
See Also: → 1048658
(Assignee)

Updated

3 months ago
Attachment #9037465 - Attachment is patch: true
Attachment #9037465 - Attachment mime type: text/x-patch → text/plain

Comment 1

3 months ago

Hmm, another bug with no audience ;-)

I'll rebase this for bug 1048658 an get it landed, thanks!

Comment 2

3 months ago

Rebased for bug 1048658. I'll comment in a moment.

Attachment #9037465 - Attachment is obsolete: true

Updated

3 months ago
Assignee: nobody → mikekaganski
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Comment 3

3 months ago
Comment on attachment 9037532 [details] [diff] [review]
1521007-CP_UTF8.patch

Review of attachment 9037532 [details] [diff] [review]:
-----------------------------------------------------------------

Great, we continue cleaning up the mess ;-)

::: mailnews/mapi/mapihook/src/msgMapiHook.cpp
@@ +577,5 @@
>    if (aMessage->lpOriginator)
>    {
>      nsAutoString From;
> +    if (!bUTF8)
> +        From.Append(NS_ConvertASCIItoUTF16(aMessage->lpOriginator->lpszAddress));

Was this NS_ConvertASCIItoUTF16() correct to start with? Should that have been NS_CopyNativeToUnicode() like elsewhere?
Also three times further down for To, CC, Bcc.
Attachment #9037532 - Flags: review+
(Assignee)

Comment 4

3 months ago
Comment on attachment 9037532 [details] [diff] [review]
1521007-CP_UTF8.patch

Review of attachment 9037532 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mapi/mapihook/src/msgMapiHook.cpp
@@ +577,5 @@
>    if (aMessage->lpOriginator)
>    {
>      nsAutoString From;
> +    if (!bUTF8)
> +        From.Append(NS_ConvertASCIItoUTF16(aMessage->lpOriginator->lpszAddress));

Well - as I mentioned, I have been unable to check this in action... *possibly* this is expected to come Web-encoded and be actually ASCII? then this "addresses" section would not require special handling wrt CP_UTF8. The Microsoft documentation doesn't mention that this flag affects anything except "lpszSubject, lpszNoteText, lpszMessageType, lpszDateReceived, lpszConversationID", while I *know* (because I see that) that at least file descriptions' file names are affected (but not sure about file paths; would need to setup a temp path in a Unicode-named directory to try to make sure; for now I left filepath unchanged)...
(Assignee)

Comment 5

3 months ago

FWIW, I have left a feedback on the MS documentation page to inform them about missing documentation bits - who knows, nowadays they are quite open.

Comment 6

3 months ago

(In reply to Mike Kaganski from comment #4)

possibly this is expected to come Web-encoded and be actually ASCII?
Hey, you're saying that MS would actually follow the international standard of RFC 6532 that allows raw UTF-8 in headers but nothing else? Sadly I've seen too many bugs where people complain about windows-1252 headers that don't work.

Anyway, we can go with this for now and follow-up later. Now over to bug 594239 or does the patch not need rebasing there? Since we're landing two MAPI bugs today, we can leave that one for another day ;-)

(Assignee)

Comment 7

3 months ago

For the record: I duplicated my question at https://social.msdn.microsoft.com/Forums/en-US/4d029672-e1e6-4d79-9b13-d027e79bdf6e, so possibly some information will appear later.

I'm OK with proceeding to bug 594239 :-) - are you going to rebase it, or should I?

Comment 8

3 months ago

You could, if you woudn't mind. On changed lines, it would be good to fix any white-space sins, and to change the indentation to two spaces as I've done in the SendMailW bug.

(Assignee)

Comment 9

3 months ago

Of course! I'll do it after this change gets merged, is that OK? (It's easier for me to update then continue the patch.)

Comment 10

3 months ago

Of course :-)

Comment 11

3 months ago

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/41d0ee56c665
Support UTF-8 encoded MAPI data to avoid broken document names when sending documents from MS Word. r=jorgk

Status: ASSIGNED → RESOLVED
Last Resolved: 3 months ago
Resolution: --- → FIXED

Updated

3 months ago
Target Milestone: --- → Thunderbird 66.0

Comment 12

3 months ago
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/e87ac6b1e47e
Follow-up: fix variable name of UTF-8 indicator. r=me DONTBUILD

Comment 13

3 months ago

Sorry, I had to interfere with this a little more. We use 'a' for call parameters/arguments, I've never seen a 'b' prefix.

Looking at the code again, I came across this:
pFile->InitWithNativePath(nsDependentCString((const char*)aFiles[i].lpszPathName));

What happens if the data is UTF-8? The pathname won't be, just the file name? Maybe we need to use InitWithPath instead? I guess you could try on a system with non-UTF-8 system locale and a folder of テスト using MS Word (which I don't have). In case we need to change this, just attach a follow-up patch here.

(Assignee)

Comment 14

3 months ago

(In reply to Jorg K (GMT+1) from comment #13)

Sorry, I had to interfere with this a little more. We use 'a' for call parameters/arguments, I've never seen a 'b' prefix.

Sorry - it's my current habit (a LibreOffice codebase convention for bool params; excuse me).

Looking at the code again, I came across this:
pFile->InitWithNativePath(nsDependentCString((const char*)aFiles[i].lpszPathName));

What happens if the data is UTF-8? The pathname won't be, just the file name? Maybe we need to use InitWithPath instead?

:-) just what I described in comment 4.

Comment 15

3 months ago

(In reply to Mike Kaganski from comment #14)

:-) just what I described in comment 4.

Sorry, I'm unable to memorise all the ~200 bugmails I read/glance at every day (Friday: "only" 160, Thursday 203) :-(

So can you simply create a folder called テスト ?

(Assignee)

Comment 16

3 months ago

(In reply to Jorg K (GMT+1) from comment #13)

Looking at the code again, I came across this:
pFile->InitWithNativePath(nsDependentCString((const char*)aFiles[i].lpszPathName));

What happens if the data is UTF-8? The pathname won't be, just the file name? Maybe we need to use InitWithPath instead? I guess you could try on a system with non-UTF-8 system locale and a folder of テスト using MS Word (which I don't have).

(In reply to Jorg K (GMT+1) from comment #15)

So can you simply create a folder called テスト ?

Word prepares a temp file for sending, with the name like

C:\Users\mikek\AppData\Local\Microsoft\Windows\INetCache\Content.Word~WRD0002.docx.docx

so I'll try first setting LocalAppData to such a directory; if this won't work, I'll create a user with such a funny name.

(Assignee)

Comment 17

3 months ago
Posted patch PathUTF8.diffSplinter Review

(In reply to Jorg K (GMT+1) from comment #13)

You were absolutely right.
I have tested with a user conveniently named テストабв, and was able to see the UTF-8 paths arriving from Word. Here's the patch addressing that.

The only thing left (to a follow-up patch) is to add a DWORD registry value "SupportUTF8" to HKLM\SOFTWARE\Clients\Mail\Mozilla Thunderbird, which I discovered by reading source code of MAPISendMailHelper (see [1] for details). I haven't yet found where the relevant registry value (DLLPath) is set.

[1] https://social.msdn.microsoft.com/Forums/en-US/4d029672-e1e6-4d79-9b13-d027e79bdf6e/what-is-the-true-scope-of-cputf8-in-mapimessageulreserved

(Assignee)

Comment 18

3 months ago

I tried to find the place where the registry settings are written. I supposed that's mail/installer/windows/nsis/shared.nsh (or suite/installer/windows/nsis/shared.nsh) and macro SetClientsMail in them, and that gets into dist\bin\uninstall\helper.exe; but adding WriteRegDWORD HKLM "$0" "SupportUTF8" 1 after WriteRegStr HKLM "$0" "DLLPath" "$6", and running resulting uninstall\helper.exe there does not add SupportUTF8 reg key.

What am I missing?

Comment 19

3 months ago
Comment on attachment 9037756 [details] [diff] [review]
PathUTF8.diff

Excellent :-) - BTW, no spaces between function name and parameters.

As for the installer, I have no idea, last I looked I was searching for 00B0D0F3BAA7, but that found the same file you found. I think it's material for a follow-up bug and we can ask the M-C installer experts.
Attachment #9037756 - Flags: review+
(Assignee)

Comment 20

3 months ago

Oh I found my mistake! It did it right all the time; just I checked 32-bit registry hive when it wrote to 64-bit hive.

So this patch adds the registry setting, so that clients could check that TB supports passing UTF-8 strings when using MAPISendMail API.

Comment 21

3 months ago
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/609c28fdc2a7
Follow-up: also check for UTF-8 when copying file path. r=jorgk DONTBUILD
(Assignee)

Comment 22

3 months ago

(In reply to Jorg K (GMT+1) from comment #19)

BTW, no spaces between function name and parameters.

Sorry for that - and it was my misunderstanding: I seem to have read your review from bug 393302, where I )mis)read it as if you required the space to be present. So I started to add it now, doing thing that I personally dislike (I don;t like those spaces!) because of that. Excuse me.

Comment 23

3 months ago
Comment on attachment 9037773 [details] [diff] [review]
SupportUTF8.diff

OK, let's finish it here then.
Attachment #9037773 - Flags: review+

Comment 24

3 months ago
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/1948a0ff606a
Follow-up: Set SupportUTF8 registry key to indicate that Thunderbird is supporting UTF-8 at the MAPISendMail API. r=jorgk DONTBUILD

Comment 25

3 months ago
Comment on attachment 9037532 [details] [diff] [review]
1521007-CP_UTF8.patch

Approval for all four landed changesets.
Attachment #9037532 - Flags: approval-comm-esr60+
Attachment #9037532 - Flags: approval-comm-beta+

Updated

3 months ago
Attachment #9037532 - Flags: approval-comm-esr60+
Attachment #9037532 - Flags: approval-comm-beta+

Updated

3 months ago
Attachment #9038002 - Flags: approval-comm-beta+

Updated

3 months ago
Duplicate of this bug: 611081

Updated

3 months ago
Component: Untriaged → Simple MAPI
Product: Thunderbird → MailNews Core

Updated

2 months ago
Depends on: 1531724
You need to log in before you can comment on or make changes to this bug.