Closed Bug 1842384 Opened 2 years ago Closed 1 years ago

notifications for new e-mails are not displayed correctly if they contain an umlaut

Categories

(Thunderbird :: OS Integration, defect)

Thunderbird 109
defect

Tracking

(thunderbird_esr102 unaffected, thunderbird_esr115+ fixed, thunderbird115 wontfix, thunderbird116 affected, thunderbird117 affected)

RESOLVED FIXED
118 Branch
Tracking Status
thunderbird_esr102 --- unaffected
thunderbird_esr115 + fixed
thunderbird115 --- wontfix
thunderbird116 --- affected
thunderbird117 --- affected

People

(Reporter: soeren.hentzschel, Assigned: mkmelin)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [snnot3p])

Attachments

(3 files)

Attached image screenshot

Notifications for new e-mails are not displayed correctly if they contain an umlaut. It used to work in Thunderbird 102 and no longer works in Thunderbird 115.

Regression range according to mozregression:
https://hg.mozilla.org/comm-central/pushloghtml?fromchange=d0092ed30b8462f73ab0b7368f234af99e22e137&tochange=5a9e1108b51af7d242104cea842091120b049ccc

It's unfortunately not uncommon for Thunderbird that a push includes more than one change so I can't bisect further. But I guess bug 1802815 is the only bug in this regression range that makes sense as regressor?

Tested on macOS 13.4.

Flags: needinfo?(mkmelin+mozilla)
Whiteboard: [snnot3p]
Version: Thunderbird 117 → Thunderbird 109

Works for me with TB117a1, a correctly MIME encoded email and on Windows.

With an broken email see: Bug 1783499

(In reply to Alfred Peters from comment #1)

With an broken email see: Bug 1783499

That case is somewhat different.

In your example, I notice that some umlauts are quite correct.

Does the bug occur with all emails?
If not, can you upload an example?

I don't know if there are case where it works but I saw this very often in the last few months.

I attached an example e-mail.

(In reply to Sören Hentzschel from comment #3)

Created attachment 9342936 [details]

I attached an example e-mail.

Thanks for that. Unfortunately, I can't reproduce it with that either.
Tested on Windows with TB117a1, TB115b6 and TB115b6 with German settings.

Since it involves native OS notifications I can't rule out that it is macOS specific. I also don't know how to test the problem with the downloaded email, since opening this file in Thunderbird does not generate a notification.

I dropped the email marked as unread into the INBOX using a 2nd IMAP client (a 2nd TB instance with its own profile).

Duplicate of this bug: 1845080

I filed a duplicate report with more information in https://bugzilla.mozilla.org/show_bug.cgi?id=1845080. This is not OS-specific.

Looks like the strings are passed incorrectly to the OS notification, both Windows and Linux. You said in bug 1845080 (quote): "UTF-8 encoding of 🙂 as ISO-8859-1". That's likely not the case. It's likely more a case of confusing UTF-8 and UTF-16. If the strings are happening in the subject, maybe a fall-out from https://hg.mozilla.org/comm-central/rev/327f82032ed2#l7.12

You're right, all we can say is that UTF-8 bytes are being treated as codepoints somehow.

The commit you link does look suspicious, but it is not in the regression range in the OP, and the bug actually only happens with the body of the email (the preview property), not the subject, for me. I think you can also see that in OP, where ü is fine in the subject but „ is garbled in the body.

Oh, yes, the preview property:
https://searchfox.org/comm-central/search?q=hdr.*property.*preview&path=&case=false&regexp=true
And here we have the bug:
https://searchfox.org/comm-central/rev/32cc2360c91edecd5553b76a6237e4c25b51450c/mailnews/base/src/MailNotificationManager.jsm#320
It's retrieved and not decoded from UTF-8 as stored in the DB to UTF-16 as used in a JS string. The author of the regressor bug is already NI'ed here.

The idl needs to use a modern string type. Non-ascii doesn't survive passed through XPCOM when the type is string.

Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Flags: needinfo?(mkmelin+mozilla)
Target Milestone: --- → 118 Branch

Some drive-by comments re. better string processing:

char storeToken[100];
PR_snprintf(storeToken, sizeof(storeToken), "%lld", m_startOfNewMsg);
msgHdr->SetStringProperty("storeToken", nsDependentCString(storeToken));

This could be replaced with:

nsCString storeToken = nsPrintfCString("%lld", m_startOfNewMsg);
msgHdr->SetStringProperty("storeToken", storeToken);

There are four occurrences of this throughout the file.


*aResult = strdup(key.get()); Please use: ToNewCString(key);


  return m_mdb->SetProperty(m_mdbRow, propertyName,
                            PromiseFlatCString(propertyValue).get());

PromiseFlatCString shouldn't be necessary. Doesn't it compile without it?

PromiseFlatCString shouldn't be necessary. Doesn't it compile without it?

Yeah, needed to compile.

Yes, of course.

As for Sean's comments: nsIMsgDatabase.setStringPropertyByHdr() doesn't need to be converted since it's not used in JS. The string is just considered raw (UTF-8) bytes. The changes to the format specifiers appear to be copy/paste errors. Different things compile on different platforms, so better not to change them.

Actually, should that be using PRIu64 or PRId64?

And there is a copy/paste issue here:

-    char storeToken[100];
-    PR_snprintf(storeToken, sizeof(storeToken), "%lld", m_startOfNewMsg);
-    newMsgHdr->SetStringProperty("storeToken", storeToken);
+    nsCString storeToken = nsPrintfCString(PRIu64, m_startOfNewMsg);
+    msgHdr->SetStringProperty("storeToken", storeToken);

Wrong header variable. We're trying PRIu64.

That hunk again:

   m_curSrcHdr = nullptr;
   if (newMsgHdr) {
-    char storeToken[100];
-    PR_snprintf(storeToken, sizeof(storeToken), "%lld", m_startOfNewMsg);
+    nsCString storeToken = nsPrintfCString("%" PRIu64, m_startOfNewMsg);
     newMsgHdr->SetStringProperty("storeToken", storeToken);
     newMsgHdr->SetMessageOffset(m_startOfNewMsg);

More issues. Before the raw UTF-8 bytes were returned from the database. You need to eliminate any "cheap" conversion from raw UTF-8 to UTF-16 (JS string). We found two here:
https://searchfox.org/comm-central/search?q=%2F%2F+Get+the+preview+text+as+a+UTF-8+encoded+string&path=&case=false&regexp=false

Attachment #9348884 - Attachment description: Bug 1842384 - Make setStringProperty work properly for non-ascii values. r=#thunderbird-reviewers → Bug 1842384 - Make setStringProperty work properly for non-ascii values. r=leftmostcat

Magnus, please don't use "%lu" and "%ld" as the don't compile on Windows. Please use "%" PRId64 and "%" PRIu64.

Sample code:

  int64_t i64 = 89955556338558;
  uint64_t u64 = 89955556338558;
  char storeTokeni[100];
  char storeTokenu[100];
  PR_snprintf(storeTokeni, sizeof(storeTokeni), "%lld", i64);
  PR_snprintf(storeTokenu, sizeof(storeTokenu), "%lld", u64);
  nsCString storeTokenii = nsPrintfCString("%" PRId64, i64);
  nsCString storeTokenuu = nsPrintfCString("%" PRIu64, u64);
  nsCString storeTokeniix = nsPrintfCString("%lld", i64);
  nsCString storeTokenuux = nsPrintfCString("%lld", u64);
  nsCString storeTokeniiy = nsPrintfCString("%ld", i64);
  nsCString storeTokenuuy = nsPrintfCString("%lu", u64);
  printf("=== %lld %s %s %s %s\n", i64, storeTokeni, storeTokenii.get(), storeTokeniix.get(), storeTokeniiy.get());
  printf("=== %lld %s %s %s %s\n", u64, storeTokenu, storeTokenuu.get(), storeTokenuux.get(), storeTokeniiy.get());
 0:14.05 C:/mozilla-source/mozilla-central/comm/ (some sample file): error: format specifies type 'long' but the argument has type 'int64_t' (aka 'long long') [-Werror,-Wformat]
 0:14.05   nsCString storeTokeniiy = nsPrintfCString("%ld", i64);
 0:14.05                                              ~~~   ^~~
 0:14.05                                              %lld

Another correction:

      mDatabase->SetStringProperty(msgKey, "junkscore",
                                   PromiseFlatCString(junkScore));

Here the PromiseFlatCString() isn't needed any more.

And while you have the file open, can you please change
https://searchfox.org/comm-central/search?q=nsCString%28aProperty%29&path=&case=false&regexp=false
to use nsDependentCString instead of nsCString().

Pushed by brendan@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/c1ee102f05f7
Make setStringProperty work properly for non-ascii values. r=leftmostcat

Status: ASSIGNED → RESOLVED
Closed: 1 years ago
Resolution: --- → FIXED

Comment on attachment 9348884 [details]
Bug 1842384 - Make setStringProperty work properly for non-ascii values. r=leftmostcat

[Approval Request Comment]
Regression caused by (bug #): bug 1802815
User impact if declined: notifications may display garbled text
Testing completed (on c-c, etc.): c-c, beta
Risk to taking this patch (and alternatives if risky): should be safe

Attachment #9348884 - Flags: approval-comm-esr115?
Duplicate of this bug: 1852486

Comment on attachment 9348884 [details]
Bug 1842384 - Make setStringProperty work properly for non-ascii values. r=leftmostcat

[Triage Comment]
Approved for esr115

Attachment #9348884 - Flags: approval-comm-esr115? → approval-comm-esr115+
Duplicate of this bug: 1849688
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: