Closed Bug 291320 Opened 19 years ago Closed 8 years ago

Reply to a mail with raw 8 bits characters in the subject results in an empty subject line

Categories

(MailNews Core :: Composition, defect, P4)

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.0rc1

People

(Reporter: jmdesp, Unassigned)

References

Details

(4 keywords)

Attachments

(4 files, 1 obsolete file)

If you try to reply to a mail that :
- uses raw 8 bit non-ASCII characters in the subject
- does not define a Content-Type header with the charset
the subject line comes out empty in the reply.

Manually selecting a charset to display the mail before hitting replay avoids
the problem, even if the charset that you manually select is exactly the same
charset that is already used to display the mail.

This is a regression introduced between 31/3/04 and 1/4/04, obviously by the
patch to bug 229399.
The practice of sending mail with raw 8 bits, and no Content-Type header is
unfortunately rather frequent in several country, so this makes this regression
quite annoying.
That practice is RFC-incompliant. Meanwhile the patch to a bug 229399 fixes
issues that occure with RFC-compliant messages. I hope you're not going to
unpatch TBird, are you?
The issues in bug 229399 is pushing what is allowed by the RFC to the limits,
whilst what I describe here is what is happening with many, many non RFC
compliant clients.

Anyway, the conclusion is just that it should not have been a one liner, and
that junkshin should complete his patch by checking that the charset for the
subject has been initialized, and if no valid charset is present for the header,
use the charset of the body.

I expect this not to be too hard, and keep everybody happy.

The issue in bug 229399 was already a regression I believe, I reported a very
similar problem in bug 115096, that back at the time stayed two year without a
patch.
Thanks for the report.

> The practice of sending mail with raw 8 bits, and no Content-Type header is
> unfortunately rather frequent in several country,

These days, I hardly, if ever, get such an email, but I guess my sample is
certainly skewed. Anyway, I have to make up a couple of them for testing.
 
Status: NEW → ASSIGNED
Sorry for spamming
OS: Windows 2000 → All
Hardware: PC → All
BTW: This also happens with some news article, which defines a Content-Type
header with a charset, see
http://groups-beta.google.com/groups?selm=d4j4oe%2445f%2404%241%40news.t-online.com&hl=en
for a example (the message should turn up in google in the next few hours).
You're right, I don't know exactly why I misinterpreted initially, but it
doesn't help if the body defines the charset.
It seems to be broken everytime the subject is not properly encoded encoding
according to RFC 2047.

This is a blocker for news. 
For historical reason, many, many news client are not configured to, or just
can't, use RFC 2047 encoding for the subject of messages.
Severity: normal → major
Flags: blocking-aviary1.1?
Summary: Reply to a mail with non-US ASCII character in the subject and no charset results in an empty subject line → Reply to a mail with raw 8 bits characters in the subject results in an empty subject line
Jungshik, I'm quite interested in that bug and I took a little time to investigate.

With your patch, charsetOverride is now FALSE most of the time, when it was
almost always TRUE before.

This means this code gets executed, but wasn't before :
nsMsgCompose.cpp
1562       if (!charsetOverride)
1563       {
1564         rv = msgHdr->GetCharset(getter_Copies(charset));
1565         if (NS_FAILED(rv)) return rv;
1566       }

AFAIS GetCharset fails to find the charset in *every* case, whether there's one
in the header of the message or not. Therefore it overwrite our 'charset'
formerly set in line 1517 with an empty string.
1517   GetTopmostMsgWindowCharacterSet(mailCharset, &mCharsetOverride);

GetCharset goes down to some very bad code I just don't want to touch :
mailnews/db/msgdb/src/nsMsgDatabase.cpp
3282     if (err == NS_OK)
3283     {
...
3296     else if (err == NS_OK)  // guarantee a non-null result

The consequence is we arrive at line 1601 with an empty originCharset
1601       rv = mimeConverter->DecodeMimeHeader(subjectCStr,
1602                 getter_Copies(decodedCString),
1603                 originCharset.get(), charsetOverride);
1604       if (NS_FAILED(rv)) return rv;
1605 
1606       CopyUTF8toUTF16(decodedCString, subject);

The third parameter of DecodeMimeHeader is the default charset, that is now
empty. When the input is raw 8 bit, and that parameter is empty,
DecodeMimeHeader just copies the input without modifying it.

CopyUTF8toUTF16 expects it's input to be UTF-8, and does no error recovery. 
We get the empty subject.
Removing the GetCharset call seems to get exactly the required result.

I get a proper subject again for raw 8 bit data in subject, whether or not we
have a Content-Type header, and the case in bug 229399 is still handled
properly.

Jungshik, do you have more case that you want to test for regression ? 
Do you know if that call has some cases where it won't fail ?
Assignee: jshin1987 → jmdesp
(In reply to comment #9)

> Removing the GetCharset call seems to get exactly the required result.

> Jungshik, do you have more case that you want to test for regression ? 
> Do you know if that call has some cases where it won't fail ?

I've just tested it. It doesn't fail for mail messages with 'charset' specified
but always fails for news articles. Anyway, it appears pretty useless even when
it doesn't fail because GetTopMostMsgWindowCharset gets charset right. Just to
be safe, how about enclosing it with |!charsetOverride && charset.IsEmpty()|

It works well too with the charset.IsEmpty test, so if you think this version
will get approval more easily, let's do it.

But I tested some more with mail messages instead of only news, and
GetTopMostMsgWindowCharset really makes this call to msgHdr->GetCharset
unrequired. 
I tested with messages whose charset is not my default charset. 
If the message has no charset, GetTopMostMsgWindowCharset gets my default
charset, but if the message has a charset defined in the Content-Type header,
it gets that one.
Attachment #182495 - Attachment is obsolete: true
Attachment #183033 - Flags: superreview?
Attachment #183033 - Flags: review?(jshin1987)
Comment on attachment 183033 [details] [diff] [review]
Modified as per jshin comment

r=jshin

Yeah, I'm pretty sure it's useless, but it might make a difference in MAPI.
Attachment #183033 - Flags: superreview?(bienvenu)
Attachment #183033 - Flags: superreview?
Attachment #183033 - Flags: review?(jshin1987)
Attachment #183033 - Flags: review+
Attachment #183033 - Flags: superreview?(bienvenu) → superreview+
Comment on attachment 183033 [details] [diff] [review]
Modified as per jshin comment

I really hope this small, low risk fix can get 1.8b2 approval.

Even knowing about the bug, I was hit by it several time recently. All
non-english user using newsgroups will hit it very often.

So I think there's a strong case for fixing it before it gets in something more
important than nighlies. 1.8b1 did not have the problem.
Attachment #183033 - Flags: approval1.8b2?
Comment on attachment 183033 [details] [diff] [review]
Modified as per jshin comment

a=shaver, please land quickly.
Attachment #183033 - Flags: approval1.8b2? → approval1.8b2+
landed on the trunk 
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
VERIFIED with 2005051706
Status: RESOLVED → VERIFIED
Flags: blocking-aviary1.1?
I've just seen a case with Thunderbird 1.0.6 that looks surprisingly similar to
this bug :
http://groups.google.fr/group/fr.reseaux.telecoms.adsl/msg/1df3e327015a4e9a?dmode=source

But according to cvsgraph, all versions of thunderbird 1.0.x uses the
1.423.2.1.2.14 version of nsMsgCompose.cpp, so at least it doesn't look like a
regression in 1.0.6 WRT 1.0.5 or another 1.0.x realease.

With a new stable version coming soon, and a bug that apparently did not
generate that much reaction in all the previous 1.0.x versions, maybe it's not
really worth investigating what happens there.
Still happens on Thunderbird 1.5.0.5 (NetBSD/i386).  Offending message is: http://groups.google.com/group/fido7.mo.dec/msg/1a598041b72fc771?dmode=source
Still happens on Thunderbird 2.0.0.4 (NetBSD/i386).
Do you have an offending message Sergey ? 
I'll check on this, might be a regression.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Pick any post in fido7.* news hierarchy (use news://news.fido7.ru).  I will attach a sample.
Attached file offending message
I'll try to find out what is happening as soon as I find some time.

I just tested with some posts on fido7.ru.unix and fido7.mo.cars, and do not repro at the moment with SM 1.5a (20070323)/XP.

Or is that really Unix only ? I didn't expect the code concerned to be OS dependant, but I might have a surprise after all.
Yes, it appears to be unix-only.  I couldn't reproduce on Windows, either.
Still no time immediatly to work on it, but I'm noting it might be not so much Unix-only.
See :
http://groups.google.fr/group/fr.reseaux.telecoms.mobiles/msg/315883cc6e0c5327?dmode=source
responding to :
http://groups.google.fr/group/fr.reseaux.telecoms.mobiles/msg/9547bc10c90f4e60?dmode=source
Flags: blocking-thunderbird3?
QA Contact: security
I might still get to working on that, but I probably send the wrong message by leaving it assigned to me when anybody else should feel free to take it.
Reassigning to nobody and marking helpwanted.
Assignee: jmdesp → nobody
Status: REOPENED → NEW
Keywords: helpwanted
Product: Core → MailNews Core
If this really happens every time an 8-bit subject is replied to, we'd definitely block on it, even if it were only on one platform.  But my impression is that it's not so easy to reproduce as that.

Before marking this as blocking+, we'd need to have a significantly better understanding of exactly when this happens.  Adding the qawanted keyword.

Marking blocking-thunderbird3-, wanted-thunderbird3+ for now.  Feel free to renominate once there's better data about reproducibility / frequency.
Flags: wanted-thunderbird3+
Flags: blocking-thunderbird3?
Flags: blocking-thunderbird3-
Keywords: qawanted
Leaving as wanted3+, hoping we can sort this out for final.
Priority: -- → P4
Target Milestone: --- → Thunderbird 3.0rc1
TB 2.0.0.14 on NetBSD/i386; Charset autodetection is turned off; no charset is selected in 'View -> Character Encoding' menu; charset is set to KOI8-R in folder properties, 'Apply default to all messages in the folder' does not matter.

With these settings, bug happens every time (in fido7.* newsgroups, anyway).  However, once I override charset (via 'View -> Character Encoding' menu) to KOI8-R, it doesn't happen.

Will test in 2.0.0.16, too.
I've tried to run Linux binary of 3.0b1 (using NetBSD's Linux emulation layer), but it requires dbus and asound libraries which aren't in pkgsrc...
I just tried this, seems WFM on linux/trunk.
QA Contact: security → composition
WFM in TB 3.0, same platform as before.
If so, let's close this.
Status: NEW → RESOLVED
Closed: 19 years ago14 years ago
Resolution: --- → WORKSFORME
Of course it is in no way to be closed.
It is still observable in Seamonkey 2.0.2.
I install 64bit build from https://launchpad.net/~joe-nationnet/+archive/ppa-seamonkey2 on my Ubuntu 9.10.
I go to news://ddt.demos.su/fido7.testing
I pick a message with Cyrillic subject.
I press 'Reply' on it.
In fact, the subject is included in the Reply only if you have opened lower pane for message view. If you have closed lower pane, or if you open the message in the separate window, and then reply to it, you still get empty subject after Re:.

Just as in older Seamonkeys ages ago.
Seamonkey is not Thunderbird and is not supported by Mozilla developers, as far as I understand.
It would be useful to know what patch fixed it in TB.
Do you have to have the relevant font installed to see the issue?

SeaMonkey and Thunderbird share a common core, which means, if the fix is in the core, then it should be fixed for both, if it is in the front end code then, due to similarities, should be easy to port.
(In reply to comment #38)
> SeaMonkey and Thunderbird share a common core, which means, if the fix is in
> the core, then it should be fixed for both, if it is in the front end code
> then, due to similarities, should be easy to port.

Correct, however:

(In reply to comment #36)
> I install 64bit build from
> https://launchpad.net/~joe-nationnet/+archive/ppa-seamonkey2 on my Ubuntu 9.10.

That is an unsupported 64 bit build. Its unclear whether the Thunderbird used to retest was 32 bit or 64 bit, but I would strongly suspect it was a 32 bit build.

Therefore before trying to investigate this more, lets get it tested on a 32 bit build of SeaMonkey just in case this is actually a separate bug with 64 bit builds.
I've tested using 32bit SM on Fedora 11 and replying to newsgroup messages I get the following:
without content-type header
Re: ÓËÏÒÏÓÔØ ÐÏ ÇÉÇÁÂÉÔÕ
with content-type header set
Re: скорость по гигабиту

So neither blank.
Would be useful if other 64bit versions of SM and TB could be tested to see if they have the issue.
It would also be very useful if people still observing the bug could attach an affected message.
Still happens in "Mozilla/5.0 (X11; U; NetBSD i386; en-US; rv:1.9.2.17) Gecko/20110623 Lightning/1.0b3pre Lanikai/3.1.10" (unofficial 32-bit build, from pkgsrc).
Attached file sample message
The problem indeed manifests with attachment 544792 [details].
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
(In reply to Rimas Kudelis from comment #45)
> The problem indeed manifests with attachment 544792 [details].

WFM on tb16 and trunk.
Closing since the remaining issue seems gone.
Status: REOPENED → RESOLVED
Closed: 14 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: