Closed Bug 365926 Opened 18 years ago Closed 14 years ago

Attachments mis-served as utf-8

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Bugzilla 3.6

People

(Reporter: smontagu, Assigned: mkanat)

References

(
URL
)

Details

Votes:

Bug Flags:

LpSolit

approval

LpSolit

approval3.6

Attachments

(1 file)

v1 14 years ago Max Kanat-Alexander 834 bytes, patch	LpSolit : review+	Details \| Diff \| Splinter Review

Simon Montagu :smontagu

Reporter

Description

•

18 years ago

HTML testcases in bugs are being served with a Content-Type header that says "text/html; charset=UTF-8", even if they have a different encoding specified in a <meta> charset, e.g. attachment 250472 [details] in bug 365922.

Marking as major, because this makes QA much more time-consuming.

Max Kanat-Alexander

Assignee

Comment 1

•

18 years ago

I suppose attachments should be sent without a charset, unless they specify some charset. I'm not quite sure how to do that, though, since we're serving them through a CGI.

OS: Linux → All

Hardware: PC → All

Target Milestone: --- → Bugzilla 3.0

Version: unspecified → 2.23.3

Jean-Marc Desperrier

Comment 2

•

18 years ago

With a bit a research to try to fix this, I found that in attachment.cgi CGI:header seems to inherit the UTF-8 charset value from somewhere, but resetting the charset it to an empty string with $cgi->header(-charset=>'') just causes the CGI module to use the more usual default, ie 'iso-8859-1'. There's no obvious method available to really remove it.

But whenever it's important to view the attachment with a specific encoding, to enter the Content Type manually when creating the attachment and set it to a value like "text/plain; charset=iso-8859-1" is a better solution than having bz send it without a specified charset and relying on the browser to do the right thing.

So I think the best solution might be to close this as invalid and/or find a place to document clearly what the good method to do that is.

Jean-Marc Desperrier

Comment 3

•

18 years ago

BTW I've already made the change in the originally impacted bug, so that the correct charset will be used when that attachment is visualized. And I'm lowering the severity though I'll let the component owners decide the final outcome of this bug.

Severity: major → minor

Simon Montagu :smontagu

Reporter

Comment 4

•

18 years ago

That's all very well for new attachments, but what about all the existing attachments that have a charset specified internally but no charset in the content-type? I have been editing the content-type as I come across them, but it's annoying. I'm sure there are also testcases attached to charset autodetection bugs that depend on not having a charset specified!

Severity: minor → major

Frédéric Buclin

Comment 5

•

18 years ago

Is this bug related to or a dupe of bug 226404?

Simon Montagu :smontagu

Reporter

Comment 6

•

18 years ago

Unrelated. That's specifically about patches.

Frédéric Buclin

Comment 8

•

17 years ago

Max, isn't this bug fixed by bug 408446 on tip?

Max Kanat-Alexander

Assignee

Comment 9

•

17 years ago

(In reply to comment #8)
> Max, isn't this bug fixed by bug 408446 on tip?

  No, we still send the charset header, as far as I know.

:Hb

Comment 10

•

16 years ago

This URL https://bugzilla.mozilla.org/attachment.cgi?id=306957 gets a "Content-Type: text/plain; name="patch359083.txt"; charset=UTF-8"
header causing the European Umlauts to break. 

Bug 226404 comment 1 states that multi-byte source code is used. Is this really an issue? Why aren't all patches delivered with the US-ASCII charset?

Frédéric Buclin

Updated

•

16 years ago

Severity: major → normal

Frédéric Buclin

Comment 11

•

16 years ago

The Bugzilla 3.0 branch is now locked to security bugs and dataloss fixes only. This bug doesn't fit into one of these two categories and is retargetted to 3.2 as part of a mass-change. To catch bugmails related to this mass-change, use lts081207 in your email client filter.

Target Milestone: Bugzilla 3.0 → Bugzilla 3.2

Frédéric Buclin

Updated

•

15 years ago

Depends on: 504104

Frédéric Buclin

Comment 12

•

15 years ago

I know how to fix this, but I first need the patch from bug 504104 for a complete fix. As it may be a bit invasive, I'm retargetting this bug to 3.6. For 3.4 and older, you should specify the charset together with the MIME type, e.g. "text/html; charset=euc-jp". The charset will then be passed to the browser and your HTML page will be displayed correctly.

In this bug, I will only focus on the automatic charset detection: if no charset is specified with the MIME type (as shown above), Bugzilla will call HTML::Parser to get the <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=euc-jp"> tag and extract the charset from it.

Assignee: attach-and-request → LpSolit

Target Milestone: Bugzilla 3.2 → Bugzilla 3.6

:Hb

Comment 13

•

15 years ago

Bug formerly depended on bug 504104 which became a dupe of bug 477442.

Depends on: 477442
No longer depends on: 504104

Max Kanat-Alexander

Assignee

Comment 14

•

14 years ago

Attached patch v1 — Details — Splinter Review

This patch simply sends a blank charset with the file and lets the browser decide. (At least, I believe the browser will decide in this case.)

Assignee: LpSolit → mkanat

Status: NEW → ASSIGNED

Attachment #427547 - Flags: review?(LpSolit)

Frédéric Buclin

Updated

•

14 years ago

Attachment #427547 - Flags: review?(LpSolit) → review+

Frédéric Buclin

Comment 15

•

14 years ago

Comment on attachment 427547 [details] [diff] [review]
v1

Seems to trigger the automatic detection of browsers. But must browsers are pretty bad at detecting the correct encoding, from what I could see. r=LpSolit

Frédéric Buclin

Updated

•

14 years ago

No longer depends on: 477442

Flags: approval3.6+

Flags: approval+

Frédéric Buclin

Comment 16

•

14 years ago

(In reply to comment #15)
> Seems to trigger the automatic detection of browsers. But must browsers are

s/must/most/

Max Kanat-Alexander

Assignee

Comment 17

•

14 years ago

Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/trunk/
modified attachment.cgi
Committed revision 7083.

Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/3.6/
modified attachment.cgi
Committed revision 7046.

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Dylan Hardison [:dylan] (he/him)

Updated

•

8 years ago