Closed
Bug 163998
Opened 23 years ago
Closed 22 years ago
Image titles should not be URL encoded
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: neil, Assigned: pavlov)
References
()
Details
(Keywords: intl)
Attachments
(2 files, 3 obsolete files)
1.60 KB,
patch
|
Details | Diff | Splinter Review | |
2.10 KB,
patch
|
nhottanscp
:
review+
darin.moz
:
superreview+
asa
:
approval1.3b-
asa
:
approval1.3+
|
Details | Diff | Splinter Review |
Using Build ID: 2002082104
Steps to reproduce problem:
1. Rename an image to use non-ASCII characters
2. Open the image in Mozilla
Expected results: non-ASCII characters appear in window title
Actual results: window title is URL encooded
For instance, the URL should display as Håkan.ico (ICON image, 32x32 pixels)
Comment 1•23 years ago
|
||
well, the problem is: which charset to use...
IIRC, some standard says to assume UTF-8... though that would not work for the
testcase in this bug.
We can not just assume latin1.
Comment 2•23 years ago
|
||
bug 9949 wants to display Unicode titles in the titlebar (assuming they're UTF-8
Christian said - that won't help you in this case).
Comment 3•23 years ago
|
||
Actually, if I go to <http://www.nrr.co.uk/Håkan.ico> with Mozilla, I get
<http://www.nrr.co.uk/H%C3%A5kan.ico>, which looks like UTF-8... :)
OS: Windows 95 → All
![]() |
||
Comment 4•23 years ago
|
||
A few weeks ago I filed another bug about the fact that I cannot go to
www.håkan.nu with Mozilla (it works in IE6). Maybe this is related?
Reporter | ||
Comment 5•23 years ago
|
||
Intersting... when I hover over the URL link that I entered into Bugzilla the
status bar displays http://www.nrr.co.uk/Håkan.ico and also when I hover over
Tuukka's bugmail URL link the status bar displays http://www.nrr.co.uk/Håkan.ico
but when I hover over the URL link in the bugzilla page the status bar displays
http://www.nrr.co.uk/HÃ¥kan.ico although all of the links work...
Comment 6•23 years ago
|
||
Jo, that bug# seems to be wrong...
Comment 7•22 years ago
|
||
As for comment #3, Mozilla converted U+00E5 in the URL
in comment #3 encoded in ISO 8859-1/Windows-1252(0xE5)
to UTF-8 because the encoding of the page with the URL you cited
is *known* (to Mozilla) to be ISO 8859-1/Windows-1252.
Anyway, I guess it's a bug NOT to 'unescape' '%-escape' sequence
back to UTF-8 before handing it over to 'window titlebar display
method' when the encoding of the URL is known and
is already converted to UTF-8. I'll try to take a look.
As for comment #4, iDNS(using characters outside a subset
of US-ASCII allowed in the present DNS system) is a totally
different issue. You cannot just use a legacy encoding
of your choice(which happened to be ISO-8859-1 in your example)
and expect it to be interpreted as you intended by DNS.
(there are many hacked/non-standard way of supporting
non-ASCII characters in domain name especially
in CJK countries, but I believe iDNS in general is still premature)
I haven't followed the development of iDNS in general
and the status of iDNS support by Mozilla in particular, though.
As for comment #6, it's bug 9449 (one of the most often duped
and misunderstood bug). The bug was fixed on all platforms
(under *nix/X11, a long long time ago. under Windows, recently).
So, Mozilla does have the ability to put any Unicode character
in the title bar in principle.
I'm also adding intl keyword. BTW, this bug may as well be transferred
over to I18N.
Keywords: intl
Comment 8•22 years ago
|
||
This bug does not belong in Intl.
It is entirely within content/html/document/src/nsImageDocument.cpp.
The question from comment 1 is not answered. Which charset should Mozilla use
for the encoded character?
Also, iDNS is entirely unrelated to this bug. iDNS is only for DNS (duh),
however this bug is about the filename part of the image, not the hostname.
Comment 9•22 years ago
|
||
> This bug does not belong in Intl.
> It is entirely within content/html/document/src/nsImageDocument.cpp.
Well, I wouldn't say it should belong to intl no matter what,
but neither would I say it's nothing to do with intl.
> The question from comment 1 is not answered. Which charset should
> Mozilla use for the encoded character?
The answer is UTF-8, but to cope with
non-standard-compliant web pages/user behaviors/web server
behaviros/etc, sometimes it may need to try various heuristics to
'read' the mind of users/web page authors and so forth.
However, in this particular case, that doesn't seem to be of much
relevance judging from comment #3. Mozilla already got the
encoding of the filename right and converted it to UTF-8
before url-encoding it. (as corerctly pointed out in comment
#3, U+00E5 - Latin small letter a with ring above - is 0xC3 0xe5)
Thereofre, the issue is where to url-decode it before handing
it over to the method responsible for the window title bar
rendering.
> Also, iDNS is entirely unrelated to this bug. iDNS is only for
> DNS (duh),
> however this bug is about the filename part of the image, not the
> hostname.
That's exactly what I meant to convey in my comment.
What I wrote in response to comment #4 was meant to
indicate that this bug is NOT related with iDNS because comment #4
was about using non-ASCII chars. in the domain name part of the URL
while this bug is about the rest of the URL.
Comment 10•22 years ago
|
||
> It is entirely within content/html/document/src/nsImageDocument.cpp
A cursory look at the file led me to doubt that this is the case
although I wouldn't rule out that possibility. It seems to me that
it has to be solved 'upstream?' (e.g. possibly implementation of nsIURL).
|UpdateTitle| and |CreateSyntheticDocument| appear to do what they're
supposed to do.
BTW, it's not clear how to reproduce this bug because the following
cases are all different cases and I'm not sure which of them this bug
is about:
1. Type in the URL with non-ASCII characters in the url bar
or the url-open dialog box. Non-ASCII characters are supposed
to be interpreted as the encoding/charset of the current locale
under which Mozilla is running.
2. Make an html document in MIME charset A, put a link
to an image with a URL
with non-ASCII characters in the filename part
as 'raw octet/byte string' in MIME charset/encoding A.
Make it clear that the document is in charset A
in either the http header or the document itself(with
meta tag).
a. view it under Mozilla with MIME charset set to A
b. view it under Mozilla with MIME charset set other than A
3. Same as 2, but url-encode the filename in charset/encoding A.
4. Same as 2, but url-encode the filename in UTF-8
Case 2 and 3 are not compliant to the standard, but in practice
there are many cases like them.
A simple way to check whether this bug is confined to |nsImageDocument| or is
more generic is to test all four
cases above with two links/urls, one to an image and
the other to a non-image (e.g. text/html, text/plain).
Comment 11•22 years ago
|
||
> A cursory look at the file led me to doubt that this is the case
That file is responsible for setting the title when an image document is loaded
(besides lots of other things, of course)
> The answer is UTF-8
ok, I'll attach a patch
Comment 12•22 years ago
|
||
Comment 13•22 years ago
|
||
this one is better, if the string is no valid utf-8, the unescaped filename is
displayed.
Attachment #113293 -
Attachment is obsolete: true
Comment 14•22 years ago
|
||
Distracted by a 'grand' issue (which turned out to have been
taken care of mostly by nsIUR(I|L) implementions),
I contradicted what I wrote in comment #9
> the issue is where to url-decode it before handing
> it over to the method responsible for the window title bar
> rendering.
by saying in comment #10
> A cursory look at the file led me to doubt that this is the case
This realization bothered me and I got out of bed to make a patch, but
you did it already. However, I have a slightly different patch
copied from somewhere in necko
(which may or may not work) so that I'm gonna upload it.
Comment 15•22 years ago
|
||
I haven't tested this yet. If it works as intended,
it falls back to 'charset' of the refering document (originCharset)
if filename is not UTF-8.
Comment 16•22 years ago
|
||
Comment on attachment 113307 [details] [diff] [review]
an alternative
hmmm
interesting
I did not know that URIs had a charset stored on them. This patch does look
like a better solution.
However, imho you should check the return value of GetOriginCharset too.
I'd try to get r=jst, sr=darin on this patch.
Comment 17•22 years ago
|
||
added a check for GetOriginCharset(). I have yet to check
if this (GetOriginCharset) really works for this case, though.
Attachment #113307 -
Attachment is obsolete: true
Comment 18•22 years ago
|
||
Comment on attachment 113323 [details] [diff] [review]
update to my alternative per Christopher's comment
well, the documentation in nsIURI.idl of originCharset definitely sounds like
it should work.
but you could ask darin, who is the networking owner, which is why I suggested
him for super-review.
Comment 19•22 years ago
|
||
Comment on attachment 113323 [details] [diff] [review]
update to my alternative per Christopher's comment
Judging from the behavior of Mozilla, it works as intended with my patch.
I tried to get some
diagnostic output, but failed.
Nonetheless, I'm asking
r=jst and sr=darin as
suggested by Christopher.
Attachment #113323 -
Flags: superreview?(darin)
Attachment #113323 -
Flags: review?(jst)
Comment 20•22 years ago
|
||
I'm now sure that it works as intended with attachment 113323 [details] [diff] [review].
I ran 'xprop' over mozilla and found that_NET_WM_NAME(UTF8_STRING)
had values they're supposed to have when view|char. coding was set to
Western and Korean(EUC).
![]() |
||
Comment 21•22 years ago
|
||
Comment on attachment 113323 [details] [diff] [review]
update to my alternative per Christopher's comment
nsIURI::originCharset is very often empty indicating an origin charset of
UTF-8.
in which case, it looks like this code would enter UnEscapeURIForUI and error
out
as a result of there being no converter registered under the empty string. so,
this case seems to be covered, but please be sure to verify it. thx!
Attachment #113323 -
Flags: superreview?(darin) → superreview+
Comment 22•22 years ago
|
||
Comment on attachment 113323 [details] [diff] [review]
update to my alternative per Christopher's comment
What darin said, r=jst
Attachment #113323 -
Flags: review?(jst) → review+
Comment 23•22 years ago
|
||
Thank you jst and darin for r/sr and pointing out the problem.
I thought that empty originCharset would lead to 'error out' further down the
road
than nsTextToSubURI.cpp and didn't look further. It turned out that it
didn't
get checked. So, I added a check to nsTextToSubURI.cpp
Attachment #113323 -
Attachment is obsolete: true
Comment 24•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
Assuming jst's review
on nsImageDocument.cpp
still standing, now I'm asking nhotta for r on
nsTextToSubURI.cpp part.
darin, after that, can you transfer your sr?
Thank you.
Attachment #113463 -
Flags: superreview?(darin)
Attachment #113463 -
Flags: review?(nhotta)
![]() |
||
Comment 25•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
nsCOMPtr<nsITextToSubURI> textToSubURI
I think the line can move after getting the origin charset.
Attachment #113463 -
Flags: review?(nhotta) → review+
![]() |
||
Comment 26•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
sr=darin
Attachment #113463 -
Flags: superreview?(darin) → superreview+
Comment 27•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
change : url-decode image
urls before putting in the titlebar
risk : low on all platforms
test : on Linux, it's confirmed that it worked as intended.
There's no platform dependence.
On nhotta's comment:
I considered checking
|originCharset| before
|nsTextToSubURI| in my first patch, but didn't
because all the implemenations of |GetOriginCharset| always return NS_OK
as they're now.
Attachment #113463 -
Flags: approval1.3b?
Comment 28•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
We're trying to get 1.3beta out the door. This is going to have to wait.
Attachment #113463 -
Flags: approval1.3b? → approval1.3b-
Comment 29•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
I'm not sure if I can ask for a for 1.3.
Anyway, it'd be nice to put this tiny (very low risk) patch into 1.3.
change : url-decode image
urls before putting in the titlebar
risk : very low on all platforms
test : on Linux, it's confirmed that it worked as intended.
There's no platform dependence.
Attachment #113463 -
Flags: approval1.3?
Comment 30•22 years ago
|
||
Comment on attachment 113463 [details] [diff] [review]
a new patch with darin's concern addressed
a=asa (on behalf of drivers) for checkin to 1.3.
Attachment #113463 -
Flags: approval1.3? → approval1.3+
Reporter | ||
Comment 31•22 years ago
|
||
Looks like this got checked in.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 32•22 years ago
|
||
Thank you everyone.
I forgot to change the status after check-in. I've just
tested under Win2k(with my build) and found that it worked fine.
You need to log in
before you can comment on or make changes to this bug.
Description
•