Closed
Bug 158171
Opened 22 years ago
Closed 14 years ago
DBCS tab title can't be displayed properly except for html file type.
Categories
(SeaMonkey :: Tabbed Browser, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: kasumi, Assigned: nhottanscp)
References
Details
(Keywords: intl)
Attachments
(3 files, 3 obsolete files)
48.73 KB,
image/jpeg
|
Details | |
1.19 KB,
patch
|
Details | Diff | Splinter Review | |
1.60 KB,
patch
|
nhottanscp
:
review+
nhottanscp
:
superreview+
|
Details | Diff | Splinter Review |
tested on 2007-07-17-08-1.0 Win XP pro. JA SP1 beta1 1. create a .txt file having DBCS as body and file name 2. create a .gif/.jpeg file having DBCS as file name 3. Select File/Open Web location 4. Specify those files as New Navigator tab For .txt file, displayed as "(Untitled)" For .gif/.jpeg file, displayed in return code More detail, please refer attachment
Keywords: intl
Summary: DBCS tab title can't be displayed properly except for html file type. → DBCS tab title can't be displayed properly except for html file type.
Comment 2•22 years ago
|
||
We display tab title as html source <title> for html file. And for non-ascii file name, we currently escape them in many case, e.g. file | open and save file...etc.
Comment 4•22 years ago
|
||
This bug has been fixed by my check-in to bug 152127. Resolving fixed.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 5•22 years ago
|
||
hi Kasumi, could you pls verify whether this is fixed for you using recent TRUNK builds? thanks!
2002-07-29-08-trunk on Win XP Pro. JA SP1 beta1 I can still reproduce, didn't fix yet.
Comment 7•22 years ago
|
||
reopening per comment 6. kasumi, i'm handing over qa contact to you since you have the appropriate configuration for testing this (i don't)...
Status: RESOLVED → REOPENED
QA Contact: sairuh → kasumi
Resolution: FIXED → ---
Comment 8•22 years ago
|
||
Comment on attachment 91846 [details]
detail
This attachment looks correct to me, as far as the tabbed browser is concerned;
if you look at the browser title then this should match the tab title which it
does.
Comment 9•22 years ago
|
||
If the browser title for an image is incorrect then that's an imagelib problem, not a tabbed browser problem.
Reporter | ||
Comment 10•22 years ago
|
||
*** Bug 167771 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 11•22 years ago
|
||
Assignee | ||
Comment 12•22 years ago
|
||
Comment on attachment 98741 [details] [diff] [review] patch from bug 167771, added unescape the title, with a charset if non ASCII. copy sr=bzbarsky from bug 167771
Attachment #98741 -
Flags: superreview+
Assignee | ||
Comment 13•22 years ago
|
||
caillon, could you review the patch? The URI needs unescaping (with a charset for non ASCII).
Assignee: caillon → nhotta
Status: REOPENED → NEW
Assignee | ||
Updated•22 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.2beta
Comment 14•22 years ago
|
||
Comment on attachment 98741 [details] [diff] [review] patch from bug 167771, added unescape the title, with a charset if non ASCII. caillon, doesn't this patch conflict with your patch to bug 164006?
Comment 15•22 years ago
|
||
Comment on attachment 98741 [details] [diff] [review] patch from bug 167771, added unescape the title, with a charset if non ASCII. >Index: bindings/tabbrowser.xml >=================================================================== Yes, as Neil points out you will have merge conflicts with the patch I checked in for bug 164006. > title = titleViaGetter > else if (browser.currentURI.spec && browser.currentURI.spec != "about:blank") { > title = browser.currentURI.spec; >+ try { >+ title = unescape(title); >+ } catch (e) { Erm, how will this code within this catch ever be executed? Unless I am reading ECMA-262 Level 3 incorrectly, unescape should never throw any exceptions or errors... Do we implement something different? Deferring review of this patch pending an answer for the above question.
Assignee | ||
Comment 16•22 years ago
|
||
>Unless I am
>reading ECMA-262 Level 3 incorrectly, unescape should never throw any
>exceptions or errors...
If that is the case then I can change it to always unescape with the document
charset instead.
Comment 17•22 years ago
|
||
unescape will certainly throw an exception in our implementation if the result, after unescaping is not valid "something" (dunno whether "something" is ASCII or UTF8 or ISO-8859-1 or what; I did not experiment). I suspect that the ECMA definition is just plain not compatible with existing real-world practice, based on a brief skim.
Comment 18•22 years ago
|
||
Phil, any insight on this? (see comment 15, comment 17). If we do go against the spec, will we keep it that way and try and change the spec? Or is that a bug in our impl (that won't get wontfixed due to compat)? I suppose the former, and that's fine, I just need to know so I can feel confident about reviewing this patch :-)
Comment 19•22 years ago
|
||
From what I know, the escape() function of JS Engine is superseded by the escape() function of the DOM, and I don't know if the DOM function ever generates exceptions - This issue came up, for example, in bug 58813, "escaped string throws JavaScript error" That makes it sound as if Boris is right -
Comment 20•22 years ago
|
||
Oops - my comment treated escape(), when you were asking about unescape(). But I think the same principle applies: the function that exists in the JavaScript Engine (subject to ECMA-262 Ed. 3) is superseded in the browser by the DOM version of the function. I'm not as familiar with the DOM escape() and unescape() functions as a developer would be, but I suspect both can generate exceptions -
Comment 21•22 years ago
|
||
Comment on attachment 98741 [details] [diff] [review] patch from bug 167771, added unescape the title, with a charset if non ASCII. >Index: bindings/tabbrowser.xml >=================================================================== >RCS file: /cvsroot/mozilla/xpfe/global/resources/content/bindings/tabbrowser.xml,v >retrieving revision 1.60 >diff -u -r1.60 tabbrowser.xml >--- bindings/tabbrowser.xml 10 Sep 2002 08:11:53 -0000 1.60 >+++ bindings/tabbrowser.xml 10 Sep 2002 22:35:49 -0000 >@@ -553,6 +553,19 @@ > title = titleViaGetter > else if (browser.currentURI.spec && browser.currentURI.spec != "about:blank") { > title = browser.currentURI.spec; >+ try { >+ title = unescape(title); >+ } catch (e) { The code right above this uses ex as the exception variable (which you can't see because you are diffing against an older revision. Please update and use the same name. >+ try { >+ // the URI might be non ASCII "non-ASCII" >+ // try unescape again with a characterSet "Try to unescape again, with a character set this time" >+ var textToSubURI = Components.classes["@mozilla.org/intl/texttosuburi;1"] make textToSubURI a const? It will never change... >+ .getService(Components.interfaces.nsITextToSubURI); >+ title = textToSubURI.unEscapeURIForUI(browser.contentDocument.characterSet, I would rather you use the .originCharset of one of the nsIURIs we have. browser.currentURI or better, its fixed up equivalent will work nicely here. The reason for this is that browser.contentDocument.characterSet is _not_ guaranteed to be the actual characterSet of the document. A script can easily mess with that and make it be something that you aren't expecting. >+ title); Additionally, I have an issue with the whole thing about unescaping. The point of having URLs in the tab title is so you can see the URL... If a URL of a site is 'http://foo/bar%20baz/' then I want to see 'http://foo/bar%20baz' and not 'http://foo/bar baz/' I don't mind if you want to make it so that DBCS displays, but I'm not sure I like the whole idea of unconditionally unescaping... I'd like to hear what jag says about this. Nonetheless, this patch needs some work as it stands, and I'd like to see the next patch diffed against the current revision. (Sorry for the conflicts)
Attachment #98741 -
Flags: needs-work+
Comment 22•22 years ago
|
||
By the way, thanks Phil for clarifying that up. I didn't realize there was a DOM version of the function as well. It all makes sense now and I bow to your knowledge of the specs. :-)
Comment 23•22 years ago
|
||
Let's see... We want a URI to display in its URI form (escaped, with all its warts), and we also want filenames to be "human readable", which is achieved by unescaping the filename part. We do this unconditionally for "File -> Save Page As..." because we're always dealing with a file there. In the case of tab titles, sometimes it's a file, sometimes it's not. Is the happy golden road here that we look at the protocol, and if it's file, we unescape, if not, we present it as-is? This has the drawback that a file can have two display forms, the raw URI text when loaded from a server, the unescaped one when loading from disk. Is this acceptable? Perhaps we should display 'http://foo/bar%20baz/' as 'http://foo/bar baz/'? After all, it's only a title, a reminder of what's to be found in the tab, not the real URI. It would certainly make 'http://foopy.com/Hello World.html' more readable.
Comment 24•22 years ago
|
||
Jag, whichever you decide, is fine by me. I wanted mainly for you to be aware of this and to have a say in it. My argument is that by displaying http://foo/bar in the first place (escaped or not) it appears to the user as a URL. Would a user get confused and attempt to type the URL as it appears on the tab title? Unlikely, but it's possible. Is it a big deal? I don't know. The current behavior has the potential of confusing users who have several tabs open. Consider 'http://foo/page%202.html' (page 2.html) and 'http://foo/page202.html' (page202.html). With multiple tabs and cropping the current behavior may have the user see 'http://foo...202.html' for both pages. I _do_ think that if we decide to unescape, we should try and make the title not look like a URL if possible. Something like 'http://foo/ - page 2.html' vs. 'http://foo/ - page202.html' The inner stuff will get cropped off anyway on the tab title, and it may make the tooltip more useful. Get the important information at the quick glance you're afforded with tooltips (they dissapear after a certain timeout), and if you want the real URI, click the tab and get it from the URL bar....
Assignee | ||
Comment 25•22 years ago
|
||
>I would rather you use the .originCharset of one of the nsIURIs we have.
I am rewriting the patch, but using "browser.currentURI.originCharset" does not
work because it is UTF-8 instead of a charset of the document. Can I use
"browser.contentDocument.characterSet"?
Comment 26•22 years ago
|
||
> Can I use "browser.contentDocument.characterSet"?
No. If you have to, for now use:
Components.lookupMethod(browser.contentDocument, 'charset')
.call(browser.contentDocument);
A site can easily change the characterSet with a single JS statement. I'm
writing wrappers for the above statement, but in the meantime, it will do.
Assignee | ||
Comment 27•22 years ago
|
||
>A site can easily change the characterSet with a single JS statement.
I am not sure if that happens frequently. When do you think that could happen?
Comment 28•22 years ago
|
||
>I am not sure if that happens frequently. I can tell you for a fact that it does not happen frequently. >When do you think that could happen? Whenever an author wants to do it. But that's the point - that it CAN happen. IF the user changes the characterSet to something that you are not expecting (e.g. an array, a function, or other weird things), the behavior of your code is then pretty much undefined. Let us not even go there.
Comment 29•22 years ago
|
||
Right, frequency doesn't matter, it's the impact of when it happens that counts, and if there's a simple way to work around is, as shown by caillon, then that's the way to go. See bug 159667 for details.
Assignee | ||
Comment 30•22 years ago
|
||
I was curious why/how that is used. Changing the document charset often makes the page not readable.
Comment 31•22 years ago
|
||
One way is to create an HTML page with <img name="characterSet">.
Assignee | ||
Comment 32•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #98741 -
Attachment is obsolete: true
Assignee | ||
Comment 33•22 years ago
|
||
caillon@returnzero.com, please review the patch.
Comment 34•22 years ago
|
||
Comment on attachment 102033 [details] [diff] [review] Unescape non ASCII. r=caillon
Attachment #102033 -
Flags: review+
Assignee | ||
Comment 35•22 years ago
|
||
bzbarsky, could you review the last patch?
Comment 36•22 years ago
|
||
Comment on attachment 102033 [details] [diff] [review] Unescape non ASCII. Actually, thinking back on this, we have several places IIRC that want to get at the characterSet of the document. Jag, what do you think about adding a <property name="characterSet"> or something like we did for contentTitle (document.title)?
Comment 37•22 years ago
|
||
Shouldn't that code go inside the "if (browser.currentURI.spec) {" block? I can't remember the lookupMethod syntax offhand, but it seems odd to me that one would use lookupMethod to get a property.... Is that correct?
Assignee | ||
Comment 38•22 years ago
|
||
There are two issues raised during the review process. * Getting the charset - the obvious "browser.contentDocument.characterSet" may not be correct (comment #26). This can be filed as a separate bug with a real example if possible. * Generic improvment idea of getting characterSet of the document (comment #36). This also does not have to be a part of this bug. I will attach a patch which uses "browser.contentDocument.characterSet".
Assignee | ||
Comment 39•22 years ago
|
||
Attachment #102033 -
Attachment is obsolete: true
Assignee | ||
Comment 40•22 years ago
|
||
caillon/bzbarsky, please r/sr, thanks.
Comment 41•22 years ago
|
||
Comment on attachment 102179 [details] [diff] [review] updated patch using "browser.contentDocument.characterSet" Absolutely positively NO. I already explained why this should NOT happen. The syntax you used is correct, bzbarsky was a little confused (with good reason) because of the way the function was named. Please do not post patches just based on questions. Questions are made to be answered, not patched. Anyway, Components.lookupMethod is correct because it looks up the getter, which is a method.
Attachment #102179 -
Flags: needs-work+
Assignee | ||
Comment 42•22 years ago
|
||
>I already explained why this should NOT happen.
I am not sure which comment. Could you give me the comment # of your explanation?
Comment 43•22 years ago
|
||
Comment 26. Basically, document.characterSet could be set by the site, and it need not even be set to a string. It could be an object that would do icky evil things. sr=me on attachment 102033 [details] [diff] [review] if the code is moved to the place where it is in attachment 102179 [details] [diff] [review]
Assignee | ||
Comment 44•22 years ago
|
||
Attachment #102179 -
Attachment is obsolete: true
Comment 45•22 years ago
|
||
This is the patch I think we should use. If we have the check anywhere, we should do this for URIs which we actually care about the URI (eg any except about:blank) since those go a different route (string bundles). r=caillon for this patch, since it's really nhottas.
Assignee | ||
Comment 46•22 years ago
|
||
Comment on attachment 102235 [details] [diff] [review] Same as Naoki's patch, but with the check moved and a comment addition r=caillon, sr=bzbarsky according to comment #43 and #45
Attachment #102235 -
Flags: superreview+
Attachment #102235 -
Flags: review+
Comment 47•22 years ago
|
||
yeah, sr=me....
Assignee | ||
Comment 48•22 years ago
|
||
checked in to the trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 49•22 years ago
|
||
tested 0n 2002-10-09-11-trunk Win XP Pro. JA Still occurred. Did I use wrong build ?
Reporter | ||
Comment 50•22 years ago
|
||
tested on 2002-10-09-11-trunk Win XP Pro. JA Confirmed fix for .txt file but still occur for graphical files such as .gif, .jpg and .bmp.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 51•22 years ago
|
||
There is no good way to get charsets for those types of files.
Status: REOPENED → ASSIGNED
Target Milestone: mozilla1.2beta → ---
Comment 52•22 years ago
|
||
Yeah; such files will never be served with a useful charset. Do we have a way of getting the document URI? We could try falling back on its origin charset if the documentCharacterSet is empty....
Assignee | ||
Comment 53•22 years ago
|
||
Yes, or we can put the fallback mechnism to the unescape funtion instead of the caller to figure out how to fallback. There are two cases, if local file we can fallback to the OS charset, if ftp then fallback to the browser's default. But for the local file case, this may be resovled once the file system switch to use UTF-8 (bug 166735, bug 162361) since the current code already handles UTF-8 URL.
Comment 54•22 years ago
|
||
> Confirmed fix for .txt file but still occur for graphical files such as .gif,
> .jpg and .bmp.
I thought that for graphical files the title was generated by imglib?
Reporter | ||
Comment 55•22 years ago
|
||
Bug 173616 said Technically mozilla renders the image by constructing a small html in #1. If it's true, graphical file name in tab can be handled same as html.
Comment 56•22 years ago
|
||
>I thought that for graphical files the title was generated by imglib?
well, if you consider content/html/document/src/nsImageDocument.cpp as a part of
imagelib then yes. anyway, that file creates the synthetic html document for
images and therefore also the title.
Updated•16 years ago
|
Product: Core → SeaMonkey
Updated•16 years ago
|
QA Contact: kasumi → tabbed-browser
Comment 57•14 years ago
|
||
Something has been fixed here a long time ago. Please file followup bugs for any remaining issues.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago → 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•