Closed Bug 77785 Opened 24 years ago Closed 23 years ago

URL: text2HTML: intl characters not recognized as end separators

Categories

(MailNews Core :: Backend, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: tarahim, Assigned: mscott)

Details

(Keywords: intl)

2001042508 Mtrunk If a URL is in the Japanese parenthesis, clicking the URL sends the URL to browser with three more bytes added at the end. Thus, if there is a line in an iso-2022-JP message; ?http://www.n-kan.com/kouen/2001/0407.html? ,the closing parenthesis gets underlined when viewed in Mail window. Clickng this hyperlink results in a browser URL bar showing; http://www.n-kan.com/seityo/20010420.html%EF%BC%89 This was not the case in the past builds.
Sorry for the mis-pasting. The original line in the example was: ?http://www.n-kan.com/seityo/20010420.html? and the resultant URL in URL bar was: http://www.n-kan.com/seityo/20010420.html%EF%BC%89
I can reproduce this with NS6.01. I tried a plain text mail.
Keywords: intl
OS: Mac System 9.x → All
Hardware: Macintosh → All
I think mozITXTToHTMLConv creats the link. Reassign to ben.bucksch@beonex.com, cc to putterman, jenm.
Assignee: nhotta → ben.bucksch
This problem still exists on 6.5 04/26 trunk build.
This problem happens as long as there is a Ja character right after the URL, like http://home.netscape.comXXXX, here X stands for a Ja character.
I suggest WONTFIX. You cannot use any character that looks good to you to delimit the URL. The "right" way is to enclose the URL in < and > (prefered) or double quotes " and ". Whilespace is also acceptable, but more failure-prone. Everything else that might work as delimiter is either - coincidence (because invalid in URIs) - heuristics adjusted to roman languages It's true that this is a bias, but we can't reasonably introduce heuristics for all languages of the world. Unknown characters (not explicitly invalid in URIs and no heuristics) are treated as valid characters of the URI. Thus, all caracters until a recognized delimiter are considered part of the URI. This might include Japanese characters directly following the Japanese quote (which, I assume, is != the roman ones). These characters might be escaped later, which leads to the "%EF%BC%89". Actually, it's a coincidence that this URI is recognized at all, considering that there is a Japanese character at the beginning of the scheme. You could argue that *this* is a bug. Use < and > or at the very least whitespace around the URI. References: RFC2396, especially appendix E.
Severity: major → normal
Priority: -- → P5
.
Assignee: ben.bucksch → mozilla
I do not get it, Ben. This is a plain text message. Nobody cares what RFC says about correctness.
The RFC in question is relevant because it defines the format and parsing of URLs. Capsule summary of Ben's message: don't expect a URL to work properly if it's enclosed in Roman parentheses, Japanese parentheses, brackets, curly braces, or *anything* except angle brackets <> or maybe whitespace.
Right. WONTFIX. Reopen, if you violently disagree. I'm also reassigning, in case that happens (I don't plan to fix it).
Assignee: mozilla → mscott
Product: MailNews → Browser
.
Severity: normal → enhancement
Status: NEW → RESOLVED
Closed: 23 years ago
Component: Internationalization → Networking
Resolution: --- → WONTFIX
Summary: some characters corrupts the interpretation of URL as hyperlink in plain text message. → Some characters not recognized as URL end separators in plain text message
Summary: Some characters not recognized as URL end separators in plain text message → Some intl characters not recognized as URL end separators in plain text message
-> mailnews/backend, per esther
Component: Networking → Mail Back End
Product: Browser → MailNews
Summary: Some intl characters not recognized as URL end separators in plain text message → URL: text2HTML: intl characters not recognized as end separators
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.