Closed Bug 93419 Opened 24 years ago Closed 22 years ago

Non-ASCII in URL at http://www.elcoteq.fi/

Categories

(Core :: Layout, defect, P3)

defect

Tracking

()

RESOLVED WORKSFORME
mozilla1.2alpha

People

(Reporter: hsivonen, Assigned: attinasi)

References

()

Details

(Whiteboard: [SYNTAX-HTML][NEED TESTCASE])

The front page of Elcoteq (http://www.elcoteq.fi/) has a frame whose URL contains the character 'ה'. The string "Pההkehys" gets mapped to "P%C3%A4%C3%A4kehys". Putting non-ASCII in an URL is wrong. See: http://www.w3.org/TR/html4/appendix/notes.html#non-ascii-chars Mozilla implements the UTF recovery method described in the HTML spec. The UTF-8 sequence for 'ה' in hex is: C3 A4. (I'm quite amazed to see 'ה' in a URL. I thought it was common knowledge among Finnish Web authors that putting non-ASCII in an URL will get you in trouble.)
P3--minor site, major problem Classifying as [SYNTAX-HTML], because we don't have a [SYNTAX-URL] pseudo-keyword.
Priority: -- → P3
Whiteboard: [SYNTAX-HTML]
-> Intl
Assignee: bclary → nitot
QA Contact: zach → momoi
The direct URI to the page is http://www.elcoteq.fi/modules/portal/view.cfm?setid=1&framename=Pההkehys whish is then translated to (as indicated by the first comment) http://www.elcoteq.fi/modules/portal/view.cfm?setid=1&framename=P%C3%A4%C3%A4kehys by Mozilla. This page fails both in Mozilla and IE 5 so I guess that the translation is wrong.
We should try to fix this problem on the client side. HTML standard recommends transcoding to UTF-8 and escape the values in such a case but it also says that if the agent wants to accommodate existing documents, it could use the document encoding. The fact of tha matter is that these types of pages exist and it is not a good idea to break on them when Comm 4 or IE5.x do not. It is true that Comm 4 or IE 5 will fail equally on http://www.elcoteq.fi/modules/portal/view.cfm?setid=1&framename=P%C3%A4%C3%A4kehys like Mozilla but they succeed on the ISO-8859-1 escaped version: http://www.elcoteq.fi/modules/portal/view.cfm?setid=1&framename=P%E4%E4kehys Mozilla also succeeds on this. Comm 4 and IE 5 successfully load this portion of the frameset while only Mozilla fails. Which means only Mozilla is not trying out the native encoding and escaping into it. For a link which points to the same server on which the current document resides, it is reasonable to try that encoding rather than or in addition to UTF-8. If the link points to a totally different server, then that is a different issue and only UTF-8 would be reasonable. I don't believe what Mozilla is doing is reasonable even against the HTML 4 recommendations. Who should own this bug? Layout or i18n? First sending it over to i18n. Also changing the component to internationalization.
Assignee: nitot → ftang
Component: Evangelism → Internationalization
CCing Valeski.
-> dougt who I believe is eventually going to be UTF8'ing our URI iface.
-> dougt
Assignee: ftang → dougt
We already deal with below the domain-name non-ASCII URLs and internal anchor links in the way I suggested above. See the following bug: http://bugzilla.mozilla.org/show_bug.cgi?id=10373 particularly erik van der poel's commment on Additional Comments From Erik van der Poel 2000-05-30 14:27 and why it is not a good idea to blindly transcode into UTF-8 in this case. I happen to think that the current problem is one case that was missed when we fixed Bug 10373. The same approach should be applied in parsing the following, which is not a straightforward URL but ultimately should be treated the same logic: <frame name="mainframe "src="/modules/portal/view.cfm?setid=1&framename=Pההkehys" marginwidth="0" marginheight="0" frameborder="no" noresize>
I agree with Katsuhiko Momoi, this problem belongs into the layout component. Only there the document encoding is known and I don't think this will change when we are UTF8'ing the uri implementation. Reassigning to layout ...
Assignee: dougt → karnaze
Component: Internationalization → Layout
QA Contact: momoi → petersen
need a testcase please. not table specific, reassigning to core owner.
Assignee: karnaze → attinasi
Whiteboard: [SYNTAX-HTML] → [SYNTAX-HTML][NEED TESTCASE]
Target Milestone: --- → mozilla1.2
This bug will be fixed by the followings. http://bugzilla.mozilla.org/show_bug.cgi?id=124042 http://bugzilla.mozilla.org/show_bug.cgi?id=127282 Patch for 127282 include Frame SRC support, which already has super-review. ============== From 127282 Begin ================== -- Additional Comment #56 From nhotta@netscape.com 2002-03-08 16:14 -- Created an attachment (id=73307) Frame SRC support, combined with the last patch. ============== From 127282 End ================== Once that patch goes in, testing on this bug is required for verification. Thanks.
Depends on: 124042, 127282
The URL no longer exists. Resolving WFM.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.