Open Bug 706691 Opened 13 years ago Updated 2 years ago

Use separate types for ASCII vs punycode vs UTF-8 strings, especially for hostnames

Tracking

()

Status:

NEW

People

(Reporter: briansmith, Unassigned)

Details

(Whiteboard: [necko-would-take])

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Reporter

Description

•

13 years ago

+++ This bug was initially created as a clone of Bug #703508 +++

In bug 703508 comment 10, Kai noticed that the character encoding of nsNSSSocketInfo::mHostName is unclear. When we store hostnames in strings in Necko and PSM, we should make the type of encoding we are expecting unambiguous--e.g. we should have a "punycode" string type.

Is it the case that all the non-UTF8 8-bit strings used for hostnames in Necko are considered punycode? It seems like we shouldn't have *any* code that is ASCII-but-not-to-be-interpreted-as-punycode, because such code wouldn't support IDNs at all.

Christian :Biesinger (don't email me, ping me on IRC)

Comment 1

•

13 years ago

Necko itself always stores hostnames as punycode or as the original UTF-8 strings (which generally are normalized to UTF-8 even when input as punycode originally). Where would any other encodings come from?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Reporter

Comment 2

•

13 years ago

(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #1)
> Necko itself always stores hostnames as punycode or as the original UTF-8
> strings (which generally are normalized to UTF-8 even when input as punycode
> originally). 

Good to hear.

> Where would any other encodings come from?

I don't know. My main point is that "ns[A]CString & hostname" doesn't scream "punycode," which leads to confusion and doubt like in Kai's review of my patches in bug 703508 and bug 674147.

Patrick McManus [:mcmanus]

Updated

•

8 years ago

Whiteboard: [necko-would-take]

Firefox Bug Husbandry Bot

Comment 3

•

7 years ago

Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258

Priority: -- → P5

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Use separate types for ASCII vs punycode vs UTF-8 strings, especially for hostnames

Categories

(Core :: Networking, defect, P5)

Tracking

()

People

(Reporter: briansmith, Unassigned)

References

Details

(Whiteboard: [necko-would-take])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Updated