If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Better parity with Safari in non-unescaped characters

NEW
Unassigned

Status

Camino Graveyard
Location Bar & Autocomplete
--
minor
7 years ago
7 years ago

People

(Reporter: Smokey Ardisson (offline for a while; not following bugs - do not email), Unassigned)

Tracking

Details

(URL)

Attachments

(1 attachment)

Created attachment 460178 [details]
Notes from making the other patch

From bug 572487 comment 4:

> On the extended testcase attachment 341759 [details], though, there are a bunch of places
> where Firefox draws "missing-glyph boxes" (e.g., tries to render the glyph)
> where it seems like it would make sense to not decode at all (e.g., %00 to
> %08).  Safari doesn't decode any of the characters/codepoints in that testcase,
> except, oddly, U+FFFC.
> 
> Also, it seems like it would be a cleaner solution if we could create a list
> that excluded whole Unicode ranges (where applicable) and then add the specific
> other characters we want to exclude, rather than listing each
> character/codepoint individually.  (We could pick up a very few using
> controlCharacterSet and whitespaceAndNewlineCharacterSet, but I'm very leery of
> using illegalCharacterSet, since it's everything that's illegal or was not
> defined in *Unicode 3.2* and we're on Unicode *5.2*, with lots of new
> characters defined.)

We can't use NSCharacterSets, because we can't get the contents of an NSCharacterSet into an NSString, and the NSString/CFStringRef functions don't support ranges.  In the meeting, Stuart suggested writing some code that will iterate or loop through a range, and that's probably what we'll end up needing here.

I've attached some notes from when I was making attachment 453245 [details] [diff] [review] (the non-fix with NSCharacterSet); that patch has a better collection of ranges, but these notes were somewhat useful when I was creating them, and I want the notes not to be lost.
I think now all we're missing after take 2 from bug 572487 is

* unassigned (u+fff0 to u+fff8) <-- FFEF - FFF8
* FDD0 to FDD7, FDDA to FDDF <-- FDC8-FDDF
* Noncharacters 1ffff-fffff, 10ffff
* U+E0000 to U+E007F

but when we go to finish this, we should double-check.
You need to log in before you can comment on or make changes to this bug.