Closed Bug 435002 Opened 16 years ago Closed 16 years ago

DTDParser should eat BOM

Categories

(Core :: Internationalization: Localization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

Details

Attachments

(1 file)

Apparently, expat does eat utf-8 BOMs in DTDs fine, so let's add that to the DTDParser. py only.

Filing a bug as there is some naughtiness to be documented:

The UTF-8 BOM '\xef\xbb\xbf' is parsed by the codecs.open line, and is converted to u'\ufeff'. The endianess of the platform doesn't pose an issue here, as I verified by testing on a Windows XP PC and a Mac PPC. I.e., codecs.BOM differs with '\xff\xfe' and '\xfe\xff', resp., but the unicode char is the same.

I'll attach a patch for reference.
FIXED.

http://hg.mozilla.org/users/axel_mozilla.com/tooling/index.cgi/rev/f413655cab34
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: