Closed
Bug 435002
Opened 17 years ago
Closed 17 years ago
DTDParser should eat BOM
Categories
(Core :: Internationalization: Localization, defect)
Core
Internationalization: Localization
Tracking
()
RESOLVED
FIXED
People
(Reporter: Pike, Assigned: Pike)
Details
Attachments
(1 file)
1.05 KB,
patch
|
Details | Diff | Splinter Review |
Apparently, expat does eat utf-8 BOMs in DTDs fine, so let's add that to the DTDParser. py only.
Filing a bug as there is some naughtiness to be documented:
The UTF-8 BOM '\xef\xbb\xbf' is parsed by the codecs.open line, and is converted to u'\ufeff'. The endianess of the platform doesn't pose an issue here, as I verified by testing on a Windows XP PC and a Mac PPC. I.e., codecs.BOM differs with '\xff\xfe' and '\xfe\xff', resp., but the unicode char is the same.
I'll attach a patch for reference.
Assignee | ||
Comment 1•17 years ago
|
||
Assignee | ||
Comment 2•17 years ago
|
||
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•