Open
Bug 977923
Opened 11 years ago
Extended ASCII characters cause importxml.pl to fail on MULTIPLE bugs, but not on ONE
Categories
(Bugzilla :: Bug Import/Export & Moving, defect)
Tracking
()
UNCONFIRMED
People
(Reporter: jwiseheart, Unassigned)
Details
User Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0 (Beta/Release)
Build ID: 20140212131424
Steps to reproduce:
I attempted to import multiple bugs with importxml.pl, following the DTD; here's an example XML file that has been stripped down, and names changed for anonymity:
<bugzilla version="4.4.2" urlbase="http://bugzilla/" maintainer="admin@ourcompany.com" exporter="bugzilla@ourcompany.com">
<bug>
<bug_id>723</bug_id>
...other bug parameters here...
</long_desc>
<thetext><![CDATA[ This imported text has no special characters. ]]></thetext>
</long_desc>
</bug>
<bug>
<bug_id>724</bug_id>
...other bug parameters here...
</long_desc>
<thetext><![CDATA[ This imported text has extended ASCII characters like Bërt's name and ±5°F ]]></thetext>
</long_desc>
</bug>
<bug>
<bug_id>725</bug_id>
...other bug parameters here...
</long_desc>
<thetext><![CDATA[ This text we import has no special characters either. ]]></thetext>
</long_desc>
</bug>
</bugzilla>
Actual results:
In this example, bug_id 723 will import, but extended ASCII characters (ASCII code > 128) like ë, ±, and ° cause bug_id 724 to throw an error similar to the following:
"not well-formed (invalid token) at line 3, column 286, byte 410 at /usr/lib/perl5/vendor_perl/5.10.0/i586-linux-thread-multi/XML/Parser.pm line 187
at ./importxml.pl line 1267"
Expected results:
To try to get a better look at what was happening, I copied JUST THE ONE BUG that was causing an error into a new XML file, like so:
<bugzilla version="4.4.2" urlbase="http://bugzilla/" maintainer="admin@ourcompany.com" exporter="bugzilla@ourcompany.com">
<bug>
<bug_id>724</bug_id>
...other bug parameters here...
</long_desc>
<thetext><![CDATA[ This imported text has extended ASCII characters like Bërt's name and ±5°F ]]></thetext>
</long_desc>
</bug>
</bugzilla>
When I try running importxml.pl with one bug at a time like this, it imports fine, extended ASCII characters and all!
Why does this work with a single bug with extended ASCII characters, but fails on an XML file with multiple bugs?
I have 4,000 bugs to import, and a few hundred have this problem...
You need to log in
before you can comment on or make changes to this bug.
Description
•