Closed
Bug 1316339
Opened 8 years ago
Closed 8 years ago
Exception in .parse method reading a .inc file with utf-8 characters in comments
Categories
(Localization Infrastructure and Tools :: compare-locales, defect)
Localization Infrastructure and Tools
compare-locales
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: flod, Assigned: flod)
References
Details
Attachments
(3 files, 1 obsolete file)
This is the exception I get when parsing the attached file Traceback (most recent call last): File "./app/scripts/tmx_products.py", line 226, in <module> main() File "./app/scripts/tmx_products.py", line 221, in main extracted_strings.extractStrings() File "./app/scripts/tmx_products.py", line 139, in extractStrings entities, map = file_parser.parse() File "/home/flodolo/transvision/libraries/compare-locales/compare_locales/parser.py", line 224, in parse for e in self: File "/home/flodolo/transvision/libraries/compare-locales/compare_locales/parser.py", line 248, in walk entity, offset = self.getEntity(ctx, offset) File "/home/flodolo/transvision/libraries/compare-locales/compare_locales/parser.py", line 492, in getEntity return (self.createEntity(ctx, m), offset) File "/home/flodolo/transvision/libraries/compare-locales/compare_locales/parser.py", line 280, in createEntity pre_comment = str(self.last_comment) if self.last_comment else '' UnicodeEncodeError: 'ascii' codec can't encode character u'\u010d' in position 216: ordinal not in range(128)
Assignee | ||
Comment 1•8 years ago
|
||
I'm not completely sure if it's relevant, but for some strange reasons I get the exception reported multiple times: one for the file with the issue, one for each later .inc file I analyze, almost as if the parser keeps the exception stored somewhere.
Assignee | ||
Comment 2•8 years ago
|
||
I think I've identified the code responsible for the exception: str() should be unicode() https://hg.mozilla.org/l10n/compare-locales/file/tip/compare_locales/parser.py#l280 But I have no clue about the exception piling up.
Assignee | ||
Comment 3•8 years ago
|
||
I think there might be another issue in the parser with the attached file: since the line #define seamonkey_l10n_long doesn't assign a value to the 'seamonkey_l10n_long', the following instruction '#unfilter' is lost.
Assignee | ||
Comment 4•8 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #3) > I think there might be another issue in the parser with the attached file: > since the line > > #define seamonkey_l10n_long > > doesn't assign a value to the 'seamonkey_l10n_long', the following > instruction '#unfilter' is lost. Never mind, it works as expected if there's a space/tab after the entity name, and my editor is trimming whitespaces.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 8•8 years ago
|
||
mozreview-review |
Comment on attachment 8809771 [details] Bug 1316339 - Support UTF-8 characters in comments within .inc files; https://reviewboard.mozilla.org/r/92298/#review92308 r=me with the follow-up
Attachment #8809771 -
Flags: review?(l10n) → review+
Updated•8 years ago
|
Attachment #8809772 -
Flags: review?(francesco.lodolo)
Assignee | ||
Comment 9•8 years ago
|
||
mozreview-review |
Comment on attachment 8809772 [details] bug 1316339, follow up to allow defines with no value https://reviewboard.mozilla.org/r/92300/#review92312 Thanks, it makes a lot more sense like this.
Attachment #8809772 -
Flags: review?(francesco.lodolo) → review+
Assignee | ||
Updated•8 years ago
|
Attachment #8809387 -
Attachment is obsolete: true
Attachment #8809387 -
Flags: review?(l10n)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → francesco.lodolo
Comment 10•8 years ago
|
||
pushed to upstream, https://hg.mozilla.org/l10n/compare-locales/pushloghtml?changeset=0effb60622ea
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•