Closed
Bug 723448
Opened 12 years ago
Closed 12 years ago
abcdump misreads string literals that begin with the UTF BOM
Categories
(Tamarin Graveyard :: Tools, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: pnkfelix, Assigned: pnkfelix)
Details
Attachments
(2 files)
509 bytes,
text/plain
|
Details | |
1.97 KB,
patch
|
edwsmith
:
review+
|
Details | Diff | Splinter Review |
Spawned off of Bug 707700 abcdump mishandles an abc compiled from the source file containing solely: '\ufeff1234' I think this is likely an artifact of this code at the start of ByteArrayObject::readUTFBytes: String* ByteArrayObject::readUTFBytes(uint32_t length) { if (m_byteArray.Available() < length) toplevel()->throwEOFError(kEOFError); const uint8_t* p = (const uint8_t*)m_byteArray.GetReadableBuffer() + m_byteArray.GetPosition(); // Skip UTF8 BOM (but it is still counted in the length we consume). if (length >= 3 && p[0] == 0xEFU && p[1] == 0xBBU && p[2] == 0xBFU) { p += 3; length -= 3; }
Assignee | ||
Comment 1•12 years ago
|
||
This code is illustrating that we cannot round-trip a String that starts with a BOM via writeUTFBytes and readUTFBytes. When I run this code, here is what I get: % $AVM testbom.abc s1: 1234 s1.length: 5 b.length: 7 read(0) s: s.length: 0 read(1) s: ï s.length: 1 read(2) s: ï» s.length: 2 read(3) s: s.length: 0 read(4) s: 1 s.length: 1 read(5) s: 12 s.length: 2 read(6) s: 123 s.length: 3 read(7) s: 1234 s.length: 4 s1: 1234 s: 1234 s1.length: 5 s.length: 4 thus equal: false
Assignee | ||
Comment 2•12 years ago
|
||
(In reply to Felix S Klock II from comment #1) > Created attachment 593783 [details] > code illustrating issue when using readUTFBytes > > This code is illustrating that we cannot round-trip a String that starts > with a BOM via writeUTFBytes and readUTFBytes. (of course it is easy to compensate for this behavior. investigating now.)
Assignee | ||
Comment 3•12 years ago
|
||
(Moved from Bug 707700, comment 8.) As I said on the original bug: Ed: redirect review as you like (or rubber stamp if you prefer). I am pretty sure we would prefer to accurately capture the string constant we read in (that is, I am claiming this is a 'real bug').
Updated•12 years ago
|
Attachment #593816 -
Flags: review?(edwsmith) → review+
Comment 4•12 years ago
|
||
changeset: 7193:99987b969155 user: Felix S Klock II <fklockii@adobe.com> summary: Bug 723448: abcdump: undo in readUTFBytes leading UTF BOM compensation (r=edwsmith). http://hg.mozilla.org/tamarin-redux/rev/99987b969155
Assignee | ||
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•