Closed
Bug 340714
Opened 19 years ago
Closed 19 years ago
nsIConverterInputStream duplicates the last character in the first buffer
Categories
(Core :: Internationalization, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: jgrills, Assigned: smontagu)
Details
(Keywords: verified1.8.1.22)
Attachments
(2 files, 1 obsolete file)
8.09 KB,
patch
|
jag+mozilla
:
superreview+
|
Details | Diff | Splinter Review |
5.32 KB,
text/html
|
Details |
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
Build Identifier: Celtx 0.9.7.1RC
When reading a UTF-16 file using the nsIConverterInputStream API with nsIUnicharLineInputStream and calling the readLine() function on a UTF-16 file encoded in LE format with a starting BOM, the last character in the first buffer (where the buffer size was passed to the init call) will be duplicated and end up in the string twice.
Reproducible: Always
Steps to Reproduce:
Here's the code that I'm using:
var captionLocalFile = Components.classes["@mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
captionLocalFile.initWithPath(captionFileName);
var captionFileInputStream = Components.classes["@mozilla.org/network/file-input-stream;1"].createInstance(Components.interfaces.nsIFileInputStream);
captionFileInputStream.init(captionLocalFile, 0x01, 0444, 0);
var captionFile = Components.classes["@mozilla.org/intl/converter-input-stream;1"].createInstance(Components.interfaces.nsIConverterInputStream);
captionFile.init(captionFileInputStream, "UTF-16",1024, Components.interfaces.nsIConverterInputStream.DEFAULT_REPLACEMENT_CHARACTER);
if (!(captionFile instanceof Components.interfaces.nsIUnicharLineInputStream))
throw "not line input stream";
var more = true;
while (more)
{
// read the line and check for eof
var line = {};
more = captionFile.readLine(line);
}
Updated•19 years ago
|
Assignee: smontagu → nobody
Component: Internationalization → Networking
QA Contact: amyy → networking
Updated•19 years ago
|
Assignee: nobody → smontagu
Component: Networking → Internationalization
QA Contact: networking → amyy
Comment 1•19 years ago
|
||
does this happen for all UTF-16 files, or just a specific one? (could you attach one for easier testing?)
Assignee | ||
Updated•19 years ago
|
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Assignee | ||
Comment 2•19 years ago
|
||
This happens because when the various implementations of Convert() for UTF16 detect a BOM they pass the buffer without the BOM to UTF16ConvertToUnicode(), after modifying the aSrcLength passed to them, and then pass the changed value back to the caller.
Attachment #224853 -
Flags: review?(jshin1987)
Comment 3•19 years ago
|
||
ok, so... why was this never an issue for anything else?
Assignee | ||
Comment 4•19 years ago
|
||
Probably because most callers of nsIUnicodeDecoder::Convert() don't do anything with aSrcLength after a successful call.
Comment 5•19 years ago
|
||
Comment on attachment 224853 [details] [diff] [review]
patch
r=jshin
sorry for the delay. I thought I had reviewed this patch (I guess I did, but forgot to log-in after pressing submit button. At home, sometimes bugzilla keeps asking me to log-in for every transaction
Attachment #224853 -
Flags: review?(jshin1987) → review+
Assignee | ||
Updated•19 years ago
|
Attachment #224853 -
Flags: superreview?(jag)
Assignee | ||
Comment 6•19 years ago
|
||
I tested this with all combinations of UTF16-BE/LE with and without BOM, and using odd and even buffer lengths (the previous version failed horribly with an odd buffer length as jag pointed out it would).
Attachment #224853 -
Attachment is obsolete: true
Attachment #229822 -
Flags: superreview?(jag)
Attachment #224853 -
Flags: superreview?(jag)
Comment 7•19 years ago
|
||
Comment on attachment 229822 [details] [diff] [review]
Responding to jag's sr comments on IRC
Looks good. Is there an easy way to add your tests to CVS so the next person to touch this code can run them?
BTW, mFoundBOM doesn't need to be a member variable, it can just be a local. Does anyone actually test for NS_OK_UDEC_NOBOMFOUND though?
Attachment #229822 -
Flags: superreview?(jag) → superreview+
Assignee | ||
Comment 8•19 years ago
|
||
(In reply to comment #7)
> (From update of attachment 229822 [details] [diff] [review] [edit])
> Looks good. Is there an easy way to add your tests to CVS so the next person to
> touch this code can run them?
>
Good idea. I should be able to work out a one-click version of the tests that I can check in.
> BTW, mFoundBOM doesn't need to be a member variable, it can just be a local.
I think it does have to be a member variable. We want to preserve its state across successive buffers for the same input, no?
> Does anyone actually test for NS_OK_UDEC_NOBOMFOUND though?
Not right now: see bug 235090 for why it got added.
Assignee | ||
Comment 9•19 years ago
|
||
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 10•19 years ago
|
||
I checked in a testcase at intl/uconv/tests/test_nsIConverterInputStream.html
Assignee | ||
Comment 11•18 years ago
|
||
The previous testcase has now been replaced by an xpcshell-based test at intl/uconv/tests/test_bug340714.js
Flags: in-testsuite+
Assignee | ||
Comment 12•18 years ago
|
||
intl/uconv/tests/unit/test_bug340714.js, rather.
Assignee | ||
Comment 13•16 years ago
|
||
Checked in to 1.8 branch (with bug 317216)
Keywords: fixed1.8.1.22
Assignee | ||
Comment 14•16 years ago
|
||
A stand-alone version of the unit test for 1.8.x testing
Comment 15•16 years ago
|
||
Verified fixed for 1.8.1.22 with Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.22pre) Gecko/20090604 SeaMonkey/1.1.17pre. Stand-alone testcase passes.
Passes in Trunk as well: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090604 Minefield/3.6a1pre
Status: RESOLVED → VERIFIED
Keywords: fixed1.8.1.22 → verified1.8.1.22
You need to log in
before you can comment on or make changes to this bug.
Description
•