Rhino munges binary output



13 years ago
13 years ago


(Reporter: bugzilla, Unassigned)





13 years ago
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20060313 Debian/1.5.dfsg+ Firefox/
Build Identifier: Rhino 1.6r2

If you read a binary file (a PNG, say) with the readFile() function and write it using the print() function, you get a different file. (Noob warning: I'm not that familiar with Javascript in general, but it seems fairly strange that an elementary read and write would trash a file. My apologies if there's something obvious that I'm missing here.)

I'm using the binary version of Rhino, rhino1_6R2.

Reproducible: Always

Steps to Reproduce:
(livemarks16.png is http://www.mozilla.org/images/livemarks16.png)

$ cat t.js 
$ java -classpath js.jar org.mozilla.javascript.tools.shell.Main t.js > new.png
$ cmp livemarks16.png new.png 
livemarks16.png new.png differ: char 1, line 1

Actual Results:  
As indicated, the files differ.
If you want the actual files, they are:

Note that these URLs may not be around forever.

Expected Results:  
new.png should be the exact same file as livemarks16.png, right? (I can understand that a function called print() might pad the output with a newline, but I don't get the binary mangling part.)


13 years ago
Assignee: igor.bukanov → please_see_bug_288433
Assignee: please_see_bug_288433 → nobody

Comment 1

13 years ago
This seems similar to Bug 337434 -- all of the octets in the range 128-255 are being replaced by a single character (in this case, 0x3f).  The resulting data file also has a spurious newline.  This is most likely the result of a charset conversion that should not be performed on binary files.

Comment 2

13 years ago
The documentation for readFile clearly states that if you don't specify an encoding, it'll use the Java default encoding for the system, or optionally you can specify an encoding. This means it's meant as a method for reading textual, and not binary content. 

The problem is twofold: reading the file unchanged and writing it unchanged. Even if you can get readFile() to read the file using some encoding that maps bytes 0x00-0xff to Unicode characters U0000-U00FF, the print() function will always use the system default character encoding, and if it doesn't have the same property (namely, maps U0000-U00FF to 0x00-0xff), you're stuck. Actually, it'd be sufficient if whatever is the system encoding has bijective mapping for all bytes 0x00-0xff to and from the character set, but again, it's hardly portable as you depend on the system default character encoding. I'll therefore to mark this as INVALID -- yes, it "munges" binary output, because both print() and readFile() are written to operate on textual content, not binary. 
Last Resolved: 13 years ago
Resolution: --- → INVALID

Comment 3

13 years ago
Thanks for the explanation. That said, is there any way to read binary content into Rhino? thanks.
You need to log in before you can comment on or make changes to this bug.