Evaluation result of string expression is wrong when the expression contains some special chars

UNCONFIRMED
Unassigned

Status

Rhino
Core
UNCONFIRMED
8 years ago
8 years ago

People

(Reporter: Wenjie Tu, Unassigned)

Tracking

Details

(Reporter)

Description

8 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)
Build Identifier: 1.7

Run following code snippet:

--------------------------------------------------
		char[] chars = {65279, 19968, 19969, 19971, 21313, 35961, 31481, 25975, 35377, 21151, 33995, 67, 104, 105, 110, 97};
		String originalStr = new String( chars );
		System.out.println( "Original string to construct string expression:" + originalStr + "; len=" + originalStr.length( ) );
		String strExpr = '"' + originalStr + '"';
		System.out.println( "String expression pushed to Rhino:" + strExpr + "; len=" + strExpr.length( ));
		
		Context cx = Context.enter();
		try 
		{
			Scriptable scope = cx.initStandardObjects();
			Object result = cx.evaluateString(scope, strExpr, "<cmd>", 1, null);
			String resultString = (String)result;
			System.out.println( "Rhino evaluation result string:" + resultString + "; len=" + resultString.length( ));
			System.out.println("Does evaluation result string equal to original one:" + resultString.equals( originalStr ));
		}
		finally
		{
			cx.exit( );
		}

--------------------------------------------------

get following output:
--------------------------------------------------------
Original string to construct string expression:?一丁七十豹竹敷許功蓋China; len=16
String expression pushed to Rhino:"?一丁七十豹竹敷許功蓋China"; len=18
Rhino evaluation result string:一丁七十豹竹敷許功蓋China; len=15
Does evaluation result string equal to original one:false
--------------------------------------------------------

Reproducible: Always

Steps to Reproduce:
1. Run code snippet in "Detail" block
2. see output
3. 
Actual Results:  
resultString does not equal to originalStr

Expected Results:  
resultString equals to originalStr
(Reporter)

Updated

8 years ago
Version: other → 1.7R1

Comment 1

8 years ago
The problematic character seems to be the first one: 65279, or "\ufeff":

http://en.wikipedia.org/wiki/Byte_order_mark

I can see how the BOM would be removed when a string is encoded from raw bytes, although I'm not sure this is what should be happening here. Disclaimer: I'm in no way a Unicode expert. Maybe someone with greater knowledge will chime in.
You need to log in before you can comment on or make changes to this bug.