Closed Bug 506053 Opened 15 years ago Closed 6 years ago

Wrong offset in Regular Expression lambda expression if string contains non-latin characters

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: cpeyer, Unassigned)

References

Details

Steps to reproduce: 

public function main():void 
{ 
"ąäčåabcdefghij".replace(/abc/, lambda); 
} 

private function lambda(matched, offset, original):void 
{ 
trace 
( 
"matched: " + matched + 
"\noffset: " + offset + 
"\noriginal.substring (offset, offset + matched.length): " + 
original.substring(offset, offset + matched.length) 
); 
} 

 Actual Results: 
  matched: abc 
  offset: 8 
  original.substring (offset, offset + matched.length): efg 
  
 Expected Results: 
 "original.substring(offset, offset + matched.length)" should be equal to "matched", because actual character offset of the "abc" string is 4: 

   matched: abc 
   offset: 4 
   original.substring (offset, offset + matched.length): abc 
  
I suspect that the string is internally stored in UTF-8. Offset is given in bytes instead of characters and the characters at the beginning of string use two bytes each. Thus, offset is equal to 8 instead of 4. 
This issue might be related to this one: https://bugs.adobe.com/jira/browse/ASC-3252
Flags: in-testsuite?
Flags: flashplayer-qrb?
Transferred from: https://bugs.adobe.com/jira/browse/ASC-3253

Getting the same behaviour in spidermonkey, but this seems to be a valid bug.
Depends on: 506052
Flags: flashplayer-triage+
Blocks: AS3_Builtins
No longer depends on: 506052
Flags: flashplayer-qrb? → flashplayer-qrb+
Priority: -- → P2
Target Milestone: --- → Future
Priority: P2 → --
Depends on: 535770
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.