Closed
Bug 169497
Opened 22 years ago
Closed 22 years ago
regular expression matching not compatible with Perl 5
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
People
(Reporter: martin.honnen, Assigned: rogerl)
Details
Attachments
(3 files)
I am trying to write a regular expression that extracts the content of the
<body> tag from a HTML string source.
I have an expression that does what I want with Perl 5 but doesn't match
anything with the XPCOM shell (JavaScript-C 1.5 pre-release 4a 2002-03-21) that
comes with Mozilla 1.1.
Here is the Perl program:
$html = "";
$html .= "<html>\n";
$html .= "<body onload=\"alert(event.type);\">\n";
$html .= "<p>Kibology for all<\/p>\n";
$html .= "<p>All for Kibology<\/p>\n";
$html .= "<\/body>\n";
$html .= "<\/html>";
($first, $second) = ($html =~ /<body.*>((.*\n?)*?)<\/body>/i);
print "first submatch: $first\n";
print "second submatch: $second\n";
When I run that with Perl ( v5.6.1 built for MSWin32-x86-multi-thread) I get the
following result:
first submatch:
<p>Kibology for all</p>
<p>All for Kibology</p>
second submatch: <p>All for Kibology</p>
which means the content of the <body> tag in the source is correctly extracted
as the first match.
The JavaScript version looks as follows:
var html = '';
html += '<html>\n';
html += '<body onload="alert(event.type);">\n';
html += '<p>Kibology for all<\/p>\n';
html += '<p>All for Kibology<\/p>\n';
html += '<\/body>\n';
html += '<\/html>';
var bodyMatch = /<body.*>((.*\n?)*?)<\/body>/i;
function showMatch (re) {
var r = '';
var match = re.exec(html);
if (match) {
var r = '';
for (var i = 0; i < match.length; i++)
r += i + ':||' + match[i] + '||\n';
}
print("match with " + re + ":");
print(r);
print("");
}
showMatch(bodyMatch);
When I run this with the XPCOM shell I get the following output:
match with /<body.*>((.*\n?)*?)<\/body>/i:
that is no match is found. If I compile the JavaScript with jsc, the JScript.NET
compiler and then run the executable, I get the following output:
match with /<body.*>((.*\n?)*?)<\/body>/i:
0:||<body onload="alert(event.type);">
<p>Kibology for all</p>
<p>All for Kibology</p>
</body>||
1:||
<p>Kibology for all</p>
<p>All for Kibology</p>
||
2:||<p>All for Kibology</p>
||
which means the <body> tag and it content are matched, the content of the <body>
tag is the first submatch, and the second submatch is the same as with Perl too.
Thus I think the Spidermonkey implementation is wrong, it should find a match
and find the submatches for the parenthised expressions.
I will upload the JavaScript file.
Reporter | ||
Comment 1•22 years ago
|
||
Reporter | ||
Comment 2•22 years ago
|
||
Reporter | ||
Comment 3•22 years ago
|
||
I have also tried the test case with the Rhino shell (rhino1_5R3), Rhino finds
the match but doesn't set the first submatch correctly:
match with /<body.*>((.*\n?)*?)<\/body>/i:
0:||<body onload="alert(event.type);">
<p>Kibology for all</p>
<p>All for Kibology</p>
</body>||
1:||||
2:||<p>All for Kibology</p>
||
At least that would also support that Spidermonkey should find a match
Reporter | ||
Comment 4•22 years ago
|
||
Assignee | ||
Comment 5•22 years ago
|
||
There's possibly some negative interaction between the concepts of Kibology and
SpiderMonkey. However, applying the patch from bug 85721 fixes this bug so I'm
thinking this should be dup'ed to that - except that Rhino already contains that
fix and should work fine. When I tried Rhino I got the expected results (i.e.
not what Martin saw in comment #3 - maybe you could give this a try on your
build, Phil?)
Comment 6•22 years ago
|
||
Martin's testcase added to JS testsuite:
mozilla/js/tests/ecma_3/RegExp/regress-169497.js
Roger is correct: the fix for this is covered by the big patch
for bug 85721, which has already been committed to Rhino.
When I run the testcase in the current SpiderMonkey shell,
it produces no match, as Martin has reported. When I run
the testcase in the current Rhino shell, it passes.
Martin: the RegExp fix is not in rhino1_5R3, but is contained
in the current version of Rhino in the Mozilla CVS repository.
If you don't normally build Rhino from the Mozilla CVS repository,
the fix might be in ftp://ftp.mozilla.org/pub/js/rhinoLatest.zip
That is dated July 7 of this year. That should have the fix,
which was checked in on 2002-06-20 (see Rhino bug 125562).
Resolving this bug as a duplicate of SpiderMonkey bug 85721 -
*** This bug has been marked as a duplicate of 85721 ***
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Comment 7•22 years ago
|
||
Marking Verified.
Martin: thank you for this report. You have been cc'ed on bug 85721
so you can follow progress on this issue.
All that's left is for the patch in that bug to be reviewed and
committed to the SpiderMonkey codebase. One day after that happens,
the fix will be reflected in trunk builds of Mozilla -
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•