Closed Bug 355677 Opened 18 years ago Closed 18 years ago

Rhino E4X depends on Apache XMLBeans

Categories

(Rhino Graveyard :: E4X, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: inonit, Assigned: inonit)

References

Details

Attachments

(3 files, 3 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7
Build Identifier: Rhino HEAD 6-Oct-2006 7:34a EDT

As Attila discussed in the newsgroup (http://groups.google.com/group/mozilla.dev.tech.js-engine/browse_thread/thread/1a4eae00d18140c1/477f6da30df203e0?tvc=2&q=new+rhino+maintainer#477f6da30df203e0), ideally Rhino's E4X support would not depend on the XMLBeans library from Apache.  Currently it does, hence this issue.

Reproducible: Always




I have begun work on a local branch of the codebase from which I am slowly removing calls to the XMLBeans library.
Attached file Snapshot of current work (obsolete) —
Feedback welcome; this is the current state of things with all XMLBeans calls removed
Attached file Snapshot 2006-Oct-19 (obsolete) —
Continuing to refactor, fixed a couple of regressions, better, more accurate namespace handling, still more refactoring to do
Attachment #242132 - Attachment is obsolete: true
Attached file Zipped snapshot (obsolete) —
All references to XMLBeans have been removed
More bugs fixed
This build passes 16 tests that the present implementation fails, but fails 9 that the present implementation passes.
Attachment #242865 - Attachment is obsolete: true
Assignee: nobody → inonit
Status: NEW → ASSIGNED
Right now the DOM-only port requires JDK 1.5, as it uses the getUserData/setUserData methods of the DOM Node class, which don't exist in the 1.4 distributions.  It uses other methods specific to 1.5 also, but the get/setUserData would be the most difficult to remove, in my judgment.

There would be ways to emulate this functionality (big Hashtable, weak references, maybe?) but I haven't investigated it.  I will if people think there will be a big demand for 1.4 JDKs running a non-XMLBeans E4X.

My current thought is to deprecate the XMLBeans implementation and move it to the deprecatedsrc tree; I don't see a lot of value in actually _maintaining_ two E4X implementations.  I am open to thoughts there as well.
Well, I'd certainly want Rhino to remain runnable on 1.4 JREs. Would adding an external XML parser that supports the newer DOM spec (i.e. Xerces) to classpath get around the problem?
Yeah, I wasn't going to disallow running on pre-1.5 JREs, but one possibility is to say "you can only use the XMLBeans implementation on pre-1.5 JREs."  I haven't looked into Xerces or the endorsed standards rigamarole or that sort of thing, so I guess I'll have to do more detailed study in order to figure out what the possibilities are.
First, this patch will also attempt to reapply a patch to testsrc which I already applied and I don't know what will happen, but it shouldn't affect building the main distribution jar. :)

Anyway, this patch shows the proposed pluggability mechanism for E4X.  It adds a method to Context and ContextFactory asking them to provide an instance of the new XMLLib.Factory class which is then used to instantiate the XMLLib in ScriptRuntime when called upon.  The default rule for the moment is to use XMLBeans if it's in the classpath (to minimize surprises for existing users) and to use the new DOM implementation if JDK 1.5+ is being used.  (Attila, per your comments, I am considering backporting this to 1.4 but we can discuss further.)  The important thing is whether we approve of the strategy here before I commit the uberpatch.  There are also included proposed buildfile changes as well.

There is still a substantial amount of quality control to be done on the new implementation, but it continues to narrowly lead the old implementation in the conformance tests.
Attachment #246076 - Attachment is obsolete: true
Just to clarify the above, we use XMLBeans in the classpath, DOM if it XMLBeans is absent and JDK 1.5+, and neither if neither condition is true.
Blocks: 320812
Blocks: 263934
Blocks: 277767
Blocks: 263936
Blocks: 264369
Blocks: 289117
Blocks: 291927
Blocks: 291937
Blocks: 352346
Attached file Zipped snapshot
You may want to apply the previous patch (which takes care of changes to other classes, like Context etc.) and then unzip this over top of it.

This implementation is getting close to 'commit' status.  It now fixes 19 bugs from the XMLBeans implementation with only two regressions:

Fixed (19): e4x/Namespace/regress-283972.js, e4x/Regress/regress-263934.js, e4x/Regress/regress-263936.js, e4x/Regress/regress-264369.js, e4x/Regress/regress-291937.js, e4x/Regress/regress-352346.js, e4x/TypeConversion/10.1.1.js, e4x/TypeConversion/10.1.2.js, e4x/XML/13.4.4.11.js, e4x/XML/13.4.4.12-1.js, e4x/XML/13.4.4.13.js, e4x/XML/13.4.4.27.js, e4x/XML/13.4.4.28.js, e4x/XML/13.4.4.32-1.js, e4x/XML/13.4.4.38.js, e4x/XML/13.4.4.39.js, e4x/XML/13.4.4.9.js, e4x/XML/regress-336921.js, e4x/XMLList/13.5.4.10.js

Regressions (2): e4x/GC/regress-280844-1.js (Xerces DOM stack overflow when processing namespace for XML tree 5000 nodes deep), e4x/Regress/regress-350629.js (Regular expression issue with namespace prefixes.)

One of the two regressions is probably wrong -- it actually exercises what appears to be a RegExp bug in Rhino (having to do with greediness) because this E4X emits the attributes under consideration in a different order than XMLBeans, and thus fails the test.

The other is a stack overflow when processing an XML document that is 5,000 layers deep.  Not a showstopper. :)

On the other hand, passing those 19 tests will also wipe out 9 Bugzilla issues when this is committed.

Existing users won't be affected, as the XMLBeans implementation will still be available.  They will only become affected when they remove XMLBeans from their classpath.  At least for now (in the future, I'd think we'd make this implementation the default).
(In reply to comment #7)
> Anyway, this patch shows the proposed pluggability mechanism for E4X. It adds
> a method to Context and ContextFactory asking them to provide an instance of
> the new XMLLib.Factory class which is then used to instantiate the XMLLib in
> ScriptRuntime when called upon.  The default rule for the moment is to use
> XMLBeans if it's in the classpath (to minimize surprises for existing users)
> and to use the new DOM implementation if JDK 1.5+ is being used.  (Attila, per
> your comments, I am considering backporting this to 1.4 but we can discuss
> further.)  The important thing is whether we approve of the strategy here
> before I commit the uberpatch.  

Having XML implementation configurable at ContextFactory/Context level is perfectly okay. As for JDK 1.5 -- wouldn't it be better if you enabled the DOM implementation for JDK 1.4 as well when there's no XmlBeans around? Isn't it possible that some programs would use only such a subset of features that doesn't run into a NoSuchMethodError on JDK 1.4? If it is, then it'd be a pity to preemptively disable it. Also (I'm reiterating my point from an earlier comment here) if it's not a JDK 1.5 issue per se, but rather the fact that your code relies on a more advanced DOM spec implementation that happens to be included with JDK 1.5 but not 1.4, then a 3rd party DOM XML parser on the classpath providing the newer spec itself would solve the problem on 1.4 JDKs too. Rhino - up to 1.6R5 - was 1.1 compliant. I plan to kick it up to 1.3 with the next release, but would really prefer to not introduce real 1.5 dependencies yet (i.e. not compiling with <javac target="1.5"> if possible for now), so we don't get references to StringBuilder instead of StringBuffer etc. Of course, only if it doesn't inconvenience the implementation too much.
>Also (I'm reiterating my point from an earlier
>comment here) if it's not a JDK 1.5 issue per se, but rather the fact that your
>code relies on a more advanced DOM spec implementation that happens to be
>included with JDK 1.5 but not 1.4, then a 3rd party DOM XML parser on the
>classpath providing the newer spec itself would solve the problem on 1.4 JDKs
>too.

I have never played with this, but my understanding is that it's more complicated than that, and we'd need to screw around with the endorsed standards mechanism, etc.  (Or the bootclasspath.)  When we load a DOM class (like org.w3c.dom.Document), it seems to me like it'll use the JRE's version no matter what we do, even if a newer implementation were sitting on the classpath.  So org.foo.supermodern.DocumentImpl might implement the new version of the DOM spec, but the E4X code is still going to use the org.w3c.dom.Document instance loaded by the JRE.  I think this limitation is what the java.endorsed.dirs business is supposed to get around, but I haven't looked at it closely enough to completely understand it.  And one of my goals here is to get reasonably close to zero-config (I am hoping I don't have to explain java.endorsed.dirs to people in the installation documentation, and there are usage scenarios -- like shared servers -- where people might not have the ability to alter them).

Interesting point about gambling with NoSuchMethodError.  Value judgment there -- I think I'd rather have it not start than work partially and then fail spectacularly.  But I see your point and am open to convincing.  I am guessing it would be hard to avoid the 1.5 dependencies at the moment; there are 20 compilation errors when one compiles under 1.4.

Too many unknowns and "I am guessing"s at the moment -- I haven't really investigated the feasibility of just manually backporting it, haven't looked into java.endorsed.dirs, so I ought to stop until I can say something more useful. :)
Thinking about it more I think you might be right -- that if a user didn't build from source (but rather used a precompiled version) that contained bytecode that called out to new DOM methods, and then was using a JAXP-compliant parser that produced objects that implemented all the methods expected by the bytecode, it might work.  My brain's getting a little twisted into knots here, as I've never tried to do anything so cavalier as update an interface on my build platform and then an implementation on my deployment platform and hope it would work.  But in this case it might. :)  I'll try to get a test done to convince myself.  It still suffers from the requires-VMwide-config problem, I think, though I guess I need to look into JAXP more and see what can be done at runtime -- again thinking of the shared server scenario.
Blocks: 361722
Blocks: 369120
OK, I can confirm that merely putting Xerces on the classpath is *not* enough to be able to use DOM Level 3 at runtime under JDK 1.4.  However, using the java.endorsed.dirs system property to allow the Xerces versions of the APIs to be loaded at runtime (and override the JDK's classes) *does* work.  (As would presumably copying the Xerces xml-apis.jar into [JDK]/jre/lib/endorsed; see http://java.sun.com/j2se/1.4.2/docs/guide/standards/, although that doesn't mention that you actually have to *create* the endorsed directory, but it isn't there in my JDK).

So I'll code something in that looks for the DOM3 classes rather than the JDK version in order to determine whether to instantiate this implementation.  Still on the horizon would be back-porting to DOM2 if it seems worth the effort.
I have placed the new implementation in the tree.  Unless an implementation class for E4X is specified on the Context object, the XMLBeans implementation is used if XMLBeans are available on the classpath.  The new implementation is used otherwise, as long as DOM Level 3 is available.  Otherwise, the results are unpredictable.

There will be some cleanup perhaps with those sorts of edge cases (and also tightening up the build process).  But I wanted to commit this to the tree because it appears to work better than the XMLBeans implementation already, and existing clients won't immediately be affected because XMLBeans is still used by default, and it was becoming difficult to maintain my own fork of the codebase (see, e.g., my committing the entire Rhino source tree this morning trying to update my own repository).

Despite fixing many regressions, there are still things to fix about the new implementation, so I'll get to work on those.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Thanks for your great work, David! I tried it out, and the new implementation (without xbean.jar in the classpath) seems to work well, but as soon as I put xbean.jar in the classpath, I get

ReferenceError: "XML" is not defined.

Is this some oddity with my build (I have a slightly patched tree), or are other people seeing the same?
Haven't been able to reproduce this.  Attached is a test I ran with XMLBeans 2.2.0 and JDK 1.6:

Output without xbean.jar on classpath:

$ $jdk/bin/java -classpath `cygpath -wp $RHINO/build/commit/dist/js.jar` org.mozilla.javascript.tools.shell.Main e4x-hello.js       xmlbeans: false
context class: org.mozilla.javascript.xmlimpl.XMLLibImpl
<foo/>


And with:

$ $jdk/bin/java -classpath `cygpath -wp $RHINO/build/commit/dist/js.jar:/opt/java/apache/xmlbeans/2.2.0/lib/xbean.jar` org.mozilla.javascript.tools.shell.Main e4x-hello.js
xmlbeans: true
context class: org.mozilla.javascript.xml.impl.xmlbeans.XMLLibImpl
<foo/>

Suspicions:

1.  You didn't put jsr173_1.0_api.jar on classpath.  Though I didn't either -- I can't explain why it worked anyway. :)

2.  Given your custom build, perhaps something has happened to Context or ContextFactory (two classes with which you might well be munging).

Since I happen to know you've been playing with LazilyLoadedCtor, you may want to do what we talked about in bug 369342 and *not* swallow exceptions (maybe dumping their stack trace instead) and see whether something more helpful than the ReferenceError spits out.
Thanks for your help. The first problem was simply that I had cvs-updated my local copy without the -d option, so the directories for the newly deprecated Xmlbeans implementation never got created. And in fact I didn't have the jsr 173 jar file in the classpath. Shame on me, next time I'll dig some more before asking. But I agree that LazilyLoadedCtor should produce some kind of message on failure.
No shame here. :)  It's good to know what's happening out there -- extra impetus to fix bug 369342, for example.
Adding target milestone of 1.6R6 based on the date this bug was resolved FIXED.
Target Milestone: --- → 1.6R6
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: