Closed
Bug 40672
Opened 24 years ago
Closed 22 years ago
Make transformiix faster
Categories
(Core :: XSLT, defect, P5)
Core
XSLT
Tracking
()
VERIFIED
FIXED
Future
People
(Reporter: Joerg.Brunsmann, Assigned: keith)
References
Details
(Keywords: perf)
Attachments
(4 files)
13.80 KB,
application/octet-stream
|
Details | |
3.61 KB,
patch
|
Details | Diff | Splinter Review | |
1.86 KB,
patch
|
Details | Diff | Splinter Review | |
289.68 KB,
text/xml
|
Details |
I think transformiix and the combination of transformiix with
mozilla is of great value. You've done a terrific job so far.
But I must admit that transformiix performance is improveable.
Even for mid-sized documents it gets rather slow. This
poor performance is inaceptable. I wouldn't recommend to
present a slow xsl processor to the mozilla web browser
community. It would lessen the acceptance of xsl as whole,
I guess.
I've done some profiling with 'gprof'. Please see the result
attached. Perhaps it's of any value for you.
Reporter | ||
Comment 1•24 years ago
|
||
Comment 2•24 years ago
|
||
Hi Joerg,
profiling might be a good idea, and
WOW, we just have performance bugs? Keith, you gotta be good.
Anyway, the profiling data seems to be broken. I guess it's just to large for
bugzilla.
gprof /tmp/build/dist/bin/transformiix /tmp/profl
gprof: unexpected EOF after reading 7056/111538 samples
Is profiling useful yet? I know we just overhauled the Strings, other stuff
pending?
Where does transformiix range in matters of speed? Do we have alternatives to
test against?
Axel
Reporter | ||
Comment 3•24 years ago
|
||
Hi Axel,
I just checked; i can download and gunzip the data without problem. It's 14000
bytes long; shouldn't be too large.
Profiling might be too early. Just wanted the get the message over that - from
my viewpoint - it would be a mistake to give transformiix to the "masses" to
early. I transformiix would have bugs in the implementation, I would suggest to
follow the 'release often, release early' theme. But if the xlst-module suffers
from performance problems, it's a complete different issue, which could handicap
the wide adoption of xslt in client web browser.
I know of two other C++ implementation, although I didn't test them:
http://www.gingerall.com/
http://mdc-xsl.sourceforge.net/
I can say that Xalan/Xerces-liason (http://xml.apache.org/) are faster than
transformiix. They are written in Java.
Assignee | ||
Comment 4•24 years ago
|
||
First off...I'm not sure we want to discuss performance issues via
Bugzilla. Secondly...there are a number of areas I can improve performance that
I am aware of. I've already done a lot of this for XSL:P so I know where to
begin. For the standalone version, I really need to use streamed output. It
currently builds a complete result tree...and then prints it, which is not
necessary as we've discussed a number of times on .xslt. For both standalone and
the mozilla component, I know of a number of areas in my XPath implementation
that I can improve on.
Personally I think we need to concentrate on XSLT conformance, and robustness,
first, as some of the performance improvements will probably make the code less
enjoyable to read.
Updated•24 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee | ||
Comment 5•24 years ago
|
||
I'm changing the status of this *bug* to Later, since it's not really a bug,
though important, but shouldn't be dealt with until after some additional
forthcoming changes are in place, and integration with Mozilla is further along.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → LATER
Comment 6•24 years ago
|
||
reopening and marking Future...
Status: RESOLVED → REOPENED
Resolution: LATER → ---
Target Milestone: --- → Future
i think you'll find this useful.
here is XSLT benchmarking results (including transformiix) using XSLTMark - XSLT
benchmarking application:
http://www.datapower.com/XSLTMark/res_2000_11_20.html
they were using M18 though, but this benchmark is downloadable.
Assignee | ||
Comment 8•24 years ago
|
||
Attaching a couple of patches to help speed up sorting by document order.
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 9•24 years ago
|
||
Assignee | ||
Comment 10•24 years ago
|
||
Comment 11•24 years ago
|
||
Looked at the source, these look good. I'll try them out later.
About document order, I recognized that LocationPath reverses the nodes for
some axes. We should fix that. And if we did, do we need to explicitly sort
by document order for anything but nodeset unions?
Axel
Assignee | ||
Comment 12•24 years ago
|
||
Hi Axel,
Yes, we're going to need to preserve the Axis ordering. A reverse axis like the
ancestor axis should remain in reverse order.
So we'll need to be careful about this, which I admit, I curently am not,
because a call to sortByDocumentOrder will rearrange the NodeSet.
--Keith
Comment 13•24 years ago
|
||
Could we add a property to a NodeSet, saying whether it's document order
forward, reverse, or unsorted?
So in case we have a reverse axis, we don't have to sort, but just to reverse
the set.
Axel
Assignee | ||
Comment 14•24 years ago
|
||
Yes, we could do that. It'll help as long as there is only one PathExpr in the
Union. If we can guarantee all NodeSet's are in document order we can do a
simple merge sort.
And I made a mistake in my previous post. I think I am currently being careful
about reverse-axis. sortByDocumentOrder is only called at the end, so it's ok.
Late Night...time to go to bed! :-)
--Keith
Comment 15•24 years ago
|
||
Hi Keith,
I tested your patch, it gave me some 10% speed improvement.
r=me, get this in the tree as soon as it opens ;-).
Axel
Assignee | ||
Comment 16•24 years ago
|
||
Your idea about saving the state of a NodeSet's sort order should also improve
our performance, before I implement that solution I'd like to get the above
patches checked in first. My local tree is getting ugly. So I'll wait until the
tree re-opens, I check in the above patches, before I try out the NodeSet sort
order proposal.
Comment 17•24 years ago
|
||
the two patches are checked in, r=me (peterv).
Axel
Assignee | ||
Comment 18•24 years ago
|
||
I really need to get back into the code, it's been too long! In any case I was
doing some thinking. It might be nice to tweak the XPath expressions to perform
an early return under certain cases. For example:
bar[1] is probably doing too much work, unless only 1 bar exists to begin with.
It first selects all the bar nodes...and then applies the predicate. In this
case the predicate is the position 1, which means just the first node is kept.
This means we created a nodeset unecessarily with all bar children (potentially
needing to resize the nodeset), just to select the first one.
The best way to solve this is to change the way we evaluate expressions. Instead
of selecting all bar children first and then apply the predicate, we can select
and apply as we go. So we select the first bar and apply the predicate. if it
matches we add it to the nodeset and increment a position counter. If it doesn't
matches we simply increment the position counter. We could add a flag to the
predicates...such as returnOnTrue, which would inidicate that if the predicate
matches (like with the position() function) then we know we don't need to check
any further and we are done.
thoughts?
Comment 19•24 years ago
|
||
Funny you say that, had that in my mind tonight.
plus a awkward counter example:
foobartag[position()=last()-5], and I can do worse:
foobartag[mod(position(),3)=mod(last(),2))]
Are there other ways to make the result depend on the nodeset?
Axel
Assignee | ||
Comment 20•24 years ago
|
||
yes, of course....if the expression contains any sub-expressions that need the
entire nodeset, we can't return early...but that doesn't mean we should always
produce the complete nodeset....just because we might need it. We simply need to
label the expressions....
so:
1. incremental
Any expression which can evaluate predicates incrementally.
2. incremental + return-early
Any expression which can evaluate predicates incrementally and
safely return once it finds the first match.
3. non-incremental
Any expression which needs the entire nodeset to evaluate predicates.
4. non-incremental + return-early
Any expression which needs the entire nodeset to evaluate predicates
but can safely return once it finds the first match.
:-)
Comment 21•23 years ago
|
||
Is this strictly a Sun/FreeBSD issue?
hell no, transformiix is slow on all platforms :)
OS: FreeBSD → All
Hardware: Sun → All
I think a nice way to spead up xpath evaluation is to have special classes for
common optimizable (sub)expressions. For example the following would be nice to
have special classes for:
@foo
A step that consists of just an attibute axes and a nametest without wildcards
foo | bar | baz | zap:*
two or more expressions which all consists of a single step using the child axis
foo/bar/baz
a PathExpr consisting only of steps along the child axis. (Not sure if there is
a big win here)
The nice thing about the last is that if they are evaluated properly the result
dosn't need to be sorted. We need to be able to handle these already-sorted
nodesets in the xslt code though.
Updated•23 years ago
|
Priority: P3 → P5
Comment 24•23 years ago
|
||
I might sound out of place, but in support of Joerg Brunsmann's post in which
he mentioned the Java Xerces and Xalan parsers, I'm wondering if you guys have
considered distributing tomcat as a servlet engine for Mozilla, installing
cocoon, and have cocoon do the XSLT for you?
P.S. What programming language are you using?
Comment 25•23 years ago
|
||
The load of the docbook.rng.xml takes approximately 5 minutes on my 600mh
Win200 ith 128M of memory. The Javascript app takes approximately 1min 10
seconds of which only about 3 seconds is file load time. I don't have a
separate load time for the style sheet as I just loaded the file directly using
the browser.
Comment 26•23 years ago
|
||
standalone branch speeds this up by a factor of 3, debug builds.
Module trunk (not really recent) crashed on me after some 15 minutes and 120Mb.
:-( (note that this is still far from out-of-memory conditions on my machine)
we are faster these days. Lets close this and open bugs on specific issues when
we find them
sorry, forgot to actually mark as fixed
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 22 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•