Closed
Bug 59783
Opened 24 years ago
Closed 22 years ago
Implement dynamic XPath
Categories
(Core :: XSLT, enhancement, P5)
Core
XSLT
Tracking
()
VERIFIED
WONTFIX
Future
People
(Reporter: dr, Assigned: peterv)
References
Details
(Keywords: arch, Whiteboard: [Hixie-PF])
Attachments
(3 files)
3.86 KB,
text/plain
|
Details | |
3.87 KB,
text/plain
|
Details | |
3.51 KB,
patch
|
Details | Diff | Splinter Review |
XBL, dynamic XSLT, and maybe XLink, all need a dynamic implementation of XPath,
in which re-evaluation of a given XPath expression with respect to a given
document is efficient, given the result of the expression and the exact change
to the document.
For example, given a document:
<foo>
<bar/>
<bar/>
<bar/>
</foo>
and an XPath:
count(/foo/bar)
if the document changes to include a fourth bar, the XPath shouldn't re-evaluate
entirely, but should instead use the prior result (3) and the change (addition
of an extra bar) to calculate the new result (4) without having to traverse the
document again.
Other criteria for the XPath implementation include:
- Ability to plug additional functions/axes/etc. into the language based on
context (eg. XSLT, XPointer, XSL-FO, etc. add extensions to the XPath language).
- Ability to use from anywhere, distinguishing between dynamic and one-pass use
(eg. define both nsIXPath and nsIDynamicXPath).
- Fit cleanly with DOM mutation events, and use some simple notification scheme
for clients.
Setting component to XSLT, since that's the umbrella under which this will be
implemented.
Comment 1•24 years ago
|
||
To get a feeling for XPath expression coming up in XSLT, I extraced most (?) of
them from the docbook stylesheet by Norman Walsh.
I fed them thru some perl-fu, to find duplicates, and sort them according to
frequency. I will attach the listing in a minute, don't wonder about
NCName,that's a [a-z][a-z0-9]*, part of perlfu.
We could use that list to evaluate how important performance of specific
expressions is.
Of course this should be done by counting their use, not frequence, I may do
that someday.
Axel
Comment 2•24 years ago
|
||
Axel -- Kickass work! I cleaned up your stats a little bit. There were a lot of
things that looked like:
$NCName.NCName.NCName.NCName
Which are really just variable references, where NCName[.NCName]* is just one
NCName (for example, "$arg.choice.plain.open.str"). I've collected those cases
into $VarRef. Also, instances of NCName:: I've turned into AxisName::. Also
sorted things better (sort -g -r does the numeric fu you want ;).
Very interesting stuff, though. The most common XPath by far (549 occurrences)
is a plain NCName. This is a relief, since that is a trivial case to optimize
for. Second most common (164) is a variable reference, which doesn't even *need*
optimizing with respect to the unnecessary traversals of the source tree. Third
most common (142, but I suspect it would be more common than variable references
in less complex XSLTs) is a single step expression, NCName/NCName. Cases like
[("/"|"//")NCName]* strike me as easy enough -- we can probably match the name
of the element changed against the list of names in the XPath and do something
smart there...
Further common cases are ".", "@NCName", unions, etc. which would reasonably
have been expected as common, and should follow similarly to the cases above...
One thing we might want to exploit is pre-implementing some of what XPath 2.0
might have to offer. What I'm thinking about specifically is the desire to
loosen restriction on step expressions, so that instead of having to write
things like:
a1/b1/c | a1/b2/c | a2/b1/c | a2/b2/c
you can write:
(a1|a2)/(b1|b2)/c
The ability to express things like this, I'm thinking, might be exploitable for
optimization. In the DocBook stylesheets, you see a lot of things that look like
this (for example, "simplesectinfo/authorgroup|docinfo/authorgroup"), and if
those can be logically collapsed in the parsing stage, that would save a decent
amount of work during transformation.
Axel: can you do some more of your magic? It would be great to get a spread of
data for different flavors of XSLT stylesheets. Anybody else still in college
who wants to do some thesis work? *grin*
Updated website to include XPath implementation plans:
http://people.netscape.com/dr/xslt/xpath.html
Comment 6•24 years ago
|
||
Hi Dan,
I read your new doc (sorry, mine is still pending, I know), and some comments:
First of all, schedule, hey, cool ;-).
Concerning the changes of grammar for XSLT and XPointer, can you give the
modified rules? I wonder, because it didn't pop up to my eyes reading the XSLT
spec.
When it comes down to optimizations, can the plugged optimizations be so fine
grained and efficient, that they win over the price of being able to plug?
Do we have more than two kind of grammar rules in XPath?
Lists and Switches? You talked about doing code generation and reuse, does this
look like two sets of macros?
I'll probably whack some profiling before I'll get down to docs. Don't know when
though. Thinking about tonight, but doubting still.
Axel
Comment 7•24 years ago
|
||
Here goes some profiling data, dr, you may rename those again ;-)
This was done by running the docbook samples test.xml and book2.xml once, with
the html stylesheets.
Axel
Comment 8•24 years ago
|
||
reposted xpath planning doc to include Capital Letters (it's a miracle!) and a
little bit of clarifying.
so i've been trying to hack together a parser and tokenizer using bison and
re2c, per scc@mozilla.org's suggestion. re2c has been moderately well-behaved,
but bison does not like generating either c++-friendly or mozilla-friendly code.
another popular tool called antlr seemed a possibility, but it doesn't seem
particularly mozilla-friendly either. i may either have to do some perl-fu to
get the output from antlr to play nicely, or may have to hand-code the parser
for now (at least the tokenizer will still be generated). still looking for help
in this area... any ideas anybody?
Comment 10•24 years ago
|
||
XPath is fairly small. Though not very pretty, I coded the transformiix XPath
lexer/tokenizer by hand. I can probably help...or maybe I can port the existing
one to your XPath object model. I still need to spend some time reading your
ideas. I apologize for not having had a chance to look at them yet.
--Keith
Comment 11•24 years ago
|
||
Hi Dan,
I just cross read a bit of antlr's documentation, and it looked to me as we
would need to link the executable to a antlr library. How far did you test this?
I didn't even bother downloading after reading this ;-).
Could you attach the code you have? I'm curious as to why mozilla won't love
the bison code. Probably lots of reasons, I just wonder if there are a few good
ones :-).
I haven't found any other parsers.
But isn't our grammar rather limited? So how about doing the code gen directly
in perl?
Axel
Axel
Reporter | ||
Comment 12•24 years ago
|
||
Yeah, I was similarly disappointed with antlr. Amazingly, gnu lex (flex) has the
same shortcoming -- you have to compile with -lfl for it to work. No way.
Once I can actually start working on this (i.e. when I manage to transfer to the
XML/DOM team) I'm going to hack together a nice and simple tokenizer interface,
and we can write the parser(s) however we want. If you are able to write a
parser generator in perl, (are you serious?!) that would be very cool...
Otherwise, I'm starting to lean towards hand-coding the parser, because it's
simple enough, and because we need to get this done before any "real" work can
happen.
Reporter | ||
Comment 13•24 years ago
|
||
I've done a bunch more work on this in my spare time. I have a sexy UML diagram
living on my whiteboard, and I was coordinating a little bit with rayw, who was
trying to design a standard API for XPath... I'd love to get a copy of Rational
Rose, to get this stuff down in a decent form. Sooner or later (grumble grumble)
I'll be able to do some real work on this...
Comment 14•24 years ago
|
||
If your interested in a standard API for XPath, I can make available the new one
I will be using in XSL:P. Assaf Arkin and I took what I had initially in XSL:P
and came up a much cleaner
--Keith
Reporter | ||
Comment 15•24 years ago
|
||
I'd be very interested to review that. Could you post it as an attachment here?
Thanks very much!
Comment 16•24 years ago
|
||
Can anyone give a sort of very general, newbie update on how this is
progressing? I am interested in using it in my XULMaker program as an
alternative to IDs.
Reporter | ||
Comment 17•24 years ago
|
||
Michael: I'm working on this only in my spare time right now, due to more
pressing work obligations, but you might be able to make use of peterv's
excellent work (an API is exposed from transformiix to use XPaths). I suggest
you email him.
Reporter | ||
Comment 18•24 years ago
|
||
doesn't look like i'll be doing any xml work any time soon :( ... futuring and
reassigning to peterv.
Assignee: dr → peterv
Status: ASSIGNED → NEW
Target Milestone: mozilla1.1 → Future
Updated•23 years ago
|
Priority: P3 → P5
Updated•23 years ago
|
Whiteboard: [Hixie-PF]
Comment 20•22 years ago
|
||
WONTFIX'ing. dr's document is gone, and we have done alot of optimisations to the
xpath engine since this bug was filed.
There is a XPath js interface by now, for example.
We should evaluate if we want expressions that return nodesets to return iterators,
though. So that results are polled, rather than fed. But that's a different bug.
Which should be filed once somebody has a plan, with predicates and reverse axes
and stuff.
Kai, please don't set alias on bugs which you know nothing about. This is not THE
XPath bug. It might be a good idea to leave aliases up to those responsible for
the module, or the bug.
Alias: xpath
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•