Closed Bug 59783 Opened 24 years ago Closed 22 years ago

Implement dynamic XPath

Tracking

()

Status:

VERIFIED WONTFIX

Milestone:

Future

People

(Reporter: dr, Assigned: peterv)

References

Details

(Keywords: arch, Whiteboard: [Hixie-PF])

Attachments

(3 files)

XPath Expressions in docbook stylesheets, see http://nwalsh.com/docbook/xsl/ 24 years ago Axel Hecht 3.86 KB, text/plain		Details
Revised stats on DocBook XSLT fu 24 years ago Dan Rosen 3.87 KB, text/plain		Details
counting the xpaths evals in docbook samples 24 years ago Axel Hecht 3.51 KB, patch		Details \| Diff \| Splinter Review

Dan Rosen

Reporter

Description

•

24 years ago

XBL, dynamic XSLT, and maybe XLink, all need a dynamic implementation of XPath, in which re-evaluation of a given XPath expression with respect to a given document is efficient, given the result of the expression and the exact change to the document. For example, given a document: <foo> <bar/> <bar/> <bar/> </foo> and an XPath: count(/foo/bar) if the document changes to include a fourth bar, the XPath shouldn't re-evaluate entirely, but should instead use the prior result (3) and the change (addition of an extra bar) to calculate the new result (4) without having to traverse the document again. Other criteria for the XPath implementation include: - Ability to plug additional functions/axes/etc. into the language based on context (eg. XSLT, XPointer, XSL-FO, etc. add extensions to the XPath language). - Ability to use from anywhere, distinguishing between dynamic and one-pass use (eg. define both nsIXPath and nsIDynamicXPath). - Fit cleanly with DOM mutation events, and use some simple notification scheme for clients. Setting component to XSLT, since that's the umbrella under which this will be implemented.

Dan Rosen

Reporter

Updated

•

24 years ago

Blocks: 18722

Severity: normal → enhancement

Status: NEW → ASSIGNED

Keywords: arch

Target Milestone: --- → Future

Axel Hecht

Comment 1

•

24 years ago

To get a feeling for XPath expression coming up in XSLT, I extraced most (?) of them from the docbook stylesheet by Norman Walsh. I fed them thru some perl-fu, to find duplicates, and sort them according to frequency. I will attach the listing in a minute, don't wonder about NCName,that's a [a-z][a-z0-9]*, part of perlfu. We could use that list to evaluate how important performance of specific expressions is. Of course this should be done by counting their use, not frequence, I may do that someday. Axel

Axel Hecht

Comment 2

•

24 years ago

Attached file XPath Expressions in docbook stylesheets, see http://nwalsh.com/docbook/xsl/ — Details

Dan Rosen

Reporter

Comment 3

•

24 years ago

Attached file Revised stats on DocBook XSLT fu — Details

Dan Rosen

Reporter

Comment 4

•

24 years ago

Axel -- Kickass work! I cleaned up your stats a little bit. There were a lot of things that looked like: $NCName.NCName.NCName.NCName Which are really just variable references, where NCName[.NCName]* is just one NCName (for example, "$arg.choice.plain.open.str"). I've collected those cases into $VarRef. Also, instances of NCName:: I've turned into AxisName::. Also sorted things better (sort -g -r does the numeric fu you want ;). Very interesting stuff, though. The most common XPath by far (549 occurrences) is a plain NCName. This is a relief, since that is a trivial case to optimize for. Second most common (164) is a variable reference, which doesn't even *need* optimizing with respect to the unnecessary traversals of the source tree. Third most common (142, but I suspect it would be more common than variable references in less complex XSLTs) is a single step expression, NCName/NCName. Cases like [("/"|"//")NCName]* strike me as easy enough -- we can probably match the name of the element changed against the list of names in the XPath and do something smart there... Further common cases are ".", "@NCName", unions, etc. which would reasonably have been expected as common, and should follow similarly to the cases above... One thing we might want to exploit is pre-implementing some of what XPath 2.0 might have to offer. What I'm thinking about specifically is the desire to loosen restriction on step expressions, so that instead of having to write things like: a1/b1/c | a1/b2/c | a2/b1/c | a2/b2/c you can write: (a1|a2)/(b1|b2)/c The ability to express things like this, I'm thinking, might be exploitable for optimization. In the DocBook stylesheets, you see a lot of things that look like this (for example, "simplesectinfo/authorgroup|docinfo/authorgroup"), and if those can be logically collapsed in the parsing stage, that would save a decent amount of work during transformation. Axel: can you do some more of your magic? It would be great to get a spread of data for different flavors of XSLT stylesheets. Anybody else still in college who wants to do some thesis work? *grin*

Dan Rosen

Reporter

Comment 5

•

24 years ago

Updated website to include XPath implementation plans: http://people.netscape.com/dr/xslt/xpath.html

Axel Hecht

Comment 6

•

24 years ago

Hi Dan, I read your new doc (sorry, mine is still pending, I know), and some comments: First of all, schedule, hey, cool ;-). Concerning the changes of grammar for XSLT and XPointer, can you give the modified rules? I wonder, because it didn't pop up to my eyes reading the XSLT spec. When it comes down to optimizations, can the plugged optimizations be so fine grained and efficient, that they win over the price of being able to plug? Do we have more than two kind of grammar rules in XPath? Lists and Switches? You talked about doing code generation and reuse, does this look like two sets of macros? I'll probably whack some profiling before I'll get down to docs. Don't know when though. Thinking about tonight, but doubting still. Axel

Axel Hecht

Comment 7

•

24 years ago

Here goes some profiling data, dr, you may rename those again ;-) This was done by running the docbook samples test.xml and book2.xml once, with the html stylesheets. Axel

Axel Hecht

Comment 8

•

24 years ago

Attached patch counting the xpaths evals in docbook samples — Details — Splinter Review

Dan Rosen

Reporter

Updated

•

24 years ago

Target Milestone: Future → mozilla1.1

Dan Rosen

Reporter

Updated

•

24 years ago

Blocks: xlink

Dan Rosen

Reporter

Comment 9

•

24 years ago

reposted xpath planning doc to include Capital Letters (it's a miracle!) and a little bit of clarifying. so i've been trying to hack together a parser and tokenizer using bison and re2c, per scc@mozilla.org's suggestion. re2c has been moderately well-behaved, but bison does not like generating either c++-friendly or mozilla-friendly code. another popular tool called antlr seemed a possibility, but it doesn't seem particularly mozilla-friendly either. i may either have to do some perl-fu to get the output from antlr to play nicely, or may have to hand-code the parser for now (at least the tokenizer will still be generated). still looking for help in this area... any ideas anybody?

Keith Visco

Comment 10

•

24 years ago

XPath is fairly small. Though not very pretty, I coded the transformiix XPath lexer/tokenizer by hand. I can probably help...or maybe I can port the existing one to your XPath object model. I still need to spend some time reading your ideas. I apologize for not having had a chance to look at them yet. --Keith

Axel Hecht

Comment 11

•

24 years ago

Hi Dan, I just cross read a bit of antlr's documentation, and it looked to me as we would need to link the executable to a antlr library. How far did you test this? I didn't even bother downloading after reading this ;-). Could you attach the code you have? I'm curious as to why mozilla won't love the bison code. Probably lots of reasons, I just wonder if there are a few good ones :-). I haven't found any other parsers. But isn't our grammar rather limited? So how about doing the code gen directly in perl? Axel Axel

Dan Rosen

Reporter

Comment 12

•

24 years ago

Yeah, I was similarly disappointed with antlr. Amazingly, gnu lex (flex) has the same shortcoming -- you have to compile with -lfl for it to work. No way. Once I can actually start working on this (i.e. when I manage to transfer to the XML/DOM team) I'm going to hack together a nice and simple tokenizer interface, and we can write the parser(s) however we want. If you are able to write a parser generator in perl, (are you serious?!) that would be very cool... Otherwise, I'm starting to lean towards hand-coding the parser, because it's simple enough, and because we need to get this done before any "real" work can happen.

Dan Rosen

Reporter

Comment 13

•

24 years ago

I've done a bunch more work on this in my spare time. I have a sexy UML diagram living on my whiteboard, and I was coordinating a little bit with rayw, who was trying to design a standard API for XPath... I'd love to get a copy of Rational Rose, to get this stuff down in a decent form. Sooner or later (grumble grumble) I'll be able to do some real work on this...

Keith Visco

Comment 14

•

24 years ago

If your interested in a standard API for XPath, I can make available the new one I will be using in XSL:P. Assaf Arkin and I took what I had initially in XSL:P and came up a much cleaner --Keith

Dan Rosen

Reporter

Comment 15

•

24 years ago

I'd be very interested to review that. Could you post it as an attachment here? Thanks very much!

Michael Hearn

Comment 16

•

24 years ago

Can anyone give a sort of very general, newbie update on how this is progressing? I am interested in using it in my XULMaker program as an alternative to IDs.

Dan Rosen

Reporter

Comment 17

•

24 years ago

Michael: I'm working on this only in my spare time right now, due to more pressing work obligations, but you might be able to make use of peterv's excellent work (an API is exposed from transformiix to use XPaths). I suggest you email him.

Heikki Toivonen (remove -bugzilla when emailing directly)

Updated

•

24 years ago

No longer blocks: xlink

Dan Rosen

Reporter

Comment 18

•

24 years ago

doesn't look like i'll be doing any xml work any time soon :( ... futuring and reassigning to peterv.

Assignee: dr → peterv

Status: ASSIGNED → NEW

Target Milestone: mozilla1.1 → Future

Axel Hecht

Updated

•

23 years ago

Priority: P3 → P5

Hixie (not reading bugmail)

Updated

•

23 years ago

Whiteboard: [Hixie-PF]

Kai Lahmann (is there, where MNG is)

Comment 19

•

22 years ago

alias for faster searching

Alias: xpath

Axel Hecht

Comment 20

•

22 years ago

WONTFIX'ing. dr's document is gone, and we have done alot of optimisations to the xpath engine since this bug was filed. There is a XPath js interface by now, for example. We should evaluate if we want expressions that return nodesets to return iterators, though. So that results are polled, rather than fed. But that's a different bug. Which should be filed once somebody has a plan, with predicates and reverse axes and stuff. Kai, please don't set alias on bugs which you know nothing about. This is not THE XPath bug. It might be a good idea to leave aliases up to those responsible for the module, or the bug.

Alias: xpath

URL: http://people.netscape.com/dr/xslt

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → WONTFIX

Dan Rosen

Reporter

Comment 21

•

22 years ago

Verifying. See comments in 18722.

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.