357276 - Add an RDF compatibility parser

Reporter

Description

•

19 years ago

To remove the "heavy" mozilla/rdf code from the platform, we need a couple things: 1) a way to import data that's currently in RDF files 2) a simple graph-based data model for XUL templating This bug covers part 1. I have an RDF parser using SAX which can read the RDF produced by the current RDF serializer, as well as "standard" RDF with the following exceptions: * no handling of relative URIs, xml:base, or rdf:ID on nodes * no handling of reification (rdf:ID on properties) * no handling of rdf:parseType (though parseType="Resource" would be pretty easy to add) It does do lang-typed literals, which should make it possible to solve the "localized extension descriptions" bug. The code is shipped as an JS file which can be loaded using the subscript loader. I will be attaching two patches: one for the parser itself and xpcshell unit tests, the other will change the chrome registry to use the new parser.

Benjamin Smedberg

Reporter

Comment 1

•

19 years ago

Attached patch RDF-compat parser, rev. 1 — Details — Splinter Review

The unit test depends on the RDF testsuite living in rdf/compat/test/testcases. I have not included the testcases in this patch, but I will land them with this patch: http://www.w3.org/2000/10/rdf-tests/rdfcore/all_20031114.zip

Attachment #242766 - Flags: review?(sayrer)

Benjamin Smedberg

Reporter

Comment 2

•

19 years ago

Attached patch Make the chrome registry use the rdf-compat parser, rev. 1 — Details — Splinter Review

I just noticed that rdf-parser.js in the first patch didn't have a license header. I have added one in my tree.

Attachment #242769 - Flags: review?(sayrer)

Axel Hecht

Comment 3

•

19 years ago

I think that a proposal like this should be discussed in .planning. Moving away from what we currently have may be a good idea, but I'd like to see a better case being made on where to go to and why. And I'm honestly more interested to see a plan for localstore.rdf than contents.rdf in toolkit, which, hrm, is even more out of fashion that RDF or RDF/XML, even.

Robert Sayre

Comment 4

•

19 years ago

(In reply to comment #0) > * no handling of rdf:parseType (though parseType="Resource" would be pretty > easy to add) > I filed bug 357289 to implement a SAX recorder, since I should have one for feeds too. That should make XML literals easy.

Robert Sayre

Comment 5

•

19 years ago

Comment on attachment 242766 [details] [diff] [review] RDF-compat parser, rev. 1 If this interface had IDL, this patch would get rejected for lack of documentation. I grok enough to continue reviewing, though. I don't think docstrings on each function are necessary, but I do think explanations of what each object does would be good, in addition to an introductory comment detailing what the file does at a very high level.

Robert Sayre

Comment 6

•

19 years ago

Comment on attachment 242766 [details] [diff] [review] RDF-compat parser, rev. 1 general nits: Pick one brace style for one-line conditionals. JS code I'm familiar with generally doesn't use the "a" prefix for args. There is a mix in this file. >+ >+function getRDFAttribute(aAttr, atts) >+{ >+ var rv = atts.getValueFromName(RDF_NAMESPACE, aAttr); nit: the rv name clashes weirdly with nsresult usage. use "text" or "value" >+ >+XML_NAME = /^[A-Za-z_][0-9A-Za-z\.\-_]*$/; >+ This seems to disallow elements with unicode names. Other regexes in the file might have this issue as well. js> myTextVar = "\u0046\u004f\u004f" FOO js> XML_NAME.test(myTextVar) true js> myTextVar = "\u7000\u5000\u5122"; 瀀倀儢 js> XML_NAME.test(myTextVar) false I think this is fine for a follow up if it's not a regression. >+ >+ r = LiteralCRP.exec(aObject); >+ if (r != null) { >+ if (r[2] == "^^") { >+ return new RDFTypedLiteral(untripleString(r[1]), untripleString(r[3])); >+ } >+ return new RDFLiteral(untripleString(r[1]), untripleString(r[4])); >+ } This stuff is hard to understand. How about descriptive names for the matches? if (r != null) { var [,foo,bar,baz,qux] = r; //maybe we can use JS 1.7? >+ toNTriples: function rdfs_toNTriples() >+ { >+ var lines = []; >+ for (var subj in this._forward) { >+ var subjobj = this._forward[subj]; >+ for (var pred in subjobj) { >+ var predarray = subjobj[pred]; >+ for each (var obj in predarray) { >+ lines.push(subj + "\t" + pred + "\t" + obj + " ."); >+ } >+ } >+ } >+ lines.sort(); >+ return lines.join("\n") + "\n"; >+ }, nit: Nested for-loops like this are suspect in JS1.7. You could possibly tighten this up by getting rid of the key/value assignment stuff and then perhaps look at transforming to one or two array comprehensions. // not tested for (var [subj, subobj] in Iterator(this._forward)) for(var [predicate,predarray] in Iterator(subobj)) >+ >+ // @param aParam either a string to be parsed directly, or a nsIFile >+ // which is read and then parsed >+ parseNTriples: function rdfs_parseNTriples(aParam) >+ { >+ if (aParam instanceof Ci.nsIFile) { >+ var fis = Cc["@mozilla.org/network/file-input-stream;1"]. >+ createInstance(Ci.nsIFileInputStream); >+ fis.init(aParam, 0x01, 0664, 0); nit: these flags should be descriptive consts. >+ var sis = Cc["@mozilla.org/scriptableinputstream;1"]. >+ createInstance(Ci.nsIScriptableInputStream); >+ sis.init(fis); >+ >+ var data = sis.read(sis.available()); >+ this._parseNTriplesFromString(data); >+ } >+ else { >+ this._parseNTriplesFromString(aParam.toString()); >+ } >+ }, Shouldn't this close the stream in a finally block? Will this always read the whole file? (I don't know) >+ >+ _parseNTriplesFromString: function rdfs_parsentriplesfromstring(aString) >+ { >+ this.clear(); >+ >+ var lines = aString.split(/[\n\r]+/); >+ for each (var line in lines) { >+ if (COMMENT_OR_EMPTY.test(line)) >+ continue; >+ >+ var parts = splitNTriple(line, this); >+ this.addTriple(parts[0], parts[1], parts[2]); >+ } >+ }, nit: this stuff can be made easier to follow by using JS features // not tested var lines = aString.split(/[\n\r]+/).filter(COMMENT_OR_EMPTY.test); var partslist = [splitNTriple(line, this) for line in lines]; // use loops when there are side-effects for each (var parts in partslist) { this.addTriple(parts[0], parts[1], parts[2]); } >+ >+ setSAXHandlers: function(aSAXReader) This is a judgement call, and I think whichever you prefer is ok. SAX handlers usually aren't swapped out like this unless they are intended to be composed in different ways (like one handler calls the next). So, startElement would usually have something like this: if (this.state == IN_PROLOG) { ... } else if (this.state == IN_DOC) ... etc. You are holding a lot of state in the handlers though, so I can see why the current code might be better. xml:lang and xml:base are easier to handle with the switch-style handlers, because you can keep sparse arrays keyed to depth for the context of each. push() on startElement and pop() on closeElement() r=sayrer with style nits fixed. the rest is up to you.

Attachment #242766 - Flags: review?(sayrer) → review+

Jeff Walden [:Waldo]

Comment 7

•

19 years ago

(In reply to comment #6) > Shouldn't this close the stream in a finally block? In fact this never closes the file; you have to read beyond the end of the file to get auto-closure, as I assume the constant arguments specify but am too lazy to double-check. > Will this always read the whole file? (I don't know) According to the contract of the API, no; it would be entirely free to return data in 5-byte chunks until it hit the end of the file if it wanted. There's also the rather-meaningless problem of the file being greater than 2**32 bytes in size. The in-tree implementation returns the number of bytes available to be read beyond the current file pointer location; I doubt that's guaranteed to be the entire file, but it typically would be where this would be used, I expect.

Robert Sayre

Comment 8

•

19 years ago

Comment on attachment 242769 [details] [diff] [review] Make the chrome registry use the rdf-compat parser, rev. 1 >? chrome/src/importer-test.js There are probably few tests to write for this, especially getResourceName. >+ >+function getResourceName(resource) >+{ >+ if (!resource instanceof RDFParser.RDFNode) >+ throw Error("Unexpected value passed as resource."); >+ >+ var urispec = resource.uri(); >+ return /:([^:]+)$/.exec(urispec)[1]; >+} Use nsIURI to get the pieces you need. >+ >+ var fis = Cc["@mozilla.org/network/file-input-stream;1"]. >+ createInstance(Ci.nsIFileInputStream); >+ fis.init(aInstalledFile, 0x01, 0664, 0); nit: name the flags >+ var sis = Cc["@mozilla.org/scriptableinputstream;1"]. >+ createInstance(Ci.nsIScriptableInputStream); >+ sis.init(fis); >+ >+ var data = sis.read(sis.available()); close the stream, make sure to read the whole stream r=sayrer with that

Attachment #242769 - Flags: review?(sayrer) → review+

Benjamin Smedberg

Reporter

Updated

•

18 years ago

Priority: -- → P3

Benjamin Smedberg

Reporter

Updated

•

17 years ago

Assignee: benjamin → nobody

Marco Bonardo [:mak]

Updated

•

13 years ago

Blocks: 833098

Blair McBride [:Unfocused] (UNAVAILABLE)

Updated

•

12 years ago

Blocks: 857458

Blair McBride [:Unfocused] (UNAVAILABLE)

Updated

•

12 years ago

Blocks: 857456

Tim Nguyen :ntim

Comment 10

•

8 years ago

Is this WONTFIX ? Given the remaining usages of RDF, the solutions are either to convert to JSON, either to remove the usages. I don't think adding a compat parser is the way to go though.

Flags: needinfo?(mak77)

Marco Bonardo [:mak]

Comment 11

•

8 years ago

(In reply to Tim Nguyen :ntim from comment #10) > Is this WONTFIX ? Given the remaining usages of RDF, the solutions are > either to convert to JSON, either to remove the usages. I agree, but the last call should be by the add-ons team since the only remaining usages are install.rdf and update.rdf, so forwarding to jorgev. The last thing I heard was that we planned to completely move to json in 2018, if so a reader would not be useful.

Flags: needinfo?(mak77) → needinfo?(jorge)

Jorge Villalobos [:jorgev] (he/him)

Comment 12

•

8 years ago

install.rdf is going away with legacy add-ons, so we can drop support for it later this year. update.rdf already has a JSON alternative (https://developer.mozilla.org/en-US/Add-ons/Updates), but we need to move AMO to start using it (https://github.com/mozilla/addons-server/issues/7223) and then give non-AMO developers time to migrate to the new format. Regardless, we're moving away from RDF and at this point I don't think it's worth working on a new parser.

Status: NEW → RESOLVED

Closed: 8 years ago

Flags: needinfo?(jorge)

Resolution: --- → WONTFIX

Tim Nguyen :ntim

Updated

•

8 years ago

No longer blocks: 274068

BMO Automation

Updated

•

7 years ago

Product: Core → Core Graveyard

RDF-compat parser, rev. 1 19 years ago Benjamin Smedberg 49.68 KB, patch	sayrer : review+	Details \| Diff \| Splinter Review
Make the chrome registry use the rdf-compat parser, rev. 1 19 years ago Benjamin Smedberg 52.27 KB, patch	sayrer : review+	Details \| Diff \| Splinter Review

Bugzilla

Add an RDF compatibility parser

Categories

(Core Graveyard :: RDF, defect, P3)

Tracking

(Not tracked)

People

(Reporter: benjamin, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated

Updated

Updated

Updated

Comment 10

Comment 11

Comment 12

Updated

Updated

Attachment

General

Description

File Name

Content Type