minimize size of serialized RDF/XML

RESOLVED FIXED in mozilla0.9.1

Status

defect
P3
normal
RESOLVED FIXED
20 years ago
Last year

People

(Reporter: waterson, Assigned: waterson)

Tracking

({perf})

Trunk
mozilla0.9.1
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

The current RDF/XML serializer is pretty verbose. Optimize output for minimum space.
Status: NEW → ASSIGNED
Target Milestone: --- → M18
since I'm not storing the hostinfo.dat (for news) as serialized rdf/xml, I'm not 
blocked by this.  feel free to move it out.
Keywords: helpwanted
Target Milestone: M18 → Future
Reviving, due to data collected in bug 72128. Smaller RDF files mean fewer calls
to PR_Read() on startup.
Keywords: helpwantedperf
Target Milestone: Future → mozilla0.9.1
Blocks: 7251
The above patch 

- breaks RDF parsing into a helper class nsRDFXMLParser, with its own API so
  that it can be scripted. (Several people have wanted to be able to parse
  RDF/XML strings from JS, e.g.)

- breaks RDF serialization into a helper class nsRDFXMLSerializer, also with its
  own API. (Several people have wanted to be able to serialize arbitrary
  datasources.)

- Cleans up the RDF/XML that's emitted; e.g., the chrome stuff drops by
  10-20% in size, and is actually human-readable now.

- Makes the chrome registry give the serializer a hint about the ``chrome''
  namespace, which helps reduce the namespace churn and improves legibility.

hyatt, could you r=? shaver, sr=? (or vice versa?) thanks...
Keywords: patch
[s]r=hyatt
I reviewed in IRC, but will transcribe here as well.

+    // XXX Keep a 'cached' copy of the URL; opening it may cause the
+    // spec to be re-written.

why XXX?

  // XXX Replace this with channels someday soon...

is there a bug on that?

+                              nsString& aProperty,
+                              nsString& aNameSpacePrefix,
+                              nsString& aNameSpaceURI)

nsAWritableString?

+static void
+rdf_EscapeAttributeValue(nsString& s)

combine those routines to avoid iterating over the string 4 times?

+    nsIRDFResource* resource;
+    nsIRDFLiteral* literal;

COMPtrs so that you can avoid the manual RELEASE?

+    // can't use |NS_LITERAL_STRING| here until |rdf_BlockingWrite| is fixed to
accept readables
+    rdf_BlockingWrite(aStream, NS_LITERAL_STRING("</RDF:RDF>\n"));

uh.
thanks, shaver. I addressed some of your comments but punted on others:

> +    // XXX Keep a 'cached' copy of the URL; opening it may cause the
> +    // spec to be re-written.
> 
> why XXX?

No good reason. Fixed.

>  // XXX Replace this with channels someday soon...
> 
> is there a bug on that?

Yup, bug 78013, mozilla-0.9.2.

> nsAWritableString?

Tried, it's going to a be a bit more work than I'd hoped; see below.

> +static void
> +rdf_EscapeAttributeValue(nsString& s)
> 
> combine those routines to avoid iterating over the string 4 times?

I combined the escape-ampersands and escape-angle-brackets routines, and fixed
some of the dumbness in escape-angle-brackets. Fixing this the ``right way'' is
more than I want to bite off right now: I think we should probably make a
``replace substrings'' method, or better, ``replace regexp'' method that can
build a state machine and munch. Put a comment to that effect.

> +    nsIRDFResource* resource;
> +    nsIRDFLiteral* literal;
>
> COMPtrs so that you can avoid the manual RELEASE?

Done.

> +    // can't use |NS_LITERAL_STRING| here until |rdf_BlockingWrite| is fixed to
> accept readables
> +    rdf_BlockingWrite(aStream, NS_LITERAL_STRING("</RDF:RDF>\n"));
>
> uh.

Removed the erroneous comment.


 // XXX this is a hack: any "file:" URI is considered writable. All
 // others are considered read-only.

Is there a bug on that?

[s]r=shaver

Filed bug 80720 to automatically re-serialize RDF/XML back to non-``file:'' 
URLs. Doing that will probably need to wait for some changes to the way that 
Necko works (or, I'll need to do a bunch of per-protocol hackery in 
nsRDFXMLDataSource).

OTOH, the fix for this bug _does_ break the RDF/XML serialization code out into 
an XPC-addressable component. So if you're really hell-bent to send RDF/XML back 
to a server, you can do it yourself by manhandling the Necko APIs.
Checked in.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
No longer blocks: 7251
Blocks: 7251
QA Contact: tever → nobody
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.