Closed Bug 278560 Opened 18 years ago Closed 18 years ago

RDF/XML invalid with "urn:forumzilla" namespace

Categories

(MailNews Core :: Feed Reader, defect)

PowerPC
macOS
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sayrer, Assigned: axel)

References

()

Details

Attachments

(2 files, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/125.5.5 (KHTML, like Gecko) Safari/125.12
Build Identifier: version 0.6+ (20050114)

In the Thunderbird 1.0 build, everything works fine. In CVSHEAD, the forumzilla ns is translated 
incorrectly in feeds.rdf and feeditems.rdf. It looks like this:

<?xml version="1.0"?>
<RDF:RDF xmlns:NS2="http://purl.org/dc/elements/1.1/"
         xmlns:NC="http://home.netscape.com/NC-rdf#"
         xmlns:RDF="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <RDF:Description RDF:about="http://scripting.com/rss.xml"
                   urn:forumzilla:quickMode=""
                   NS2:title="Scripting News" />
</RDF:RDF>

Note that the quickMode attribute is invalid. It screws this up wherever the urn:forumzilla ns appears. 
However, it works if the namespace in utils.js is changed to an HTTP URI.

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Actual Results:  
Causes failure to load RDF store.

Expected Results:  
Fields should be populated, and the js console shoudn't throw RDF Service errors.
I noticed this weird thing too when I moved back to the trunk and couldn't
figure out what was going on at the time. Would love a patch for this. 
Attached patch patch showing console output (obsolete) — Splinter Review
I've narrowed it down a bit. This patch will illustrate the the problem. It
seems to be rdf/base/src/nsRDFXMLSerializer.cpp. This patch will print the
results of MakeQName.
Attached file sample output (obsolete) —
This sample shows that MakeQName insists that it succeeded, but returns the
full URI as the property name. The nameSpacePrefix and nameSpaceURI are empty.
From the source: "this is very bogus... whatever..."
It used to be that this was only checking for "#" and "/", and then rather
optimistically assuming the split would result in a valid local name. This
patch will look for the last reserved character in the URI
("#","/",":","+","@",etc) and then look forward for the first letter or "_" to
start the local name.

Something like this
"http://example.org:2095/ns#asfds?sdf=sdfs/099asdfs-asdf"

will result in 

{http://example.org:2095/ns#asfds?sdf=sdfs/099} asdfs-asdf
Attachment #171592 - Attachment is obsolete: true
Attachment #171594 - Attachment is obsolete: true
Attached patch adjust white space (obsolete) — Splinter Review
Attachment #171790 - Attachment is obsolete: true
Looks like RSS support broke on the trunk after this change to 

nsRDFXMLSerializer::MakeQName by Axel:

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsRDFXMLSerializer.cpp&branch=&root=/cvsroot&subdir=mozilla/rdf/base/src&command=DIFF_FRAMESET&rev1=1.21&rev2=1.22

cc'ing axel. He might have some ideas. I'll ask him to look at your proposed fix
as well.

Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 171791 [details] [diff] [review]
adjust white space

sayer has patch which trys to fix the namespace bustage introduced by Bug
#239369 on the trunk. Is this the way you'd like to fix it or do you have a
different idea Axel?
Attachment #171791 - Flags: review?(axel)
Comment on attachment 171791 [details] [diff] [review]
adjust white space

This patch does not fix all the incarnations of the bug. Like, try to use
urn:x-foo:bla#foo
and then
urn:x-foo:bla#foo:bar.
Apart from that, the code is in it's very own style, severely different from
the rest of the file (and conflicting with my own preferences).
Some of this fixing would be easier with bug 259119 fixed, I could actually try
to get that bug up to speed again.
Attachment #171791 - Flags: review?(axel) → review-
Depends on: 259119
... actually, I didn't break this.

This is coming from our move to the namespaced expat parser. We always serialized 
this this way, since version 1.1 of the serializer.
It's just that we moved over to require XML Namespaces in our input XML.

This may be a problem for profile migration :-/.
thanks for jumping in Axel.
Hi Peter, have you had a chance to look at this regression that was caused by
the trunk's move to the name spaced expat parser?

This is making it really hard for folks to start testing trunk builds because
our feed datasources keep getting wiped out.
Target Milestone: --- → Thunderbird1.1
I see two options to get this stuff back in line:

- back out the namespaced expat

- do profile migration

I'd favour the latter, but I have no clue if we expose hooks for that or if the
profile can cope with it. The actual migration code could explicitly hook up the 
non-namespaced expat and the old version of the rdf sink and serialize the data
back, once I fixed bug 259119.
Though this somewhat calls for a fix of the vocabulary as well. The use of
urn:forumzilla is not really standards conformant (it should at least be
urn:x-forumzill), nor would I advise to use urns for vocabularies, as they are
even more awkward to document than http: urls.
Mental note, new vocabularies should use
#define DEVMO_NAMESPACE_URI_PREFIX "http://developer.mozilla.org/rdf/vocabulary/
"
from rdf.h, so that we have a central entry point.
Axel, if I tried to figure out a way to migrate these two data sources to
something  with a different namespace, what format should that name have? 

Currently we use:

xmlns:fz="urn:forumzilla:"

which obviously is causing problems with the new expat parser.

Should this be an http URL instead. Something like:

http://developer.mozilla.org/rdf/vocabulary/rss-subscriptions#rss-subscriptions
http://developer.mozilla.org/rdf/vocabulary/rss-items#rss-items


If we go the profile migration approach, I'm assuming we can take the old data
source which we seem to be able to load correctly (it's serializing it back to
disk that has all the problems) and dynamically change the name space to the new
name. I need to investigate to make sure that is possible. 

Peterv, do you have comments on backing out the expat name space changes before
I go down the path of forcing a profile migration?
Axel, one more question about the namespace issue. I'm trying to edit a
feeds.rdf file by hand to use a new namespace just so I can see things work
again and I'm running into problems with that. 

Will I avoid the expat issue if I just search and replace "urn:forumzilla:" and
replace it with a different name space by changing:

-xmlns:fz="urn:forumzilla:"
+xmlns:fz="http://developer.mozilla.org/rdf/vocabulary/rss-subscriptions/"
and
-<RDF:Description RDF:about="urn:forumzilla:root">
+<RDF:Description RDF:about="fz:root">

then modifying the content builder UI:

-<tree ref="urn:forumzilla:root"
+<tree ref="fz:root"

I'm trying to convert an old feeds.rdf file by hand into one that will work so I
can better understand the changes a profile migrator is going to need to do. 

Ok, this is actually a regression from peterv's patch.
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsRDFContentSink.cpp&branch=&root=/cvsroot&subdir=mozilla/rdf/base/src&command=DIFF_FRAMESET&rev1=1.117&rev2=1.118
shows that, for the opening RDF:RDF tag, we don't go thru the attributes to 
collect namespaces no more.
That one should be fixed, independently of bug 259119. I'm looking into it right
now.

Having the namespace definitions should cure this particular regression.
That doesn't make the serializer xml namespaces conform itself.
Neither does it fix the issues I have with the vocabulary used.
The use of urn: is fine as long as you use it to describe profile data.
It should not be used for predicates.
The difference is, profiles are not globally unique, that is, the concept of
a URL doesn't mix with the concept of a profile, so using some meta scheme is
fine. The predicates on the other hand are globally unique and thus should have
a URL to identify themselves. And then of course, urn:s should be standards
compliant.
Well, I hope we don't end up with yet another bitrotting non-spec'ed vocabulary 
in the mozilla codebase. (As I don't see us going thru profile migration code 
just to make me happy with a vocab.) So, that vocabulary should be spec'ed.
Please provide a document to go to http://developer.mozilla.org/rdf/vocabulary/,
an example of a suitable doc would be http://vocab.org/bio/0.1/.
Assignee: mscott → axel
Status: ASSIGNED → NEW
No longer depends on: 259119
This patch adds back the namespace catching phase to HandleStartElement.
It just stores those prefix namespace pairs in the datasource, which will then
add them to the serializer.
This does not do any magic in terms of catching multiple definitions and stuff
like that, the original code didn't either. I just want to get this regression
out of our way.
Scott, could you review and test? I did some tests with rdfcat on feeds.rdf,
that worked, but I don't do development builds of TB these days.
Attachment #171791 - Attachment is obsolete: true
Attachment #173363 - Flags: superreview?(peterv)
Attachment #173363 - Flags: review?(mscott)
Comment on attachment 173363 [details] [diff] [review]
catch all namespace definitions of the source

>Index: base/src/nsRDFContentSink.cpp
>===================================================================

>      * aLocalName will contain the localname in aExpatName and the returned
>      * value will be the namespace URI.
>+     * The return value is a dependent string containing just the namespace.

No need to say it twice, remove the stuff about the returned value from the
previous sentence.

Thanks for catching this, too bad we can't combine the attribute loops :-(.
Attachment #173363 - Flags: superreview?(peterv) → superreview+
*** Bug 279737 has been marked as a duplicate of this bug. ***
Comment on attachment 173363 [details] [diff] [review]
catch all namespace definitions of the source

so far this patch has fixed things for me and I can use RSS on the trunk again! 

Thanks Axel.
Attachment #173363 - Flags: review?(mscott) → review+
When I was trying to solve this problem on the app side, I was unable to change
the name space of the data source after it had been loaded using
nsIRDFDataSource::Change. So I ended up hacking on a brute force profile
migrator that would stream each file into a buffer and then try to search and
replace the name space into something valid. 

If we end up deciding to do a profile migration for the feed datasources down
the line in order to install a proper name space and a more structured
vocabulary, then we can leverage this patch. Hence I'm posting it here. All it
needs is a clear mapping of what needs to be changed from the old formats to
the new. i.e. the namespace, each piece of new vocabulary, etc.
Checked in.

Regarding the profile migration hack, I was actually hoping that one could have
hooked that up in the startup code that verifies the integrity of themes and
extensions.

Anyway, good to have this one fixed.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Component: RSS → Feed Reader
Product: Thunderbird → MailNews Core
Target Milestone: Thunderbird1.1 → ---
You need to log in before you can comment on or make changes to this bug.