Closed
Bug 231548
Opened 21 years ago
Closed 18 years ago
remove nsRDFParserUtils
Categories
(Core Graveyard :: RDF, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla1.9alpha1
People
(Reporter: tingley, Assigned: readams)
References
Details
(Keywords: perf)
Attachments
(5 files, 2 obsolete files)
19.98 KB,
patch
|
Details | Diff | Splinter Review | |
364 bytes,
text/rdf
|
Details | |
17.54 KB,
patch
|
axel
:
review-
|
Details | Diff | Splinter Review |
18.45 KB,
patch
|
axel
:
review+
shaver
:
superreview+
|
Details | Diff | Splinter Review |
18.51 KB,
patch
|
Details | Diff | Splinter Review |
nsRDFParserUtils, as far as I can tell, contains legacy left over from before RDF was hooked up to expat. Some of its code is never called, and the rest of performs functions that are redundant and (as a result) incorrect. nsRDFParserUtils contains 3 functions: GetQuotedAttributeValue() is never used, and would be redundant with nsParserUtils::GetQuotedAttributeValue() if it were. EntityToUnicode() is only called from StripAndConvert(). StripAndConvert is applies two transformations to attribute values: - Strip matching single and double quotes from strings. - Convert some character entities. The first behavior is not only unnecessary, it is incorrect. Quotes have been stripped, to the extent that they should be, by the parser before the content sink ever sees them; removing additional pairs of quotes is altering opaque data that we shouldn't touch. When presented with something like this: <RDF:Description RDF:about="..." NC:Name="'In Quotes'" /> The parser will strip the outer "" and pass |'In Quotes'| to RDF. StripAndConvert() will incorrectly remove the '' as well. The second part selectively unescapes certain character entities, notably &, >/<, etc. Again, this is redundant -- entities in the attributes have already been handled by the parser. (Historical note -- I believe this happened as part of bug 57248 -- see harishd's bug 57248 comment #62 and onward, where he discusses moving attribute entity parsing from the sinks to the parser.) Similar to the quote-stripping, this can produce incorrect behavior in fringe cases involving doublely-embedded entities. [As a side note, both these behaviors make a round-trip through the RDF system non-idempotent for some RDF literals, which is bad behavior.] Killing this code will reduce footprint, speed up RDF processing, and pave the way for a string cleanup of the RDF content sink that I've been meaning to do for some time. I had a patch for this at one point that sped up RDF parsing of chrome.rdf by about 8% in an opt build; it needs to be de-rotted, though.
Reporter | ||
Comment 1•21 years ago
|
||
This is missing stuff to cut it out of the build on various other platforms, if that's necessary. It removes the StripAndConvert calls and cleans up some string stuff nearby that I couldn't resist getting rid of. I'm still testing this.
Reporter | ||
Comment 2•21 years ago
|
||
rjc: not looking for a review yet, but tell me if my reasoning sounds bogus.
Reporter | ||
Comment 3•21 years ago
|
||
Basically the same as before, though I've included the contents of the parseutils files, essentially for reference. I've been testing this for a week, and everything seems ok.
Attachment #139530 -
Attachment is obsolete: true
Reporter | ||
Updated•21 years ago
|
Attachment #139937 -
Flags: review?(rjc)
Reporter | ||
Updated•21 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.7alpha
Comment 4•20 years ago
|
||
yikes. loss of data, bumping severity. This is a good thing to do, will attach a testcase that show the loss of data if you do an rdfcat on it. I wonder if we have to use a nsACString, though. rdf_RequiresAbsoluteURI should just need nsCSubstring, rdf_MakeAbsoluteURI(const nsCSubstring& aBaseURI, const nsCSubstring& aURI, nsCString& aResult) No nsA, as we don't wanna cross xpcom boundaries and there's no need to use the binary compat virtual function call interfaces to strings.
Severity: normal → critical
QA Contact: core.rdf
Target Milestone: mozilla1.7alpha → ---
Comment 5•20 years ago
|
||
Reporter | ||
Updated•20 years ago
|
Attachment #139937 -
Flags: review?(rjc)
Reporter | ||
Comment 6•20 years ago
|
||
*** Bug 241228 has been marked as a duplicate of this bug. ***
This patch fixes the double-unescaping that the RDF parser does, so that things like & are interpreted correctly. Note that even with this patch, mozilla still will badly mangle wide characters in RDF data.
Attachment #196368 -
Flags: review?(axel)
Comment 8•19 years ago
|
||
*** Bug 300602 has been marked as a duplicate of this bug. ***
Comment 9•19 years ago
|
||
Comment on attachment 196368 [details] [diff] [review] This updates the already-attached bitrotten patch Let me go for a dive. The content sink stores the nsIURI, then, for each ID or other relative URL, get's the spec, creates a new URI object, calls into necko, which calls Resolve on the new nsIURI object? Sob. Please fix that by directly calling into mDocumentURL->Resolve, everything else is just keeping mad code alive. And you were as far as changing the signature anyway.
Attachment #196368 -
Flags: review?(axel) → review-
Comment 10•19 years ago
|
||
Comment on attachment 196368 [details] [diff] [review] This updates the already-attached bitrotten patch Let me go for a dive. The content sink stores the nsIURI, then, for each ID or other relative URL, get's the spec, creates a new URI object, calls into necko, which calls Resolve on the new nsIURI object? Sob. Please fix that by directly calling into mDocumentURL->Resolve in the three contentsink call sites, everything else is just keeping mad code alive. And you were as far as changing the signature of that "helper" anyway.
Comment 11•19 years ago
|
||
Comment on attachment 196368 [details] [diff] [review] This updates the already-attached bitrotten patch Let me go for a dive. The content sink stores the nsIURI, then, for each ID or other relative URL, get's the spec, creates a new URI object, calls into necko, which calls Resolve on the new nsIURI object? Sob. Please fix that by directly calling into mDocumentURL->Resolve in the three contentsink call sites, everything else is just keeping mad code alive. And you were as far as changing the signature of that "helper" anyway.
Assignee | ||
Comment 12•19 years ago
|
||
This patch is the same as the other one, but removes also rdf_MakeAbsoluteURI completely.
Attachment #196368 -
Attachment is obsolete: true
Attachment #199176 -
Flags: review?(axel)
Comment 13•19 years ago
|
||
Comment on attachment 199176 [details] [diff] [review] not bitrotten; uses nsIURL::Resolve >Index: nsRDFContentSink.cpp >=================================================================== >RCS file: /cvsroot/mozilla/rdf/base/src/nsRDFContentSink.cpp,v >retrieving revision 1.121.8.1 >diff -p -u -r1.121.8.1 nsRDFContentSink.cpp >--- nsRDFContentSink.cpp 30 Aug 2005 02:24:59 -0000 1.121.8.1 >+++ nsRDFContentSink.cpp 11 Oct 2005 16:50:04 -0000 >@@ -858,20 +853,15 @@ RDFContentSinkImpl::GetIdAboutAttribute( > if (aIsAnonymous) > *aIsAnonymous = PR_FALSE; > >- nsAutoString uri(aAttributes[1]); >- nsRDFParserUtils::StripAndConvert(uri); >- >- rdf_MakeAbsoluteURI(NS_ConvertUTF8toUCS2(docURI), uri); Check rdf_RequiresAbsoluteURI here? The cheapest way is probably to just call into gRDFService->GetResource inside the if and after it. >+ nsCAutoString uri; >+ rv = mDocumentURL->Resolve(NS_ConvertUCS2toUTF8(aAttributes[1]), uri); >+ if (NS_FAILED(rv)) return rv; > >- return gRDFService->GetUnicodeResource(uri, aResource); >+ return gRDFService->GetResource(uri, aResource); > } > else if (localName == kIdAtom) { > if (aIsAnonymous) > *aIsAnonymous = PR_FALSE; >- >- nsAutoString name(aAttributes[1]); >- nsRDFParserUtils::StripAndConvert(name); >- > // In the spirit of leniency, we do not bother trying to > // enforce that this be a valid "XML Name" (see > // http://www.w3.org/TR/REC-xml#NT-Nmtoken), as per >@@ -880,11 +870,12 @@ RDFContentSinkImpl::GetIdAboutAttribute( > // Construct an in-line resource whose URI is the > // document's URI plus the XML name specified in the ID > // attribute. >- name.Insert(PRUnichar('#'), 0); >- >- rdf_MakeAbsoluteURI(NS_ConvertUTF8toUCS2(docURI), name); >+ nsCAutoString name; >+ rv = mDocumentURL->Resolve(NS_LITERAL_CSTRING("#") + >+ NS_ConvertUCS2toUTF8(aAttributes[1]), name); nsCAutoString ref('#'); AppendUTF16toUTF8(aAttributes[1], ref); should be a tad faster. No need to call into RequiresAbsolu... here. >+ if (NS_FAILED(rv)) return rv; > >- return gRDFService->GetUnicodeResource(name, aResource); >+ return gRDFService->GetResource(name, aResource); > } > else if (localName == kAboutEachAtom) { > // XXX we don't deal with aboutEach... >@@ -923,17 +914,15 @@ RDFContentSinkImpl::GetResourceAttribute > // first thing that was specified and ignore the other. > > if (localName == kResourceAtom) { >- nsAutoString uri(aAttributes[1]); >- nsRDFParserUtils::StripAndConvert(uri); >- > // XXX Take the URI and make it fully qualified by > // sticking it into the document's URL. This may not be > // appropriate... >- nsCAutoString documentURL; >- mDocumentURL->GetSpec(documentURL); >- rdf_MakeAbsoluteURI(NS_ConvertUTF8toUCS2(documentURL), uri); >+ nsresult rv; check for rdf_RequiresAbsoluteURI here, too. >+ nsCAutoString uri; >+ rv = mDocumentURL->Resolve(NS_ConvertUCS2toUTF8(aAttributes[1]), uri); >+ if (NS_FAILED(rv)) return rv; > >- return gRDFService->GetUnicodeResource(uri, aResource); >+ return gRDFService->GetResource(uri, aResource); > } > } > return NS_ERROR_FAILURE; <...> >Index: rdfutil.cpp >=================================================================== >RCS file: /cvsroot/mozilla/rdf/base/src/rdfutil.cpp,v >retrieving revision 1.60 >diff -p -u -r1.60 rdfutil.cpp >--- rdfutil.cpp 9 Sep 2004 20:17:36 -0000 1.60 >+++ rdfutil.cpp 11 Oct 2005 16:50:05 -0000 >@@ -87,85 +87,13 @@ rdf_MakeRelativeRef(const nsCSubstring& > } > > static PRBool >-rdf_RequiresAbsoluteURI(const nsString& uri) >+rdf_RequiresAbsoluteURI(const nsACString& uri) nsCString is good enough these days. This isn't really for public consumption anyway. > { <...> >- >- return NS_OK; >+ return !(StringBeginsWith(uri, NS_LITERAL_CSTRING("urn:")) || >+ StringBeginsWith(uri, NS_LITERAL_CSTRING("chrome:")) || >+ StringBeginsWith(uri, NS_LITERAL_CSTRING("nc:"), drop the nc: one, we shouldn't have those. >+ nsCaseInsensitiveCStringComparator())); > } > > void >Index: rdfutil.h >=================================================================== >RCS file: /cvsroot/mozilla/rdf/base/src/rdfutil.h,v Expose rdf_RequiresAbsoluteURI here or move it over to the content sink completely. I guess I'd advocate to move it. I'm a yota away from r+ now, but I guess seeing the final patch won't hurt.
Attachment #199176 -
Flags: review?(axel) → review-
Assignee | ||
Comment 14•19 years ago
|
||
some shenanigans involving RequiresAbsoluteURI. Man, converting various strings is a pain in the butt.
Attachment #199191 -
Flags: review?(axel)
Assignee | ||
Comment 15•19 years ago
|
||
Comment on attachment 199191 [details] [diff] [review] the March of Patches continues... ping?
Comment 16•19 years ago
|
||
Comment on attachment 199191 [details] [diff] [review] the March of Patches continues... r=me with the following nit fixed, >Index: nsRDFContentSink.cpp >=================================================================== <...> >+ if (rdf_RequiresAbsoluteURI(relURI)) { >+ nsCAutoString uri; >+ rv = mDocumentURL->Resolve(NS_ConvertUCS2toUTF8(aAttributes[1]), uri); >+ if (NS_FAILED(rv)) return rv; >+ >+ return gRDFService->GetResource(uri, >+ aResource); >+ } else { >+ return gRDFService->GetResource(NS_ConvertUCS2toUTF8(aAttributes[1]), >+ aResource); >+ } and <...> >+ if (rdf_RequiresAbsoluteURI(relURI)) { >+ nsresult rv; >+ nsCAutoString uri; >+ >+ rv = mDocumentURL->Resolve(NS_ConvertUCS2toUTF8(aAttributes[1]), uri); >+ if (NS_FAILED(rv)) return rv; >+ >+ return gRDFService->GetResource(uri, aResource); >+ } else { >+ return gRDFService->GetResource(NS_ConvertUCS2toUTF8(aAttributes[1]), >+ aResource); >+ } > } No "else" after return, please.
Attachment #199191 -
Flags: superreview?(shaver)
Attachment #199191 -
Flags: review?(axel)
Attachment #199191 -
Flags: review+
Comment on attachment 199191 [details] [diff] [review] the March of Patches continues... sr=shaver
Attachment #199191 -
Flags: superreview?(shaver) → superreview+
Assignee | ||
Comment 18•19 years ago
|
||
This patch makes the requested change to the reviewed patch. I don't have commit access so I need someone to commit for me. Whoever does it should be sure to do a cvs remove on nsRDFParserUtils.cpp
Comment 19•19 years ago
|
||
gRDFService->GetResource(NS_ConvertUCS2toUTF8(aAttributes[1]) I'm not sure why this doesn't use GetUnicodeResource... but anyway, I checked this in as-is (and removed nsRDFParserUtils.*) fixed on trunk Checking in rdf/base/src/Makefile.in; /cvsroot/mozilla/rdf/base/src/Makefile.in,v <-- Makefile.in new revision: 1.36; previous revision: 1.35 done Checking in rdf/base/src/nsRDFContentSink.cpp; /cvsroot/mozilla/rdf/base/src/nsRDFContentSink.cpp,v <-- nsRDFContentSink.cpp new revision: 1.123; previous revision: 1.122 done Removing rdf/base/src/nsRDFParserUtils.cpp; /cvsroot/mozilla/rdf/base/src/nsRDFParserUtils.cpp,v <-- nsRDFParserUtils.cpp new revision: delete; previous revision: 1.17 done Removing rdf/base/src/nsRDFParserUtils.h; /cvsroot/mozilla/rdf/base/src/nsRDFParserUtils.h,v <-- nsRDFParserUtils.h new revision: delete; previous revision: 1.11 done Checking in rdf/base/src/rdfutil.cpp; /cvsroot/mozilla/rdf/base/src/rdfutil.cpp,v <-- rdfutil.cpp new revision: 1.61; previous revision: 1.60 done Checking in rdf/base/src/rdfutil.h; /cvsroot/mozilla/rdf/base/src/rdfutil.h,v <-- rdfutil.h new revision: 1.26; previous revision: 1.25 done
Comment 20•19 years ago
|
||
Rob, can you close this bug if it's fixed?
Comment 21•18 years ago
|
||
can someone close this if it should be closed?
Assignee: tingley → readams
Status: ASSIGNED → NEW
Flags: blocking1.9a1? → blocking1.9-
Reporter | ||
Comment 22•18 years ago
|
||
Closing as Fixed per Rob A's patch. Thanks for de-rotting this.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Comment 23•18 years ago
|
||
Is there a reason to not apply this to 1.8?
Comment 24•18 years ago
|
||
Nobody requested to land it, and IMHO it's too late to land it now.
Target Milestone: --- → mozilla1.9alpha
Comment 25•18 years ago
|
||
Fair enough - I applied it to the Flock tree since we were having problems.
Updated•6 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•