Closed Bug 113400 Opened 23 years ago Closed 23 years ago

Investigate loading a light version of XHTML plus MathML DTD

Categories

(Core :: MathML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla0.9.8

People

(Reporter: hsivonen, Assigned: rbs)

References

Details

Attachments

(2 files, 7 obsolete files)

Since Mozilla doesn't support characters outside the Basic Multilingual Plane, Mozilla uses PUA assignment for some math chars. The current mathml.dtd approach looks like an interoperability problem. OTOH, loading the full XHTML 1.1 plus MathML 2.0 DTD looks like a performance problem. This bug is about investigating whether it makes sense to load a light version of the DTD that omits declarations that areof no interest to a non-validating parser.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.8
The spec-compliant DOCTYPE is: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd"> So the fix for this bug is to special-case MathML and substitute the URL to "kDTDDirectory/mathml.dtd" in the function where the DTD path is resolved -- the function where this happens is |IsLoadableDTD()| located in mozilla/htmlparser/src/nsExpatTokenizer.cpp.
Depends on: 105137
nsExpatTokenizer is not part of the build anymore. The place to change is now nsExpatDriver. I have made a version that relies on the URL of the DTD instead.
Attached patch proposed fix (obsolete) — Splinter Review
With the proposed fix, I am working in the error code path. So it would have little perf impact on the chrome for example. Not all URLs are tested against the special case. We get there in the course of checking for other errors.
I'd rather bind the DTD to the public id than the system id. The public id is unambiguous. However, authors are permitted to modify the system id to point to a copy of the DTD on their own servers. I took the official DTD (372 KB) and removed all the declarations and comments I thought were uninteresting for a non-validating system. I ended up with 45 KB exclusive of the character entity declarations that already exist in Mozilla's mathml.dtd. However, I'm getting syntax error but I haven't yet figured out why. (Having looked at the issue, I'm back to my old opinion that DTDs are problematic in the browser context in general. I'd prefer a solutions that permitted authors not to rely on external declarations in documents that are made available on the Web. However, due to the character issues, I think loading a local DTD for XHTML + MahtML makes sense. I only hope it doesn't cause problems later on.) By the way, I think it would be good to remove these lines from Mozilla's mathml.dtd: <!-- handy alias for the MathML namespace --> <!ENTITY mathml "http://www.w3.org/1998/Math/MathML"> Introducing an entity that isn't in the official DTD would cause an interoperability problem if authors referenced the entity in their documents.
Attached patch iteration to use the public id (obsolete) — Splinter Review
Iteration following henris' commments, and after digging bugzilla for general information to put the problem in a wider context. I have thus seen bug 113400, and have implemented something significantly less ambitious than bug 113400, but more general enough to to the point that the #ifdefs are of little signficance (SVG might need this too). As in the previous patch, it is a fallback route when the normal attempt failed.
Attachment #61072 - Attachment is obsolete: true
Attachment #64650 - Attachment is obsolete: true
Something that has bugged me is to how to make this forward compatible (without recompilation). But this may not matter in this context since the proper fix via bug 113400 would allow external extensible catalog files anyway.
rbs: this _is_ bug 113400. Which bug did you mean?
...stealing the bug to drive the checkin effort. >However, due to the character issues, I think loading >a local DTD for XHTML + MahtML makes sense. I only hope it doesn't cause >problems later on C.f. the catalog spec in bug bug 98413. That spec is exactly about remapping public IDs to more friendly URIs. So the proposed fix is a tiny step in the right direction (although my ultimate preference would also be to have an efficient DTD cache as I advocated in the long standing bug 44458). harishd/jst, like the patch? Hot r/sr before the impending closure of the trunk?
Assignee: henris → rbs
Status: ASSIGNED → NEW
bbaetz, the proposed fix would be sufficient to cover SVG too for now, right?
Attachment #64747 - Attachment is obsolete: true
Untested, but yes. harishd just removed the svg stuff which did this, because it borke ports with his lastest patch (I haven't looked into that yet). This looks fine, htough. I'd load a dtd for a few of the old CR public ids which did the namespace stuff, but leave the "official" one out, forcing the explicit namespacing.
Attached patch patch (obsolete) — Splinter Review
Here is a tentative fix to patch up SVG. I haven't tested this nor have compiled (!!!!) because the tree that I have on my home computer is very old and therefore wasn't able to compile. Would appreciate if someone could test this patch for me. Thanx.
harishd, take a look at my patch. It aims at addressing the problem in a more general way and doesn't suffer from the drawback of testing every URL against the special-cases before trying to open the URL.
rbs: Just chated with Bradley and he pointed that out.
rbs: Are you aiming to get this in before the tree closure? If so, then I'll do the extra svg bits. Otherwise I'll test harish's patch once my build finishes.
Yes, I at aiming at checkin it sraight after geting r/sr. Just post the mappings that you want for SVG and I will submit another patch with your catalog entries as well.
I need a mapping to a dtd containing: <!ATTLIST svg xmlns CDATA #FIXED "http://www.w3.org/2000/svg" > for public ids: "-//W3C//DTD SVG 20001102//EN" We won't map the CR one, and see if anyone complains - I'd prefer to be standards compilent unless there is too much other content out there.
>We won't map the CR one, and see if anyone complains Do you want a partial match like in the old code? Feel free to ask. (I have been wondering about forward compatibility too.) I can change the struct to: struct nsCatalogEntry { const char* mPublicID; const char* mLocalDTD; eMatchMode {full, partial} // and use Find() == 0 in case of 'partial' };
Attached patch patch to cater for SVG as well (obsolete) — Splinter Review
So all that remains is to create the svg.dtd file to be installed via the makefiles in the bin/dtd. No code-level changes will be required, and the svg.dtd will contain the single line: <!ATTLIST svg xmlns CDATA #FIXED "http://www.w3.org/2000/svg" >
Attachment #64965 - Attachment is obsolete: true
Attachment #64979 - Attachment is obsolete: true
Comment on attachment 64992 [details] [diff] [review] patch to cater for SVG as well Ready for r/sr, tuning/tweaking can come later...
We probably don't want a partial match. That code was written before teh final spec was release - all previous versions (including the final CR) implied that it wasn't needed, and even the svg testsquite didn't have it. The change (to complie with the MXL namespace spec) was added later I'm only matching against the previous public id, since the final CR draft used 1.0 identifier too. If this breaks stuf, we can easily revisti this to match the 1.0 id as well.
Attached patch same patch - with more comments (obsolete) — Splinter Review
Since partial match is not needed, the original patch is good. I just added more comments. I have seen many queries about this special DTD directory, it is reasonably extensible as it stands but this hasn't been well advertised. I added the kind of comments I would give to users asking about it.
Attachment #64992 - Attachment is obsolete: true
Comment on attachment 65057 [details] [diff] [review] same patch - with more comments In RemapDTD(), don't flag the lack of a mapping from public id to dtd filename with an *error* code (NS_ERROR_NOT_AVAILABLE), the lack of a mapping is not an error. I know we do this all over, but it's _wrong_. Just pass back an empty string if there's no mapping, use the nsresults for what they're intended to be used for, not to flag yes/no type things. Actually, since there's no real error codes ever returned from RemapDTD() you could even make it a void method. sr=jst with that fixed.
Attachment #65057 - Flags: superreview+
Attached patch updated patchSplinter Review
I updated to patch following jst's comments.
Attachment #65057 - Attachment is obsolete: true
Comment on attachment 65076 [details] [diff] [review] updated patch sr=jst
Attachment #65076 - Flags: superreview+
harishd, r?
Comment on attachment 65076 [details] [diff] [review] updated patch >+ const nsCatalogEntry* data = kCatalogTable; >+ while (data->mPublicID) { Will data always be true? >+ res = NS_NewURI(getter_AddRefs(dtdURI), dtdFile.GetURLString()); >+ if (NS_SUCCEEDED(res) && dtdURI) { Is the check for res necessary? What kind of performance tests ( Ts/Tp/Txul ) have you done?
> >+ const nsCatalogEntry* data = kCatalogTable; > >+ while (data->mPublicID) { >Will data always be true? |data| will always be non null, for it is initialized from a static constant array. Then, when the field |data->mPublicID| is null, it marks the end of the array. > >+ res = NS_NewURI(getter_AddRefs(dtdURI), dtdFile.GetURLString()); > >+ if (NS_SUCCEEDED(res) && dtdURI) { > Is the check for res necessary? Not really. That was just the old code that was re-indended due to my addition of another if-clause. I can get rid of when checking in. > What kind of performance tests ( Ts/Tp/Txul ) have you done? Unfortunately, my buid environment doesn't have all the nifty perf tools that you people have. I don't anticipate a perf degradation and will be willing to backout and go to back to the drawing board if problems develop. However, I don't see a possible perf spot at the moment since that code is still in the fallback path in case of errors. The chrome will continue to fly as before, plus the original test was always failing (the special dtd directory has always been empty since mathml.dtd has been the only file installed there by default -- and this happens in MathML-enabled builds only).
Status: NEW → ASSIGNED
Comment on attachment 65076 [details] [diff] [review] updated patch r=harishd
Attachment #65076 - Flags: review+
Fix landed.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: