Closed
Bug 424359
Opened 17 years ago
Closed 16 years ago
Html serializer : bugs with OutputBodyOnly and OutputLF/CRLineBreak flags
Categories
(Core :: DOM: Serializers, defect)
Core
DOM: Serializers
Tracking
()
RESOLVED
FIXED
People
(Reporter: laurent, Assigned: laurent)
References
Details
Attachments
(2 files, 9 obsolete files)
79.52 KB,
patch
|
bzbarsky
:
superreview+
|
Details | Diff | Splinter Review |
3.53 KB,
patch
|
bzbarsky
:
superreview+
|
Details | Diff | Splinter Review |
The HTML serializer contains two bugs: 1) with nsIDocumentEncoder::OutputBodyOnly, there is an </html> tag at the end of the string 2) with nsIDocumentEncoder::OutputCRLineBreak, some end of line are not replaced by a CR (ex: end of lines after the doctype, inside comments).
Assignee | ||
Updated•17 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•17 years ago
|
||
Here is a first version of the patch. I added some mochitests (html serializer) and xpcshell test (xml serializer). To resolve the second bug, I modified the XML serializer. So now, the XML serializer supports OutputCR/LFLineBreak flags ;-) I will submit a second version, because I think my tests don't pass on plateforms like Mac and Windows, because of the default line break used on them.
Assignee: nobody → laurent
Assignee | ||
Comment 2•17 years ago
|
||
Here is the updated patch, ready for a review.
Attachment #310987 -
Attachment is obsolete: true
Attachment #312721 -
Flags: superreview?(peterv)
Attachment #312721 -
Flags: review?(jonas)
Assignee | ||
Updated•16 years ago
|
Flags: wanted1.9.0.x?
Assignee | ||
Updated•16 years ago
|
Attachment #312721 -
Flags: review?(jonas) → review?(Olli.Pettay)
Comment 3•16 years ago
|
||
Comment on attachment 312721 [details] [diff] [review] version 2 of the patch >@@ -855,16 +834,20 @@ nsHTMLContentSerializer::AppendElementEn > AppendToString(mLineBreak, aStr); > mMayIgnoreLineBreakSequence = PR_TRUE; > mColPos = 0; > } > else { > MaybeFlagNewline(aElement); > } > >+ if (name == nsGkAtoms::body) { >+ mInBody = PR_FALSE; >+ } >+ Mainly because of this change, could you add tests which have more than one <body> element in the document. And what happens if there are nested <body> elements? (that may happen with non-valid-XHTML or if someone modifies the DOM.) >+++ content/base/test/test_htmlserializer_1.html 2008-03-31 15:12:28.000000000 +0200 >@@ -0,0 +1,157 @@ >+<!DOCTYPE HTML> >+<html> >+<!-- >+--> >+<head> >+ <title>Test for HTML serializer</title> >+ <script type="text/javascript" src="/MochiKit/MochiKit.js"></script> >+ <script type="text/javascript" src="/tests/SimpleTest/SimpleTest.js"></script> >+ <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css" /> >+</head> >+<body> >+<!--<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=">Mozilla Bug </a>--> Could you please rename this file to test_bug424359-1.html and make the link point to this bug. >+++ content/base/test/test_htmlserializer_2.html 2008-03-31 15:12:24.000000000 +0200 >@@ -0,0 +1,96 @@ >+<!DOCTYPE HTML> >+<html> >+<!-- >+--> >+<head> >+ <title>Test HTML serializer with entities</title> >+ <script type="text/javascript" src="/MochiKit/MochiKit.js"></script> >+ <script type="text/javascript" src="/tests/SimpleTest/SimpleTest.js"></script> >+ <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css" /> >+</head> >+<body> >+<!--<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=">Mozilla Bug </a>--> This file could be then test_bug424359-2.html
Assignee | ||
Comment 4•16 years ago
|
||
This new version fixes some issues when there are several body elements * if body elements are nested, only the top body element is generated. However, the content of child body elements are generated inside the top body element (in fact, I follow the same behavior of the HTML parser) * if there are two sibling body elements, only the first one is generated. I renamed also the two test files.
Attachment #312721 -
Attachment is obsolete: true
Attachment #328506 -
Flags: review?(Olli.Pettay)
Attachment #312721 -
Flags: superreview?(peterv)
Attachment #312721 -
Flags: review?(Olli.Pettay)
Comment 5•16 years ago
|
||
Comment on attachment 328506 [details] [diff] [review] Version 3 of the patch Please use -p with diff. > if (name == nsGkAtoms::body) { >- mInBody = PR_TRUE; >+ mInBody ++; >+ if (mInBody > 1) { >+ // we ignore nested body elements >+ return NS_OK; >+ } >+ if (mAlreadyHaveBody) { >+ return NS_OK; >+ } Extra space between mInBody and ++; (and I'd prefer ++mBody) >+ if (name == nsGkAtoms::body && mInBody > 1) { >+ // we ignore nested body elements >+ mInBody--; >+ return NS_OK; >+ } Could this be if (name == nsGkAtoms::body && --mInBody > 1) { // we ignore nested body elements return NS_OK; } Then you don't need to set mInBody explicitly to 0 later in the method. But still some comments about the approach. (In reply to comment #4) > * if body elements are nested, only the top body element is generated. However, > the content of child body elements are generated inside the top body element > (in fact, I follow the same behavior of the HTML parser) Sounds reasonable, though to keep the patch as simple as possible and not change the behavior too much, perhaps serializer could create those inner bodys too. > * if there are two sibling body elements, only the first one is generated. This is tricky. What would be great, IMO, was to be able to push the content of latter sibling body elements to the first body, but that would require quite large changes. What is the current behavior? Generate all the sibling bodys? Maybe there isn't need to change that. So to fix this bug, is it enough to use the version 2 patch but change mInBody to PRInt32 and increase and decrease it when needed?
Assignee | ||
Comment 6•16 years ago
|
||
In my opinion, I think this isn't the role of the serializer to fix problems with invalid elements. I we start to try to fix some things like I do in my patch with the body element, then why not to verify other things like H1 inside P, DIV inside SPAN, P as child of UL etc. ? It would be too complex, and It would be outside of the goals of a serializer. The serializer should only take care about the syntax it generates, so it should provide a valid SGML document or a valid XML document. So ok, I can remove behaviors I added, so nested body elements or sibling body elements will be generated, like in the current behavior. So no need to change mInBody to PRInt32, and changes made in the version 2 of the patch are enough. If you are ok, I will make a new patch with new unit tests.
Comment 7•16 years ago
|
||
(Argh, I wasn't CC'd to this bug) (In reply to comment #6) > So ok, I can remove behaviors I added, so nested body elements or sibling body > elements will be generated, like in the current behavior. So no need to change > mInBody to PRInt32, and changes made in the version 2 of the patch are enough. > > If you are ok, I will make a new patch with new unit tests. As far as I see, there is still need to change mInBody ot PRInt32. Otherwise you set mInBody to PR_FALSE too early (when closing the innermost <body> in the case of nested <body>s.)
Assignee | ||
Comment 8•16 years ago
|
||
Yes, you're right. I'm going to make a new patch today.
Assignee | ||
Comment 9•16 years ago
|
||
Here is the updated patch.
Attachment #329680 -
Flags: review?(Olli.Pettay)
Assignee | ||
Updated•16 years ago
|
Attachment #328506 -
Attachment is obsolete: true
Attachment #328506 -
Flags: review?(Olli.Pettay)
Assignee | ||
Comment 10•16 years ago
|
||
Sorry for the spam. I forgot to create the diff with the -p flag.
Attachment #329680 -
Attachment is obsolete: true
Attachment #329683 -
Flags: review?(Olli.Pettay)
Attachment #329680 -
Flags: review?(Olli.Pettay)
Comment 11•16 years ago
|
||
Comment on attachment 329683 [details] [diff] [review] Version 4 of the patch with function names I guess this is ok.
Attachment #329683 -
Flags: review?(Olli.Pettay) → review+
Assignee | ||
Updated•16 years ago
|
Attachment #329683 -
Flags: superreview?(jst)
Comment 12•16 years ago
|
||
Not sure we actually want this in 1.9.0.x (the stable branch), but probably in 1.9.1?
Flags: wanted1.9.1?
Whiteboard: [needs sr jst]
Assignee | ||
Updated•16 years ago
|
Component: XML → Serializers
QA Contact: xml → dom-to-text
Assignee | ||
Comment 13•16 years ago
|
||
Same patch, but the applying of the previous one fails on the current trunk, because of a change in a C++ comment.
Attachment #329683 -
Attachment is obsolete: true
Attachment #329683 -
Flags: superreview?(jst)
Assignee | ||
Updated•16 years ago
|
Attachment #342864 -
Flags: superreview?(bzbarsky)
Comment 14•16 years ago
|
||
Comment on attachment 342864 [details] [diff] [review] same patch updated for the current trunk So... it seems like some of the "HTML output only" comments describe things as they are, whereas other are things as they "should" be (that is, some things are HTML-only for good reasons, and some just happen to be that way for now). I'd rather not document the things that are HTML-only by accident as being HTML-only... Or at least make it very clear that this can change. Past that, please remove the random whitespace changes and please use -p when diffing like Olli asked you to. I'm not going to spend time reviewing this patch in its current state because it'll be so much easier to review with the above two changes...
Assignee | ||
Comment 15•16 years ago
|
||
Sorry Boris for the format of the patch. I currently have my patch in a hg queue, and it seems that hg qrefresh doesn't use the diff parameters from my .hgrc. I didn't know it and I forgot to verify the output format. I removed also the random whitespace changes.
Attachment #342864 -
Attachment is obsolete: true
Attachment #342882 -
Flags: superreview?(bzbarsky)
Attachment #342864 -
Flags: superreview?(bzbarsky)
Comment 16•16 years ago
|
||
Ah, for mq you want to do: [diff] git = 1 showfunc = true unified = 8 in your .hgrc. That will make things happy. Looking at the patch now.
Comment 17•16 years ago
|
||
Comment on attachment 342882 [details] [diff] [review] same patch with -p -U 8 >+++ b/content/base/src/nsHTMLContentSerializer.h >+ PRInt32 mInBody; Probably better as PRUint32. >+++ b/content/base/src/nsXMLContentSerializer.h >+ // current colonne position s/colonne/column/ > PRPackedBool mAddNewline; >+ > }; No need for that blank line With those nits, looks great. Thanks for writing all those tests!
Attachment #342882 -
Flags: superreview?(bzbarsky) → superreview+
Comment 18•16 years ago
|
||
I pushed changeset 43d8e57c41e9 with those three things changed.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 19•16 years ago
|
||
Backed out, because this caused test failures on Windows. Note that I left the test files in the tree and just backed out the file changes (by accident, mostly). The failures seem to be due to the XML serializer changes, at least in part: TEST-UNEXPECTED-FAIL | ../../../_tests/xpcshell-simple/content/unit/test_xmlserializer.js | test failed, see log NEXT ERROR *** TEST-UNEXPECTED-FAIL | ../../../_tests/xpcshell-simple/content/unit/test_xmlserializer.js | <?xml version="1.0" encoding="UTF-8"?> <foo/> <!-- é --> == <?xml version="1.0" encoding="UTF-8"?> <foo/> <!-- é --> A number of mochitest failures also, in both test_bug424359-1.html and test_bug424359-2.html. See log at http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1223917750.1223920953.6499.gz
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 20•16 years ago
|
||
This new version includes Boris's changes, plus some changes in tests: I added the use of the OutputLFLineBreak flag to all tests, so the encoder won't use the default line break of the platform. I think tests will then pass on windows. I haven't a windows system here at home to verify it, but I could test at office tomorrow morning...
Attachment #342882 -
Attachment is obsolete: true
Attachment #342941 -
Flags: superreview?(bzbarsky)
Comment 21•16 years ago
|
||
I pushed that patch to the try server with the label "CRLFStuff" so we can try to get some data on it.
Comment 22•16 years ago
|
||
Er.. except try server doesn't run unit tests. :(
Assignee | ||
Comment 23•16 years ago
|
||
Thanks to a patch which fixes a VC7.1 bustage, and which has just landed into the trunk, I finally could compile on my window desktop at the office :-) On this machine, all tests pass with my latest patch (mochitest + xpcshell). And on my linux box, same result: all is green.
Updated•16 years ago
|
Attachment #342941 -
Flags: superreview?(bzbarsky) → superreview+
Comment 24•16 years ago
|
||
Pushed that last diff as changeset 9fde2bdd93ef.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Comment 25•16 years ago
|
||
backed out 9fde2bdd93ef due to unit test failures
Comment 26•16 years ago
|
||
(In reply to comment #25) > backed out 9fde2bdd93ef due to unit test failures It caused Windows FF+TB+SM to fail at http://mxr.mozilla.org/mozilla-central/source/content/test/unit/test_xml_serializer.js?mark=276#265
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 27•16 years ago
|
||
Here is an additionnal patch which fixes the test. I launch xpcshell tests in all directories, mochitests in all directories, and it's ok. Some tests failed on my machine but apparently, it is not related to the serialization, it's mainly on some tests which do some calculation (on the layout, SVG...), and some others about network apis... I hope it will be enough. I'm sorry for these failures :-(
Attachment #343551 -
Flags: superreview?(bzbarsky)
Comment 28•16 years ago
|
||
Comment on attachment 343551 [details] [diff] [review] patch for failed tests >+var LB; Please add some comments about what this variable really means? And no need for two newlines after; one is enough. >+ if(("@mozilla.org/windows-registry-key;1" in C) || ("nsILocalFileOS2" in I)) Gah. Can you file a followup bug on having a way to get the platform linebreak string from some service and replacing this ickiness with that?
Assignee | ||
Comment 29•16 years ago
|
||
Yes, I agree this plateform test is ugly. I could use the app-info component with its nsIXULRuntime interface, and add a new property on it (I filled bug 460573), or use at least its OS property. But according to the documentation, http://developer.mozilla.org/En/Writing_xpcshell-based_unit_tests , it is not always available in xpcshell (someone on #developers told me the same thing), although I can use it on my build on windows, and it is used in this test : uriloader/exthandler/tests/mochitest/handlerApps.js So, is the documentation deprecated or is there really a risk to use it in xpcshell ?
Comment 30•16 years ago
|
||
bsmedberg might be able to answer the nsIXULRuntime question.... at the very least, I suspect it's only available if you are running a toolkit app. Mochitests run in browser, so that works. You're writing an xpcshell test, though, so I suspect it's not guaranteed to work.
Assignee | ||
Comment 31•16 years ago
|
||
Thanks Boris, I will ask him. Here is the patch with a new comment on the variable.
Attachment #343551 -
Attachment is obsolete: true
Attachment #343885 -
Flags: superreview?(bzbarsky)
Attachment #343551 -
Flags: superreview?(bzbarsky)
Comment 32•16 years ago
|
||
s/use/uses/, and I'd put the comment where the variable is declared, not where it's assigned.
Comment 33•16 years ago
|
||
nsIXULRuntime is not available by default in xpcshell, I think... but there's no reason you couldn't fail gracefully back to unix-style line endings in that case, is there?
Comment 34•16 years ago
|
||
The problem is that the serializer will create platform line-breaks, period. The reference string needs to have them too, therefore.
Assignee | ||
Comment 35•16 years ago
|
||
I changed the comments.
Attachment #343885 -
Attachment is obsolete: true
Attachment #344043 -
Flags: superreview?(bzbarsky)
Attachment #343885 -
Flags: superreview?(bzbarsky)
Updated•16 years ago
|
Attachment #344043 -
Flags: superreview?(bzbarsky) → superreview+
Comment 36•16 years ago
|
||
Pushed changeset f19fc94f348b
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Whiteboard: [needs sr jst]
Comment 37•16 years ago
|
||
Unless I'm missing some baby-killing aspect of this bug it doesn't sound like a level of risk we'd want to take in a 1.9.0.x security update. Fixed in FF3.1
Flags: wanted1.9.1?
Flags: wanted1.9.0.x?
Flags: wanted1.9.0.x-
You need to log in
before you can comment on or make changes to this bug.
Description
•