Closed
Bug 225667
Opened 22 years ago
Closed 21 years ago
various HTML constructs are mangled by composer
Categories
(SeaMonkey :: Composer, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 141338
People
(Reporter: ddyer, Unassigned)
Details
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5b) Gecko/20030827
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5b) Gecko/20030827
Some HTML constructs that are interpreted correctly by Mozilla are
mangled by composer, which means that if you pass a page through composer,
it is damamaged and no longer works. Here's a sample page which exhibts
two independant cases of this. Here is the sample text:
<applet ><!--#include virtual="/cgi-bin/gs_AppletTag.cgi" --> height="550"
width="500" >
<param name=test value=test>
</applet>
<table><caption>Demonstrate that tag structure is mangled</caption>
<form action=/cgi-bin/process.pl>
<tr><td>row 1 1</td></tr>
<tr><td>row 2 1</td></td>
<input type=submit>
</form>
</table>
The two "odd" things about this are that the <applet .. > tag has
a server side include for apache. In the result a new > is added
in the wrong place, and the real > is converted to > In the
original, The <form> ... </form> construct spans a table <tr></tr> pair.
This works perfectly, but composer rewrites it with a new </form>
immediately after the <form>. This destroys the form.
Reproducible: Always
Steps to Reproduce:
1. save the sample text as a document. View in mozilla.
2. view source, note all is well.
3. use "edit page" then "view source" in composer
4. note the new </applet> and </form> tags added.
Actual Results:
<html>
<head>
</head>
<body>
<applet =""><!--#include virtual="/cgi-bin/gs_AppletTag.cgi" -->
height="550" width="500" >
<param value="test" name="test"></applet>
<input type="submit">
<table>
<caption>Demonstrate that tag structure is mangled</caption>
<form action="/cgi-bin/process.pl"></form>
<tbody>
<tr>
<td>row 1 1</td>
</tr>
<tr>
<td>row 2 1</td>
</tr>
</tbody>
</table>
</body>
</html>
Expected Results:
I expect the output to have the same tree structure as the input,
at least if the input tree is well formed, as is the case here.
The worst thing about this bug is that the damage is pretty silent, and
could easily not be noticed until long after the page was edited.
Comment 1•22 years ago
|
||
Since the test steps don't involve any actual editting, I belive this means it
is a serializer problem.
Assignee: composer → dom-to-text
Component: Editor: Composer → DOM to Text Conversion
Comment 2•22 years ago
|
||
Um... you are loading something that's not HTML. We parse it into a DOM. Then
when you save, we serialize out the DOM.
If you give an app that processes HTML something that's not HTML, it'll try its
best to make it look like HTML, which is what we do.
This is a dup. And it's not a DOM-to-Text bug, since given the DOM we have the
conversion is exactly correct.
Assignee: dom-to-text → composer
Component: DOM to Text Conversion → Editor: Composer
Whiteboard: DUPEME
Comment 3•22 years ago
|
||
The testcase seems confusing. Is the first snippet the input or the output? It
looks like a mixture of the two.
And if there's any "exactly correct" conversion of the broken form tag, then
according to a little-implemented corner of the HTML spec, it's something like this:
<form action="">cgi-bin</form>process.pl>
Comment 4•22 years ago
|
||
For what it's worth, the greater than character >
is also converted to > (in some cases) when
attemting to send properly constructed javascript in email.
This would point to a common problem in the serializer..no?
Comment 5•22 years ago
|
||
No, that would point to a totally different issue from this bug.
Comment 6•22 years ago
|
||
What is the doctype associated with the snippet in the original bug description?
Is the comment with the #include correctly formatted or does it need a space
before the #? Is there an extra > in the applet tag's line?
the original documents are plain html. There are no "extra"
brackets in the source.
In the case of the applet tag, embedded < > inside the tag is a
syntactically correct comment, which should be parsed and passed
through to the output.
In the case of the table tag, all the constructs are well formed
and properly nested, but the sequence of constructs doesn't match
some preconvieved idea of what can be found below a <table> tag.
The REALLY unaccepable aspect of this bug is that it damages
pages with no warning. An acceptable halfway measure would be to
at least complain that the input was ill-formed and the output is
possibly damaged.
Comment 8•22 years ago
|
||
Stewart: NET only applies to tags with no attribute specification list, and the
presence of "action=" would presumably initiate recognition of aforesaid, but
good guess.
"In the case of the applet tag, embedded < > inside the tag is a syntactically
correct comment, which should be parsed and passed through to the output."
No, it isn't. While it would be syntactically correct in document content,
comments are not permitted within start-tags (or end-tags, for that matter).
"...but the sequence of constructs doesn't match some preconvieved [sic] idea of
what can be found below a <table> tag."
Right, that's called a "DTD", and it's par for the course in HTML (although our
parser is considerably more lenient than the standard in that respect). Because
well-formedness is not required in HTML, some limits on the arrangement of tags
are required.
Bear in mind that whatever source is output will produce exactly the same
results in the browser as would feeding it your original source (with the
exception in this case of the SSI processing). As far as generating warnings, I
think that would effectively require a validating parser attached to Composer,
and I don't think anyone's prepared to undertake that level of architectural
rearrangement in the near future.
Comment 9•22 years ago
|
||
> Bear in mind that whatever source is output will produce exactly the same
> results in the browser as would feeding it your original source
Actually, not. Not with the screwed up form nesting. Note that the input is no
longer inside the form in the result (we keep out-of-band info on what input
goes with what form in cases like that that's lost at serialization time).
| Reporter | ||
Comment 10•22 years ago
|
||
> Bear in mind that whatever source is output will produce exactly the same
> results in the browser as would feeding it your original source
Actually, not in either case. The <applet > tag that emerges is complely
mangled, and the intended side effect of the comment is completely lost,
since the comment is gone and would have been interpreted by apache.
Comment 11•22 years ago
|
||
(In reply to comment #8)
> Stewart: NET only applies to tags with no attribute specification
> list
What on earth is NET?
> "In the case of the applet tag, embedded < > inside the tag is a
> syntactically correct comment, which should be parsed and passed
> through to the output."
There's no embedded < > in the reporter's example, unless the
immediately preceding > somehow counts as 'embedded'.
> Because well-formedness is not required in HTML, some limits on the
> arrangement of tags are required.
What is meant by well-formedness, exactly?
Comment 12•22 years ago
|
||
(In reply to comment #11)
> What on earth is NET?
"Null End Tag". The thing that says <a /> and <a></a>> are the same thing.
> There's no embedded < > in the reporter's example, unless the
> immediately preceding > somehow counts as 'embedded'.
Actually, the original example has a stray '>' after '<applet' that needs to be
removed to get the mangling described under "actual results". So we're looking
at markup like:
<applet <!--#include virtual="/cgi-bin/gs_AppletTag.cgi" --> height="550"
width="500" >
> What is meant by well-formedness, exactly?
Choess meant well-formedness as defined in the XML 1.0 spec.
Updated•21 years ago
|
Product: Browser → Seamonkey
Comment 13•21 years ago
|
||
Sounds like bug 141338 to me, and bz.
*** This bug has been marked as a duplicate of 141338 ***
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•