User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:184.108.40.206) Gecko/20100722 Firefox/3.6.8 ( .NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:220.127.116.11) Gecko/20100722 Firefox/3.6.8 ( .NET CLR 3.5.30729)
When inserting text or html code into an iframe in designmode using execCommand/inserthtml, the content is always inserted as a top-level element under body, i.e. all open DOM nodes are closed before the insertion point and copies of the closed DOM nodes are created after the insertion (i.e. a whole DOM tree gets duplicated!).
It causes major havoc for rich text editors such as the Wikipedia editor wikEd. The duplication of the nodes is also extremely problematic because it creates multiple nodes with the same id (!) with the potential to break scripts. Therefore, it appears to be a real bug and not a well thought-out feature.
The correct (and previous) solution would be to insert at the text/html at exactly the DOM level of the insertion point. If for some reason (not sure which) a script needs to force insertions at the top-level, this could always be done manually in the code.
Here is a way to test this:
1. Go to a rich text editor such as http://www.kevinroth.com/rte/demo.htm
2. Use the preloaded content or paste from a web page
3. Use the DOM inspector to see the DOM tree
4. Let the rich text editor add some content into a deep element, e.g. a special character into italic text
5. Use the DOM inspector to see the DOM tree after the change
6. The new content is added on the top level, all elements are closed before the insertion and reopened as duplicates after the insertion
Unfortunately, there is no work-around for this behaviour without complex and major DOM reorganization after the insertion that will break the built-in undo/redo history.
This bug came with a recent release of Firefox, perhaps with 3.6 (?).
Can you please give me more information on how to reproduce this bug? I tried to get that information from comment 0 and comment 1, but I couldn't reproduce it. Also, please test this in Firefox 4.
(In reply to Ehsan Akhgari [:ehsan] from comment #2)
> Can you please give me more information on how to reproduce this bug? I
> tried to get that information from comment 0 and comment 1, but I couldn't
> reproduce it. Also, please test this in Firefox 4.
Using FireFox 16.0.2:
1. Open the demo page located at http://www.kevinroth.com/rte/demo.htm
2. Right-click in the middle of the word 'preloaded' and click Inspect Element. Note that only a single em tag exists.
3. Close the element inspector.
4. Left click in the middle of the word 'preloaded'
5. Click the Insert Special Character button in the edit toolbar and insert any character. (You must allow the pop-up to select the character)
6. Right-click in the middle of the word 'preloaded' and click Inspect Element. Note that now two em tags exist.
Created attachment 681878 [details]
Test case for bug 587461
Follow the instructions from comment 3 and use the Inserthtml button to add character to the middle of the sample text. Note how the Change Color button does not work correctly after an insertion has been made.
Thanks! Aryeh, can you look into this, please?
I've put it on my list and will look into it if I have time.
document.execCommand("inserthtml", false, "a");
This produces "<b>foo</b>a<b>bar</b>" in us and Chrome 25 dev. Comment 0 requests that the result should be "<b>fooabar</b>", and this is the behavior in Opera 12.50 internal, and what the spec says. The IE version I have (some Windows 8 preview VM) doesn't support insertHTML.
I don't get why we'd want to split ancestors. Ehsan, do you know? Should we change Gecko, or change the spec? This behavior seems rather surprising and not so useful, but WebKit does it too, so maybe there's a compat requirement, or some reason for it that I'm missing.
(Note: I will *not* be around consistently to troubleshoot regressions caused by any patches I write. So if you don't think this is worth the risk of regressions that I won't be able to handle, probably I should stick to a) safer editing stuff, or b) things that other people will want to maintain, like maybe in DOM.)
I'm not sure what the historical reason behind how this works is. Maybe Daniel knows?
I feel like changing the behavior here could be regression prone...
This does not look like a regression. If I take Nvu or KompoZer, based on
Gecko 1.7, the behaviour is the same. So I guess this a design choice made
by the original implementor roughly 13 years ago because insertHTML can insert
just anything, blocks or inlines. The current behaviour is wrong, I agree.
The split is handled by nsHTMLEditor::InsertHTMLWithContext in nsHTMLDataTransfer.cpp
lines 420-427 I think.
The way the spec does it is first it tries inserting it, then for every top-level node that was inserted, it splits it out of any invalid ancestors. I don't see any reason to split parents if the thing you inserted is already okay where it was inserted. I'd guess this behavior was just a mistake, or attempt to simplify the implementation.
The question is, do we want to accept the regression risk from changing this to make more sense?
Punting to Ehsan for a decision on whether we want to change. I think we probably want to stick to current behavior, realistically, particularly since I bet WebKit won't want to change even if we do. If we decide not to change, I'll file a bug on the spec.
(In reply to :Aryeh Gregor from comment #11)
> Punting to Ehsan for a decision on whether we want to change. I think we
> probably want to stick to current behavior, realistically, particularly
> since I bet WebKit won't want to change even if we do. If we decide not to
> change, I'll file a bug on the spec.
I just cannot understand that strategy: preserving a totally incorrect behaviour
is fine in particular because WebKit won't fix the bug? We have a clear behavioural
bug here, with a public API resulting in an unexpected (and clearly wrong) result.
I don't care about what the spec says either; if the spec describes the current
behaviour, then the spec is wrong and putting in a spec a blatantly wrong behaviour
Inserting a single char in the middle of a span should NOT split the span and
honestly, I personally don't care about what WebKit does, will do or will not do.
Looking at WebKit's strategy forgets about one thing: we can lead the way and make
a better web.
I agree with Daniel here. Inserting a character into a span which has an ID makes two spans. This should not happen.
Preserving the behavior is possibly a good idea because
1) We want to maintain compatibility with existing content, which might depend on current behavior in subtle and unintended ways. Even if the new behavior is clearly better, if a change breaks old pages, it's a regression. I'm not sure what the regression potential is here -- probably not too high, but not negligible.
2) We want to maintain compatibility with other browsers. Interoperability is and should be a major goal of ours. Bad behavior that works exactly the same across browsers can be easily worked around by authors; inconsistent behavior is much harder to deal with. Incompatibility between browsers is one of the single biggest headaches in authoring web content, and one of the biggest selling points of competing platforms like Flash (which is thankfully on the decline). Being incompatible with other browsers does not help authors: it just means they have to do more testing and debugging to figure out why their page only works in some browsers. (This point is moot if WebKit is in fact willing to change.)
Generally, sticking to established behavior is the best default answer, for these two reasons. We should only change if the benefits are really worth it. Maybe they are in this case. Ehsan is module owner, so it's his call here, and I won't argue the point further. But I hope this clarifies why I think it might be best to preserve the incorrect behavior.
(In reply to :Aryeh Gregor from comment #14)
> Preserving the behavior is possibly a good idea because
> 1) We want to maintain compatibility with existing content, which might
> depend on current behavior in subtle and unintended ways. Even if the new
> behavior is clearly better, if a change breaks old pages, it's a regression.
> I'm not sure what the regression potential is here -- probably not too high,
> but not negligible.
I doubt we will break anything here. Inserting a "a" in a span resulting in a split of the span is such an unreliable behaviour from an editing point of view I doubt anyone can rely or even actually relies on this. In short: fixing the current bugzilla bug is not an enhancement, it's bug-fixing.
> 2) We want to maintain compatibility with other browsers. Interoperability
I disagree with this. Maintaining compatibility on a visibly broken feature means slowing the Web down and keeping unfixed a broken feature. If we stick to the current behaviour, we become followers, not leaders.
> is and should be a major goal of ours. Bad behavior that works exactly the
> same across browsers can be easily worked around by authors; inconsistent
This is the crux of the problem in the current case: it is not easy for users to work around the issue.
I guess there is no other way to know whether this change will break web content other than by trying it. :/ Accordingly, I'd take a patch which fixes this bug, with the disclaimer that I will advocate to get it backed out if it regresses some content and we can't come up with a fix in a timely manner.
(In reply to comment #15)
> > 2) We want to maintain compatibility with other browsers. Interoperability
> I disagree with this. Maintaining compatibility on a visibly broken feature
> means slowing the Web down and keeping unfixed a broken feature. If we stick to
> the current behaviour, we become followers, not leaders.
Maintaining compatibility where not doing so breaks web content in ways we cannot evangelize around is not acceptable at all. Same argument goes for breaking compatibility. At the end of the day, there are stupid things that the current editor code (and that of the other UAs) does which is known and worked around by authors and fixing those issues is a long process which may or may not prove useful, and in those cases documenting the existing behavior and spec'ing it in a compatible way is the best course of action (case in point: the HTML5 parsing algorithm.) That being said, the trade-offs should clearly be inspected on a case by case basis.
And to me, being perceived as a follower and or a leader is not something to optimize for, as long as we can defend the decisions we make and make them with good intentions and judgement. :-)
Created attachment 733316 [details] [diff] [review]
Okay, so I looked in a debugger and found where the node was being split:
// pasting does not inherit local inline styles
nsCOMPtr<nsIDOMNode> tmpNode =
int32_t tmpOffset = selection->GetAnchorOffset();
rv = ClearStyle(address_of(tmpNode), &tmpOffset, nullptr, nullptr);
That tells us why we're doing it: when the user pastes text, we don't want it to inherit the local styles. This makes sense (and I bet it's why WebKit does it too).
So this patch just disables that for insertHTML. It makes no attempt to do any other fixups on the markup, so it probably does not match spec at all -- but one could argue that if the author really wants to insert a paragraph inside a <b>, that's their lookout.
If you have ideas on how to do this more nicely, let me know. Also, if we want insertHTML to match paste, maybe we want to keep existing behavior.
Comment on attachment 733316 [details] [diff] [review]
Review of attachment 733316 [details] [diff] [review]:
I think making paste and inserthtml do the same this is a non-goal, and we already diverge from that by for example sanitizing pasted input. r=me, but I'd like to know what Daniel thinks as well.
Aryeh requested my feedback here but I won't be able to do that before friday afternoon european time. Is that ok?
I agree with Ehsan: insertHTML and paste are completely different things. If somebody would want the current behavior, he can achieve that by manipulating the text to be inserted. However, it is not possible to make any adjustments after the text has been inserted (at least not for rich-text editors as that would invalidate the undo cache, see bug 458524). Also, the current behavior results in invalid HTML (same ID for multiple elements).
(In reply to comment #20)
> Aryeh requested my feedback here but I won't be able to do that before friday
> afternoon european time. Is that ok?
Of course :-)
Patch seems perfectly fine to me and preserves original behaviour if third-party tools rely on it.