Closed
Bug 613130
Opened 14 years ago
Closed 14 years ago
Paste list structure from MS-Word styles lost
Categories
(Core :: DOM: Editor, defect)
Tracking
()
RESOLVED
FIXED
mozilla2.0b8
People
(Reporter: yaojun85, Assigned: ehsan.akhgari)
References
()
Details
(Keywords: regression)
Attachments
(2 files)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.12 (KHTML, like Gecko) Chrome/9.0.576.0 Safari/534.12 Build Identifier: 3.6.12 When pasting a list structure from MS Word document on Windows, the "mso:list" style originally presented in a span element that was symboling a list bullet indicator doesn't anymore appear, it's a regression observed since Firefox 3.6. Reproducible: Always Steps to Reproduce: 1.Copy the list structure in attached document 2.Paste it into "contentediable" frame 3.Check the HTML codes in 'Source' mode. Actual Results: <p class="MsoNormal" style="margin-left: 18pt; text-indent: -18pt;"><span style="" lang="EN-US"><span style="">1.<span style="font: 7pt "Times New Roman";"> </span></span></span><span dir="LTR"></span><span lang="EN-US">List item1</span></p> Expected Results: <p class="MsoNormal" style="margin-left: 18pt; text-indent: -18pt;"><span style="mso-list: Ignore" lang="EN-US"><span style="">1.<span style="font: 7pt "Times New Roman";"> </span></span></span><span dir="LTR"></span><span lang="EN-US">List item1</span></p> The 'mso:list' style is widely used by WYSIWYG editors' "paste from word" functionality, to transform paragraphs acts as list item into a semantic list structure, and thus is broken.
Updated•14 years ago
|
Assignee | ||
Comment 2•14 years ago
|
||
(In reply to comment #0) > The 'mso:list' style is widely used by WYSIWYG editors' "paste from word" > functionality, to transform paragraphs acts as list item into a semantic list > structure, and thus is broken. Which WYSIWYG editors are you talking about? To the best of my knowledge, Microsoft Word is the only one using this.
Both TinyMCE and CKEditor have functionality to handle pasting from Word and converting Word list to real semantic UL/OL lists. So there is currently a lot of frustration out there regarding the resent changes in Firefox from our users. I'm not sure how this would be best handled. I think instead of white listing styles it should blacklist the ones that are security issues like "expression" and "behavior".
What recent changes? AFAIK no version of Firefox has ever preserved unknown style rules.
Ah, this is really about whether the contents of the 'style' attribute are preserved. I think we do probably have a bug. We should preserve the original value of the 'style' attribute in the DOM and hand it out on a getAttribute() (until/unless someone modifies the .style properties of course). I'm sure this is already filed somewhere.
Assignee | ||
Comment 6•14 years ago
|
||
(In reply to comment #5) > Ah, this is really about whether the contents of the 'style' attribute are > preserved. > > I think we do probably have a bug. We should preserve the original value of the > 'style' attribute in the DOM and hand it out on a getAttribute() (until/unless > someone modifies the .style properties of course). I'm sure this is already > filed somewhere. That's not what happens. When we're pasting something, we look at all of the style attributes, feed their contents to our CSS parser, look through the parsed styles and remove -moz-binding properties, and serialize the result as the value of the style attribute to be pasted. There is no original value of the style attribute in the DOM for us to use.
Assignee | ||
Comment 7•14 years ago
|
||
This is the code in question, FWIW <http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/content/html/document/src/nsHTMLFragmentContentSink.cpp#1037>. We do this all before injecting the style attribute in the DOM.
Assignee | ||
Comment 8•14 years ago
|
||
I guess if we don't encounter any -moz-binding values we could revert to the original value, but that would mean that we're basically blacklisting, and not white-listing... and that we'll be broken if a style attribute has both mso:list and -moz-binding.
I suppose what you really want here is a whitelist of properties that Gecko doesn't support, but we know are safe, and then we let those pass through. Maybe that can be hacked into the CSS parser somehow...
I'd probably be OK with the approach in comment #8 though.
Assignee | ||
Comment 11•14 years ago
|
||
(In reply to comment #9) > I suppose what you really want here is a whitelist of properties that Gecko > doesn't support, but we know are safe, and then we let those pass through. > Maybe that can be hacked into the CSS parser somehow... I don't have any idea how to generate such a whitelist. I would argue that coming up with a comprehensive list is next to impossible. If I'm not overlooking an obvious way to generate that list, I think comment 8 is the only sensible option here.
Comment 12•14 years ago
|
||
Since I'm not the original reporter of this issue I didn't do to much research on why the paste logic is broken when it comes to lists. I think it's related to conditional comments being stripped not styles. I compared the pasted contents between 3.0.16 and 3.6.13 and the older version didn't have the mso:list style either but the conditional comment "<!--[if !supportLists]-->" is stripped. I'm not sure why it strips these comments but we look for either that comment or the mso:list style in our code at least. Are there security issues with comments as well?
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → ehsan
Status: UNCONFIRMED → ASSIGNED
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
blocking2.0: --- → ?
Ever confirmed: true
Assignee | ||
Comment 13•14 years ago
|
||
(In reply to comment #12) > Since I'm not the original reporter of this issue I didn't do to much research > on why the paste logic is broken when it comes to lists. I think it's related > to conditional comments being stripped not styles. I compared the pasted > contents between 3.0.16 and 3.6.13 and the older version didn't have the > mso:list style either but the conditional comment "<!--[if !supportLists]-->" > is stripped. I'm not sure why it strips these comments but we look for either > that comment or the mso:list style in our code at least. Are there security > issues with comments as well? Oh, no, there are no security issues with regards to comments. But we've allowed comments since bug 572642, so they shouldn't be a problem.
The other issue with allowing unknown properties is that there may be unsafe values of some unknown properties (e.g., expression()).
Assignee | ||
Comment 15•14 years ago
|
||
(In reply to comment #14) > The other issue with allowing unknown properties is that there may be unsafe > values of some unknown properties (e.g., expression()). True. Catching those is not easy, though, because we don't support the expression property in our parser, and figuring out whether the original string did contain that and it was dropped by the parser is not possible, is it?
That's what I meant in comment #9. Whitelisting "unknown" properties is the only way to fix this bug and also protect against potentially dangerous unknown properties from leaking into pasted content, but it seems a bit quixotic.
Assignee | ||
Comment 17•14 years ago
|
||
(In reply to comment #16) > That's what I meant in comment #9. Whitelisting "unknown" properties is the > only way to fix this bug and also protect against potentially dangerous unknown > properties from leaking into pasted content, but it seems a bit quixotic. Well, whitelisting all of the unknown properties except the unsafe ones is very hard (and it's essentially the same as blacklisting). And we can't use it anyways, until our CSS parser can retain such properties.
Reporter | ||
Comment 18•14 years ago
|
||
@Robert It would not be that ridiculous of whitelisting mso-* if it's sure that clipboard comes from an office application context.
Reporter | ||
Comment 19•14 years ago
|
||
(In reply to comment #12) > Since I'm not the original reporter of this issue I didn't do to much research > on why the paste logic is broken when it comes to lists. I think it's related > to conditional comments being stripped not styles. I compared the pasted > contents between 3.0.16 and 3.6.13 and the older version didn't have the > mso:list style either but the conditional comment "<!--[if !supportLists]-->" > is stripped. I'm not sure why it strips these comments but we look for either > that comment or the mso:list style in our code at least. Are there security > issues with comments as well? This's confirmed also, while I guess it desires a new bug, I'm not sure bug 572642 has tackled it.
Assignee | ||
Comment 20•14 years ago
|
||
(In reply to comment #18) > @Robert > It would not be that ridiculous of whitelisting mso-* if it's sure that > clipboard comes from an office application context. We don't have a reliable way to determine the source of the clipboard data.
Assignee | ||
Comment 21•14 years ago
|
||
(In reply to comment #19) > (In reply to comment #12) > > Since I'm not the original reporter of this issue I didn't do to much research > > on why the paste logic is broken when it comes to lists. I think it's related > > to conditional comments being stripped not styles. I compared the pasted > > contents between 3.0.16 and 3.6.13 and the older version didn't have the > > mso:list style either but the conditional comment "<!--[if !supportLists]-->" > > is stripped. I'm not sure why it strips these comments but we look for either > > that comment or the mso:list style in our code at least. Are there security > > issues with comments as well? > This's confirmed also, while I guess it desires a new bug, I'm not sure bug > 572642 has tackled it. Can you please file a new bug on the comment issue?
Comment 22•14 years ago
|
||
As I tried to explain in bug 596300 comment 42, there are two options to take care of all the problems related to this: 1: Provide the raw HTML in the onBeforePaste event with the clipboardData object. That way, if the application is designed to handle pasting let it do it, the application will know what are the kind of formatting it must allow and with this help it will be able to do it properly. 2: Create a new execCommand switch so an application can opt-out of all this sanitation and keep working as it does in other browsers. As a general rule, if the application isn't doing anything special with the pasted data it won't care about bogus styles that can't be understood by the browser like those mso- that in general can be seen as garbage, so providing the data only to those that request it (as in 1) seems to me the better solution.
Assignee | ||
Comment 23•14 years ago
|
||
(In reply to comment #22) > As I tried to explain in bug 596300 comment 42, there are two options to take > care of all the problems related to this: > > 1: Provide the raw HTML in the onBeforePaste event with the clipboardData > object. That way, if the application is designed to handle pasting let it do > it, the application will know what are the kind of formatting it must allow and > with this help it will be able to do it properly. But that doesn't change the data which is actually pasted in the DOM, so I'm not sure how this would help. > 2: Create a new execCommand switch so an application can opt-out of all this > sanitation and keep working as it does in other browsers. > > As a general rule, if the application isn't doing anything special with the > pasted data it won't care about bogus styles that can't be understood by the > browser like those mso- that in general can be seen as garbage, so providing > the data only to those that request it (as in 1) seems to me the better > solution. Well, the actual concern is to only pass safe data to web applications, in order to protect web applications against cross-domain attacks. It's not really a good option to require web applications to opt in to this protection... :(
Comment 24•14 years ago
|
||
The basis of this ticket is that the application tries to detect the paste event and transform the data provided into something filtered and adjusted to its configuration. If we can get the raw HTML with clipboardData then we will cancel the default paste and proceed to adjust it before inserting ourselves into the DOM. If you provide this sanitation as a toStaticHtml then we can call it in those situations where the user wants it that way (after we have done our clean up switching styles to semantic elements), there are other times when we must allow event handlers, inline scripts...
Assignee | ||
Comment 25•14 years ago
|
||
How does that mitigate the security concerns, then?
Reporter | ||
Comment 26•14 years ago
|
||
(In reply to comment #21) > Can you please file a new bug on the comment issue? Bug 572642 is filed for it. (In reply to comment #18) > We don't have a reliable way to determine the source of the clipboard data. Ok...but is there any known security problem with "mso-*"?
Reporter | ||
Comment 27•14 years ago
|
||
(In reply to comment #26) > (In reply to comment #21) > > Can you please file a new bug on the comment issue? > Bug 572642 is filed for it. Sorry, I actually mean Bug 613912.
I think we should do comment #8 for Firefox 4 and later develop a safer solution based on parsing unknown properties and passing through some subset of them. (Personally I think we could pass through all prefixed properties whose prefix isn't known to be used by browsers.)
blocking2.0: ? → final+
Assignee | ||
Comment 29•14 years ago
|
||
This patch implements comment 8. Asking review from bz on the content part and from roc on the layout part.
Attachment #492879 -
Flags: review?(roc)
Attachment #492879 -
Flags: review?(bzbarsky)
Comment on attachment 492879 [details] [diff] [review] Patch (v1) I don't like it, but it seems like the best of the bad options.
Attachment #492879 -
Flags: review?(roc) → review+
Assignee | ||
Updated•14 years ago
|
Whiteboard: [has patch][needs review bz]
Comment 31•14 years ago
|
||
this is a regression from which landing? guessing bug 520189
Blocks: CVE-2010-2769
Comment 32•14 years ago
|
||
Not going to block a branch release on this, but will look at the patch if you request branch approval (after review and trunk landing).
Assignee | ||
Comment 33•14 years ago
|
||
(In reply to comment #31) > this is a regression from which landing? guessing bug 520189 Yes.
Comment 34•14 years ago
|
||
Comment on attachment 492879 [details] [diff] [review] Patch (v1) r=me. I hate the web. :(
Attachment #492879 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Updated•14 years ago
|
Whiteboard: [has patch][needs review bz] → [needs landing]
Assignee | ||
Comment 35•14 years ago
|
||
Landed the patch: http://hg.mozilla.org/mozilla-central/rev/4da141a1fbff It would be great if someone could verify this fix in tomorrow's nightly.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [needs landing]
Target Milestone: --- → mozilla2.0b8
Assignee | ||
Comment 36•14 years ago
|
||
Juan, Marcia, is this something you guys can verify please?
Comment 37•14 years ago
|
||
I installed TinyMCE, and I opened one of the examples there (simple.html). Then, using MS Word 2010, I opened the document containing a list and copied all. When I paste this into the example and select one of the entries (a. aasdsd) and I view source, I get this: 4.0 Beta 7: <p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;"><span style=""><span style="">a.<span style="font: 7pt "Times New Roman";"> </span></span></span>aasdsd</p> 4.0b8pre(12/06): <p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;" _mce_style="margin-left: .75in; text-indent: -.25in; mso-list: l0 level2 lfo1; tab-stops: list .75in;"><span style="" _mce_style="mso-fareast-font-family: Arial; mso-bidi-font-family: Arial;"><span style="" _mce_style="mso-list: Ignore;">a.<span style="font: 7pt "Times New Roman";" _mce_style="font: 7.0pt "Times New Roman";"> </span></span></span>aasdsd</p> Garry, it doesn't look exactly as the expected results in comment #0. Does this depend on Word version?
Assignee | ||
Comment 38•14 years ago
|
||
(In reply to comment #37) > I installed TinyMCE, and I opened one of the examples there (simple.html). > Then, using MS Word 2010, I opened the document containing a list and copied > all. When I paste this into the example and select one of the entries (a. > aasdsd) and I view source, I get this: > > 4.0 Beta 7: > <p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;"><span > style=""><span style="">a.<span style="font: 7pt "Times New > Roman";"> </span></span></span>aasdsd</p> > > 4.0b8pre(12/06): > <p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;" > _mce_style="margin-left: .75in; text-indent: -.25in; mso-list: l0 level2 lfo1; > tab-stops: list .75in;"><span style="" _mce_style="mso-fareast-font-family: > Arial; mso-bidi-font-family: Arial;"><span style="" _mce_style="mso-list: > Ignore;">a.<span style="font: 7pt "Times New Roman";" > _mce_style="font: 7.0pt "Times New Roman";"> > </span></span></span>aasdsd</p> > > Garry, it doesn't look exactly as the expected results in comment #0. Does this > depend on Word version? This is possibly good, but can you please test it in MidasDemo instead (because TinyMCE might be adding its own stuff...)?
Reporter | ||
Comment 39•14 years ago
|
||
>Garry, it doesn't look exactly as the expected results in comment #0. Does this
>depend on Word version?
Not a big issue, the structure might defers from IE and Firefox (I'm actually making the TC refering to IE case) as long as the styles are preserved, confirm to be fixed on 4.0b8pre(12/06), a wonderful fix finally for wysiwygers.
Comment 40•13 years ago
|
||
Any chance to have this landing on Firefox 3.6?
Assignee | ||
Comment 41•13 years ago
|
||
(In reply to comment #40) > Any chance to have this landing on Firefox 3.6? I think that this fix is too risky for Firefox 3.6. Please upgrade to Firefox 4, which has this bug fixed.
Comment 42•13 years ago
|
||
Not to say that this issue is affecting me directly, but all CKEditor users around the world. Anyway, it's up to you guys to decide.
Assignee | ||
Comment 43•13 years ago
|
||
Hmm, well, I gave it a bit more thought, and actually read the patch more closely. The patch touches a lot of code, which makes me nervous, but then again it fixes a regression introduced on the branch... roc: what do you think?
It is not a serious security or stability issue so it doesn't meet branch criteria. I'd have a hard time justifying taking it on the branch.
Assignee | ||
Updated•13 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•