<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Reporter

Updated

•

16 years ago

Blocks: 39098

Reporter

Comment 1

•

16 years ago

I discovered this by making the drag-n-drop serialization of the selection use the same code as the kbd Copy command (in bug 39098). This made the "content/events/test/test_dragstart.html" mochitest fail: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1256695352.1256705398.7411.gz It appears the bug occurs both for the text/plain and text/html serialization.

Assignee

Comment 2

•

16 years ago

for text/html serialization, it is not a regression. It is just that the new html serializer does a better job for pretty printing. so you have to update that test. For text/plain, I don't know yet. I didn't modify the plain text serializer with the patch of bug 422403. I'm investigating on this.

Assignee

Comment 3

•

16 years ago

I understood the issue. nsHTMLCopyEncoder is used for the serialization, and is called into SelectionCopyHelper (nsCopySupport.cpp). There is a first serialization here http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCopySupport.cpp#134 of the selection. It returns <div id="draggable" ondragstart="doDragStartSelection(event)">This is a <em>draggable</em>\n bit of text.</div> Note the \n added by the serializer, because of the default flag on wrapping in the HTML serializer. As I said before, this \n is added because of the better pretty printing made by the serializer. Then this string is converted by nsHTMLFormatConverter http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCopySupport.cpp#153 . This converter remove all html tag, and it seems it replaces all \n by a space, this why we have two spaces between "draggable" and "bit". A solution is to disable pretty printing in the nsHTMLCopyEncoder. We have to verify if it doesn't break something somewhere. I notice an other issue in the SelectionCopyHelper function (not related to the bug) : if the selection to copy is not in a plain text context (bIsHTMLCopy = true), then the selection is serialized twice ! The comment for the first serialization says "We always require a plaintext version", and then the kUnicodeMime is passed to the init method of the encoder. However, it is useless, because nsHTMLCopyEncoder always uses text/html, it doesn't use the given mime type : http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocumentEncoder.cpp#1116 In the "if (bIsHTMLCopy)" statement, the selection is serialized a second time. The only difference is the called method: EncodeToString vs EncodeToStringWithContext. I think we could optimize this piece of code : only do the first serialization when bIsHTMLCopy == false. Do you agree ? Olly ? Mats ? If ok I can provide a patch to fix these two issues.

Comment 4

•

16 years ago

(In reply to comment #3) > A solution is to disable pretty printing in the nsHTMLCopyEncoder. We have to > verify if it doesn't break something somewhere. Sounds reasonable. > I notice an other issue in the SelectionCopyHelper function (not related to the > bug) I this a regression too? If it is, is it a serious one? And in any case, please file a new bug for this issue.

Assignee

Comment 5

•

16 years ago

Attached patch patch v1 (obsolete) — Details — Splinter Review

here is the patch, which only changes flags for the encoder for the copy. I added also a test file. For the second issue, it is not really a regression, just an optimization we can do. I will create a new bug for that and will attach a patch to it.

Assignee: nobody → laurent

Status: NEW → ASSIGNED

Attachment #409067 - Flags: review?(Olli.Pettay)

Comment 6

•

16 years ago

Comment on attachment 409067 [details] [diff] [review] patch v1 How does this work with long lines? Based on the documentation OutputRaw may wrap. But I'm not sure if the documentation is right, since here you say it doesn't wrap https://bugzilla.mozilla.org/show_bug.cgi?id=422403#c7

Assignee

Comment 7

•

16 years ago

I don't know which documentation you talk about, but as indicated in the comment for the OutputRaw flag in the nsIDocumentEncoder.idl, using these flag disable the wrapping and the formating. And this is the real behavior not only with the patch of bug 422403, but it was also the previous behavior (gecko 1.9.1 and before). Of course, with OutputRaw, there isn't any more new \n, but existing \n are kept. so if in the orignal source code there is a line break like this <div>This is a <em>draggable</em> bit of text.</div> However, with such cases, I think we will still have the issue, (here, a double space between "draggable" and "bit"), because of the behavior of nsHTMLFormatConverter used in nsCopySupport. As I said, it seems it simply replaces \n by a space, after removing html tags. I don't know if this issue in nsHTMLFormatConverter can be fixed (and how, I don't know this part of gecko). I don't know if using the plain text serializer instead of nsHTMLFormatConverter could be an other solution...

Mike Beltzner [:beltzner, not reading bugmail]

Comment 8

•

16 years ago

Ah, oops, I misunderstood the comment in nsIDocumentEncoder.

Updated

•

16 years ago

Flags: blocking1.9.2? → blocking1.9.2+

Priority: -- → P2

Comment 9

•

16 years ago

Does the patch bring back the exactly same behavior what we have on 1.9.1? I wonder especially the removal of flags = 0;

Reporter

Comment 10

•

16 years ago

Attached file Testcase #1 — Details

Reporter

Updated

•

16 years ago

Attachment #409067 - Flags: review?(Olli.Pettay) → review-

Reporter

Comment 11

•

16 years ago

Comment on attachment 409067 [details] [diff] [review] patch v1 Hmm, did this pass regression tests? This patch breaks copy from a plain text context (see Testcase #1). I believe the first value for 'flags' should be: nsIDocumentEncoder::OutputPreformatted | nsIDocumentEncoder::OutputRaw the second should be: nsIDocumentEncoder::OutputRaw (fwiw, that passed regression tests on TryServer together with the patch in bug 39098)

Assignee

Comment 12

•

16 years ago

Attached patch patch v2 — Details — Splinter Review

I added the OutputPreformatted flag and re introduce "flags = 0". You're right, it was useless to remove both. In fact, I didn't see the code of nsHTMLCopyEncoder::setSelection, which select a different serializer in some case. It uses the plaintext serializer when the selection to copy is inside an input and textarea or in a preformatted html element. And it uses the html serializer in the other cases. I thought it uses always the html serializer. And so, the second issue I explained is no more valid for some cases. I updated the unit test file to include your test case.

Attachment #409067 - Attachment is obsolete: true

Attachment #409324 - Flags: review?(matspal)

Reporter

Comment 13

•

16 years ago

Comment on attachment 409324 [details] [diff] [review] patch v2 Looks good, r=mats. It still breaks the "test_dragstart.html" test together with the patch in bug 39098, but that bug isn't for 1.9.2, so I'll add a \n to the expected result on trunk. As far as I know this is a change in behavior, and should be considered when approving this patch for 1.9.2. I don't see a problem with having the extra newline in HTML-to-HTML copy on 1.9.2, OTOH, I don't see the benefit from it either. I'm requesting an explicit approval1.9.2 to consider the above.

Attachment #409324 - Flags: review?(matspal)

Attachment #409324 - Flags: review+

Attachment #409324 - Flags: approval1.9.2?

Assignee

Updated

•

16 years ago

Attachment #409324 - Flags: superreview?(Olli.Pettay)

Updated

•

16 years ago

Attachment #409324 - Flags: superreview?(Olli.Pettay) → superreview+

Assignee

Comment 14

•

16 years ago

checked in: http://hg.mozilla.org/mozilla-central/rev/4ae32b827a7f

Mike Beltzner [:beltzner, not reading bugmail]

Assignee

Updated

•

16 years ago

Status: ASSIGNED → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

Comment 15

•

16 years ago

Comment on attachment 409324 [details] [diff] [review] patch v2 (this is a blocker, doesn't need explicit approval, please land ASAP)

Attachment #409324 - Flags: approval1.9.2?

Johnny Stenback (:jst)

Comment 16

•

16 years ago

Jonas, any thoughts on whether we'll be better off with or without this patch for 1.9.2?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 17

•

16 years ago

Wow, this bug is confusing for someone not familiar with our serializers :) A few questions: 1. So the problem is that when copying, we would first prettyprint the selection, and then concatenate all the textnodes in the result of the prettyprint? And the fix is to not prettyprint but rather *just* concatenate the existing textnodes? 2. Does this patch make us behave as we did in 1.9.1 in all cases? Comment 13 seems to indicate "no". If the answer is "no", then when and how do we differ? 3. If this patch is correct, then why does it combined with the patch in bug 39098 break a existing testcase? Is that a problem with this patch, the patch in bug 39098, or a problem with the testcase?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 18

•

16 years ago

1) selection inside input/textarea is serialized with the plaintext serializer (so concatenation of textnode + pretty printing), otherwise, it is serialized with the html serializer. And in this case, tags are then removed and \n are replaced by spaces (by using the html converter) to have a text/plain version in the clipboard. The new html serializer does a better job on wrapping for the pretty printing, so in the example, we have a new \n. The problem is that the \n are replaced by a space, by the html converter. So a solution is to not do pretty printing during the serialization. 2) Yes, for most of case. 3) No, Mats just changed the way the drag-n-drop serialization is made, by using the same code as the kbd Copy command, so it uses the html serializer, and because of the new wrapping behavior, it changes a test, and then Mats discovers the bug 524975. I think we should land my patch to 1.9.2, since kbd copy command doesn't behave anymore as in 1.9.1.

Comment 19

•

16 years ago

(In reply to comment #18) > 1) selection inside input/textarea is serialized with the plaintext serializer > (so concatenation of textnode + pretty printing), otherwise, it is serialized > with the html serializer. And in this case, tags are then removed and \n are > replaced by spaces (by using the html converter) to have a text/plain version > in the clipboard. > > The new html serializer does a better job on wrapping for the pretty printing, > so in the example, we have a new \n. The problem is that the \n are replaced by > a space, by the html converter. So a solution is to not do pretty printing > during the serialization. Ok, so it sounds like the answer is simply "yes" to my question? Or more precisely "yes. Except when copying from input/textarea, in which case none of the code relevant to this bug is used". Is that correct? > 2) Yes, for most of case. Well, that's a "no" then :). So in which cases are we now different from 1.9.1? > 3) No, Mats just changed the way the drag-n-drop serialization is made, by > using the same code as the kbd Copy command, so it uses the html serializer, > and because of the new wrapping behavior, it changes a test, and then Mats > discovers the bug 524975. That still doesn't answer the question: If mats patch were to land, is the problem in his patch, in your patch or in the test. It sounds like you're saying that it's in the test? Another important question here is does this patch change any behavior that webpages see? I.e. is it possible that this patch will break (or fix) websites? Or is there a risk that it will?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 20

•

16 years ago

let's summary. There are two issues: 1) this current bug: "copy-paste of simple HTML adds extra space not present in original content". This is clearly a regression after my work on the serializer (bug 422403). I already explains why in previous comments. So should be landed in 1.9.2 (IMHO) since 1.9.2 has the new serializer. However, with the patch I provided on this bug 524975, Copy-paste will behave in a different manner for HTML content. 1.9.1: it was serialized with pretty printing flags. With my patch: without pretty printing. I don't think this difference is important, at least for the text/html flavor for the clipboard. 2) the test on drag-n-drop. The problem is in the test. Since mats change the code of the serialization for drag-n-drop (and then the result of the serialization, as I explained), he has to adapt the test. > Another important question here is does this patch change any behavior that webpages see? Clearly, it doesn't. However, I just found a regression with my patch: in fact, copying HTML produces now non pretty printing, not only for the text/html flavor, but also for text/unicode flavor. So copying this DOM: bla <ul> <li>foo</li> <li>bar</li> </ul> When we paste it, we have bla foo bar instead of bla * foo * bar like in gecko 1.9.1. I reopen the bug. (or should I create a new bug ?) I'm working on a new patch.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Comment 21

•

16 years ago

(In reply to comment #20) > let's summary. There are two issues: > > 1) this current bug: "copy-paste of simple HTML adds extra space not present in > original content". This is clearly a regression after my work on the serializer > (bug 422403). I already explains why in previous comments. So should be landed > in 1.9.2 (IMHO) since 1.9.2 has the new serializer. > > However, with the patch I provided on this bug 524975, Copy-paste will behave > in a different manner for HTML content. 1.9.1: it was serialized with pretty > printing flags. With my patch: without pretty printing. I don't think this > difference is important, at least for the text/html flavor for the clipboard. My gut reaction is that the behavior with your patch is better, yes. However it's a scary change this late in the game. > 2) the test on drag-n-drop. The problem is in the test. Since mats change the > code of the serialization for drag-n-drop (and then the result of the > serialization, as I explained), he has to adapt the test. Ok, if the problem is in the test then that's all good. Let's never talk about it in this bug any more :) I only care about bugs in the code, not in the tests. > However, I just found a regression with my patch: in fact, copying HTML > produces now non pretty printing, not only for the text/html flavor, but also > for text/unicode flavor. So copying this DOM: > > bla > <ul> > <li>foo</li> > <li>bar</li> > </ul> > > When we paste it, we have > > bla > foo > bar > > instead of > > bla > * foo > * bar > > like in gecko 1.9.1. > > I reopen the bug. (or should I create a new bug ?) > I'm working on a new patch. This scares me, a lot. This would have been a bad regression to ship with. I think given how late in the game this is, I don't think we should take anything on the branch. These changes are better done early in the cycle. For example while not prettyprinting copied HTML seems like a good idea, it's a bigger change than i'm not comfortable with right before release. So my recommendation is to not take this for the 1.9.2 release. *Possibly* it's something to consider for 1.9.2.x. This leaves us with the extra space regression. Either we can leave things as is now and consider fixing it in 1.9.2.x, or we can consider backing out bug 422403 on the branch. Do you have an opinion?

Assignee

Comment 22

•

16 years ago

Ok. I'm agree to leave the 1.9.2 branch as its current state. I think it's a good idea to not land the patch in it, since it introduces a more critical regression (described in my previous comment) than the extra space bug IMHO. And I will have more time to provide a better patch for bug 524975 :-) Backing out bug 422403 will be harder, because other changes have been made on the serializer after the check in of the html5 parser. I prefer to spend time on the patch for bug 524975 instead of on the backout of bug 422403 (which is a huge patch).

Johnny Stenback (:jst)

Comment 23

•

16 years ago

blocking1.9.2- per the above comments, we'll work on getting this fixed on trunk for the next release! Thank you Laurent for your contributions here!

Flags: blocking1.9.2+ → blocking1.9.2-

Johnny Stenback (:jst)

Comment 24

•

16 years ago

Laurent, do you think you would have time to work on an updated patch for this bug?

Assignee

Comment 25

•

16 years ago

sorry for the delay. I'm working on it this week.

Assignee

Comment 26

•

16 years ago

Attached patch Additional tests for the trunk — Details — Splinter Review

My apologizes, it seems in fact there is no regression. Perhaps my old builds were bad. I can't reproduce the regression I found. I added some tests to verify that copy paste of a piece HTML produce a well formated text in text/plain. See this patch.

Attachment #424228 - Flags: review?(Olli.Pettay)

Assignee

Comment 27

•

16 years ago

Attached patch Patch for the 1.9.2 branch — Details — Splinter Review

The full patch for the 1.9.2 branch.

Attachment #424229 - Flags: superreview?(Olli.Pettay)

Attachment #424229 - Flags: review?(Olli.Pettay)

Attachment #424229 - Flags: approval1.9.2.1?

Updated

•

16 years ago

Attachment #424228 - Flags: review?(Olli.Pettay) → review+

Boris Zbarsky [:bzbarsky]

Comment 28

•

16 years ago

So why would we take the patch for 1.9.2?

Comment 29

•

16 years ago

This looks like the underlying issue is exactly what bug 549295 is about, which is that the behavior of not passing any wrapping flags at all was totally changed from "wrap if the text is long" to "wrap all the time"....

Comment 30

•

16 years ago

Laurent, does the patch fix bug 549295 on FF 3.6? Or vice-versa?

Mike Beltzner [:beltzner, not reading bugmail]

Assignee

Comment 31

•

16 years ago

no the patch doesn't fix bug 549295. But the patch of bug 549295 probably fixes (or at least, helps to fix) bug 524975. I have to verify it.

Comment 32

•

16 years ago

Comment on attachment 424229 [details] [diff] [review] Patch for the 1.9.2 branch Please renominate when the reviews are complete.

Attachment #424229 - Flags: approval1.9.2.2? → approval1.9.2.2-

Assignee

Comment 33

•

16 years ago

checked in additionnal tests http://hg.mozilla.org/mozilla-central/rev/e9e8eaec0cbb

Assignee

Comment 34

•

16 years ago

fixed a failed test: http://hg.mozilla.org/mozilla-central/rev/d898c3dbdc27

Comment 35

•

16 years ago

(In reply to comment #31) > no the patch doesn't fix bug 549295. But the patch of bug 549295 probably fixes > (or at least, helps to fix) bug 524975. I have to verify it. Any news here?

Assignee

Comment 36

•

16 years ago

As Boris said, the regression comes from the issue described in bug 549295, and the patch of bug 549295 fix the current bug, even without the attachement 409324 already landed into the trunk. However I think the real issue is in the html converter used by SelectionCopyHelper, when it replaces all LF by a space. If a LF is preceded by a space, it should not replace it by a space, but only remove it. However I don't know how the html convert works, so perhaps I say silly things :-)

Depends on: 549295

Comment 37

•

16 years ago

Should we back out attachment 409324 [details] [diff] [review] from trunk, if the real regression fix is bug 549295?

Assignee

Comment 38

•

16 years ago

well, attachment 409324 [details] [diff] [review] just add a flag for the encoder. I'm not sure that this flag does some regression, so we can keep it... In fact, now with the patch for bug 549295, the real question is : do we need to wrap on long lines (old behavior before the new serializer) or not, for serialized content for the clipboard. If yes, we should revert the change. However, if we revert this change, we should keep the test I added, and I can remove the added flag directly with the patch for bug 549295.

Comment 39

•

16 years ago

We should have the old behavior, unless there is some very good reason not to.

Assignee

Comment 40

•

16 years ago

Ok I will remove the flag in my patch for bug 549295

Assignee

Updated

•

16 years ago

Attachment #424229 - Flags: superreview?(Olli.Pettay)

Attachment #424229 - Flags: review?(Olli.Pettay)

Reporter

Updated

•

16 years ago

Blocks: 567362

Anthony DiSante

Comment 41

•

16 years ago

I'm not sure if this is the most relevant bug to my issue; searching bugzilla for "copy paste space" (without the quotes) returns several bugs. I'm seeing this behavior in Thunderbird 3.0.4/Linux, whereas in Tbird 2.x, the bug was not present. What happens is, when you select the last word in a paragraph, you need to be careful not to select "too far" past the end of the word (into the empty area to the right of the word). If you go too far, your selection will not visibly contain any whitespace; but when you copy & paste it, a trailing space appears in the pasted text. This also happens when you double-click to select the word, as long as it's the last word in the paragraph (and there's no punctuation after it). This happens even for plain-text messages (content-type: text/plain). Firefox does a similar thing if the word you're double-clicking is followed by a newline in the source code, as opposed to being immediately followed by an HTML tag. You can plainly see that there's no space in the visible selection, yet upon pasting, it's there. From the comments here about serializers and test cases and regressions, and the several other linked bugs, this seems like a complex issue. I hope my report helps, or is at least relevant; please let me know if there's a better bug for me to look at instead.

Assignee

Comment 42

•

15 years ago

Finally, I forgot to remove the flag in the patch for bug 549295. And because of this nsIDocumentEncoder::OutputRaw flag, it causes some bugs like bug 572543. I really should remove this flag :-) Patch soon.

Assignee

Updated

•

15 years ago

Blocks: 572543

Aaron Digulla

Comment 43

•

15 years ago

I have the same issue with Firefox 4.0.1: Firefox adds an invisible space if there is a newline between the last word and a tag (as in "word
<div>") or if there is a space after the word (as in "word <br/>". This is really annoying if you copy passwords since you can't see the additional space character when you paste.

Scoobidiver (away)

Updated

•

13 years ago

Depends on: 723163

Q.uentin

Comment 44

•

12 years ago

Any news ? I have this bug

Dror

Comment 45

•

12 years ago

Still had this issue with Firefox 28.0. Ironically, the workaround was to add   after the word and then it worked fine.

Tim Guan-tin Chien [:timdream] (please needinfo)

Comment 46

•

12 years ago

Persists to FF 29

Updated

•

10 years ago

Comment 47

•

10 years ago

Should be fixed by bug 333064?

Flags: needinfo?(mats)

Reporter

Comment 48

•

10 years ago

I think it must have been fixed earlier than that. I can't reproduce this bug using FF46 on Linux. QA: please test this bug on all platforms; resolve as worksforme if OK. Thanks.

Flags: needinfo?(mats)

Keywords: qawanted

Comment 49

•

10 years ago

Persists to FF 46.0.1 on OSX 10.10 To reproduce: 1. Open this issue, https://bugzilla.mozilla.org/show_bug.cgi?id=524975 , in a new private window (so that you're not authenticated) 2. Scroll down to this comment 3. Double-click my name ("aaron"), to highlight the entire thing 4. copy it (either ctrl-C or right-click, copy) 5. paste it somewhere and observe the leading space character, not present in the HTML (which is like <span>aaron</span>) See video documentation: https://www.youtube.com/watch?v=HI1LqAUaMrI

Reporter

Comment 50

•

10 years ago

So the DOM range we get when double-clicking the name per STR in comment 49 : let range = window.getSelection().getRangeAt(0) range.commonAncestorContainer is this: <span class="bz_comment_user"><a href="user_profile?user_id=497669"> <img src="https://secure.gravatar.com/avatar/d89dd360c498c1bf4ca9d9a01d08596a?d=mm&size=64" width="32" height="32" align="middle" border="0"></a> <span class="vcard vcard_497669"><span class="ln">aaron</span> </span> </span> range.startContainer is the text node before the vcard_497669 span. range.startOffset is 1. range.endContainer is the text node after the vcard_497669 span (i.e. the newline). range.endOffset is 0. So, given that range, copy-paste should actually result in " aaron". The error seems to be that the double-click included the space in the text node before the vcard_497669 span. It shouldn't do that, at least not on platforms that excludes the surrounding space when double-clicking words in plain text. I'm pretty sure this is a different bug than what I originally reported though, but let's keep the bug open for now until we know if the latter bug is on file.

Federico Bruni

Comment 51

•

10 years ago

I can confirm the problem described in comment #49. I'm using FF 47 on Linux (Fedora 23).

simon

Comment 52

•

9 years ago

Happens to me as well. Pasting into a terminal adds 0~ and 1~ around the copied text. Really **** behavior. Never using Firefox again.

Comment 53

•

9 years ago

This appears to be fixed FF 51.0.1 OSX 10.10.5