Open Bug 1650720 Opened 4 years ago Updated 3 years ago

Copying \r\n inside <pre> converts to \r\r\n

Categories

(Core :: DOM: Serializers, defect)

x86_64
Windows 10
defect

Tracking

()

People

(Reporter: saschanaz, Unassigned)

Details

Attachments

(1 file)

Attached file bug1528442-2.html
  1. Open the attachment
  2. Copy the content
  3. Paste it somewhere

Expected: Single newline per line
Actual: Double newlines per line

Assignee: nobody → krosylight
Summary: Copying \r\n inside <pre> converts to \n\n → Copying \r\n inside <pre> converts to \r\n\n

This is confusing.

Summary: Copying \r\n inside <pre> converts to \r\n\n → Copying \r\n inside <pre> converts to \r\r\n

This is confusing, EncodeForTextUnicode() tries encoding the text with the mime type text/html instead of text/plain. The comment says:

  // note that we assign text/unicode as mime type, but in fact
  // nsHTMLCopyEncoder ignore it and use text/html or text/plain depending where
  // the selection is. if it is a selection into input/textarea element or in a
  // html content with pre-wrap style : text/plain. Otherwise text/html. see
  // nsHTMLCopyEncoder::SetSelection

I'm not sure why the copy would ever be text/html even when it's not pre-wrap style. This comment is wrong anyway, as it only becomes text/plain for <input> and <textarea> but not for pre-wrap style. That means, copying <pre> actually uses text/html.

The second confusing part is that EncodeForTextUnicode() tries encoding in raw-mode first (which does not affect newlines) and then retries in non-raw-mode, which causes the bug by incorrectly replacing \n with \r\n, making \r\n to \r\r\n. Why try twice?

Emilio, do you happen to have some idea about this, as one of the relevant functions has your FIXME comment?

Flags: needinfo?(emilio)

Not really, that FIXME is about assuming that all the selection is in the first selection range, which is not true for table selection for example.

Flags: needinfo?(emilio)

Oops, okay. Keeping pinging semi-randomly based on commit history...

Mirko, it seems you refactored some of the code, do you have some idea about the mime type thing?

Flags: needinfo?(mbrodesser)

(In reply to Kagami :saschanaz from comment #2)

This is confusing, EncodeForTextUnicode() tries encoding the text with the mime type text/html instead of text/plain. The comment says:

  // note that we assign text/unicode as mime type, but in fact
  // nsHTMLCopyEncoder ignore it and use text/html or text/plain depending where
  // the selection is. if it is a selection into input/textarea element or in a
  // html content with pre-wrap style : text/plain. Otherwise text/html. see
  // nsHTMLCopyEncoder::SetSelection

I'm not sure why the copy would ever be text/html even when it's not pre-wrap style. This comment is wrong anyway, as it only becomes text/plain for <input> and <textarea> but not for pre-wrap style.

I'm not sure how relevant it is, but in another case text/plain is set too.

That means, copying <pre> actually uses text/html.

The second confusing part is that EncodeForTextUnicode() tries encoding in raw-mode first (which does not affect newlines) and then retries in non-raw-mode, which causes the bug by incorrectly replacing \n with \r\n, making \r\n to \r\r\n. Why try twice?

I'm unfamiliar with this detail, it needs investigation.

Emilio, do you happen to have some idea about this, as one of the relevant functions has your FIXME comment?

Flags: needinfo?(mbrodesser)
Severity: -- → S3

May be this behavior is also related to this bug.

Steps to reproduce

  • Go to IEEE Explore page and open any research paper. For example: https://ieeexplore.ieee.org/document/7919538
  • Click on Cite this button
  • Select BibTex option
  • Copy BibTex content using mouse and keyboard that's select text using mouse and the press Ctrl+C
  • Paste it in any text editor

Actual ouput

Every sentence ends with an extra new line.

Expected output

There shouldn't be any extra new line after each sentence.

Additional

Like 152844#c4 example, I cannot reproduce this bug in HTML based editors like Librewriter; seems like problem is with text serialization.

No immediate plan to work on this, please feel free to take it.

Assignee: krosylight → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: