Closed Bug 731896 Opened 12 years ago Closed 12 years ago

text copied from html tables with ctrl+select not to be separated by tabs anymore.

Categories

(Core :: DOM: Serializers, defect)

13 Branch
x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla14
Tracking Status
firefox13 + verified

People

(Reporter: alice0775, Assigned: hsivonen)

References

()

Details

(Keywords: regression, Whiteboard: [qa+])

Attachments

(1 file, 4 obsolete files)

Build Identifier:
http://hg.mozilla.org/mozilla-central/rev/30b4f99a137c
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120229 Firefox/13.0a1 ID:20120229031108

@Rodze reports a bug in http://forums.mozillazine.org/viewtopic.php?p=11783897#p11783897

text copied from html tables with ctrl+select not to be separated by tabs anymore.

Reproducible: Always

1. Start Firefox with new profile
2. Open URL
3. Hold Ctrl and select any cells of the table
4. Copy and paste it anywhere

Actual Results:
  everything is together

Expected Results:
  columns separated by tabs and rows by new lines.



Regression window(m-c)
Works:
http://hg.mozilla.org/mozilla-central/rev/08c0c1440171
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120228 Firefox/13.0a1 ID:20120228070559
Fails:
http://hg.mozilla.org/mozilla-central/rev/dde4e0089a18
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120228 Firefox/13.0a1 ID:20120228092158
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=08c0c1440171&tochange=dde4e0089a18

Regression window(m-i)
Works:
http://hg.mozilla.org/integration/mozilla-inbound/rev/fdd55a46661c
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120226 Firefox/13.0a1 ID:20120227033049
Fails:
http://hg.mozilla.org/integration/mozilla-inbound/rev/8ea9dc2f8570
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120227 Firefox/13.0a1 ID:20120227035949
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=fdd55a46661c&tochange=8ea9dc2f8570

Suspected : Bug 650784
So the problem is this:

The change landed in bug 650784 changed clipboard export to use a mechanism that parses an HTML string as if it was a full HTML document and then serializes its body.

This works for full documents, because everyone expects the plain text serialization to omit <title> text which is outside <body>.

This works for most fragments, because almost all the time the fragment context is a body-like context and the parser happily implies <html><head></head><body> which the serializer ignores.

The problem is that when a <tr> or <td> is seen without a <table> first (i.e. the implied fragment context isn't body-like), the HTML parser that runs in the full doc mode discards those tokens.

AFAICT, the ways forward are:
 1) Making plain text clipboard export happen directly by showing the DOM range to the plain text serializer without going through an intermediate HTML string.
or
 2) Implementing the kind of DWIM insertion mode in the HTML tree builder that chooses the first real insertion mode upon seeing the first start tag token.

#2 has already been proposed for other purposes but #1 seems like a more sensible fix considering that we are trying to generate plain text from a DOM.

bz, smaug: Opinions?
Nevermind. Solution #1 seems rather easy and the right thing to do.
Assignee: nobody → hsivonen
Attached patch First attempt at a fix (obsolete) — Splinter Review
Ehsan, the test case for bug 551704 has expectations about <br> at the end of selection and whether the plain text clipboard flavor ends in a line break. How important is it to keep the exact behavior there? Especially considering that the test case expectations for trailing line break in the HTML clipboard flavor and plain text clipboard flavor are different.
Assignee: hsivonen → nobody
Component: Serializers → Security: UI
QA Contact: dom-to-text → ui
Target Milestone: --- → mozilla13
Adding Ehsan to CC for real. Ehsan, see comment 4.
Status: NEW → ASSIGNED
Component: Security: UI → Serializers
QA Contact: ui → dom-to-text
Target Milestone: mozilla13 → ---
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> Ehsan, the test case for bug 551704 has expectations about <br> at the end
> of selection and whether the plain text clipboard flavor ends in a line
> break. How important is it to keep the exact behavior there? Especially
> considering that the test case expectations for trailing line break in the
> HTML clipboard flavor and plain text clipboard flavor are different.

It is important (see bug 551704 comment 12).  Why do you want to change this behavior?  Note that the expected HTML is compared against innerHTML and the expected text is compared against what gets copied on the clipboard.
(In reply to Ehsan Akhgari [:ehsan] from comment #6)
> It is important (see bug 551704 comment 12).

That comment doesn't really explain why.

> Why do you want to change this behavior?

Due to reasons mentioned in comment 1, I'm changing text plain clipboard export to do a direct DOM to plain text serialization instead of DOM to HTML to plain text. I hoisted the check for invisible BRs from the HTML serializer to nsDocumentEncoder. Still with this change, for reasons I don't know, I get different results for the trailing line break in plain text compared to what the test case for bug 551704. For some cases tested, the trailing line break gets zapped. For some cases it doesn't get zapped.

> Note that the expected HTML is compared against innerHTML and the
> expected text is compared against what gets copied on the clipboard.

Good point.
Can you please attach both test cases here?
Attached patch WIP part 1 (obsolete) — Splinter Review
Assignee: nobody → hsivonen
Attachment #602885 - Attachment is obsolete: true
Attached patch WIP part 2 (obsolete) — Splinter Review
(In reply to Ehsan Akhgari [:ehsan] from comment #8)
> Can you please attach both test cases here?

Both? The test case for bug 551704 is on m-c. These patches show the problem when applied.
OK I'm kind of lost here.  Here's what currently happens with the rationale for it:

If you have an invisible BR at the end of the childlist of a node, innerHTML on its parent returns the <br> at the end, since the serialization algorithm <http://www.whatwg.org/specs/web-apps/current-work/multipage/the-end.html#html-fragment-serialization-algorithm> does not let us strip it out.  However, for copy and paste, we should strive to paste what the user sees on the screen.  If the user pastes into a plaintext field in an app, and sees the trailing new line, that would be puzzling as the newline was invisible in the source, and it would look like we're adding a linebreak upon paste.

Does this make sense?
(In reply to Ehsan Akhgari [:ehsan] from comment #12)
> Does this make sense?

Yes, but any ideas why my patches don't end up removing the trailing break consistently?

Nominating as tracking, since this is a regression in the Firefox 13 cycle. We should figure out a fix while 13 is still in Aurora.
Is bug 650784 critical enough for 13's release that we'd consider taking a forward fix rather than backing it out now?
(In reply to Alex Keybl [:akeybl] from comment #14)
> Is bug 650784 critical enough for 13's release that we'd consider taking a
> forward fix rather than backing it out now?

It's not critical for 13 release, but it's annoying to back out, because it requires first backing out stuff from Firefox, then from Thunderbird, then some more stuff from Firefox.

Also, this follow-up is ready except for trailing line breaks in the plain text clipboard flavor in some special cases.
This should work... Let's see what the tryserver says.
Attachment #607173 - Attachment is obsolete: true
Attachment #607174 - Attachment is obsolete: true
Attachment #608284 - Flags: review?(bugs)
(In reply to Henri Sivonen (:hsivonen) from comment #13)
> (In reply to Ehsan Akhgari [:ehsan] from comment #12)
> > Does this make sense?
> 
> Yes, but any ideas why my patches don't end up removing the trailing break
> consistently?

Nothing jumps at me by just looking over the patch...  Let me know if you want me to debug it for you (I can do that, but I haven't done that so far to make sure that we're not duplicating each others' efforts.  :-)
(In reply to Ehsan Akhgari [:ehsan] from comment #17)
> (In reply to Henri Sivonen (:hsivonen) from comment #13)
> > (In reply to Ehsan Akhgari [:ehsan] from comment #12)
> > > Does this make sense?
> > 
> > Yes, but any ideas why my patches don't end up removing the trailing break
> > consistently?
> 
> Nothing jumps at me by just looking over the patch...  Let me know if you
> want me to debug it for you (I can do that, but I haven't done that so far
> to make sure that we're not duplicating each others' efforts.  :-)

Thanks, but no need anymore. The attached patch works.
So this regresses Bug 564737?
Comment on attachment 608284 [details] [diff] [review]
Export plain text to clipboard directly from the DOM without intermediate HTML

And that bug has lots of dups.
Attachment #608284 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #19)
> So this regresses Bug 564737?

How so? That's about mailnews format conversion. This patch decouples clipboard from HTML source to plain text conversion. Aren't the effects of that fix on the clipboard just collateral damage and not a feature?
This patch emulates the quirk that the leading space is dropped when the selection starts with a space and isn't in a text input, textarea or pre. I filed bug 739537 about removing the quirk.
Attachment #608284 - Attachment is obsolete: true
Attachment #609645 - Flags: review?(bugs)
Attachment #609645 - Flags: review?(bugs) → review+
Thanks. Landed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b39493b4caa7
Flags: in-testsuite+
Target Milestone: --- → mozilla14
https://hg.mozilla.org/mozilla-central/rev/b39493b4caa7
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment on attachment 609645 [details] [diff] [review]
Export plain text to clipboard directly from the DOM without intermediate HTML, emulate old leading space quirk

[Approval Request Comment]
Regression caused by (bug #): Bug 650784

User impact if declined:

When copying a part of an HTML table, cell boundaries in Firefox 13 wouldn't be represented as tabs in the plain text clipboard flavor, which would have an adverse effect on pasting into spreadsheets that ingest the plain text clipboard flavor as opposed to ingesting the HTML flavor and on pasting into a text editor and running a find and replace for tabs to post-process the data.

Testing completed (on m-c, etc.):

Has baked on m-c for a couple of days without complaints reaching me.

Risk to taking this patch (and alternatives if risky):

This patch makes us export plain text to directly from the DOM when previously we first exported from DOM into HTML, then parsed the HTML and then generated plain text from the output of the HTML parser. A change like this could regress something that our unit tests don't cover. (Indeed, *this* bug happened in the first place, because we didn't have enough test coverage.) However, the plain text serializer is the same as the one previously used after the HTML parsing phase, so that mitigates the opportunity for differences.

The alternatives are shipping Firefox 13 with the regression or backing out changes that were made to HTML to plain text conversion. Backing those changes would be particularly annoying, because Thunderbird relies on the changes and then a further changes were made after the Thunderbird changes. Thus, it would be necessary to first back out stuff from mozilla-central, then to back out stuff from comm-central and then to back out even more stuff from mozilla-central.

I think taking this patch gives less opportunity for accidents than a multi-stage backout from two trees.

String changes made by this patch: None.
Attachment #609645 - Flags: approval-mozilla-aurora?
Comment on attachment 609645 [details] [diff] [review]
Export plain text to clipboard directly from the DOM without intermediate HTML, emulate old leading space quirk

[Triage Comment]
agreed this is better than a) shipping regression b) trying to detangle the html conversion as a backout.  I trust you'll report back once it's baked on Aurora with confirmation of lack of regressions from this.
Attachment #609645 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Thank you for the approval. Landed:
https://hg.mozilla.org/releases/mozilla-aurora/rev/b0fa8e8ffbc0

I've made a note for myself to report back on the Aurora baking situation on April 12th.
(In reply to Lukas Blakk [:lsblakk] from comment #26)
> I trust you'll report back once it's baked on
> Aurora with confirmation of lack of regressions from this.

No reports about regressions have reached me. Additionally, I searched for open and duplicate bugs (across all of b.m.o) filed since 2012-04-04 and having "copy", "paste" or "clipboard" in the summary and found no reports of regressions.
Whiteboard: [qa+]
Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20100101 Firefox/13.0

Verified on Ubuntu 12.04, Windows 7, Mac Os 10.6 in Firefox 13 beta 3. 
Using the URL: rows are separated by tab and columns by line when copying in a text editor.
(In reply to Virgil Dicu [:virgil] [QA] from comment #29)
> Using the URL: rows are separated by tab and columns by line when copying in
> a text editor.

The other way around actually: columns-tab, rows-line
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: