Closed Bug 723163 Opened 8 years ago Closed 7 years ago

about:crashes should allow easy cut & paste

Categories

(Core :: Serializers, defect)

2.0 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox23 --- disabled

People

(Reporter: sfink, Assigned: adw)

References

(Blocks 2 open bugs)

Details

(Keywords: regression)

Attachments

(3 files)

A cut & paste of about:crashes produces smushed-together text. This is annoying for my typical friends & family support workflow, where I ask them to send me their about:crashes so I can look them up.

For my mother, I ended up running

  perl -lne 'print join(" ", $1, $2, $3) while /(bp-.{36}|.{36})(.*?201[12])(.*?[AP]M)/g' ~/src/MomCrashes.txt

Please don't make me do that again. :)

Example snippet from an about:crashes cut & paste:

bp-9baab16d-1c89-42ca-9f2c-1ab7721201311/31/20121:39 PMbp-9772a04f-8b0e-4d23-8825-5536b21201251/24/20124:13 PMbp-d3250191-1cf6-4401-82ba-f63e521201241/24/20121:34 PMbp-587a49c8-621d-4df5-987f-560be21201191/19/201211:24 AMbp-7eaa1586-dbc8-4101-86fa-d98b521201121/12/20125:10 PMbdce0f3a-59d8-4326-aee3-a0c27591c39f12/24/20119:37 PM34ff8b37-7903-49f1-88f4-d0196b6543fa12/24/20119:37 PM01d5e1a2-9ab8-4284-a825-3167f22aa7ac12/10/201111:28 AM69a0...
Duplicate of this bug: 723161
Product: Core → Firefox
QA Contact: general → general
Version: 12 Branch → Trunk
feel free to correct me, but I could swear this is a regression - it didn't always get mashed.
although I don't know if this is a breakpad fault or selection fault
Component: General → Breakpad Integration
Keywords: regression
Product: Firefox → Toolkit
If you view selection source on it, you can see that it basically just looks like:
<tr><td><a href="http://crash-stats.mozilla.com/report/index/bp-..." id="bp-bcffd34b-2864-4c73-91c7-dca2f2120731">bp-...</a></td><td>7/31/2012</td><td>11:44 AM</td></tr>

Should be easy enough to write a reduced testcase and see if the clipboard code has changed its handling of this. Looking at the changelog for crashes.js, I'm fairly confident that the code for creating this UI hasn't changed. If there's something we could change to make it copy and paste better, we could certainly do that.
Yeah, this is a regression. I just tried in Firefox 3.6.0, and I get a listing with line breaks.
Component: Breakpad Integration → DOM
Product: Toolkit → Core
(Not really sure exactly where this bug belongs, but it's broken on both Windows and OS X)
Attached file Smaller testcase
Saved and chopped down a version of about:crashes. Cut'n'pastes with multiple lines using Firefox 3.6, not Nightly.
Regression window
Good:
http://hg.mozilla.org/mozilla-central/rev/07e74f1f0561
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20091109 Minefield/3.7a1pre ID:20091109043812
Bad:
http://hg.mozilla.org/mozilla-central/rev/3776e2ddf1a3
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20091109 Minefield/3.7a1pre ID:20091109103043
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=07e74f1f0561&tochange=3776e2ddf1a3

Suspected: 
4ae32b827a7f	Laurent Jouanneau — Bug 524975 - Copy-Paste of simple HTML adds extra space not present in original content. r=Mats Palmgren sr=Olli Pettay
Blocks: 524975
Version: Trunk → 2.0 Branch
If this fixed, it will be convenient to copy and paste of selected paragraph in Troubleshooting Information.
(Now, copy and paste resultant is one line.)
Mats, Olli - ping? If bug 524975 regressed this, did it break other copy/paste cases? Even just for the about:crashes case it's a nuisance.
This is a giant pain for those times when you really need to get the info copied into another medium. Is it hard to fix?

/be
I'll look at this soon.
Assignee: nobody → bugs
Attached patch patchSplinter Review
Currently a few concepts are tangled up.  This untangles them into:

* Selections inside text widgets should always be non-pretty-printed text/plain
* All other selections should get pretty-printed text/plain (including XHTML)
* Only HTML should additionally get text/html (since nsHTMLCopyEncoder doesn't really support XHTML)

Related or duplicate:

* bug 726605
* bug 572543
* bug 515464
* bug 270145
Attachment #727041 - Flags: review?(bugs)
Attachment #727041 - Flags: review?(bugs) → review?(hsivonen)
Assignee: bugs → adw
(In reply to Drew Willcoxon :adw from comment #12)
> * Only HTML should additionally get text/html (since nsHTMLCopyEncoder
> doesn't really support XHTML)

This doesn't seem right. The user shouldn't have to see different clipboard behavior depending on how the doc was sent over the network. In most normal cases, the text/html serializer can convert XHTML to HTML just fine. The cases where it can't could in theory arise is scripted text/html DOMs, too.
Comment on attachment 727041 [details] [diff] [review]
patch

>   // note that we assign text/unicode as mime type, but in fact nsHTMLCopyEncoder
>   // ignore it and use text/html or text/plain depending where the selection
>   // is. if it is a selection into input/textarea element or in a html content
>   // with pre-wrap style : text/plain. Otherwise text/html.
>   // see nsHTMLCopyEncoder::SetSelection

That's really sad API design, but I understand it's pre-existing sadness.

>+  // First, prepare the text/plain clipboard flavor.
>+  nsAutoString textPlainBuf;
>+  if (selForcedTextPlain) {
>+    // Nothing to do.  buf contains the final, preformatted, raw text/plain.
>+    textPlainBuf.Assign(buf);

Just to make sure: Is this the bit that actually fixes the reported bug?

>+  // also consider ourselves in a text widget if we can't find an html document
>+  nsCOMPtr<nsIHTMLDocument> htmlDoc = do_QueryInterface(mDocument);
>+  if (!(htmlDoc && mDocument->IsHTML())) {
>+    mIsTextWidget = true;

I think this bit is wrong for the user experience as opined in my earlier comment.

I think at minimum, the mDocument-IsHTML() bit should be removed. Better yet if instead of checking that the doc QIs to nsIHTMLDocument this code checked that the element at the start of the selection is in the HTML namespace. That way, clipboard behavior would depend on what's in the DOM and not on what the input to the parser was when the DOM tree was initially built.
(In reply to Henri Sivonen (:hsivonen) from comment #14)
> That's really sad API design, but I understand it's pre-existing sadness.

It looks like the only callers, other than nsCopySupport.cpp, are nsPlaintextEditor.cpp and nsSelection.cpp.  They each call it once.

> 
> >+  // First, prepare the text/plain clipboard flavor.
> >+  nsAutoString textPlainBuf;
> >+  if (selForcedTextPlain) {
> >+    // Nothing to do.  buf contains the final, preformatted, raw text/plain.
> >+    textPlainBuf.Assign(buf);
> 
> Just to make sure: Is this the bit that actually fixes the reported bug?

Not this part.  The fix is doing the pretty-printed text/plain encoding in all other cases, in the else branch when !selForcedTextPlain.  (selForcedTextPlain == true means the selection is inside a "text widget" like a textarea, and the text/plain encoding should therefore not be pretty-printed.)

> >+  // also consider ourselves in a text widget if we can't find an html document
> >+  nsCOMPtr<nsIHTMLDocument> htmlDoc = do_QueryInterface(mDocument);
> >+  if (!(htmlDoc && mDocument->IsHTML())) {
> >+    mIsTextWidget = true;
> 
> I think this bit is wrong for the user experience as opined in my earlier
> comment.

Just to be clear, I didn't change this part.  bz added a similar check in nsCopySupport::IsPlainTextContext, which this patch removes, in bug 270145 with this comment:

> // also consider ourselves in a text widget if we can't find an html
> // document. Note that XHTML is not counted as HTML here, because we can't
> // copy it properly (all the copy code for non-plaintext assumes using HTML
> // serializers and parsers is OK, and those mess up XHTML).

I don't know much about this, but at one point I did try removing the IsHTML term from the conditional.  The problem was XHTML documents that contained CDATA: when copied and pasted into an app that supports text/html on the clipboard, the CDATAs' closing "]]>" were visible in the resulting paste.
Comment on attachment 727041 [details] [diff] [review]
patch

OK, r+, but please file a follow-up about making the HTML serializer treat CDATA nodes like any other text nodes and removing the HTMLness check.
Attachment #727041 - Flags: review?(hsivonen) → review+
Well, about:memory has a few tests that copy text and check the resulting text/plain encoding, and they fail with the patch.

about:memory inserts text nodes with line feeds in certain places to make its text look nice when you paste it.  The reason that works now is because about:memory is XHTML, and XHTML is not pretty printed, so the line feeds get encoded.

But with the patch, about:memory (and all other XHTML documents) gets pretty-printed, so the extra line feeds are ignored in the encoding.  Although this new pretty-printed encoding automatically inserts line feeds to make the text look pretty, they don't exactly match the strategically placed line feeds in the old, raw encoding.  In fact in this case, the pretty-printed encoding does not look as nice as the raw encoding.

Thoughts?  Possible solutions:

* Remove the about:memory tests and live with the fact that the pretty-printed encoding does not look as nice as the raw encoding with the strategically placed line feeds.

* Forget this patch and modify all the chrome XHTML pages (about:crashes, about:support, etc.) to use strategically placed whitespace like about:memory does.

* Modify the pretty printer, either to be more pretty so that it matches about:memory's raw encoding, or so that it doesn't ignore whitespace, if that even makes sense.
(In reply to Drew Willcoxon :adw from comment #17)
> * Remove the about:memory tests and live with the fact that the
> pretty-printed encoding does not look as nice as the raw encoding with the
> strategically placed line feeds.

This seems like the best option. But can we avoid impact in the common case with an about:memory-specific hack to override the copy command and set custom content on the clipboard, somehow?
Probably, wouldn't a copy event listener work?  But I'd prefer to avoid that, since it would be new code we'd have to write and maintain, and it would set a precedent for other chrome XHTML pages.

To show what we're talking about, I've pasted the expected and actual encodings below.  "Main Process" and "2nd" are H1; "User Compartments", "System Compartments", and "Ghost Windows" are H2; and everything else is PRE.  In the expected encoding, the H1s stand alone, and the H2s abut the PREs.

Expected (raw without the patch):

> Main Process
> 
> User Compartments
> http://foo.com/
> https://bar.com/bar?baz
> https://very-long-url.com/very-long/oh-so-long/really-quite-long.html?a=2&b=3&c=4&d=5&e=abcdefghijklmnopqrstuvwxyz&f=123456789123456789123456789
> 
> System Compartments
> [System Principal] [3]
> atoms
> moz-nullprincipal:{7ddefdaf-34f1-473f-9b03-50a4568ccb06}
> 
> Ghost Windows
> http://foobar.com/foo?bar#baz
> https://very-long-url.com/very-long/oh-so-long/really-quite-long.html?a=2&b=3&c=4&d=5&e=abcdefghijklmnopqrstuvwxyz&f=123456789123456789123456789
> 
> 2nd
> 
> User Compartments
> child-user-compartment
> 
> System Compartments
> child-system-compartment
> 
> Ghost Windows

Actual (pretty printed with the patch):

> Main Process
> User Compartments
> 
> http://foo.com/
> https://bar.com/bar?baz
> https://very-long-url.com/very-long/oh-so-long/really-quite-long.html?a=2&b=3&c=4&d=5&e=abcdefghijklmnopqrstuvwxyz&f=123456789123456789123456789
> 
> System Compartments
> 
> [System Principal] [3]
> atoms
> moz-nullprincipal:{7ddefdaf-34f1-473f-9b03-50a4568ccb06}
> 
> Ghost Windows
> 
> http://foobar.com/foo?bar#baz
> https://very-long-url.com/very-long/oh-so-long/really-quite-long.html?a=2&b=3&c=4&d=5&e=abcdefghijklmnopqrstuvwxyz&f=123456789123456789123456789
> 
> 2nd
> User Compartments
> 
> child-user-compartment
> 
> System Compartments
> 
> child-system-compartment
> 
> Ghost Windows
I think about:memory has special format constraints because of its "Read report from clipboard" functionality. We wouldn't want to break that. Adding some custom code to ensure the right format doesn't seem like that much of a burden to me.
Oh, I didn't know that.  OK, I'll go forward with the plan of landing the patch as-is plus custom copy handling for about:memory.  I'll post a patch for the latter here.
Though I see now that that functionality actually accepts JSON, so that point isn't relevant. Maybe the changes in clipboard formatting are not worth worrying about - I guess you should just ask njn for feedback.
njn, this patch changes how XHTML pages like about:memory are copied to the clipboard.  The result is that the line feeds in about:memory's text/plain clipboard encoding are slightly different from what they currently are (and as a result some of its tests fail).  Comment 19 shows the difference, and comment 17 explains in more detail.

Do you think the about:memory encoding produced by this patch is OK (and the relevant tests should therefore be removed or at least updated), or should it not be changed?  Any other thoughts?
Flags: needinfo?(n.nethercote)
Is it possible to change the about:memory implementation so as to preserve the old linebreaks?

Gavin is right that things won't catastrophically break if we change the plain-text formatting, but on the other hand I kind of like the formatting we have.
It's probably possible, for example by using a copy event listener and then modifying the clipboard on copy.  (Maybe there's a better way, but that's what comes to mind now.)
Could we simply change the markup?  For example, if <h1> is significant, we could use <div class=h1>?
That might work, I'll try.  But if we want a specific format and it's no longer possible to finely tune what the encoder spits out (due to pretty printing), IMO it's better to intercept and modify the copy than to rely on how the encoder happens to work or to bend the markup to manipulate it.
Do you not-like the formatting the patch produces? :) As Drew says it's not that different.
about:memory uses JSON for its imports, but nice copy+paste is still important for copying small sections into bug reports, emails, etc.  It's critical that the formatting of the trees in about:memory is preserved.

The exact whitespace surrounding the headings is less important, but I did spend some effort getting it looking nice, by using careful placement of newlines within the text nodes.  If you search for "paste" in toolkit/components/aboutmemory/content/aboutMemory.js you'll see several places where I did that.  If you can tweak them to preserve the existing formatting that would be ideal.
Flags: needinfo?(n.nethercote)
Well, preserving the exact same line feeds around the H1s and H2s is not easy.  (Formatting in the PRE trees is preserved.)  An empty P after H1 results in an empty line after the H1, so that works.  But PRE sections always get a blank line above them [1], and there's no way to undo that with markup.

Using a copy event listener to modify the text on copy really isn't practical either.  There's no text/html flavor.  text/plain is all there is, so you have to look at this plain text and decide how to change the line feeds, but the text could be anything on the page.  Maybe you could come up with some heuristics, but I don't think it's worth the effort.

An out-there idea would be to look for some attribute on the body element or somewhere, and if it's present, use raw encoding instead of pretty-printing.

So I think the choices are (and I think B is better):

A) Forget this patch and modify all the chrome XHTML pages (about:crashes, about:support (bug 726605), etc.) to use strategically placed whitespace like about:memory does.  But, about:memory's H1s and H2s continue to look exactly like they do now when copied and pasted.

B) Modify or remove the about:memory tests and live with the fact that the pretty-printed encoding looks slightly less nice than the raw encoding with the strategically placed line feeds.  But, all XHTML documents get some level of pretty printing when copied and pasted.

[1] http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsPlainTextSerializer.cpp#526
We're not married to the about:crashes markup, FWIW, it's pretty arbitrary.
I'll probably land this as-is with updated about:memory tests when I get some time next week.  Any objections?
I'd like to see the changes to the about:memory tests before landing.  Thanks!
(In reply to Drew Willcoxon :adw from comment #30)
> But, all XHTML documents get some
> level of pretty printing when copied and pasted.

As noted earlier, I think it's bogus that XHTMLness affects what gets exported to clipboard. I think we should export an HTML serialization of the selection to clipboard even if the DOM was built by the XML parser. The user shouldn't have to see any difference based on what parser built the DOM.

I really hope we aren't baking the assumption that XHTMLness affects the clipboard deeper into test cases here.
Attached patch tests patchSplinter Review
https://tbpl.mozilla.org/?tree=Try&rev=ede26651ea79

Henri, I agree.  I'll file that follow-up bug.  As you can see, the changes to the test cases are merely whitespace changes in the expected encodings.

(In reply to Henri Sivonen (:hsivonen) from comment #34)
> I really hope we aren't baking the assumption that XHTMLness affects the
> clipboard deeper into test cases here.

I think there are a couple of ways to look at that.

The assumption you mention is already baked into these tests, without these patches.  They assume that any extra whitespace in the markup is present in the text/plain encoding.  Right now, that is a property of encodings generated from XHTML.  HTML is encoded differently, and its encodings do not have that property.  Since these patches make the generation of HTML and XHTML text/plain encodings the same, you could say that they remove the assumption you mention.

On the other hand, the point of these tests is to check that the encodings match certain expected strings.  Therefore, the test must make many assumptions about what the encodings look like, specifically about what encodings generated from XHTML look like, since about:memory is XHTML.
Attachment #732525 - Flags: review?(n.nethercote)
Comment on attachment 732525 [details] [diff] [review]
tests patch

Review of attachment 732525 [details] [diff] [review]:
-----------------------------------------------------------------

Uglier, but it'll do.
Attachment #732525 - Flags: review?(n.nethercote) → review+
https://hg.mozilla.org/mozilla-central/rev/a215de599e7f
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Blocks: 515464
(In reply to Henri Sivonen (:hsivonen) from comment #16)
> OK, r+, but please file a follow-up about making the HTML serializer treat
> CDATA nodes like any other text nodes and removing the HTMLness check.

bug 857915
Blocks: 726605
Component: DOM → Serializers
Blocks: 888839
No longer blocks: 888839
Depends on: 888839
Since the fix for this bug was backed-out per bug #897729, should not this bug report be reopened?
(In reply to David E. Ross from comment #41)
> Since the fix for this bug was backed-out per bug #897729, should not this
> bug report be reopened?
No. It's still fixed in the trunk and Aurora.
In which end-user release is this bug fixed?
(In reply to David E. Ross from comment #43)
> In which end-user release is this bug fixed?

24.
You need to log in before you can comment on or make changes to this bug.