Separate table cells with tabs rather than spaces when converted to text

RESOLVED FIXED in mozilla1.3beta

Status

()

Core
Serializers
RESOLVED FIXED
16 years ago
13 years ago

People

(Reporter: John Woodward, Assigned: Daniel Bratell)

Tracking

Trunk
mozilla1.3beta
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

16 years ago
User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.2a) Gecko/20020910
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.2a) Gecko/20020910

Table entries, when copied to the clipboard, are separated by a single space.
Therefore, you cannot differentiate between different table entries and table
entries that contain blanks themselves. In other words, if a table with two
entries, "TableEntryWithNoBlank" and "Table Entry With Blanks" is copied into
the clipboard, it will appear as "TableEntryWithNoBlank Table Entry With
Blanks".  If a tab, or even multiple blanks were used between entries, then you
could distinguish between table entries when looking at the clipboard.

Reproducible: Always

Steps to Reproduce:
1. Visit the URL above.
2. Select all of the text.
3. Copy the selected text into the clipboard.


Actual Results:  
The two table entries appear in the clipboard separated by a single blank.

Expected Results:  
The two table entries should be seperated by a tab character or by multiple
blanks, to distinguish the table entris.

This is a big issue for me because I have a program that processes Brokerage
Transactions copied into the clipboard from transaction history websites, and
converts them into a form that can be imported into Quicken.  This method worked
in Netscape 4.x, and all versions of Internet Explorer, but cannot work in
Mozilla because the various table entires produced in transaction histories
cannot be distinguished because the individual table entires (columns) in the
output can contain blanks.

Comment 1

16 years ago
Confirmed using FizzillaCFM/2002100308 on 10.1.5. Revising summary. See also bug
18012 and bug 51308 (this bug might depend on the latter).
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Tables entries cannot be distinguished in the clipboard because they are separated by a single space, which could also appear in a table entry. → Separate table cells with tabs rather than spaces when converted to text
(Assignee)

Comment 2

16 years ago
I think I agree. TABs are more useful. This is line ~650 in nsPlainTextSerializer:

      // Maybe add something else? Several spaces? A TAB? SPACE+TAB?
      AddToLine(kSpace.get(), 1);
      mInWhitespace = PR_TRUE;

It would be trivial to change and it would be good doing it during the beta
stage rather than later. I'll attach a patch.
OS: MacOS X → All
Hardware: Macintosh → All
(Assignee)

Comment 3

16 years ago
Created attachment 109280 [details] [diff] [review]
Output a tab rather than a space between table cells
(Assignee)

Updated

16 years ago
Blocks: 137450
(Assignee)

Comment 4

16 years ago
Comment on attachment 109280 [details] [diff] [review]
Output a tab rather than a space between table cells

NB: The "solutions"  to the DOM -> text conversion tests will have to be
updated before this is checked in if we don't want to break tinderbox.
(Assignee)

Comment 5

16 years ago
Created attachment 109316 [details] [diff] [review]
A complete patch that is tested and also cleans up a little

There were some more things that had to be done to always get a TAB and to
handle nested tables. To simplify code that were getting somewhat repetetive I
added some methods for handling stacks of PRBools, which we after this patch
will have three of. This also fixes a couple of strict compiler warnings and
removes an unused variable that I added the other day.
(Assignee)

Updated

16 years ago
Attachment #109280 - Attachment is obsolete: true
(Assignee)

Comment 6

16 years ago
Stealing bug. I hope nobody minds. 

Just one thing, this patch adds both a SPACE and a TAB between table cells. I
could easily be talked into just emitting a TAB.
Assignee: harishd → bratell
Target Milestone: --- → mozilla1.3beta
(Assignee)

Updated

16 years ago
Attachment #109316 - Flags: superreview?(jst)
Attachment #109316 - Flags: review?(harishd)
(Assignee)

Comment 7

16 years ago
Created attachment 109342 [details] [diff] [review]
Better

While waiting for review I couldn't stop myself from improving the patch. This
fixes so that the space is compressed away if possible and includes comment
changes, a fix to PopBool and the required test fix.
Attachment #109316 - Attachment is obsolete: true
(Assignee)

Updated

16 years ago
Attachment #109342 - Flags: superreview?(jst)
Attachment #109342 - Flags: review?(harishd)
(Assignee)

Updated

16 years ago
Attachment #109316 - Flags: superreview?(jst)
Attachment #109316 - Flags: review?(harishd)
Comment on attachment 109342 [details] [diff] [review]
Better

- I don't think having the method PopNoReturn() here is justified, it's called
in one place, and I really doubt the speed overhead is even measurable in this
code, and this only *adds* code, even if the callsite is smaller due to the
void return. I'd say take this out and just call PopBool() and ignore the
result.

- In nsPlainTextSerializer::GetLastBool():

+  PRBool returnValue = PR_FALSE;
+  PRUint32 size = aStack.Count();
+  if (size > 0) {
+    returnValue = NS_REINTERPRET_CAST(PRBool, aStack.ElementAt(size-1));
+  }
+  return returnValue;

How about:

+  PRUint32 size = aStack.Count();
+  if (size <= 0) {
+    return PR_FALSE;
+  }
+  return NS_REINTERPRET_CAST(PRBool, aStack.ElementAt(size-1));

Less code, and faster too (not that it matters here either, but still), and
deal with the normal execution flow at end of the function, returning early in
case of an "error".

Same thing goes for PopBool().

As for the whole tab and space thing, do we really need the space? I can see
that breaking more than it would fix. What software that knows how to import
table-like data can *not* deal with tabs? Beats me. Didn't 4x do tabs too? What
about IE? My vote is to do tabs only...

Other than that, sr=jst
Attachment #109342 - Flags: superreview?(jst) → superreview+

Comment 9

16 years ago
Comment on attachment 109342 [details] [diff] [review]
Better

>Index: content/base/src/nsPlainTextSerializer.cpp
>===================================================================
>-      nsAutoString temp(entity);
>+      nsAutoString temp; temp.Append(PRUnichar(entity));

Break this into two lines

r=harishd
Attachment #109342 - Flags: review?(harishd) → review+
(Assignee)

Comment 10

16 years ago
MSIE 6 doesn't seem to use TAB, but it has clearly whitespace problems of its own. 

The reason MSIE works better in some cases is probably because copy/paste
importers use the HTML flavor and our HTML is broken when copying a part of a table.

I'll drop the space. We will see if this causes any problems in the real world.
Status: NEW → ASSIGNED
(Assignee)

Comment 11

16 years ago
Checked in. We'll have to see if the TAB causes any problems.
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 12

13 years ago
*** Bug 181937 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.