Closed Bug 228920 Opened 21 years ago Closed 18 years ago

Unable to Paste Single Cell Data From Excel into HTML Compose

Categories

(Core :: DOM: Editor, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: mscott, Assigned: mrbkap)

References

Details

Attachments

(1 file, 7 obsolete files)

2003/12/18 builds although it has been around since at least November when a
customer first reported it to me. Maybe this has never worked...

1. Select a single cell of arbitrary data (I used 'Scott' as my cell text).

2. Paste that cell into an HTML editor instance.

3. Note that the data does not show up.

However if you save the document and then reload it, the pasted data does show
up. Or if it's a message, I can send the message and the pasted data does show
up in the out going message.
Here's what the raw cfHTML data looks like for a single cell paste:

+	mStr	0x0372d718 "
  <td height=17 width=64 style='height:12.75pt;width:48pt'>Scott</td>
"

nsHTMLDataTransfer trys to force create table elements because the pasted data
is just a td element. And that's where the problem occurss.

If I select two cells and paste, the cfHTML data looks like:

mStr	0x04dd1cf0 "
 <col width=64 style='width:48pt'>
 <tr height=17 style='height:12.75pt'>
  <td height=17 width=64 style='height:12.75pt;width:48pt'>Scott</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 style='height:12.75pt'>John</"

and that paste works. So the fact that we start out at a col level instead of a
td level makes the paste go through
Last bit of SPAM

During the insert process for the single cell, three assertions come up:

The first assertion complains about Parser and editor disagree on blockness, the
node is a col.

nsHTMLEditor::NodeIsBlockStatic(nsIDOMNode * 0x04e44b44, int * 0x0012c1d8) line
647 + 33 bytes
nsHTMLEditor::IsBlockNode(nsIDOMNode * 0x04e44b44) line 731 + 13 bytes
nsHTMLEditor::NormalizeEOLInsertPosition(nsIDOMNode * 0x04e44b44,
nsCOMPtr<nsIDOMNode> * 0x0012c5a0, int * 0x0012c5c0) line 1946 + 18 bytes
nsHTMLEditor::InsertHTMLWithContext(nsHTMLEditor * const 0x0334c8b8, const
nsAString & {...}, const nsAString & {...}, const nsAString & {...}, const
nsAString & {...}, nsIDOMDocument * 0x00000000, nsIDOMNode * 0x00000000, int 0,
int 1) line 426
nsHTMLEditor::InsertFromTransferable(nsHTMLEditor * const 0x0334c7d0,
nsITransferable * 0x04e1f0c0, nsIDOMDocument * 0x00000000, const nsAString &
{...}, const nsAString & {...}, nsIDOMNode * 0x00000000, int 0, int 1) line 1029
+ 66 bytes


The next assertion happens in nsFrame because we are trying to add child nodes
to element that isn not a container:

nsFrame::SetInitialChildList(nsFrame * const 0x04e42908, nsIPresContext *
0x0223e530, nsIAtom * 0x00000000, nsIFrame * 0x04e429e4) line 564 + 32 bytes
nsCSSFrameConstructor::ConstructTableColFrame(nsIPresShell * 0x03348550,
nsIPresContext * 0x0223e530, nsFrameConstructorState & {...}, nsIContent *
0x04e443c0, nsIFrame * 0x04e427a8, nsStyleContext * 0x04e42870, nsTableCreator &
{...}, int 0, nsFrameItems & {...}, nsIFrame * & 0x04e42908, int & 1) line 2837
nsCSSFrameConstructor::TableProcessChild(nsIPresShell * 0x03348550,
nsIPresContext * 0x0223e530, nsFrameConstructorState & {...}, nsIContent *
0x04e443c0, nsIContent * 0x04e425f0, nsIFrame * 0x04e427a8, nsIAtom *
0x00f51eb0, nsStyleContext * 0x04e42678, nsTableCreator & {...}, nsFrameItems &
{...}, nsIFrame * & 0x00000000) line 3166 + 55 bytes

and the third assertion:

BCMapCellIterator::SetNewRowGroup(int 1) line 4896 + 67 bytes
BCMapCellIterator::First(BCMapCellInfo & {...}) line 4934
nsTableFrame::CalcBCBorders(nsIPresContext & {...}) line 5736 + 18 bytes
nsTableOuterFrame::InitChildReflowState(nsIPresContext & {...},
nsHTMLReflowState & {...}) line 467
nsTableOuterFrame::GetMarginPadding(nsIPresContext * 0x0223e530, const
nsHTMLReflowState & {...}, nsIFrame * 0x04e427a8, int 12480, nsMargin & {...},
nsMargin & {...}, nsMargin & {...}) line 496
nsTableOuterFrame::GetInnerTableAvailWidth(nsIPresContext * 0x0223e530, nsIFrame
* 0x04e427a8, const nsHTMLReflowState & {...}, int * 0x0012a248, nsMargin &
{...}, nsMargin & {...}) line 722
nsTableOuterFrame::Reflow(nsTableOuterFrame * const 0x04e426c8, nsIPresContext *
0x0223e530, nsHTMLReflowMetrics & {...}, const nsHTMLReflowState & {...},
unsigned int & 0) line 1997 + 38 bytes
Blocks: 227205
This could be a clue. When I paste it into composer, I can then view the HTML
source of the document. 

It shows the following for the single cell case. Note the table tag followed by
a col tag. then a tbody. 

<table x:str="" border="0" cellpadding="0" cellspacing="0" width="64"
style="border-collapse: collapse; width: 48pt;">
<col width="64" style="width: 48pt;"><col> <tbody>
<tr>
<td height="17" width="64" style="height: 12.75pt; width: 48pt;">Scott<br>
</td>
</tr>
</tbody><tr height="17" style="height: 12.75pt;">
</tr>
</table>

related: ?
Bug 228879 Bogus Text when pasting from open office spread sheet to Mail Compose
Blocks: 229740
Here is a summary of what gets inserted for the single cell case which does not
work. This is not the actual clipboard cfHTML, but the HTML for the content we
actually end up inserting into the document. Note that we insert an EXTRA <col>
tag after the one that was contained in the clipboard just before the tbody tag:

<table x:str="" border="0" cellpadding="0" cellspacing="0" width="64"
 style="border-collapse: collapse; width: 48pt;">
  <col width="64" style="width: 48pt;"><col> <tbody>
    <tr>
      <td height="17" width="64" style="height: 12.75pt; width: 48pt;">asdfsadf<br>
      </td>
    </tr>
  </tbody><tr height="17" style="height: 12.75pt;">
  </tr>

</table>

Now, here is the table we insert when two cells are selected from Excel. This
insertion works. Note that we don't generate that second empty <col> tag.

<table x:str="" border="0" cellpadding="0" cellspacing="0" width="64"
 style="border-collapse: collapse; width: 48pt;">
  <col width="64" style="width: 48pt;"> <tbody>
    <tr height="17" style="height: 12.75pt;">
      <td height="17" width="64" style="height: 12.75pt; width: 48pt;">asdfsadf</td>
    </tr>
    <tr height="17" style="height: 12.75pt;">
      <td height="17" style="height: 12.75pt;">asdfa</td>
    </tr>
  </tbody>
</table>
The problem is definetly this second COL element that the layout engine inserts
on its own. Why does it think we need this empty column element that does not
have a width and height on it?

For clarity, I wanted to post the RAW HTML that comes straight from the
clipboard for the failure and success case. Comment #5 shows the HTML that
results from the generated dom elements after this raw html is inserted into the
document.

Original context information for success case:

<table x:str border=0 cellpadding=0 cellspacing=0 width=64
style='border-collapse:collpse;width:48pt'>
<!--StartFragment--><!--EndFragment-->.
</table>

Original paste fragment for success case. Note how the col tag is part of the
original paste and not the surrounding context.

<col width=64 style='width:48pt'>
<tr height=17 style='height:12.75pt'> 
  <td height=17 width=64 style='height:12.75pt;width:48pt'>asdfsad</td>
</tr>
<tr height=17 style='height:12.75pt'>
  <td height=17 style='height:12.75pt'>asdfa</td> 
</tr>

Original context information for failure case. Note how the col tag is part of
the context and not the paste fragment

<table x:str border=0 cellpadding=0 cellspacing=0 width=64
style='border-collapse: collapse;width:48pt'> 
  <col width=64 style='width:48pt'>
    <tr height=17 style='height:12.75pt'>
      <!--StartFragment-->
      <!--EndFragment-->.. 
    </tr>
</table>

Original Paste fragment for failure case:

<td height=17 width=64 style='height:12.75pt;width:48pt'>asdfsadf</td>

I'll need help from the layout gurus now I think.
Assignee: mozeditor → mscott
Attached patch temporary work around (obsolete) — Splinter Review
I'm using this as a temporary work around. There is no reason to have the
thunderbird, but maybe it could be a win32 ifdef so it only gets triggered for
windows users.

From my debugging, I think this is going to be a hard layout/parser bug as to
why the context stacks are not getting built correctly, leading to the extra
col tag getting inserted. I will need help from layout folks to fix this as
I've gotten about as far as I can on my own :)
cc'ing bienvenu to keep him in the loop on this bug.
Status: NEW → ASSIGNED
Work around checked into the M4 branch. I would like to put it in the trunk to
as it is a high visiblity issue but there may be objections to the hack. 
This sounds like a bug that's either in editor or perhaps in htmlparser.  It
might be useful to put a breakpoint in
nsHTMLTableColElement::nsHTMLTableColElement to see where the second element is
being created.
If I do the same thing with Open Office 1.1 I get the following HTML source:
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="content-type">
<title></title>
</head>
<body>
<table border="1" cellspacing="0" cols="1" frame="void" rules="groups">
<colgroup><col width="107"></colgroup> <tbody>
<tr>
<td align="left" height="20" width="107">Foo</td>
</tr>
</tbody>
</table>
</body>
</html>

How certain are you that M$ doesnt create bogus html?
Target Milestone: --- → mozilla1.8beta
I'm not quite sure if editor or htmlparser is at fault here (yet). When I try
this on a fairly current CVS debug build, I get the following content model
(which causes table assertions, for obvious reasons, I think):
HTML
  BODY
    TABLE
      COL
        COL
        TBODY
          TR
            TD
              #text
      TR

I'm not sure if the fragment sink is giving a bad document fragment or if the
editor is pasting the document fragment in the wrong place, yet.
After some more debugging, I'm almost certain this is an editor bug. The problem
is that given the context and the new content, the editor inserts the new
content in the wrong place (namely, instead of the <tr> it wants, it puts it in
the <col>). I think
http://lxr.mozilla.org/seamonkey/source/editor/libeditor/html/nsHTMLDataTransfer.cpp#2467
is causing the pain by selecting the <col> (first child of the table) instead of
the <tr> (second child of the table).

I'm not really sure how this can be fixed.
I have a real fix for this, which I'll attach tomorrow.
Assignee: mscott → mrbkap
Status: ASSIGNED → NEW
OS: Windows 2000 → All
Priority: -- → P3
Hardware: PC → All
Target Milestone: mozilla1.8beta1 → mozilla1.9alpha
Depends on: 329007
Attached patch Proposed fix, v1 (obsolete) — Splinter Review
My comment 13 (from not quite a year ago!) was right on the money. The problem is that Excel's paste data ends up looking like:
<html>
  <table>
    <col>
    <tr>
      <!-- paste here -->

But we were only pasting into the first first child that we came across. This patch adds a magical cookie comment into the context string that allows us to figure out the real location for the paste. I've tried to construct FindTargetNode to fall back gracefully to the old behavior.
Attachment #138895 - Attachment is obsolete: true
Attachment #213644 - Flags: review?(brade)
Status: NEW → ASSIGNED
Whiteboard: [patch]
Comment on attachment 213644 [details] [diff] [review]
Proposed fix, v1

Asking Neil to review since I am swamped for the next week.
Attachment #213644 - Flags: review?(brade) → review?(neil)
Attached patch Hack (obsolete) — Splinter Review
I'm not convinced that attachment 213644 [details] [diff] [review] will work as advertised. The reason is that I think that InsertHTMLWithContext needs to find the right node too. In fact, the problem was worse for me, although I don't completely understand why; all I could tell was that the context fragment was missing the <head> node although it did have all of its child nodes. This meant that the old code would pick on the <meta>, while the last patch would fail in InsertHTMLWithContext. This hack simply finds the start and end fragment by scanning the context.
(In reply to comment #17)
> fact, the problem was worse for me, although I don't completely understand why;

I bet it's becuase I have the fix for bug 329007, so the head stuff was getting removed propery and the head stuff was getting close enough.

> although it did have all of its child nodes. This meant that the old code would
> pick on the <meta>, while the last patch would fail in InsertHTMLWithContext.
> This hack simply finds the start and end fragment by scanning the context.

Hmm, you're right -- my patch is not complete. I'll fix my version of InsertHTMLWithContext.
Whiteboard: [patch]
Attached patch Partial patch I'm giving up on (obsolete) — Splinter Review
This patch is mostly a cleaned up version of what Neil proposed, but suffers because we never pass the given HTML through the HTML parser for it to sanitize (the only parse that we do doesn't do any sort of checking), I'm going back to my original plan.
Attachment #213644 - Attachment is obsolete: true
Attachment #213902 - Attachment is obsolete: true
Attachment #213644 - Flags: review?(neil)
Attached patch Proposed patch (obsolete) — Splinter Review
I'm going to hold up on requesting reviews until I've had a chance to test this and read through it a bit more. It's a diff -w since there's some additional whitespace cleanup that was cluttering things.
Attachment #214497 - Attachment is obsolete: true
Attached patch Proposed patch, v2 (obsolete) — Splinter Review
I tested this a bit and this seems to work. I'll test it more as I have time, but I think this is right.
Attachment #214512 - Attachment is obsolete: true
Attachment #214527 - Flags: review?(neil)
Comment on attachment 214527 [details] [diff] [review]
Proposed patch, v2

>+  ReplaceFragComments(contextUTF8, NS_LITERAL_CSTRING("<!--" kInsertCookie "-->"));
I think you should simply insert your cookie when this string is constructed.

It continually confuses me that CreateDOMFragmentFromPaste isn't inlined into InsertHTMLWithContext; this patch strains the existing functionality overlap. Maybe CreateDOMFragmentFromPaste should return the start/end stream parent/offsets directly?
Attached patch Updated to comments (obsolete) — Splinter Review
This also fixes a minor problem with the fallback code, where I wasn't setting the out parameter properly. I'm going to final a but on inlining (or at least cleaning up) CreateDOMFragmentFromPaste.
Attachment #214527 - Attachment is obsolete: true
Attachment #214728 - Flags: review?(neil)
Attachment #214527 - Flags: review?(neil)
(In reply to comment #23)
> final a but

That would be "file a bug"...

Comment on attachment 214728 [details] [diff] [review]
Updated to comments

r=me with fixes as noted on IRC, but copied here for completeness:

>-  for (k = 0; k < rangeStartHint; k++)
This patch doesn't use rangeStartHint any more.

>+        NS_ADDREF(*aResult = aStart);
This leaks the previous value of *aResult (the fallback start node). Although there's an obvious fix for that the fix that I'd prefer to see the separation of the fallback code, but I'll reserve judgement on that for your forthcoming cleanup patch.
Attachment #214728 - Flags: review?(neil) → review+
Attachment #214728 - Attachment is obsolete: true
Attachment #214754 - Flags: superreview?(jst)
Attachment #214754 - Flags: review+
Comment on attachment 214754 [details] [diff] [review]
Addressing Neil's review

sr=jst
Attachment #214754 - Flags: superreview?(jst) → superreview+
Fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
I filed bug 333063 on fixing the relationship between nsHTMLEditor::InsertHTMLWithContext and nsHTMLEditor::CreateDOMFragmentFromPaste.
Checkin for this bug may have caused a 15% increase in number of allocations on balsa. See bug 333173.
Depends on: 333229
*** Bug 308751 has been marked as a duplicate of this bug. ***
*** Bug 154142 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: