Closed Bug 137450 Opened 22 years ago Closed 12 years ago

Problem copying and pasting a table from a web page to excel

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla15

People

(Reporter: patgade, Assigned: limweizhong)

References

()

Details

Attachments

(3 files, 13 obsolete files)

13.00 KB, application/vnd.ms-excel
Details
8.78 KB, patch
bbondy
: review+
Details | Diff | Splinter Review
4.80 KB, patch
bbondy
: review+
Details | Diff | Splinter Review
Hi,

When you select and copy a table in Mozilla 9.9 and then paste it into MS Excel
2000, it all appears in one cell as plain text.  However, if you first paste it
into composer and then copy and paste into Excel 2000, it pastes correctly as a
table into cells.  This problem doesn't occur in Excel 97.

For company intranets, its quite common for staff to pull out query results in
html tables and copy and paste into excel.

Off topic - it would be a really great feature to be able to select and copy
specific columns in navigator rather than just rows. eg right click->select
column as you can in composer.

Thanks.
I just copied the entire bugzilla "todays query" and pasted right into excel as
"html": Worked fine with MS Excel 2000.

Pasting as text honors first table rows onløy, however, and does not distinguish
between succeeding cells on same row.

(Build ID 2002041408 on XP)

You can select a table column bu ctrl+drag, but it will paste as a row, not
column. That's bug 76545.
I tried your example.  When I "Select All" and copy the whole page it does go
into cells fine.  However the problem occurs if I select just the table, or
just
specific rows of the table  (see attachment).  It then pastes into only one
cell (this happens
whether I choose 'paste' or 'paste special'->html. 

Thanks for the tip on using Ctrl->Drag to select a column.  I think the Table
Select context menu in composer is very good though, and might be a nice option
to add to the context menu of a table in navigator as well.

Thanks for your help.
Target Milestone: --- → mozilla1.0
This doesn't really sound like a selection bug to me; it sounds like a clipboard
problem.  It sounds like it works as desired in Composer.  Please clarify what
the problem is if you disagree.

-->XPApps
Assignee: mjudge → sgehani
Component: Selection → XP Apps
QA Contact: tpreston → paw
related to bug 69566? (122574 -> bug 92518 ->)
Target Milestone: mozilla1.0 → ---
to DOM to Text conversion.  Note the following from bug 172424:

  "In IE, when you select the contents of a table and paste into another
   document, the cell row contents are separated with a tab. In Mozilla the cell
   row contents are separated with a space (I'm using Mozilla/5.0 (Macintosh; U;
   PPC Mac OS X; en-US; rv:1.2b) Gecko/20021009).

  "The IE behavior is rather nice because when you are pasting data from an html
   table into a spreadsheet each cell from the html table ends up in a separate
   cell in the spreadsheet, the Mozilla behavior is less useful...everything
   ends up in the first column."
Assignee: sgehani → harishd
Status: UNCONFIRMED → NEW
Component: XP Apps → DOM to Text Conversion
Ever confirmed: true
QA Contact: paw → sujay
*** Bug 172424 has been marked as a duplicate of this bug. ***
*** Bug 154073 has been marked as a duplicate of this bug. ***
*** Bug 183154 has been marked as a duplicate of this bug. ***
So switching to TABs would solve this, and MSIE alread does that? I've a patch
in bug 173388 that switches to TAB and if that is a good thing, I will go look
for reviews.
Depends on: 173388
I think this is because when you select table contents in Mozilla and copy/paste
it the receiver get: <html><body><td>cell 1</td><td>cell2></body></html>. There
should probably be some <table> and <tr> to make everyone happy. 

The patch in 173388 inserts TABs between cells and that make it possible to, in
Excel, select "Paste Special..." from the Edit menu and then select Unicode
text, but that is probably to odd for most people to find out. 

I don't think this bug is in the right category, but I don't know which
component is responsible for creating html from a user selection when copying.
*** Bug 191782 has been marked as a duplicate of this bug. ***
This is a problem for me too. If you do any amount of copy/pasting (like I do),
you'll find it crippling. Copy/Paste may work under some of the specific example
provided by R.K.Aa, but it typically does not work for me. 

I tend to select a table, and copy it then paste that into Excel. This never
works for me.
This is a problem on any OS, and any graphical spreadsheet program.
See also 181937 (Mac).  I can attest for Linux using Gnumeric, Xess, etc.

In all cases, it can be solved by putting tabs between cell contents
when copying to the clipboard.

Can we get an ETA on when this will be fixed? FYI, it's the only thing
preventing me from ditching IE altogether...
still an issue....
*** Bug 220501 has been marked as a duplicate of this bug. ***
Still a problem in Firefox 1.0PR
If Mozilla wants corporate types to switch from IE, you'd better fix this bug,
and make copying and pasting data tables work as it does in IE.  If you just
want geeks to use Mozilla, don't bother.
*** Bug 272159 has been marked as a duplicate of this bug. ***
*** Bug 272672 has been marked as a duplicate of this bug. ***
*** Bug 264959 has been marked as a duplicate of this bug. ***
Problem for my corporate client. Switched back to IE coz of this.
*** Bug 247678 has been marked as a duplicate of this bug. ***
*** Bug 276916 has been marked as a duplicate of this bug. ***
*** Bug 276688 has been marked as a duplicate of this bug. ***
Assignee: harishd → dom-to-text
QA Contact: sujay
Hi, our problem is similar to this one, at least I think it has the same reason.
We are producing ERP software that is using browsers as client software. One of
the possibilities we have in almost all the GUI's is a search button that starts
queries for the data one is currently entering or editing. Those queries are
generating a result set displayed in a grid.
On the GUI we have an export button which allows us to print the data, export it
to files and so on. If we choose to export it in cvs format Mozilla (V 1.8b,
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050208) and
Firefox (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b) Gecko/20050208
Firefox/1.0+) do have the problem that the result is put to one cell per record
in Excel. If we use IE, it works absolutely fine.
Our prodict is written in Java and one of our best selling arguments is that we
are plattform independent. So it would be very nice if we could tell our
customers to use independent software to access our application.
I can confirm that this problem exists in Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0.

It is quite a big problem for many in our company and is one of two reasons why
we need to still use IE
This bug doesn't seem to be going anywhere - still New after nearly three years.

This is a major problem, and it seemed that someone had a fix for it, but it is
still a problem, for plenty of users.

Is anything being done about this, or has it fallen of the radar of developers?
I note that on sending a comment on this bug amongst the list of exclusions is
dom-to-text@mozilla.bugs.  Is that normal, or does it indicate that comments on
this bug are not making their way to people who might deal with them?
Several potential FireFox users have complained that this bug has existed for
years, without resolution.  I note that the reference to 173388 is probably not
meaningful anymore, since the fix for 173388 is included in the source for
FireFox 1.0.1 and the 137450 problem still exists.

Through further experimentation, I determined that I can copy from a WEB page
table to a NotePad document and then cut/paste to an Excel XP page with some 
more reasonable results (of course this loses formatting).  But at least I get
items into more than one cell.  Still the results are so far from what I need
and I expect that this knowledge would not lead to a respectable solution.  It
does, however, give a clue as to what output FireFox generates.

Perhaps the owner of this bug should follow up on the reference to 173388 and
ask the folks who understood 173388 if they have any ideas about bug 137450.  I
must admit that I'm not enough of a programmer to be able to contruct a patch in
a reasonable amount of time.  But I think the 173388 group could look at
nsPlainTextSerializer.cpp from the standpoint that both their needs and the
Excel community's needs should be met simultaneously.  Of course, I don't have
the slightest idea who dom-to-text@mozilla.bugs is.

Regards,  Bill Rowe
This is caused by missing newline after the last cell.

IE's behavour is like this ('\n' means newline, '\t' means tab):
Windows XP 55.20 \nWindows 2000 18.20 \n

Firefox (20050317)'s behavour is:
Windows XP\t55.20\n\nWindows 2000\t18.20


A Japanese friend points this out in his weblog's entry:
http://diary.noasobi.net/2005/03/diary_050317a.html

He says 'paste this to an editor, add newline after the last cell, then copy and
paste this to excel. It works well'.
As a non-techgeek, I have to add my vote to the legions who remain frustrated
with this bug. I'm on FireFox 1.0.2, Windows XP, and I'm a bit horrified that
it's been three years since this bug was first identified. I cut and paste
primarily from government websites, like the census, BEA and this one
http://www.st.nmfs.gov/st1/commercial/landings/annual_landings.html

They are not going to update their code anytime soon.

It is keeping me on IE for work, and these kind of temporary workaround
solutions are still more difficult than using IE, where this cut and paste works
perfectly. Anything involving intermediate steps, text editors, etc. mean fewer
people will use Firefox.
This happens because Excel chooses the HTML clipboard format over the plain text
one.  If you do Paste Special -> Plain Text, the table will paste fine.  If you
paste normally, you get HTML format, which doesn't work.  If you only select
part of a table, Mozilla doesn't put a <table> tag on the clipboard, which
screws up Excel.  Selecting the entire table (using ctrl+click on the border)
and pasting in Excel works fine.
This isn't DOM, since Excel isn't using our text/plain format.

My proposal to fix this:
Change BuildPlatformHTML to wrap the selection with the context HTML, instead of
just body and html tags.
Definition of BuildPlatformHTML:
http://lxr.mozilla.org/seamonkey/source/widget/src/windows/nsDataObj.cpp#1182
Called from:
http://lxr.mozilla.org/seamonkey/source/widget/src/windows/nsDataObj.cpp#886

This may require changing how the HTML info clipboard format is created, since
it currently seems to always be 0,0.  It's possible we could create a new
clipboard format, "HTML with context" or something, but that seems kind of bogus.

This is where we get the HTML fragment:
http://lxr.mozilla.org/seamonkey/source/content/base/src/nsDocumentEncoder.cpp#1214
Component: DOM to Text Conversion → Widget: Win32
Assignee: dom-to-text → win32
QA Contact: ian
dom-to-text@mozilla.bugs and anything like it is not a person (and does not
receive mail), it's the ability to watch a role, if you care about dom to text
bugs, then go to mail prefs:
https://bugzilla.mozilla.org/userprefs.cgi?tab=email and add that to your users
to watch list.
Added documentation on CF_HTML, showing that we don't follow the intended
implementation.  I threw together a few testcases here:
http://mavra.perilith.com/~luser/clipboardtests/

Unfortunately the results files have trailing null characters in them, so they
won't open as plain text in any recent mozilla.

It appears that we could support most of the cases without too much effort,
however IE also includes the entire contents of the HEAD tag, which might be
difficult to easily support.
Attached patch Possible fix - content changes (obsolete) — Splinter Review
Assignee: win32 → ted.mielczarek
Status: NEW → ASSIGNED
With these two patches, copying HTML to the clipboard adds a new data flavor,
"text/moz_htmlselinfo", which contains one integer indicating the offset into
the HTML context data where the selection should be inserted.  The changes to
widget/win32 changes BuildPlatformHTML so that when the CF_HTML data flavor is
requested, the HTML selection, HTML context, and the new selection info flavor
are used to provide a unified context + selection.
Just for my reference, related bugs: bug 237546 , bug 236546 

Note that this patch won't fix copying individual cells from different rows in a
table using ctrl+click.
peterv: do you know the code in nsDocumentEncoder well enough to take a look at
this, or do I have to keep looking for someone else?
*** Bug 298739 has been marked as a duplicate of this bug. ***
*** Bug 303164 has been marked as a duplicate of this bug. ***
It's been a few months that a potential patch has been available.

Has anyone tested it or know who it can be sent to for a review?
(In reply to comment #43)
> It's been a few months that a potential patch has been available.
> 
> Has anyone tested it or know who it can be sent to for a review?

I tested the patch using my testcases from comment 36.  Any other testcases for
situations I didn't cover are welcome.  I still haven't found anyone that knows
nsDocumentEncoder well enough to review.  Maybe when the 1.8 branch work winds
down I'll have better luck.
I'd like to confirm that a CTRL-click on the table border before the copy/paste
manuver does successfully preserve formatting when moving table data from
Firefox to MS office applications (running WinXP, Firefox 1.0.6).  It's still
frustrating that IE works better for copy/paste text with formatting than Firefox.  
Attached patch Combined and updated to trunk (obsolete) — Splinter Review
Let's see if we can't make some progress here.
Attachment #184850 - Attachment is obsolete: true
Attachment #184851 - Attachment is obsolete: true
Attachment #213963 - Flags: review?(neil)
Comment on attachment 213963 [details] [diff] [review]
Combined and updated to trunk

>+  NS_IMETHOD EncodeToStringWithContextExtra(nsAString& aEncodedString, 
I think it would be better if you added the extra info to the info string (duh), and fixed up the existing consumer (nsHTMLEditor::CreateDOMFragmentFromPaste in nsHTMLDataTransfer.cpp).
Attachment #213963 - Flags: review?(neil) → review-
Ugh, after a little more testing, it turns out that this patch works except if you select one table cell, in which case Excel will not paste anything.  Looks like I need to do some more work...

I'll attach an updated patch shortly anyway, but it's not ready to go yet.
*** Bug 331289 has been marked as a duplicate of this bug. ***
Summary: Problem copying a table from a web page to excel → Problem copying and pasting a table from a web page to excel
*** Bug 338270 has been marked as a duplicate of this bug. ***
A problem, perhaps related?, with copy/pasting from web pages to Excel. It began to happen after one of the new Firefox releases a couple months ago. Sorry I can't remember which one.
Here's an example:
Go to: http://finance.yahoo.com/q/bs?s=ASX&annual
Copy the row beginning "Cash and Cash Equvalents"
Paste it into Excel

The first cell in the row is fine, but the second looks like this:
406,290ash And Cash Equivalents, 
and the real first cell is pushed over to the right by one cell

This never used to happen on the earlier release. It's not a huge deal, but would be nice if someone could fix it.

I'm using Excel 97, and XP
FWIW, when the Windows CF_HTML generation code is used on Mac (bug 332912), the same brokenness is manifested with Mac Office (not surprisingly).
Same problem applies when trying to copy a table and paste into the MS Outlook or Word mail composers.
Workaround of selecting before the start of the table works for me.
Someone mentioned this was being tested for 1.6?  Not sure if I misinterpreted that.  In any case, FF 2.0.0.1 still has this problem (for me, anyway;) even selecting before the start of the table doesn't work.  I'm on a project which has be copying lots of stuff into Excel for the next few weeks, and I'm sorry to say I had to switch back to IE.
John:  Unfortunately this patch had some problems, so I wasn't able to finish it in that timeframe.  Hopefully I will find the time and get this in for Firefox 3.
Assignee: ted.mielczarek → nobody
Status: ASSIGNED → NEW
QA Contact: ian → win32
Blocks: 470384
Component: Widget: Win32 → Content
OS: Windows 2000 → All
QA Contact: win32 → content
Hardware: x86 → All
Attached patch fix v0.5 (obsolete) — Splinter Review
Ted's last patch updated to current trunk, doesn't work but might be a helpful starting point.
Attachment #213963 - Attachment is obsolete: true
Possible workaround: Table2Clipboard extension (https://addons.mozilla.org/firefox/addon/1852).
Josh: it looks like my latest patch didn't actually include the widget changes (not sure why). attachment 184851 [details] [diff] [review] is also necessary to make this fix work on Windows. The BuildPlatformHTML function does most of the work, you could probably uplift that into the Core and just use it from each widget implementation. Note that the Win32 widget code does want to know the length of each particular string, and needs to stick specific HTML comments in to indicate the beginning/end of the selection.
Gecko now exposes HTML to the native clipboard on Mac OS X, and as a result this bug was exposed on Mac OS X. Due to the way Excel handles pulling data from the native clipboard, some row-splitting behavior is regressed. See bug 470384, which is now a duplicate of this bug.
Flags: blocking1.9.1?
Not blocking, but if you get the patch with tests baked we'd take it before code freeze at the end of the month.
Flags: blocking1.9.1? → blocking1.9.1-
Component: Content → DOM
QA Contact: content → general
Any news on the last patch that was submitted? I can't complain as I can't develop, but I check this bug's status after each release. It has been reported more than 7 years ago.
Since Bug 504671 has been marked as a duplicate of this it is worth pointing out what that related bug is.

When you copy and paste a portion of a table from other browsers, the content is not marked as HTML, but FF 3.5 (this is a regression, not a problem in 3.0) includes a HTML type on the clipboard. This breaks the once working functionality.

Safari does a more logical thing and only uses plain text / rich text for this type of data. who could complain about a table being represented as having tabs between it? If you copy a whole table, then i could see the complaint, but a portion?

Kevin Brosnan writes at but 504671, "Till (sic) Excel can handle pasting partial table data or code is written to fix bug 137450 use one of the following." 

But I think it would be more through to say that Excel would have to be able to handle malformed HTML because the current "Apple HTML pasteboard type" (the problem causer) does not open or close a table tag, nor does it open or close a tr tag, just has raw td tabs. Why not just force table and tr tags?
It is also worth pointing out that the raw td tags can have classes that are not defined. These classes in the td tag should either be removed or their definitions included. What possible meaning is there to including them without definitions? what could a parser possibly do with that except deal with the buggy missing class definition? Why stress the other code that way?
I just want to copy/paste, into EXCEL, short phrases, statements, or numerical data.....like I could before this 3.5. This is so____. I'm not pasting a whole chart!! or columns.
Maybe a recipe. PLEEEEEZ somebody HelP!!!!!
We've been able to copy and paste tabular data from IE to Excel from day one. We also can do so with Chrome and Safari. 

I'm amazed that Mozilla has ignored this bug for more than seven years.

There are 400 million Excel users in the world. About a hundred thousand of them visit my site every month. I, for one, will be telling them that Firefox isn't Excel friendly.

Charley Kyd
ExcelUser.com
(In reply to comment #73)
> There are 400 million Excel users in the world. About a hundred thousand of
> them visit my site every month. I, for one, will be telling them that Firefox
> isn't Excel friendly.
> 

As per comment 59 you should tell your site visitors that Firefox + Table2Clipboard extension are a perfect companion for Excel users.
I'm not sure this is the place to have this discussion, and I apologize if it's not, but:
- As said functionality works out of the box with all other browsers (it even worked with very early versions of Chrome), I believe users expect it to work in FF as well without a plugin. So from an end-user perspective, this is a bug.
- If Table2Clipboard does manage it well, why not include its behavior in FF directly?
- I must be totally dumb, but I still can't figure out how to use Table2Clipboard properly.

Finally, there is a proposed fix. I didn't check it, I don't know the procedures, but it seems to me that having a look at the fix, at Chrome's implementation and pushing this for the next release should not take much more time than discussing this here and marking new bug reports as duplicate.

Again, sorry to be the end-user with no coding skills complaining about an otherwise outstanding product.
(In reply to comment #74)
> (In reply to comment #73)
> > There are 400 million Excel users in the world. About a hundred thousand of
> > them visit my site every month. I, for one, will be telling them that Firefox
> > isn't Excel friendly.
> > 
> 
> As per comment 59 you should tell your site visitors that Firefox +
> Table2Clipboard extension are a perfect companion for Excel users.

Not perfect (1) it is not "out of the box" functionality, so it requires knowledge of how to install addins and comfort allowing them and managing them. (2) Table2Clipboard requires that you can control click which is a right click on a Apple laptop. See comments at Bug 504671.
This reproduces in FireFox 4b12.
Go to about:support, select the Hardware Acceleration table at the bottom, press Ctrl+C, open Notepad, press Ctrl+V - One line of text with no spaces between the cells/new lines between the rows.
Related/duplicated issues -
https://bugzilla.mozilla.org/show_bug.cgi?id=638439
https://bugzilla.mozilla.org/show_bug.cgi?id=572543
https://bugzilla.mozilla.org/show_bug.cgi?id=237546
https://bugzilla.mozilla.org/show_bug.cgi?id=303736
https://bugzilla.mozilla.org/show_bug.cgi?id=515464
Assignee: nobody → netzen
This patch fixes copy HTML operations in Firefox and hence also fixes paste operations in:
Word, Excel, Firefox and other programs for partial table selection.

This also fixes things like pasting only a couple of <li> items when they belong to an <ol>.  
Pasting them before would result in bullet points, but pasting them now results in a numbered partial list.

The problem was with the HTML range selection not including enough context.  
We should include parent items (and their attributes) for items like <tbody>, <tr>, <li>, ...

The fix was simply to change the common ancestor code to determine a common ancestor which provides enough context when one like <tbody> was the common ancestor.
Attachment #360522 - Attachment is obsolete: true
Attachment #555160 - Flags: review?(Olli.Pettay)
Comment on attachment 555160 [details] [diff] [review]
Patch for HTML copy not including enough context for paste operations in external apps

You can't change nsContentUtils::GetCommonAncestor this way.
nsContentUtils::GetCommonAncestor is used in many places and it is expected to
return the common ancestor, not something else.
Attachment #555160 - Flags: review?(Olli.Pettay) → review-
Sorry I meant to ask specifically about it in the attachment comment and if I should create a similarily named function.  Ill submit a new patch with that change.
Seems like a pretty straightforward bug with a tiny little patch. Did this just drop off the radar?
> Seems like a pretty straightforward bug with a tiny little patch. 
> Did this just drop off the radar?

I probably won't have time to get back to this for a month or two, if someone else wants to pick it up in the meantime please feel free.  If not I'll get back to it eventually :)
Firefox is the only browser I've found that has this problem. IE, Chrome, Opera, and Safari do not. 

You've been discussing the issue for more than nine years. This is ridiculous.
Firefox is the only browser I've found that has this problem. IE, Chrome, Opera, and Safari do not. 

You've been discussing the issue for more than nine years. This is ridiculous.
Thanks Charley Kyd, I agree it's been too long. I am currently finishing a large project, but I personally do want to see this fixed as well so I will get to it sooner if I can.
Please resolve this problem; it's really a big issue. Using FreeClipViewer I managed to gather that Firefox only copies the <tr>s and <td>s as the HTML fragment and does not surround that fragment with a <table> and <tbody> tag. Is it really difficult to test if the selection is *inside* a table and just surround the HTML fragment with the corresponding <table> and <tbody> tags?
Also, there is already some code to almost implement the behavior above, which happens when you select from inside a table to outside the table. In that case, the <table> and <tbody> tags get placed into the HTML fragment instead of surrounding it, which is logical.
Brian, have you had a chance to take a look at this?
Sorry I haven't had a chance to pick this up again. I had to re-prioritize since Comment 88.  Realistically I still have a few security bugs and a snappy bug to do on the portion of my time available outside of the main project work.  To anyone who can pickup before I get back to it, please feel free to steal.
I can see if I can help; but that is if I can understand the code first... Where's the place in the code that implements the HTML selection copying? I'm thinking of looking it up in http://mxr.mozilla.org instead of downloading the whole code.
Hi,

I have Firefox 12.0 and Excel 11.6.2 om a MacPro using Snowleopard

I, too, have the same problem described here with a simple Paste.

However, I am able to copy and paste entire tables into separate cells with this workaround:

--Highlight area of Foxfire table I wish to copy.
--Copy it.
--Highlight destination cell in Excel.
--Select 'Paste Special'.
--Select 'Unicode Text'.
--Click 'OK'

two extra keystrokes...

Works every time. 

I do it daily from several sites which have my checking and Mastercard purchases listed in table form, or any web table I want to import into Excel.
(In reply to lwz from comment #94)
> I can see if I can help; but that is if I can understand the code first...
> Where's the place in the code that implements the HTML selection copying?
> I'm thinking of looking it up in http://mxr.mozilla.org instead of
> downloading the whole code.

The main function involved in copying something from an HTML document is SelectionCopyHelper <http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCopySupport.cpp#106>.  If you want to debug this, it's best to set a breakpoint there, and then follow the execution to this SetData call <http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCopySupport.cpp#287> which is what puts the copied data on the clipboard.

Please let me know if you need help finding some code involved here.  :-)
I have taken a look, and was just thinking if we could just add table and tr tags to the list in the IncludeInContext function at http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocumentEncoder.cpp#1487

What do you think?
Oh ignore that... just realised it will not work.
Is it wise to change SerializeRangeToString or will it affect other XML documents that happen to use table/tr/td tags as well? Or I could create a copy of SerializeRangeToString and name it to SerializeSelRangeToString just for use in the "if (mSelection) {" branch of nsDocumentEncoder::EncodeToString.
Ok. I think I know what we can do. We can add nsHTMLCopyEncoder versions of SerializeRangeContextStart and SerializeRangeContextEnd like what was done already for the IncludeInContext. Sorry for the many earlier tries - I was getting orientated.
Attached file Untested "patch" (obsolete) —
Totally untested "patch". I don't have the time to download and compile the code yet. Hope someone can do that for me. I also am absolutely unfamiliar with the patch submission process, so bear with me.
Anyway for the meantime, the best solution I found was to copy the selected cells to an empty HTML editor in Firefox like the Gmail compose window, select all, and copy. Hope this helps for the other users.
But the above only works if you're not selecting ranges of cells using the Ctrl.
If you click "show obsolete" on the attachments table here, you can see some 6-year-old patches I had that were a potential fix: attachment 184850 [details] [diff] [review] and attachment 184851 [details] [diff] [review]. I'm sure they're quite out of date now, but the core concepts probably still work.
Assignee: netzen → limweizhong
@Ted: Could you explain what the core concept for your patch is? It is really very different. But I was hoping someone could try out my patch, which I believe will work. The reason why the table is not pasting properly is that the table and tr tags were not included in the fragment. Ideally, they should be *outside* the fragment, but in this case I put them inside, following the same idea as Brian. This is because I do not want to change http://mxr.mozilla.org/mozilla-central/source/widget/windows/nsDataObj.cpp#1833, which is all hard-coded (*gulps*)!
Attached file Patched nsDocumentEncoder (obsolete) —
I'm trying to make it easier for someone to test my "patch".
Attachment #618965 - Attachment is obsolete: true
It's been a while, but IIRC, the existing code in nsHTMLCopyEncoder builds a "context" string that it puts on the clipboard with a different data flavor. This string is just the stack of HTML tags from the context up through and including the selection. My patch made that code save some extra data pointing to where the context ended and the selection began in that string, so then the Windows BuildPlatformHTML code could extract the actual context tags to wrap around the selection.

It looks like bbondy's attachment 555160 [details] [diff] [review] took a different approach and modified the GetCommonAncestor code to return more context in certain circumstances.
Looking back I think that Ted's way is better.  There is documentation on the HTML format here by the way:
http://msdn.microsoft.com/en-us/library/aa767917%28v=vs.85%29.aspx
Actually, from what I gather, the string that the nsHTMLCopyEncoder generates does not generate the table/tr tags where necessary. It does not output any tr tag when the selection spans the tds of a single row or any table tag when the selection spans the trs of a table. That's the issue I'm solving.
However do I generate a patch file without having to download the *entire* repository!?
It would be best to download mozilla-central source code and then compile your patch to make sure it works.  You can get the source code here:
https://developer.mozilla.org/en/Mozilla_Source_Code_%28Mercurial%29

If you have any build specific questions I would be happy to help on IRC, just message bbondy, or email me, or you can ask in the bug.
Hi everyone, I have no machine here which is capable of this huge CPU-intensive operation -- I've downloaded the hg bundle, and I estimate (after running it for a while) that it will take more than 3 hours just to unbundle, and another few hours to compile. Would appreciate anyone having the time and computing power to just test this patch for me. I'm not a hard-core developer, and just happened to come across this irritating bug. Feel free to steal this bug and my patch from me; just post a message indicating your intention. Sorry Brian.
So if I understand your patch correctly then I think it will not work because you will put a table around every table data cell that is copied.  I tried something similar when I did it but just by modifying the already virtual IncludeInContext.  Let me know if I did not understand your code though. 

I did try it as well and it did not compile because virtual should not be in the definition, just the declaration.  After that it crashes because 
nsCOMPtr<nsIContent> content(do_QueryInterface(node)); can sometimes and is sometimes NULL.  After that it gave as I mentioned above.
Oh yes, you will be right about that. Sorry, I didn't realize that. But at the very least it should solve the case where you are selecting cells without using Ctrl. There's no avenue to do have done this partial patch from within IncludeInContext because then *all* the ancestors that are table-related will appear, whereas if you do this in SerializeRangeContextStart/End you will be able to check if it's the immediate parent. So that's what I did. To make it work with Ctrl, I would probably have to edit the part where the patch for "Bug 236546" is, and somehow disable the portion in SerializeRangeContextStart/End. I will think about this and get back. Will try not to make silly errors like the last one. If you have the "patched" version you could help to verify that if you select without using Ctrl it pastes correctly. Thanks.
Attached file Patched nsDocumentEncoder (obsolete) —
I corrected the bugs and added code to cater for multiple td selections when using Ctrl to highlight cells. I assume that p->GetNodeParent will not completely fail when p is already a tr element. If it returns nsnull, I assume that passing a null node to nsContentUtils::GetAncestors will not fail, but just give me an empty list (which it currently should do, according to its source code). Let me know if I need to put in an extra temporary variable to check this (in case the implementation of nsContentUtils::GetAncestors changes).

I also use a new boolean member called mNoContext, which I reset to false every time after use.
Attachment #619005 - Attachment is obsolete: true
Attached file Patch generated with WinMerge (obsolete) —
This is the patch generated by WinMerge.
For info, I found that Chrome also uses the same "fix" for partial tables. It puts the table tag inside the HTML fragment. However, I agree that Ted's method might be a bit more helpful in the long run. But I can't tell what goes wrong with his patch when you select just one table cell (see Comment 49). We need to see what is actually put on the clipboard by Firefox using FreeClipViewer or some other software. I believe it might be fixed simply by removing the first element of mCommonAncestors (a tr) at this point: http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDocumentEncoder.cpp#1109, before EncodeToStringWithContextExtra takes mCommonAncestors, so that the tr would not be nested inside another tr.
Attached patch Patch v1. (obsolete) — Splinter Review
Thanks for the patch!

This new patch provided by lwz seems to work correctly.  The raw HTML format viewed by using clipspy.exe seems to look correct and pasting to Excel works. 

Marking ehsan for review on lwz's behalf.  I will implement any review comments but keep the author as lwz.
Attachment #555160 - Attachment is obsolete: true
Attachment #619379 - Attachment is obsolete: true
Attachment #619380 - Attachment is obsolete: true
So, you didn't set the review request.  But I guess bz is a better reviewer here.
Comment on attachment 619614 [details] [diff] [review]
Patch v1.

Marking bz for review on behalf of lwz as per ehsan's suggestion
Attachment #619614 - Flags: review?(bzbarsky)
Thanks for the help.

There's also another bug with EncodeToStringWithContext regarding the usage of mCommonAncestors, mStartDepth and mEndDepth to fill the data object with HTML context and HTML info. This was already present with both IncludeInContext and the other bug fix, so I also didn't adjust these three variables correctly in my patch. I think it should be put into a separate bug because it involves fixing IncludeInContext and the other bug fix too, or maybe changing the way these things are handled in the first place. But I don't really know the impact as I am unclear how mStartDepth and mEndDepth are really used in the pasting code (http://mxr.mozilla.org/mozilla-central/source/editor/libeditor/html/nsHTMLDataTransfer.cpp#2216). I will look at it some other time.
But logically, having a mStartDepth and mEndDepth smaller than the correct values shouldn't affect the paste, and I don't think it is essential for the HTML context to be correct (currently). Therefore, this patch should be OK for the time being.
Thanks lwz. If you think this is a problem in general please post another bug for that. It seems to me like your patch is still good and ready for review.
Yes I was actually also expecting review still.
Comment on attachment 619614 [details] [diff] [review]
Patch v1.

Sorry for the lag here; I had to sort out how this code works...

>+++ b/content/base/src/nsDocumentEncoder.cpp
>+  bool              mNoContext;  

This should be documented.  And perhaps named mSkipContextWhenSerializingRange.

>@@ -1081,40 +1083,60 @@ nsDocumentEncoder::EncodeToString(nsAStr
>         nsCOMPtr<nsIContent> content = do_QueryInterface(node);
>         if (content && content->Tag() == nsGkAtoms::tr) {
>           nsCOMPtr<nsINode> n = do_QueryInterface(node);

You didn't write this code, but...

  nsINode* n = content;

And the content->Tag() == nsGkAtoms::tr should probably be:

  content->IsHTML(nsGkAtoms::tr)

>+nsHTMLCopyEncoder::SerializeRangeContextStart(const nsTArray<nsINode*>& aAncestorArray,

I'm not all that happy wih the duplication of code between this method and the superclass.  Maybe factor the common code out into a method that has an extra integer argument and pass 'j' from here but 0 from the superclass SerializeRangeContextStart?

>+    if (!content || content->Tag() != nsGkAtoms::tr    &&
>+                    content->Tag() != nsGkAtoms::thead &&
>+                    content->Tag() != nsGkAtoms::tbody &&
>+                    content->Tag() != nsGkAtoms::table)

What about tfoot?  Also, should check for !content->IsHTML().

>+nsHTMLCopyEncoder::SerializeRangeContextEnd(const nsTArray<nsINode*>& aAncestorArray,

Similar comments here.  But why are we using a boolean flag here and an int in SerializeContextStart()?  It'd be clearer that they're equivalent if they used the same setup. In particular, you could move determination of 'j' into a helper method to make it clearer what's going on.

r- but a review on the updated patch should happen much faster now that I'm more familiar with this stuff.
Attachment #619614 - Flags: review?(bzbarsky) → review-
Attached file Patch v2 (WinMerge output) (obsolete) —
Here's an updated version. It has not been tested, but you might want to see if the way it's being done is alright.

I'm not comfortable with such extremely long variable names, so mNoContext is now called mContextAlreadySerialized, with some comments.

I get the index of the node that the immediate context starts with a helper function, though it might be slower that way. I've also added more comments here and there.
Attachment #623435 - Flags: review?(limweizhong)
Attachment #623435 - Flags: review?(limweizhong) → review?(bzbarsky)
Comment on attachment 623435 [details]
Patch v2 (WinMerge output)

I'll clean this up a bit over the weekend and request a review from bz on lwz's behalf.  Thanks for the updates lwz.
Attachment #623435 - Flags: review?(bzbarsky)
Attached patch Patch v2. (obsolete) — Splinter Review
Thanks again for quickly addressing the review comments lwz.
Fixed some coding sytle related things.  The patch compiled as you provided it and still works with pasting individually selected cells to Excel.
Attachment #619614 - Attachment is obsolete: true
Attachment #623435 - Attachment is obsolete: true
Attachment #623464 - Flags: review?(bzbarsky)
Comment on attachment 623464 [details] [diff] [review]
Patch v2.

This is much better!

Two nits:

1)  Please rename GetImmediateContextIndex to GetImmediateContextCount, have it return j and 0 instead of j-1 and -1 respectively, and change the <= to < in the SerializeRangeContextStart/End loops.  That would make it clearer what's going on woth the immediate context, I think.

2)  Hoist the mContextAlreadyserialized checks above the array length gets in SerializeRangeContextStart/End, please.

r=me with those.  Thanks for the patch!
Attachment #623464 - Flags: review?(bzbarsky) → review+
I'll fix the nits and see it to landing, thanks so much for the contribution lwz!
Thanks for the review bz.
Attached patch Base patch v3Splinter Review
Implemented nits.
Attachment #623464 - Attachment is obsolete: true
Attachment #624223 - Flags: review+
Pasting to excel worked, but I looked at the clipboard output before landing and noticed there was no ending context.  This patch fixes it.
Attachment #624255 - Flags: review?(bzbarsky)
Comment on attachment 624255 [details] [diff] [review]
Fixed bug with end context not being included

r=me
Attachment #624255 - Flags: review?(bzbarsky) → review+
Thanks for that bug correction! I didn't realise that that would happen with shifting that reset outside the if-loop :O
Sorry, I think we should also change line 1146 (mDisableContextSerialize = false;) to be above the SerializeRangeContextEnd -- my same mistake. The code at that point will be run when you select a table cell range with Ctrl, then continue the multiple selection by selecting something else outside the table (that is not in a table), with Ctrl held down, and then copy.
Oops ya I'll include that as well.
Carrying forward r+.

Tested with ctrl + select table cells + text and could reproduce the missing end tags.  After applying the v2 patch this works properly.
Attachment #624255 - Attachment is obsolete: true
Attachment #624293 - Flags: review+
Attachment #624223 - Attachment description: Patch v3 → Base patch v3
Pushed to try, if all tests pass I'll push this out tomorrow.
Target Milestone: --- → mozilla15
https://hg.mozilla.org/mozilla-central/rev/14e6675d74e4
https://hg.mozilla.org/mozilla-central/rev/2cfe33680b08
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: blocking1.9.1- → in-testsuite-
Resolution: --- → FIXED
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: