Closed Bug 586587 Opened 14 years ago Closed 8 years ago

Drag and Drop of HTML to Firefox

Categories

(Core :: Widget: Win32, enhancement)

x86
Windows 7
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox45 --- fixed

People

(Reporter: michael.schoendorfer, Assigned: jorgk-bmo)

References

Details

Attachments

(2 files, 30 obsolete files)

2.36 KB, text/html
Details
44.81 KB, patch
jorgk-bmo
: review+
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; rv:2.0b2) Gecko/20100720 Firefox/4.0b2
Build Identifier: Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; rv:2.0b2) Gecko/20100720 Firefox/4.0b2

Firefox does not support Copy and Paste of HTML from Microsoft Outlook into the browser.
The CF_HTML flavor of the Clipboard would provide the needed HTML.

Reproducible: Always

Steps to Reproduce:
1. Create an HTML document that handles the ondrop event and outputs the content of event.datatransfer.getData of the "text/html" format.
2. Open this document in Firefox.
3. Open Microsoft Outlook 2007.
4. Select some text in an E-Mail and drag-and-drop it to the HTML document
Actual Results:  
Nothing

Expected Results:  
The HTML document outputs the HTML of the selected Text incl. formatting and links.
Attached file Testcase (obsolete) —
Testcase to test the described behaviour with drag and drop.
This patch adds support for the "HTML Format" Clipboard Format. If "text/html" is requested from the clipboard the content will be fetched from IDataObject with the Clipboard-Flavor CF_HTML.
Because the HTML is not saved as a two-byte nsISupportsString but a nsISupportsCString it will be returned as an ACString.
Attachment #465157 - Flags: review?(jmathies)
Severity: minor → enhancement
Summary: Drag and Drop HTML from Outlook to Firefox → Drag and Drop of HTML to Firefox
Status: UNCONFIRMED → NEW
Ever confirmed: true
Hi,

Is there any Timetable, when this feature could be integrated into a release build?

best regards 
martin
Comment on attachment 465157 [details] [diff] [review]
Patch to support Drag and Drop of HTML

Review of attachment 465157 [details] [diff] [review]:

Sorry it took so long to get to this.

When I use this, ms products tend to add a header, something like:

Version:1.0 StartHTML:0000000105 EndHTML:0000021563 StartFragment:0000020937 EndFragment:0000021523 

I wonder if that's just how they format their html data or if that's something we should be dealing with on our end?

Requesting a review from smaug on the small nsDOMDataTransfer.cpp change.

::: widget/src/xpwidgets/nsPrimitiveHelpers.cpp
@@ +84,5 @@
 {
   if ( !aPrimitive )
     return;
 
+  if ( strcmp(aFlavor,kTextMime) == 0 || strcmp(aFlavor,kNativeHTMLMime) == 0 || strcmp(aFlavor,kHTMLMime) == 0 ) {

One nit, this should wrap to ~80 chars. I'll update this since I had to merge the patch anyway.
Attachment #465157 - Flags: review?(jmathies)
Attachment #465157 - Flags: review?(Olli.Pettay)
Attachment #465157 - Flags: review+
Attached patch Rebased to tip (obsolete) — Splinter Review
Comment on attachment 465157 [details] [diff] [review]
Patch to support Drag and Drop of HTML

In general there is some extra 8bit string to 16bit string copying happening,
but this patch as such doesn't cause it.
r+ for the content/ part.
Attachment #465157 - Flags: review?(Olli.Pettay) → review+
(In reply to comment #4)
> Comment on attachment 465157 [details] [diff] [review]
> Patch to support Drag and Drop of HTML
> 
> Review of attachment 465157 [details] [diff] [review]:
> 
> Sorry it took so long to get to this.
> 
> When I use this, ms products tend to add a header, something like:
> 
> Version:1.0 StartHTML:0000000105 EndHTML:0000021563 StartFragment:0000020937
> EndFragment:0000021523 
> 
> I wonder if that's just how they format their html data or if that's something
> we should be dealing with on our end?
> 
> Requesting a review from smaug on the small nsDOMDataTransfer.cpp change.
> 
> ::: widget/src/xpwidgets/nsPrimitiveHelpers.cpp
> @@ +84,5 @@
>  {
>    if ( !aPrimitive )
>      return;
> 
> +  if ( strcmp(aFlavor,kTextMime) == 0 || strcmp(aFlavor,kNativeHTMLMime) == 0
> || strcmp(aFlavor,kHTMLMime) == 0 ) {
> 
> One nit, this should wrap to ~80 chars. I'll update this since I had to merge
> the patch anyway.

So, looks like this isn't quite ready yet. We need to strip the header info out based on the StartHTML value.

http://msdn.microsoft.com/en-us/library/ms649015%28v=vs.85%29.aspx
Hi,  

this patch was deployed by our former employee. Is there anything left we should submit so that this patch will have a chance to get into the release build?

regards,
Martin
(In reply to comment #8)
> Hi,  
> 
> this patch was deployed by our former employee. Is there anything left we
> should submit so that this patch will have a chance to get into the release
> build?
> 
> regards,
> Martin

There was a minor formatting nit in comment 4 and we need to strip out the text based headers ms copies into the data (comment 7), which really shouldn't be too hard.
Comment on attachment 465157 [details] [diff] [review]
Patch to support Drag and Drop of HTML

For reference we already have some code for creating these headers in data obj:

http://mxr.mozilla.org/mozilla-central/source/widget/src/windows/nsDataObj.cpp#1808

setting r- until this is fixed up.
Attachment #465157 - Flags: review+ → review-
Assignee: nobody → netzen
Attachment #465157 - Attachment is obsolete: true
Attachment #528088 - Attachment description: merged to trunk → Rebased to tip
Comment on attachment 528088 [details] [diff] [review]
Rebased to tip

Review of attachment 528088 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/src/windows/nsClipboard.cpp
@@ +109,5 @@
>    else if (strcmp(aMimeStr, kFileMime) == 0 || 
>             strcmp(aMimeStr, kFilePromiseMime) == 0)
>      format = CF_HDROP;
> +  else if (strcmp(aMimeStr, kNativeHTMLMime) == 0 ||
> +           strcmp(aMimeStr, kHTMLMime) == 0)

kHTMLMime (text/html) should not be treated as kNativeHTMLMime (CF_HTML "HTML Format") as they are different types.

This will cause problems for programs that pass data to us through text/html.

@@ +628,5 @@
>                genericDataWrapper = do_QueryInterface(file);
>              nsMemory::Free(data);
>            }
> +        else if ( strcmp(flavorStr, kNativeHTMLMime) == 0 ||
> +                  strcmp(flavorStr, kHTMLMime) == 0) {

Ditto
Assignee: netzen → nobody
The editor parses the CF_HTML format in nsHTMLEditor::ParseCFHTML, and I think uses the header info for context info so we don't want to strip it out.

We probably want to provide both text/html and the native type and have the datatransfer convert the native type when text/html is requested and isn't available directly.
Bumping this one since the Clipboard API is pretty useless for our purposes right now since it doesn't produce the same results as copy/paste into a contentEditable element. Displayed in #850663
Also bumping this one. Appears to be resolved for Mac, but not for windows.
This is an issue with more than just MS Office. Content copied from any external HTML source (tested Chrome, Internet Explorer, OpenOffice) cannot be pasted into the reproduction case provided in 850663. I have a test case that logs the contents of clipboardData.types, and the only data available is text/plain.

Note however that copying HTML content from within Firefox can be pasted successfully as text/html clipboard data. So that's inconsistent.

It's also a cross-platform issue, as Ben described. HTML can be successfully pasted on OS X and Linux when copied from and external HTML source (tested Office 2011 on OS X, OpenOffice and other browsers on both).
[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: 
Testing completed (on m-c, etc.): 
Risk to taking this patch (and alternatives if risky):
String or IDL/UUID changes made by this patch:

Rebased for 2014 codebase. What are the next steps for getting this merged into release?
Attachment #8409057 - Flags: review+
Attachment #8409057 - Flags: approval-mozilla-release?
Comment on attachment 8409057 [details] [diff] [review]
Rebased for mozilla-central nightly 2014

You need to get somebody to review it.  Maybe Jim Mathies, though he is away for a few more days it looks like.  Then it can land on Nightly 31 (or 32 depending on how long it takes to get reviewed and landed).  From there it will automatically go to Aurora and Beta and eventually release, as time goes forward.
Attachment #8409057 - Flags: review?(jmathies)
Attachment #8409057 - Flags: review+
Attachment #8409057 - Flags: approval-mozilla-release?
Comment on attachment 8409057 [details] [diff] [review]
Rebased for mozilla-central nightly 2014

Review of attachment 8409057 [details] [diff] [review]:
-----------------------------------------------------------------

Ben, generally looks good although the formatting is a bit of a mess. Please note the nits below and post a refresh.

::: dom/events/DataTransfer.cpp
@@ +27,5 @@
>  #include "mozilla/dom/DataTransferBinding.h"
>  #include "mozilla/dom/Element.h"
>  #include "mozilla/dom/BindingUtils.h"
>  
> +

added line feed here, please remove.

@@ +1277,5 @@
>        supportsstr->GetData(str);
>        variant->SetAsAString(str);
>      }
>      else {
> +	  nsCOMPtr<nsISupportsCString> supportscstr = do_QueryInterface(data);

looks like you either introduced tab spacing or just added too much padding, please get the alignment right.

@@ +1283,5 @@
> +        nsAutoCString str;
> +        supportscstr->GetData(str);
> +        variant->SetAsACString(str);
> +      }
> +      else {

nit = } else {

::: widget/windows/nsClipboard.cpp
@@ +103,5 @@
>    else if (strcmp(aMimeStr, kFileMime) == 0 ||
>             strcmp(aMimeStr, kFilePromiseMime) == 0)
>      format = CF_HDROP;
> +  else if (strcmp(aMimeStr, kNativeHTMLMime) == 0 ||
> +	  strcmp(aMimeStr, kHTMLMime) == 0)

nit - alignment

@@ +642,5 @@
>                genericDataWrapper = do_QueryInterface(file);
>              nsMemory::Free(data);
>            }
> +        else if ( strcmp(flavorStr, kNativeHTMLMime) == 0 ||
> +			strcmp(flavorStr, kHTMLMime) == 0) {

nit - alignment

::: widget/windows/nsDragService.cpp
@@ +542,5 @@
>                        TYMED_HGLOBAL | TYMED_FILE | TYMED_GDI);
>          if (mDataObject->QueryGetData(&fe) == S_OK)
>            *_retval = true;                 // found it!
>        }
> +	  else if (!strcmp(aDataFlavor, kHTMLMime)) {

nit - alignment

@@ +547,5 @@
> +       // if the client wants html, maybe it's in "HTML Format"
> +       format = nsClipboard::GetFormat(kHTMLMime);
> +       SET_FORMATETC(fe, format, 0, DVASPECT_CONTENT, -1,
> +                     TYMED_HGLOBAL);
> +       if (mDataObject->QueryGetData(&fe) == S_OK)

lets use SUCCEEDED() here vs. the S_OK compare. Maybe you could touch up the other uses above.

::: widget/xpwidgets/nsPrimitiveHelpers.cpp
@@ +53,5 @@
>      return;
>  
> +  if ( strcmp(aFlavor,kTextMime) == 0 ||
> +	  strcmp(aFlavor,kNativeHTMLMime) == 0 ||
> +	  strcmp(aFlavor,kHTMLMime) == 0) {

nit - alignment
Attachment #8409057 - Flags: review?(jmathies) → feedback+
Attached patch Post-feedback patch (obsolete) — Splinter Review
[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: 
Testing completed (on m-c, etc.): 
Risk to taking this patch (and alternatives if risky):
String or IDL/UUID changes made by this patch:

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: 
Testing completed (on m-c, etc.): 
Risk to taking this patch (and alternatives if risky): 
String or IDL/UUID changes made by this patch:

Post-review follow up. Have updated the patch accordingly. Please let me know if  you need any further modifications
Attachment #8411360 - Flags: review+
Attachment #8411360 - Flags: feedback+
Attachment #8411360 - Flags: approval-mozilla-release?
Attachment #8411360 - Flags: approval-mozilla-beta?
Comment on attachment 8411360 [details] [diff] [review]
Post-feedback patch

First, you need a final approval from mr vs. just feedback+, which I'm giving here. We need to land this on mc first, wait a week or so, then request uplift to aurora and beta. (We would never uplift a change like this to the release channel.)
Attachment #8411360 - Flags: approval-mozilla-release?
Attachment #8411360 - Flags: approval-mozilla-beta?
Attachment #8411360 - Flags: review+
Attachment #8411360 - Flags: review+
Also, ben it would be good if you could add a commit message to the final patch so drivers don't have to fill that information for you. Saves them time since they have to apply a lot of patches each day.
Assignee: nobody → ben
Keywords: checkin-needed
Attachment #528088 - Attachment is obsolete: true
Attachment #8409057 - Attachment is obsolete: true
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5a514b9a0db

Thanks for the patch, Ben! One request, please make sure that future patches include proper commit information before requesting checkin. Makes life much easier for those landing on your behalf!
https://developer.mozilla.org/en-US/docs/Mercurial_FAQ#How_can_I_generate_a_patch_for_somebody_else_to_check-in_for_me.3F
Keywords: checkin-needed
Component: Shell Integration → Widget: Win32
Product: Firefox → Core
Status: NEW → ASSIGNED
Hello Ben,

Were you working on fixing the test failures at https://tbpl.mozilla.org/?rev=ec9e918e47dd&tree=Mozilla-Inbound ?

Steven Shaw (steshaw) was interested in helping.

Thanks
Flags: needinfo?(ben)
Steven tells me he's already talked with you about working on this.
Assignee: ben → steven
Flags: needinfo?(ben)
Matthew: That's correct. We're currently working on getting a patch together that correctly passes tests. We should have something together soon.
Hi! I got here from a duplicate bug 850663. Is someone working on this? Ben?

Thanks!
I've been meaning to post a request for help - we couldn't get the patch working, so we abandoned it (and Steven no longer works with us). We've even considered engaging a contractor with enough C++ experience to get it across the line for us, but none of us have had the time to make it happen.
I'd like to look at those test failures, but it seems results are no longer available (?). I'll see if I can push a build to the try server..
Not too surprising, it has been nearly 12 months. Appreciate your help - let me know if you need anything.
Had some problems applying the patch on a fresh checkout (not surprising.. seems among other things the path to nsPrimitiveHelpers has changed - it's no longer in widget/xpwidgets but directly inside widget?). I also tried to check out the revision that had the patch already, but hg failed me and aborted the checkout after a while.. If somebody could massage the patch a little I'm happy to push it to the try server for some updated test results.
We still have the codebase and build tools from our previous attempt on a VM. I might keep that, do a fresh checkout and see if I can update the patch.
Attached patch Patch update April 2015 (obsolete) — Splinter Review
I'm attaching an updated patch. It's all simple context changes plus a file move; the relevant code looks otherwise identical. I also included the commit information.

I downloaded the linux build VM to update the patch. Our windows box is still doing a fresh checkout and it's the end of my work day, so I haven't had a chance to verify this in a build yet.
The patch doesn't fix paste, only drag&drop. When pasting I see text/html as a clipboard data option, but the string it returns is empty.

The old build from a year ago still works. Looking at the differences from the attached patch, I now remember some discussion about this a year ago.

nsPrimitiveHelpers.cpp needs specific handling for kHTMLMime instead of just casting it like kNativeHTMLMime (the data is UTF8 but needs to be UTF16). There is also some beginnings of handling the windows HTML clipboard format header, which makes more sense to do in the browser than in JS.

I'll see if I can integrate these changes into the current code, verify it on windows and then attach a new patch.
Attached file Paste Test Case
I've been struggling with weird behaviour for a couple of days, and after rolling back to a fresh checkout I think the nightly is broken.

In windows nsClipboard.cpp, during a paste operation the GetDataFromDataObject function is only ever queried with "text/unicode" (confirmed by logging the value of flavorStr). I'm not very good at mercurial but it might be related to bug 1071562?

I'm attaching a test case which includes both my paste replication and the drag&drop replication from the other test case.
> In windows nsClipboard.cpp, during a paste operation the
> GetDataFromDataObject function is only ever queried with "text/unicode"
> (confirmed by logging the value of flavorStr). I'm not very good at
> mercurial but it might be related to bug 1071562?

If you are running the page in a separate process (The tab title will appear underlined if so) then yes. But you can open a regular window from the File menu that won't exhibit that bug.
aha. Thanks for that!
Attached patch Patch update April 2015 (obsolete) — Splinter Review
The patch from 2 days ago works as expected with a non-e10s window and my test case. It would apply with some fuzz but I'm attaching a fresh patch for the couple of lines of context that changed.

The concern I have with this change is that it hands the raw windows HTML clipboard to JS which includes a few lines of header information. This appears to be necessary for nsHTMLEditor::ParseCFHTML (in nsHTMLDataTransfer.cpp) to parse, but it's unexpected in the JS context.

Should I try moving the parsing code out of nsHTMLEditor? Or is there a way to provide different output results to the JS clipboardData API that won't impact the editor?
Attachment #8592108 - Attachment is obsolete: true
Attachment #8593156 - Flags: feedback?(hsteen)
ok. Alternate patch attached - apologies if my code is not up to scratch, I'm not a C programmer.
	
I dug into the code more and decided my concern was caused by an underlying problem with the patch. This hinges on my assumption that the editor is looking for (and handling) kNativeHTMLMime but dom/events/DataTransfer only requests kHTMLMime.

Rather than add kNativeHTMLMime handling to DataTransfer, the original patch updates the windows clipboard to pretend that kHTMLMime is CF_HTML on windows. This is not true, CF_HTML (aka kNativeHTMLMime) has a header and is UTF8.

As such, I think it is best for the clipboard to remove the header in the kHTMLMime scenario. This is now set up such that it should not impact the editor at all.

The last remaining question is the change to nsPrimitiveHelpers handling of kHTMLMime. With this patch it assumes kHTMLMime is equal to kNativeHTMLMime, which seems risky on platforms other than windows. Maybe that's what caused the test failures.
Nice progress! :) With a build including this patch I can drag data from OpenOffice, Chrome and Opera (Presto) into Firefox, and HTML is read from the clipboard. (I can't drag from IE11, but I blame that on IE because dragging to Chrome doesn't work either.)

Functionality-wise, there are two small things that require polish:
1) Encoding. I get a little "mojibake" - looks like we somewhere receive UTF-8 data and handle it as ASCII. Example: 

 «kjøpmannen» 

shows up as

 «kjøpmannen»

2) I think we need to process the CF_HTML clipboard format a little more to pass exactly the right part on to the script. With this patch we include the markup prior to 

<!--StartFragment-->

and after 

<!--EndFragment-->

and we should not. I know this might seem a matter of debate, given that some implementations (IE?) sometimes add significant stuff like STYLE tags outside the fragment markers, but as far as I can tell it's what Chrome does.

Now, I'm "just" a tester (and the editor of the clipboard events spec), so I don't know the implementation architecture in depth. We should most likely have functionality to parse CF_HTML somewhere in platform integration-type code that both the rich text editor and the clipboard code can depend on - if there's code for handling CF_HTML inside our editor or rich text code, it should be refactored out (IMHO - with the caveat that I'm not a developer :)).

Neil, are you the best person to advice and help us move this forward, or can you CC somebody who is? I'm so looking forward to helping this feature land - especially if we can also throw in paste support with a few more lines of code somewhere ;). (The feature detection story here is already very poor - to avoid driving web developers even madder we should not ship DnD-support until we have paste IMO).
Flags: needinfo?(enndeakin)
Thanks :)

1) is likely because of the nsPrimitiveHelpers change I mentioned earlier. I don't think the change there is right, it's no longer converting text/html content to UTF16. Some of the earlier patch code used NS_ConvertUTF8toUTF16 to achieve this but it was in the wrong place, I haven't tried migrating it yet.

2) I'm specifically looking for the content and style data outside of the fragment. You're right, DnD to Chrome seems to extract the fragment, but it definitely provides the full context when pasting. My app depends on this and I'd like to offer the same capability in Firefox :)

I tested both paste and DnD from IE11 on my build machine (paste is the one I really care about).

Architecture wise there are only two places that parse CF_HTML. nsClipboard::FindPlatformHTML does basic extraction, and nsHTMLEditor::ParseCFHTML does the serious extraction for editor handling.
Asking for the type text/html from DataTransfer.getData should return the html data without the header stuff that Windows uses.

Asking for the type application/x-moz-nativehtml should return the original CFHTML data. If application/x-moz-nativehtml is all that is available, we should provide a mechanism that converts it when text/html is asked for. nsHTMLEditor::ParseCFHTML does this conversion in the editor currently; we can move this code as needed.
Flags: needinfo?(enndeakin)
I think that's what this patch is intended to do. x-moz-nativehtml isn't listed as something to look for in DataTransfer, so the (windows) clipboard code now supplies the converted raw data when looking for text/html. If the editor asks for raw CFHTML, it can still get it.

nsHTMLEditor::ParseCFHTML breaks the CFHTML into two strings; the fragment and the before/after context mashed together around an "insert here" cookie. That seems like an implementation detail of the editor.

I really think it is important to return the full HTML document from clipboardData.getData('text/html') when pasting. All browsers (including Firefox) do this on OS X, for example, as does Chrome on Windows. The fragment information on windows is more of a "context within the document" indicator, not an indication to throw away data outside the fragment.

I'll work on the UTF8 issue tomorrow (I'm in Australia) and hopefully have an updated patch for you soon.
Some test results from a try build with that patch are here BTW: https://treeherder.mozilla.org/#/jobs?repo=try&revision=10587d470dd1
I'll dig in to some of these later..
The NS_NOINTERFACE errors could be the UTF8 problem. I still think I can fix that, but I'm reaching beyond the edge of my C knowledge so the code may need some cleanup after I'm done.

Can I run those failing tests locally? Or do I just give you a new patch and hope it works? :)
As you probably have both a clone of mozilla-central and the Mozilla Build stuff set up already, you should be able to run specific tests by tweaking the "./mach mochitest-browser" command line - see
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/mach
Attached patch Patch update April 2015 (obsolete) — Splinter Review
It's much better now. I couldn't make the code work in the structure I wanted, so I just made it function correctly but with some likely undesirable changes. It will need someone with better C knowledge to improve it if this ends up failing code review.

The tests in dom/base/test and editor/libeditor/tests all pass. I believe the only remaining failure is browser_clipboard.js which is not expecting a full HTML document. This relates to Neil's comment.

However I'm leaning towards changing the test. One of my aims here is to normalise the text/html paste format - do you really want the paste data to be a fragment when the copy source is Firefox, but a full document when it's copied from other browsers?
Attachment #8593156 - Attachment is obsolete: true
Attachment #8593205 - Attachment is obsolete: true
Attachment #8593156 - Flags: feedback?(hsteen)
It would be good to finish this comment:

// This UTF8 -> UTF16 conversion may be better done in nsClipboard::FindPlatformHTML - but then
// the data 

(in nsPrimitiveHelpers.cpp)

My impression at the moment is that pasting (almost) always should produce a HTML fragment, not a full document.. but you seemed to disagree with that above. Do you know what other implementations do?
WebKit produces a whole document with comments around the body for the fragment.

<html>
<body>
<!--StartFragment-->HTML<!--EndFragment-->
</body>
</html>

Not sure if this is what to expect I think it's kind of weird to use comments as tags for the fragment as the WebKit folks do. Also since the current behavior when copy/pasting within Firefox produces a fragment not a whole document it potentially break code that relies on that behavior.
oops, that was supposed to be a hint for future code review. I'll get on that :)

Even the ContentEditable implementation doesn't technically produce a HTML fragment. While it may *appear* to it actually keeps the data surrounding the fragment and uses it to style the content inside the fragment. That's what I want to replicate myself in JS.

My specific use case is copying from Microsoft Word on Windows and OS X, pasting into a browser and using clipboardData.getData('text/html'). The raw HTML from this is an absolute mess but if you know what to look for it has vital data outside the fragment.

My experience - which you can confirm with my Paste Test Case - is:

OS X (Chrome, Safari, Firefox) all return complete documents.
Windows Chrome returns a complete document.
Windows IE doesn't implement the clipboard API (IE11 has internal cleanup of word content).
Windows Firefox has no HTML in the clipboardData (it does if the source content is a Firefox tab).

Due to the lack of data our editor falls back to the IE10 code path on Firefox for Windows (but not OS X). I would like to rectify this situation, and have confirmed the existing patch triggers our high quality import code.
@spocke that's not WebKit, that's the actual content on the windows clipboard (minus header info). You can use a clipboard viewer tool to confirm.

Anyone relying on the text/html clipboardData in Firefox today is giving their users a weird experience. Content copied from within Firefox does one thing (text/html is available), content copied from all other sources does another (no text/html, fall back to ContentEditable paste).

For complex source documents, the fragment is actually a horrible source of data as allowing ContentEditable to handle the paste will merge extra context into the fragment.

I would like to propose that Firefox break the existing behaviour on Windows, match up with what it does on OS X, and always return the full clipboard for text/html.

To be honest though I would also be happy with a separate MIME type that gives me access to the raw clipboard and leave text/html as a fragment. It would mean my paste code needs a specific Firefox Windows check, but at least I'd have the data and not be stuck giving my users the IE10 paste experience.
Both Chrome and Firefox handle copy/paste of HTML very erratic:

If I run this fiddle http://jsfiddle.net/nfg6g3a0/1/ and do the following:
Copy HTML from Firefox to Firefox on Mac OS X produces: Fragment
Copy HTML from Chrome to Chrome on Mac OS X produces: Fragment
Copy HTML from Firefox to Chrome on Mac OS X produces: Document
Copy HTML from Chrome to Firefox on Mac OS X produces: Fragment
Copy HTML from Firefox to Firefox on Windows produces: Fragment
Copy HTML from Chrome to Chrome on Windows produces: Document
Copy HTML from Firefox to Chrome on Windows produces: Document
Copy HTML from Chrome to Firefox on Windows produces: N/A due to this bug

I guess this depends on how the copy operation fills the clipboard but from a JS developers perspective one would wan't the same results in regardless on how people copy HTML and from where. It should either always be a whole document or a fragment normalized between platforms. I guess developers could normalize this manually in JS as we do with the TinyMCE editor but it would be nice if the browsers did that for us.
Spocke, good to see you chime in on this topic :)

So the current state of implementations is "messy" - what is the ideal state? Do you agree with Andrew that getting a full document including the <!--StartFragment--> and <!--EndFragment--> comments is best? That basically means standardising the IE/Microsoft CF_HTML conventions across platforms. 

(One benefit of using CF_HTML is that it includes a SourceURL property in its meta data - at least when IE writes CF_HTML to the clipboard. It might be interesting to explore whether we should build security logic on top of the clipboard content's origin - not saying it's a good idea, but if we know that the origin will be known we can consider it.)

For the spec's test suite, we may also need an evilish test or two that constructs a fragment of HTML that looks like a complete CF_HTML (with meta data) and writes it to the clipboard with setData().
Ah, yes, copying from Chrome does produce a fragment. I saw the meta tag and thought it was a whole document. My bad.

However, my point stands - pasting to Chrome will output the *entire* clipboard. In Spocke's "to Chrome" cases, whether you get Fragment or Document depends on the source. Try copying from the built-in OS X textedit app, or MS Word, and you'll get a Document. I can't think of other sources of HTML off the top of my head.

Perhaps rather than just using paste as the metric, use a clipboard viewer compared to the browser paste.
A full document like the cf_html spec on all paste operations on all platforms makes more sense. The sourceurl meta and possible other meta information could be very useful. In theory a browser could inject it's own meta information in the document for what ever reason.

However this is a nice to have not a need to have. The need to have is getting the html in any form in there from external sources. Once that is fixed I would be one happy puppy since we could then start using the clipboard api for real on Firefox. :)
ok. I'll update the patch today, improve that comment and switch the test case to full documents.
I'm a bit worried there will be protests and incompatibility with other software that reads HTML from the clipboard on other platforms if we really try to push CF_HTML everywhere. But then, I have no idea how other popular platforms organise their clipboard meta data when you copy HTML..

If we provide the full document including the CF_HTML magic comment tags, JS will be (is already?) written to look for those comments - will that JS still works fine if you copy and paste on platforms that does not add the magic comments? Isn't there a risk of ending up with JS that depends on a very obscure Windows quirk to run correctly? I accept that there's important stuff elsewhere in the markup created by certain implementations that you script authors want and need access to, but I'm still not entirely sure this is the right thing to do.. sorry.. :(

(On the other hand, it's somewhat tempting to suggest adding DataTransfer.sourceURL ..)
Anyone using text/html today already has to deal with those magic tags if they support Chrome. I think that alone is reason enough to not be concerned.

We could, if you wanted, implement Neil's suggestion of sticking with the fragment in the Firefox -> Firefox case and falling back to the Document from other sources. However that continues the weird dichotomy between sources of HTML and how they appear when pasted into RTEs that handle text/html themselves.

Let me re-state Spocke's results in terms of what is on the native clipboard (and adding in MS Word as an external source).

OS X, copy from Firefox, places a Document on the clipboard (no magic tags).
  Paste Firefox -> Fragment
  Paste Chrome  -> Document

OS X, copy from Chrome, places a Fragment on the clipboard (no magic tags).
  Paste Firefox -> Fragment
  Paste Chrome  -> Fragment

OS X, copy from MS Word, places a Document on the clipboard (no magic tags).
  Paste Firefox -> Document
  Paste Chrome  -> Document

Windows, copy from Firefox, places a Document on the clipboard.
  Paste Firefox -> Fragment
  Paste Chrome  -> Document

Windows, copy from Chrome, places a Document on the clipboard.
  Paste Firefox -> <no html>
  Paste Chrome  -> Document

Windows, copy from MS Word, places a Document on the clipboard.
  Paste Firefox -> <no html>
  Paste Chrome  -> Document

Everywhere those two paste results are different, JS developers need special handling. I'd like to remove that need.
And I agree completely that we should make it simpler to handle this :)
I've asked for feedback on the public-webapps ML https://lists.w3.org/Archives/Public/public-webapps/2015AprJun/0228.html
I'm also taking a look at what would be involved with keeping the existing behaviour for "text/html" and just providing direct access to "x-moz-nativehtml" in the DataTransfer object. It's promising.

Knowing what I do about the windows clipboard, I'm seeing some *really* interesting code which roughly explains why we're in this situation. I'll try to keep this non-technical. 

Every OS has platform standard clipboard type identifiers. Windows uses English phrases like "Unicode Text Format" and "HTML Format" (aka CF_HTML). Firefox has clipboard code on each platform which is responsible for converting between internal MIME identifiers and platform standards. The Windows clipboard code does this in a lot of cases, but not for HTML. It actually registers a new clipboard format identifier "text/html". Windows will let you register any format you want, but other applications won't see the data.

Instead of doing that conversion, Firefox added "x-moz-nativehtml" as an internal MIME identifier which does get converted to Windows "HTML Format". You can see this happening in a Windows clipboard viewer, Firefox adds both "HTML Format" and "text/html" (among other custom formats).
Copy from Chrome:  http://imagebin.ca/v/1zKJbootfCSr
Copy from Firefox: http://imagebin.ca/v/1zKKEqVfRYRU

But doing this means special case hacks are necessary. There are only really two places that deal with this properly; copying content (which is why both "text/html" and "HTML Format" appear) and the ContentEditable implementation (which is why it can paste "HTML Format" from other applications). The paste/drag (DataTransfer) logic inspects the clipboard for standard MIME types only, not "x-moz-nativehtml", so when other applications put "HTML Format" on the clipboard it isn't picked up due to the lack of conversion.

What the patch does is go back to square one and attempt to provide that conversion. This is re-evaluating a very old base assumption so there is a behaviour change involved. If it was encapsulated well enough, which is way beyond my C knowledge, the entire x-moz-nativehtml concept could be deleted and everybody would get full documents on all platforms.

My position is that nobody can be relying on the current behaviour unless they restrict their users to copying from one Firefox tab to another within their web app (no support for external sources of HTML). Certainly possible, but unlikely.
Andrew, any further progress? Anything I can do? :)
I wasn't sure if I should continue until you made up your mind about whether my approach was correct ;)

I've just asked for a bit more time at work to spend on this, hopefully I'll have an update soon that fixes all of the broken tests I know about. It would be helpful if you could confirm there is only one failure remaining :)
Oh, sorry - didn't notice we were still debating this :)
I tried to test in IE11, but it seems completely broken - can't get any HTML from the clipboard at all (?!).

I accept that returning the full source from the clipboard is better than extracting the part between CF_HTML's magic comment tags. It may make it slightly harder to write cross-platform JS, on the other hand it gives the script author access to potentially important information about the location of copied content inside the document tree and maybe style information. So let's go for that - JS authors must in any case be prepared to handle both fragments and whole documents.
(In reply to Andrew Herron from comment #63)
> It actually registers a new clipboard format identifier "text/html".

Whoa. That sounds like a silly thing to do - we should certainly get rid of that IMO.

> What the patch does is go back to square one and attempt to provide that
> conversion. This is re-evaluating a very old base assumption so there is a
> behaviour change involved. If it was encapsulated well enough, which is way
> beyond my C knowledge, the entire x-moz-nativehtml concept could be deleted
> and everybody would get full documents on all platforms.

+1 for dropping x-moz-nativehtml too.

I'll see if I can make another build with your most recent patch and run those tests..
Well your last message was that you were asking for other feedback!

I'll look into IE11. It definitely worked when I made the patch. I've been given approval to spend work time on it but was too busy on Friday.

Unfortunately I don't have enough knowledge of the Firefox codebase to fix the things I said about the CF_HTML handling and dropping x-moz-nativehtml. That will impact a great many places including the ContentEditable implementation. I'll bring my patch up to scratch, hopefully we can close this 5 year old bug, and perhaps that can be logged as a new issue.
The patch doesn't apply cleanly to tip, I've resolved that locally and paste from IE11 definitely works. Hopefully the tests aren't too hard to fix up.

I've actually found a bug in the test suite! One of the tests calls reject() with no arguments when it fails, and that throws a JavaScript exception. The entire test suite grinds to a halt until it eventually times out.

My code is causing the test failure, which I'll fix, but I'll also fix the bug in the test case so failures are handled normally.
Attached patch Patch update May 2015 (obsolete) — Splinter Review
It turns out the constant test failures I was seeing were due to Remote Desktop interfering with the clipboard. sigh. I've figured that out, and the getData() tests now pass, but other tests are failing due to a problem I wasn't seeing before.

It seems the clipboard changes aren't being handled by the ContentEditable paste code. I'm attaching a patch update but I know it doesn't pass at the moment.

The test failure is due to extra whitespace both before and after the content when copying from ContentEditable and pasting into itself. I can replicate this in a plain ContentEditable test case. I have checked:

* the raw clipboard contents are correct
* Copying from Firefox ContentEditable to Firefox ContentEditable adds an extra space
* Immediately pasting the same clipboard contents into Chrome ContentEditable does not add an extra space
* Copying from Chrome ContentEditable to Firefox ContentEditable works

I wonder if the CF_HTML header format that Firefox is loading onto the clipboard is wrong, and Chrome is searching for the fragment rather than relying on offset locations in the header. I'll take a look next week.
Attachment #8595361 - Attachment is obsolete: true
Flags: needinfo?(hsteen)
Hm.. Just copying and pasting back into this little TC doesn't seem to add any spaces?

data:text/html,<div contentEditable="true">hi</div><pre></pre><script>setInterval(function(){ document.getElementsByTagName('pre')[0].textContent = encodeURIComponent(document.getElementsByTagName('div')[0].textContent) }, 300)</script>

If you could list a couple of failing tests I could try to help investigate it (although I don't know this code either..)
Flags: needinfo?(hsteen)
The test is browser/base/content/test/general/browser_clipboard.js 

* <div contentEditable=true>Test <b>Bold</b> After Text</div>
* select "t Bold"
* copy
* move the cursor elsewhere in the line
* paste
(Regarding the tests, the "extractFragment" method should probably move to head.js - this file has common code shared by multiple tests)
Confirming the test failure: I see an extra newline is added before the inserted HTML and two extra newlines after. This might come from the processing of Moz's own text/html stuff on the clipboard? Since it doesn't happen when pasting from Firefox to Chrome, neither when pasting from Chrome to Firefox - both of those presumably involve the real CF_HTML stuff. Does Firefox prefer "text/html" over CF_HTML when available?
aha, thanks for the note about head.js - I was hoping someone would be able to help me with that. I'll give that a go.

I have confirmed that the CF_HTML on the clipboard has the correct offsets, and using my own parsing code it extracts the correct HTML. I'll take a look at the ContentEditable clipboard parser hopefully tomorrow, and also see if I can figure out why this change triggered it.
Adding the function to head.js doesn't work for me, I get "extractFragment not defined". I'm running the tests like this, is that correct?

./mach test browser/base/content/test/general/browser_clipboard.js

I'm attempting to add some debug printf statements to figure out why the behaviour is different, as near as I can tell at the moment the ContentEditable code is no longer looking for kNativeHTMLMime with my patch applied. I still have some digging to do but that's really not good.
Ehsan, if I may bother you: who knows the editor code well enough to help Andrew understand this? I tried to dig around myself, but I didn't manage to make VisualStudio stop at any breakpoints in paste-related editor code :-/
Flags: needinfo?(ehsan)
This is where we parse CF_HTML in the editor code for the purposes of pasting and/or drag-drop: <https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/nsHTMLDataTransfer.cpp#907>.

As can be seen here <https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/nsHTMLDataTransfer.cpp#1127>, we prefer CF_HTML where available, and if we don't find that, we fall back to looking for normal HTML content in the transferable object.

If you have more specific questions, I would be happy to help!
Flags: needinfo?(ehsan)
Yeah I've made it that far. I just need to track down why that code isn't working with the patch applied; it looks like the magic kHTMLContext mime type isn't on the clipboard so the CF_HTML parsing isn't even used. I haven't had time to work on it yet.
(In reply to Hallvord R. M. Steen [:hallvors] from comment #77)
> Ehsan, if I may bother you: who knows the editor code well enough to help
> Andrew understand this? I tried to dig around myself, but I didn't manage to
> make VisualStudio stop at any breakpoints in paste-related editor code :-/

You may have been attached to the wrong process if you had e10s enabled.
Note that kHTMLContext and CF_HTML are not the same thing.  kHTMLContext is out made-up MIME type for a Mozilla specific clipboard flavor.
Yes, I did notice that - but kHTMLContext is used as a flag to trigger special CF_HTML parsing when copy/pasting from the CE to itself. I've been on holiday but I will hopefully get back onto this soon.
I finally found some time to work on this. I wasted a bunch of it struggling with a D3D11 crash in my VM, I'm not even sure why it works now but I have made a small amount of progress.

I haven't tracked down the full detail yet, none of the printf lines I added to nsHTMLDataTransfer are printing to the shell console which makes understanding what is going on extremely difficult :)

However by disabling pieces of the patch until the test passed, and combined with Ehsan's comment #78, my guess that because the patch aliases CF_HTML to both kNativeHTMLMime and kHTMLMime, the preferred flavor becomes kHTMLMime and this check fails:
https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/nsHTMLDataTransfer.cpp#1127

At that point ParseCFHTML is never called, but the paste code sees that kHTMLContext is available, and everything goes downhill from there.

I have a clearer idea of what I need to tackle now. Hopefully we're not too far away from finishing this!
(In reply to Andrew Herron from comment #84)
> However by disabling pieces of the patch until the test passed, and combined
> with Ehsan's comment #78, my guess that because the patch aliases CF_HTML to
> both kNativeHTMLMime and kHTMLMime, the preferred flavor becomes kHTMLMime
> and this check fails:
> https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/
> nsHTMLDataTransfer.cpp#1127

Hmm, sorry if that's obvious, but why do you need to alias CF_HTML like that?
Well... I didn't :)

This all started as an attempt to clean up an existing (now very old) patch. I've ended up rewriting most of it, and that is one of the last pieces of the original code.

My best guess is that aliasing both to CF_HTML in the windows clipboard was done rather than add windows-specific code to the JS clipboardData and drag&drop logic to provide kNativeHTMLMime as 'text/html'. Let them request kHTMLMime as normal and have the clipboard handle the magic behind the scenes.

I'm hoping I can adjust the 'best flavor' logic to prefer kNativeHTMLMime and everything will go back to normal.
Attached patch Patch update July 2015 (obsolete) — Splinter Review
OK. So my idea worked, I made the browser_clipboard.js tests pass, but looking closer my first attempt just ignored the kHTMLContext in all kNativeHTMLMime cases. This passes the tests because they're all fairly simple.

When I run a clipboard viewer on Mac I'm not seeing context information. So I'm not even sure if the context information is required anymore. But I didn't want to leave it like that, so I moved the "have context" check inside the kNativeHTMLMime code, and where context is available use it instead of the one returned from ParseCFHTML. This results in some duplication but should keep the intended behaviour.

This patch finally matches what Neil said in comment #45 and passes all tests. It's pretty much ready to submit - the only remaining question is the duplicate "extractFragment" test function in browser_clipboard.js, there doesn't seem to be a common JavaScript place to put it and I don't know how to implement it in C.

Hallvord, can you please run the tests on the try server? Or should we set me up with access to do that?
Attachment #8607962 - Attachment is obsolete: true
Flags: needinfo?(hsteen)
Attached patch Patch update July 2015 (obsolete) — Splinter Review
Apologies for the double update, I forgot to add the commit message in that patch.
Attachment #8636377 - Attachment is obsolete: true
Flags: needinfo?(hsteen)
Mike will take care of a try run - thanks, Mike!
Looks like there are some more tests that assert HTML clipboard contents since we last did a try run in April. I'll work on updating them.

I've also been wondering whether these tests should have a different assert with fragment data on windows, instead of just stripping the fragment. I'm not sure how feasible that is given my lack of C knowledge, but I'll give it a go.
Attached patch Patch update July 2015 (obsolete) — Splinter Review
This update fixes the failures in M(1), M(4) and M(JP). That's 25 of the 34 failures. I'm not sure about the others; they don't look like they failed for reasons related to these clipboard changes?

I did end up switching to the assert value having fragment header on windows only, which I think is much more accurate and removes the duplication in browser_clipboard.js.

The 5 updated tests all pass on windows, and still pass on my mac, so we should be good to go for another try run.
Attachment #8636379 - Attachment is obsolete: true
Here ya go: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5da71a5d87cd

The other test failures look unrelated (timeouts and intermittent failures).
>_< Forgot to select tests on that last one.

Here's another run, just building and running test son win32 and win64: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e03ad3a75fc5.
Thanks Mike!

In amongst all of the unrelated failures yesterday I miscounted the clipboard failures, and there's still one error in "pasteTransferable". I haven't seen that code before, but with any luck I can use the original patch (bug 525389) to figure out where to look.
This could take a bit to fix. The pasteTransferable code is creating a custom transferable (as the name suggests), but this isn't playing well with the other code changes. The test result looks like the clipboard contents is being cut in half, which suggests it's hitting a "divide length by 2" case incorrectly (aka UTF8 vs UTF16).

I'll keep digging.
Assignee: steven → hsteen
Oops.. No idea why a "hg bzexport" command decided to attach something here and assign the bug to me. Please ignore.. :-o
Assignee: hsteen → steven
Feel free to assign it to me instead :)

I'll get around to tracking down that last failing test... eventually
After looking at this patch a bit, I'm not sure it is correct. It seems to change our interpretation of the 'text/html' to treat them as single-byte strings, yet doesn't change them for text/html we put on the clipboard/drag buffer which are wide strings.

Am I interpreting this correctly?
Blocks: 1050566
Attachment #8661172 - Attachment is obsolete: true
Ah, I was looking at the older patch. I'll investigate with the newer patch tomorrow.
I'm refreshing the "July 2015" patch and will submit a try run.
Thanks for looking at it.
I've refreshed the patch (will attach in the next comment) and compiled.
Using URL
  data:text/html, <html contenteditable>
I've dragged a link from Chrome as per bug 1050566. That works.

It would be good if we could land this one day soon ;-)
This is the refreshed patch. Note that some predecessors already had r+.

Try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0c47f5f3f90
Attachment #8411360 - Attachment is obsolete: true
Attachment #8637143 - Attachment is obsolete: true
Uff, this has an awful lot of oranges. I'm running the Mochitests again, just to see which failures are real. Test results are usually better before noon GMT.

Try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2579937d1142

Also, this contains Windows specific code, so this should also be tested on Windows. But let's tackle the Linux failures first.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2579937d1142

This is slightly better. However, there are some weird errors:

dom/base/test/test_getFeature_with_perm.html | navigator.getFeature should exist
dom/network/tests/test_network_basics.html | navigator.connection should exist
dom/tests/mochitest/geolocation/test_geolocation_is_undefined_when_pref_is_off.html | undefined assertion name - got [object Geolocation], expected undefined
etc.

BTW: Most of these errors are already present in this run of mid July (comment #94):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e03ad3a75fc5

Who will be looking into this? Neil? Andrew? Or will this be left to rot since it looks "too hard"?
Flags: needinfo?(enndeakin)
Flags: needinfo?(andrew.herron)
OK, I ran one of the failing tests locally:
mach mochitest dom/base/test/test_getFeature_with_perm.html
Fails just the same as in the try run.

I noticed that the patch adds this:
(function () {
  // On windows, HTML clipboard includes extra data (bug 586587). Values from widget/windows/nsDataObj.cpp
  var isWin = navigator.platform.indexOf("Win") >= 0;
  SimpleTest.clipboard = {
    htmlPrefix: isWin ? "<html><body>\r\n<!--StartFragment-->" : "",
    htmlPostfix: isWin ? "<!--EndFragment-->\r\n</body>\r\n</html>" : ""
  };
})();

to testing/mochitest/tests/SimpleTest/SimpleTest.js

Removing this, the test passes. So there is something wrong with this addition, since it makes completely unrelated tests fail.
Quoting Ehsan from IRC:
===
the right way to do that is to put that code in dom/base/test/clipboard_helpers.js or some such and include that file in the tests that need it and leave other tests alone.
===

I'll look into it.

Unrelated:
Also, sorry about the 'Or will this be left to rot since it looks "too hard"?' in comment #106.
This stuff is damn hard for someone unfamiliar with the test system (like myself) ;-)
NB: I've seen so much good work go to waste in patches that didn't move forward, so I got a bit edgy. Especially fixing drag&drop with has 21 votes (!) would be a good thing to do, and it looks like it's almost there!
I removed the modification of SimpleTest.js and adjusted the tests accordingly.

Try run will follow.
Attachment #8679145 - Attachment is obsolete: true
FWIW the reason behind the test failures is accessing navigator.platform as soon as the test page loads.  That makes the WebIDL layer define the methods and properties on Navigator before the test page has had a chance to set prefs that would make things such as navigator.getFeature() become visible.
Assignee: steven → mozilla
Excellent result with the latest patch:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2efd167cea90

Six failures in one test in Mochitests (1):
INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_bug166235.html | text/html value in the clipboard - got "<p id=\"test0\">This text should be copied.</p>", expected "<html><body>\r\n<!--StartFragment--><p id=\"test0\">This text should be copied.</p><!--EndFragment-->\r\n</body>\r\n</html>"

And that's due to my clumsy "correction". I'll fix this straight away.
Another try run, this time for all platforms. Linux should already be OK:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=63327a9240f9
Attachment #8680185 - Attachment is obsolete: true
Thanks for looking into this. I'm sorry I can't help at the moment, I'm actually recovering from surgery for the next couple of weeks.

While it may appear to have become idle and in danger of being left to rot, I had plans to continue soon. The patch has basically been done since May, but by July I became too frustrated with the test failures to spend my personal time on it (hence the weird test code you found, I was struggling to find the right place for it). I haven't had work time available to finish it but that is likely to change in November.

FYI, in regards to the earlier comment about single-byte vs double-byte strings, the reason this patch switches to single-byte so much is that's the defined windows CF_HTML standard. I've attempted to ensure the clipboard contents are single-byte while remaining double-byte internally as the rest of the codebase expects, but the conflation between CF_HTML and 'text/html' makes things quite difficult.

Tacking this on the side is wrong in many ways, but separating the two concepts so they're dealt with correctly is way beyond my level of C knowledge :)
Flags: needinfo?(andrew.herron)
Thank you for your reply. Get better soon! And sorry again for my unfriendly comment.

Yes, I know, those tests can be *very* frustrating. It appears that you don't have Level 1 access rights, so someone else has done the try server runs for you in the past (for example in comment #94). This makes it even more frustrating.

I will look into it further, and if I have questions, I will consult with you. Once complete, your work will be checked in with your name on it, so your contribution will be recognised.

Coming to the latest try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=63327a9240f9

Linux: All green as expected.
Mac: Failures in jetpack-package/addon-sdk/source/test/test-clipboard.js
Windows: Not complete yet.

I'll keep looking into the test failures.

Neil: If in the meantime you have a chance to look at the patch, please let us know whether your concerns from comment #100 still stand.
Thanks :)

The lack of access certainly didn't help, but I was working through it. I did manage to run (and eventually fix) the tests failing locally to the point where the last big failure was the jetpack addon-sdk test. If that's the only failure moving forward that's fantastic.

This particular test was fine on mac at the time, the error on that mac try run appears to be using the windows assertion (it's looking for a full HTML document?) which is not good.

This test is a really tough one; it puts 'text/html' on the clipboard directly via the SDK API. I updated it in the patch to include the now-expected start/end fragment headers on windows, but when I last checked I think it was still having issues with what appeared to be cutting the string in half. I figured this meant it was somehow using the single-byte string length but it had a double-byte string.

That was the point where I put it to one side; due to my unfamiliarity with the code figuring out where this might be happening and why likely requires a bit of a deep dive which I would need to set aside a solid block of time for :)

My guess is the fix will be in one of two places:

1)
widget/windows/nsClipboard.cpp assumes CF_HTML is UTF8 single-byte, but it's probably getting double-byte from this SDK code. Maybe it needs to check. This might not be a good fix however as other apps will assume CF_HTML is single-byte which leads me to...
2)
Elsewhere, code which loads data onto the windows clipboard (I forget where) converts double-byte 'text/html' to single-byte CF_HTML before doing so. Maybe this SDK code needs to do the same.
Looks like you know a whole lot more about this than I do ;-)

I just picked up the patch, rebased it and sent it to the try server. While I was at it, I removed the code from SimpleTest.js (where it shouldn't have been) and got the Linux tests to pass.

Yes, on Mac there is this one failure I mentioned in test-clipboard.js. I don't know why this is failing, looks like this comparison fails: "<b>hello there</b>" == "<html><body>
(No idea why there is a quote missing, maybe due to the "cutting in half" you mentioned.)
I don't have access to Mac hardware, so I can't debug that.

Let's wait a few hours to see what the try server run shows for Windows, then we get a clearer picture of the remaining work. I'm afraid I can't help you much with the code, since I'm not familiar with this area at all. But of course I can run the try runs for you.

When I run dom/base/test/test_bug166235.html locally on Windows it fails:
8 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_bug166235.html | test0.innerHTML: textarea paste - got "", expected "This text should be copied."

12 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_bug166235.html | uncaught exception - NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsITransferable.getTransferData] at chrome://specialpowers/content/specialpowersAPI.js:162

Particularly the second one doesn't look so good.

Looking at your user profile, this is your first bug to work on, ... and you really picked a hard one.
I picked it because it impacts the product I work on for my day job :)

As I said it could be a few weeks before I get a chance to look at it, thanks for your help getting it up to date.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=63327a9240f9

Looking at the M(JP) failure on Mac. Quoting from the log file:
Running tests on Firefox 44.0a1/Gecko 44.0a1 (Build 20151028162006) ({ec8030f7-c20a-464f-9b0e-13a3a9e97384}) under darwin/x86_64.

Notice something? MacOS X announces itself as "darwin". Sadly, we test for "win" and therefore, Mac is treated as Windows, and the test fails.

That's easily fixed. So with that fix, Linux and Mac are green and we only have to worry about Windows. Sadly the Windows results are still not complete.

The M(1) failure on Windows is an easy win, it's just:
1140 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_copyimage.html | clipboard has correct [text/html] content - got "<html><body>\r\n<!--StartFragment--><img id=\"logo\" src=\"about:logo\"><!--EndFragment-->\r\n</body>\r\n</html>", expected "<img id=\"logo\" src=\"about:logo\">"

So one more test that needs the "fragment" stuff added. I'll proceed once we have all the results.
Slight fix to testing. Mac should be all green with this. Try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c064f95c0678
Attachment #8680324 - Attachment is obsolete: true
https://treeherder.mozilla.org/#/jobs?repo=try&revision=63327a9240f9 - Linux green, Mac almost green.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c064f95c0678 - Remaining Mac green.

Windows failures so far:
dom/base/test/test_copyimage.html M(1)
dom/base/test/test_bug166235.html M(1) (fails locally on W7 debug, but not on try so far, strange)
editor/libeditor/tests/test_bug525389.html M(4)
Fixed dom/base/test/test_copyimage.html M(1).
Cleaned-up a lot of "white-space issues" to adhere more to the Mozilla coding standards.

test_bug166235.html is failing locally due to a conflict with a clipboard manager I have installed (Ditto). After closing this, it works.

So we're down to one test not working:
editor/libeditor/tests/test_bug525389.html M(4)

Hooray!!!
Attachment #8681329 - Attachment is obsolete: true
wow, those 'win' checks I've been using were copied from elsewhere in the codebase. I wonder how many other tests are running on mac unexpectedly...

I should've realised 166235 fails due to a clipboard manager. That one caused me a lot of grief but it was so long ago I just don't even think about why I close the clipboard tool now :)

Looking back at comment #95 when I first hit the 525389 failure, that's the doozy involving SDK APIs that I was talking about yesterday. If you can't figure it out I'll have a go when I get back to work.
(In reply to Andrew Herron from comment #123)
> wow, those 'win' checks I've been using were copied from elsewhere in the
> codebase. I wonder how many other tests are running on mac unexpectedly...
There are two types of checks:
In the (doomed/ill-conceived) Jetpack stuff it checks "platform" which returns "winnt", "darwin" and "linux".
Otherwise it checks navigator.platform which returns "Win32" and "Mac...".
They are not interchangeable. I assume it's done right and nothing runs unexpectedly.

> Looking back at comment #95 when I first hit the 525389 failure, that's the
> doozy involving SDK APIs that I was talking about yesterday. If you can't
> figure it out I'll have a go when I get back to work.
Thanks for the reference. I'm just looking at it now.

Isn't it 8 AM in Brisbane now?
Ah, so that was simply my fault then. Good to know.

It was 7am (Saturday) when you wrote that :)
In this patch, the last failing test is fixed:
editor/libeditor/tests/test_bug525389.html M(4)

There were two problems:
1) After converting UTF8 to UTF16, the data length needed to be doubled.
2) The fragment tags needed to be removed.

Try run for Window XP opt only, M(4) and retesting M(1) just to be sure.
Since all the Windows versions have behaved the same, there is no need to burn server resources: https://treeherder.mozilla.org/#/jobs?repo=try&revision=028e897374fb
Attachment #8681467 - Attachment is obsolete: true
Can I ask a (perhaps silly) question:

In your patch, you fix up these tests:
addon-sdk/source/test/test-clipboard.js M(JP)
browser/base/content/test/general/browser_clipboard.js M(bc7)
dom/base/test/test_copypaste.html (copypaste.js) M(1)
dom/base/test/test_bug166235.html M(1)
dom/base/test/test_copyimage.html M(1)
They now check the operating system and then expect other things when run on Windows.

Why?

For test editor/libeditor/tests/test_bug525389.html M(4) the Windows specific stuff is removed in the C++ code, so it never makes it to the surface. Although the test is affected, we don't change it.

Why don't we just strip the undesired extra stuff and leave the tests as they are?

Actually, looking at https://treeherder.mozilla.org/#/jobs?repo=try&revision=028e897374fb, with my stripping, M(4) passes, but the following in M(1) fail now since the Windows specific stuff is not returned any more: test_copypaste.html
test_bug166235.html
test_copyimage.html.

I'll revert these tests to their original state and try it again.
In this patch I've removed all modifications to existing tests. See what happens.
We need to run M(1,4,JP,bc7):

https://treeherder.mozilla.org/#/jobs?repo=try&revision=638927ac0e28
Attachment #8681596 - Attachment is obsolete: true
As for testing:
I'm using you test case from attachment 8593110 [details]. Since I have no MS applications on my machine, I'm dragging/pasting from Chrome (as described in bug 1050566).

That works just fine. I don't think we need to pass the entire "document" back, so:
<html><body>\r\n<!--StartFragment--><span>Hello</span><span>Kitty</span><!--EndFragment-->\r\n</body>\r\n</html>

It's sufficient to pass the "fragment" back, so:
<span>Hello</span><span>Kitty</span>.

Right?
OK, the existing tests work, with one minor issue in test_copypaste.html: <CR><LF> vs. <LF>, example:

got
"<div id=\"alist\">\r\n bla\r\n <ul>\r\n <li>foo</li>\r\n \r\n <li>bar</li>\r\n </ul>\r\n </div>"
expected
"<div id=\"alist\">\n bla\n <ul>\n <li>foo</li>\n \n <li>bar</li>\n </ul>\n </div>"

Change the test or change the C++ code to strip the \r?
Better in the C++ code, right?
Stripping out \r while we copy. This fixes the failing test.

Maybe this it the final solution, so let's give this another good test run or all platforms:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b37ac2054a18
Attachment #8681598 - Attachment is obsolete: true
Flags: needinfo?(enndeakin)
Ack. No, please don't strip down to just the fragment in FindPlatformHTML. ContentEditable does this internally because it grabs the HTML context from other clipboard data, but I very deliberately returned the actual clipboard contents (in this case, an entire HTML document) for the JS APIs. We *need* that context information surrounding the fragment to paste correctly.

It's not required by any test attached to this bug, but in one very specific case discussed some time ago (my summary in comment #61) discuss using the full document via JS APIs (which Chrome does give us).

Sorry about this, I was out of mobile range when you started down the path, please revert FindPlatformHTML and the tests.
(Looks like Brisbane is not on Melbourne time, those Queenslanders doing their own thing ...)

Hmm, my motivation of removing the full <html><body>\r\n<!--StartFragment--> and <!--EndFragment-->\r\n</body>\r\n</html> is that this way no tests have to change.

Changing tests because program behaviour has changed for no apparent reason is somewhat frowned upon and may have trouble to pass reviews. Making tests depend on operating system is not nice and having those ugly "fragment" comments everywhere, is really quite a pain. Even when stripping the stuff, your test case (attachment 8593110 [details]) works just fine.

The try looks all green (with 99% complete): https://treeherder.mozilla.org/#/jobs?repo=try&revision=b37ac2054a18.

Now you're saying you need the whole document (as per comment #61).

So why don't I do this:
Leave the body tags <html><body> alone, but remove the ugly Windows fragment comments. Also remove the \r\n and remove any \r in the UTF-8 string.

Remember, we saw in test_bug525389.html that stripping is necessary since, as you said, the tags get stripped elsewhere making tests fail on the extra newlines.
I'll just stop for a while and let you give me some more input.

Summary:
Returning just the fragment from FindPlatformHTML passes all the tests which can remain unmodified.
It also makes your test from attachment 8593110 [details] work.
However, you're saying that you want more than the fragment, you want the document including the <html><body> tags.

I still haven't quite understood why returning the fragment isn't enough. If we look at the "HelloKitty" test, the expected result at the moment is <span>Hello</span><span>Kitty</span> on all platforms.

If we go ahead with the previous approach, *only* on Windows we would expect
<html><body>\r\n<!--StartFragment--><span>Hello</span><span>Kitty</span><!--EndFragment-->\r\n</body>\r\n</html>

Why is this desirable? Firefox on Mac and Linux would then behave differently. How would that make your life easier?

Let's look at the table again:
OS X - Behaviour won't change
====
OS X, copy from Firefox, places a Document on the clipboard (no magic tags).
  Paste Firefox -> Fragment, Paste Chrome -> Document
OS X, copy from Chrome, places a Fragment on the clipboard (no magic tags).
  Paste Firefox -> Fragment (1), Paste Chrome -> Fragment
OS X, copy from MS Word, places a Document on the clipboard (no magic tags).
  Paste Firefox -> Document (2), Paste Chrome -> Document

Windows
=======
Windows, copy from Firefox, places a Document on the clipboard.
  Paste Firefox -> Fragment, Paste Chrome -> Document
Windows, copy from Chrome, places a Document on the clipboard.
  Paste Firefox -> <new 1>, Paste Chrome -> Document
Windows, copy from MS Word, places a Document on the clipboard.
  Paste Firefox -> <new 2>, Paste Chrome -> Document

The equivalent to "new 1" on OS X is Fragment.
The equivalent to "new 2" on OX X is Document.

Since we can't have it both ways, Fragment doesn't seem so bad, but most likely your argument will be that whatever is on the clipboard should be pasted, so Document in both cases.

Please explain further.

I'll change the names of the attachments a little, don't worry, the previous state is still there.
Comment on attachment 8681615 [details] [diff] [review]
Approach 2 - Fragments: Andrew Herron's patch (Jorg K - take 7)

This approach returns fragments.
All tests pass:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b37ac2054a18
Attachment #8681615 - Attachment description: Proposed solution: Andrew Herron's patch (Jorg K - take 7) → Approach 2 - Fragments: Andrew Herron's patch (Jorg K - take 7)
I wanted to say: All tests pass as they are, no test modified.
Here I'm going back to take 4, but adding the length correction (doubling the length after conversion from UTF-8 to UTF-16) from take 5. So take 8 is really take 4.5 ;-)

With this patch all the *modified* tests should pass, however test_bug525389 M(4) won't pass. One of the errors will be:
got    "\n<span>Hello</span><span>Kitty</span>\n\n",
expected "<span>Hello</span><span>Kitty</span>"

We can see that the <html><body> tags and fragment comments were stripped by the editor, but the newlines weren't. (That's what led me to only returning the fragment in take 5, 6 and 7.)

This can of course be fixed easily by adjusting the test for Windows, like we've changed all the others.

Let's give this option another broader try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bc266491ba2d
Attachment #465154 - Attachment is obsolete: true
OK, https://treeherder.mozilla.org/#/jobs?repo=try&revision=bc266491ba2d is green apart from the failure in M(4) in test_bug525389.html which we expected.

Three different cases of three '\n' characters left over from the removal of tags and fragment comments:

got    "\n<span>Hello</span><span>Kitty</span>\n\n"
expected "<span>Hello</span><span>Kitty</span>"

got      "<ol><li id=\"paste_here\">X\n<dl><dd>Hello Kitty</dd></dl><span>Hello</span><span>Kitty</span>\n\n</li></ol>"
expected "<ol><li id=\"paste_here\">X<dl><dd>Hello Kitty</dd></dl><span>Hello</span><span>Kitty</span></li></ol>"

got      "<pre id=\"paste_here\">Hello \nKitty<span>Hello</span>\n\n</pre>"
expected "<pre id=\"paste_here\">Hello Kitty<span>Hello</span></pre>"

Now we have two choices:
1) Fix this test and go with Approach 1 which returns the document from the clipboard.
2) Go with Approach 2 which returns the fragment from the clipboard.

I guess I will present both approaches to the powers that be. Maybe the reviewers have a preference.
Fixing test_bug525389.html. Now we have two fully working options.
Attachment #8681636 - Attachment is obsolete: true
Thank you for updating a second copy of the patch the way I requested. Now that I'm not half asleep, I'll try to provide a detailed explanation.

My argument (if it comes up) as to why the tests are platform specific now is that technically, this is testing the actual FireFox platform specific behaviour. Previously it was testing the fake FireFox-specific "text/html" clipboard format which no other Windows application uses. When copying things to the clipboard, FireFox creates CF_HTML and it adds the fragment header and footer when doing so. This is done in function nsDataObj::BuildPlatformHTML (file widget/windows/nsDataObj.cpp).

When pasting, this fragment can't be stripped out automatically because CF_HTML from sources other than FireFox can (and do) include useful data outside of the fragment tags. This is also more consistent with OS X where FireFox does not remove the HTML header tags (there is no fragment information that could do this easily, but I think the point is JS receives the whole HTML document on OS X).

If we want less platform specific tests, we should update the *other* platforms to create full HTML documents instead of partial fragments as they do today.


OK. So specifically, why do *I* want it to work this way?

The discussion in and around comment #61 is that for JS editors with custom clipboard code, we want the absolute raw clipboard regardless of what it is. The less processing done between the native clipboard and our code the better. This is what Chrome does for us on all platforms, what FireFox does for us on OS X and Linux, and with my patch what FireFox does for us on Windows.

Whether we have the fragment comments or not is irrelevant, you can strip those out if it makes the tests easier (although that doesn't seem to be necessary after all). What matters is that when pasting from MS Word there is a <head> tag outside of the fragment body that contains critical information for styling the HTML fragment correctly.

When the ContentEditable C code sees CF_HTML it splits the style information and the fragment contents into two separate chunks and deals with both. This is also what JS editors with custom MS Word processing do (Textbox.io, TinyMCE and according to bug #1183686 CKEditor as well).

If this is merged with just the fragment patch, it will technically fix the original bug here but not the ones marked as a dupe of it. It will be of no use to JS editors (and we'll all probably log new bugs saying please give us the whole document).


Please let me know if anything I've said is unclear, I'm happy to go into further detail or provide actual examples if necessary. I've been working with this sort of clipboard processing for a very long time :)
Comment on attachment 8681615 [details] [diff] [review]
Approach 2 - Fragments: Andrew Herron's patch (Jorg K - take 7)

I'm convinced ;-)

I just went off on a tangent to fix test_bug525389.html.

It is now lazily fixed in Approach 1, I'm thinking about doing it better by stripping the tags *and* the newlines here:
https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/nsHTMLDataTransfer.cpp#892

So let's abandon Approach 2.
Attachment #8681615 - Attachment is obsolete: true
Excellent :)

Removing newlines specifically would probably work, or it might be easier to just trim the resulting string?
Hi Jim,

this bug has come a long way. You approved a previous version of the patch back in April 2014 (!), comment #22. It got landed but caused a bunch of test failures since (foolishly) no one had bothered to submit it to try before presenting to review.

There was no activity from April 2014 until April 2015, when Andrew Herron picked it up. I've picked up the last patch he submitted in July 2015, rebased it and fixed the remaining test failures. I also fixed "coding style" issues. Note that the code we're modifying is old, so you find things like
  result = functionX ( parameter );
and
  }
  else {
which don't adhere to the current coding standard. I tried to match the surrounding code as well as possible.

There have been may try runs for this, so here is the last one. I'm just running opt, since this is faster and previous runs haven't shown any differences between opt and debug. Also, from experience, all versions of Windows behave the same here.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=37dca8570669
Attachment #8681654 - Attachment is obsolete: true
Attachment #8681682 - Flags: review?(jmathies)
Attachment #8681682 - Flags: feedback?(enndeakin)
(In reply to Andrew Herron from comment #142)
> Removing newlines specifically would probably work, or it might be easier to
> just trim the resulting string?
Since you're asking:
This is related to test_bug525389.html where we run into this problem:
got    "\n<span>Hello</span><span>Kitty</span>\n\n"
expected "<span>Hello</span><span>Kitty</span>"
resulting from tags being removed in the editor, but not surrounding newlines.

I tried to remove the newlines together with the fragment comments. Sadly newline conversion happens *after* the fragment comment removal:
https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/nsHTMLDataTransfer.cpp#985
https://dxr.mozilla.org/mozilla-central/source/editor/libeditor/nsHTMLDataTransfer.cpp#997
I didn't want to test for Windows specific stuff in the generic code. Also, removing the newlines preceding and following the fragment comments would only have been half the job. On Windows we also have to remove the newline here: </body>\r\n</html>, which would be another bit to tweak.

In the end I abandoned the idea and went with fixing the test instead.

I think this bug is done. The code changes are 99.99% your changes, I added one single line to double the data length after converting from UTF-8 to UTF-16. Ah yes, and I changed strncpy to memcpy.
Ah, fair enough.

I knew the patch was close in July but I didn't realise it was *that* close. You've been a great help, thank you :)
Try 100% green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=37dca8570669
This is ready for review.
Comment on attachment 8681682 [details] [diff] [review]
Proposed solution: Andrew Herron's patch (Jorg K - take 9)

> const { base64jpeg } = require("./fixtures");
> 
>+const { platform, pathFor } = require("sdk/system");

You don't use 'pathFor'.


>diff --git a/browser/base/content/test/general/browser_clipboard.js b/browser/base/content/test/general/browser_clipboard.js

>-  let results = yield ContentTask.spawn(browser, { modifier: modifier },
>+  // On windows, HTML clipboard includes extra data (bug 586587). Values from widget/windows/nsDataObj.cpp
>+  const htmlPrefix = (content.navigator.platform.indexOf("Win") >= 0) ? "<html><body>\r\n<!--StartFragment-->" : "";
>+  const htmlPostfix = (content.navigator.platform.indexOf("Win") >= 0) ? "<!--EndFragment-->\r\n</body>\r\n</html>" : "";
>+

Just 'navigator.platform' should work. And also remove 'content.' from the line above where modifer is declared. That was cut and paste from when it used to be inside the spawn block where it was needed, but isn't any more.

Also, remove the bug number reference, since it implies that this behaviour is a bug when it isn't.


>+
>+  if (removeHeader) {
>+    // For the JS APIs, we want the full HTML document from CF_HTML but not the header offset information.
>+    char* charData = (char*)*outData;
>+    int32_t contentLength = endOfData - startOfData;
>+
>+    char* contentData = static_cast<char*>(moz_xmalloc(contentLength));
>+    if (contentData) {
>+      // extract the substring
>+      memcpy(contentData, charData + startOfData, contentLength);
>+
>+      free(*outData);
>+      *outData = contentData;
>+      *outDataLen = contentLength;
>+    }
>+  }

It would be simpler and avoid extra memory allocations to remove the 'removeHeader' flag and all of this code, and just return 'startOfData' in an out parameter.
Then just pass buffer + startOfData to CreatePrimitiveForCFHTML.

I think the 'text/html' data should not include \r characters. You can use ConvertPlatformToDOMLinebreaks to strip out the \r characters within nsClipboard::GetDataFromDataObject rather than having tests handle it. It already does this for text types, so look there for how this is done.

This patch seems to remove adding 'text/html' as a type to the clipboard on Windows, and only using CF_HTML. That's good. Did you verify that adding to the clipboard / starting a drag does this correctly? A glance over the code suggests it should work. Does it work when dragging from older versions (or from an unpatched Thunderbird) without this patch to versions with this patch?
Attachment #8681682 - Flags: feedback?(enndeakin) → feedback+
Comment on attachment 8681682 [details] [diff] [review]
Proposed solution: Andrew Herron's patch (Jorg K - take 9)

Removing the review request for now until I fix the issues raised in the feedback. Thanks, Neil!

As for your questions: (It's actually not my code, I just got it to work the way the original author had intended it.)

Copying something in FF/TB currently creates a text/html item on the Windows clipboard.
Copying something in the patched version, doesn't create this entry.

Copying/dragging from an old unpatched version of FF/TB to the patched one works. It uses the CF_HTML as intended.
Attachment #8681682 - Flags: review?(jmathies)
Jorg, if you don't get to this during your day I'm happy to address the feedback during my day and then you can kick off another try run when you wake up :)
Andrew, you can also request try server access as documented on <https://www.mozilla.org/en-US/about/governance/policies/commit/access-policy/>.
Addressing feedback issues:
1) removed pathFor
2) changed content.navigator. ... to navigator. ...
3) changed logic in CreatePrimitiveForCFHTML
4) removing \r now in the code (and thus removed from the tests)

Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=830239384b79

Who wants to review this?
Attachment #8681682 - Attachment is obsolete: true
Attachment #8682234 - Flags: review?(jmathies)
Attachment #8682234 - Flags: review?(enndeakin)
That's much more elegant solution in FindPlatformHTML... You can tell I'm not a C programmer :)

You left an unnecessary comment here:
https://bugzilla.mozilla.org/attachment.cgi?id=8682234&action=diff#a/dom/base/test/copypaste.js_sec2

> // windows has extra content, and \n instead of \n

Could just be "windows has extra content".
Cosmetic changes: Fixed some comments and used sizeof(char16_t).

I'm a bit worried about the crash in M(bc7), which I don't see locally, so I'm running this again:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3f6d5e9f7ba0
Attachment #8682234 - Attachment is obsolete: true
Attachment #8682234 - Flags: review?(jmathies)
Attachment #8682234 - Flags: review?(enndeakin)
Comment on attachment 8682333 [details] [diff] [review]
Proposed solution: Andrew Herron's patch (Jorg K - take 10b)

Hmm, I'm a bit confused by the test result:
Mac and Windows XP (32bit) M(bc7) are green now, instead it crashes in Windows 64 bit. Sadly no matching intermittent failure is offered.
I'm not aware of any wrongdoing ;-)
Neil, can you please comment? Who would do the review, you or Jim? Or both?
Maybe it's OK as it is, and you can review straight away.
Attachment #8682333 - Flags: feedback?(enndeakin)
I ran another one: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0415ee102723

Only change I made was change
nsLinebreakHelpers::ConvertPlatformToDOMLinebreaks ( kHTMLMime, (void**)&utf16, &signedLen );
to
nsLinebreakHelpers::ConvertPlatformToDOMLinebreaks ( kHTMLMime, reinterpret_cast<void**>(&utf16), &signedLen );
which shouldn't make a difference. Still crashes in M(bc7) on Windows 8 (64 bit). But the crash is different:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0415ee102723
PROCESS-CRASH | browser/base/content/test/general/browser_contentSearchUI.js | application crashed [@ DoMarking<JSObject>(js::GCMarker *,JSObject *)]
PROCESS-CRASH | browser/base/content/test/general/browser_contentAreaClick.js | application crashed [@ mozilla::dom::Element::QueryInterface(nsID const &,void * *)]

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3f6d5e9f7ba0
PROCESS-CRASH | browser/base/content/test/general/browser_contentSearchUI.js | application crashed [@ TraversalTracer::onChild(JS::GCCellPtr const &)]
PROCESS-CRASH | browser/base/content/test/general/browser_contentAreaClick.js | application crashed [@ js::PreliminaryObjectArray::sweep()]

Looks like random memory corruption. Oh, I know. Removing the \r most likely allocated a shorter buffer, and when I append the '\0' again, it writes where it shouldn't. Damn!
Attachment #8682333 - Flags: feedback?(enndeakin)
Here we go again, this time the string handling should be right.

Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=584f5e376098
Attachment #8682333 - Attachment is obsolete: true
Damn, I mistyped this comment: "lenght+1", will fix in the next round.
Comment on attachment 8682479 [details] [diff] [review]
Proposed solution: Andrew Herron's patch (Jorg K - take 10c)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=584f5e376098 is now all green. Previous run https://treeherder.mozilla.org/#/jobs?repo=try&revision=830239384b79. This was green apart from the M(bc7) which are fixed now.

Who needs to review this?

Can the reviewer please note comment #157 and comment #143.
Attachment #8682479 - Flags: review?(jmathies)
Attachment #8682479 - Flags: review?(enndeakin)
Comment on attachment 8682479 [details] [diff] [review]
Proposed solution: Andrew Herron's patch (Jorg K - take 10c)

>+  // The conversion from UTF-8 to UTF-16 creates and nsAutoString, so we need
>+  // to extract the data from it.
>+  nsAutoString a(NS_ConvertUTF8toUTF16(utf8, *aDataLen));
>+  char16_t* utf16 = ToNewUnicode(a);
>+  *aDataLen = a.Length() * sizeof(char16_t);
>+
>+  // We know it's null terminated, so we can pass "lenght+1".
>+  // This way, if the string is shortened, the null is shifted down.
>+  // A little hacky, but the alternative is to realloc it after the call
>+  // to add the null again.
>+  int32_t signedLen = static_cast<int32_t>(*aDataLen) + sizeof(char16_t);
>+  nsLinebreakHelpers::ConvertPlatformToDOMLinebreaks(
>+                        kHTMLMime, reinterpret_cast<void**>(&utf16), &signedLen );
>+  *aDataLen = signedLen - sizeof(char16_t);
>+
>+  primitive->SetData(nsAdoptingString(utf16));
>+  NS_ADDREF(*aPrimitive = primitive);
>+}

If you pass kTextMime to ConvertPlatformToDOMLinebreaks, it will convert single-byte characters. This should simplify this code.


>+        else if ( strcmp(flavorStr, kHTMLMime) == 0 ) {
>+          uint32_t startOfData = 0;
>+          // The JS folks want CF_HTML exactly as it is on the clipboard, but
>+          // minus the CF_HTML header index information.
>+          // It also needs to be converted to UTF16 and have linebreaks changed.
>+          if ( FindPlatformHTML(aDataObject, anIndex, &data, &startOfData, &dataLen) ) {
>+            dataLen -= startOfData;
>+            nsPrimitiveHelpers::CreatePrimitiveForCFHTML ( static_cast<char*>(data)+startOfData,
>+                                                           &dataLen, getter_AddRefs(genericDataWrapper) );

Please add spaces around the + operator.


> //
> // FindPlatformHTML
> //
> // Someone asked for the OS CF_HTML flavor. We give it back to them exactly as-is.
> //
> bool
>-nsClipboard :: FindPlatformHTML ( IDataObject* inDataObject, UINT inIndex, void** outData, uint32_t* outDataLen )
>+nsClipboard :: FindPlatformHTML ( IDataObject* inDataObject, UINT inIndex,
>+                                  void** outData, uint32_t* outStartOfData,
>+                                  uint32_t* outDataLen )

Add a comment indicating that outData is the pointer to the full data and outStartOfData is the pointer within that data after the header portion.


>    {\
>diff --git a/widget/windows/nsDragService.cpp b/widget/windows/nsDragService.cpp
>     format = nsClipboard::GetFormat(aDataFlavor);
>     SET_FORMATETC(fe, format, 0, DVASPECT_CONTENT, -1,
>                   TYMED_HGLOBAL | TYMED_FILE | TYMED_GDI);
>-    if (mDataObject->QueryGetData(&fe) == S_OK)
>+    if (SUCCEEDED(mDataObject->QueryGetData(&fe)))

QueryGetData returns the success code S_FALSE when the data is not present which will give us the wrong result. So don't change this. Similar for the other cases.
Attachment #8682479 - Flags: review?(enndeakin) → review-
Thanks!

(In reply to Neil Deakin from comment #159)
> If you pass kTextMime to ConvertPlatformToDOMLinebreaks, it will convert
> single-byte characters. This should simplify this code.
I thought of that, but it seemed funny to do kTextMime in a function doing kHTMLMime stuff.

> >-    if (mDataObject->QueryGetData(&fe) == S_OK)
> >+    if (SUCCEEDED(mDataObject->QueryGetData(&fe)))
> QueryGetData returns the success code S_FALSE when the data is not present
> which will give us the wrong result. So don't change this. Similar for the
> other cases.

Hmm, just look at what the "other" reviewer said in comment #20:
lets use SUCCEEDED() here vs. the S_OK compare. Maybe you could touch up the other uses above.

Where is SUCCEEDED() defined so I can form my own opinion?
Flags: needinfo?(enndeakin)
(In reply to Jorg K (GMT+2) from comment #160)
> (In reply to Neil Deakin from comment #159)
> > If you pass kTextMime to ConvertPlatformToDOMLinebreaks, it will convert
> > single-byte characters. This should simplify this code.
> I thought of that, but it seemed funny to do kTextMime in a function doing
> kHTMLMime stuff.

It doesn't do anything related to HTML either. I'm suggesting this change to simplify the string handling.


> Hmm, just look at what the "other" reviewer said in comment #20:
> lets use SUCCEEDED() here vs. the S_OK compare. Maybe you could touch up the
> other uses above.
> 
> Where is SUCCEEDED() defined so I can form my own opinion?

S_FALSE is defined as the value '1'.

There's a comment on the documentation for QueryGetData that indicates that S_FALSE is returned:
 https://msdn.microsoft.com/en-us/library/windows/desktop/ms680637%28v=vs.85%29.aspx 

This change is why you're getting the wrong results in comment 17 of bug 938991.
Flags: needinfo?(enndeakin)
Damn, looks like the original reviewer sent us on the garden path:
https://msdn.microsoft.com/en-us/library/windows/desktop/ff485842%28v=vs.85%29.aspx
S_OK	0x0	Success.
S_FALSE	0x1	Success.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms687197%28v=vs.85%29.aspx
#define SUCCEEDED(hr) (((HRESULT)(hr)) >= 0)

So we have to revert that change, thanks!
Addressed the review issues.

New test run on Windows, since only Windows code was changed:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c44e6f49815

Neil: Maybe you can take another look once the try run is finished.

Note: I wasn't sure whether the UTF-16 length is always the UTF-8 length multiplied by 2, so I went for the safe way:
nsAutoString a(NS_ConvertUTF8toUTF16(reinterpret_cast<const char*>(utf8), *aDataLen));
*aDataLen = a.Length() * sizeof(char16_t);
primitive->SetData(a);
Attachment #8682479 - Attachment is obsolete: true
Attachment #8682479 - Flags: review?(jmathies)
Attachment #8683143 - Flags: review?(enndeakin)
Oops, forgot a 'free()'.
Attachment #8683143 - Attachment is obsolete: true
Attachment #8683143 - Flags: review?(enndeakin)
Attachment #8683156 - Flags: review?(enndeakin)
Just checked, patch (attachment 8683156 [details] [diff] [review]) still applies.
Comment on attachment 8683156 [details] [diff] [review]
Proposed solution: Andrew Herron's patch (Jorg K - take 11b)

>+  nsAutoString a(NS_ConvertUTF8toUTF16(reinterpret_cast<const char*>(utf8), *aDataLen));
>+  free(utf8);
>+  *aDataLen = a.Length() * sizeof(char16_t);
>+  primitive->SetData(a);
>+  NS_ADDREF(*aPrimitive = primitive);

The caller never uses the returned output length, so you can make aDataLen a non-out parameter and simplify the conversion:

  primitive->SetData(NS_ConvertUTF8toUTF16(reinterpret_cast<const char*>(utf8), signedLen));
Attachment #8683156 - Flags: review?(enndeakin) → review+
Hmm, are we talking about the same thing here?

nsClipboard.cpp

  if ( FindPlatformHTML(aDataObject, anIndex, &data, &startOfData, &dataLen) ) {
    dataLen -= startOfData;
    nsPrimitiveHelpers::CreatePrimitiveForCFHTML ( static_cast<char*>(data) + startOfData,
                                                   &dataLen, getter_AddRefs(genericDataWrapper) );
  }
  else
...
NS_ASSERTION ( genericDataWrapper, "About to put null data into the transferable" );
aTransferable->SetTransferData(flavorStr, genericDataWrapper, dataLen);
res = NS_OK;

dataLen is used by the caller.

So, just check it in "as is"?
Flags: needinfo?(enndeakin)
OK, of course. Sorry about that.

+  nsAutoString a(NS_ConvertUTF8toUTF16(reinterpret_cast<const char*>(utf8), *aDataLen));

Just update the name 'a' to be more descriptive. Even 'str' would be ok.

Then this is fine. Thanks!
Flags: needinfo?(enndeakin)
Carrying forward Neil Deakin's r+.
Changed name of temporary variable from 'a' to 'str' as suggested
(I couldn't think of a better name).
Attachment #8683156 - Attachment is obsolete: true
Attachment #8690875 - Flags: review+
Dear Sheriff,
Can you please check this in together with bug 938991.
Can you please check in the this bug here first, so the patches apply cleanly.

Many thanks in advance.
Keywords: checkin-needed
Blocks: 938991
https://hg.mozilla.org/mozilla-central/rev/e1b5927361ac
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Depends on: 1254980
You need to log in before you can comment on or make changes to this bug.