Closed Bug 860857 Opened 11 years ago Closed 8 years ago

Support custom mimetypes via clipboardData for clipboard events

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: ssaviano, Assigned: enndeakin)

References

Details

(Keywords: dev-doc-complete, testcase)

Attachments

(1 file)

Attached file Testcase
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.47 Safari/537.36

Steps to reproduce:

The addition of the clipboardData object was added to clipboard events via
https://bugzilla.mozilla.org/show_bug.cgi?id=407983

However, this does not support setting a custom mimetype on copy and then reading that custom mimetype on paste. An example usage would be higher-fidelity copy/paste (beyond what can be represented just in HTML/text) for Google Docs.

Attached is a scratch/demo file that uses the clipboardData API to write a custom mimetype on copy. On paste, it just lists the mimetypes/data that clipboardData specifies is available.

The custom mimetype is retained in Safari and Chrome.

Steps:
1) On copy call: e.clipboardData.setData('custom', 'custom data');
2) On paste call: e.clipboardData.getData('custom');


Actual results:

Custom mimetype data is not available


Expected results:

Custom mimetype data is available
Attachment #736421 - Attachment description: cbmon.html → Testcase
Attachment #736421 - Attachment mime type: text/plain → text/html
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: testcase
OS: Linux → All
Hardware: x86_64 → All
Depends on: 407983
No longer depends on: 407983
Olli, what would need to happen here?
Flags: needinfo?(bugs)
That is a question to Neil.
Flags: needinfo?(enndeakin)
But looks like we have currently hard coded list in DataTransfer::CacheExternalClipboardFormats() for
example. We'd need to extend clipboard service to let one to add more types.
Flags: needinfo?(bugs)
(In reply to comment #3)
> But looks like we have currently hard coded list in
> DataTransfer::CacheExternalClipboardFormats() for
> example.

Yes, that's one thing that we need to do.  Another one is to see if nsDOMDataTransfer can handle those.

> We'd need to extend clipboard service to let one to add more types.

I thought that is already possible?
There isn't an api to retrieve the list of types on the clipboard. The underlying platforms didn't expose such an api at the time the clipboard and data transfer code was originally written, but it's possible that they do now, although I don't know for sure.

If that is possible now, CacheExternalClipboardFormats/CacheExternalDragFormats just needs to get that list instead of using the hard-coded array to get the list of types. That should be all that would be needed to support custom types that have string values.

However, other custom types may contain sensitive data, files or other things that we may not want to expose.
Flags: needinfo?(enndeakin)
We may want clipboard to keep some gecko-only data, so that data could be copy-pasted between
webapps or so, but not to OS.
(In reply to Neil Deakin from comment #5)
> However, other custom types may contain sensitive data, files or other
> things that we may not want to expose.

If a user has put something in the clipboard and hits ctrl-v (or Edit -> Paste), why can we assume there are sensitive data at some custom mime types? How can we assure these sensitive data don't actually exist in the whitelisted mime types? How did/will we construct the whitelist to be so sure?

Same for the reverse: if user has hit ctrl-c/x (or Edit -> Copy/Cut), why isn't a webapp allowed to put data at any mime type it wishes to put into the clipboard? How can we be sure the whitelisted mime types are completely safe?

I vote for allowing all mime types. This is what Chrome and Safari (on Mac) are doing.
+1 for all mime types. We'd like to do the same thing the Google Docs people want to do in VisualEditor. We currently have a separate code paths for Chrome & Firefox with the FF one being slower and more error prone (when HTML gets 'cleaned').
+1 for all mimetypes, we are developing an editor for which we would like to provide different formats, so that we can provide a lossless copy-paste within the editor, as well as a more lossy conversion to plaintext and html.

Also, sensitive information can very well also be in the text/html or the text/plain format (passwords, health insurance information), so the sensitive information argument is moot.
See Also: → 938991
Generally what needs to happen here is that we need to create a special format type for the data on the real OS clipboard.

Known types (text/plain, text/html, image types, etc) would get converted into the right OS format as they do now.

Unrecognized types would be converted into the special format, which would hold the real 'unrecognized type name' and the data. This would get 

This would allow copy/paste around within Firefox, (and other Mozilla-based applications) but not other applications.
I think there is no specific privacy concern related to custom types.

If we allow JS to write *any type* of content to the clipboard, there is a security concern and we've had push back from among others developers for Chromium against allowing JS to write binary data and label it as a specific format - the discussion was about images but would apply even more to other specific data types like Excel data. (Reference for the discussion: 
https://lists.w3.org/Archives/Public/public-webapps/2015AprJun/0819.html ) I'm not sure if the implementation can (or even should) distinguish between a "custom" type and something that another native app on the OS might try to read from the clipboard and handle.. Thoughts?

What Chrome on Windows does for this custom data demo is to register a new CF (Windows clipboard format) labelled as:
50153: Chromium Web Custom MIME Data Format
(the number is probably system-dependent)

It then packs both the custom MIME type and the data into a clipboard entry labelled as this CF type:

ttext/x-vnd.google-docscustomcustom - special text

(some NUL characters removed - I guess it's just because it's UTF-16 data or something)

If we want to allow this, we should come up with a cross-browser, cross-platform solution and spec everything, including the OS specific parts for major OSes. We likely won't require other browsers to define a format called "*Chromium* Web Custom" whatever, nor look for it if it is on the clipboard. That means this solution is not interop-friendly and prevents custom data copied in one browser from being handled by a script in another browser. That's something we need to fix in the spec.
(Certainly other platforms make life simpler than Windows here :) - but again brings up the safety question: if we allow any JS to create a custom binary payload and dump it as application/x-ms-word or whatever we create a vector for security attacks..)
> What Chrome on Windows does for this custom data demo is to register a new
> CF (Windows clipboard format) labelled as:
> 50153: Chromium Web Custom MIME Data Format
> (the number is probably system-dependent)
> 
> It then packs both the custom MIME type and the data into a clipboard entry
> labelled as this CF type:
> 
> ttext/x-vnd.google-docscustomcustom - special text
> 

This is pretty much what I suggested in comment 10.
(In reply to Neil Deakin from comment #13)

> This is pretty much what I suggested in comment 10.

Indeed, it is *exactly* what you suggested in comment 10. But we need to consider this cross-platform - on nix and Mac, where you typically have a simpler MIME type->data mapping, shall we invent a new MIME type for embedding a custom type and the data? What happens if a site writes more than one custom type to the clipboard?
To get the bikeshedding going - for example if we on nix/Mac defined something like this:

application/org.w3c.custom-clipboard-data [{"foo/custom1":"Custom text"}, {"foo/custom2":"Second piece of custom data"}, {"foo/custom-binary":"ByteArraySomehowSerializedForJSON"}]

and on Windows registered CF_W3C_CUSTOM_CLIPBOARD_DATA with the same serialization?
(Keep in mind that the implication here is that ALL types the browser does not explicitly recognise - i.e. not on the "mandatory data types" list https://w3c.github.io/clipboard-apis/#mandatory-data-types-1 - and this list may shrink further based on security concerns - would now be written to the clipboard packed inside a CF_W3C_CUSTOM_CLIPBOARD_DATA / application/org.w3c.custom-clipboard-data entry. So other software would have to be written specifically to look for such entries to make use of data in its native format copied by JS.)
I don't think much of that would be in scope for this bug. Making some special interchangable format for custom data would only be useful if one were to cut and paste from the same web site in two different browsers.

It looks like Chrome just stores the data as:

  <count><type1><data1><type2><data2>...

It seems to interpret 'data' as a string. For our UI usage, we may need to support a length and treat it as a byte-array for each item.

For real custom types that we don't support (such as image/tiff), we can add a mechanism for addons to add such support.
(In reply to Neil Deakin from comment #17)
> Making some
> special interchangable format for custom data would only be useful if one
> were to cut and paste from the same web site in two different browsers.

Neil, interoperability is important. It's the very reason we try to spec browser behaviour in the first place. Defining this properly also means other software in the future can look for that data type and read custom data JS from the web has written to the clipboard - we're not just worrying about browser->browser interop here.

If you think Chrome's approach is a good one, we can spec & copy that. I'm not sure - what if you want to write several custom types with lots of data each, won't it be a big performance hit if the reader of said data must scan through *all* data looking for type declarations?
FWIW, I'm happy to change Chromium to make the custom data format work well with other browsers/whatever's specced in this area (if anything). The way Chromium stores data now is just what happens to what was most convenient at the time.

I don't remember the details of the generic Pickling class in Chromium, but I think we write out something like this:
<number of entries in custom data (n)><size of type 1><type 1><size of data 1><data 1><size of type 2><type 2><size of data 2><data 2>...<size of type n><type n><size of data n><data n>
Assignee: nobody → enndeakin
Status: NEW → ASSIGNED
The spec is still a bit handwavy about this important topic. Neil and Daniel, could you help me come up with a sufficiently detailed description of how things should work? I suppose we need to cover topics like

1) For platforms that support MIME types to describe clipboard contents natively

 .. we can write clipboard parts described by the custom MIME types
 .. but what are the security concerns? Should we add specific requirements to MIME types (like they must contain x-vnd.foo-org-namespace) to lessen the risk of JS writing binary data some native application will try to read?
 .. what, if anything, do implementations do security-wise? Per my testing, nothing..


2) For platforms that do *not* describe clipboard parts with MIME types natively (*couch* Windows mainly)
 a) What exactly is a "custom" vs a "standard" MIME type? If JS adds text/csv part, will it be serialized as "custom" or not?
 b) Serialization
 c) String to describe serialized custom format ("Chromium Web Custom MIME Data Format".replace("Chromium", "W3C") ?? ;) )
 d) Something about reading and de-serializing this format for paste events

General question: what, if any, alternate parts should be written? For example, if my JS adds a text/html part, should the browser automatically create a text/plain version?

It seems like security is actually a lesser concern on Windows..?
Flags: needinfo?(enndeakin)
Flags: needinfo?(dcheng)
For Chromium, we chose not to distinguish between (1) and (2). My recollection is X11 has an atom table that stores all the different possible targets that can be written to the X selection, so a bad page could consume quite a bit of space in the atom table by setting lots of custom MIME types.

Distinguishing between 'standard' and 'custom' is a hard problem, and Chromium punted on that by only whitelisting a very small list of types that are actually translated (text/plain, text/html essentially, I don't recall what happens if you try to specify an encoding as well but I suspect we probably don't handle it correctly).

Re: automatic text/html -> text/plain conversion when JS sets text/html seems like it'd be tricky: for Chromium, it'd require that we parse the markup and then try to extract some sensible text content from it. I think I'd prefer to avoid that.
Flags: needinfo?(dcheng)
Flags: needinfo?(enndeakin)
https://hg.mozilla.org/integration/mozilla-inbound/rev/4cbc83f4c4b1d7cd3bf9758247dce1f54318fb62
Bug 860857, support custom datatransfer types using a special type, r=smaug,jmathies,mstange
https://hg.mozilla.org/mozilla-central/rev/4cbc83f4c4b1
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Is there going to be a way to feature-detect this? Playing with the nightly build I couldn't see any way to detect the availability of this feature during the copy event?
It's a very good question and unfortunately the answer is no - at least right now. This feature adds no extra JS-visible API and thus can not be detected. :/

Perhaps we should add ClipboardData.prototype.supportedTypes = ['text/plain', 'text/html', ... 'custom'] - or something like that?
I've reported an issue for discussing that here: https://github.com/w3c/clipboard-apis/issues/30
Depends on: 1269713
See Also: → 564738
See Also: → CVE-2016-5266
Depends on: 1338328
Depends on: CVE-2017-5401
You need to log in before you can comment on or make changes to this bug.