Open Bug 395218 Opened 17 years ago Updated 2 years ago

Mozilla does not accept text/html clipboard contents with unknown charset

Categories

(Core :: Widget: Gtk, defect, P5)

x86
Linux
defect

Tracking

()

People

(Reporter: mslama, Unassigned)

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)
Build Identifier: version 1.5.0.13 (20070824)

When I copy text from NetBeans IDE with following data flavors Mozilla (eg. new email composer) does not accept it.
FINE [org.netbeans.core.NbClipboard]:   0 = java.awt.datatransfer.DataFlavor[mimetype=text/html;representationclass=java.lang.String]
FINE [org.netbeans.core.NbClipboard]:   1 = java.awt.datatransfer.DataFlavor[mimetype=text/html;representationclass=java.io.Reader]
FINE [org.netbeans.core.NbClipboard]:   2 = java.awt.datatransfer.DataFlavor[mimetype=text/html;representationclass=java.io.InputStream;charset=unicode]
FINE [org.netbeans.core.NbClipboard]:   3 = java.awt.datatransfer.DataFlavor[mimetype=text/plain;representationclass=java.lang.String]
FINE [org.netbeans.core.NbClipboard]:   4 = java.awt.datatransfer.DataFlavor[mimetype=text/plain;representationclass=java.io.Reader]
FINE [org.netbeans.core.NbClipboard]:   5 = java.awt.datatransfer.DataFlavor[mimetype=text/plain;representationclass=java.io.InputStream;charset=unicode]
FINE [org.netbeans.core.NbClipboard]:   6 = java.awt.datatransfer.DataFlavor[mimetype=application/x-java-jvm-local-objectref;representationclass=java.lang.String]
FINE [org.netbeans.core.NbClipboard]:   7 = java.awt.datatransfer.DataFlavor[mimetype=application/x-java-serialized-object;representationclass=java.lang.String]


Reproducible: Always

Steps to Reproduce:
1.Start NetBeans IDE
2.Open html document in editor or open Help window Help -> Help Contents
3.Select text and invoke Ctrl-C to copy text to clipboard.
4.Go to email composer. Press Ctrl-V to paste text. Nothing happens.

When I do the same with java source paste works fine. In this case data flavors are:
FINE [org.netbeans.core.NbClipboard]:   0 = java.awt.datatransfer.DataFlavor[mimetype=text/x-java;representationclass=java.lang.String]
FINE [org.netbeans.core.NbClipboard]:   1 = java.awt.datatransfer.DataFlavor[mimetype=text/x-java;representationclass=java.io.Reader]
FINE [org.netbeans.core.NbClipboard]:   2 = java.awt.datatransfer.DataFlavor[mimetype=text/x-java;representationclass=java.io.InputStream]
FINE [org.netbeans.core.NbClipboard]:   3 = java.awt.datatransfer.DataFlavor[mimetype=text/plain;representationclass=java.lang.String]
FINE [org.netbeans.core.NbClipboard]:   4 = java.awt.datatransfer.DataFlavor[mimetype=text/plain;representationclass=java.io.Reader]
FINE [org.netbeans.core.NbClipboard]:   5 = java.awt.datatransfer.DataFlavor[mimetype=text/plain;representationclass=java.io.InputStream;charset=unicode]
FINE [org.netbeans.core.NbClipboard]:   6 = java.awt.datatransfer.DataFlavor[mimetype=application/x-java-jvm-local-objectref;representationclass=java.lang.String]
FINE [org.netbeans.core.NbClipboard]:   7 = java.awt.datatransfer.DataFlavor[mimetype=application/x-java-serialized-object;representationclass=java.lang.String]

Actual Results:  
No text is pasted into composer window.

Expected Results:  
Text should be pasted to composer window.
Other apps accept this clipboard content fine (like NetBeans IDE editor, Gnome text editor, Swing Notepad demo).
I can get this to happen as well.  

NetBeans IDE 6.0.1, select text from a Help window and then try to paste in to a new composition window in Tb.  Nothing shows up, and the various Paste menu items under Edit are disabled.  Other apps will paste the text, including text-based edit controls and style-capable ones, like OOO Writer.

I'm running Ubuntu 7.10 (Gutsy).
Status: UNCONFIRMED → NEW
Component: General → Message Compose Window
Ever confirmed: true
Version: unspecified → Trunk
The problem appears to be that the clipboard data provided by NetBeans/Java does not have an identifiable character set from the perspective of the nsClipboard code.  It seems that ConvertHTMLtoUCS2 and GetHTMLCharset in nsClipboard.cpp are a bit jerky.  ConvertHTMLtoUCS2 refuses to process things without a known character set, and GetHTMLCharset's logic is somewhat limited.  If the string doesn't start with '\xff\xfe' or the transpose (implying UTF-16), it wants to see a string like so: CONTENT="text/html;charset=${THE_ENCODING}", where ${THE_ENCODING} is something like 'ascii' (and which is valid).

During the process of investigation, I wrote a python script that grabs the clipboard data as provided by NetBeans Help window, encodes it in UTF-16 and then exposes that as the clipboard contents.  This made Thunderbird happy.  I also had it inject a CONTENT= string.  This also made it happy.
Component: Message Compose Window → General
Version: Trunk → unspecified
Trying to move to Core, Widget/Gtk...
Component: General → Widget: Gtk
Product: Thunderbird → Core
QA Contact: general → gtk
Summary: Mozilla does not accept data from clipboard with text/html data flavor → Mozilla does not accept text/html clipboard contents with unknown charset
Version: unspecified → Trunk
I should probably also note that NetBeans Help/Java actually is willing to provide text/html with explicit charsets... the list of target atoms it is willing to provide are:

text/html;charset=UTF-16
text/html;charset=UTF-8
text/html;charset=UTF-16BE
text/html;charset=UTF-16LE
text/html;charset=ISO-8859-1
text/html;charset=US-ASCII
text/html
UTF8_STRING
TEXT
STRING
text/plain;charset=UTF-16
text/plain;charset=UTF-8
text/plain;charset=UTF-16BE
text/plain;charset=UTF-16LE
text/plain;charset=ISO-8859-1
text/plain;charset=US-ASCII
text/plain
text/plain;charset=unicode
JAVA_DATAFLAVOR:application/x-java-jvm-local-objectref; class=java.lang.String 

If I tell my python script to proxy "text/html;charset=UTF-16" as "text/html", the marker sequence is 0xfe 0xff (indicating a big-endian return payload) and the paste is messed up.  Characters are invisible until I type something, and then they become visible, but wrong characters.  I assume the code doesn't bother to treat the big-endian UTF-16 any differently from little-endian.

Proxying "text/html;charset=UTF-16LE" as "text/html" fails because no marker sequence is prepended, but works if I inject the little-endian marker sequence.

I suppose that leaves us with a few potential fixes to the end-to-end cut-and-paste problem:
 * Be more accepting of non-explicit text/html charsets...
 * Consider/check text/html variant clipboard targets with explicit charsets.  (And use that as the charset instead of guessing.)

And I suppose a separate bug:
 * The code accepts big-endian UTF-16 but doesn't actually handle it.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.