Drag-n-drop CF_HTML encoding bug.

NEW
Unassigned

Status

()

Core
Drag and Drop
14 years ago
5 years ago

People

(Reporter: Andrew Fedoniouk, Unassigned)

Tracking

({intl})

Trunk
x86
Windows XP
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

14 years ago
User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461; SV1; .NET CLR 1.1.4322)
Build Identifier: 1.0 PR

Copy into clipboard in FireFox 1.0 uses UTF-8 encoded CF_HTML clipboard 
format (on Windows) which is right encoding.

But drag-n-drop there uses some other encoding with not well known character 
references.

Problem arises with non English text being drag from FireFox.

Reproducible: Always
Steps to Reproduce:
Open any web site using e.g. russian language. For example http://www.google.ru
1. Select any text in russian there.
2. Drag-n-drop it into any application which accepts CF_HTML in drag-n-drop.
(You may use my http://blocknote.net or http://www.nvu.com/ (Daniel Glazman, 
Gecko engine) as targets).
3. Notice how unrecognized char refs appear.

Note if you will copy the same text into clipboard and then paste it into 
target applications everything is fine - CF_HTML contains pure UTF-8 encoded 
HTML.
Actual Results:  
Russian text appears as:
<html><body>
<!--StartFragment -->
&Vcy;&lcy;&acy;&scy;&tcy;&icy; &Scy;&SHcy;&Acy; &pcy;&ocy;&kcy;&acy;
&ncy;&iecy; &pcy;&lcy;&acy;&ncy;&icy;&rcy;&ucy;&yucy;&tcy;



Expected Results:  
Cyrillic characters should be UTF-8 encoded instead of use character 
references out of Latin 1 charset.

Mozilla 1.7.3 has the same problem.

It would be nice also if Firefox will use CF_HTML v.1.0 instead of outdated 
v.0.9. IE 6 uses CF_HTML v.1.0. 

Don't hesitate to contact me if you need more info here.

Updated

14 years ago
Keywords: intl

Comment 1

14 years ago
Daniel, feel like taking a shot at this one?

Comment 2

14 years ago
A prominent, if not free, drop target application to demonstrate this is
Microsoft Word.

Comment 3

14 years ago
It seems that in mozilla/content/base/src/nsContentAreaDragDrop.c when
nsContentAreaDragDrop::DragGesture is called, there should be an override
nsIClipboardDragDropHooks defined for win32 that will convert the selected text
to UTF-8 in the OnCopyOrDrag function.

From what i can tell, this would be the preferable way to do the translation,
but i'd like somebody who has a better idea to confirm that i'm not completely
confused.

Updated

14 years ago
Component: General → Drag and Drop
Product: Firefox → Core
Version: unspecified → Trunk

Updated

14 years ago
Assignee: firefox → nobody
QA Contact: firefox.general

Comment 4

14 years ago
Created attachment 167339 [details] [diff] [review]
Don't encode entities

Well, I'm guessing here that we shouldn't be encoding entities.
At least, I seem to be able to drag-n-drop http://www.google.ru/ to Word now.
But I've no idea whom to ask for (super)review.
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/

Comment 6

13 years ago
This is still happening in 1.5b1.
It looks like Neil's patch got forgotten about.
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 7

13 years ago
Well, maybe bz can suggest some victims^H^H^H^H^H^H^Hreviewers ;-)
I really don't know.... Maybe one of the editor folks?  Because I can see this
screwing up drag/drop within editor pretty badly...

Updated

13 years ago
Attachment #167339 - Flags: review?(daniel)
QA Contact: drag-drop
Assignee: nobody → netzen
The problem here isn't with CF_HTML "HTML Format", it is with text/html.  
Both are different formats.  We are currently generating CF_HTML with UTF8 but we are generating text/html with UCS2.  

A similar problem also happens if you drag from Chrome onto us, Chrome treats text/html as UTF8 and it provides it to us in that format, but we try to use it as UCS2. 

It should be UTF8 for both CF_HTML and text/html. (Note: CF_HTML has the extra headers and such).

Reference from: The 'text/html' Media Type (RFC 2854)

> For the text/html flavor, any registered IANA charset may be used, but UTF-8 is preferred.  This will require changes in a a handful of places but we should be handling this data as UTF8 so the change is worth it.

I verified and generating the data as text/html makes the paste work in the provided program (BlockNote) in Comment 1 for Russian characters. 

Fix coming tomorrow.
Ran into some other problems on this a couple weeks ago so that is why no fix on this yet.  Have some other stuff ahead of this before I get back to this task.
Assignee: netzen → nobody
You need to log in before you can comment on or make changes to this bug.