Closed Bug 212177 Opened 21 years ago Closed 20 years ago

image alt text not ever used in plaintext conversion

Categories

(Core :: DOM: Serializers, defect)

defect
Not set
minor

Tracking

()

VERIFIED FIXED

People

(Reporter: ihok, Assigned: Brade)

Details

(Keywords: fixed1.7, helpwanted, regression)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624

When I select and copy an image on a web page, I want either the image itself to
be copied or its alt text. This would be especially useful on nytimes.com: that
site begins each story with a leading image that is a graphic representation of
a letter. For example, stories begin like "[D]ETROIT, July 8", where [D] is a
gif (http://graphics7.nytimes.com/images/dropcap/d.gif). When I copy a paragraph
that starts with that gif, I don't want a copy that reads "DETROIT", not
"ETROIT", which is the case now.

A heuristic for determining whether to copy an image or its alt text: If the
selection contains an inline image and preceding or following text, copy the alt
text. Otherwise, copy the image (i.e., place the contents of the gif or whatever
in the clipboard).

The easiest workaround for this bug is to go to the properties of the image and
select the alt text there, but that a) interrupts work flow and b) can get
really annoying if, say, the NY Times story you're copying contains several such
images -- hence, filing as minor, not RFE.

Reproducible: Always

Steps to Reproduce:
1. Select and copy an image.
2. Paste into, say, Word (which can handle either image or text pastes).

Actual Results:  
Paste is empty.
sounds like a dupe of a bug on my plate
-->brade
Assignee: mjudge → brade
Component: Selection → XP Apps: Drag and Drop
OS: Windows XP → All
Hardware: PC → All
I can't find the bug this is a dup of, but I did find bug 66035, "Image urls are
copied with text selection" (fixed).
I can't find a dupe either, but I can reproduce the problem 
(1.4 Final, Windows 2000).

The discussion in bug 66035 indicates that the function used for copy/paste is 
essentially the generic conversion to plain text; updating component, this isn't 
just a drag-and-drop issue.

For instance, saving a web page as text places no indication at all of an image 
in the text, even if there is an 'alt' (or a 'title').

See the closely related bug 180620, about 'alt' text not being displayed when 
images are turned off; and also bug 180991, about the best representation of the 
'alt' text.

Jack Tanner, note that clipboard items can be (and are, for Mozilla) copied as 
multiple parts -- a plain text part, a rich text part, an HTML part, for 
instance.  The 'heuristic' you describe is probably not useful; the problem is 
that the plain-text part needs to get the 'alt' text.
Status: UNCONFIRMED → NEW
Component: XP Apps: Drag and Drop → DOM to Text Conversion
Ever confirmed: true
Summary: image alt text not selected on copy paste → image alt text not ever used in plaintext conversion
Blocks: 147784
No longer blocks: 147784
This bug breaks pasting directions from Yahoo! Maps in a dangerous way: the
direction to turn (left or right) doesn't get pasted.  (Reported at
http://broken.typepad.com/b/2003/11/pasting_yahoo_m.html.)
Do we know when this regressed?  The code should be in nsPlainTextSerializer.cpp.

It looks like bratell@lysator.liu.se changed the relevant block when he moved it
in revision 1.73.  Perhaps that is when this regression started?  Then again,
maybe my eyes are glazed over from looking at various diffs and I missed it?
OK, there are two possibilities for this regression:
 1) moving the code around or rewriting it in nsPlaintextSerializer.cpp
 2) the caller changed the flag(s) used during copy

We're never getting to the img block in DoOpenContainer because it comes after a
check for nsIDocumentEncoder::OutputFormatted which apparently isn't set.

Does anyone know if the image block should be moved up (because it shouldn't be
treated as formatted output) or if the caller should set the flag for
OutputFormatted?
OutputFormatted should definitely not be set during c&p.  In fact, the primary
meaning of OutputFormatted is "this is not from a copy operation", i.e. that
flag being clear means we're copying or doing something otherwise simple.

If I'm reading the diff correctly, Daniel moved that code from DoAddLeaf to
DoOpenContainer; when it was in DoAddLeaf, it got called regardless of the
formatting flag, but in DoOpenContainer, it now lives after the "if not
formatted then return" clause.

Seems like the right thing to do is to move it before the return clause, so it
gets called even in the non-formatted case.

Kathy, you want to do that?  (Since you already did the hard part, the diff
detective work.)  Or I can take it, if you want to reassign to me.
Snarf -- will attach a patch as soon as I finish updating and building so I can
test it.
Assignee: brade → akkzilla
Status: NEW → ASSIGNED
I tried the simple patch I suggested, and it didn't output anything for images.
 Why?  Because an IMG tag isn't a container, and so DoOpenContainer is never
called for img tags.  Having the img tag handling in DoOpenContainer is just wrong.

I'm not sure why it was moved from DoAddLeaf.  I'll look again at the diff
tomorrow (it's late now) and try to understand it, but I'm fairly sure that the
img stuff should move back to DoAddLeaf.  If anyone objects to that, or
understands why it was moved, please speak now!
This is what Akkana said to do.  I tested it and it works for me.  Is it too
late to get this into 1.7?
Assignee: akkzilla → brade
Comment on attachment 146214 [details] [diff] [review]
patch from Akkana

r=brade (Akkana wrote the patch); seeking sr= from bz  or others who are
qualified
Attachment #146214 - Flags: superreview?(bzbarsky)
Attachment #146214 - Flags: review+
Comment on attachment 146214 [details] [diff] [review]
patch from Akkana

I'm sorry, I'm not currently doing srs (see the reviewers page).  Bouncing to
jst, who sred the patch that causes this bug...
Attachment #146214 - Flags: superreview?(bzbarsky) → superreview?(jst)
Comment on attachment 146214 [details] [diff] [review]
patch from Akkana

+    if (!imageDescription.IsEmpty()) {
+      Write(imageDescription);
+    }

Pointless check for empty string, it's fine to call Write() with an empty
string, it's a nop. No need for the code for the extra check here.

sr=jst
Attachment #146214 - Flags: superreview?(jst) → superreview+
fix landed for 1.8a; any interest in having this for 1.7?
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Attachment #146214 - Flags: approval1.7?
This checkin seems to have caused a Tp jump on btek (though not on the other
tinderboxes).  I wouldn't have thought the plaintext serializer was used during
pageload much....
I'd like to hear more on the Tp jump before considering this for 1.7. Kathy, can
you provide any insight or exoneration for this patch?
I have no explanation for the jump on btek.  It does seem to have gone down
slightly and remained unchanged on other computers.  I am surprised this would
have any effect on startup tests and such.

The only thing this patch does is move a block of code back where it originally
was so it now works.  I don't think it should prevented from landing on the 1.7
branch unless we learn of a serious bug related to the checkin (but I'm not paid
to make the decisions of what lands and what doesn't ;-) ).
Comment on attachment 146214 [details] [diff] [review]
patch from Akkana

a=asa (on behalf of drivers) for checkin to 1.7
Attachment #146214 - Flags: approval1.7? → approval1.7+
I can confirm that this problem has been fixed in the following scenarios:
1) Load web page, Save As Text file: 
   'alt' text output to file
2) Load web page, Select portion including image, Paste into Text drop-sink:
   'alt' text included in paste
3) Load HTML email with embedded image, View|Message Body As|Plain:
   'alt' text displayed

Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040510

Has this been checked into the 1.7 branch yet?  It would be nice to see it in 
RC2.  Should I verify now, or wait to check on 1.7?

Now that this is fixed, see bug 180991.
Comment on attachment 146214 [details] [diff] [review]
patch from Akkana

The plan is to land this patch on the 1.7 branch.  It hasn't happened yet
(sorry, I don't have time today).  If someone else would like to land it, go
ahead but please comment here (and add appropriate keyword).
Keywords: fixed1.7
So far as I am concerned, this "bug fix" is a new bug in 1.7./ I frequently
select text that contains numerous "decorative" images and paste it. I most
certainly do *not* want the ALT text for these images to be included in the
selection.

For example, try selecting the article authors' names at
http://dx.doi.org/10.1016/j.patcog.2004.03.016

Since some people seem to want this functionality that I regard as a bug, could
this behaviour at least be controllable via Preferences?

I would like to see this reopened.

Regards,

David
Verifying: fixed in Moz 1.7.2 (also in TB.7)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: