13873 - Extraneous white space inserted into pasted CGI URL

Reporter

Description

•

25 years ago

Apprunner Build ID: 1999091310

A query was performed with the term "Search Engines" at HotBot. When the URL of
the resultant page is copied in Ancestor 4.6, it is pasted thus:

http://www.hotbot.com/?MT=Search+Engines&SM=MC&DV=0&LG=any&DC=10&DE=2&BT=H&Searc
h.x=31&Search.y=7

When the same URL data is pasted from that copied in Mozilla, thus it appears:

http://www.hotbot.com/?BT=H& MT=Search+Engines& Search.x=39& Search.y=4& SM=MC&
DV=0& LG=any& DC=10& DE=2

[Assigned to Parser for efficiency of processing, not due to any applicability
for the component]

rickg

Updated

•

25 years ago

Status: NEW → ASSIGNED

Summary: Extraneous white space inserted into pasted CGI URL → Extraneous white space inserted into pasted CGI URL

rickg

Comment 1

•

25 years ago

I'm not sure what the problem is; can you be more specific?

Crysgem

Reporter

Comment 2

•

25 years ago

(Most recently tested with the 1999091713 Apprunner build)
Perform a query on the term "Mozilla" at HotBot. The resultant URL is
http://www.hotbot.com/?BT=H&MT=Mozilla&Search.x=44&Search.y=7&. However, if the
URL is copied from the address field and pasted, it is reproduced as:
http://www.hotbot.com/?BT=H& MT=Mozilla& Search.x=44& Search.y=7&. Note the
inserted spaces after each ampersand.

rickg

Updated

•

25 years ago

Assignee: rickg → akkana

Status: ASSIGNED → NEW

rickg

Comment 3

•

25 years ago

I spoke with SimonF, who suggested this may be in your area.

Akkana Peck

Assignee

Comment 4

•

25 years ago

I'll look at it and see where the spaces are being introduced.

Akkana Peck

Assignee

Updated

•

25 years ago

URL: http://www.hotbot.com/ → http://www.hotbot.com/?MT=Search+Engi...

Akkana Peck

Assignee

Comment 5

•

25 years ago

This appears to be a Windows nsClipboard issue, not a problem in the output
converters; it doesn't happen on Linux.  Changing url to the one giving the
problem, since the one in the comments below is wrapped and therefore not usable
as a test.

Akkana Peck

Assignee

Updated

•

25 years ago

Assignee: akkana → pinkerton

Akkana Peck

Assignee

Comment 6

•

25 years ago

However, I can't reproduce this on Windows, either.  Where are you pasting?  The
browser urlbar still doesn't seem to support pasting on Windows (there's another
longstanding bug on that), and I don't see any spaces when pasting into either
the plaintext editor or the html editor on yesterday's build on NT.  Maybe it's
something to do with the Win98 clipboard?

Assigning to pinkerton, who owns XP clipboard and perhaps knows who owns the
Windows clipboard these days.

Mike Pinkerton (not reading bugmail)

Updated

•

25 years ago

Assignee: pinkerton → akkana

Mike Pinkerton (not reading bugmail)

Comment 7

•

25 years ago

happens on both macOS and windows when the text is copied from mozilla. works
just fine when the text is copied from 4.5. it appears as if that is how it's
being put on the clipboard. i broke in the clipboard code and that is how it is
given to me. not a clipboard bug, probably a xif converter bug in how it converts
to text/plain.

btw, copy/paste works in url bar if you use the cmd-keys. so does select-all.

bouncing back to akkana.

Akkana Peck

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Depends on: 12214, 14026, 14464

Target Milestone: M12

Akkana Peck

Assignee

Comment 8

•

25 years ago

Mike showed me how he reproduced this: he copied, using the accelerator keys,
from apprunner's urlbar back into the urlbar.  Unfortunately, the acelerator
keys are broken on linux and windows (14464).  Adding bug dependencies.

Akkana Peck

Assignee

Updated

•

25 years ago

Assignee: akkana → harishd

Status: ASSIGNED → NEW

Akkana Peck

Assignee

Comment 9

•

25 years ago

I found a way to reproduce it: paste the above URL into the plaintext editor (it
pastes fine), then select it there, copy, and paste it back into the plaintext
editor -- now it has spaces.

The spaces are getting added in conversion of XIF to plaintext, whenever
entities are encountered.  The XIF fragment looks like this:

<entity value="amp"/><content>SM=MC</content>
<entity value="amp"/><content>DV=0</content>

but somehow this causes the parser to call nsXIFDTD::HandleToken with a bogus
text token (mTypeID = 119) containing a space right after it calls HandleToken
with the entity tag (mTypeID = 15).

Harish, could you take a look at this, or at least give me some clues as to how
the parser builds up its stack of tokens and where to break to examine this
process?  By the time it gets to nsXIFDTD::BuildModel, the token stack is
already built and we're just popping them off one by one, so I'm not sure how to
find out where these extra text nodes are coming from.

Akkana Peck

Assignee

Comment 10

•

25 years ago

I have a test case for this now, which I will check into
htmlparser/tests/outsinks whenever I can (I may not be allowed to until the tree
opens for M12) which allows testing this with a standalone test app.

Akkana Peck

Assignee

Updated

•

25 years ago

No longer depends on: 12214, 14026, 14464

Akkana Peck

Assignee

Comment 11

•

25 years ago

Removing dependencies, which are no longer relevant.

harishd

Comment 12

•

25 years ago

Token stack is built up during the tokenization phase, i.e., right before the
DTD call.  The best place to break would be the call to ConsumeToken() by the
parser.  This is the place where the tokenization actually begins.  Since we're
seeing a whitespace it might be a good idea to also break at
nsHTMLTokenizer::ConsumeWhitespace().  Anyway, I'll also take a look into this
bug.

harishd

Updated

•

25 years ago

Assignee: harishd → akkana

harishd

Comment 13

•

25 years ago

The problem seems to be in nsHTMLToTXTSinkStream::AddLeaf(). Discussed with
akkana about it and she is willing to take the bug back.

Reassigning but to akkana@netscape.com

Akkana Peck

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Akkana Peck

Assignee

Comment 14

•

25 years ago

Accepting bug.

Akkana Peck

Assignee

Updated

•

25 years ago

Status: ASSIGNED → RESOLVED

Closed: 25 years ago

Resolution: --- → FIXED

Akkana Peck

Assignee

Comment 15

•

25 years ago

Fixed, and I have a test case in the automated output test to watch for
regressions:  run
TestOutput -i text/xif -o text/plain -c OutTestData/entityxif.out
OutTestData/entityxif.xif
(Linux only as yet, after building the output tests in htmlparser/tests/outsinks
-- not part of standard build yet).

Crysgem

Reporter

Updated

•

25 years ago

Status: RESOLVED → VERIFIED

Crysgem

Reporter

Comment 16

•

25 years ago

[Verifying--Fixed, on the originally reported platform].