Closed
Bug 13873
Opened 25 years ago
Closed 25 years ago
Extraneous white space inserted into pasted CGI URL
Categories
(Core :: DOM: HTML Parser, defect, P3)
Tracking
()
VERIFIED
FIXED
M12
People
(Reporter: Crysgem, Assigned: akkzilla)
References
()
Details
Apprunner Build ID: 1999091310 A query was performed with the term "Search Engines" at HotBot. When the URL of the resultant page is copied in Ancestor 4.6, it is pasted thus: http://www.hotbot.com/?MT=Search+Engines&SM=MC&DV=0&LG=any&DC=10&DE=2&BT=H&Searc h.x=31&Search.y=7 When the same URL data is pasted from that copied in Mozilla, thus it appears: http://www.hotbot.com/?BT=H& MT=Search+Engines& Search.x=39& Search.y=4& SM=MC& DV=0& LG=any& DC=10& DE=2 [Assigned to Parser for efficiency of processing, not due to any applicability for the component]
Status: NEW → ASSIGNED
Summary: Extraneous white space inserted into pasted CGI URL → Extraneous white space inserted into pasted CGI URL
(Most recently tested with the 1999091713 Apprunner build) Perform a query on the term "Mozilla" at HotBot. The resultant URL is http://www.hotbot.com/?BT=H&MT=Mozilla&Search.x=44&Search.y=7&. However, if the URL is copied from the address field and pasted, it is reproduced as: http://www.hotbot.com/?BT=H& MT=Mozilla& Search.x=44& Search.y=7&. Note the inserted spaces after each ampersand.
Assignee | ||
Comment 4•25 years ago
|
||
I'll look at it and see where the spaces are being introduced.
Assignee | ||
Updated•25 years ago
|
Assignee | ||
Comment 5•25 years ago
|
||
This appears to be a Windows nsClipboard issue, not a problem in the output converters; it doesn't happen on Linux. Changing url to the one giving the problem, since the one in the comments below is wrapped and therefore not usable as a test.
Assignee | ||
Updated•25 years ago
|
Assignee: akkana → pinkerton
Assignee | ||
Comment 6•25 years ago
|
||
However, I can't reproduce this on Windows, either. Where are you pasting? The browser urlbar still doesn't seem to support pasting on Windows (there's another longstanding bug on that), and I don't see any spaces when pasting into either the plaintext editor or the html editor on yesterday's build on NT. Maybe it's something to do with the Win98 clipboard? Assigning to pinkerton, who owns XP clipboard and perhaps knows who owns the Windows clipboard these days.
Updated•25 years ago
|
Assignee: pinkerton → akkana
Comment 7•25 years ago
|
||
happens on both macOS and windows when the text is copied from mozilla. works just fine when the text is copied from 4.5. it appears as if that is how it's being put on the clipboard. i broke in the clipboard code and that is how it is given to me. not a clipboard bug, probably a xif converter bug in how it converts to text/plain. btw, copy/paste works in url bar if you use the cmd-keys. so does select-all. bouncing back to akkana.
Assignee | ||
Updated•25 years ago
|
Assignee | ||
Comment 8•25 years ago
|
||
Mike showed me how he reproduced this: he copied, using the accelerator keys, from apprunner's urlbar back into the urlbar. Unfortunately, the acelerator keys are broken on linux and windows (14464). Adding bug dependencies.
Assignee | ||
Updated•25 years ago
|
Assignee: akkana → harishd
Status: ASSIGNED → NEW
Assignee | ||
Comment 9•25 years ago
|
||
I found a way to reproduce it: paste the above URL into the plaintext editor (it pastes fine), then select it there, copy, and paste it back into the plaintext editor -- now it has spaces. The spaces are getting added in conversion of XIF to plaintext, whenever entities are encountered. The XIF fragment looks like this: <entity value="amp"/><content>SM=MC</content> <entity value="amp"/><content>DV=0</content> but somehow this causes the parser to call nsXIFDTD::HandleToken with a bogus text token (mTypeID = 119) containing a space right after it calls HandleToken with the entity tag (mTypeID = 15). Harish, could you take a look at this, or at least give me some clues as to how the parser builds up its stack of tokens and where to break to examine this process? By the time it gets to nsXIFDTD::BuildModel, the token stack is already built and we're just popping them off one by one, so I'm not sure how to find out where these extra text nodes are coming from.
Assignee | ||
Comment 10•25 years ago
|
||
I have a test case for this now, which I will check into htmlparser/tests/outsinks whenever I can (I may not be allowed to until the tree opens for M12) which allows testing this with a standalone test app.
Assignee | ||
Updated•25 years ago
|
Assignee | ||
Comment 11•25 years ago
|
||
Removing dependencies, which are no longer relevant.
Comment 12•25 years ago
|
||
Token stack is built up during the tokenization phase, i.e., right before the DTD call. The best place to break would be the call to ConsumeToken() by the parser. This is the place where the tokenization actually begins. Since we're seeing a whitespace it might be a good idea to also break at nsHTMLTokenizer::ConsumeWhitespace(). Anyway, I'll also take a look into this bug.
Comment 13•25 years ago
|
||
The problem seems to be in nsHTMLToTXTSinkStream::AddLeaf(). Discussed with akkana about it and she is willing to take the bug back. Reassigning but to akkana@netscape.com
Assignee | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 14•25 years ago
|
||
Accepting bug.
Assignee | ||
Updated•25 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 15•25 years ago
|
||
Fixed, and I have a test case in the automated output test to watch for regressions: run TestOutput -i text/xif -o text/plain -c OutTestData/entityxif.out OutTestData/entityxif.xif (Linux only as yet, after building the output tests in htmlparser/tests/outsinks -- not part of standard build yet).
Reporter | ||
Comment 16•25 years ago
|
||
[Verifying--Fixed, on the originally reported platform].
You need to log in
before you can comment on or make changes to this bug.
Description
•