79929 - IMG tags with a space in the src= path are not loaded

Reporter

Description

•

23 years ago

At the above URL, there are two IMG tags that have a space in the path to the
image e.g. SRC=" /someimage.gif". This causes Mozilla not to load the images in
question. Removing those spaces with the editor and looking in the preview shows
the images load and display correctly.

Conor Lennon

Comment 1

•

23 years ago

I'm seeing this too in Build 2001050921 in Linux.
Changing OS to All.
Only happens when the url is not absolute.
I don't know what the Html standards say on this.

OS: Windows 2000 → All

Asa Dotzler [:asa]

Comment 2

•

23 years ago

updating component

Assignee: asa → clayton

Status: UNCONFIRMED → NEW

Component: Browser-General → HTML Element

Ever confirmed: true

QA Contact: doronr → bsharma

tor

Comment 3

•

23 years ago

*** Bug 84313 has been marked as a duplicate of this bug. ***

Christopher Blizzard (:blizzard)

Updated

•

23 years ago

Whiteboard: want for mozilla 0.9.2

Christopher Blizzard (:blizzard)

Updated

•

23 years ago

Whiteboard: want for mozilla 0.9.2 → critical for 0.9.2

chris hofmann

Comment 4

•

23 years ago

wonder how much content really looks like this...

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 5

•

23 years ago

I believe we should remove trailing and leading whitespace (but not ws in between).

Clayton is not the correct owner, HTML Element component should be removed from
Bugzilla. Reassigning to layout.

Assignee: clayton → karnaze

Component: HTML Element → Layout

QA Contact: bsharma → petersen

chris hofmann

Comment 6

•

23 years ago

marc, can you have a look?

Assignee: karnaze → attinasi

Whiteboard: critical for 0.9.2 → PDT+, critical for 0.9.2

Marc Attinasi

Assignee

Comment 7

•

23 years ago

I'll look deeper, but removing the leading (and possibly trailing) spaces shoul
dbe pretty easy, although I believe that this should probably be avoided for
standard mode (I need to check the HTML specs to see if the attribute is somehow
special).

Just curious: why is this critical for 0.9.2?

Status: NEW → ASSIGNED

Eric Pollmann

Comment 8

•

23 years ago

See also bug 24346 - same problem, resolved FIXED.

Chris Petersen

Comment 9

•

23 years ago

Changing QA contact

QA Contact: petersen → tpreston

Eric Pollmann

Comment 10

•

23 years ago

This is the checkin for bug 24346:
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&subdir=mozilla/layout/html/document/src&command=DIFF_FRAMESET&file=nsHTMLContentSink.cpp&rev1=3.304&rev2=3.305&root=/cvsroot

Eric Pollmann

Comment 11

•

23 years ago

CC'ing Harish - ideas?

Eric Pollmann

Comment 12

•

23 years ago

This space is no longer stripped out as of 21-April, rev 3.339 of that file.  If
I remember correctly, that was to fix a problem with:

<select>
 <option value=" foo">

In that case we *do* need to retain the space.  Perhaps the space should be
stripped out in the image frame, or in the url parser?

Marc Attinasi

Assignee

Comment 13

•

23 years ago

I think it is dangerous to change this at the Content sink since it has no clue
about attribute semantics, or even the URLParser level where it cannot
understand the calling-context. If I strip out the leading whitespace (and
trailing I guess) in the ImageFrame, then the DOM is not messed up. Of course,
it is not entirely clear that we are better off having a DOM value for the
attribute and using something else to actually load the image, but at least the
DOM is consistent with the source.

My proposed fix, which I am coding now and testing, is to do the trimming of the
src and lowsrc attribute values in the nsImageFrame::LoadImage method, for
QuirksMode only. I really think that we should NOT trim the whitespace in
standard mode, but I have yet to find anything in the specs to back me up...

Marc Attinasi

Assignee

Comment 14

•

23 years ago

Attached file Simple testcase — Details

Marc Attinasi

Assignee

Comment 15

•

23 years ago

Attached patch Patch — Details — Splinter Review

Stuart Parmenter

Comment 16

•

23 years ago

should this go in nsImageFrame::LoadImage so that it catches the case when 
someone changes the source dynamically?

Eric Pollmann

Comment 17

•

23 years ago

Should a similar patch go in for:

<form action=>
<body background=>
<blockquote/q/del/ins cite=>
<object classid=>
<object/applit codebase=>
<object data=>
<a/area/link/base href=>
<img/frame/iframe longdesc=>
<head profile=>
<script/input/frame/iframe/img src=>
<img/input/object usemap=>

(There are the attributes with type URI at:
http://www.w3.org/TR/html4/index/attributes.html)

Is it ever legal to have whitespace at the beginning of a URI?  Could all this
be done in one place?

Jeffrey Baker

Comment 18

•

23 years ago

According to RFC 2396, the space character is forbidden everywhere within any
URI.  So Moz should just do whatever it is Moz usually does with a bogus
attribute value.  Stripping the whitespace out is probably the wrong thing, as
it may hide the problem from a page author using Mozilla to test his work.  Most
validating tools won't catch this, because they don't include URI parsers.

Regarding the original problem page: when did <image> get in the HTML spec? 
Mozilla seems to understand <image> = <img>, but I don't know why it should.

Eric Pollmann

Comment 19

•

23 years ago

Specs aside, there are existing pages that expect this behavior.  All other
browsers on the market (including Mozilla up to 21-April) do strip leading
whitespace.  Let's not make this bug a repeat of bug 57968 and ship a browser,
then run around telling webmasters why their images are mysteriously not showing
up.  (If we must do this for strict mode, fine...)

Also, this has been our behavior up until 21-April-2001, so there is good
support for the theory that we won't encounter too many problems if we do strip
the whitespace for all URLs...

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 20

•

23 years ago

But as I said, do not strip whitespace in the middle, unless we were doing that
previously. We can (or at least used to) handle spaces in URIs. There are
websites that actually use spaces in directory and file names... (in the middle,
not in the beginning or end that I know of). Even Microsoft's BizTalk website
had them when it originally appeared, but they changed it fast :)

Marc Attinasi

Assignee

Comment 21

•

23 years ago

First, I strongly oppose putting this change *outside* of Quirks mode.
Thewhitespace is illegal, and the only reason we want to handle it at all is
because legacy browsers did, hence QuirksMode. In Standard mode, we want to
adhere to the specifications as closely as possibly, regardless of legacy
behavior (God save the web!). Furthermore, I can see no reason to muck with
whitespace inside the URL at all, just leading and trailing.

Re: pavlov's comment: Yea, maybe it should be done in LoadImage for the reasons
you stated and because it is actually easier there (less code). I'll redo tha patch.

Marc Attinasi

Assignee

Comment 22

•

23 years ago

Attached patch New Patch that handles dynamic changes to src too — Details — Splinter Review

Marc Attinasi

Assignee

Comment 23

•

23 years ago

I've attached a patch to trim the leading and trailing whitespace in the
LoadImage method (in QuirksMode only). This will handle dynamic changes to the
src attribute as well, but there is the additional cost of a string-copy of the
src attribute values now.

Please review patchID 39521 and lemme know what y'all think.

Also, I still have not heard why this is critical to 0.9.2 and RTM+ - it
mystifies me since it is marked severity:minor and there are no top50 URLs
noted. Anyway, it is easy enough, so marking target milestone 0.9.2

Priority: -- → P3

Target Milestone: --- → mozilla0.9.2

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 24

•

23 years ago

There is a typo in the comment, and you do not need the quirksmode variable
since you only test once. Otherwise looks ok to me.

Marc Attinasi

Assignee

Comment 25

•

23 years ago

Thanks Heikki, I fixed the typo and removed the quirksMode variable. Also, I
moved the whitespace static array into the block where it is used (could not see
any reason to remove it though).

selmer (gone)

Comment 26

•

23 years ago

Who is the sr= person for this bug?

Whiteboard: PDT+, critical for 0.9.2 → PDT+, r=heikki, need sr=

Marc Attinasi

Assignee

Comment 27

•

23 years ago

Good question - trying shaver (for no good reason)

Marc Attinasi

Assignee

Updated

•

23 years ago

Keywords: patch

Summary: IMG tags with a space in the src= path are not loaded → [PATCH] IMG tags with a space in the src= path are not loaded

Eric Pollmann

Comment 28

•

23 years ago

To reiterate, the checkin on 21-April that broke this one case (<img src=>) also
broke up to 20 other cases, listed above - every other case where a URL is
specified as an attribute.  This patch only addresses one of them.  Should we
open up a separate bug for the others?

I've only tested a few, but, our solution seems to be the least consistent of
any browser out there.  Some strip whitespace, some do not. This makes our
behavior probably all the more confusing/frustrating/misleading for web designers.

<form action=" foo.cgi">
  IE, Nav 4.x, and Mozilla now all handle this differently
  Netscape 6 handles it like IE.

<body background=" foo.gif">
  IE, Nav 4.x, and Netscape 6 all show this image, we do not.

<script/input/img src=>
  IE, Nav 4.x, and Netscape 6 all handle the script src=, input type=image src=,
and img src= cases, we only handle the img src= case

For example, take a look at this page.  Nav 4.x and IE render it one way
(arguably incorrectly), and we go a different route:
http://eric.302easyst.net/mozilla/bugs/attrws.html

My feeling is that if we fix img src= we should fix the other cases too...

Keywords: patch

Summary: [PATCH] IMG tags with a space in the src= path are not loaded → IMG tags with a space in the src= path are not loaded

Whiteboard: PDT+, r=heikki, need sr= → PDT+, critical for 0.9.2

Marc Attinasi

Assignee

Comment 29

•

23 years ago

Eric, I think your reasoning is sound. I'd like to look at the problem at the
sink level and see what I can come up with. Of course, this bug is marked PDT+
critical for some reason, so maybe I'll need to get this in before adressing the
more fundamental problem. In that case, this hack should be removed if/when the
more basic issue is resolved.

Regarding the April change that regressed this: if the sink was stripping
whitespace when it set attributes, then is it true that the dynamic case was not
handled? Is that OK? Also, how much semantic knowledge do we want the sink to
have about attribute values?

Eric Pollmann

Comment 30

•

23 years ago

True, well...  as this is PDT+, I think the patch above is definitely better
than not taking any action at all, as it brings us one step closer to compatability!

Christopher Blizzard (:blizzard)

Comment 31

•

23 years ago

I agree that whitespace should not be stripped except in quirks mode.  The
danger here is that whitepace might be signifigant in the case of an image that
is actually generated by a cgi.

sr=blizzard

Christopher Blizzard (:blizzard)

Comment 32

•

23 years ago

After prodding from Mike.  Can you guys make sure that you open bugs to deal
with and investigate the other isses that Eric has brought up here?  Making sure
that we pick something to be consistent with and making a better long term
solution?  THanks.

selmer (gone)

Comment 33

•

23 years ago

Chofmann didn't say it when he added PDT+, but the reason to want this badly is
that we have so many pages out there where we don't "just work" as other
browsers do (or at least as well as 4.x ;-)  I assume this class of quirks
(including the spinoff bugs suggested by Eric) is a significant contributor to
the problem.  If this patch isn't going to introduce a bunch of risk, then I
agree it's good to get what we have and work on doing better.  If the patch
might cause a bunch of regressions, I'd rather wait (unless current behavior is
*worse* than what we shipped in N6.)

What's the prognosis on fixing the problem in the general case to maximize
compatibility with IE and N4.x across the full range of URIs with whitespace?

chris hofmann

Updated

•

23 years ago

Whiteboard: PDT+, critical for 0.9.2 → PDT+, critical for 0.9.2, have patch, r and sr

Marc Attinasi

Assignee

Comment 34

•

23 years ago

Fixing the general problem is going to be more risky. I think the risk can be
mitigated, however, by getting enough eyes on it. The change I am proposing is
pretty simple, but it adds some semantic knowledge to the ContentSink that I
don't think it should have. That said, there are many of examples of the Sink
already doing this kind of element-specific / attribute-specific processing,
mostly for quirks. I'll put a patch together that takes care of stripping
leading and trailing whitespace from attributes while not regression the option
values (the reason this was changed and regressed in the first place). Once the
patch is there, people can look at it and decide if the risk is reasonable and
the benefits desirable...

Marc Attinasi

Assignee

Comment 35

•

23 years ago

Attached patch Patch to content sink: strips whitespace for all attributes that are not in a special list (back to old behavior without regressing the option value fix) — Details — Splinter Review

Marc Attinasi

Assignee

Comment 36

•

23 years ago

I attached a patch to the content sink that makes it strip out whitespace for
attributes EXCEPT those that are in a special table of non-strippable
attributes. Currently, the table is only built up of attribute names, though in
theory it might need tag names too (since attributes in different contexts might
have different legacy behavior). Currently, the table only has the 'value'
attribute so it does not regress the option value bug (gets us back to where we
were when bug 24346 was fixed, more or less).

The benefit is that all of the cases that Eric Pollman enumerated are handled.

The drawbacks to this approach are: cannot distinguish between Quirks and
Standard mode, and a slight performance hit as each attribute has to be looked
up in a small table.

Reviews? Thoughts? Concerns? Come on, lemme have it!

Jeffrey Baker

Comment 37

•

23 years ago

Sounds backwards.  We know what attributes have a URI type, so why not strip
whitespace for only those attributes.  Actually it always seemed to me that
whitespace stripping should be done by whatever code is responsible for
initiating the HTTP request, or resolving the relative URI.  That code knows it
is working on a URI, and that neither leading nor trailing whitespace belong
there.  That would definitely have less of a performance impact as well. 
Consulting a lookup table for every attribute can add up on pages like LXR, or
W3C recs, others, but stripping whitespace need only happen when navigating to
another page or loading style sheets, scripts, objects, and images.

Marc Attinasi

Assignee

Comment 38

•

23 years ago

Jeffrey, I agree with you, in fact the whole idea of stripping whitespace on
attributes stinks if you ask me. I was trying to minimize the impact, and since
we used to strip spaces for ALL attributes, and then changed it for the 'value'
attribute bug, I coded up a patch that is closest to what we had been doing for
some time. Clearly we were not discriminating between URL attributes and non-URL
attributes, so why should we now? It all seems so wrong, really, but then what
do I know?

Bear in mind that we are EVEN NOW stripping whitespace on all attributes, just
not the 'space' character. The current code has the Trim in there but no space
char in the set. So, the impact of the patch is just to add a check against the
set of attributes...

Johnny Stenback (:jst)

Comment 39

•

23 years ago

I would suggest that we'd move the whitespace stripping code to
HTMLContentSink::AddAttributes() where we can do the name compares by simply
comparing atom pointers and not having to do a ASCII and Unicode string
comparison  for all attributes, we already have the atom in
HTMLContentSink::AddAttributes() so we might as well use it.

PS. Since we're changing this code, we might as well replace Truncate() and
Append() with one call to Assign() in HTMLContentSink::GetAttributeValueAt() too.

Jon Hall

Comment 40

•

23 years ago

Please do not penalize standards mode performance to be compatible with illegal
html. Lowest common denominator is not what standards mode is all about. Leave
compatibility for quirks and Netscape.

Chris Hiner

Comment 41

•

23 years ago

Bug 83603 is caused by spaces in a script src tag...  So if this one is fixed
generically, close that one as well.

Christopher Blizzard (:blizzard)

Comment 42

•

23 years ago

This is a 0.9.2 bug.  Can someone summarize what's going on with this bug?  Thanks.

Marc Attinasi

Assignee

Comment 43

•

23 years ago

Summary: we used to strip leading whitespace from all attributes in the
ContentSink. We still do this, but we do not include the 'space' character in
the set of whitespace chars to strip. So, now if a URL has a leading space, it
is not handled properly. If it is a src attribute for an image, the image is not
displayed. If it is a src attribute for a script tag, the script is not loaded.

My initial stab at fixing this was to handle images only, and in quirks mode
only. Then it was pointed out that lots of attributes need the same processing,
so I made a hack into the content sink to basically revert it back to what it
was doing back in April (i.e. always stripping whitespace, including spaces,
except I excluded the 'value' attribute since that is the one that it was
changed for). The big problem is, this is non-standard behavior - whitespace in
the URL is illegal, and by stripping it we are not conforming to spec. In the
content sink it is not really possible (I think) to distinguish between Standard
and Quirks mode. There is also a small performance penalty in stripping
whitespace from every attribute...

Another proposed solution is to handle it in the URL parser - I was looking at
stripping the whitespace in the NS_NewURI call, but I am not very familiar with
that code, and not sure if it can be condition on Std. vs. Quirks mode.

There is a patch attached that handles <IMG> only. The question is about the
general case...

Christopher Blizzard (:blizzard)

Comment 44

•

23 years ago

Adding dougt since he can give us advice on the URL parsing bits.

Marc Attinasi

Assignee

Comment 45

•

23 years ago

I was looking at the NS_NewURI methods in nsNetUtil and it looks like we could
do the stripping there, but I cannot tell if the better place would be in the
IOService itself. The NS_NewURI method is inline, which seems quite strange
since it is called so often: if it is really being inlined then it seems like a
cause of potential bloat, if it is not being inlined then it is probably
screwing up code locality (anybody know if it is inlined on the various platforms?).

I can take a stab at it, but I keep thinking about johnhall's plea to NOT
penalize standards mode performance by stripping illegal whitespace - and I
agree with him. Stripping at the URI parsing would cause this penalty.

Jeffrey Baker

Comment 46

•

23 years ago

But stripping at URI parsing is going to happen a lot less frequently than
stripping in the attribute parsing.

Marc Attinasi

Assignee

Comment 47

•

23 years ago

PRessure is mounting: should I just checkin the hack in patch 39521 to fix the 
image problem, and then we can work on the general URL-stripping problem for 
0.9.3? Please advise - I have r and sr but need approval from drivers...

Eric Pollmann

Comment 48

•

23 years ago

Attached patch patch in netlib [one more fly in the ointment... :) ] — Details — Splinter Review

Eric Pollmann

Comment 49

•

23 years ago

This patch also fixes the problem, and might address some of the performance
concerns.  I copied the code that was already in nsStdUrl::SetSpec to strip
leading whitespace from absolute url's, to nsStdUrl::Resolve.  It may, in fact,
only need to be in one of the two places.  ???  Unfortunately, I don't really
know this code all that well...

As for performance:
  We are already doing the exact same logic for every URL that gets created in
SetSpec().  In the normal case (no whitespace), the logic reduces to a few
instructions.  As for how often this code is called, in loading the testcase I
posted above:
http://eric.302easyst.net/mozilla/bugs/attrws.html

SetSpec is called 109 times (no change here)
Resove is called 16 times (this patch would add a few extra instructions per call)

I am not advocating this as the correct solution, because I really, Really don't
know what the ramifications would be, but I'm posting it here as yet another
solution to consider, if needed.

Christopher Blizzard (:blizzard)

Updated

•

23 years ago

Whiteboard: PDT+, critical for 0.9.2, have patch, r and sr → PDT+, critical for 0.9.2, have patch, r and sr; need a decision on what patch to use from drivers and people in bug

Asa Dotzler [:asa]

Comment 50

•

23 years ago

at this late stage in the game I'd rather take the lower risk hack (for images
only) on the 0.9.2 branch and put the real fix on the trunk.  a= asa@mozilla.org
(on behalf of drivers) for images only fix on the 0.9.2 branch.

Whiteboard: PDT+, critical for 0.9.2, have patch, r and sr; need a decision on what patch to use from drivers and people in bug → PDT+, critical for 0.9.2, have patch, r and sr; a=drivers for image only fix on 0.9.2 branch

Marc Attinasi

Assignee

Comment 51

•

23 years ago

Patch for images only checked into trunk. The remaining, more general, issue is
being dealt with both here and in bug 87298 - I'm going to open a new bug to
cover the URL problem specifically, and reference it in both bugs (this and 87298).

Marc Attinasi

Assignee

Comment 52

•

23 years ago

Bug 87894 opened to cover remaining, general issue with whitespace in URLs.
Marking this one FIXED

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Terri Preston

Comment 53

•

23 years ago

Re-opening, can't see the spinning logo on https://webmail.jadedirect.com, or
that cute little baby in the test case

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Marc Attinasi

Assignee

Comment 54

•

23 years ago

Terri, did you use a BRANCH build? This is BRANCH only...

Marc Attinasi

Assignee

Comment 55

•

23 years ago

Sorry, my comment on  
   Additional Comments From Marc Attinasi 2001-06-26 14:47
was wrong, it should have said BRANCH, not trunk. This hacked fix is for the
BRANCH only.

Terri Preston

Comment 56

•

23 years ago

Alrighty, FIXED ON BRANCH ;-)

Status: REOPENED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

Terri Preston

Comment 57

•

23 years ago

Verified fixed w2k branch build 2001062803
Verified fixed linux branch build 2001062706
Verified fixed mac branch build 2001062809

Status: RESOLVED → VERIFIED

R.K.Aa.

Comment 58

•

22 years ago

Did this regress? See bug 133536
http://bugzilla.mozilla.org/show_bug.cgi?id=133536
and testcase in
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=76204

Andreas Otte

Comment 59

•

22 years ago

This has only been fixed on the 0.9.2 branch, so nothing regressed, the issue is
still there.

Simple testcase 23 years ago Marc Attinasi 160 bytes, text/html		Details
Patch 23 years ago Marc Attinasi 1001 bytes, patch		Details \| Diff \| Splinter Review
New Patch that handles dynamic changes to src too 23 years ago Marc Attinasi 1.14 KB, patch		Details \| Diff \| Splinter Review
Patch to content sink: strips whitespace for all attributes that are not in a special list (back to old behavior without regressing the option value fix) 23 years ago Marc Attinasi 1.43 KB, patch		Details \| Diff \| Splinter Review
patch in netlib [one more fly in the ointment... :) ] 23 years ago Eric Pollmann 1.88 KB, patch		Details \| Diff \| Splinter Review