Last Comment Bug 425046 - (CVE-2008-5508) URLs containing 0x01 are interpreted very oddly - possible overflow bug?
: URLs containing 0x01 are interpreted very oddly - possible overflow bug?
: fixed1.9.1, verified1.8.1.19, verified1.9.0.5
Product: Core
Classification: Components
Component: Networking (show other bugs)
: unspecified
: x86 Linux
-- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
: Patrick McManus [:mcmanus]
Depends on: 451613
  Show dependency treegraph
Reported: 2008-03-25 12:19 PDT by Chip Salzenberg
Modified: 2009-01-13 11:09 PST (History)
13 users (show)
samuel.sidler+old: wanted1.9.0.x+
dveditz: wanted1.8.1.x+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

Sample file to trigger bug (98 bytes, text/html)
2008-03-25 13:32 PDT, Chip Salzenberg
no flags Details

Description User image Chip Salzenberg 2008-03-25 12:19:40 PDT
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20080129 Iceweasel/ (Debian-
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20080129 Iceweasel/ (Debian-

This tag:
    <a href=&#1;;space&#32;here>linky</a>
genrates this link text:
there's a ^A at the beginning of that, if you can't see it here.

What is wrong with this?  Let me count the ways:

    htt: instead of http:
    three slashes instead of two before "yahoo"
    slash in the middle of "com"
    missing "e" at the very end

I'm a C programmer and this smells like the kind of result that could spring from a potential buffer overflow bug.  However I have no evidence that an overflow is actually happening.  I tag this bug "Security" out of caution.

BTW this link works in Outlook - not that I would suggest for a second that you imitate Outlook!  But that's why I was testing it.

Reproducible: Always

Steps to Reproduce:
1. Paste the above link tag into HTML
2. View it
3. "Copy link location" from link
Actual Results:  
htt:///  (note leading ^A)

Expected Results:  (possibly with leading ^A)
Comment 1 User image :Gavin Sharp [email:] 2008-03-25 13:18:21 PDT
This looks like it might be related to bug 415034?

Chip: could you attach a HTML test file, rather than inline HTML code? I'm not able to get your example code to act like a normal link...
Comment 2 User image Chip Salzenberg 2008-03-25 13:32:46 PDT
Created attachment 311646 [details]
Sample file to trigger bug
Comment 3 User image Chip Salzenberg 2008-03-25 13:34:44 PDT
It's not a "normal" link at all - the first char of the URL is a ^A.  But it is a URL that can be copied with "Copy link location".
Comment 4 User image Christian :Biesinger (don't email me, ping me on IRC) 2008-04-16 04:20:55 PDT
(adjusting summary. quotes don't matter)
Comment 5 User image Daniel Veditz [:dveditz] 2008-04-22 22:50:20 PDT

We first decide this is a relative URL because we're unable to extract a valid scheme (nsIOService::NewURI calls ExtractScheme). ExtractScheme fails because of the invalid scheme character. If you replace the x01 with '_' you see something like which is fine (probably not what the author intended, but not an unreasonable guess).

So given the base, nsHttpsHandler::NewURI is called, which creates a nsStandardURL and for some reason calls resolve on the purely "relative" part. 

nsBaseURLParser::ParseURL looks for a scheme, but skips initial space and all control characters, not just the limited whitespace set used earlier. We now think the scheme is a valid "http" instead of bailing out at this point. We now remember scheme starts at 0 and has a length of 4. This is true relative to our modified spec string which we incremented when skipping initial chars, but the caller never gets told we changed the start so all the offsets from that point on are off one.

This bit should have used net_FilterURIString() like the original scheme extraction.

At this point because the scheme doesn't match what it thought the original base scheme was it assumes it must be an absolute URI, regardless of the fact that it was convinced it was relative earlier.

When it puts the URI back together from its component pieces in BuildNormalizedSpec all the offsets are off by one (or more, if there were more leading control characters) which explains the odd result. The scheme length is 4, so the 'p' gets dropped. The host is "/" instead of "", and of course it puts "://" between the scheme and host thus the three slashes, and so on.

Because BuildNormalizedSpec uses our string classes to handle potential expansion of percent encoding and IDN conversion to punycode there's no danger of overflowing any buffers.

The code is clearly broken, but I don't think there's a security problem since the URL in question was broken to begin with (invalid characters). On the other hand any time we get confused when parsing URIs I get nervous.
Comment 6 User image Boris Zbarsky [:bz] (still a bit busy) 2008-04-22 23:01:21 PDT
Yeah, seems to me like ParseURL() should do exactly what ExtractScheme does...  Darin, is there a reason this code skips control chars?
Comment 7 User image Daniel Veditz [:dveditz] 2008-11-24 14:48:56 PST
This is fixed in bug 451613, checked into the 1.8 and 1.9.0 branches.
Comment 8 User image Al Billings [:abillings] 2008-11-25 17:35:44 PST
Verified for with  Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/2008112505 GranParadiso/3.0.5pre.

Verified for with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/2008112503 BonEcho/
Comment 9 User image Jason Duell [:jduell] (needinfo me) 2009-01-09 18:20:24 PST
Since this bug is "fixed" (per Daniel Veditz, comment #7), should we close it?
Comment 10 User image Samuel Sidler (old account; do not CC) 2009-01-09 22:19:41 PST
Jason: Bug 451613 hasn't landed on the trunk yet, so this bug can't be FIXED. 

(Additionally, it hasn't landed on 1.9.1, but that'd be tracked with the fixed1.9.1 keyword.)
Comment 11 User image Daniel Veditz [:dveditz] 2009-01-12 17:03:09 PST
jst just landed this for me

Note You need to log in before you can comment on or make changes to this bug.