Closed
Bug 104876
Opened 24 years ago
Closed 24 years ago
c:foo wrongly considered valid URL
Categories
(Core :: Networking, defect)
Tracking
()
VERIFIED
FIXED
mozilla0.9.6
People
(Reporter: BenB, Assigned: andreas.otte)
Details
Attachments
(1 file, 2 obsolete files)
|
1.64 KB,
patch
|
jesup
:
review+
dougt
:
superreview+
|
Details | Diff | Splinter Review |
Reproduction:
1. Read plaintext email with "c:foo" in content
Actual result:
It's linked to <c:foo> (which means, given the recognizer code, that Necko
considers "c:foo" to be a valid URL, which is of course wrong).
Expected result:Only valid and usable URI schemes are considered valid.
| Assignee | ||
Comment 1•24 years ago
|
||
Nice example ... :-)
I guess this depends on the recognizer code. The urlparser will make this:
urltest -std c:foo
c:foo
c,,,foo,-1,/,,,,,,c://foo/
which is somewhat legal, since the parser does not know which schemes are vaild.
A creation of a real url would give these results:
urltest c:foo
c:foo
Can not create URL
So it seems that the code that does this recognition inside mailnews has to do
something more than just call the parser, at least, if not creating the full
url, it has to check for a valid protocolhandler. This can be a performance hit.
Where is the recognizer code located? I will take a look.
| Reporter | ||
Comment 2•24 years ago
|
||
> A creation of a real url
That's what I do:
rv = mIOService->NewURI(specStr, nsnull, getter_AddRefs(uri));
<http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#380>
> would give these results: [...] Can not create URL
It doesn't, which is the bug.
| Assignee | ||
Comment 3•24 years ago
|
||
This is really strange. I used the exact same codelayout inside urltest.cpp:
result = pService->NewURI(i_pURL, nsnull, getter_AddRefs(pURL));
which returns the results as seen above. I will dig into this.
Comment 4•24 years ago
|
||
See also bug 104693 (and the C:\\ example)
this is a dup of bug 104693. Punctuation liks dots and colons that don't have
space before and after are treated like http:// URL's
| Assignee | ||
Comment 6•24 years ago
|
||
I will be gone for two days, can't work on it now, but there seems a lot of
coverage on this stuff. This seems to be a recent change, but going through
changes in netwerk over the last week nothing comes to mind except the str?cmp
changes by alecf. Or maybe some caching of protocol handlers gone bad ...
| Reporter | ||
Comment 7•24 years ago
|
||
> going through changes in netwerk over the last week nothing comes to mind
> except the str?cmp changes by alecf.
Same with the converter.
Comment 8•24 years ago
|
||
This isn't a dupe of the recent regression; it's been this way since 2001-09-03
at least.
Gerv
| Assignee | ||
Comment 9•24 years ago
|
||
It seems we call in these cases a default protocol handler which calls the os to
figure out if it can handle the url. Somehow, I don't know why yet, "c" is
considered a protocol that can be executed by (at least) linux. As a consequence
the url can be created ...
| Assignee | ||
Comment 10•24 years ago
|
||
No, it's GetCachedProtocolHandler in nsIOService.cpp. It mistakes c for the
chrome protocol and returns the chrome handler.
| Assignee | ||
Comment 11•24 years ago
|
||
This funny thing was caused by rjesup's checkin to nsIOService.cpp (1.104)
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsIOService.cpp&root=/cvsroot&subdir=mozilla/netwerk/base/src&command=DIFF_FRAMESET&rev1=1.103&rev2=1.104
protocol "c" is found to be "chrome" because of the strncasecmp with length 1.
| Assignee | ||
Comment 12•24 years ago
|
||
| Assignee | ||
Comment 13•24 years ago
|
||
| Assignee | ||
Updated•24 years ago
|
Attachment #54241 -
Attachment is obsolete: true
| Assignee | ||
Comment 14•24 years ago
|
||
And now is clear why urltest worked ... there was no chrome url to cache ...
Comment 15•24 years ago
|
||
Attachment #54246 -
Flags: review+
Comment 16•24 years ago
|
||
| Assignee | ||
Comment 17•24 years ago
|
||
Comment on attachment 54502 [details] [diff] [review]
alternate solution w/o the additional strlen
gScheme[i][len] might peek into some memory that is somewhere else if you enter a very long scheme to test with. I can't think of a case where this check fails but some boundary checking code might get upset.
Comment 18•24 years ago
|
||
actually, i believe it is ok. consider these 3 cases:
1) len > strlen(gScheme[i]) and gScheme[i] is a substring of scheme: strncasecmp
would return non-zero, since gScheme[i][k] would equal '\0' for some k < len.
2) len < strlen(gScheme[i]) and scheme is a substring of gScheme[i]: strncasecmp
would return zero, but gScheme[i][len] would be non-zero.
3) len == strlen(gScheme[i]) and scheme is equal to gScheme[i]: strncasecmp
would return zero, and gScheme[i][len] would be zero.
Comment 19•24 years ago
|
||
Comment on attachment 54502 [details] [diff] [review]
alternate solution w/o the additional strlen
r=rjesup@wgate.com
I agree with darin's analysis of the latest patch
Attachment #54502 -
Flags: review+
| Assignee | ||
Comment 20•24 years ago
|
||
I also believe the patch works fine, it's just that for example with a test
scheme of "ccccccccccccccccccccccccccccccccccccc" and a list of cached hadlers
like "http","chrome","file" we would peek with gScheme[i][len] into some memory
that is not allocated by the gScheme array. If you both think that is okay, I
will accept that.
Comment 21•24 years ago
|
||
cccccccccccccccccccccccccccccccc won't cause a past-end reference.
Looking at it again, chromeccccccccccccccccccccccccc will, however.
While this is 99% likely to be "safe" (not cause a page fault), it's not 100% -
if the linker puts the string near the end of a page it could fault.
Also, even worse, gScheme[i][len] could be a \0 in ANOTHER string (or other
data-section constant). So this is inherently wrong. I retract my r= for the
no-strlen patch.
If you want to avoid the strlen you should encode an array of length of the
values in gScheme (gSchemeLen[]). Otherwise do the strlen - the data will be
in L1 cache, and the longest scheme is about 6 characters, so the hit is minimal
(possible cache miss for strlen's code if the compiler doesn't inline it).
Updated•24 years ago
|
Attachment #54502 -
Flags: review+
Comment 22•24 years ago
|
||
Comment on attachment 54502 [details] [diff] [review]
alternate solution w/o the additional strlen
My apologies - it's too early in the morning. I looked at it one more time, and realized that the strncasecmp uses len. Since we don't look at gSchemes[i][len] unless strncasecmp returns 0, we know that the scheme and gScheme[i] are the same for len characters. Thus we know that strlen(gScheme[i]) >= len, so we can look at gScheme[i][len]. Mea Culpa. I need caffiene
Attachment #54502 -
Flags: review+
| Assignee | ||
Comment 23•24 years ago
|
||
Okay, I didn't took the execution order into account. I will go hunting for sr=,
but it will be weekend before I check this in, I'm away for three days.
Comment 24•24 years ago
|
||
yes, the code working correctly depends on execution order, and rjesup explained
it perfectly.
| Assignee | ||
Updated•24 years ago
|
Attachment #54246 -
Attachment is obsolete: true
Comment 25•24 years ago
|
||
Comment on attachment 54502 [details] [diff] [review]
alternate solution w/o the additional strlen
okay by me
Attachment #54502 -
Flags: superreview+
Comment 26•24 years ago
|
||
So we have r=/sr=, let's check in
Keywords: patch
Target Milestone: --- → mozilla0.9.6
| Assignee | ||
Comment 27•24 years ago
|
||
fix checked in ...
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 28•23 years ago
|
||
Using build 20020322 on winxp, mac 9.1 and linux and the original scenario in
plain text compose and html compose, this is fixed. Verified
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•