Closed Bug 104876 Opened 24 years ago Closed 24 years ago

c:foo wrongly considered valid URL

Categories

(Core :: Networking, defect)

Product:

Component:

Platform:

x86

Linux

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla0.9.6

People

(Reporter: BenB, Assigned: andreas.otte)

Details

Attachments

(1 file, 2 obsolete files)

patch to fix the problem ... have to check for the length too ... 24 years ago Andreas Otte 810 bytes, patch		Details \| Diff \| Splinter Review
this is better ... 24 years ago Andreas Otte 811 bytes, patch	jesup : review+	Details \| Diff \| Splinter Review
alternate solution w/o the additional strlen 24 years ago Darin Fisher 1.64 KB, patch	jesup : review+ dougt : superreview+	Details \| Diff \| Splinter Review

Ben Bucksch (:BenB)

Reporter

Description

•

24 years ago

Reproduction: 1. Read plaintext email with "c:foo" in content Actual result: It's linked to <c:foo> (which means, given the recognizer code, that Necko considers "c:foo" to be a valid URL, which is of course wrong). Expected result:Only valid and usable URI schemes are considered valid.

Assignee

Comment 1

•

24 years ago

Nice example ... :-) I guess this depends on the recognizer code. The urlparser will make this: urltest -std c:foo c:foo c,,,foo,-1,/,,,,,,c://foo/ which is somewhat legal, since the parser does not know which schemes are vaild. A creation of a real url would give these results: urltest c:foo c:foo Can not create URL So it seems that the code that does this recognition inside mailnews has to do something more than just call the parser, at least, if not creating the full url, it has to check for a valid protocolhandler. This can be a performance hit. Where is the recognizer code located? I will take a look.

Ben Bucksch (:BenB)

Reporter

Comment 2

•

24 years ago

> A creation of a real url That's what I do: rv = mIOService->NewURI(specStr, nsnull, getter_AddRefs(uri)); <http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#380> > would give these results: [...] Can not create URL It doesn't, which is the bug.

Assignee

Comment 3

•

24 years ago

This is really strange. I used the exact same codelayout inside urltest.cpp: result = pService->NewURI(i_pURL, nsnull, getter_AddRefs(pURL)); which returns the results as seen above. I will dig into this.

Comment 4

•

24 years ago

See also bug 104693 (and the C:\\ example)

Comment 5

•

24 years ago

this is a dup of bug 104693. Punctuation liks dots and colons that don't have space before and after are treated like http:// URL's

Assignee

Comment 6

•

24 years ago

I will be gone for two days, can't work on it now, but there seems a lot of coverage on this stuff. This seems to be a recent change, but going through changes in netwerk over the last week nothing comes to mind except the str?cmp changes by alecf. Or maybe some caching of protocol handlers gone bad ...

Ben Bucksch (:BenB)

Reporter

Comment 7

•

24 years ago

> going through changes in netwerk over the last week nothing comes to mind > except the str?cmp changes by alecf. Same with the converter.

Gervase Markham [:gerv]

Comment 8

•

24 years ago

This isn't a dupe of the recent regression; it's been this way since 2001-09-03 at least. Gerv

Assignee

Comment 9

•

24 years ago

It seems we call in these cases a default protocol handler which calls the os to figure out if it can handle the url. Somehow, I don't know why yet, "c" is considered a protocol that can be executed by (at least) linux. As a consequence the url can be created ...

Assignee

Comment 10

•

24 years ago

No, it's GetCachedProtocolHandler in nsIOService.cpp. It mistakes c for the chrome protocol and returns the chrome handler.

Assignee

Comment 11

•

24 years ago

This funny thing was caused by rjesup's checkin to nsIOService.cpp (1.104) http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsIOService.cpp&root=/cvsroot&subdir=mozilla/netwerk/base/src&command=DIFF_FRAMESET&rev1=1.103&rev2=1.104 protocol "c" is found to be "chrome" because of the strncasecmp with length 1.

Assignee

Comment 12

•

24 years ago

Attached patch patch to fix the problem ... have to check for the length too ... (obsolete) — Details — Splinter Review

Assignee

Comment 13

•

24 years ago

Attached patch this is better ... (obsolete) — Details — Splinter Review

Assignee

Updated

•

24 years ago

Attachment #54241 - Attachment is obsolete: true

Assignee

Comment 14

•

24 years ago

And now is clear why urltest worked ... there was no chrome url to cache ...

Randell Jesup [:jesup] (needinfo me)

Comment 15

•

24 years ago

Comment on attachment 54246 [details] [diff] [review] this is better ... Doh! r=rjesup@wgate.com

Attachment #54246 - Flags: review+

Comment 16

•

24 years ago

Attached patch alternate solution w/o the additional strlen — Details — Splinter Review

Assignee

Comment 17

•

24 years ago

Comment on attachment 54502 [details] [diff] [review] alternate solution w/o the additional strlen gScheme[i][len] might peek into some memory that is somewhere else if you enter a very long scheme to test with. I can't think of a case where this check fails but some boundary checking code might get upset.

Comment 18

•

24 years ago

actually, i believe it is ok. consider these 3 cases: 1) len > strlen(gScheme[i]) and gScheme[i] is a substring of scheme: strncasecmp would return non-zero, since gScheme[i][k] would equal '\0' for some k < len. 2) len < strlen(gScheme[i]) and scheme is a substring of gScheme[i]: strncasecmp would return zero, but gScheme[i][len] would be non-zero. 3) len == strlen(gScheme[i]) and scheme is equal to gScheme[i]: strncasecmp would return zero, and gScheme[i][len] would be zero.

Randell Jesup [:jesup] (needinfo me)

Comment 19

•

24 years ago

Comment on attachment 54502 [details] [diff] [review] alternate solution w/o the additional strlen r=rjesup@wgate.com I agree with darin's analysis of the latest patch

Attachment #54502 - Flags: review+

Assignee

Comment 20

•

24 years ago

I also believe the patch works fine, it's just that for example with a test scheme of "ccccccccccccccccccccccccccccccccccccc" and a list of cached hadlers like "http","chrome","file" we would peek with gScheme[i][len] into some memory that is not allocated by the gScheme array. If you both think that is okay, I will accept that.

Randell Jesup [:jesup] (needinfo me)

Comment 21

•

24 years ago

cccccccccccccccccccccccccccccccc won't cause a past-end reference. Looking at it again, chromeccccccccccccccccccccccccc will, however. While this is 99% likely to be "safe" (not cause a page fault), it's not 100% - if the linker puts the string near the end of a page it could fault. Also, even worse, gScheme[i][len] could be a \0 in ANOTHER string (or other data-section constant). So this is inherently wrong. I retract my r= for the no-strlen patch. If you want to avoid the strlen you should encode an array of length of the values in gScheme (gSchemeLen[]). Otherwise do the strlen - the data will be in L1 cache, and the longest scheme is about 6 characters, so the hit is minimal (possible cache miss for strlen's code if the compiler doesn't inline it).

Randell Jesup [:jesup] (needinfo me)

Updated

•

24 years ago

Attachment #54502 - Flags: review+

Randell Jesup [:jesup] (needinfo me)

Comment 22

•

24 years ago

Comment on attachment 54502 [details] [diff] [review] alternate solution w/o the additional strlen My apologies - it's too early in the morning. I looked at it one more time, and realized that the strncasecmp uses len. Since we don't look at gSchemes[i][len] unless strncasecmp returns 0, we know that the scheme and gScheme[i] are the same for len characters. Thus we know that strlen(gScheme[i]) >= len, so we can look at gScheme[i][len]. Mea Culpa. I need caffiene

Attachment #54502 - Flags: review+

Assignee

Comment 23

•

24 years ago

Okay, I didn't took the execution order into account. I will go hunting for sr=, but it will be weekend before I check this in, I'm away for three days.

Comment 24

•

24 years ago

yes, the code working correctly depends on execution order, and rjesup explained it perfectly.

Assignee

Updated

•

24 years ago

Attachment #54246 - Attachment is obsolete: true

Doug Turner (:dougt)

Comment 25

•

24 years ago

Comment on attachment 54502 [details] [diff] [review] alternate solution w/o the additional strlen okay by me

Attachment #54502 - Flags: superreview+

Randell Jesup [:jesup] (needinfo me)

Comment 26

•

24 years ago

So we have r=/sr=, let's check in

Keywords: patch

Target Milestone: --- → mozilla0.9.6

Assignee

Comment 27

•

24 years ago

fix checked in ...

Status: NEW → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED

Comment 28

•

23 years ago

Using build 20020322 on winxp, mac 9.1 and linux and the original scenario in plain text compose and html compose, this is fixed. Verified

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.