Can't use colons (:s) in link URLs

RESOLVED FIXED

Status

Webtools Graveyard
Litmus
RESOLVED FIXED
12 years ago
2 years ago

People

(Reporter: Adam Guthrie, Assigned: zach)

Tracking

Details

(URL)

(Reporter)

Description

12 years ago
Seemingly links with colons in their URLs will not get linkified. 

Something like "http://en.wikipedia.org/wiki/Special:Statistics" will not get linkified no matter how I try to link it -- be it "easy" linking or with plain HTML.
(Reporter)

Updated

12 years ago
Summary: Can't use colons (:s) in link → Can't use colons (:s) in link URLs

Comment 1

12 years ago
Where we tell it to, the templating system does some filtering prior to display, including trying to strip out "malformed" links. The HTML::StripScripts::Parser module has a very limited concept of what properly formed links are, i.e. only http(s):// links are considered valid. I've recently subclassed it so we can override the link-checking behavior to allow for things like ftp:// and mailto:// URLs, but there is obviously some work still to be done.

This is perhaps a larger issue of where we choose to filter content and not. IMO, we don't need to filter testcase data at all (aside from markdown syntax) since we implicitly trust the people (admins) who can change that data anyway. The preview system helps with this too now. We likely only need to be draconian in our filtering of test result data, specifically comments and anything that can come in from the internet-at-large.
(Assignee)

Comment 2

12 years ago
I'm wondering how best to strike a balance between security and usability here. Certainly, testcase data need not be subject to the same strict scrutiny as user-generated comments, but displaying unfiltered data from the database still concerns me. Right now, Litmus only has a very general notion of "admins," and anyone who is an admin is considered trusted. In the future, we are presumably going to want more granularity in this regard (i.e. some people get to be admins for Calendar only, only these people get to edit and disable other users, etc...). We may also have testcases or test results that are considered confidential and a privilege system that respects and enforces this. Serving up arbitrary html from testcase data under the litmus.mozilla.org domain opens us up to cross-site scripting, cookie stealing, and privilege escalation attacks. 

I see a couple of solutions: 
1. Use a separate domain for testcase content - Bugzilla hopes to move to this model for attachments and host attachment data at attachments.bugzilla.mozilla.org, keeping Bugzilla cookies hidden from any scripts embedded in attachments. On the downside, this is clunky, hard to maintain, and requires sysadmin by in. 

2. Open up permissions as much as possible for testcase data (tweaking the StripScripts config to allow more tags). We should look at what's being denied now and try to support these cases. Perhaps we should log filtered messages, or display an error message rather than silently filtering?

3. Go wide open and allow any arbitrary HTML. I don't think this is a good idea for all the reasons listed above. 

In this particular case, it seems like Markdown is just being buggy and doesn't consider a colon a valid URL character. I'll look into the regexps and see if this is the case.

Comment 3

12 years ago
(In reply to comment #2)
> I see a couple of solutions: 
> 1. Use a separate domain for testcase content - Bugzilla hopes to move to this
> model for attachments and host attachment data at
> attachments.bugzilla.mozilla.org, keeping Bugzilla cookies hidden from any
> scripts embedded in attachments. On the downside, this is clunky, hard to
> maintain, and requires sysadmin by in. 

That seems like overkill.
 
> 2. Open up permissions as much as possible for testcase data (tweaking the
> StripScripts config to allow more tags). We should look at what's being denied
> now and try to support these cases. Perhaps we should log filtered messages, or
> display an error message rather than silently filtering?

This is the easiet and quickest solution to this particular problem: fix Litmus::StripScripts to allow colons in URLs. Any links to wikis will not be marked up otherwise.
 
> 3. Go wide open and allow any arbitrary HTML. I don't think this is a good idea
> for all the reasons listed above. 

I don't see the problem here. Everyone with admin access can do much worse things to our system than displaying malformed HTML. If you don't trust your admins, you've got bigger problems than bad links.

We're also not returning random data from the database. There are a limited number of vectors by which testcase data can get into the database all of which involve gating through a trusted user. By all means, we should continue to filter any data coming from testers, but I don't think we can rely on HTML template filtering as a linchpin of our security policy.

What we should have is an audit_trail table where we can track admin actions and changes, possibly even with the ability to revert.

> In this particular case, it seems like Markdown is just being buggy and doesn't
> consider a colon a valid URL character. I'll look into the regexps and see if
> this is the case.

Adam tried it both ways (markdown and regular HTML) though. The problem with the testdata filter, and specifically with href validiity checking. We are also chaining filters, so if the testdata filter is barfing on colons, is doesn't matter what the markdown filter is doing.

Zach: are you taking this bug then? Please re-assign if so.
(Assignee)

Comment 4

12 years ago
I've added colons to the regular expression of allowed characters in URLs. This should presumably get migrated upstream to the StripScripts developers, but this should work for now. 

We can revisit the issue of filtering other content as it comes up. 
Assignee: ccooper → zach
(Assignee)

Updated

12 years ago
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.