User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.106 Safari/535.2 Steps to reproduce: Received a (plain text) email containing a URL like this: http://osm.org/go/0EQH0U0iC-- containing two dashes at the end. (These are generated from OpenStreetMap's short link generator, but presumably the same applies to any URL that happens to have a dash at the end) Actual results: Only http://osm.org/go/0EQH0U0iC was recognized as the link. (Actually, in this case following the incorrect link does work, but that's because OSM is interpreting the shorter link the same as the one containing the two dashes, but in general it wouldn't). Expected results: All of http://osm.org/go/0EQH0U0iC-- should have been included in the link, i.e. the pattern matching for links in plain text emails needs some work.
(I see bugzilla also gets its pattern matching wrong in the bug report :-) )
(In reply to David Earl from comment #1) > (I see bugzilla also gets its pattern matching wrong in the bug report :-) ) True, but this is not a bug. See http://lists.w3.org/Archives/Public/www-validator/2001Jul/0107.html
That is true of *hostnames* (RFC1034, 3.5) but seemingly not of the path component of a URL (RFC 1738, sec 5).
Oh, and all browsers and web servers handle them, so taking a purist attitude isn't helpful.
(In reply to David Earl from comment #4) > Oh, and all browsers and web servers handle them, so taking a purist > attitude isn't helpful. Please read bug 414579.
I did. Thunderbird doesn't deal with URLs split across multiple lines either. The response to that report seems to be 'it's too hard'. And as for 'avoid URLs with trailing dashes', what an absurd piece of nonsense. Think it though! The problem is receiving URLs in email - from other people. As recipient you don't have control over this. I appreciate that there is some ambiguity over double dashes and line ends, but if you aren't going to handle URLs split over multiple lines then I see no harm in dealing with trailing dashes in the pattern match as if that is a complete URL it will be correct, and if it isn't you'll get it wrong either way. Of course it would be better to handle URLs split over multiple lines too - indeed there are add-ons that specifically address this deficiency in Thunderbird. This is the second bug I've reported in the last month where the response has been "we want to hide in an ivory tower". Please, Thunderbird needs some real world pragmatism.
The people I was going to consider copying have already commented in bug 414579, so unless they have changed their minds the response will be the same