183111 - [mozTXTtoHTMLConv] "be:" gets converted to an URL link

Reporter

Description

•

22 years ago

"be:" (w/out the quotes) gets converted to an URL link

Trunk 2002-11-25-12 build, Windows XP.  HTML compose.

1.  Compose new msg
2.  Type something like this in the body:

The date will be:  December 2

3.  Send it as rich text to yourself
4.  Get the msg
5.  Notice that the be: word is set to URL of be:

be: shouldn't do anything.

Matthias Versen [:Matti]

Comment 1

•

22 years ago

blah: etc matches as URL scheme and it will be converted.

invalid (?)

Matthias Versen [:Matti]

Comment 2

•

22 years ago

*** Bug 183489 has been marked as a duplicate of this bug. ***

lchiang

Reporter

Comment 3

•

22 years ago

From a user's point of view, it makes the email look strange after being sent. 
There are lots of times when I type : after a word in email. Examples:

LisaC:  Can you please make some food?

or

The date will be:  December 25, 2002

etc.

Keywords: nsbeta1

Ray Trent

Comment 4

•

22 years ago

In fact, lots of different things are showing up as links. One particularly
annoying one is Outlook meeting requests, which include dates, as in: "1:00
PM-2:00 PM (GMT-08:00)". Both the "PM-2:00" and "GMT-08:00" are linkified.

The truly inane thing is that if you click on these, you *immediately* get
"protocol not known", so there's no possible reason for these things to be
links, and the tool *knows it*.

Matthias Versen [:Matti]

Comment 5

•

22 years ago

*** Bug 185836 has been marked as a duplicate of this bug. ***

James Paige

Comment 6

•

22 years ago

I have a cron job that regularly mails me the output of an unzip operation, and
all the "Defl:X" text gets converted to a link.

 Length   Method    Size  Ratio   Date   Time   CRC-32    Name
--------  ------  ------- -----   ----   ----   ------    ----
  254452  Defl:X   207869  18%  12-18-02 04:16  6f4a4809  clean.dat
     110  Stored      110   0%  12-18-02 04:16  23842225  file_id.diz
  303167  Defl:X   269684  11%  12-18-02 04:16  3441bbb6  names.dat
    1209  Defl:X      571  53%  12-18-02 04:16  9393b561  packing.lst
     708  Defl:X      364  49%  12-18-02 04:16  5b0e1ce7  pkgdesc.ini
   12169  Defl:X     4543  63%  12-18-02 04:16  8596e468  reseller.txt
 2186479  Defl:X  1942861  11%  12-18-02 04:16  6378d76f  scan.dat
   51200  Defl:X    24211  53%  12-18-02 04:16  9858d0f3  VALIDATE.EXE
   42048  Defl:X    12968  69%  12-18-02 04:16  c75efa3b  readme.txt
   12124  Defl:X     2472  80%  10-15-98 03:20  dc749365  internet.dat
--------          -------  ---                            -------
 2863666          2465653  14%                            10 files

I can understand why only converting common protocols such as http: and ftp:
would exclude some valid links, but nowhere near as many bogus links as get
created by the current anything: link scheme

Benedikt Kantus

Comment 7

•

22 years ago

I am using trunk build 2002122004 on Win 98 and absolutely everything containing
a : will be linkified, even if it is just something like ...:.

It's not just some characters or some combinations, but really everything with a :

Matthias Versen [:Matti]

Comment 8

•

22 years ago

*** Bug 187505 has been marked as a duplicate of this bug. ***

timeless

Comment 9

•

22 years ago

hey, i like having stack traces that look like links :)

Assignee: ducarroz → ben.bucksch

Component: Mail Back End → MIME

Ben Bucksch (:BenB)

Comment 10

•

22 years ago

> One particularly
> annoying one is Outlook meeting requests, which include dates, as in: "1:00
> PM-2:00 PM (GMT-08:00)". Both the "PM-2:00" and "GMT-08:00" are linkified.

> The truly inane thing is that if you click on these, you *immediately* get
> "protocol not known", so there's no possible reason for these things to be
> links, and the tool *knows it*.

I believe this statement is wrong. My converter (mozTXTToHTMLConv, used in
mailnews and AIM) only linkifies URLs which are identified as valid by necko.
This includes (or did include) a check for the URL scheme (the part before the
colons).

Because Mozilla also checks the Windows registry (or comparable) for URL
schemes, the results will heavily depend upon your system.

If your statements are right, this is either a regression or I see no real way
to fix this. "thepalace:" can be a valid URL. But if really the "be:" in " be: "
gets linked, then there should be done something about it. I find it strange,
though, that this bug has never been reported before, although this converter
works in Mozilla since over 3 years now.

Keywords: qawanted

Benedikt Kantus

Comment 11

•

22 years ago

Attached image Screenshot showing that not everything is linkified — Details

As you can see from the screenshot, not everything will be linkified, though I
have not yet discovered which ones will and which ones will not.
See additional comment.

Benedikt Kantus

Comment 12

•

22 years ago

I took the screenshot that is attached to comment #1 using the 2003010205 trunk
build on Win2k.
It shows a mail I received with Mozilla showing two different behaviours: On the
upper image, all words containig a colon will be linkified. On the lower image
there are words that were not linkified.

This happened within the _same_ email and with the _same_ build.


Ben, comment #10 doesn't clearly say that you can reproduce this bug with your
build. Is this so?

lchiang

Reporter

Comment 13

•

22 years ago

This is a regression which I didn't see on the 1.0x branch builds.
Gayatri - can you assign to someone to see if s/he can find a pattern?

Samir Gehani

Comment 14

•

22 years ago

Mail triage team: nsbeta1+/adt2

Keywords: nsbeta1 → nsbeta1+

Whiteboard: [adt2]

Ben Bucksch (:BenB)

Comment 15

•

22 years ago

the following call checks, if the URL is valid:
rv = mIOService->NewURI(NS_ConvertUCS2toUTF8(txtURL), nsnull, nsnull,
getter_AddRefs(uri));
<http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp>

I guess that the implementation of NewURI changed, that somebody "optimized" the
"performance" (really speed) of the function and limited the functionality of it
while doing so, so that it no longer checks, if the URL scheme is valid. If
that's the case, this would be a bad move without checking consumers, and I
think that there should be a version of the function which does all checks.

The code was written specifically so that it would have no knowledge of
different URL schemes and instead to delegate all checks to netwerk's generic
URI functions. In fact, that's a major reason for why I wrote this class
originally. So, hardcoding URL scheme knowledge into the converter is not
acceptable.

If I happen to be right, then I'd suggest to create a new function like
NewURIWithChecks, doing what the old implementation of NewURI did and maybe even
more checks, and to use that from the converter.

R.K.Aa.

Comment 16

•

22 years ago

*** Bug 189971 has been marked as a duplicate of this bug. ***

Cavin Song

Assignee

Comment 17

•

22 years ago

Reassigning.

Assignee: ben.bucksch → cavin

Matthias Versen [:Matti]

Comment 18

•

22 years ago

*** Bug 190831 has been marked as a duplicate of this bug. ***

Cavin Song

Assignee

Comment 19

•

22 years ago

The problem occurs only when you put more than one spaces after a colon char and 
is caused by the fact that the body text (in a compose window) returned from the 
editor API (ie, mEditor->OutputToString()) has changed. 

For a string like the following in a compose window:

  be:^^   (^^ represents 2 spaces)

mEditor->OutputToString() would return a string like:

 be:^\A0

meaning that a space followed by a char whose hex code is 'A0'. But now it 
returns:

  be:\A0^

meaning that a char whose hex code is 'A0' followed by a space .

Because of this change it breaks the code in mozTXTToHTMLConv::ScanTXT() where 
we check to see if we need to find an url when we hit char ':'.

Ray Trent

Comment 20

•

22 years ago

Comment 19 may be correct for the original report of this issue, but not all of
the subsequent comments refer to strings with spaces being linkified. In fact,
many of them have nothing to do with messages composed in Mozilla at all
(comments 4 and 6 are examples to the contrary).

Is this because the double space problem isn't the root cause, or do we need 2
bugs here?

Cavin Song

Assignee

Comment 21

•

22 years ago

The problem with "PM-2:00" is caused by the same reason that Ben mentioned in 
Comment #15 (ie, issue with NewURI() call).

(not reading, please use seth@sspitzer.org instead)

Comment 22

•

22 years ago

summarizing a conversation I had with cavin about this bug.

nsIIOService is frozen see 
http://lxr.mozilla.org/mozilla/source/netwerk/base/public/nsIIOService.idl#54

we should see if any other consumers were expecting the same behaviour as 
mozTXTToHTMLConv::CheckURLAndCreateHTML() was.

if not, and only CheckURLAndCreateHTML() was expecting this, we should consider 
moving the functionality into CheckURLAndCreateHTML() if possible.

cavin tells me that the old way used to call HaveProtocolHandler(), but we now 
do that on NewChannel().

I suggested moving the guts of HaveProtocolHandler() to CheckURLAndCreateHTML() 
to fix this bug.

here are the guts:

309     nsCAutoString scheme;
310     aURI->GetScheme(scheme);
311     nsCOMPtr<nsIExternalProtocolService> extProtService (do_GetService
(NS_EXTERNALPROTOCOLSERVICE_CONTRACTID));
312     extProtService->ExternalProtocolHandlerExists(scheme.get(), 
&haveHandler);

but we should make sure that other consumers of NewURI() don't need this as 
well, and run it by darin to see what he thinks.

(not reading, please use seth@sspitzer.org instead)

Updated

•

22 years ago

Keywords: regression

(not reading, please use seth@sspitzer.org instead)

Comment 23

•

22 years ago

cavin determined that this was caused by the checkin for bug #176904

Boris Zbarsky [:bzbarsky]

Comment 24

•

22 years ago

Hm... So it sounds like you just take a string and try to create a URI out of it
and if that succeeds you linkify it?

The fact that that worked at all was a bug in the external helper app service
(see bug 176904 comment 6 and bug 176904 comment 7.

My apologies for not doing a thorough enough check on the "make sure no one
interprets that as being synonymous with having an external protocol handler"
part.  :(

Ben Bucksch (:BenB)

Comment 25

•

22 years ago

> So it sounds like you just take a string and try to create a URI out of it
> and if that succeeds you linkify it?

Yes, and exactly that was suggested by somebody from the netwerk team (I don't
remember who, but it's a very onld post on .netlib) *prior* to the creation of
the converter - this idea to move all validity checking of the URL into netwerk
made a lot of sense to me and is the idea on which (this part of) the converter
is based.

The converter just determines *candidates* for possibly valid URIs out of a raw
text, calls a netwerk function to see, if the URI is indeed valid, and if so,
linkyfies it. This netwerk function happened to be NewURI (that NewURI checks
the validity of the URI were the semantics of the interface, as it has been told
to me - so, no, this was not a "bug"). The checks are *not* supposed to be
limited to the URI scheme - the *whole* URI is supposed to be checked for
validity. It was just that our implementations of the checks were incomplete for
some URI types (e.g. mailto:), and planned to be filled in later (long ago by
now). IIRC, "standard uris" (http, ftp etc.) did indeed check for syntactical
validity of hostname, login, path, query etc..

In other words, I think it's a bad idea to move the checks into the converter -
they are extensive and of general nature, so there should be *some* function
defined in netwerk which thoroughly checks the validity of a given URI string,
and we should use that in the converter. I don't care, if it's NewURI or
something else, as long as it's a stable interface in core netwerk.


As for the A0, that should be trivial to fix. The converter has a set of known
URI delimiters (a different one for beginning and end and for different ways or
writing URIs, see <http://www.bucksch.org/1/projects/mozilla/16507> - QA might
look at <http://www.mozilla.org/quality/mailnews/tests/mn-html-to-txt.txt>), so
just add it to the appropriate sets of delimiters.

Summary: "be:" gets converted to an URL link → [mozTXTtoHTMLConv] "be:" gets converted to an URL link

Ben Bucksch (:BenB)

Comment 26

•

22 years ago

Oh, and while you are at it, please remove the following lines form the code:
  // it would be faster if we could just check to see if there is a protocol
  // handler for the url and return instead of actually trying to create a url.
Somebody added them, and - as explained above - this is a bad idea.

Boris Zbarsky [:bzbarsky]

Comment 27

•

22 years ago

Ben, a uri does _not_ become invalid simply because the scheme is unknown to us,
as far as necko is concerned.  We have embeddors who need to be able to get URI
objects for their private URI schemes without registering them with the OS at
large...

Perhaps we could have a function somewhere in necko (I hesitate to suggest
nsIIOService2...) that would perform the check needed here?  That could be used
in the uri fixup code too...

Ben Bucksch (:BenB)

Comment 28

•

22 years ago

> a uri does _not_ become invalid simply because the scheme is unknown to us,
> as far as necko is concerned.

True, but it's invalid as far as Gecko is concerned - Gecko cannot load it. If
Gecko cannot load it, it doesn't make much sense to link it.
At least that's the way the world was when I wrote the converter.

> We have embeddors who need to be able to get URI
> objects for their private URI schemes without registering them with the OS at
> large...

I'd let them register their URI schemes with Necko (and offer facilities to do
that, if necessary). Maybe even ask them to create URI implementations.

Boris Zbarsky [:bzbarsky]

Comment 29

•

22 years ago

> True, but it's invalid as far as Gecko is concerned - Gecko cannot load it.  If
> Gecko cannot load it, it doesn't make much sense to link it.

It does if the embedding app around Gecko can load it.  Which is true in many
cases, from what I see in the necko bugs.....

There are facilities to register protocol handlers with necko.  The point is
that they want to handle the URI completely outside of Gecko (eg pass it off to
a sound player).  But for that, they need a notification that the URI was
clicked on.  That was the motivating cause for the change in bug 176904 -- it
made us consistently fire such notifications.  Further, gecko internally needs
this functionality so that when someone clicks on a link like:

<a href="unknownprotocol: foo">link</a>

We put up an alert of some sort instead of just sitting there and doing nothing.

I understand what the situation was when you wrote the converter.  The point is
that we should try to find a decent solution that helps both you and other people.

Ben Bucksch (:BenB)

Comment 30

•

22 years ago

Does the embedding app wants its internal URLs linked in e.g. plaintext
documents loaded in Gecko? If so, then there is no way around for the embedding
app to tell us about its URL schemes. Otherwise, we'd have to link *everything*
that complies to the generic URI syntax, which includes "be:", what happened,
what created this bug.

Ben Bucksch (:BenB)

Comment 31

•

22 years ago

BTW: Compare RFC2717.

Boris Zbarsky [:bzbarsky]

Comment 32

•

22 years ago

> Does the embedding app wants its internal URLs linked in e.g. plaintext
> documents loaded in Gecko?

Why would you be loading plaintext documents?  Don't necessarily think browser
here, think "help system" or some other fairly specialized system that needs an
HTML renderer....

RFC 2717 does not apply if none of the data involved ever goes out over the
network and is instead completely internal to the app in question.

Cavin Song

Assignee

Comment 33

•

22 years ago

Attached patch Proposed fix, v1 (obsolete) — Details — Splinter Review

1. Treat nbsp and space the same so that we don't fall into the code to call
FindURL().
2. Validate schemes so that we don't linkify invalid ones.

Cavin Song

Assignee

Updated

•

22 years ago

Attachment #113935 - Flags: superreview?(sspitzer)

Ben Bucksch (:BenB)

Comment 34

•

22 years ago

Comment on attachment 113935 [details] [diff] [review]
Proposed fix, v1

Why did I spend 2 hours yesterday to explain my earlier comment, when I just
get ignored? (Or did I misunderstand something?)

1. The *whole* URI must be validated, not just the scheme. Does NewURI still do
that?
2. I would really like not to see scheme checks in that code. Please create a
core Necko function for that.

3. I don't think that fix for A0 will work. As you can see in the like you
modifed, there is a comment "//performance increase". This is redundant code,
only added to speed up the processing. The same is checked later again.
Your fix might happen to work for " be:\A0 ", but probably won't for " PM-2:00
" and " http://bar\A0 " probably fails as will (\A0 is probably included in the
link).
See calls to nsCRT:IsAsciiSpace(). I think you either want to modify that
function or create an internal function for the same purpose and use that.

As the original developer of that code, review- .

Attachment #113935 - Flags: review-

Cavin Song

Assignee

Comment 35

•

22 years ago

Hmm, I did not see the new comments before I posted the patch (did not expect
comments over the weekend). In any case, your issues will be addressed.

Ben Bucksch (:BenB)

Comment 36

•

22 years ago

Oh, I see. Thank you.

lchiang

Reporter

Updated

•

22 years ago

QA Contact: gayatri → esther

Ben Bucksch (:BenB)

Comment 37

•

22 years ago

> Yes, and exactly that was suggested by somebody from the netwerk team (I don't
> remember who, but it's a very onld post on .netlib) *prior* to the creation of
> the converter

I looked up the post, it's
<nntp://news.mozilla.org/382F2593.2532605D@netscape.com> (.netlib, subject
"recognizing URLs in text", 14 Nov 99) and the subthread following it.

Warren Harris wrote there:
"Yes, I think you/they should look for the pattern suggested above, and then try
calling NS_NewURI. This should only succeed if the protocol exists, and the
string is a syntactically valid URL."

I answered
"I like this idea. Since the main purpose of nsMimeURLUtils is this
recognition, I started rewriting the class."
This converter was the result.

Warren wrote later again
"First, recognize characters followed by a colon (as a protocol name -- not
specific protocol names, just anything that follows this pattern), and take the
text up to the next white space and hand it to NS_NewURI. If that succeeds, make
a link out of it."

NewURI clearly was supposed to not only check, if the URL scheme is known, but
also the syntax of the URI. So, the converter worked exactly as it was supposed
to, until Necko broke :-(. That's why I suggested to create a new function in
Necko to provide the functionality of the former NewURI.

Jsut checking schemes is not good enough, otherwise "/()%78%!!)@(%()%" will
probably be recognized as email address :(.

Jo Hermans

Comment 38

•

22 years ago

*** Bug 192674 has been marked as a duplicate of this bug. ***

Darin Fisher

Comment 39

•

22 years ago

> NewURI clearly was supposed to not only check, if the URL scheme is known, but
> also the syntax of the URI. So, the converter worked exactly as it was 
> supposed to, until Necko broke :-(. That's why I suggested to create a new 
> function in Necko to provide the functionality of the former NewURI.

Ben,

Boris is saying that Warren's original design doesn't hold up well.  It was a
nice idea, because it certainly makes the problem of solving this bug simpler,
but it causes other problems.  We decided that we needed NewURI to always return
an object, provided the URI string isn't malformed (i.e., the URI string must
have the structure of an URI for the specified scheme, and for unknown
protocols, the only requirement is an RFC-2396-compliant-scheme followed by a
colon and then whatever junk).

Finally, note that the external protocol handler only checks the existance of a
registered handler for the URL scheme.  It does not, and never did, check the
body of the URL string.

Here's a suggestion:

How about we introduce the following interface in netwerk/base/public:

  interface nsIExternalProtocolHandler : nsIProtocolHandler
  {
     boolean handlerExists(in string scheme);
  };

Then mozTXTToHTMLConv could do the following:

  boolean shouldLinkify(in string url)
  {
     scheme = ioservice.extractScheme(url)
     handler = ioservice.getProtocolHandler(scheme);

     externalHandler = handler.QI(nsIExternalProtocolHandler);
     if (!externalHandler)
       return true; // handler is built-in, linkify it!

     return externalHandler.handlerExists(scheme);
  }

nsExternalProtocolHandler would then implement nsIExternalProtocolHandler.

Ben Bucksch (:BenB)

Comment 40

•

22 years ago

> Finally, note that the external protocol handler only checks the existance of a
> registered handler for the URL scheme.  It does not, and never did, check the
> body of the URL string.

Yes, sure, it can't, because it can represent arbitary URLs and doesn't know how
they are structured, apart from RFC2396.
For http, ftp, mailto etc. URLs however, we do have a pretty good idea of how
they should look like, and we do (or should) have implementations being able to
check conformity to that.

> Here's a suggestion:
> How about we introduce the following interface in netwerk/base/public:
>      boolean handlerExists(in string scheme);

As outlined above, simply checking the scheme is not good enough for the
purposes of the converter. And checking, if an email address or http URL is
stictly syntactically valid is really not the business of the converter IMHO, it
just requires that functionality.
I would have envisaged something vaguely like the following:
interface nsIURI2 : nsIURI
{
  boolean syntaxOK;
}
or
interface nsIIOSerivce2
{
  boolean isValidURI(in string uriToBeChecked);
}
This would end up invoking syntax checking function in the nsIURI
implementations of the relevant protocol implementation.

(not reading, please use seth@sspitzer.org instead)

Comment 41

•

22 years ago

for http:// and ftp:// etc to linkify, we'd need to make those protocol 
handlers also implement nsIExternalProtocolHandler, right?

ioservice.getProtocolHandler() would return those protocol handlers, and for 
something external (like goim:// or something handled by an external app)
ioservice.getProtocolHandler() would give us nsExternalProtocolHandler

here's a list of the protocol handlers:

nsAboutProtocolHandler
nsDataHandler
nsFileProtocolHandler
nsFtpProtocolHandler
nsGopherHandler
nsHttpHandler
nsJARProtocolHandler
nsKeywordProtocolHandler
nsResProtocolHandler
nsViewSourceHandler
nsExternalProtocolHandler

do we weally want to linkify all these? (I'm wondering about viewsource, res, 
and jar, for example)

handlerExists() makes me think we should be returning true, but maybe we should 
call it something else, so that its valid for some things that we don't want to 
linkify to return false?

(not reading, please use seth@sspitzer.org instead)

Comment 42

•

22 years ago

> here's a list of the protocol handlers:

that was a partial list, I forgot about the ones in mailnews.

speaking of which, we do want to linkify mailto: and news:, but we wouldn't 
want to linkify others (like some imap, and pop3 urls).

what I'm getting at is for protocols that we handle, we only linkify if we 
really intend for the user to click on them.

Boris Zbarsky [:bzbarsky]

Comment 43

•

22 years ago

> I forgot about the ones in mailnews

Not to mention javascript:, irc://, etc, etc.  Anyone can make up a protocol
handler and drop it into the build....

Ben Bucksch (:BenB)

Comment 44

•

22 years ago

> do we weally want to linkify all these?

I don't think stuff like jar: or imap: disturbs in practice. In fact, I see irc:
etc. (depending on if there is actually a handling client) as feature.

Darin Fisher

Comment 45

•

22 years ago

> For http, ftp, mailto etc. URLs however, we do have a pretty good idea of how
> they should look like, and we do (or should) have implementations being able 
> to check conformity to that.

Yes, the URL parsers do (to some extent) enforce conformity above and beyond the
mere existance of a scheme+':' ... and it is entirely protocol dependent.

> As outlined above, simply checking the scheme is not good enough for the
> purposes of the converter. And checking, if an email address or http URL is
> stictly syntactically valid is really not the business of the converter IMHO, 
> it just requires that functionality.

Prior to the fix for bug 176904, we would do the following:

  if scheme is builtin, then parse URI, fail if malformed.
  if scheme is external, then check if OS can handle the scheme.

The patch for bug 176904 only changed the case when the scheme is external. 
And, this bug is really only about that "regression" ... let's focus on that
first.  So, for the purposes of external protocol handlers, checking the scheme
is all you can do (in most cases).


> I would have envisaged something vaguely like the following:
> interface nsIURI2 : nsIURI
> {
>  boolean syntaxOK;
> }
> or
> interface nsIIOSerivce2
> {
>  boolean isValidURI(in string uriToBeChecked);
> }
> This would end up invoking syntax checking function in the nsIURI
> implementations of the relevant protocol implementation.

again, the syntax is checked on construction (because that's when we parse the
URI string).

(not reading, please use seth@sspitzer.org instead)

Comment 46

•

22 years ago

note, I didn't notice:

> return true; // handler is built-in, linkify it!

about irc:, we would want to linkify that one, since clicking on those links 
should launch the irc app.

so if we go with darin's original suggestion, all internal schemes would 
linkify.

I'd prefer something else for handlerExists().

I'd want to spin off a new bug about not linkifying everything that we handle, 
since I don't think we want that.  (and that can be a seperate discussion)

Darin Fisher

Comment 47

•

22 years ago

OK, so to address the problem of built-in protocol handlers not wanting to be
linkified, we could do a number of things.  We could introduce a protocol flag
(see nsIProtocolHandler::protocolFlags).  Or, we could change the name of
nsIExternalProtocolHandler to something any protocol handler could implement. 
However, i dislike this solution since it is quite bloaty.  We could also have a
simple white-list in the TXTtoHTML converter.

Ben Bucksch (:BenB)

Comment 48

•

22 years ago

Ah, so NewURI still does check the "body" (scheme-specific part) of the URI in
the case of protocols we have a protocol implementation for? That's great. Then,
we have no problem. (Assuming nobody has the idea to remove those checks :-(.)
Hardcoding the check for the external handlers would then be OK with me for now.

Avoiding to link certain URL schemes: I think that's unnecessary, it's
irrelevant in practice and just means more code. It's a different bug in any case.

(not reading, please use seth@sspitzer.org instead)

Comment 49

•

22 years ago

I realize this is another bug, but I want to make sure if we add a new 
interface we take future changes into account.

4.x used to have this concept of internal-vs-external urls, for security 
purposes.

certain urls the user could enter in the url bar, or click on, and certain ones 
they could not, but those urls could be run internal to the application.

how about instead of handlerExists(scheme) what about userAllowed(spec)

for the external protocol handler, we'd just see if the scheme was supported.

for internal, we'll allow the protocol handlers to determine if the url 
was "safe".

if userAllowed, we'd linkify.

(not reading, please use seth@sspitzer.org instead)

Comment 50

•

22 years ago

I wish I hadn't brought up the "don't linkify everything" issue, as it's not 
helping resolve this issue.

Since darin's new interface (nsIExternalProtocolHandler) would not be frozen, I 
can bring this up in another bug, and for now linkify everything that we handle 
internally.

now that I fully understand it, I think darin's proposal in 
http://bugzilla.mozilla.org/show_bug.cgi?id=183111#c39 makes sense, will not be 
a lot of bloat, and will address part of this bug.

benb, bz, cavin, do you agree?

Ben Bucksch (:BenB)

Comment 51

•

22 years ago

Seth, my comment 48 still stands. I don't know, why cavin's patch v1 (the uri
check part of it) wouldn't do the same as darin suggested (just without new
interface), but I don't care either :).

Darin Fisher

Comment 52

•

22 years ago

ben: because necko should not depend directly on exthandler =)

Boris Zbarsky [:bzbarsky]

Comment 53

•

22 years ago

Yeah, darin's approach seems reasonable... (and I _do_ apologize for all the
trouble I caused by not noticing this issue before I made my changes....)

(not reading, please use seth@sspitzer.org instead)

Comment 54

•

22 years ago

Comment on attachment 113935 [details] [diff] [review]
Proposed fix, v1

rejecting this patch.

cavin plans on implemented why darin suggested.

thanks to all (darin, benb, bz, and cavin) for working to resolve this
regression.

Attachment #113935 - Flags: superreview?(sspitzer) → superreview-

Matthias Versen [:Matti]

Comment 55

•

22 years ago

*** Bug 193116 has been marked as a duplicate of this bug. ***

Cavin Song

Assignee

Comment 56

•

22 years ago

Attached patch Proposed fix, v2 (obsolete) — Details — Splinter Review

Created new interface "nsIExternalProtocolHandler : nsIProtocolHandler" and new
method "boolean handlerExists" to resolve the issue.

Attachment #113935 - Attachment is obsolete: true

Cavin Song

Assignee

Updated

•

22 years ago

Attachment #114498 - Flags: review?(darin)

Ben Bucksch (:BenB)

Comment 57

•

22 years ago

From shortly looking over patch v2, it looks good to me. Thanks (also to darin)!

3 tiny suggestions, just in case there are other review comments and you need to
change the code again anyways:
- Move NewURI call into ShouldLinkify
- Add a comment that ShouldLinkify checks the validity of the URI
  (as-is, it sounds like a policy decision, like what seth suggested)
- Change "IsAsciiSpace" to "IsSpace"
But feel free to ignore it.

Darin Fisher

Comment 58

•

22 years ago

Comment on attachment 114498 [details] [diff] [review]
Proposed fix, v2

>Index: netwerk/base/public/Makefile.in

> 		nsIMultiPartChannel.idl \
>+    nsIExternalProtocolHandler.idl \
> 		$(NULL)

please keep indentation consistent.  thx!


>Index: netwerk/base/public/nsIExternalProtocolHandler.idl

>+ * Version: NPL 1.1/GPL 2.0/LGPL 2.1

new code should use the MPL.


>+ * Portions created by the Initial Developer are Copyright (C) 2001

2003 :-)


>+    boolean handlerExists(in string scheme);

use "in ACString aScheme" instead of "in string scheme"

though it was my suggestion, i agree with seth that "handlerExists"
is probably not the best method name here.  some alternatives:

  userAllowed
  schemeAllowed
  schemeSupported
  protocolAllowed
  protocolSupported

i'm not crazy about any of these.  though i think "protocolAllowed"
or "protocolSupported" might be my preference.	we're asking: are
we "allowed" / "able" to execute an external application for URIs
of this "protocol" scheme?  "allowed" gets us talking about
permissions, and that's not quite right here.  here, we are only 
asking can an URI be handled by the operating system.  which makes
me consider this method name:

  externalAppExistsForScheme

oh, i don't know ;-)


>Index: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp

>+// Bug 183111, editor now replaces multiple spaces with leading
>+// 0xA0's and a single ending space, so need to treat 0xA0's as spaces.
>+PRBool mozTXTToHTMLConv::IsAsciiSpace(const char aChar)
>+{
>+  return (nsCRT::IsAsciiSpace(aChar) || aChar == 0xA0);
>+}

>-             && !nsCRT::IsAsciiSpace(aInString[PRUint32(i)])
>+             && !IsAsciiSpace(aInString[PRUint32(i)])

>-             && !nsCRT::IsAsciiSpace(aInString[i])
>+             && !IsAsciiSpace(aInString[i])

should this really be part of this patch?


>+  nsresult rv = mIOService->ExtractScheme(NS_ConvertUCS2toUTF8(aURL), scheme);

>+  // See if the url should be linkified.
>+  if (!ShouldLinkify(txtURL))
>+    return PR_FALSE;

>   rv = mIOService->NewURI(NS_ConvertUCS2toUTF8(txtURL), nsnull, nsnull, getter_AddRefs(uri));

looks like you are doing the conversion from UCS2 to UTF8 twice.  why don't you
change the API for ShouldLinkify to take a nsCString instead, and then just do
the conversion once in the calling function.


>+   NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsIProtocolHandler, nsIExternalProtocolHandler)

is the ambiguous map really necessary?


>+  nsCOMPtr<nsIExternalProtocolService> extProtService (do_GetService(NS_EXTERNALPROTOCOLSERVICE_CONTRACTID, &rv));

should the external protocol handler class cache a reference to the external
protocol service?  we execute this code once per link, right?


>Index: uriloader/exthandler/nsExternalProtocolHandler.h

> #include "nsIProtocolHandler.h"
>+#include "nsIExternalProtocolHandler.h"

no need to #include "nsIProtocolHandler.h" now.

Attachment #114498 - Flags: review?(darin) → review-

Christian :Biesinger (don't email me, ping me on IRC)

Comment 59

•

22 years ago

>  externalAppExistsForScheme

Isn't there an nsIExternalProtocolHandler which can do exactly this?
See
http://lxr.mozilla.org/seamonkey/source/uriloader/exthandler/nsIExternalProtocolService.idl

Darin Fisher

Comment 60

•

22 years ago

biesi: please see comment #52.

Cavin Song

Assignee

Comment 61

•

22 years ago

Attached patch Proposed patch, v3 (obsolete) — Details — Splinter Review

Incorporated comments.

Attachment #114498 - Attachment is obsolete: true

Cavin Song

Assignee

Updated

•

22 years ago

Attachment #114730 - Flags: review?(darin)

Matthias Versen [:Matti]

Comment 62

•

22 years ago

*** Bug 194128 has been marked as a duplicate of this bug. ***

Darin Fisher

Comment 63

•

22 years ago

Comment on attachment 114730 [details] [diff] [review]
Proposed patch, v3

>Index: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp

>+PRBool mozTXTToHTMLConv::IsSpace(const char aChar)
>+{
>+  return (nsCRT::IsAsciiSpace(aChar) || aChar == 0xA0);
>+}

this function should be declared inline.  in fact, why not just make it a
static
inline function for this source file?  why bother declaring it as a member var?

  static inline PRBool
  IsSpace(const char aChar)
  {
    return (nsCRT::IsAsciiSpace(aChar) || aChar == 0xA0);
  }


>+  rv = mIOService->NewURI(utf8URL, nsnull, nsnull, getter_AddRefs(uri));

this should be unnecessary now, right?


>Index: uriloader/exthandler/nsExternalProtocolHandler.cpp

>+	nsresult rv;
>+	m_extProtService = do_GetService(NS_EXTERNALPROTOCOLSERVICE_CONTRACTID, &rv);

what's this |rv| for?  bag it.


>   PRBool HaveProtocolHandler(nsIURI * aURI);
> 	nsCString	m_schemeName;
>+  nsCOMPtr<nsIExternalProtocolService> m_extProtService;
> };

gr.. tabs!  would be good to replace them with spaces while you're touching
this
file if you don't mind ;-)

r=darin with these changes.

Attachment #114730 - Flags: review?(darin) → review-

Ben Bucksch (:BenB)

Comment 64

•

22 years ago

> > NewURI

> this should be unnecessary now, right?

I thought you said that NewURI still checks the body for built-in URLs like http:?

Darin Fisher

Comment 65

•

22 years ago

true, doh!  nevermind me =)

R.K.Aa.

Comment 66

•

22 years ago

*** Bug 194704 has been marked as a duplicate of this bug. ***

Cavin Song

Assignee

Comment 67

•

22 years ago

Attached patch Proposed patch, v4 — Details — Splinter Review

Incorporated darin's last comments.

Attachment #114730 - Attachment is obsolete: true

Cavin Song

Assignee

Updated

•

22 years ago

Attachment #115430 - Flags: superreview?(sspitzer)

(not reading, please use seth@sspitzer.org instead)

Comment 68

•

22 years ago

Comment on attachment 115430 [details] [diff] [review]
Proposed patch, v4

asking for darin's official r=, before I sr.

Attachment #115430 - Flags: review?(darin)

Darin Fisher

Comment 69

•

22 years ago

Comment on attachment 115430 [details] [diff] [review]
Proposed patch, v4

>Index: uriloader/exthandler/nsExternalProtocolHandler.cpp

>+NS_IMETHODIMP nsExternalProtocolHandler::ExternalAppExistsForScheme(const nsACString& aScheme, PRBool *_retval)
>+{
>+  if (m_extProtService)
>+    return (m_extProtService->ExternalProtocolHandlerExists(PromiseFlatCString(aScheme).get(), _retval));

nit: kill the extra parans


>+  // In case we don't have external protocol service.
>+  if (_retval)
>+    *_retval = PR_FALSE;

nit: you can assume _retval non-null


sorry for not commenting on these last time.  i don't need to see another
patch.	r=darin.

Attachment #115430 - Flags: review?(darin) → review+

(not reading, please use seth@sspitzer.org instead)

Comment 70

•

22 years ago

Comment on attachment 115430 [details] [diff] [review]
Proposed patch, v4

sr=sspitzer

Attachment #115430 - Flags: superreview?(sspitzer) → superreview+

Cavin Song

Assignee

Updated

•

22 years ago

Target Milestone: --- → mozilla1.4alpha

Cavin Song

Assignee

Comment 71

•

22 years ago

Fix checked in.

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

Boris Zbarsky [:bzbarsky]

Comment 72

•

21 years ago

This patch has a problem with some compilers -- bug 195727

Boris Zbarsky [:bzbarsky]

Comment 73

•

21 years ago

This patch also broke the Mac OS9 port -- see bug 195917

Matthias Versen [:Matti]

Comment 74

•

21 years ago

*** Bug 195952 has been marked as a duplicate of this bug. ***

R.K.Aa.

Comment 75

•

21 years ago

*** Bug 196341 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 76

•

21 years ago

*** Bug 196775 has been marked as a duplicate of this bug. ***

esther

Comment 77

•

21 years ago

Using trunk builds 20030314 on winxp, mac osx and linux and the scenario as
originally described in html and plain text compose windows this is fixed
If any of the commentor need to test specific scenarios,  please comment in this
bug the results. If no one has found a scenario that doesn't work, I will
verfify this as fixed on 3-19.

Ben Bucksch (:BenB)

Comment 78

•

21 years ago

Esther, please check against
<http://www.bucksch.org/1/projects/mozilla/16507> and
<http://www.mozilla.org/quality/mailnews/tests/mn-html-to-txt.txt>
In particular, check for invalid URL schemes. Thanks.

FYI, the following should not link, but also does so in Moz1.0, so it wouldn't
be a regression covered by this bug.
<http://@##>
<mailto:@>
<mailto:#@+>
<http://hj+ü>
<mailto:üdghjr>

benc

Comment 79

•

21 years ago

-> QA to me.
stream converters, of which this is one, is now part of networking qa. 

esther might still run tests in mailnews, and file bugs...

Component: MIME → Networking

Keywords: qawanted

Product: MailNews → Browser

QA Contact: esther → benc

R.K.Aa.

Comment 80

•

21 years ago

*** Bug 199099 has been marked as a duplicate of this bug. ***

benc

Comment 81

•

21 years ago

If I understand this, this is fixed for 1.4a, but was broken in 1.3f because it
came in after the branch date.

MailNews and other dependent components will continue to test this for their
shipping products, but I'll own the general testing technical stuff.

Ideally, I think mailnews should file bugs where mailnews breaks, and then be
depends on a single Networking bug that talks about where the stream converter
breaks. In this case, I did not go back through all the dupes, re-open them, and
make them dependent on this bug. However, I did leave them in mailnews, so the
current QA needs to verify dupe or get confirmation this works now.

I will add tests for "9:00" and "D:\path\file" because they seem exceptionally
common, so you can blindly VERIFY/dupe those bugs.

Jo Hermans

Comment 82

•

21 years ago

*** Bug 199427 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 83

•

21 years ago

*** Bug 200106 has been marked as a duplicate of this bug. ***

Boris Zbarsky [:bzbarsky]

Comment 84

•

21 years ago

*** Bug 200288 has been marked as a duplicate of this bug. ***

Felix Miata

Comment 85

•

21 years ago

*** Bug 200821 has been marked as a duplicate of this bug. ***

Boris Zbarsky [:bzbarsky]

Comment 86

•

21 years ago

*** Bug 201604 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 87

•

21 years ago

*** Bug 202431 has been marked as a duplicate of this bug. ***

Torben

Comment 88

•

21 years ago

*** Bug 203047 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 89

•

21 years ago

*** Bug 204971 has been marked as a duplicate of this bug. ***

Benedikt Kantus

Comment 90

•

21 years ago

*** Bug 204991 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 91

•

21 years ago

*** Bug 205990 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 92

•

21 years ago

*** Bug 207316 has been marked as a duplicate of this bug. ***

benc

Comment 93

•

21 years ago

VERIFIED: per esther's results.

Status: RESOLVED → VERIFIED

Matthias Versen [:Matti]

Comment 94

•

21 years ago

*** Bug 209382 has been marked as a duplicate of this bug. ***

Mike Cowperthwaite

Comment 95

•

21 years ago

*** Bug 188289 has been marked as a duplicate of this bug. ***

Mike Cowperthwaite

Comment 96

•

21 years ago

*** Bug 210960 has been marked as a duplicate of this bug. ***

Screenshot showing that not everything is linkified 22 years ago Benedikt Kantus 8.25 KB, image/gif		Details
Proposed fix, v1 22 years ago Cavin Song 2.23 KB, patch	BenB : review- sspitzer : superreview-	Details \| Diff \| Splinter Review
Proposed fix, v2 22 years ago Cavin Song 11.25 KB, patch	darin.moz : review-	Details \| Diff \| Splinter Review
Proposed patch, v3 22 years ago Cavin Song 12.13 KB, patch	darin.moz : review-	Details \| Diff \| Splinter Review
Proposed patch, v4 22 years ago Cavin Song 11.91 KB, patch	darin.moz : review+ sspitzer : superreview+	Details \| Diff \| Splinter Review