Closed Bug 183111 Opened 22 years ago Closed 22 years ago

[mozTXTtoHTMLConv] "be:" gets converted to an URL link

Categories

(Core :: Networking, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.4alpha

People

(Reporter: lchiang, Assigned: cavin)

References

Details

(Keywords: regression, Whiteboard: [adt2])

Attachments

(2 files, 3 obsolete files)

"be:" (w/out the quotes) gets converted to an URL link

Trunk 2002-11-25-12 build, Windows XP.  HTML compose.

1.  Compose new msg
2.  Type something like this in the body:

The date will be:  December 2

3.  Send it as rich text to yourself
4.  Get the msg
5.  Notice that the be: word is set to URL of be:

be: shouldn't do anything.
blah: etc matches as URL scheme and it will be converted.

invalid (?)
*** Bug 183489 has been marked as a duplicate of this bug. ***
From a user's point of view, it makes the email look strange after being sent. 
There are lots of times when I type : after a word in email. Examples:

LisaC:  Can you please make some food?

or

The date will be:  December 25, 2002

etc.

Keywords: nsbeta1
In fact, lots of different things are showing up as links. One particularly
annoying one is Outlook meeting requests, which include dates, as in: "1:00
PM-2:00 PM (GMT-08:00)". Both the "PM-2:00" and "GMT-08:00" are linkified.

The truly inane thing is that if you click on these, you *immediately* get
"protocol not known", so there's no possible reason for these things to be
links, and the tool *knows it*.
*** Bug 185836 has been marked as a duplicate of this bug. ***
I have a cron job that regularly mails me the output of an unzip operation, and
all the "Defl:X" text gets converted to a link.

 Length   Method    Size  Ratio   Date   Time   CRC-32    Name
--------  ------  ------- -----   ----   ----   ------    ----
  254452  Defl:X   207869  18%  12-18-02 04:16  6f4a4809  clean.dat
     110  Stored      110   0%  12-18-02 04:16  23842225  file_id.diz
  303167  Defl:X   269684  11%  12-18-02 04:16  3441bbb6  names.dat
    1209  Defl:X      571  53%  12-18-02 04:16  9393b561  packing.lst
     708  Defl:X      364  49%  12-18-02 04:16  5b0e1ce7  pkgdesc.ini
   12169  Defl:X     4543  63%  12-18-02 04:16  8596e468  reseller.txt
 2186479  Defl:X  1942861  11%  12-18-02 04:16  6378d76f  scan.dat
   51200  Defl:X    24211  53%  12-18-02 04:16  9858d0f3  VALIDATE.EXE
   42048  Defl:X    12968  69%  12-18-02 04:16  c75efa3b  readme.txt
   12124  Defl:X     2472  80%  10-15-98 03:20  dc749365  internet.dat
--------          -------  ---                            -------
 2863666          2465653  14%                            10 files

I can understand why only converting common protocols such as http: and ftp:
would exclude some valid links, but nowhere near as many bogus links as get
created by the current anything: link scheme
I am using trunk build 2002122004 on Win 98 and absolutely everything containing
a : will be linkified, even if it is just something like ...:.

It's not just some characters or some combinations, but really everything with a :
*** Bug 187505 has been marked as a duplicate of this bug. ***
hey, i like having stack traces that look like links :)
Assignee: ducarroz → ben.bucksch
Component: Mail Back End → MIME
> One particularly
> annoying one is Outlook meeting requests, which include dates, as in: "1:00
> PM-2:00 PM (GMT-08:00)". Both the "PM-2:00" and "GMT-08:00" are linkified.

> The truly inane thing is that if you click on these, you *immediately* get
> "protocol not known", so there's no possible reason for these things to be
> links, and the tool *knows it*.

I believe this statement is wrong. My converter (mozTXTToHTMLConv, used in
mailnews and AIM) only linkifies URLs which are identified as valid by necko.
This includes (or did include) a check for the URL scheme (the part before the
colons).

Because Mozilla also checks the Windows registry (or comparable) for URL
schemes, the results will heavily depend upon your system.

If your statements are right, this is either a regression or I see no real way
to fix this. "thepalace:" can be a valid URL. But if really the "be:" in " be: "
gets linked, then there should be done something about it. I find it strange,
though, that this bug has never been reported before, although this converter
works in Mozilla since over 3 years now.
Keywords: qawanted
As you can see from the screenshot, not everything will be linkified, though I
have not yet discovered which ones will and which ones will not.
See additional comment.
I took the screenshot that is attached to comment #1 using the 2003010205 trunk
build on Win2k.
It shows a mail I received with Mozilla showing two different behaviours: On the
upper image, all words containig a colon will be linkified. On the lower image
there are words that were not linkified.

This happened within the _same_ email and with the _same_ build.


Ben, comment #10 doesn't clearly say that you can reproduce this bug with your
build. Is this so?
This is a regression which I didn't see on the 1.0x branch builds.
Gayatri - can you assign to someone to see if s/he can find a pattern?
Mail triage team: nsbeta1+/adt2
Keywords: nsbeta1nsbeta1+
Whiteboard: [adt2]
the following call checks, if the URL is valid:
rv = mIOService->NewURI(NS_ConvertUCS2toUTF8(txtURL), nsnull, nsnull,
getter_AddRefs(uri));
<http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp>

I guess that the implementation of NewURI changed, that somebody "optimized" the
"performance" (really speed) of the function and limited the functionality of it
while doing so, so that it no longer checks, if the URL scheme is valid. If
that's the case, this would be a bad move without checking consumers, and I
think that there should be a version of the function which does all checks.

The code was written specifically so that it would have no knowledge of
different URL schemes and instead to delegate all checks to netwerk's generic
URI functions. In fact, that's a major reason for why I wrote this class
originally. So, hardcoding URL scheme knowledge into the converter is not
acceptable.

If I happen to be right, then I'd suggest to create a new function like
NewURIWithChecks, doing what the old implementation of NewURI did and maybe even
more checks, and to use that from the converter.
*** Bug 189971 has been marked as a duplicate of this bug. ***
Reassigning.
Assignee: ben.bucksch → cavin
*** Bug 190831 has been marked as a duplicate of this bug. ***
The problem occurs only when you put more than one spaces after a colon char and 
is caused by the fact that the body text (in a compose window) returned from the 
editor API (ie, mEditor->OutputToString()) has changed. 

For a string like the following in a compose window:

  be:^^   (^^ represents 2 spaces)

mEditor->OutputToString() would return a string like:

 be:^\A0

meaning that a space followed by a char whose hex code is 'A0'. But now it 
returns:

  be:\A0^

meaning that a char whose hex code is 'A0' followed by a space .

Because of this change it breaks the code in mozTXTToHTMLConv::ScanTXT() where 
we check to see if we need to find an url when we hit char ':'.
Comment 19 may be correct for the original report of this issue, but not all of
the subsequent comments refer to strings with spaces being linkified. In fact,
many of them have nothing to do with messages composed in Mozilla at all
(comments 4 and 6 are examples to the contrary).

Is this because the double space problem isn't the root cause, or do we need 2
bugs here?
The problem with "PM-2:00" is caused by the same reason that Ben mentioned in 
Comment #15 (ie, issue with NewURI() call).
summarizing a conversation I had with cavin about this bug.

nsIIOService is frozen see 
http://lxr.mozilla.org/mozilla/source/netwerk/base/public/nsIIOService.idl#54

we should see if any other consumers were expecting the same behaviour as 
mozTXTToHTMLConv::CheckURLAndCreateHTML() was.

if not, and only CheckURLAndCreateHTML() was expecting this, we should consider 
moving the functionality into CheckURLAndCreateHTML() if possible.

cavin tells me that the old way used to call HaveProtocolHandler(), but we now 
do that on NewChannel().

I suggested moving the guts of HaveProtocolHandler() to CheckURLAndCreateHTML() 
to fix this bug.

here are the guts:

309     nsCAutoString scheme;
310     aURI->GetScheme(scheme);
311     nsCOMPtr<nsIExternalProtocolService> extProtService (do_GetService
(NS_EXTERNALPROTOCOLSERVICE_CONTRACTID));
312     extProtService->ExternalProtocolHandlerExists(scheme.get(), 
&haveHandler);

but we should make sure that other consumers of NewURI() don't need this as 
well, and run it by darin to see what he thinks.
cavin determined that this was caused by the checkin for bug #176904
Hm... So it sounds like you just take a string and try to create a URI out of it
and if that succeeds you linkify it?

The fact that that worked at all was a bug in the external helper app service
(see bug 176904 comment 6 and bug 176904 comment 7.

My apologies for not doing a thorough enough check on the "make sure no one
interprets that as being synonymous with having an external protocol handler"
part.  :(

> So it sounds like you just take a string and try to create a URI out of it
> and if that succeeds you linkify it?

Yes, and exactly that was suggested by somebody from the netwerk team (I don't
remember who, but it's a very onld post on .netlib) *prior* to the creation of
the converter - this idea to move all validity checking of the URL into netwerk
made a lot of sense to me and is the idea on which (this part of) the converter
is based.

The converter just determines *candidates* for possibly valid URIs out of a raw
text, calls a netwerk function to see, if the URI is indeed valid, and if so,
linkyfies it. This netwerk function happened to be NewURI (that NewURI checks
the validity of the URI were the semantics of the interface, as it has been told
to me - so, no, this was not a "bug"). The checks are *not* supposed to be
limited to the URI scheme - the *whole* URI is supposed to be checked for
validity. It was just that our implementations of the checks were incomplete for
some URI types (e.g. mailto:), and planned to be filled in later (long ago by
now). IIRC, "standard uris" (http, ftp etc.) did indeed check for syntactical
validity of hostname, login, path, query etc..

In other words, I think it's a bad idea to move the checks into the converter -
they are extensive and of general nature, so there should be *some* function
defined in netwerk which thoroughly checks the validity of a given URI string,
and we should use that in the converter. I don't care, if it's NewURI or
something else, as long as it's a stable interface in core netwerk.


As for the A0, that should be trivial to fix. The converter has a set of known
URI delimiters (a different one for beginning and end and for different ways or
writing URIs, see <http://www.bucksch.org/1/projects/mozilla/16507> - QA might
look at <http://www.mozilla.org/quality/mailnews/tests/mn-html-to-txt.txt>), so
just add it to the appropriate sets of delimiters.
Summary: "be:" gets converted to an URL link → [mozTXTtoHTMLConv] "be:" gets converted to an URL link
Oh, and while you are at it, please remove the following lines form the code:
  // it would be faster if we could just check to see if there is a protocol
  // handler for the url and return instead of actually trying to create a url.
Somebody added them, and - as explained above - this is a bad idea.
Ben, a uri does _not_ become invalid simply because the scheme is unknown to us,
as far as necko is concerned.  We have embeddors who need to be able to get URI
objects for their private URI schemes without registering them with the OS at
large...

Perhaps we could have a function somewhere in necko (I hesitate to suggest
nsIIOService2...) that would perform the check needed here?  That could be used
in the uri fixup code too...
> a uri does _not_ become invalid simply because the scheme is unknown to us,
> as far as necko is concerned.

True, but it's invalid as far as Gecko is concerned - Gecko cannot load it. If
Gecko cannot load it, it doesn't make much sense to link it.
At least that's the way the world was when I wrote the converter.

> We have embeddors who need to be able to get URI
> objects for their private URI schemes without registering them with the OS at
> large...

I'd let them register their URI schemes with Necko (and offer facilities to do
that, if necessary). Maybe even ask them to create URI implementations.
> True, but it's invalid as far as Gecko is concerned - Gecko cannot load it.  If
> Gecko cannot load it, it doesn't make much sense to link it.

It does if the embedding app around Gecko can load it.  Which is true in many
cases, from what I see in the necko bugs.....

There are facilities to register protocol handlers with necko.  The point is
that they want to handle the URI completely outside of Gecko (eg pass it off to
a sound player).  But for that, they need a notification that the URI was
clicked on.  That was the motivating cause for the change in bug 176904 -- it
made us consistently fire such notifications.  Further, gecko internally needs
this functionality so that when someone clicks on a link like:

<a href="unknownprotocol: foo">link</a>

We put up an alert of some sort instead of just sitting there and doing nothing.

I understand what the situation was when you wrote the converter.  The point is
that we should try to find a decent solution that helps both you and other people.
Does the embedding app wants its internal URLs linked in e.g. plaintext
documents loaded in Gecko? If so, then there is no way around for the embedding
app to tell us about its URL schemes. Otherwise, we'd have to link *everything*
that complies to the generic URI syntax, which includes "be:", what happened,
what created this bug.
BTW: Compare RFC2717.
> Does the embedding app wants its internal URLs linked in e.g. plaintext
> documents loaded in Gecko?

Why would you be loading plaintext documents?  Don't necessarily think browser
here, think "help system" or some other fairly specialized system that needs an
HTML renderer....

RFC 2717 does not apply if none of the data involved ever goes out over the
network and is instead completely internal to the app in question.
Attached patch Proposed fix, v1 (obsolete) — Splinter Review
1. Treat nbsp and space the same so that we don't fall into the code to call
FindURL().
2. Validate schemes so that we don't linkify invalid ones.
Attachment #113935 - Flags: superreview?(sspitzer)
Comment on attachment 113935 [details] [diff] [review]
Proposed fix, v1

Why did I spend 2 hours yesterday to explain my earlier comment, when I just
get ignored? (Or did I misunderstand something?)

1. The *whole* URI must be validated, not just the scheme. Does NewURI still do
that?
2. I would really like not to see scheme checks in that code. Please create a
core Necko function for that.

3. I don't think that fix for A0 will work. As you can see in the like you
modifed, there is a comment "//performance increase". This is redundant code,
only added to speed up the processing. The same is checked later again.
Your fix might happen to work for " be:\A0 ", but probably won't for " PM-2:00
" and " http://bar\A0 " probably fails as will (\A0 is probably included in the
link).
See calls to nsCRT:IsAsciiSpace(). I think you either want to modify that
function or create an internal function for the same purpose and use that.

As the original developer of that code, review- .
Attachment #113935 - Flags: review-
Hmm, I did not see the new comments before I posted the patch (did not expect
comments over the weekend). In any case, your issues will be addressed.
Oh, I see. Thank you.
QA Contact: gayatri → esther
> Yes, and exactly that was suggested by somebody from the netwerk team (I don't
> remember who, but it's a very onld post on .netlib) *prior* to the creation of
> the converter

I looked up the post, it's
<nntp://news.mozilla.org/382F2593.2532605D@netscape.com> (.netlib, subject
"recognizing URLs in text", 14 Nov 99) and the subthread following it.

Warren Harris wrote there:
"Yes, I think you/they should look for the pattern suggested above, and then try
calling NS_NewURI. This should only succeed if the protocol exists, and the
string is a syntactically valid URL."

I answered
"I like this idea. Since the main purpose of nsMimeURLUtils is this
recognition, I started rewriting the class."
This converter was the result.

Warren wrote later again
"First, recognize characters followed by a colon (as a protocol name -- not
specific protocol names, just anything that follows this pattern), and take the
text up to the next white space and hand it to NS_NewURI. If that succeeds, make
a link out of it."

NewURI clearly was supposed to not only check, if the URL scheme is known, but
also the syntax of the URI. So, the converter worked exactly as it was supposed
to, until Necko broke :-(. That's why I suggested to create a new function in
Necko to provide the functionality of the former NewURI.

Jsut checking schemes is not good enough, otherwise "/()%78%!!)@(%()%" will
probably be recognized as email address :(.
*** Bug 192674 has been marked as a duplicate of this bug. ***
> NewURI clearly was supposed to not only check, if the URL scheme is known, but
> also the syntax of the URI. So, the converter worked exactly as it was 
> supposed to, until Necko broke :-(. That's why I suggested to create a new 
> function in Necko to provide the functionality of the former NewURI.

Ben,

Boris is saying that Warren's original design doesn't hold up well.  It was a
nice idea, because it certainly makes the problem of solving this bug simpler,
but it causes other problems.  We decided that we needed NewURI to always return
an object, provided the URI string isn't malformed (i.e., the URI string must
have the structure of an URI for the specified scheme, and for unknown
protocols, the only requirement is an RFC-2396-compliant-scheme followed by a
colon and then whatever junk).

Finally, note that the external protocol handler only checks the existance of a
registered handler for the URL scheme.  It does not, and never did, check the
body of the URL string.

Here's a suggestion:

How about we introduce the following interface in netwerk/base/public:

  interface nsIExternalProtocolHandler : nsIProtocolHandler
  {
     boolean handlerExists(in string scheme);
  };

Then mozTXTToHTMLConv could do the following:

  boolean shouldLinkify(in string url)
  {
     scheme = ioservice.extractScheme(url)
     handler = ioservice.getProtocolHandler(scheme);

     externalHandler = handler.QI(nsIExternalProtocolHandler);
     if (!externalHandler)
       return true; // handler is built-in, linkify it!

     return externalHandler.handlerExists(scheme);
  }

nsExternalProtocolHandler would then implement nsIExternalProtocolHandler.
> Finally, note that the external protocol handler only checks the existance of a
> registered handler for the URL scheme.  It does not, and never did, check the
> body of the URL string.

Yes, sure, it can't, because it can represent arbitary URLs and doesn't know how
they are structured, apart from RFC2396.
For http, ftp, mailto etc. URLs however, we do have a pretty good idea of how
they should look like, and we do (or should) have implementations being able to
check conformity to that.

> Here's a suggestion:
> How about we introduce the following interface in netwerk/base/public:
>      boolean handlerExists(in string scheme);

As outlined above, simply checking the scheme is not good enough for the
purposes of the converter. And checking, if an email address or http URL is
stictly syntactically valid is really not the business of the converter IMHO, it
just requires that functionality.
I would have envisaged something vaguely like the following:
interface nsIURI2 : nsIURI
{
  boolean syntaxOK;
}
or
interface nsIIOSerivce2
{
  boolean isValidURI(in string uriToBeChecked);
}
This would end up invoking syntax checking function in the nsIURI
implementations of the relevant protocol implementation.
for http:// and ftp:// etc to linkify, we'd need to make those protocol 
handlers also implement nsIExternalProtocolHandler, right?

ioservice.getProtocolHandler() would return those protocol handlers, and for 
something external (like goim:// or something handled by an external app)
ioservice.getProtocolHandler() would give us nsExternalProtocolHandler

here's a list of the protocol handlers:

nsAboutProtocolHandler
nsDataHandler
nsFileProtocolHandler
nsFtpProtocolHandler
nsGopherHandler
nsHttpHandler
nsJARProtocolHandler
nsKeywordProtocolHandler
nsResProtocolHandler
nsViewSourceHandler
nsExternalProtocolHandler

do we weally want to linkify all these? (I'm wondering about viewsource, res, 
and jar, for example)

handlerExists() makes me think we should be returning true, but maybe we should 
call it something else, so that its valid for some things that we don't want to 
linkify to return false?
> here's a list of the protocol handlers:

that was a partial list, I forgot about the ones in mailnews.

speaking of which, we do want to linkify mailto: and news:, but we wouldn't 
want to linkify others (like some imap, and pop3 urls).

what I'm getting at is for protocols that we handle, we only linkify if we 
really intend for the user to click on them. 
> I forgot about the ones in mailnews

Not to mention javascript:, irc://, etc, etc.  Anyone can make up a protocol
handler and drop it into the build....
> do we weally want to linkify all these?

I don't think stuff like jar: or imap: disturbs in practice. In fact, I see irc:
etc. (depending on if there is actually a handling client) as feature.
> For http, ftp, mailto etc. URLs however, we do have a pretty good idea of how
> they should look like, and we do (or should) have implementations being able 
> to check conformity to that.

Yes, the URL parsers do (to some extent) enforce conformity above and beyond the
mere existance of a scheme+':' ... and it is entirely protocol dependent.

> As outlined above, simply checking the scheme is not good enough for the
> purposes of the converter. And checking, if an email address or http URL is
> stictly syntactically valid is really not the business of the converter IMHO, 
> it just requires that functionality.

Prior to the fix for bug 176904, we would do the following:

  if scheme is builtin, then parse URI, fail if malformed.
  if scheme is external, then check if OS can handle the scheme.

The patch for bug 176904 only changed the case when the scheme is external. 
And, this bug is really only about that "regression" ... let's focus on that
first.  So, for the purposes of external protocol handlers, checking the scheme
is all you can do (in most cases).


> I would have envisaged something vaguely like the following:
> interface nsIURI2 : nsIURI
> {
>  boolean syntaxOK;
> }
> or
> interface nsIIOSerivce2
> {
>  boolean isValidURI(in string uriToBeChecked);
> }
> This would end up invoking syntax checking function in the nsIURI
> implementations of the relevant protocol implementation.

again, the syntax is checked on construction (because that's when we parse the
URI string).
note, I didn't notice:

> return true; // handler is built-in, linkify it!

about irc:, we would want to linkify that one, since clicking on those links 
should launch the irc app.

so if we go with darin's original suggestion, all internal schemes would 
linkify.

I'd prefer something else for handlerExists().

I'd want to spin off a new bug about not linkifying everything that we handle, 
since I don't think we want that.  (and that can be a seperate discussion)

OK, so to address the problem of built-in protocol handlers not wanting to be
linkified, we could do a number of things.  We could introduce a protocol flag
(see nsIProtocolHandler::protocolFlags).  Or, we could change the name of
nsIExternalProtocolHandler to something any protocol handler could implement. 
However, i dislike this solution since it is quite bloaty.  We could also have a
simple white-list in the TXTtoHTML converter.
Ah, so NewURI still does check the "body" (scheme-specific part) of the URI in
the case of protocols we have a protocol implementation for? That's great. Then,
we have no problem. (Assuming nobody has the idea to remove those checks :-(.)
Hardcoding the check for the external handlers would then be OK with me for now.

Avoiding to link certain URL schemes: I think that's unnecessary, it's
irrelevant in practice and just means more code. It's a different bug in any case.
I realize this is another bug, but I want to make sure if we add a new 
interface we take future changes into account.

4.x used to have this concept of internal-vs-external urls, for security 
purposes.

certain urls the user could enter in the url bar, or click on, and certain ones 
they could not, but those urls could be run internal to the application.

how about instead of handlerExists(scheme) what about userAllowed(spec)

for the external protocol handler, we'd just see if the scheme was supported.

for internal, we'll allow the protocol handlers to determine if the url 
was "safe".

if userAllowed, we'd linkify.
I wish I hadn't brought up the "don't linkify everything" issue, as it's not 
helping resolve this issue.

Since darin's new interface (nsIExternalProtocolHandler) would not be frozen, I 
can bring this up in another bug, and for now linkify everything that we handle 
internally.

now that I fully understand it, I think darin's proposal in 
http://bugzilla.mozilla.org/show_bug.cgi?id=183111#c39 makes sense, will not be 
a lot of bloat, and will address part of this bug.

benb, bz, cavin, do you agree?
Seth, my comment 48 still stands. I don't know, why cavin's patch v1 (the uri
check part of it) wouldn't do the same as darin suggested (just without new
interface), but I don't care either :).
ben: because necko should not depend directly on exthandler =)
Yeah, darin's approach seems reasonable... (and I _do_ apologize for all the
trouble I caused by not noticing this issue before I made my changes....)
Comment on attachment 113935 [details] [diff] [review]
Proposed fix, v1

rejecting this patch.

cavin plans on implemented why darin suggested.

thanks to all (darin, benb, bz, and cavin) for working to resolve this
regression.
Attachment #113935 - Flags: superreview?(sspitzer) → superreview-
*** Bug 193116 has been marked as a duplicate of this bug. ***
Attached patch Proposed fix, v2 (obsolete) — Splinter Review
Created new interface "nsIExternalProtocolHandler : nsIProtocolHandler" and new
method "boolean handlerExists" to resolve the issue.
Attachment #113935 - Attachment is obsolete: true
Attachment #114498 - Flags: review?(darin)
From shortly looking over patch v2, it looks good to me. Thanks (also to darin)!

3 tiny suggestions, just in case there are other review comments and you need to
change the code again anyways:
- Move NewURI call into ShouldLinkify
- Add a comment that ShouldLinkify checks the validity of the URI
  (as-is, it sounds like a policy decision, like what seth suggested)
- Change "IsAsciiSpace" to "IsSpace"
But feel free to ignore it.
Comment on attachment 114498 [details] [diff] [review]
Proposed fix, v2

>Index: netwerk/base/public/Makefile.in

> 		nsIMultiPartChannel.idl \
>+    nsIExternalProtocolHandler.idl \
> 		$(NULL)

please keep indentation consistent.  thx!


>Index: netwerk/base/public/nsIExternalProtocolHandler.idl

>+ * Version: NPL 1.1/GPL 2.0/LGPL 2.1

new code should use the MPL.


>+ * Portions created by the Initial Developer are Copyright (C) 2001

2003 :-)


>+    boolean handlerExists(in string scheme);

use "in ACString aScheme" instead of "in string scheme"

though it was my suggestion, i agree with seth that "handlerExists"
is probably not the best method name here.  some alternatives:

  userAllowed
  schemeAllowed
  schemeSupported
  protocolAllowed
  protocolSupported

i'm not crazy about any of these.  though i think "protocolAllowed"
or "protocolSupported" might be my preference.	we're asking: are
we "allowed" / "able" to execute an external application for URIs
of this "protocol" scheme?  "allowed" gets us talking about
permissions, and that's not quite right here.  here, we are only 
asking can an URI be handled by the operating system.  which makes
me consider this method name:

  externalAppExistsForScheme

oh, i don't know ;-)


>Index: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp

>+// Bug 183111, editor now replaces multiple spaces with leading
>+// 0xA0's and a single ending space, so need to treat 0xA0's as spaces.
>+PRBool mozTXTToHTMLConv::IsAsciiSpace(const char aChar)
>+{
>+  return (nsCRT::IsAsciiSpace(aChar) || aChar == 0xA0);
>+}

>-             && !nsCRT::IsAsciiSpace(aInString[PRUint32(i)])
>+             && !IsAsciiSpace(aInString[PRUint32(i)])

>-             && !nsCRT::IsAsciiSpace(aInString[i])
>+             && !IsAsciiSpace(aInString[i])

should this really be part of this patch?


>+  nsresult rv = mIOService->ExtractScheme(NS_ConvertUCS2toUTF8(aURL), scheme);

>+  // See if the url should be linkified.
>+  if (!ShouldLinkify(txtURL))
>+    return PR_FALSE;

>   rv = mIOService->NewURI(NS_ConvertUCS2toUTF8(txtURL), nsnull, nsnull, getter_AddRefs(uri));

looks like you are doing the conversion from UCS2 to UTF8 twice.  why don't you
change the API for ShouldLinkify to take a nsCString instead, and then just do
the conversion once in the calling function.


>+   NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsIProtocolHandler, nsIExternalProtocolHandler)

is the ambiguous map really necessary?


>+  nsCOMPtr<nsIExternalProtocolService> extProtService (do_GetService(NS_EXTERNALPROTOCOLSERVICE_CONTRACTID, &rv));

should the external protocol handler class cache a reference to the external
protocol service?  we execute this code once per link, right?


>Index: uriloader/exthandler/nsExternalProtocolHandler.h

> #include "nsIProtocolHandler.h"
>+#include "nsIExternalProtocolHandler.h"

no need to #include "nsIProtocolHandler.h" now.
Attachment #114498 - Flags: review?(darin) → review-
>  externalAppExistsForScheme

Isn't there an nsIExternalProtocolHandler which can do exactly this?
See
http://lxr.mozilla.org/seamonkey/source/uriloader/exthandler/nsIExternalProtocolService.idl
biesi: please see comment #52.
Attached patch Proposed patch, v3 (obsolete) — Splinter Review
Incorporated comments.
Attachment #114498 - Attachment is obsolete: true
Attachment #114730 - Flags: review?(darin)
*** Bug 194128 has been marked as a duplicate of this bug. ***
Comment on attachment 114730 [details] [diff] [review]
Proposed patch, v3

>Index: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp

>+PRBool mozTXTToHTMLConv::IsSpace(const char aChar)
>+{
>+  return (nsCRT::IsAsciiSpace(aChar) || aChar == 0xA0);
>+}

this function should be declared inline.  in fact, why not just make it a
static
inline function for this source file?  why bother declaring it as a member var?

  static inline PRBool
  IsSpace(const char aChar)
  {
    return (nsCRT::IsAsciiSpace(aChar) || aChar == 0xA0);
  }


>+  rv = mIOService->NewURI(utf8URL, nsnull, nsnull, getter_AddRefs(uri));

this should be unnecessary now, right?


>Index: uriloader/exthandler/nsExternalProtocolHandler.cpp

>+	nsresult rv;
>+	m_extProtService = do_GetService(NS_EXTERNALPROTOCOLSERVICE_CONTRACTID, &rv);

what's this |rv| for?  bag it.


>   PRBool HaveProtocolHandler(nsIURI * aURI);
> 	nsCString	m_schemeName;
>+  nsCOMPtr<nsIExternalProtocolService> m_extProtService;
> };

gr.. tabs!  would be good to replace them with spaces while you're touching
this
file if you don't mind ;-)

r=darin with these changes.
Attachment #114730 - Flags: review?(darin) → review-
> > NewURI

> this should be unnecessary now, right?

I thought you said that NewURI still checks the body for built-in URLs like http:?
true, doh!  nevermind me =)
*** Bug 194704 has been marked as a duplicate of this bug. ***
Incorporated darin's last comments.
Attachment #114730 - Attachment is obsolete: true
Attachment #115430 - Flags: superreview?(sspitzer)
Comment on attachment 115430 [details] [diff] [review]
Proposed patch, v4

asking for darin's official r=, before I sr.
Attachment #115430 - Flags: review?(darin)
Comment on attachment 115430 [details] [diff] [review]
Proposed patch, v4

>Index: uriloader/exthandler/nsExternalProtocolHandler.cpp

>+NS_IMETHODIMP nsExternalProtocolHandler::ExternalAppExistsForScheme(const nsACString& aScheme, PRBool *_retval)
>+{
>+  if (m_extProtService)
>+    return (m_extProtService->ExternalProtocolHandlerExists(PromiseFlatCString(aScheme).get(), _retval));

nit: kill the extra parans


>+  // In case we don't have external protocol service.
>+  if (_retval)
>+    *_retval = PR_FALSE;

nit: you can assume _retval non-null


sorry for not commenting on these last time.  i don't need to see another
patch.	r=darin.
Attachment #115430 - Flags: review?(darin) → review+
Comment on attachment 115430 [details] [diff] [review]
Proposed patch, v4

sr=sspitzer
Attachment #115430 - Flags: superreview?(sspitzer) → superreview+
Target Milestone: --- → mozilla1.4alpha
Fix checked in.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
This patch has a problem with some compilers -- bug 195727
This patch also broke the Mac OS9 port -- see bug 195917
*** Bug 195952 has been marked as a duplicate of this bug. ***
*** Bug 196341 has been marked as a duplicate of this bug. ***
*** Bug 196775 has been marked as a duplicate of this bug. ***
Using trunk builds 20030314 on winxp, mac osx and linux and the scenario as
originally described in html and plain text compose windows this is fixed
If any of the commentor need to test specific scenarios,  please comment in this
bug the results. If no one has found a scenario that doesn't work, I will
verfify this as fixed on 3-19. 
Esther, please check against
<http://www.bucksch.org/1/projects/mozilla/16507> and
<http://www.mozilla.org/quality/mailnews/tests/mn-html-to-txt.txt>
In particular, check for invalid URL schemes. Thanks.

FYI, the following should not link, but also does so in Moz1.0, so it wouldn't
be a regression covered by this bug.
<http://@##>
<mailto:@>
<mailto:#@+>
<http://hj+ü>
<mailto:üdghjr>
-> QA to me.
stream converters, of which this is one, is now part of networking qa. 

esther might still run tests in mailnews, and file bugs...
Component: MIME → Networking
Keywords: qawanted
Product: MailNews → Browser
QA Contact: esther → benc
*** Bug 199099 has been marked as a duplicate of this bug. ***
If I understand this, this is fixed for 1.4a, but was broken in 1.3f because it
came in after the branch date.

MailNews and other dependent components will continue to test this for their
shipping products, but I'll own the general testing technical stuff.

Ideally, I think mailnews should file bugs where mailnews breaks, and then be
depends on a single Networking bug that talks about where the stream converter
breaks. In this case, I did not go back through all the dupes, re-open them, and
make them dependent on this bug. However, I did leave them in mailnews, so the
current QA needs to verify dupe or get confirmation this works now.

I will add tests for "9:00" and "D:\path\file" because they seem exceptionally
common, so you can blindly VERIFY/dupe those bugs.
*** Bug 199427 has been marked as a duplicate of this bug. ***
*** Bug 200106 has been marked as a duplicate of this bug. ***
*** Bug 200288 has been marked as a duplicate of this bug. ***
*** Bug 200821 has been marked as a duplicate of this bug. ***
*** Bug 201604 has been marked as a duplicate of this bug. ***
*** Bug 202431 has been marked as a duplicate of this bug. ***
*** Bug 203047 has been marked as a duplicate of this bug. ***
*** Bug 204971 has been marked as a duplicate of this bug. ***
*** Bug 204991 has been marked as a duplicate of this bug. ***
*** Bug 205990 has been marked as a duplicate of this bug. ***
*** Bug 207316 has been marked as a duplicate of this bug. ***
VERIFIED: per esther's results.
Status: RESOLVED → VERIFIED
*** Bug 209382 has been marked as a duplicate of this bug. ***
*** Bug 188289 has been marked as a duplicate of this bug. ***
*** Bug 210960 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: