Open Bug 23114 Opened 25 years ago Updated 13 years ago

Linkify URLs in message headers

Categories

(SeaMonkey :: MailNews: Message Display, defect, P3)

defect

Tracking

(Not tracked)

People

(Reporter: bugzilla, Assigned: markushossner)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

If you have a "organization" mail header that contains a URL the URL should
become clickable.
Like this:
Organization: Tele Danmark Internet (http://www.gemal.dk)
Perhaps even with:
Organization: Tele Danmark Internet (www.gemal.dk)

Netscape 4.7 only does the first one. It could be cool if Mozilla could do both.

Using Build 2000010416
Assignee: phil → mozilla
Component: Front End → MIME
Summary: urls in "organization" should become clickable → Linkify URLs in message headers
I'm going to generalize this bug a little bit. I think the real issue is that
all header values should be run through the text-to-html converter. The Subject
header is another good example, in addition to the Organization header. Ben's
new converter does also pick up www.gemal.dk, in addition to http://www.gemal.dk

Reassigning to Ben, cc rhp and mscott
Hmm....in principal I like the idea, but I'm worried about slowing down message
display times by link-ifying the headers.

If we want to do this, the text to html converter needs to be scriptable for
xpconnect. Is it? (I have no idea)  The work needs to be done in
nsMsgHdrViewOverlay.js where we would take the subject, run it through the text
ti html converter and then create the appropriate anchor elements in the xul
document. These anchor elements then need to be inserted into the div that
contains the subject value.

A good example of this is how I linkify email addresses in the from, to and cc
line in JS. This example is also in nsMsgHdrViewOverlay.js
Severity: normal → enhancement
Component: MIME → Front End
OS: Windows 98 → All
Hardware: PC → All
Phil,
I don't think, ruuning all headers through the HTMLToTXTConv is a good idea. TO
etc. are already linkyfied and might want to be addressbook URLs insteal of
mailto: URLs. There's also some chance, that it messes everything up.

But running it over SUBJECT and ORGANISATION might be useful.

mscott,
If it is only done on msg display (not in thread pane), there should be no
performance problem, as the converter is called for every line in the msg
anyway.

Yes, mozIHTMLToTXTConv is scriptable in theory, but untested.
Status: NEW → ASSIGNED
Mass-ACCEPTing to stop annoying autonotifies
Mass-LATER
Priority: P3 → P5
Target Milestone: --- → M16
Keywords: pp
Sorry for the spam
Keywords: pp4xp
Keywords: helpwanted
Priority: P5 → P3
Target Milestone: M16 → Future
Won't make it for M16 (=> 1.0) :-((, at least not by me.
Changing personal priorities. Giving away most of my bugs :-( (reassigning to
default owner).

I will still track these bugs closely. If you need my input, feel free to ask me.

New owner: Please do *not* close these bugs (as WONTFIX or whatever you may
find) unless they are fixed. Rather, reassign to <nobody@mozilla.org>, if you
don't want to work on them.
Assignee: mozilla → sspitzer
Status: ASSIGNED → NEW
QA Contact: lchiang → esther
Target Milestone: Future → ---
*** Bug 61496 has been marked as a duplicate of this bug. ***
*** Bug 62033 has been marked as a duplicate of this bug. ***
This should also cover the X-Url header -- in fact, that one can be really
simple because it's only supposed to contain a url...
*** Bug 76389 has been marked as a duplicate of this bug. ***
*** Bug 61006 has been marked as a duplicate of this bug. ***
Too many bugs are duped against this one. This one is solely about linking URLs
in headers that we show anyway, similar to what we do in the body of plaintext
messages. Bug 62033, OTOH, for example, is completely different, because neither
is that header normally shown nor do the links match anyhow the text that will
be shown.
Having non-linkified URLs in the subject/organization headers is quite user
unfriendly. E.g. a very common behaviour is to send a URL in the Subject: to
recommend a site. Correct me if I'm wrong but I can't even see any possibility
of copying such a URL. All I can do is type it into the browser location bar by
hand.
Is there any work being done on this bug?
4xp's are usually considered bugs, not RFEs. 

If 61497 is fixed in a reasonable way, we might get this "for free", so adding a
dependency.
Severity: enhancement → normal
Depends on: 61497
*** Bug 150135 has been marked as a duplicate of this bug. ***
I think this linkify feature should be considered for Subject, Organization, and
ALL X- headers. A lot of services add X- headers with URL's in them. For
example, a lot of spam-filtering companies do this. 
*** Bug 161067 has been marked as a duplicate of this bug. ***
There are Delivered-to, Return-path, List-* which usually contain mail addresses
as well.
*** Bug 181006 has been marked as a duplicate of this bug. ***
*** Bug 181734 has been marked as a duplicate of this bug. ***
Is there something in particular that is preventing this from being added, or is
it just that no one has had time to get to it yet?
I think it's because nobody attached a path
> Is there something in particular that is preventing this from being added, or is
> it just that no one has had time to get to it yet?

I'm not sure, because the headers are rendered in XUL and the converter outputs
HTML. If there's no way out output HTML in the header value fields without other
things, maybe my new URL-finding function in bug 116242 will help. In any case,
somebody would have to look into it, and yes, nobody bothered to do it so far.
I've tried to implement that for Mnenhy by using mozTxtToHTMLConv, but ran into
several problems with the smiley conversion - but just linkifying mere URLs
should be no problem. (My current state of affairs are iframes getting fed with
mozTxtToHTMLConv output, but those don't autoresize to content (yet)...)
*** Bug 35689 has been marked as a duplicate of this bug. ***
Blocks: 176238
Product: Browser → Seamonkey
Assignee: sspitzer → mail
Karsten, any change in Mnenhy now 3 years later?

(thunderbird bug 264774)
Blocks: socmn
Attached patch patchSplinter Review
The patch linkifys URLs and email addresses, displays emoticons and enable structs like *bold* or /italic/ in all non-emailaddress and non-tag headerfields using mozTxtToHTMLConv. It also adds prefs to enable and disable the described functionality.
Attachment #267901 - Flags: superreview?(neil)
Attachment #267901 - Flags: review?(mnyromyr)
Assignee: mail → markushossner
Assignee: markushossner → mail
QA Contact: esther
Sorry for the bugspam :|
Assignee: mail → markushossner
Keywords: helpwanted
Comment on attachment 267901 [details] [diff] [review]
patch

Markus and I were in close contact during the preparations for this patch, so I had the chance to make my review comments even before the patch was attached. ;-)
Attachment #267901 - Flags: review?(mnyromyr) → review+
Comment on attachment 267901 [details] [diff] [review]
patch

First, thanks for the patch! (And for using mozITXTToHTMLConv :) .) Seriously, I'm glad somebody took on this bug.

> // clean email address from < and >
> var cleanedAddress = RegExp(/^<(.*)>$/).exec(emailAddress);

uhm. I don't know why you do that, but that's not the right way. Probably you have the email address properly somewhere, e.g. from source or as mailto: URL.

> var needABrowser = /(^http(s)?:|^ftp:|^file:|^gopher:|^chrome:|^resource:|^about:)/i;

Again, uhm. I don't even know what this is supposed to do (looking for links typically opened in a webbrowser?), but you should not hardcode such a list.

>    extProtService = extProtService.QueryInterface(Components.interfaces.nsIExternalProtocolService);
...
> if (isPhishingURL(ceParams.linkNode, false, href))

I can't imagine that Thunderbird doesn't have a central function to open links (in a browser).

> fillHeaderFieldContextMenu()
> <popup id="headerFieldContext"

The context menu looks like copy&paste of a 200 lines of code that is likely to change (adding/removing of menu items/commands for links). No idea whether there's a way to share it.

+            var strNewHTMLHeader = "<div xmlns='http://www.w3.org/1999/xhtml'>" +
+                                   htmlHeaderValue +
+                                   "</div>";

Wait, are you inserting the generated HTML from the remote source into the chrome (system-privilege) XUL? This is probably a security bug.

It most likely emits problematic links, though. I don't know what happens when you click on them.

I don't know whether mozTXTToHTML emits any other HTML that could be problematic, but even if it doesn't right now, it may in the future. It's assumed to work on untrusted content and emit untrusted content, that callers run the generated HTML with webpage privileges, not system privs.

Unfortunately, I don't know a solution offhand. You probably don't want to create an iframe (<browser type="content" />, specifically) for each header.

> (all, esp. createInstance and co)

The code looks a bit inefficient, BTW. Consider that you run through this several times per message opened.

r-
(In reply to comment #32)
> I can't imagine that Thunderbird doesn't have a central function to open links
> (in a browser).

We badly need such a central function on the toolkit side, see also bug 349309 comment #9
> Wait, are you inserting the generated HTML from the remote source into the
> chrome (system-privilege) XUL? This is probably a security bug.

It'd be extremely helpful if you could state _how_.
Just throwing in "probably a security bug" is not helpful.

> It most likely emits problematic links, though. I don't know what happens when
> you click on them.

We probably should filter out data: URIs? What else?
> It'd be extremely helpful if you could state _how_.

You mean how it is a security bug? I said so: You run untrusted content with system privileges. I don't need to supply a concrete exploit to prove it.
Or you mean how to do it better? I don't know, sorry. If I knew, I would have said so.

> Just throwing in "probably a security bug" is not helpful.

Well, it is. It means that I catched what is probably a critical security bug, before it got in.

> We probably should filter out data: URIs? What else?

I don't think you can write browser security by simple filters, that's far too dangerous. You can't run untrusted source with system-privileges.
Let me think about this.
If you can find a way how to run this as untrusted content, that would be great.
So from what I've gathered from IRC and here, headers are getting escaped before linkifying, right?
If that's true, it sounds to me like the only part that can introduce real HTML here is mozTXTToHTML, which means that part is the untrusted component here, not really the remote site, right?
Well, depends on how you see it. Yes, in the sense that the original is plaintext and thus cannot be a threat (short of buffer overflows and co). The converter operates over untrusted data, and emits most or all of that data into the result, by definition and requirements. Furthermore, it assumes that the caller threats the result as untrusted (as webpage or email body), which is true for all current callers (as far as I know), and is also a wise choice.

It's not the implementation of mozTXTToHTML that is insecure. Any code that does the same thing would probably be just as dangerous. It's the fact that it takes untrusted data and makes HTML out of it, which is potentially active code.

For example - I don't know whether that is real, but shows the problem - let's say the source has
<http://bla/bla" onmouseover="alert(Components.interfaces);" foo=">.
It may well be that everything in <> is recognized as URL, and the resulting HTML is <a href="http://bla/bla" onmouseover="alert(Components.interfaces);" foo="">...</a>. If you mouse over this, the alert runs with chrome priviledes in your code.

Now, you can say that mozTXTToHTML protect against that. In this case, we could escape any "<>& in attribute values. But it's really to protect against everything, and security is not the job of the class, it's just link etc. recognition, and it's the job of Gecko to sandbox the result.

This is the problem here.
OK, then we need to either ensure that we catch all of the possible "let's say" cases or just use some poor man's self-spinned linkification code that avoids all possible cases of malicious linkage. Making all headers appear in untrusted <browser> areas is not worth it, I guess, so that's no option.

Oh, BTW, if text-mode email display uses the same code, then your example provided above doesn't work out to be malicous as my bugmail about your comment shows exactly the same linkification as the HTML version on this page.
> catch all of the possible "let's say"

Not feasible.

> just use some poor man's self-spinned linkification code

Well, see above, the problem is not the implementation, but the concept.

You could, however, use the link recognition code from mozTXTToHTML. It doesn't generate HTML, just tells you where URLs start and end (already non-trivial). There's a special function for that, and the spellchecker uses it to skip links. You would insert DOM nodes yourself, avoiding escaping problems.

Then you still have the problem what happens when a user clicks on it. You should not use an html:a, due to the risk that the URL runs as chrome. Just create your own tag, put the URL as an attribute on it (and also as content), and make clicks and content menu use that.
Then it's still important how you load it. Only load it in the external browser, never load a URL in Thunderbird, not even a mail URL.

That's all I can think of right now.
Of cource we all take the security problem very serious.

So lets see what could happen and what we could do to prevent it.


As fas as I see the problem is, could someone cause mozTXTToHTML to produce harmful code:

a) Code which does harm by just displaying it

b) Code which does harm by clicking on a link


a) As Robert said headers are getting escaped before linkifying so the only problem should be the HTML code produced by mozTXTToHTML. As far as I see mozTXTToHTML finds a URL and produces: <a href="url">url</a>, finds an email address and produces <a href="mailto:address">address</a>.

For emoticons and structs the produced HTML is fixed (<span class="moz-smiley-s1">:-)</span>) or independent of the input (<i>italic</i>).


So is there a possibility to create an URL/email address with doesn't result in:

<a href="url">url</a> or <a href="mailto:address">address</a>

which e.g. results in <a href="x" onmouseover="...">url</a>

for the URL = 'x" onmouseover="alert(Components.interfaces);'


Could we, could mozTXTToHTML prevent this case of code injection?! If not every new build linkify code would be as dangerous as mozTXTToHTML.



b) About the clicking problem. To prevent the default html:a clicking routine a special click handling is needed. Therefore the click handling should not be done by a default click-on-a-link function like (contentAreaClick). It is important to controll every step made when clicking on a link.

Also my actual implementation of the clicking routine needs some tuning here. I would suggest to open a new compose window for href="mail:..." and for everything else (href="url") open a browser window with this URL (Exception: URLs which are a known phishing attack). The browser should handle the URL like it does with every URL opened through a link.


And the event handling has to stop after doing so. I think this should do.



> I don't know whether mozTXTToHTML emits any other HTML that could be
> problematic, but even if it doesn't right now, it may in the future.

I guess that future functionallity as today has to be activate by special flags (kURLs, kGlyphSubstitution, kStructPhrase). As today mozTXTToHTML shouldn't do more than the user wants it to do. So future functionallity is possible without affecting existing usage.
> Could we, could mozTXTToHTML prevent this case of code injection?!

I think I covered that above. I'd say don't let it generate HTML, but just let it find the URLs and you insert (into the text) your own element or widget to handle the links.

- Don't use html:a. You may think you override the click action, but then you miss a case, and you're toast.

> To open a new compose window for href="mail:..." [mailto, actually]

Only if you can use the same code path that would be invoked by the browser for mailto: links

> Also my actual implementation of the clicking routine needs some tuning here.

The code needs a lot of other changes apart from the security problem, yes, some of them pointed out above.

Handing all URLs - apart from those specially handled - to the browser would get rid of the hardcoded browser URL scheme list, yes, that's what I'd have done as well.

> future of mozTXTToHTML

That's a different topic, but you cannot assume anything about it other than that it will make useful HTML out of plaintext, but not that it can run with system privileges.
Comment on attachment 267901 [details] [diff] [review]
patch

So, we have to "attack vectors" here:
(a) mozTXTToHTML producing markup being executed automatically 
mozTXTToHTML's URL recognition generates links in the form of
  <a href="URL">Description</a>
To compromise that, you'd have to create a URL string which "escapes" from the "" embracement. This is usually done by planting " characters or 0x00 (etc.) bytes into an unaware conversion routine - but mozTXTToHTML *is* aware of that! Saying "mozTXTToHTML is secure now but might not be in future" while it hasn't changed fundamentally for quite some time is a weird argument, though. "Don't use that program, it might be insecure in future"?

If this would be a real issue, we could demand the implementation to be "safe" and say "unsafe changes need to go into a derivation". (And, btw, the only 'real' consumers of this HTML generation are mailnews and gopher...)

Anyway, the URL recognition is the only part where not exactly known content could creep in (in theory, see above), so doing glyphs etc. should not be the problem. We could find URLs in a second parse step if requested and create (and style) a custom element for that to avoid the html:a default processing. That's in fact relatively easy to do.

(b) clicks on mozTXTToHTML-generated <a> links
This is indeed a major flaw in the current implemention (and that's why I actually like our at-least-6-eyes-have-seen-the-code policy ;-) ).
If I put the following URL into a header, eg. subject:
  javascript:alert(msgHeaderParserContractID)
it will print the content of an *internal* variable when clicked upon in the header. This is of course a security leak (thus revoking r+). 

Given that we actually do have control about how we react upon links, I think we probably should use no automatic processing, but a whitelist: process known mail urls internally, pass other known protocol schemes to a browser content window, drop anything else we don't like or know about.
Attachment #267901 - Flags: review+ → review-
> Saying "mozTXTToHTML is secure now but might not be in future" while it hasn't
> changed fundamentally for quite some time is a weird argument, though. "Don't
> use that program, it might be insecure in future"?

The argument is: mozTXTToHTMLConv emits untrusted HTML, by design, and that's a sane design. Just because we don't see a problem now (the fact that the URL stops at " is more a coincidence than design!) doesn't mean there is none nor will be. As the author of mozTXTToHTMLConv, I cannot and *will not* guarantee that the emited HTML will not contain anything problematic from the source that it's generated from. mozTXTToHTMLConv is not a security component!

To paraphrase you: Yes, you may be able to use your pencil as screwdriver, but don't assume that it's a good idea or will not break things. You use the completely wrong tool.

Again, this whole argument is void, if you use mozTXTToHTMLConv.findURLInPlaintext() to *tell* you where the links are, and then insert nodes yourself via DOM, and handle them carefully (not <html:a>!). This implies not getting smileys in headers, but hey, that's not critical, is it?

> process known
> mail urls internally, pass other known protocol schemes to a browser content
> window, drop anything else we don't like or know about.

Don't hardcode browser URLs. See other review comments above. Handing URLs to the browser should be safe, the browser needs to take care of its own security. So, handle some schemes specifically (and carefully!), drop the phishing, hand the rest to the browser.
You did not reply to some points in my comment #44, so I have to read between the lines again. Feel free to correct me.

> The argument is: mozTXTToHTMLConv emits untrusted HTML, by design, and
> that's a sane design.

Disagreed.
Having an interface that is *only* available on the chrome side, but *must* *not* be used for the chrome side is of pretty limited use.
Thus, it's not surprising that mailnews is the only "serious" consumer...

> Just because we don't see a problem now (the fact that the URL
> stops at " is more a coincidence than design!) doesn't mean there is none

That'd be a bug, then...

> nor will be.

Still, this is no useful argument.
We *could* declare this interface to be security relevant and *require* that no evil stuff will be generated. (This may need an in-depth look at the code before, but nonetheless.) 
I see _no_ justification why creating untrusted content only should be a "sane" limitation! (We could have a flag for "allow untrusted content creation", but that would probably be a nightmare to implement cleanly.)

> As the author of mozTXTToHTMLConv, I cannot and *will not* guarantee
> that the emited HTML will not contain anything problematic from the source
> that it's generated from.

Well, fine be me. Noone said you have to be responsible for anything that happens after you. But you are neither module owner of netwerk or peer or something, last I looked. :-P

> mozTXTToHTMLConv is not a security component!

Well, if it creates markup, it'd better be.

> To paraphrase you: Yes, you may be able to use your pencil as screwdriver, but
> don't assume that it's a good idea or will not break things. You use the
> completely wrong tool.

As almost always, this comparison is not quite right.
We're trying to use a screwdriver to work with electrical installments, so it'd better be a circuit tester as well. And using a circuit tester as a screwdriver would do no harm...

> Again, this whole argument is void, if you use
> mozTXTToHTMLConv.findURLInPlaintext() to *tell* you where the links are, and
> then insert nodes yourself via DOM, and handle them carefully (not <html:a>!).

Sure, for links/urls.

> This implies not getting smileys in headers,

Why? Smilies, glyphs and structs don't even _have_ arbitrary content, their "content" to be converted is completely known! Denying even those is only consequent in the light of rendering mozTXTToHTMLConv a waste of time.

Maybe we need something like mozTXTToXULConv. ;-)

> but hey, that's not critical, is it?

Having smileys is as critical as having links in headers: not at all.
We're living without it (in core at least) for almost 7.5 years.

> > process known
> > mail urls internally, pass other known protocol schemes to a browser content
> > window, drop anything else we don't like or know about.
> 
> Don't hardcode browser URLs.

I didn't say that. I wrote "known protocol schemes" by purpose.
I do not know [almost] anything about XUL, so may be it's completely off - but judging from the above discussion it would seem that the right solution would be to mark the subject field as an "insecure"/"tainted" part of the document. I would imagine that this is already done for the message body - why can't it be done for the subject as well? Once the subject is marked as "untrusted", it would no longer be a problem to use whatever mechanism for converting it to HTML... 
OTOH, it might be appealing to "attack" bug 9942 along with this one and make the messagepane be XUL+HTML and put the header stuff into as well. Then using mozTXTToHTMLConv would be a non-issue. Could get a bit complex now, though, to keep stuff like infobars etc. running with (almost) no chrome access.
(I didn't actually try that yet.)
Karsten, we can discuss about security and design forever, but fact is that the component does not make the guarantees that you expect from it, and is thus not usable in the way you use it.

I did offer you another API, from the same component, that probably would work. I don't see why you refuse to just use that.

Aleksey,
it's not easy to mark something as untrusted. You can only do so for completely new documents in kind-of-iframes, which is a very heavy solution.

> Having smileys is as critical as having links in headers: not at all.

I would say that links in email bodies are critical and in headers useful sometimes. Smileys were always just cute, only useful for moms.

> it might be appealing to "attack" bug 9942

Oh, yes!

Note that the bug just asks headers to scroll with body, not that the headers are untrusted. My patches try *very* *very* hard to keep headers trusted. If they are untrusted, fixing the bug would be trivial.

It's a completely different approach. Having the header as non-chrome has other implications (you can't do certain things anymore that need chrome right), but would of course solve the problem here completely.
Comment on attachment 267901 [details] [diff] [review]
patch

>+      <xul:description class="headerValueHTML headerValue plain"
>+                       anonid="headerValueNode" flex="1"
>+                       onclick="return headerValueOnClick(event);"
>+                       onmouseup="this.focus();"
>+                       context="headerFieldContext"/>
While you lot argue about the implementation, a few coding questions ;-)
What's the difference between headerValueOnClick and messagePaneOnClick?
What's the onmouseup handler for?
What's the difference between headerFieldContext and messagePaneContext?

>+.headerValueHTML html|a {
>+  cursor: pointer;
>+  color: #424F63;
>+  text-decoration: underline;
>+}
Shouldn't this be blue too?
Attachment #267901 - Flags: superreview?(neil)
(In reply to comment #50)
> (From update of attachment 267901 [details] [diff] [review])

> 
> >+.headerValueHTML html|a {
> >+  cursor: pointer;
> >+  color: #424F63;
This should *NEVER* be hardcoded. There's bound to be theme out there with which this won't be visible.

Another invisible piece of UI. AAaaarrgh..
Michal:
The definition you are talking about is in the Modern theme, which uses hardcoded colors all over the place. It's completely OK there.

I wonder a bit about the definitions for Classic we have in the patch - are those system-color-change safe?
(In reply to comment #50)
> (From update of attachment 267901 [details] [diff] [review])

> What's the difference between headerValueOnClick and messagePaneOnClick?

messagePaneOnClick only exists in Seamonkey MailNews not in Thunderbird. headerValueOnClick only handles left clicks and has I wrote above it would be better to change headerValueOnClick even more to just open a new compose window for email addresses and a browser for everything else and not try to handle protocolls by itself like messagePaneOnClick does.


> What's the onmouseup handler for?

The problem is, if you click on a html:a element it takes the focus. But we need the surrounding description to get the focus so that in case of a select all command (cmd_selectAll) the description's controller is called to execute the command. Otherwise if the html:a element has the focus a default controller is used and instead of the header content all threads are selected.


> What's the difference between headerFieldContext and messagePaneContext?

The messagePaneContext offers a huge bunch of possible menuitems but here we only need very few. So you could take the messagePaneContext and write a function to hide all the unneeded stuff or use a new small context menu with the needed stuff. In this case here I don't think it would be good to reuse code if you have to write lots of code to don't use alot of the reused code.


> > >+.headerValueHTML html|a {
> > >+  cursor: pointer;
> > >+  color: #424F63;

I used this to make the links look like the email daddresses in the header:

.emailDisplayButton {
  cursor: pointer;
  color: #424F63;
  text-decoration: underline;
  margin: 0;
}

(messageHeader.css)
Depends on: 388674
No longer depends on: 388674
QA Contact: search
QA Contact: search → message-display
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: