Closed
Bug 108654
Opened 20 years ago
Closed 17 years ago
[mozTXTToHTMLConv] Message-ID interpreted as mail address
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: 3.14, Assigned: BenB)
Details
Attachments
(1 file, 1 obsolete file)
|
9.14 KB,
text/plain
|
Details |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.5) Gecko/20011012 It is common practice to cite message-ids in news articles, e.g., <3BE7BB7F.1070204@logic.univie.ac.at>. Mozilla makes it mailto which is a very bad guess. So it is not possible to follow the hint. Hence there must be either a very good heuristic to find out or a user dialog, which must in any case be available (thru the right mouse button). pi
Comment 1•20 years ago
|
||
There is no realistic way for the link parser to know whether <3BE7BB7F.1070204@logic.univie.ac.at> is a mailto: address or message-id because they look the exact same. Besides, you usually use full news:// urls (like <news://news.mozilla.org/3BE7BB7F.1070204@logic.univie.ac.at>) when citing messages in newsgroups, and that's the way it should be. Suggesting wontfix.
| Reporter | ||
Comment 2•20 years ago
|
||
> There is no realistic way for the link parser to know whether > <3BE7BB7F.1070204@logic.univie.ac.at> is a mailto: address or message-id > because they look the exact same. They are *formally* the same. But when you look at it your guess will be right in almost any case. And if you think it is not possible to have a heuristic (as I suggested), you are wrong, Forte Agent has this for many years. > Besides, you usually use full news:// urls (like > <news://news.mozilla.org/3BE7BB7F.1070204@logic.univie.ac.at>) No. It is common practice in usenet much longer than URLs are used. Usenet is not the non-clickable part of the web. And nobody (with understanding of usenet) will add the server for normal messages, because you usually need the right to access the server. > when citing messages in newsgroups, and that's the way it should be. Do we want a newsreader to work for the people or the people to adjust to broken software? In the latter case, cancel this project and let people use Outbreak Excess. > Suggesting wontfix. You ignored my suggestion to allow the user to decide. pi
| Assignee | ||
Comment 4•20 years ago
|
||
-> me
Assignee: sspitzer → mozilla
Component: Networking - News → Networking
OS: Linux → All
Product: MailNews → Browser
Hardware: PC → All
| Assignee | ||
Comment 5•20 years ago
|
||
Not that this "ceveat" is well-documented (<http://www.bucksch.com/1/projects/mozilla/16507/>, "Known failures"). > Hence there must be either a very good heuristic to find out Impossible, to my knowledge. Message-IDs are syntactically identical to email addresses. > a user dialog, which must in any case be available (thru the right mouse > button). huh? right mouse button in a dialog? anyways, I don't see, how that could be achived, technically. The converter converts to generic HTML, and could be used to for saved documents or something. So, - I don't know, how to code that at all. - Even if we do, it would probably depend on Mozilla as renderer, which might not be desireable for saved docs.
Status: UNCONFIRMED → NEW
Ever confirmed: true
| Reporter | ||
Comment 6•20 years ago
|
||
>> Hence there must be either a very good heuristic to find out > >Impossible, to my knowledge. Message-IDs are syntactically identical to email >addresses. Yes, but you can make an educated guess; Forte Agent is pretty good at it. For example, e-mail addresses tend to be much shorter (in the local part) than message-ids. $ symbols never how up in e-mail (I am not positive if it would be legal). A long sequence of numbers and consonants are unlikely in an e-mail address. Just a few ideas. Of course, this is not 100% safe. >> a user dialog, which must in any case be available (thru the right mouse >> button). > >huh? right mouse button in a dialog? I thought about the following: On the link right-click, so the context menu comes up. There should be something like "send e-mail to that address" and "get that message-id". >anyways, I don't see, how that could be achived, technically. >The converter converts to generic HTML, and could be used to for saved documents >or something. So, >- I don't know, how to code that at all. >- Even if we do, it would probably depend on Mozilla as renderer, which might >not be desireable for saved docs. That is a problem. But as long as the messages are displayed, we have access. So the heuristic approach would again be useful for the rest. pi
| Assignee | ||
Comment 7•20 years ago
|
||
> you can make an educated guess An educated, guess is OK, but can we make one? (An educated guess is for me one that either strcitly follows the spec or is right in 99.9%+ of the cases.) > For example, e-mail addresses tend to be much shorter (in the local part) Uh. There are very long email addresses. Just today, I had something like <cornelia.inschweiler-povierski@bigbank.com>. > $ symbols never how up in e-mail (I am not positive if it would be legal) They are legal. RFC822: Without quoting (where everything is allowed), the local part must be an atom, which consists of any char except specials, space and ctls. ctls are ASCII <= 31 and 127. specials are "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / <"> / "." / "[" / "]" (but "." are allowed bey special exception). > A long sequence of numbers and consonants are unlikely in an e-mail address. Uh, there are indeed valid email addresses of that form, e.g. generated ones. > Of course, this is not 100% safe. It's too unsafe for my taste. I think that it would confuse users more than it does today.
Comment 8•20 years ago
|
||
To summarize, like I said in my first comment, unless your computer happens to have a human brain attached to it, there is no realistic chance we can change this behaviour and make an "educated guess" without breaking more functionality than we'd add; it would add more pain than gain. Ben, wontfix?
| Assignee | ||
Comment 9•20 years ago
|
||
Håkan, yes, I'm leaning towards wontfix, but I'd like to first give Boris a chance to figure out some valid options for fixing this bug.
| Reporter | ||
Comment 10•20 years ago
|
||
>> you can make an educated guess > > An educated, guess is OK, but can we make one? (An educated guess is for me > one that either strcitly follows the spec or is right in 99.9%+ of the > cases.) Well, now we are 100% wrong with message-ids. >> For example, e-mail addresses tend to be much shorter (in the local part) > > Uh. There are very long email addresses. Just today, I had something like > <cornelia.inschweiler-povierski@bigbank.com>. Right, it is just *a* hint. >> $ symbols never how up in e-mail (I am not positive if it would be legal) > > They are legal. OK, but I have never seen them in practice. So this is at least the 99.9% reliability you asked for. I also did a test (with Mozilla): I sent e-mail to $@piology.org which actually arrived, but the $ changed to "$". Same for make$fast@piology.org which became "make$fast"@piology.org. >> A long sequence of numbers and consonants are unlikely in an e-mail address. > > Uh, there are indeed valid email addresses of that form, e.g. generated ones. Yes, they are legal. Any message-id would be legal. But it is the question of best guessing. >> Of course, this is not 100% safe. > > It's too unsafe for my taste. I think that it would confuse users more than > it does today. If you think, this is fine with me, for that case I offered the interactive solution. There must be a nice way of accessing a message-id. So let the user say "I want to access this as a message-id" (using the right-click-menu). Nofix would be like writing a web browser which only displays links, but you cannot click, you have to copy it and paste it into the goto field where you also have to modify the link address first. pi
| Assignee | ||
Comment 11•20 years ago
|
||
> Well, now we are 100% wrong with message-ids. The risk is that we would be turning email addresses in msg ids, which would be extremely confusing for users. > I offered the interactive solution Yes, but while that might be a nice UI, it is really hard to implement in the converter. Please note that the linked msg-ids are not very useful anyway, because we can hardly do anything useful with them. In the best case, we can show a msg that happens to lie in the same mail folder or on the same news server. This could change (there is a bug about it), but it is unlikely to happen in the mid-term future.
| Assignee | ||
Comment 12•20 years ago
|
||
I forgot one thing: The $-rule looks OK, I think. Can you come up with more ones? (which don't bear the risk to consider a real email address a msg-id.) Note that someone still has to implement the stuff. I probably won't do it myself.
| Reporter | ||
Comment 13•20 years ago
|
||
>> I offered the interactive solution > > Yes, but while that might be a nice UI, it is really hard to > implement in the converter. You right-click on a mailto-link, choose "get this message-id". What would Mozilla do? Just replace mailto: by news://server/ where server is the server you are using right now or else the default news server. Then call this URL. Doing that manually is quite annoying. Having the program do it is easy. > Please note that the linked msg-ids are not very useful anyway, > because we can hardly do anything useful with them. NACK. Someone tells me that the answer is in the message with that id. So I get it and have the answer. Very useful. Happens all the time. > In the best case, we can show a msg that happens to lie in the > same mail folder or on the same news server. That's good enough for practical purposes. Or you could (as an option) also allow to try to get it from groups.google.com; Gnus has this option. A nice argument for news *instead* of mailto right away: http://www.cbl.ncsu.edu/DiscussionGroups/MHonArc/MHonArc-1998-08-15/msg00024.html > I forgot one thing: The $-rule looks OK, I think. Can you come up > with more ones? (which don't bear the risk to consider a real > email address a msg-id.) Some readers add their name, like <pine.somestring@somehost>. I couldn't find a list, but I keep searching. Forte Agent uses <identifier@4ax.com> for message-ids, no e-mails at this domain; I am not aware of another program using this domain approach. Also, if the domain name is only one part, that is a (broken) message-id and not a mail address, e.g., <something@localhost>, <somethingelse@myhost>. OTOH, everything with a short local part (we would have to find a good limit here) is an e-mail address. pi
| Assignee | ||
Comment 14•20 years ago
|
||
> > same mail folder or on the same news server. > That's good enough for practical purposes. [...] > A nice argument for news *instead* of mailto right away: I think, you are talking mostly about news. However, the vast majority of users never use news at all. So, correct recognition of email addresses has, for me, absolute priority. As for the attribution lines like "In your message <a6758ghd74.8123456@foo.org>, you write:", I'd argue that they are broken and should read <mid:a6758ghd74.8123456@foo.org>. > Some readers add their name, like <pine.somestring@somehost>. > I couldn't find a list, but I keep searching. Forte Agent uses > <identifier@4ax.com> for message-ids, no e-mails at this domain; > I am not aware of another program using this domain approach. Very interesting. This is something safe and implementable. > Also, if the domain name is only one part, that is a (broken) message-id > and not a mail address, e.g., <something@localhost>, <somethingelse@myhost>. Apart from the fact that your assertion is completely wrong (<ben@myserver> is indeed an email address, just that it is only meaningful in the local network), it is also unsafe, compare "let's meet@9am".
| Reporter | ||
Comment 15•20 years ago
|
||
> As for the attribution lines like > "In your message <a6758ghd74.8123456@foo.org>, you write:", > I'd argue that they are broken Anyways, this happens a lot. >> Also, if the domain name is only one part, that is a (broken) message-id >> and not a mail address, e.g., <something@localhost>, <somethingelse@myhost>. > > Apart from the fact that your assertion is completely wrong (<ben@myserver> > is indeed an email address, just that it is only meaningful in the local > network), Yes, so this is not really used outside. > it is also unsafe, compare "let's meet@9am". Making this mailto isn't any better. pi
| Assignee | ||
Comment 16•20 years ago
|
||
> it is also unsafe, compare "let's meet@9am".
> Making this mailto isn't any better.
We do nothing at all.| Reporter | ||
Comment 17•20 years ago
|
||
>> it is also unsafe, compare "let's meet@9am". >> Making this mailto isn't any better. > > We do nothing at all. I overlooked earlier, that it cannot be a message-id, anyways. The angle brackets are part of the message-id, hence the above is no problem. pi
| Reporter | ||
Comment 18•20 years ago
|
||
> I forgot one thing: The $-rule looks OK, I think. > Can you come up with more ones? I started a discussion at <3BE915AF.4060708@logic.univie.ac.at>. There are already excellent answers. Even if you don't speak German you will understand the key points. Google is a few hours late, but you can later find it there: http://groups.google.com/groups?threadm=3BE915AF.4060708%40logic.univie.ac.at Next week I will post a summary here. pi
Comment 19•20 years ago
|
||
Sorry if I'm being ignorant but I don't really see the point or need of this... It's much better to incorrectly guess at a *@* address being mailto: than guessing it's a message-id; because mailto: addresses are so much more common. Doing guesses on valid characters, depending on how long an email address is and so on is not good enough for me... So summing up usefulness and reliability I still think this should be WONTFIX.
| Reporter | ||
Comment 20•20 years ago
|
||
Sorry, it took longer than expected and there is still no end. But in the mean time there is an alpha version of a perl script (should be easy to understand) which guess pretty well: http://piology.org/perl/id-or-mail.pl.html pi
| Reporter | ||
Comment 21•20 years ago
|
||
To add to my last comment. I further developed the script. On a large test base (more than 4 million message-ids and more than .5 million e-mail addresses) the error is below 1% for message-ids and below .33% for e-mail addresses). Certainly, this could be improved further. So far it is at least a proof of concept. pi
| Assignee | ||
Comment 22•20 years ago
|
||
...for archival purposes.
| Reporter | ||
Comment 23•20 years ago
|
||
I think the actual script is more useful than the web page wrapped around it.
This obsoltes attachment 63749 [details], but I cannot mark this.
pi| Assignee | ||
Updated•20 years ago
|
Attachment #63749 -
Attachment description: CUrrent version of Boris Perl script (copied from Website) → Current version of Boris' Perl script (copied from Website)
Attachment #63749 -
Attachment is obsolete: true
Comment 24•20 years ago
|
||
Not even news:3BE7BB7F.1070204@logic.univie.ac.at will work. It will lappear to be a link, it will show a link in statusbar on mouseover, it will NOT spawn a mailcompose when clicking on it...actually...*nothing* happens when clicking it. NC4.78 and Outlook will load the news-message when clicking that link. (if valid) Is that another bug?
Yup, those are covered by bug 108877 and bug 89939.
Comment 26•19 years ago
|
||
I programed a little xpi-addon for mozilla which helps to find the message to a certain messageid. It integrates a menu-item to the message pane context menu. So you just have to right-click on the messageid and then choose the adequate newsserver in the context menu. Your could find the xpi-addon and some explainations at: http://messageidfinder.mozdev.org/ Markus PS: Waiting for feedback
| Assignee | ||
Comment 27•19 years ago
|
||
Markus Hossner, this is the wrong bug. You probably want bug 37653.
Summary: Message-ID interpreted as mail address → URL: Message-ID interpreted as mail address
| Reporter | ||
Comment 28•19 years ago
|
||
Ben, you just changed the summary. Strictly speaking, this is not a URL, this bug is about plain message-ids. news:message-id is bug 108877. pi
| Assignee | ||
Comment 29•17 years ago
|
||
I see this question spawned a long thread and Boris' perl script, but one short look at the script tells me that we probably can't implement it will in Mozilla. There's no way anything like the script could go into Mozilla. It should be 1 or 2 rules (otherwise code bloat), which are correct like 99.9% of the time (otherwise users get *very* confused why it sometimes says mailto, sometimes msgid). Given the script, it doesn't look like there is such a rule, so I'm closing this as WONTFIX. Note: We still don't have proper support for actually *using* msg-ids, I think.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
Summary: URL: Message-ID interpreted as mail address → [mozHTMLToTXTConv] Message-ID interpreted as mail address
| Assignee | ||
Comment 30•17 years ago
|
||
Note: If there is indeed a safe rule, please specify it readably, not in Perl code, because Perl looks to me like a bunch of strange characters thrown in a mixer.
| Assignee | ||
Comment 31•17 years ago
|
||
Oh, and Boris, sorry for first asking and then nor using the result, but this script is so many miles away from what I could or would implement in C++ (even if I wanted to fix this bug). It's not even the same country. I blame myself for asking "Can you come up with more [rules]?" in comment 12, but I also said "that someone still has to implement the stuff. I probably won't do it myself.".
| Assignee | ||
Updated•17 years ago
|
Summary: [mozHTMLToTXTConv] Message-ID interpreted as mail address → [mozTXTToHTMLConv] Message-ID interpreted as mail address
| Reporter | ||
Comment 32•17 years ago
|
||
Ben, part of the discussion was to offer both in a context menu so people can make the decission themself. You are right, that Mozilla cannot deal with message-ids at all (this includes news and nntp URLs). This is embarrassing. So even if Mozilla cannot decide it can offer both. pi
| Assignee | ||
Comment 33•17 years ago
|
||
Boris, as I said several times, there is no way for the converter to offer UI (not even iframes with DOM and JS would work, because JS is likely to be disabled). The TXT->HTML conversion is implemented in the backend (in the Gecko network library), and it's normal HTML in the Mailnews frontend.
| Reporter | ||
Comment 34•17 years ago
|
||
Well, there is a context menu which works somehow. It should be possible to launch a message-id (once we can do that at all), we can already launch it as mailto. pi
You need to log in
before you can comment on or make changes to this bug.
Description
•