Closed
Bug 19251
Opened 26 years ago
Closed 26 years ago
improve way to recognize URLs in messages
Categories
(MailNews Core :: MIME, enhancement, P1)
MailNews Core
MIME
Tracking
(Not tracked)
VERIFIED
FIXED
M12
People
(Reporter: warrensomebody, Assigned: BenB)
References
(Blocks 1 open bug, )
Details
Attachments
(6 files)
|
53.59 KB,
text/plain
|
Details | |
|
35.12 KB,
patch
|
Details | Diff | Splinter Review | |
|
5.17 KB,
patch
|
Details | Diff | Splinter Review | |
|
5.07 KB,
patch
|
Details | Diff | Splinter Review | |
|
10.91 KB,
patch
|
Details | Diff | Splinter Review | |
|
26.90 KB,
patch
|
Details | Diff | Splinter Review |
The current mechanism for recognizing URLs in mail messages is hard coded to
only look for a few URL schemes. We need to make this extensible so that URLs
associated with all protocol plugins are recognized. For instance, jar: URLs
aren't recognized right now.
To do this, I think all we have to do is first detect something that looks like
a protocol scheme (e.g. "foo:") and then take the text up to the next
whitespace character and hand it to nsIOService::NewURI. If this successfully
constructs a URL, then we know that the protocol scheme does correspond to an
installed protocol plugin, and that the URL should be converted into an actual
link in the text.
Updated•26 years ago
|
Assignee: phil → rhp
Comment 1•26 years ago
|
||
Reassign to rhp
Comment 2•26 years ago
|
||
Ben had been working on this code (actually working on a rewrite of some of
these routines) so he would be the person to look at this.
- rhp
Comment 3•26 years ago
|
||
Thats:
Ben Bucksch
http://www.bucksch.org
| Assignee | ||
Updated•26 years ago
|
Assignee: rhp → mozilla
Status: ASSIGNED → NEW
| Assignee | ||
Updated•26 years ago
|
Status: NEW → ASSIGNED
| Assignee | ||
Comment 4•26 years ago
|
||
Accepting
| Assignee | ||
Updated•26 years ago
|
Severity: normal → enhancement
Component: Front End → MIME
OS: other → All
Priority: P3 → P1
Target Milestone: M19 → M12
| Assignee | ||
Comment 5•26 years ago
|
||
The most basic recognition functionality seems to work. Need to do more testing.
Not working: mailto, abbreviated URLs.
All the other funtions in the class (ParseURL etc.) still don't use Necko and
need to be rewritten (by me).
| Assignee | ||
Comment 6•26 years ago
|
||
Some description of my code:
It works mode-based: modes are tested in sequence (defined by a const) and the
first successful one wins.
Modes are the following (copied from source code):
RFC1738, /* Check, if RFC1738, APPENDIX compliant,
like <URL:http://www.mozilla.org>. */
RFC2396, /* RFC2396, APPENDIX E allows anglebrackets (like
<http://www.mozilla.org>) or quotation marks
(like "http://www.mozilla.org") (w/o "URL:"). */
freetext /* assume heading scheme
with "[a-zA-Z0-9]*:" like "news:".
Certain characters (see code) or any whitespace
(including linebreaks) end the URL.
Other certain (punctation) characters (see code)
at the end are stripped off.
*/
/* RFC1738 and RFC2396 type URLs may may use multiple lines,
whitespace is stripped. Special characters like ',' stay intact.*/
| Reporter | ||
Comment 7•26 years ago
|
||
Sounds like you're saying that you wrote your own recognizer based on the
specs, but I'd rather see us use what necko has for consistency. That way if
the thing is highlighted, we'll be assured that we can handle it.
If necko's url parsing doesn't meet the specs specified, then we should fix it.
| Assignee | ||
Comment 8•26 years ago
|
||
Warren,
all my functiom does is to decide, where the URL starts and ends. I don't know
of a Necko function doing this. After that is done, I leave it up to Necko
(NS_NewURI) to decide, is the result is valid or not.
I'll attach the current code (work in progress). If you still think, it should
be moved to Necko, please provide me with the necessary background (knowledge)
and I'll integrate it.
| Assignee | ||
Comment 9•26 years ago
|
||
| Assignee | ||
Comment 10•26 years ago
|
||
I forgot to mention: function in question is FindURL.
Comment 11•26 years ago
|
||
*** Bug 7176 has been marked as a duplicate of this bug. ***
| Assignee | ||
Comment 12•26 years ago
|
||
| Assignee | ||
Comment 13•26 years ago
|
||
| Assignee | ||
Comment 14•26 years ago
|
||
| Assignee | ||
Comment 15•26 years ago
|
||
| Assignee | ||
Comment 16•26 years ago
|
||
| Assignee | ||
Comment 17•26 years ago
|
||
The patches/files create a new defunct stream converter with an XPCOM interface
and 3 (static) functions: ScanTXT, ScanHTML and CiteLevel. The latter is
currently unused, ScanTXT is used by mimetpla.cpp and mimetplf.cpp, ScanHTML by
nsMsgSendPart. I changed these functions to use the new class and removed the
old functions from nsMimeURLUtils.
I will ask Shaver, if the licence is OK.
The callers still need some work for I18N and perf checking to pass the right
modes to the functions, but most if not all points are marked with a XXX
comment.
rhp, can you please review the mime parts and the 3 functions?
valeski, can you please review the converter and it's integration in Necko? Is
it OK, that it registers for text/plain? If not, can you make it register with
the Factory, so libmime can access it? Tnx.
| Assignee | ||
Comment 18•26 years ago
|
||
Typo: "pref(erences) checking", not "perf checking"
Comment 19•26 years ago
|
||
Everyone,
It probably makes sense for one person to land all of these changes. If you
want, I can step up and take that role. I will probably get this stuff ready to
rock over the weekend and look for a Monday landing.
If anyone objects, please let me know.
- rhp
PS: Warren: this will include the other changes we talked about today.
| Assignee | ||
Comment 20•26 years ago
|
||
Note, that the license for moz(I)TXTToHTMLConv is possibly invalid. I may
release it under a different licence (e.g. a modified MPL or new BSD-style (w/o
ad restriction) license.
Updated•26 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 26 years ago
Resolution: --- → FIXED
Comment 21•26 years ago
|
||
Ok gang, this is all checked in now. There seems to be an issue with the
emoticon detection that Ben is working on, but other than that, we seem to be
working.
- rhp
| Reporter | ||
Comment 22•26 years ago
|
||
Judging from Ben's comment about just looking at where the url starts and ends
and then calling NS_NewURI, I'm happy. I haven't looked at the code though.
One thing I'd like to see though that I always considered broken in 4.x
releases: If a url is broken across a line, there should be heuristics that
recognize that fact, and pick up the rest of the url as the continuation, e.g.:
bla bla bla bla bla bla bla bla bla bla bla bla bla http://listings.ebay.com/aw/
listings/list/category1497/index.html bla bla bla bla bla bla
The recognizer should notice the "<text>*:" as the start of the url, then notice
that it ends with the newline, and then look on the next line for a string a of
text containing slashes, and dots, etc. and including it in the url string.
| Assignee | ||
Comment 23•26 years ago
|
||
Warren, bug #5351 (dependant on this) addresses the linebreaks in URLs.
| Assignee | ||
Comment 24•26 years ago
|
||
Comment 25•24 years ago
|
||
Can anyone give me some test url's for this bug. I have verified a mailto link
works OK (comment #9) and in (comment #22) a long string url with text in front and
in back of it sent and received as a url OK. I'm not sure what all the protocols
are as stated in original description, so if someone can help with this I would
appreciate it.
| Assignee | ||
Comment 26•24 years ago
|
||
Esther, this code is so old and so visible that you can fairly securely mark
this verified. (I won't do so, because I am the one who fixed it.)
Updated•21 years ago
|
Product: MailNews → Core
Updated•17 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•