Closed Bug 19445 Opened 25 years ago Closed 25 years ago

nsMimeURLUtils - "b-/b@mozilla.org bla/ bla"

Categories

(MailNews Core :: MIME, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: BenB, Assigned: BenB)

Details

Attachments

(1 file)

I created a really strange bug.

Structured Phrases pretty up e.g. "DELIMITER - "/" - APLHA [any string - ALPHA]
- "/" - DELIMITER" (while DELIMITER is not alphanumerical and not "/") with <em
class=txt_slash> insertion.

FindURL is triggered in the ambiguous mail case by the "@" symbol, runs back in
the string to find the start and calculates the number of chars, which the link
will replace (because the user part of the address is already in the output
string and has to be replaced by the link (e.g. "<a
href="mailto:user@host">user")).

The user part might include characters, that already have been replaced, like
"&" (escaped to "&amp;") or similar, so FindURL calls ScanTXT (new name) again
with the user part (and without flag for URL scan) and takes the length of the
result as basis for the resulting replace chars number. (Actually it is even a
bit more complicated.) This ensures, that the replace chars number is correct
even with already escaped chars.

The problem is: Structured Phrases not just looks at the "current" chars, but,
as described above, looks forward, if it can find a *pair* of (special) chars in
the string passed to ScanTXT.
So, for "b-/b/-@mozilla.org", everything is fine, because the pair is completely
in the user part and is recognized by both the ScanTXT instance working on the
whole line and the ScanTXT instance called by FindURL to revert the prettying
up.
But for "b-/b@mozilla.org bla/ bla", the line-ScanTXT instance has "b-<em
class=txt_slash>/b" in the (buffered) resulting string, when FindURL is called
for the "@". This recognizes the email address and passes "b-/b" to another
instance of ScanTXT to calculate the number of chars, that have to be deleted
from the line instance of ScanTXT. But it gets the string "b-/b" back, because
there's no second "/" in the user part. It gives back "<a
href="b-/b@mozilla.org">b-/b@mozilla.org</a> and tell the caller to delete 4
chars from the resulting string and skip 11 chars after the "@". The result of
the line instance of ScanTXT is "b-<em class=txt_slas<a
href="b-/b@mozilla.org">b-/b@mozilla.org</a>".

This appears, when 1. an unbalanced number of "plain text tags" (which are
currently "/", "*", "_" and "|" with a non-alpha char before or after and an
alpha char on the other side) are on the user part of an email address
without heading "mailto:" AND 2. a corresponding plain text tag is in the rest
of the line/paragraph.

Apart from this obviously completely skrewed up email address (even in display,
not just the href part of the link), the worst I noticed was a from this point
on completely emphasized msg.

This bug should appear very rarely. But fixing it would mean 1. reorganizing
ScanTXT to first escape, then do URL recognition and then more prettying up and
2., what is even worse, rewriting the FindURL function to work on the escaped
string rather than plain text. Don't know, what to do.
Corrections:
This example of an email address was actually *not* screwed up, but I have
examples, that are.
The result is "b-<em class=txt_slas<a
href="b-/b@mozilla.org">b-/b@mozilla.org</a> bla/</em> bla".
*sigh* Correction of correction (I should take some sleep :-) ):
The example email address *is* screwed up a bit as everyone with open eyes (not
me) can see :-). This link is ok, but it has a heading "b-", which is the real
bug in this case.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
I have a workaround for this: I can check, is any structured phrases are "open"
and skip ambiguous mail recognition then.

Note for later implementers: Currently, this is only a problem for ambiguous
mail addresses, because in the other cases (where FindURL is triggered by the
colon or dot), only "[A-Za-z0-9+-.<:]" appear before the triggering and they are
not problematic ("<" is a bit problematic, but is catched by the nested call to
ScanTXT). If ScanTXT is changed in a (in the light of this bug) unfortunate way,
a similar problem may appear.
Ben - how might we verify this bug?
QA Contact: lchiang → asj
Try freaky URLs like the one in the summary (the test after that the address is
necessary) and see, if they are
1. Recognized correctly
2. Not recognized at all
3. Incorrectly recognized
Only 3. is a bug. It may manifest as a wrong link (i.e. the display is OK, but
the wrong URL is opened after a click on the URL), altered or garbaged display
(very bad) or so.
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: