When inserting a link into HTML composition, the text part sometimes gets linkified upon send resulting in a double link. Regression from bug 964024.

RESOLVED FIXED in Firefox 49

Status

()

Core
Networking
RESOLVED FIXED
11 months ago
8 months ago

People

(Reporter: Jorg K (GMT+2), Assigned: Jorg K (GMT+2))

Tracking

({regression})

Trunk
mozilla49
regression
Points:
---

Firefox Tracking Flags

(firefox49 fixed, thunderbird_esr38 wontfix, thunderbird_esr4547+ fixed)

Details

(Whiteboard: [necko-would-take])

Attachments

(1 attachment, 3 obsolete attachments)

(Assignee)

Description

11 months ago
This bug started originally as bug 1274242 were we discovered that linkify stops at the first | it finds.

Hoever, that was not what the reporter of bug 1274242 had originally reported. He said that after upgrading to TB 45, the following problem occurs:

STR:
- enter
    http://www.test.com&tag=a||b
  into the body of an email.
- highlight the link and click Insert ==> Link (<ctrl> L).
- The long link turns all blue.
- Send the email (or just sent to outbox, <ctrl><shift><enter>).

Works in TB 31, but not TB 38 and TB 45.

Alice, could you please find the regression. Thanking you in advance.
Flags: needinfo?(alice0775)
(Assignee)

Comment 1

11 months ago
Sorry, I got confused. Test case is: http://www.test.com&tag=//a||b.
(Assignee)

Comment 2

11 months ago
This is about the weirdest thing I've seen in a long time. I tried:
http://www.test.com?tag=//a|b and sometimes it works in TB 38, sometimes it doesn't.

What always fails in TB 38 is:
http://www.mappingsupport.com/p/gmap4.php?t=aerial&label=on&tilt=off&markers=//Not||description=plm2

Alice, when testing this, please make sure that the e-mail gets sent in HTML format since the auto-downgrade to plain text causes more confusion.

Comment 3

11 months ago
Here is a full link that opens the map.  I adjusted the t parameter.
This will not make any difference for purpose of testing TB.
But I did want to provide a working link.

http://www.mappingsupport.com/p/gmap4.php?t=s&label=on&tilt=off&markers=//Not_a_survey___Coordinates_are_approximate||description=plm2||line=on||44.810123,-62.871949^1||44.807591,-62.874422^2||44.806832,-62.874392^3||44.807001,-62.874934^4||44.807466,-62.874531^5||44.807781,-62.875128^6||44.80779,-62.875145^7||44.807846,-62.875091^8||44.810321,-62.872686^9
(Assignee)

Comment 4

11 months ago
Something has gone terribly wrong here.

TB 31 inserts this:
<a href="http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not%7C%7Cdescription=plm2">
http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not||description=plm2</a>

TB 38 inserts:
<a href="http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not%7C%7Cdescription=plm2">
<a class="moz-txt-link-freetext"
href="http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not">
http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not</a>||description=plm2</a>

Take a good look:
The original link is there and complete. But Linkify was run on the text part of the link producing yet another link and some link text. Both the inner link and its text are wrong.

Let me restate it in some other way:
TB 31 leaves the original link alone: <a href="http://www.example.com">http://www.example.com</a>

From TB 38 we linkify the text as well, and the result is:
<a href="http://www.example.com"><a href="http://www.example.com">http://www.example.com</a></a>
Note that this example doesn't trigger the bug.

Since the linkify stops at the first "|" due to bug 1274242, we get bad inner links and bad text.

So I'll change the bug summary.
Summary: Bug to investigate why sending a link that contains a || doesn't work in TB 38 and TB 45 when it worked in TB 31 → When inserting a link into HTML composition, the text part sometimes gets linkified upon send resulting in a double link. Worked in TB 31, broken in TB 38 and beyond.
(Assignee)

Comment 5

11 months ago
Joseph, I hope you follow. Your report detected two problems:
Bug 1274242: Linkify stops at "|" where it shouldn't. That never worked from day one.
This bug here: Text of links inserted manually gets linkified where it shouldn't thus exposing the faulty behaviour in the other bug. This linkifying of link text is wrong.
No erroneous linkifying took place in TB 31, but later versions were affected.

Comment 6

11 months ago
Yes, I follow.

Since the map links I send to clients are complex I have always used insert ==> link.
It is possible I tried to rely on linkify way back when, saw that it did not work, and didn't care (and didn't report a bug) since I was willing to do insert ==> link.

Linkify needs to ease off and not mess with stuff that is already a link.
(Assignee)

Comment 7

11 months ago
(In reply to Joseph from comment #6)
> Linkify needs to ease off and not mess with stuff that is already a link.
That is exactly the point. But we need to find where that changed somewhere between TB 31 and TB 38.

The thing is, that this was reported before in bug 1240418, but we just brushed it off. With the observation that it works in TB 31 and not TB 38 we can get to the cause of the problem quicker, I hope.
Whiteboard: [necko-would-take]
(Assignee)

Updated

11 months ago
Duplicate of this bug: 1240418
(Assignee)

Comment 9

11 months ago
OK, looking at the log of the source file in question:
https://hg.mozilla.org/mozilla-central/log/tip/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp

That code is almost static with only a bit of refactoring taking place, however, bug 964024 landed there, and it landed right on TB 38, meaning that TB 38 would have been the first broken version.

That also modified the link processing <a> so I put my bet on that ;-)
https://hg.mozilla.org/mozilla-central/rev/a8c47d8abcba

Comment 10

11 months ago
Regression window:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c2359a6a6958&tochange=940118b1adcd
https://hg.mozilla.org/comm-central/pushloghtml?fromchange=338986d73c02&tochange=f0194350ebf3

Suspect : https://hg.mozilla.org/mozilla-central/rev/a8c47d8abcba
Flags: needinfo?(alice0775)
(Assignee)

Comment 11

11 months ago
Thanks, Alice. We both came to the same result.
Magnus, you broke that somehow. I'm staring at it, but I can't see it:

-      uint32_t start = uint32_t(i);
-      if (nsCRT::ToLower((char)aInString[uint32_t(i) + 1]) == 'a')
-           // if a tag, skip until </a>
+      int32_t start = i;
+      if (Substring(aInString, i + 1, 2).LowerCaseEqualsASCII("a "))
+           // if a tag, skip until </a>.
+           // Make sure there's a space after, not to match "abbr".

OK, found it!!

The HTML gets wrapped (used ThunderHTMLedit to see it):
    <p><a
href="http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not%7C%7Cdescription=plm2"><tt>http://www.mappingsupport.com/p/gmap4.php?t=aerial&amp;label=on&amp;tilt=off&amp;markers=//Not||description=plm2</tt></a><br>

There is NO space behind the "<a" !!!
(Assignee)

Updated

11 months ago
Blocks: 964024
Keywords: regressionwindow-wanted
(Assignee)

Comment 12

11 months ago
Created attachment 8754967 [details] [diff] [review]
Fixing the regression from bug 964024: Not only test for space after <a
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
Attachment #8754967 - Flags: review?(honzab.moz)
(Assignee)

Updated

11 months ago
status-thunderbird46: wontfix → ---
status-thunderbird47: affected → ---
status-thunderbird48: affected → ---
status-thunderbird49: affected → ---
Component: General → Networking
Product: Thunderbird → Core
(Assignee)

Updated

11 months ago
Summary: When inserting a link into HTML composition, the text part sometimes gets linkified upon send resulting in a double link. Worked in TB 31, broken in TB 38 and beyond. → When inserting a link into HTML composition, the text part sometimes gets linkified upon send resulting in a double link. Regression from bug 964024.

Comment 13

11 months ago
Great detective work!

Comment 14

11 months ago
Wowie zowie - I am amazed at the light speed response.
Many thanks!
(Assignee)

Comment 15

11 months ago
I don't like regressions. Other changes take longer ;-)
(Assignee)

Comment 16

11 months ago
Created attachment 8755001 [details] [diff] [review]
Fixing the regression from bug 964024: Not only test for space after <a (v2).

This is a little more elegant.
Attachment #8754967 - Attachment is obsolete: true
Attachment #8754967 - Flags: review?(honzab.moz)
Attachment #8755001 - Flags: review?(honzab.moz)
Attachment #8755001 - Flags: feedback?(mkmelin+mozilla)
Please explain the patch in detail.
Flags: needinfo?(mozilla)
(Assignee)

Comment 18

11 months ago
Sure.

Before bug 964024
https://hg.mozilla.org/mozilla-central/rev/a8c47d8abcba
we simply checked for "<a" to detect a link.

In bug 964024 that was changed to checking for "<a ".

In serialised DOM (converted to HTML as text) this fails if the text presented is not
<a href=some site ...>
but instead
<a                         <=== line break after the a
href=usually some very long link>.

See sample link at the bottom of comment #11.

So instead of checking for "<a " I now check for "<" followed by "a" followed by space, \n or \r.
Flags: needinfo?(mozilla)

Comment 19

11 months ago
Comment on attachment 8755001 [details] [diff] [review]
Fixing the regression from bug 964024: Not only test for space after <a (v2).

Review of attachment 8755001 [details] [diff] [review]:
-----------------------------------------------------------------

Could also (in theory) be \t or \f - https://www.w3.org/TR/html5/infrastructure.html#space-character

Could you also fix the similar "<head " case?
Attachment #8755001 - Flags: feedback?(mkmelin+mozilla)
(Assignee)

Comment 20

11 months ago
Created attachment 8755141 [details] [diff] [review]
Fixing the regression from bug 964024: Not only test for space after <a (v3).

More comprehensive fix repairing the other tests as well.
Attachment #8755001 - Attachment is obsolete: true
Attachment #8755001 - Flags: review?(honzab.moz)
Attachment #8755141 - Flags: review?(honzab.moz)
Attachment #8755141 - Flags: feedback?(mkmelin+mozilla)

Updated

11 months ago
Attachment #8755141 - Flags: feedback?(mkmelin+mozilla) → feedback+
(Assignee)

Comment 21

11 months ago
(In reply to Honza Bambas (:mayhemer) from comment #17)
> Please explain the patch in detail.
Since I made more changes, here another explanation.

Before the code tested the following strings to detect a start tag:
|<a |
|<style |
|<style>|
|<script |
|<script>|
|<head |
|<head>|

Since we had cases of |<a\r|, we decided to unify the approach and now detect
|<a_|
|<style_|
|<script_|
|<head_|
where _ can be any of space, \f, \n, \r, \t or >.

There is a text for this:
./mach xpcshell-test --verbose netwerk/test/unit/test_mozTXTToHTMLConv.js

I haven't added more test cases, but I can, if you wish.
Jorg, before I start reviewing this, would you be willing to rewrite the parsing code of this class using mozilla::Tokenizer?  It's a lexical-analyzer-like kind of tokenizer, customizable, and designed to also be derived from (when suitable) to impl a parser of any level of complexity.  It would be a bit larger task, but would make this class much more maintainable.  At least just the ScanHTML method would be enough, as it's more sensitive than just text walk through.

If not, I will jump on the review.
Flags: needinfo?(mozilla)
(Assignee)

Comment 23

11 months ago
We need this bug fixed quickly since it is a clear regression of bug 964024 and we have users who can't include the links they need into their messages.

As you can see here
https://hg.mozilla.org/mozilla-central/log/tip/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp
this code is rather dormant.

The only use of it I could find in M-C is in the spell checker
https://dxr.mozilla.org/comm-central/source/mozilla/extensions/spellcheck/src/mozEnglishWordUtils.cpp#177
and then it's also part of Necko (but I don't know how much it's used there). I've actually played with the idea to move the code to comm-central, but since it's used in M-C that's out of the question.

So what I'm trying to say is that in the short term we need to land this small fix. In the long term we can think about rewriting it. However, I'm a Thunderbird developer (compose peer), so my efforts need to go into TB.

So please proceed with the review.
Flags: needinfo?(mozilla)
Good rational, thanks.  Going to review.
Comment on attachment 8755141 [details] [diff] [review]
Fixing the regression from bug 964024: Not only test for space after <a (v3).

Review of attachment 8755141 [details] [diff] [review]:
-----------------------------------------------------------------

unhappily giving r+

::: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp
@@ +1234,5 @@
>    {
>      if (aInString[i] == '<')  // html tag
>      {
>        int32_t start = i;
> +      if (i+2 < lengthOfInString &&

i + 2 (spaces around signs, applies on more places too)

can overflow when MAX_UINT32 - 2 < i, also applies to other places...  exactly what Tokenizer takes care of, between else..

@@ +1236,5 @@
>      {
>        int32_t start = i;
> +      if (i+2 < lengthOfInString &&
> +          nsCRT::ToLower(aInString[i+1]) == 'a' &&
> +          canFollow.FindChar(aInString[i+2]) >= 0)

!= kNotFound


so, let's assume a string "<a":
- i == 0
- lengthOfInString == 2
- i + 2 == 2
- i + 2 < lengthOfInString --> 2 < 2 == false

so, we are safe


personally I hate these maths.. but comment 23 has good arguments to leave it in this horrible form.
Attachment #8755141 - Flags: review?(honzab.moz) → review+
(Assignee)

Comment 26

11 months ago
Thanks for the quick review. Since I'm not at my desk right now, I'll fix the review issues tomorrow.

(In reply to Honza Bambas (:mayhemer) from comment #25)
> personally I hate these maths.. 
There really are no maths involved. If we want to access at index (i+k) then i+k simply needs to be < length ;-)

Comment 27

11 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/b05086812022
Keywords: checkin-needed
(Assignee)

Comment 28

11 months ago
Created attachment 8756317 [details] [diff] [review]
Fixing regression from bug 964024: Test for white-space after <a and other tags (v4)

Fixed review issues.

Carrying forward Honza Bambas' r+
Attachment #8755141 - Attachment is obsolete: true
Attachment #8756317 - Flags: review+
(Assignee)

Comment 29

11 months ago
Dear Sheriff,
I know that you'd like to see a successful try run, but I don't have one.
This code is tested via
./mach xpcshell-test --verbose netwerk/test/unit/test_mozTXTToHTMLConv.js
which I have run locally. Of course I made sure it compiles, too ;-)
Please land the patch.
Keywords: checkin-needed

Comment 30

11 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/b05086812022
Status: ASSIGNED → RESOLVED
Last Resolved: 11 months ago
status-firefox49: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
(Assignee)

Comment 31

11 months ago
Kent, please include into TB 45.x at your earliest convenience. This is a regression and our users can't include the links they want into their e-mail even if they add the link manually. There is one duplicate of this. The change is harmless, I just changed the comparison a little.

Three people (including myself) looked at it and there is an automated test showing that we didn't break anything.
Flags: needinfo?(rkent)

Comment 32

11 months ago
Pushed to a release branch on mozilla-beta for Thunderbird 47.0b1
Flags: needinfo?(rkent)

Comment 33

10 months ago
Pushed http://hg.mozilla.org/releases/mozilla-esr45/rev/c18c3446bd34 to THUNDERBIRD_45_VERBRANCH
status-thunderbird_esr38: affected → wontfix
status-thunderbird_esr45: affected → fixed
tracking-thunderbird_esr45: ? → 47+
(Assignee)

Comment 34

10 months ago
Pushed to a release branch on mozilla-beta for Thunderbird 48.0b1
https://hg.mozilla.org/releases/mozilla-beta/rev/f4165cb03763
(Assignee)

Comment 35

10 months ago
Pushed to a release branch on mozilla-beta for Thunderbird 48.0b1 - take 2:
https://hg.mozilla.org/releases/mozilla-beta/rev/69f8c8df54e4

Comment 36

8 months ago
Thanks to all involved for fixing this bug.  I am the reporter.
I am now using Tunderbird 45.2.0 and I can email my clients links like the following:

https://mappingsupport.com/p/gmap4.php?t=h&label=on&tilt=off&markers=//Not_a_survey___Coordinates_are_approximate||description=plm2||label=on||line=on||37.320377,-122.267308^1||37.320395,-122.266774^2||37.320126,-122.266792^3||37.320122,-122.26672^4||37.320173,-122.266717^5||37.320019,-122.266137^6||37.319689,-122.26643^7||37.319923,-122.266733^8||37.320377,-122.267308||line=on||37.320122,-122.26672||37.319923,-122.266733

I am manually making this into a link when I compose my email and linkify no longer breaks the link.

I note that Gmail also allows me to use links like the above.  However I prefer to use Thunderbird when communicating with clients.

Joseph Elfelt
https://MappingSupport.com
(Assignee)

Comment 37

8 months ago
If you like the service, please consider a donation ;-)
https://donate.mozilla.org/en-US/thunderbird/about/

Comment 38

8 months ago
Donation sent.
Thanks again!

Joseph
You need to log in before you can comment on or make changes to this bug.