Message filter with "Body contains" does not ignore tag content when tag is broken across lines.

NEW
Unassigned

Status

Thunderbird
Filters
2 years ago
10 months ago

People

(Reporter: ski, Unassigned)

Tracking

(Blocks: 1 bug)

38 Branch
x86
Windows Vista

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
User Agent: Mozilla/5.0 (Windows NT 6.0; rv:42.0) Gecko/20100101 Firefox/42.0
Build ID: 20151029151421

Steps to reproduce:

I can reproduce the bug always in the following way. It need s an POP-account in order to use the "Body contains" filter option.

1.) Create the message filter:
 In the Menu go to Tools/Message Filters.
 Create a new filter via the button [New...].
 Give it some name. I did not change the hooks, so the hook on "Manually Run" and on "Getting New Mail: Filter before Junk Classification" are set. The next two hooks are unset. I set the radio button to "Match any of the following", although this is not necessary for one entry.
 For the filter options I choose "Body" "contains" "spamerwebsjte".
 (The "j" is just to create a new word.)
 Under "Performs these actions:" choose "Move Message to" "Trash on GMX"
 Press [OK]

2.) Switch to sending pure HTML-mails without plain text in the following way:
 In the Menu go to Tools/Options/Composition/[Send Options...].
 In the drop-down-menu choose "Send the message in HTML anyway".
 Press [OK] [OK].

3.) Send the following HTML-mail to yourself:
  Create a new message, address it to any POP-account you can read. A senseful subject could be "Test with short line".
  Click the message body and then in the Menu choose Insert/HTML... . 
  Paste the following content:

<html>
  <body>
  Dear Reader,<br> 
  <a href="http://www.some.spamerwebsjte.tat.i.want.to.filter.out.com">Click Me !</a>
  </body>
</html>

3.) Click [Insert]. Before sending you might like to save the message as template for further tests.
 Send the message.

4.) Get new messages from the POP account you have send the mail to.


Actual results:

When receiving the message, it is not filtered i.e. moved to the Trash but it stays in the Inbox. In the message source there is no line break before the "href"-command:

[...]
Dear Reader,<br>
    <a href="http://www.some.spamerwebsjte.tat.i.want.to.filter.out.com">Click

      Me !</a>
[...]


Expected results:

The word "spamerwebsjte" should have been found by the filter and the message should have been moved to the trash.

But the filter works, if you make the line with the link a bit longer in the test mail: use this test message, where only one additional letter is introduced in "that" instead of "tat":
 (a senseful subject would be "Test with long line")

<html>
  <body>
  Dear Reader,<br> 
  <a href="http://www.some.spamerwebsjte.that.i.want.to.filter.out.com">Click Me !</a>
  </body>
</html>

After sending and receiving this message, it is moved to the trash and the longer line is broken in the source before the "href"-command:

[...]
Dear Reader,<br>
    <a
      href="http://www.some.spamerwebsjte.that.i.want.to.filter.out.com">Click

      Me !</a>
[...]

Obviously, this additional line break, that appears automatically, makes the filter to work correctly.
But the filter must find the search word in both test mails.
(Reporter)

Updated

2 years ago
OS: Unspecified → Windows Vista
Hardware: Unspecified → x86

Comment 1

a year ago
Confirmed.

Adding:
  <body>
  Dear Reader,<br> 
  <a href="http://www.some.spamerwebsjte.tat.i.want.to.filter.out.com">Click Me !</a>
  </body>
the link line is broken before the href and the message is filtered, regardless of filter configuration
(Filter before/after Junk Classification).

Adding:
  <body bgcolor="#FFFFFF" text="#000000">
    <p> Dear Reader,<br>
      <a href="http://www.some.spamerwebsjte">Click Me !</a> </p>
  </body>
results in the link line not being broken and the message is NOT filtered correctly.

I wonder whether filtering should happen only on visible text or on all text in the message.
Status: UNCONFIRMED → NEW
Ever confirmed: true

Updated

a year ago
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1211128

Comment 3

a year ago
OK, after some discussion in bug 1211128 we concluded that text in HTML tags should be ignored in the search. That this doesn't happen when the tag is broken across lines is a bug.
Processing is here:
http://mxr.mozilla.org/comm-central/source/mailnews/base/search/src/nsMsgBodyHandler.cpp#310

So I am re-opening the bug for this and correcting the summary.

The discussion of whether to search inside the tags can happen in the other bug 1211128.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Summary: Message filter with "Body contains" fails in some HTML-mails depending on line length → Message filter with "Body contains" does not ignore tag content when tag is broken across lines.

Updated

a year ago
Status: REOPENED → NEW

Updated

10 months ago
Blocks: 519202
You need to log in before you can comment on or make changes to this bug.