Closed Bug 882917 Opened 11 years ago Closed 10 years ago

[email] "Body" search of HTML messages regressed by worker thread migration

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

defect
Not set
normal

Tracking

(blocking-b2g:1.4+, b2g1828+ wontfix, b2g18-v1.0.1 wontfix, b2g-v1.3 affected, b2g-v1.3T affected, b2g-v1.4 fixed, b2g-v2.0 fixed)

VERIFIED FIXED
1.4 S5 (11apr)
blocking-b2g 1.4+
Tracking Status
b2g18 28+ wontfix
b2g18-v1.0.1 --- wontfix
b2g-v1.3 --- affected
b2g-v1.3T --- affected
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: asuth, Assigned: asuth)

References

Details

(Keywords: regression, Whiteboard: permafail [priority][c= p=2])

Attachments

(1 file)

The body search filter attempts to create an HTML DOM implementation in order to only search the (non-quoted) text content of HTML mails.

This has been broken since we landed the worker thread enhancements in bug 814257, although the regression may have been somewhat concealed by the bug 868270 regression from when we started lazily performing body fetches.
Andrew Sutherland deleted the linked story in Pivotal Tracker
Target Milestone: --- → 1.2 FC (16sep)
Notes from IRC that it's not sure mcav received, posting so they don't get lost:

We have an HTML parser in bleach.js that we can use at https://github.com/mozilla-b2g/bleach.js
the logic basically overlaps with what we do for snippet extraction, except we don't want to stop at a limit
in practice, we probably want to build a flattened string representation anyways, so that we don't get tripped up by markup like "magic <span>dance</span>" which should still match "magic dance" as our search term
and then we can use that in the future if we do anything clever like building a pre-computed inverted index, etc.
Whiteboard: burirun1
Whiteboard: burirun1 → burirun1, burirun2
Tagging this bug to put it in the productivity backlog
Target Milestone: 1.2 FC (16sep) → ---
Assignee: nobody → mcav
Status: NEW → ASSIGNED
blocking-b2g: --- → backlog
(Unassigning for now since I'm not actively working on it)
Assignee: mcav → nobody
Status: ASSIGNED → NEW
Whiteboard: burirun1, burirun2 → burirun1, burirun2 [priority][p=8]
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Whiteboard: burirun1, burirun2 [priority][p=8] → burirun1, burirun2 [priority][c= p=2]
Target Milestone: --- → 1.4 S5 (11apr)
As noted on bug 838843, this patch lives on top of that patch because that patch provides the core search slice unit tests.  This one is simpler and just makes text/html work again and adds the test case variants to exercise the path.  I also cleaned up/improved some of the more unit-testy search cases in test_search.js which included improving the snippet/excerpt logic and its coverage.

As also noted on the other bug, if running this against Gaia you are still going to see bug 989116 rear its head and generate a whole bunch of (theoretically) harmless (from the perspective of the search card) errors involving listPerson.  I am working to fix that too but wanted to break things up sanity-wise.

Note that there is an expected failure of that attachment thing still, and I want to fix that in a separate bug.
Attachment #8399518 - Flags: review?(m)
And point-wise, the reason we went from p=8 to p=2 is that:
- Bug 838843 involved most of the legwork for the unit tests that this needed.
- It turns out it is insanely easy to naively convert the entire HTML document to a flattened textual representation.  While there are more performant things we could do so we could give up early, we absolutely don't need to do them right now.
- Our test coverage is okay, but not exceedingly amazing for the specific use-case.  However, since we're basically using already-tested snippet-generating logic and already have reasonably good bleach.js coverage, I don't think we need much more than what we have.  As/if regressions are detected, we'll of course want to flesh things out further.
Comment on attachment 8399518 [details] [review]
fix up html body search and add test GELAM test coverage, also clean up snippet logic

Cool. I tested search locally with this patch, and it works as expected. It's a little unfortunate that we have to copy/paste code for the slice logic, but it's at least obvious where it came from so shouldn't be a big deal in practice. Test coverage looks to be sufficient and passes-ish for me too. Nice work. I'm so used to search being broken on all of my mail clients (IMAP search never works on my phone, and Thunderbird's search interface leaves something to be desired) that I was actually somewhat surprised to see that it worked properly.
Attachment #8399518 - Flags: review?(m) → review+
Yeah, I don't feel great about the copy-paste-modify logic there.  The code exists in a weird state where it's close enough to basically be the same code but because of the MatchHeader wrapping scenario and the case where start/end does does not exactly line up with the contents of headers that it's also different enough that it's hard to cleanly reuse.  I did try things the other way first where the SearchSlice was just wrapping calls to the MailSlice prototype.  I found myself writing some long comment blocks explaining why I was altering the control flow path in weird looking ways.

I think it would be something easier/cleaner if we had operator overloading support, hygienic macros, or maybe if I just added some helper functions even if they are tiny.  When we end up adding unified-folder slices I expect is when the refactoring will become obvious/compulsory.
Depends on: 838843
landed in GELAM/master:
https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/297
https://github.com/mozilla-b2g/gaia-email-libs-and-more/commit/3b9cc82fe17b1bb5a476a8e2207f23cf89c4260e

landed in gaia/master:
https://github.com/mozilla-b2g/gaia/pull/17929
https://github.com/mozilla-b2g/gaia/commit/c854579a247114aa13e4980fb124fae3cbd2d5cf


Note that this migrated our bleach.js dependency from chewlayer.js to worker-bootstrap.js which is a startup-time concern.  I have spun this off to be tracked as bug 991383 since it's complexity creep to do the lazy-loading here and I really want the UI integration tests to be active to protect us from lazy-loading screw-ups.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Blocks: 989116
Whiteboard: burirun1, burirun2 [priority][c= p=2] → permafail [priority][c= p=2]
[Environment]
Gaia      553da99ab09b6b894d9f95bb06b16b6e1ddbf0a1
Gecko     https://hg.mozilla.org/mozilla-central/rev/5b6e82e7bbbf
BuildID   20140414160205
Version   31.0a1
ro.build.version.incremental=eng.archermind.20131114.105818
ro.build.date=Thu Nov 14 10:58:33 CST 2013

[Result]
PASS
Status: RESOLVED → VERIFIED
Asking formally for 1.4+ on this bug as it will be uplifted because the fix is depended on by bug 989116, a 1.4+ blocker, and that bug will be marked fix once the changeset for bug 796474, another 1.4+ bug fix is uplifted.
blocking-b2g: backlog → 1.4?
1.4+ per comment #12.
blocking-b2g: 1.4? → 1.4+
Keywords: regression
Fixed on v1.4 branch:
https://github.com/mozilla-b2g/gaia/commit/81e97c3ca58be0487292011bc59efa4cebab30be

from pull request:
https://github.com/mozilla-b2g/gaia/pull/18808

That pull request was a roll up commit of the following bugs, since bug 796474, a 1.4+ bug, depended on changes in the other bugs, and the other bugs also block another 1.4 bug, bug 989116.

Commits that were part of the roll up:

bug 838843: fc4a74a8400838e5fd18da6b7d8851a5a4380019
bug 882917: 0a8ccfdb26a33789bec754d769b0786570ceb28c
bug 796474: 67868019817334815ebb881ef8cd1b478989aa01
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: