Closed
Bug 868233
Opened 12 years ago
Closed 9 years ago
Allow XPCOM access to body handler
Categories
(MailNews Core :: Search, enhancement)
MailNews Core
Search
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: rkent, Assigned: opera.wang)
Details
Attachments
(2 files, 1 obsolete file)
20.42 KB,
patch
|
Details | Diff | Splinter Review | |
20.99 KB,
patch
|
Details | Diff | Splinter Review |
A very common request that people have in custom searches and filters is to search the actual message body. In built-in filters, this is done using a helper class nsMsgBodyHandler. To ease searches involving the body, make that class available as a js-accessible XPCOM object.
Reporter | ||
Updated•12 years ago
|
Assignee: nobody → kent
Severity: normal → enhancement
Status: NEW → ASSIGNED
Reporter | ||
Comment 1•12 years ago
|
||
Comment 2•11 years ago
|
||
My private conversation with Opera Wang:
Opera:
The patch C++ code requires you to download the source code of TB, patch it and then rebuild
whole TB, it would be hard if you were not an IT. Also the patch is about half year ago
which means it is a bit rotted ( can't get patched directly, & needs some merging effort ).
ukrainianconsular:
It means if i use the patch, i should never update thunderbird & condemn thunderbird
to the latest actual stable build & never update it again, since i can't expect devs
at mozilla to incorporate it for good .. . could you please add it,
since i have no coding knowledge ((
Opera:
As I said, without the patch, it's (almost) impossible to do regex search in the body.
I actually tried it once, with a lot of dirty code, and still can't make it work without
the possibility of crashing TB.
Reporter | ||
Comment 3•11 years ago
|
||
Unfortunately once someone starts down the path of comments like "the man seems to have an ego & practices abuse of trust pretending" (Bug 19442 comment 71) then I assume that all expectations of any response from me or cooperation with me are gone. So at this point any communication between me and Mr. Consular seems to be impossible. But I would still be willing to pursue this with a friendly third party.
I'd completely forgotten that I had done this WIP patch, which I suppose is why I posted it here in the first place, knowing I would forget. The patch should not be all that controversial. It simply adds capability for a javascript-based extension to hook into some code that would allow extensions such as FiltaQuilla to implement much-requested functionality.
But I don't recall if the patch works. What this needs at this point is for someone to compile and test it with a recent TB build, and confirm that it does what is expected. There might still be time to move this forward prior to Thunderbird 31, though that window is fairly narrow.
But in the meantime I will unassign myself since I am not actively working on this.
Assignee: kent → nobody
Status: ASSIGNED → NEW
Comment 4•11 years ago
|
||
"Unfortunately once someone starts down the path of comments like "the man seems to have an ego & practices abuse of trust pretending" (Bug 19442 comment 71) then I assume that all expectations of any response from me or cooperation with me are gone"
Wrong Kent .. .
They were gone since i've made that request 2 years ago, & you've then told me you didn't have time for that after pretending it was "easy". if it's not abuse of trust, then the term "easy" was for my part, displaced, cause it gives good hopes to the person who contacts you.
Then you ignored all my attempts to re-ask that fix .. .
But if you ever had a sincere will, i'll address you my apologize today.
Assignee | ||
Comment 5•11 years ago
|
||
This is the patch based on Kent's patch with small modifications:
1. unbitrotted
2. use localAccountUtils for the test case so it can pass
3. Add one more interface 'AUTF8String read(in unsigned long aCount)', if extensions author want to look into the raw body (jsmime?), this interface might be useful, as 'getNextMessageLine' will strip all attachments.
With this patch, Extensions like FiltaQuilla or 'Expression Search / GmailUI' can doing more complex filter like 'regular expression search for body' (bug 19442), or 'filter by attachment name' (bug 224392)
I also modified my addon to verify the functional of this patch and seems it works.
BTW, I work on this just because I'm the maintainer of 'Expression Search / GmailUI' and now I'm using a dirty method (thread.processNextEvent) to query body which might crash TB.
Attachment #8379618 -
Flags: review?(kaie)
Comment 6•11 years ago
|
||
I wonder which mozilla devs are following this thread, maybe:
kaie@kuix.de
&
mkmelin+mozilla@iki.fi
Whoever is willing to update thunderbid's nightly or stable build with rkent's patch
updated by opera wang, please let me know when this process will be done.
& i guess filtaquilla would need an update too in order to work with that patch
cause rkent uses personalized "internet headers properties" of search like bccList:
ccList: & recipients:
Like the character set is usually found in the "Content-Type:" header,
& his property for that is "msgCharSet:"
So i guess a "body:" or "msgBody:" property has to be added yet.
Assignee | ||
Comment 7•11 years ago
|
||
Comment on attachment 8379618 [details] [diff] [review]
bug_868233.patch
change reviewer to :Irving Reid as Kai is mainly focused on security.
And please assign to me.
Attachment #8379618 -
Flags: review?(kaie) → review?(irving)
Reporter | ||
Updated•11 years ago
|
Assignee: nobody → opera.wang
Assignee | ||
Updated•11 years ago
|
Status: NEW → ASSIGNED
OS: Windows 7 → All
Hardware: x86_64 → All
Comment 8•11 years ago
|
||
Comment on attachment 8379618 [details] [diff] [review]
bug_868233.patch
Review of attachment 8379618 [details] [diff] [review]:
-----------------------------------------------------------------
Can you solve the same problem with nsIMsgFolder::getMsgInputStream(), possibly combined with nsIMsgFolder::getMsgTextFromStream()? If so, I'd rather support that instead of exposing another interface.
http://dxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolder.idl#550
http://dxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolder.idl#740
Aside from that the patch is fine, with a few white space fixups - I'm going to give f+ rather than r+ until we decide what to do about the duplication of APIs.
::: mailnews/base/search/public/nsMsgBodyHandler.h
@@ +14,5 @@
> //---------------------------------------------------------------------------
> // nsMsgBodyHandler: used to retrieve lines from POP and IMAP offline messages.
> // This is a helper class used by nsMsgSearchTerm::MatchBody
> //---------------------------------------------------------------------------
> +class nsMsgBodyHandler : public nsIMsgBodyHandler
Please remove the trailing white space from this line
@@ +69,5 @@
>
> nsIMsgSearchScopeTerm *m_scope;
> nsCOMPtr <nsILineInputStream> m_fileLineStream;
> nsCOMPtr <nsIFile> m_localFile;
> + nsCOMPtr<nsIInputStream> m_inputStream;
Space between nsCOMPtr and <nsIInputStream> to match style of lines above
::: mailnews/base/search/src/nsMsgBodyHandler.cpp
@@ +32,5 @@
> + m_headers = NULL;
> + m_headersSize = 0;
> + m_Filtering = false; // make sure we set this before we call initialize...
> + InitLocal(scope, numLines, msg, db);
> +
Please remove spaces from this line
@@ +57,5 @@
> + return NS_OK;
> +}
> +
> +NS_IMETHODIMP nsMsgBodyHandler::Init(nsIMsgDBHdr* aMsgHdr, bool aStripHeaders, bool aStripHtml)
> +{
White space
@@ +85,4 @@
>
> + OpenLocalFolder();
> + return NS_OK;
> +}
White space
::: mailnews/base/test/unit/test_searchCustomBody.js
@@ +76,5 @@
> +{
> + localAccountUtils.loadLocalMailAccount();
> + MailServices.filters.addCustomTerm(customTerm);
> +
> + var copyListener =
White space
Attachment #8379618 -
Flags: review?(irving) → feedback+
Assignee | ||
Comment 9•11 years ago
|
||
white spaces change.
Also I did a quick test for nsIMsgFolder::getMsgInputStream() + nsIMsgFolder::getMsgTextFromStream(), it can give me the text part of the body, but can't get the msg header or attachment part, and seems can't get the html part too. So I still would like to have this interface.
An alternate way is to make nsIInputStream::read scriptable?
Attachment #8379618 -
Attachment is obsolete: true
Attachment #8385858 -
Flags: review?(irving)
Comment 10•11 years ago
|
||
Joshua, Mark, Kent - what do you think about making nsIInputStream::read() scriptable?
Flags: needinfo?(standard8)
Flags: needinfo?(kent)
Flags: needinfo?(Pidgeot18)
Comment 11•11 years ago
|
||
(In reply to :Irving Reid from comment #10)
> Joshua, Mark, Kent - what do you think about making nsIInputStream::read()
> scriptable?
There is no need to: nsIScriptableInputStream is the proper way to go.
(Although there is a mild amount of issues if you have messages with embedded NUL characters and you don't code carefully).
(In reply to opera wang from comment #9)
> Also I did a quick test for nsIMsgFolder::getMsgInputStream() +
> nsIMsgFolder::getMsgTextFromStream(), it can give me the text part of the
> body, but can't get the msg header or attachment part, and seems can't get
> the html part too. So I still would like to have this interface.
If you want headers, then nsIMsgBodyHandler is absolutely not what you want. You can use nsIMimeHeaders if you want a sucky interface or mimeParser.jsm if you want a non-sucky interface.
If you want both the text and the HTML part, you want to truly speak mime, and the body handler is the wrong thing for you. JSMime or GLoda's MimeMsg are arguably both better features. In fact, when JSMime 0.2 lands, JSMime will be able to replicate libmime's magic charset conversion functionality which does far better than any of our crappy ad-hoc MIME parsers. The downside is you have to figure out which MIME parts are body parts yourself.
I see no compelling reason to make this interface public if nsIMsgFolder::getMsgInputStream exists.
Flags: needinfo?(standard8)
Flags: needinfo?(kent)
Flags: needinfo?(Pidgeot18)
Assignee | ||
Comment 12•11 years ago
|
||
Thanks Joshua, your suggestions helps me a lot.
Kent, my preliminary test shows that both NetUtil.readInputStreamToString or getMsgTextFromStream works, while streamMessage + getMsgTextFromStream crash TB
bodyRegex.match = function _match(aMsgHdr, aSearchValue, aSearchOp) {
let folder = aMsgHdr.folder;
let got;
if ( 0 ) { // will crash
let messenger = Cc["@mozilla.org/messenger;1"].createInstance(Ci.nsIMessenger);
let listener = Cc["@mozilla.org/network/sync-stream-listener;1"].createInstance(Ci.nsISyncStreamListener);
let uri = folder.getUriForMsg(aMsgHdr);
messenger.messageServiceFromURI(uri).streamMessage(uri, listener, null, null, false, "");
got = folder.getMsgTextFromStream(listener.inputStream, aMsgHdr.Charset, aMsgHdr.messageSize /*read*/, aMsgHdr.messageSize /*max output*/, false/*compressQuotes*/, true/*strip HTML*/, { }/*contentType*/);
} else {
let result = {};
let stream = folder.getMsgInputStream(aMsgHdr, result);
let mimeType = {};
try {
//got = NetUtil.readInputStreamToString(stream, aMsgHdr.messageSize); // read the whole message
got = folder.getMsgTextFromStream(stream, aMsgHdr.charset || '', aMsgHdr.messageSize, aMsgHdr.messageSize, false, false, mimeType);
dump("mimeType:" + mimeType.value);
} catch (err) { }
if ( !result.value ) stream.close(); // close if not reusable
}
return false;
};
Comment 13•11 years ago
|
||
(In reply to opera wang from comment #12)
> let uri = folder.getUriForMsg(aMsgHdr);
> messenger.messageServiceFromURI(uri).streamMessage(uri, listener,
> null, null, false, "");
I'm guessing the crash happens somewhere in this time frame, which isn't surprising, since the message header hasn't been added to the database at this point in time yet.
Assignee | ||
Comment 14•11 years ago
|
||
I'm try the 'quick filter' mode, not 'Message Filters', so the issue might be:
The streamMessage/nsISyncStreamListener call processNextEvent or just suspend current thread and return. But the match for the current header is not finished yet, that will cause re-enter issue for nsMsgSearchOfflineMail::Search(bool *aDone) @ nsMsgLocalSearch.cpp and crash
Reporter | ||
Comment 15•11 years ago
|
||
If you can accomplish a body regex custom search successfully with getMsgTextFromStream then I have no issues with abandoning this bug, and just using that.
Comment 16•11 years ago
|
||
Getting rid of filtaquilla which becomes now obsolete to (regex) filter by
headers
to
from
subject
& the missing one: body
Many thanks to opera wang to introduce all those options in the addon he is maintaining:
Expression Search / GMailUI 0.8.8 beta
http://www.sendspace.com/file/dsfhu4
Assignee | ||
Comment 17•11 years ago
|
||
The 2 addons are for different purpose so no one is obsolete.
But anyone who is interested can try the beta version and maybe Kent can add to filtaquilla later.
Thanks.
Comment 18•11 years ago
|
||
Comment on attachment 8385858 [details] [diff] [review]
bug_868233_v2.patch
Review of attachment 8385858 [details] [diff] [review]:
-----------------------------------------------------------------
(In reply to Joshua Cranmer [:jcranmer] from comment #11)
> (In reply to :Irving Reid from comment #10)
> > Joshua, Mark, Kent - what do you think about making nsIInputStream::read()
> > scriptable?
>
> There is no need to: nsIScriptableInputStream is the proper way to go.
...
> I see no compelling reason to make this interface public if
> nsIMsgFolder::getMsgInputStream exists.
I agree; if getMsgInputStream + nsIScriptableInputStream provides equivalent access to the message data, we don't need nsIMsgBodyHandler.idl. See http://dxr.mozilla.org/comm-central/source/mailnews/test/resources/mailTestUtils.js#73 for the one place we test that code path...
As Joshua points out, you really need JSMime to get the character encodings etc. correct, but that would be true of either approach.
I'm clearing the review request for now; we can bring this back up if getMsgInputStream doesn't work out.
Attachment #8385858 -
Flags: review?(irving)
Comment 19•11 years ago
|
||
Hey Opera,
i'm missing this to be more precise:
contains
doesn't contain
is
isn't
& not "matches" "doesn't match"
for regular expressions.
I'm using expression search 0.8.8 beta.
ex:
if i want to create a regex rule where there's only urls in the body,
i can't use "is"
as well as i want to have a "contains" option for regex, so it doesn't ignore
everything else.
Thanks
Comment 20•11 years ago
|
||
Hey Opera, there’s a serious issue. gmail ui is not recognizing my regular expression
that works everywhere else online...
Expression:
(?ism)([>]|\s|^|"|<<)(-?i|i'll|i'd|i'm|im|id)([<]|\s)
you can see a test here
http://regexr.com/v1?38mmb
step: create an email with for ex:
Im
In the body
& create for body (regex) a filter to move it to trash
(?ism)([>]|\s|^|"|<<)(-?i|i'll|i'd|i'm|im|id)([<]|\s)
Nothing happens..
Comment 21•11 years ago
|
||
Reporter | ||
Comment 22•9 years ago
|
||
Not needed since we have a working alternative.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•