Open Bug 1562737 Opened 5 years ago Updated 11 months ago

Allow body search in encrypted messages ("regular search", QFB, filters)


(MailNews Core :: Search, enhancement, P3)



(Not tracked)


(Reporter: jorgk-bmo, Unassigned)




(1 file)

Body search uses its own "mini MIME" parser to interpret a message stream:

We could smarten that up to detect application/pkcs7-mime parts and run them through decryption before processing them further. That might even be possible for application/pgp-encrypted parts.

Kai, what do you think?

P.S.: This has been on my mind for a while, but today I wasted hours searching for an e-mail and it turned out to be encrypted and the search didn't find it :-(

Flags: needinfo?(kaie)

This is a dupe of bug 188988. Thanks for your idea for a new approach to solve the problem.
I don't mind having that functionality. Decrypting "only for searching" is better than other suggested that had been made in the other bug. (I don't like the idea to store all bugs unencrypted, and I don't like the idea of a plaintext fulltext index for encrypted messages.) However, your search would probably be quite slow, and while searching, you might get prompts for the master password, or for GnuPG key passwords. We'd have to invent a way to skip all decryption attempts during the search, if the user cannot enter the correct password.

Flags: needinfo?(kaie)

I didn't know that bug. Yes, the search would be slow, but better slow than wrong. Sure, we need to skip cases were we can't decrypt.

So, are you aware of a handy interface to pass in the encrypted part and get it decrypted. I haven't dug through MIME yet.

Flags: needinfo?(kaie)

The decryption of S/MIME messages is triggered in mailnews/mime/src/mimecms.cpp.
We have a streaming decoder, see how the decoder_context object is used.

nsCOMPtr<nsICMSDecoder> decoder_context = do_CreateInstance(NS_CMSDECODER_CONTRACTID, &rv);

You initialize it with a callback function that will be called with decrypted content.

decoder_context->Start(content_callback, data);

Then you have to feed chunks of the encrypted input data into decoder_context->Update(encrypted_input), which will decrypt the chunks, and call content_callback(decrypted_content). When you're done with all input, call decoder_context->Finish(info_object). You can ignore the contents of the info_object for your purpose.

The implementation of content_callback can search for the needle, but doesn't need to do anything else with the haystack it's receiving. However, a single call to content_callback might contain only a subset of needle. So to be really certain, you'd have to search across boundaries of chunks. A possible solution is to grow a large buffer of all decrypted content, and delay the search until you have processed all input.

This will also help with S/MIME (CMS) messages that are signed-only (not encrypted), but uses the CMS opaque encoding, so a simple search through it doesn't work.

Flags: needinfo?(kaie)

OK, and for application/pgp-encrypted parts?

I haven't yet studied how PGP processing in Enigmail works.

It goes via the PGP Proxy. I tried to document it a bit once upon a time:

But there's most likely not a handy call like for S/MIME.

See Also: → 280588

(In reply to Kai Engert (:KaiE:) from comment #1)

This is a dupe of bug 188988. Thanks for your idea for a new approach to solve the problem.

Actually, no, that bug is for Gloda search which "should" already work, but doesn't.

See Also: → 188988
Summary: Allow body search in encrypted messages → Allow body search in encrypted messages ("regular search", QFB, filters)
Severity: normal → N/A
Priority: -- → P3


is there an ETA for this patch? will it make it to the June/July 2023 release of TB?

I use TB108 latest, and the "global"/gloda search is not finding any encrypted email with search-terms which are used in those encrypted emails.


I tried 1562737-body-search-encrypted.patch
It didn't build with Beta 114, I had to tweak it.

It worked, and gave me a search result, but then it's stuck in an endless loop, blocking Thunderbird.
Stuck is: nsMsgBodyHandler2::GetNextLine, while (eatThisLine).

This will need more work.

Did you use the latest version?
That's rebased to trunk but we haven't tested it yet.

(In reply to BB from comment #12)

Did you use the latest version?

Same endless loop.

The end-of-buffer detection isn't right in the new code. I see nsMsgBodyHandler2::GetNextLocalLine is called *m_currInput == 0, but m_currInputEnd points 3 bytes later.

I think we don't need m_currInputEnd, I think we can check for the zero terminator.
I'm experimenting and will attach a patch that works for me.

Attached patch ontop.patchSplinter Review

I've applied this fix on top of the latest betterbird patch, and that works for me.

Apologies for not looking at this earlier.
I had not noticed this patch implements deep searching, which is fine.

(Earlier I had incorrectly assumed this is still the enable-plaintext-gloda-index-for-encrypted-messages, which I don't like.)

We constantly update our patches. This patch has shipped since version 91, but wasn't tested in 115 yet. Thanks for looking into it. You might also want to study the history of this code which you can find in the patch header, there have been crash issues. Altogether, the patch has aspects that should be improved, for example including a CPP file is not the best way to do things, but offered itself for a downstream project.

Note that bug 1719121 will impact this code.

A personal anecdote, when I originally joined the Mozilla project in 2001, one the things I wanted to get done was searching in encrypted emails, but I had never been able to focus on it. This patch is the closest I have seen yet, and I'm happy that we might be close to implementing it - even if it's slow. So I'd support an effort to get this patch improved and landed in the near future.

(In reply to Kai Engert (:KaiE:) from comment #13)

The end-of-buffer detection isn't right in the new code. I see nsMsgBodyHandler2::GetNextLocalLine is called *m_currInput == 0, but m_currInputEnd points 3 bytes later.

We've looked at it. The actual string and its length shouldn't get out of step. The problem is likely the way the decryption result is passed around in a static variable which is prone to overwriting, so the string changes while we still hold the length of the previous content: (note the comment).
SetMimeCallback() requires a static function which forces the class variables it can access down to static as well.
We've updated our solution which a simple change. So please download the latest version for further testing.

We tested this now based on latest trunk on a folder with real life data and saw no hang but instead the correct search results. It's possible that the crashes we mentioned were also related to the "static issue".

The ontop.patch appears to hide the real issue.

You need to log in before you can comment on or make changes to this bug.