Open Bug 2016935 Opened 2 months ago Updated 12 hours ago

Searching emails in Thunderbird should include searching in content of (.pdf) attachments.

Categories

(MailNews Core :: Search, enhancement)

Thunderbird 140
enhancement

Tracking

(Not tracked)

UNCONFIRMED

People

(Reporter: stig, Unassigned)

Details

Actual results:

When in Thunderbird searching for terms only mentioned in the (pdf) attachment of an email, the email is not found. That is really annoying. Some webshops doesn't name items you have bought directly in the bodies of the mails the send anymore, but only in attached pdf receipts and order confirmations. I have several times found myself searching for old receipts for things I know I have bought online, but don't remember when or where. And then it is really really hard to find such receipts. Often I have given up.

It works on online email services like Gmail and Mailfence. It should work in the Thunderbird client too. It is a very useful feature.

You could of course include support for other formats than pdf, but I would assume that for most people just pdf support is really needed.

I'm using Thunderbird ESR Windows client.

Does pdf.js have functionality that would let us easily extract the document text for searching?

Flags: needinfo?(cdenizet)
Summary: Searching emails in Thunderbird should include searching in content of (pdf) attachments. → Searching emails in Thunderbird should include searching in content of (.pdf) attachments.

Yes it's possible.
If you want to get all the pdf text:

But if you want to be able to highlight the results in the pdf (like for a normal search) it could be slightly more work depending on how the search is made. If we search for a single string then it's just a matter of almost replicating what we've currently for a normal search. Or if we search in using a regexp then our find controller has to be updated:

Flags: needinfo?(cdenizet)

Thanks! It would preferably be a way to find without loading the .pdf in a browser, just obtaining the text from a file content.

If Thunderbird was able to just find and list all e-mail's with attached PDF:s containing the search term, that would be enough! Because if the PDF is large enough to be hard to read, the user could just do a Ctrl-F to find the string inside the PDF.

You need to log in before you can comment on or make changes to this bug.