1885505 - Screen readers will not read highlighted text inside a PDF

This bug is a an uneasy one.
On the pdf side, the highlight annotation should (I'd prefer must) contain the the boxes used to highlight.
So if we want to know what is the text behind the highlight we must compute the intersection of the glyphes bounding boxes with the ones used for highlighting and then we can know what is the exact highlighted text.
That said, some pdf producers (like Acrobat) are adding the highlighted text in the struct tree when it's existing
So there are two things we can do right now to slightly improve the situation:

add the highlighted text in the struct tree.
use the one we've in the struct tree and link it to the text layer (if it's possible).

Marco Castelluccio [:marco]

Reporter

Comment 4

•

3 months ago

What is the expected behavior here?
Should we read the whole text, and then say which part is highlighted?

Flags: needinfo?(nlapre)

Nathan LaPré

Comment 5

•

3 months ago

The section of text that's highlighted should have attached formatting information in a way that screen readers can report. For instance, if you load a text fragments URL like data:text/html,abc<div>def</div>ghi#:~:text=def, and navigate with a screen reader to the highlighted text "def", you should be able to get it to report the formatting. In VoiceOver, the shortcut is VO+T. On my machine, it reports "highlighted" as you'd expect. (Side note: the same should work for the <mark> element, but that's slightly bugged in Firefox on macOS currently.) You could give a section of highlighted text the ARIA mark role to communicate the highlighted state.

Usually, screen readers won't report that something is highlighted unless you ask them to. In VO and NVDA, it's a special keyboard shortcut. Ultimately, it may not be hugely important to users to know about highlights, but it's good to make it possible to access for SR users.

Flags: needinfo?(nlapre)

Marco Castelluccio [:marco]

Reporter

Comment 6

•

3 months ago

(In reply to Nathan LaPré from comment #5)

The section of text that's highlighted should have attached formatting information in a way that screen readers can report. For instance, if you load a text fragments URL like data:text/html,abc<div>def</div>ghi#:~:text=def, and navigate with a screen reader to the highlighted text "def", you should be able to get it to report the formatting. In VoiceOver, the shortcut is VO+T. On my machine, it reports "highlighted" as you'd expect. (Side note: the same should work for the <mark> element, but that's slightly bugged in Firefox on macOS currently.) You could give a section of highlighted text the ARIA mark role to communicate the highlighted state.

Thanks!

Usually, screen readers won't report that something is highlighted unless you ask them to. In VO and NVDA, it's a special keyboard shortcut. Ultimately, it may not be hugely important to users to know about highlights, but it's good to make it possible to access for SR users.

Should we still consider it S2 then? I'm wondering which one is more important between this one and https://connect.mozilla.org/t5/ideas/reader-mode-for-pdf-files/idc-p/70393.

Nathan LaPré

Comment 7

•

3 months ago

It being an S2 is a bit of a technicality, but is consistent with our triage guidelines. Ultimately, the existence of highlights isn't knowable for people with low or no vision. If you can see the screen, you get the information, otherwise you don't. So we would call the implementation of highlights "completely inaccessible."

That's not to say anything about the actual importance of the feature to people. Honestly I don't know whether it's a huge deal to users. Maybe, maybe not. I don't know of any user research on the topic. Anecdotally, I've heard lots of people complain about inaccessible/confusing text in PDFs (that reader view could help address?), but I've never heard anyone complain about highlights. If you have to choose between the two I think reader mode probably wins handily.

James Teh [:Jamie]

Comment 8

•

3 months ago

(In reply to Nathan LaPré from comment #5)

Usually, screen readers won't report that something is highlighted unless you ask them to.

NVDA does. I've also seen VoiceOver do this, but we seem to be getting inconsistent results there.

(In reply to Marco Castelluccio [:marco] from comment #6)

Should we still consider it S2 then?

A way to think about this is: how would you triage it if fully sighted users couldn't se highlights? NVDA does read these automatically, but even if it didn't, it might still be just as important to a screen reader user to perceive these as it is for a sighted user. The disability doesn't change the severity.

I'm wondering which one is more important between this one and https://connect.mozilla.org/t5/ideas/reader-mode-for-pdf-files/idc-p/70393.

Regardless of the answer to that question, I don't think this impacts the severity of the bug. It might impact priority though. One thing to take into account is that Reader Mode for PDF doesn't exist yet. The PDF viewer does, and right now, part of it isn't accessible to some users.

Marco Castelluccio [:marco]

Reporter

Comment 9

•

3 months ago

(In reply to James Teh [:Jamie] from comment #8)

(In reply to Nathan LaPré from comment #5)

Usually, screen readers won't report that something is highlighted unless you ask them to.

NVDA does. I've also seen VoiceOver do this, but we seem to be getting inconsistent results there.

(In reply to Marco Castelluccio [:marco] from comment #6)

Should we still consider it S2 then?

A way to think about this is: how would you triage it if fully sighted users couldn't se highlights? NVDA does read these automatically, but even if it didn't, it might still be just as important to a screen reader user to perceive these as it is for a sighted user. The disability doesn't change the severity.

If NVDA / VoiceOver didn't read them automatically, then highlights would have been practically undiscoverable anyway. That's why I was suggesting it might not be S2.
The fact that NVDA and VoiceOver do read them automatically of course completely changes the perspective.

I'm wondering which one is more important between this one and https://connect.mozilla.org/t5/ideas/reader-mode-for-pdf-files/idc-p/70393.

Regardless of the answer to that question, I don't think this impacts the severity of the bug. It might impact priority though. One thing to take into account is that Reader Mode for PDF doesn't exist yet. The PDF viewer does, and right now, part of it isn't accessible to some users.

Indeed, this is only a question of prioritization. As we must choose what to work on next, there's a choice to be made between the two options.
Should making those PDFs more accessible come before making the highlighting feature more accessible, or the opposite? They are both in our roadmap and currently this bug comes first, but I'm wondering if reader mode shouldn't instead take its place. Had this bug been S3 instead of S2, then the answer would have been clear. Given that it is S2, then the answer is more difficult.
Fixing this bug will be quicker than implementing reader mode, but it would benefit only users who come across highlighted PDFs.
Implementing reader mode will be slower and so it will take longer to deliver value, but it would benefit way more users as most PDFs are not tagged.

Bugzilla

Screen readers will not read highlighted text inside a PDF

Categories

(Firefox :: PDF Viewer, defect, P3)

Tracking

()

People

(Reporter: marco, Unassigned)

References

Details

(Keywords: access)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9