727819 - Make PDF.js accessible

Reporter

Description

•

12 years ago

This is a follow-up thread from our Mozilla Central integration discussion in:

https://bugzilla.mozilla.org/show_bug.cgi?id=714712

Basically the main challenge is that we use a 2d Canvas to render the PDF content, and we're not sure what's the best or recommended solution for making a Canvas app accessible. 

At the moment we have a text selection backend that replicates the Canvas text into regular divs in the DOM. However, this is mostly a hack, so we're considering going all-out Canvas - see rationale below:

https://github.com/mozilla/pdf.js/pull/1205

Because of this new proposed backend, ideally any A11Y solution would not depend on our current text selection layer - although in principle we could create one just for the sake of A11Y.

I see there's some sort of support for caret selection in the W3C Canvas spec, but without any examples or experience in the subject it's hard to know what exactly the proposed API does:

http://www.w3.org/TR/2dcontext/#caret-selection

It seems to suggest we can navigate the content of a Canvas with a caret (F7?), but it's hard to imagine how that works since Canvas is rasterized. I must be missing something.

We're open to suggestions and words of wisdom from folks who have experience with making Canvas content accessible (Or perhaps PDF.js is going to be the very first accessible Canvas app? That'd be awesome :))

I guess a good starting is: What A11Y addon/software should we use to test PDF.js against?

David Bolter [:davidb] (NeedInfo me for attention)

Comment 1

•

12 years ago

Thanks for filing this Artur. I'm cc'ing
Marco - to QA any builds we can throw his way.
Alexander, Ehsan - for comments on caret/selection.

For end user testing we generally recommend using:
http://www.nvda-project.org/wiki/Download (on Windows)

Other tools and dev details are mentioned here:
https://wiki.mozilla.org/Accessibility/Contribute

Marco Zehe (:MarcoZ)

Comment 2

•

12 years ago

To me, this sounds like it heavily depends on the work done in bug 495912, right?

Masatoshi Kimura [:emk]

Comment 3

•

12 years ago

Focus management / Caret and selection management were removed from latest Editor's draft.
http://dev.w3.org/html5/2dcontext/Overview.html
See public-html-a11y about discussions.
http://lists.w3.org/Archives/Public/public-html-a11y/

arky [:arky]

Comment 4

•

12 years ago

I did some early a11y testing on Linux using orca. There were some UI issues with keyboard navigation, unnamed buttons and lack of accessible notifications. I believe that the pdf.js UI was planned to be revamped.  

The pdf content was accessible, the screen reader for the first few pages. There was some bug related this somewhere. 

Currently on community building road trip in India. Will pull the latest sources and continue the testing.

Michael[tm] Smith [:sideshowbarker]

Comment 5

•

12 years ago

(In reply to Masatoshi Kimura [:emk] from comment #3)
> Focus management / Caret and selection management were removed from latest
> Editor's draft.
> http://dev.w3.org/html5/2dcontext/Overview.html

I realize there have been some claims they were removed, but if you examine the actual editor's draft, you'll see that's simply not true. They were never removed. The editor's draft  in fact has included text that attempts to address them since before May of last year; see the following:

http://dev.w3.org/html5/2dcontext/#dom-context-2d-drawsystemfocusring
http://dev.w3.org/html5/2dcontext/#dom-context-2d-drawcustomfocusring
http://dev.w3.org/html5/2dcontext/#dom-context-2d-scrollpathintoview

Whether or not that spec text actually addresses the use cases and requirements sufficiently is something that should probably better be discussed in the W3C bugzilla.

Marco Zehe (:MarcoZ)

Comment 6

•

12 years ago

I ran a test with this URL:
http://mozilla.github.com/pdf.js/web/viewer.html

First thing I noticed was that there actually is text in the HTML that a screen reader can work with. At the top there are the controls, below that the text begins by listing all the authors.

What I noticed was that there do not seem to be any headings. I don't know the original of this PDF shown here, so do not know if the tags (if any) provide that information, but if dealing with a tagged PDF, those tags should definitely be translated by pdf.js, since they are specifically for accessibility.

If the PDF has no tags, we can make an approximation. Adobe, for example, can deduce tables and some other structures by analyzing untagged PDFs and providing an educated guess on accessibility.

In fact, much of what this PDF shows, esp in the code examples, looks like it would in such an educated guess scenario of Adobe Reader.

I also noticed that the PDF never seems to load fully. Whenever I hit Ctrl+End to go to the end of the document, new portions were loaded and appended.

Responsiveness was OK, there was only a slight lag when new stuff was loaded.

So, whatever mirroring takes place currently is actually quite useable with a screen reader such as NVDA that walks our accessibility tree.

David Bolter [:davidb] (NeedInfo me for attention)

Comment 7

•

12 years ago

(In reply to Marco Zehe (:MarcoZ) from comment #6)
> I ran a test with this URL:
> http://mozilla.github.com/pdf.js/web/viewer.html
> 
> First thing I noticed was that there actually is text in the HTML that a
> screen reader can work with. At the top there are the controls

Marco, are the controls all labelled well?

David Bolter [:davidb] (NeedInfo me for attention)

Comment 8

•

12 years ago

I'll note that for a sighted keyboard user the controls seem to work perfectly.

Marco Zehe (:MarcoZ)

Comment 9

•

12 years ago

(In reply to David Bolter [:davidb] from comment #7)
> Marco, are the controls all labelled well?

Yes, no problem there!

(no longer active)

Comment 10

•

12 years ago

Can somebody please file a new bug only about the caret/selection stuff, and provide a clear description of what needs to happen there?  It's very hard to extract this information from this bug.

David Bolter [:davidb] (NeedInfo me for attention)

Comment 11

•

12 years ago

(In reply to Ehsan Akhgari [:ehsan] from comment #10)
> Can somebody please file a new bug only about the caret/selection stuff, and
> provide a clear description of what needs to happen there?  It's very hard
> to extract this information from this bug.

And cc me.

I heard we are changing the selection backend?

Bill Walker [:bwalker] [@wfwalker]

Comment 12

•

12 years ago

(In reply to David Bolter [:davidb] from comment #11)
> (In reply to Ehsan Akhgari [:ehsan] from comment #10)
> > Can somebody please file a new bug only about the caret/selection stuff, and
> > provide a clear description of what needs to happen there?  It's very hard
> > to extract this information from this bug.
> 
> And cc me.
> 
> I heard we are changing the selection backend?


Although we have investigated Canvas-based text selection as a prototype, we have not committed to it, no.

David Bolter [:davidb] (NeedInfo me for attention)

Comment 13

•

11 years ago

Marco do we have a bug filed for tagged pdf?

(BTW I'm not sure this bug is useful anymore?)

Flags: needinfo?(marco.zehe)

James Teh [:Jamie]

Comment 14

•

11 years ago

Two things worth noting:
* Support for tagged PDF (and guessing where there aren't tags) will very much change the structure of the HTML representation of the content. Aside from headings, tables, etc., text should also flow better. That is, a single block of content (e.g. a paragraph) should appear in a single block element instead of multiple block elements. Right now, text breaks in awkward places.
* Tagged PDF can specify the reading order of the content. In extreme cases, the reading order can actually mix content from different pages. There are valid use cases for this; e.g. a 2-page brochure where you are meant to read some parts across both pages instead of reading all of one page and then the other.

Marco Zehe (:MarcoZ)

Comment 15

•

11 years ago

(In reply to David Bolter [:davidb] from comment #13)
> Marco do we have a bug filed for tagged pdf?

Yes, bug 861157.

Flags: needinfo?(marco.zehe)

David Bolter [:davidb] (NeedInfo me for attention)

Updated

•

11 years ago

Depends on: tagged-pdf

Brendan Dahl [:bdahl]

Updated

•

3 years ago

Depends on: 1708041

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Marco Castelluccio [:marco]

Comment 16

•

11 months ago

This tracking bug has served its purpose.

Status: NEW → RESOLVED

Closed: 11 months ago

Resolution: --- → INVALID

Bugzilla

Quick Search

Make PDF.js accessible

Categories

(Firefox :: Disability Access, defect)

Tracking

()

People

(Reporter: aadib, Unassigned)

References

(Depends on 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Updated

Updated

Comment 16