Open Bug 1950656 Opened 10 months ago Updated 1 month ago

[a11y] Automatically create outline (table of contents) metadata from page's headings (<h1>, <h2>, etc.) when printing/saving a page to PDF

Categories

(Core :: Printing: Output, enhancement)

Firefox 135
enhancement

Tracking

()

UNCONFIRMED
Accessibility Severity s3

People

(Reporter: nekohayo, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: access)

Summary

Both for convenience and accessibility compliance reasons, in addition to the page title being preserved into the PDF title metadata (bug #1939532), when "printing" (saving) a page to PDF, the web browser engine should automatically create the dynamic table of content that gets shown in PDF readers' sidebars, which lists the headings and their corresponding page position (so that the software and the user can easily know the entire document structure and more easily navigate through it.

Here are two examples of pages that can serve as easy testcases, as they make good use of h1/h2/h3 headings:

Info about implementation in Skia

Searching for "pdf heading print" in Chromium's issue tracker, I found https://issues.chromium.org/issues/41387522 where we see mentions of the Skia graphics library where this feature is available, including:

Additional things I found:

See Also: → 1939532, 1322653, 1321689

This would be one aspect of Tagged PDF generation.

(Currently, our PDF output is generated via cairo, not skia; presumably the cairo_tag_{begin,end} APIs could be used to generate this.)

Blocks: 1657973

With SkPDF, once we have the structure tree including headings, it should be trivial to generate the document outline. See SkPDF::Metadata::fOutline. Thus, I'm reversing the dependency relationship here, since bug 1657973 is about building the structure tree.

No longer blocks: 1657973
Depends on: 1657973
Accessibility Severity: --- → s3
Keywords: access

Unfortunately, setting SkPDF::Metadata::fOutline to StructureElementHeaders isn't sufficient by itself. It seems that accumulating the text from headings relies on the SkTextBlob being constructed using text, but Gecko only provides glyph indexes.

I got this working in my prototype, but it required a small patch to Skia and I'm not sure it will be accepted upstream. I suspect the cleaner solution (as far as Skia is concerned) of fixing Gecko to provide text instead of just glyph indexes might be a much bigger lift, but I'm not familiar enough with the graphics code to have any real clue.

You need to log in before you can comment on or make changes to this bug.