[meta] Text recognition in images
Categories
(Core :: DOM: Core & HTML, task)
Tracking
()
a11y-review | requested |
People
(Reporter: gregtatum, Unassigned)
References
(Depends on 11 open bugs, Blocks 1 open bug)
Details
(Keywords: meta)
This bug tracks the exploration and implementation work for text recognition in images.
(Note that this is the second version of this bug, as I accidentally attached phabricator patches to the previous meta bug, Bug 1759199)
For a11y ENG review:
Please provide an explanation of the feature or change. Include a description of the user scenario in which it would be used and how the user would complete the task(s).
The feature is called Text Recognition--known to others as Optical Character Recognition, or OCR. In short, this feature will extract text from images shown on a webpage in Firefox and present the extracted text in a visual modal.
Users will be able to trigger the feature via a context menu. While the feature will automatically copy ALL of the processed text to the user's clipboard, users can use the text highlighted in the modal to process via the VoiceOver screen reader OR select a smaller portion to copy-paste, look up, or share with others.
Please note that our current implementation is strictly for macOS users, supporting the core macOS API VNRecognizeTextRequestRevision2.
I've included the link to the Apple Developer here: https://developer.apple.com/documentation/vision/vnrecognizetextrequestrevision2?language=objc .
If they select the "Copy Text from Image" prompt, the text extracted from the image will automatically by copied to their clipboard.
How do we test this?
There is a Nightly Experiments pref called "Text Recognition" that you can trigger. Please consult Greg Tatum on additional instructions. This is only available on macOS 10.15 or newer versions.
When will this ship?
Tracking bug/issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1782574 and https://mozilla-hub.atlassian.net/browse/FFXP-1977
Design documents (e.g. Product Requirements Document, UI spec):
Spec: https://docs.google.com/document/d/1p4SPGlKBC2v4M5wDG6daEnU6FdrMlar9dxnmCWHpGCI/edit?skip_itp2_check=true&pli=1#
Figma Prototypes: https://www.figma.com/file/Yn8yqzRvYaTxBiXdzZaw28/Text-Recognition-in-Images?node-id=413%3A33222
Engineering lead: Greg Tatum (Maire Reavy for senior oversight)
Product manager: Karen Kim
The accessibility team has developed the Mozilla Accessibility Release Guidelines which outline what is needed to make user interfaces accessible:
https://wiki.mozilla.org/Accessibility/Guidelines
Please describe the accessibility guidelines you considered and what steps you've taken to address them:
Morgan Rae Reschenberg and Asa Dotzler have been giving us invaluable feedback on the UX design for this feature to make it more accessible.
Ryan Casey has updated the above linked Figma, which includes several a11y sections: a contrast audit via Figma plugins, High Contrast and Dark Mode mockups, and tab ordering.
Describe any areas of concern to which you want the accessibility team to give special attention:
There are several screen reader unknowns that we want to work through for the Firefox 106 cycle, mainly how to get VoiceOver specifically to read the extracted text presented in the modal before reading the word "Close" in the button at the very bottom of the modal. Will we need to establish an individual node for Text Recognition-based modal content (similar to alt text fields)?
Thank you so much in advance!
Description
•