Last Comment Bug 674612 - [10.7] Text to Speech on Lion not reading highlighted text.
: [10.7] Text to Speech on Lion not reading highlighted text.
Status: VERIFIED FIXED
[qa+][gs][gssolved]
: regression
Product: Core
Classification: Components
Component: Disability Access APIs (show other bugs)
: Trunk
: All Mac OS X
: -- major with 2 votes (vote)
: mozilla10
Assigned To: Steven Michaud [:smichaud] (Retired)
:
Mentors:
http://getsatisfaction.com/mozilla_me...
Depends on:
Blocks: osxa11y
  Show dependency treegraph
 
Reported: 2011-07-27 11:38 PDT by Doug Otis
Modified: 2011-12-04 18:14 PST (History)
14 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Simple page to test with (224 bytes, text/html)
2011-09-20 11:06 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details
Possible fix (4.57 KB, patch)
2011-09-24 13:12 PDT, Steven Michaud [:smichaud] (Retired)
no flags Details | Diff | Review
Fix rev1 (special-case titlebar buttons) (4.66 KB, patch)
2011-09-24 14:56 PDT, Steven Michaud [:smichaud] (Retired)
surkov.alexander: review+
mzehe: feedback+
Details | Diff | Review

Description Doug Otis 2011-07-27 11:38:46 PDT
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0.1) Gecko/20100101 Firefox/5.0.1
Build ID: 20110707182747

Steps to reproduce:

Highlighted text in the message body being prepared in a reply, but this was not sent to the text to speech conversion utility.


Actual results:

Instead the Subject header field is read. 


Expected results:

The highlighted text should be read.

Worked well with Snow Leopard.
Comment 2 Doug Otis 2011-08-03 11:58:10 PDT
The difference between these two reports is that different (and incorrect) fields are being converted to speech instead of the highlighted text within the message body.  It should be noted that the other report did not indicate whether the text being highlighted was within the message body.
Comment 3 Steven Michaud [:smichaud] (Retired) 2011-09-06 08:48:37 PDT
Marco, do you have any ideas about this?
Comment 4 Marco Zehe (:MarcoZ) 2011-09-06 09:17:59 PDT
I suppose this is the Text-To-Speech menu item found in many applications' Edit menus. It allows to send highlighted text or other parts of documents etc. to be sent to the OS X text to speech system. This is NOT VoiceOver, the screen reader for the blind, but rather a more simple auditory interface that allows basic reading of certain text elements.

What this report is implying is that, if you select a chunk of text in a message with the mouse, then click Edit, Text to Speech, and in the sub menu choose the option to read the highlighted text, this works in Snow Leopard, but not in Lion.
Comment 5 Doug Otis 2011-09-06 10:18:48 PDT
(In reply to Marco Zehe (:MarcoZ) from comment #4)
> I suppose this is the Text-To-Speech menu item found in many applications'
> Edit menus. 

No. OS X allows users to set a specific key combination to convert text to speech of any highlighted text. 

> It allows to send highlighted text or other parts of documents
> etc. to be sent to the OS X text to speech system. This is NOT VoiceOver,
> the screen reader for the blind, but rather a more simple auditory interface
> that allows basic reading of certain text elements.

Correct.
 
> What this report is implying is that, if you select a chunk of text in a
> message with the mouse, then click Edit, Text to Speech, and in the sub menu
> choose the option to read the highlighted text, this works in Snow Leopard,
> but not in Lion.

Wrong. This has nothing to do with an Edit menu.

This is a built-in function of OS X based upon specific user defined keys that do not interfere with the application.  The selected key combination still can access the OS X text to speech function, but the highlighted text is _not_ what is being read for 10.7 unlike 10.6.  It appears to read the subject line instead of a selection within the message body.
Comment 6 Marco Zehe (:MarcoZ) 2011-09-06 10:55:12 PDT
Doug, I suspect that the key combination does nothing different than the Edit menu/Speech/Start speech" menu item. So in essence we're talking about the same thing.
Comment 7 Blake Winton (:bwinton) (:☕️) 2011-09-06 15:14:28 PDT
I also see this behaviour in Firefox Aurora (8.0a2).

I suspect it's a larger problem in Thunderbird, since people use text-to-speech to double-check outgoing email, but it's not just Thunderbird's bug.  :(

Oh, and it seems to be the window title, not the subject header.  (Although they're the same in the Compose window.

Thanks,
Blake.
Comment 8 Steven Michaud [:smichaud] (Retired) 2011-09-07 08:13:47 PDT
Blake, please give precise, detailed steps-to-reproduce for this bug in both Thunderbird and Firefox.

If additional extensions/plugins/etc are required, please let me know which ones (together with references to where they can be found).  Likewise if you need to change any OS-level settings from their defaults.
Comment 9 Steven Michaud [:smichaud] (Retired) 2011-09-07 12:32:34 PDT
This is	an interesting bug.

Here are steps to reproduce.  (Blake Winton	pointed	me to the Speech
preference panel.)

1) In System Preferences : Speech, make sure "Speak selected text when
   the key is pressed" is selected. 

   The default "key" is "Option+Esc".

2) Run any trunk (mozilla-central) nightly dated 2010-03-24 or later.

3) Select some text in the browser window and press Option+Esc (or
   whatever "key" you've chosen to	make the OS "speak the selected
   text").

   What will be spoken is the window title, not the selected text.

So the regression range is as follows:

firefox-2010-03-23-03-mozilla-central
firefox-2010-03-24-03-mozilla-central

By further testing I found that it was triggered by this patch:

Bug 553073 - "CFBundleIdentifier in /browser/app/macbuild/Contents/Info.plist.in"
             "is hardcoded as "org.mozilla.firefox"" [r=joshmoz]
author       Reed Loden <reed@reedloden.com>
             Tue Mar 23 23:45:40 2010 -0500 (at Tue Mar 23 23:45:40 2010 -0500)
changeset 39766:6387e3a7dd75
Comment 10 Steven Michaud [:smichaud] (Retired) 2011-09-07 14:28:13 PDT
This is an Apple bug -- really quite a bizarre one.

As best I can tell, only an app whose CFBundleIdentifier is in a very
short list can use Text to Speech on OS X 10.7.X.  Firefox releases'
"org.mozilla.firefox" is in that list.  "org.mozilla.thunderbird"
isn't.  Neither is the "org.mozilla.nightly" CFBundleIdentifier used
by nightlies since the patch for bug 553073 landed.

Since at least OS X 10.5, there's been a SpeechSynthesisServer.app in
the Resources directory of the SpeechSynthesis framework, which in
turn is inside the ApplicationServices framework.  On OS X 10.7.X (but
not on 10.6 or 10.5) the SpeechSynthesisServer binary (in
Contents/MacOS) contains a number of strings matching well-known
CFBundleIdentifiers:

org.mozilla.firefox
com.microsoft.Word
com.microsoft.Excel
com.operasoftware.Opera
com.apple.Safari
com.apple.mail

I strongly suspect that Text to Speech will only work properly on OS X
10.7.X for apps whose CFBundleIdentifier is one of these.

Here are some tests you can run, which seem to confirm this:

A) Change an Thunderbird's CFBundleIdentifier to one from the above
   list.

   1) Edit the Thunderbird's Contents/Info.plist file to change its
      CFBundleIdentifier.

   2) Run lsregister (in the /System/Library/Frameworks/
      CoreServices.framework/Frameworks/LaunchServices.framework/
      Support/ directory) with the following parameters to rebuild
      your launch services database:

      -kill -r -domain local -domain system -domain user

   3) Run Thunderbird and try to use Text to Speech -- it should work
      properly.

B) Change a Firefox release's CFBundleIdentifier so that it no longer
   matches any from the above list, then rebuild your launch services
   database.

   Text to Speech will no longer work properly in the FF release.

C) Change the "org.mozilla.firefox" string(s) in the
   SpeechSynthesisServer binary to something like
   "org.mozilla.firefix" and restart your computer.

   Text to Speech will no longer work properly in an unaltered FF
   release.  But it *will* work properly after you change its
   CFBundleIdentifier to "org.mozilla.firefix" and rebuild your launch
   services database.

   This test isn't for the faint of heart.  Of course you should make
   a backup copy of the original SpeechSysthesisServer binary.  I used
   "sudo emacs SpeechSynthesisServer" to make the change (after
   changing to hexl-mode).  Note that since you're dealing with a
   universal binary, the string to be changed exists in two different
   places.
Comment 11 Steven Michaud [:smichaud] (Retired) 2011-09-07 15:05:28 PDT
> org.mozilla.firefox
> com.microsoft.Word
> com.microsoft.Excel
> com.operasoftware.Opera
> com.apple.Safari
> com.apple.mail
>
> I strongly suspect that Text to Speech will only work properly on OS X
> 10.7.X for apps whose CFBundleIdentifier is one of these.

Actually it's a little more complicated than I thought.  Text to Speech works fine in Google Chrome, whose CFBundleIdentifier is com.google.Chrome.  Maybe the above list is some kind of exceptions list -- CFBundleIdentifiers for which Text to Speech is guaranteed to work, even if it wouldn't normally work.
Comment 12 Steven Michaud [:smichaud] (Retired) 2011-09-08 15:02:21 PDT
Before reporting this to Apple, I'm trying to figure out why Text to Speech works with Google Chrome (even if you change its CFBundleIdentifier to an arbitrary value).  Whatever Chrome does differently, we may be able to imitate it.

One thing I've ruled out, though -- Chrome *isn't* using its own text-to-voice extension API (as implemented on the Mac in extension_tts_api_mac.mm).
Comment 13 Steven Michaud [:smichaud] (Retired) 2011-09-08 15:04:49 PDT
> text-to-voice extension API

text-to-speech extension API
Comment 14 Steven Michaud [:smichaud] (Retired) 2011-09-20 10:12:57 PDT
The plot thickens:

As best I can now tell, this is actually a Mozilla bug (an
accessibility bug).  The OS X 10.7's SpeechSynthesisServer's exception
list for this bug contains just one item -- org.mozilla.firefox.  So
it works around the bug in Firefox releases, but not in apps that have
a different CFBundleIdentifier	(like org.mozilla.nightly or
org.mozilla.thunderbird).

With any other app than org.mozilla.firefox, the 10.7
SpeechSynthesisServer makes a number of AX calls, including the
following (documented at
http://developer.apple.com/library/mac/#documentation/Accessibility/Reference/AccessibilityLowlevel/AXUIElement_h/CompositePage.html
and
http://developer.apple.com/library/mac/#documentation/Accessibility/Reference/AccessibilityLowlevel/AXValue_h/CompositePage.html):

extern AXUIElementRef AXUIElementCreateApplication (
    pid_t pid);
extern AXError AXUIElementCopyAttributeValue (
    AXUIElementRef element,
    CFStringRef attribute,
    CFTypeRef *value);
extern CFTypeID AXUIElementGetTypeID (
    void);
extern AXValueType AXValueGetType(
    AXValueRef value);

But these calls don't work properly in the Mozilla tree on OS X
(whether or not --enable-accessibility is specified when building).
So the 10.7 SpeechSynthesisServer doesn't use them when the foreground
app's CFBundleIdentifier is org.mozilla.firefox.  Instead it
(probably) falls back to the methods used on earlier versions of OS X.

In my next comment, I'll describe in detail how the above mentioned AX
calls are used with Chrome (where they work properly) and with a
Firefox nightly (where they don't).
Comment 15 Marco Zehe (:MarcoZ) 2011-09-20 10:45:12 PDT
david, Surkov, looks like Lion uses Accessibility features in ways other than strict VoiceOver support, unlike Snow Leopard and earlier. See Steven's comment #14 for a description, and a follow-up comment he'll post soon.
Comment 16 Steven Michaud [:smichaud] (Retired) 2011-09-20 11:06:49 PDT
Created attachment 561243 [details]
Simple page to test with

Unusually for Apple, the 10.7 SpeechSynthesisServer has all its
symbols stripped, so it's not possible to break (in gdb) on any
symbols that are defined in SpeechSynthesisServer.  But it's still
possible to break on symbols imported from other libraries (whose
symbols haven't been stripped) -- including the four methods mentioned
in comment #14).

(Interestingly, though, you can still use class-dump to get full
Objective-C header information from 10.7's SpeechSynthesisServer.)

Testing with a very simple HTML page, here's what happens (in the
SpeechSynthesisServer, somewhat simplified) when you select a word in
Google Chrome and press the Text-to-Speech key combination:

1) AXUIElementCreateApplication() is called to get the top-level
   accessibility object for the foreground application (i.e. Chrome).

2) AXUIElementCopyAttributeValue() is called to get the AXRole
   attribute for the top-level accessibility object returned in step
   1.  It returns AXApplication.

3) AXUIElementCopyAttributeValue() is called to get the
   AXFocusedUIElement attribute for the top-level accessibility
   object.  It returns an AXUIElement object whose role is
   AXScrollArea and which has no children (whose AXChildren attribute
   is an empty array).

4) From this point no further AX calls are made, and (as I said above)
   it's not possible to break on internally defined symbols.  But I
   suspect that, having failed to find a suitable string with the AX
   calls, the 10.7 SpeechSynthesisServer now falls back to the "old"
   (pre-10.7) protocol for finding a suitable string to "speak".

Here's what happens in the SpeechSynthesisServer when you do
Text-to-Speech from a Firefox nightly:

1) 2) These steps are the same as with Chrome.

3) AXUIElementCopyAttributeValue() is called to get the
   AXFocusedUIElement attribute for the top-level accessibility
   object.  It returns an AXUIElement object whose role is AXWindow
   and which has four children (also AXUIElement objects).  Three of
   the childrens' roles are AXButton, and the last is AXStaticText.

4) AXUIElementCopyAttributeValue() is called on each of the
   AXUIElement/AXButton objects to get their children -- none of them
   have any.

5) AXUIElementCopyAttributeValue() is called on the
   AXUIElement/AXStaticText object to get its AXValue attribute.  This
   turns out to be "Simple Page" -- the window's title.

6) No further AX calls are made.  But the 10.7 SpeechSynthesisServer
   goes on to speak the window's title.  Which appears to mean that
   the AX calls have "successfully" found the string to speak --
   though it's not the correct string.
Comment 17 Steven Michaud [:smichaud] (Retired) 2011-09-20 11:12:58 PDT
(Following up comment #14 and comment #16)

> In my next comment, I'll describe in detail how the above mentioned
> AX calls are used with Chrome (where they work properly) and with a
> Firefox nightly (where they don't).

I'm no longer so sure this is really a Mozilla bug.  Maybe the 10.7
SpeechSynthesisServer isn't using the AX calls correctly/appropriately
(after all they do fail to find any text to speak with Google Chrome).

But at least the information from comment #16 should allow us to work
around the problem (whoever's fault it is).
Comment 18 Steven Michaud [:smichaud] (Retired) 2011-09-20 11:47:02 PDT
One thing I forgot to mention, which may be relevant:

All the SpeechSynthesisServer's AX calls are made on a secondary thread.
Comment 19 Steven Michaud [:smichaud] (Retired) 2011-09-20 15:58:57 PDT
I've rerun my test from comment #16 using a custom mozilla-central build with accessibility turned on (compiled with --enable-accessibility).  The SpeechSynthesisServer found many more AXUIElement objects while traversing the tree (and called AXUIElementCopyAttributeValue() on all of them).  But of these only three were AXUIElement/AXStaticText objects.  And only one had a non-empty AXValue attribute -- the object corresponding to the browser window's title ("Simple Page").
Comment 20 Steven Michaud [:smichaud] (Retired) 2011-09-20 16:07:11 PDT
I've also found that Text-to-Speech can "spontaneously" start working correctly in an accessibility-enabled build.

One way to make this happen is to type something into the Google search bar and select it, then press the Text-to-Speech key combination.  The word you've typed will get spoken.  And Text-to-Speech will from now on work anywhere, until you quit the browser and restart it.
Comment 21 Steven Michaud [:smichaud] (Retired) 2011-09-20 16:09:23 PDT
Is there a bug for turning on accessibility on OS X?  This bug will need to depend on it.

Also, I've probably reached the limit of what I can do here.  Someone from accessibility will need to take it over from this point.
Comment 22 Steven Michaud [:smichaud] (Retired) 2011-09-20 16:59:46 PDT
As I keep digging up new evidence, I keep changing my mind about this
bug.  So a newcomer will probably find it pretty confusing.

Here's a summary of the current state of my knowledge:

Prior to OS X 10.7, Apple's SpeechSynthesisServer always used the
following hack to find out what text the user has selected when he/she
presses the Text-to-Speech key combination:

It (somehow) causes a Command-C event to be synthesized to the
foreground app, which causes the app to write any selected text to the
system clipboard.  Then SpeechSynthesisServer reads the system
clipboard and "speaks" it.

The 10.7 SpeechSynthesisServer can still perform this hack, and often
does so.  But first it tries a more "correct" approach -- it uses AX
calls to find out what text the user has selected.  Only if this fails
does it fall back to the Command-C hack.

OS X accessibility support is still incomplete, so it's still turned
off by default in all Mozilla builds for OS X.  But even with
accessibility turned off at the app level, the OS still provides
minimal accessibility support for *every* app.	It's Mozilla's bad
luck that this support ends up sending misleading information to the
SpeechSynthesisServer.

Apple is aware of this problem in Firefox, and	so the 10.7
SpeechSynthesisServer special-cases Firefox (which it identifies from
its org.mozilla.firefox CFBundleIdentifier).  If SpeechSynthesisServer
thinks it has Firefox for a "client", it only uses the Command-C hack
(and never uses the AX calls).  But Apple neglected to also special
case other apps built from the Mozilla tree, which have different
CFBundleIdentifiers -- like FF nightlies and Thunderbird releases.

So far this really does look like Apple's bug.

But the AX calls can return misleading results even when Firefox's
accessibility code is turned on.  This is because accessibility isn't
yet finished on OS X, and is pretty clearly a Firefox accessibility
bug.
Comment 23 Steven Michaud [:smichaud] (Retired) 2011-09-20 17:05:30 PDT
Other non-Apple apps have the same "speak the window title" bug that Thunderbird and FF nightlies have -- for example TextWrangler.
Comment 24 Trevor Saunders (:tbsaunde) 2011-09-20 18:24:22 PDT
(In reply to Steven Michaud from comment #21)
> Is there a bug for turning on accessibility on OS X?  This bug will need to
> depend on it.

well, I'd guess turning a11y on and bug 499931 might well be enough to fix this.  iirc we have a few make voice over work / make mac a11y work bugs which you should be able to find here
https://bugzilla.mozilla.org/buglist.cgi?cmdtype=dorem&remaction=run&

unfortunately I don't have the time to read 1100 summaries to figure out exactly which.namedcmd=All%20access%20bugs&sharer_id=285656&list_id=1331567 BUT i DON'T REALLY HAVE TIME TO
Comment 25 alexander :surkov 2011-09-20 21:16:37 PDT
accessible objects implement NSAccessibility protocol only so that I'm not sure how AX methods can be called on accessible object.

Steve, could you point out what's expected from Gecko accessible tree (which interfaces/protocols/methods/whatever) we should expose/implement?
Comment 26 Steven Michaud [:smichaud] (Retired) 2011-09-21 08:28:25 PDT
> accessible objects implement NSAccessibility protocol only so that
> I'm not sure how AX methods can be called on accessible object.

Clearly the AX calls *do* work -- they find many more objects when
accessibility is turned on than when it's turned off, and seem to
return accurate (if misleading) information.  But maybe this is new
with OS X 10.7 (maybe they don't work on earlier versions of OS X).

> Steve, could you point out what's expected from Gecko accessible
> tree (which interfaces/protocols/methods/whatever) we should
> expose/implement?

I don't know the accessibility code very well, but I'll see what I can
turn up in a quick look through the tree.
Comment 27 alexander :surkov 2011-09-21 08:33:35 PDT
(In reply to Steven Michaud from comment #26)

> I don't know the accessibility code very well, but I'll see what I can
> turn up in a quick look through the tree.

neither of us knows. thank you!
Comment 28 Steven Michaud [:smichaud] (Retired) 2011-09-21 09:26:28 PDT
Continuing to test in SpeechSynthesisServer, I've now found what
happens when Text-to-Speech works correctly in an
accessibility-enabled trunk build (as per comment #20):

When you type a word in the Google search bar, select that, and press
the Text-to-Speech key-combo, the "application"'s (the application
AXUIElement's) AXFocusedUIElement attribute is a
AXUIElement/AXTextField object whose AXSelectedText attribute is the
text you've selected.  In this case the AX calls "work", and the
Command-C hack isn't needed.

When you subsequently select a word in the Simple Page testcase and
press the Text-to-Speech key-combo, the application's
AXFocusedUIElement is an AXUIElement/AXGroup object.  Among this
object's children the AX calls find an AXUIElement/AXStaticText
object, but this object's AXValue attribute is NULL.  So the
SpeechSynthesisServer falls back to using the Command-C hack.
Comment 29 Steven Michaud [:smichaud] (Retired) 2011-09-21 09:32:44 PDT
I forgot to mention that, in the second case from comment #28, the AX
calls also query the focused AXGroup object for AXSelectedText and
AXSelectedTextMarkerRange properties, but both are NULL.
Comment 30 Steven Michaud [:smichaud] (Retired) 2011-09-21 09:49:47 PDT
Here's what happens (in SpeechSynthesisServer) when you select text
(in my "Simple Page" testcase) in Safari and press the Text-to-Speech
key-combo:

The application's AXFocusedUIElement attribute is an
AXUIElement/AXWebArea object, which has a non-NULL AXTextMarkerRange
attribute.  AX calls use this to find the selected text, then the
SpeechSynthesisServer speaks it.  (So there's no need for the
Command-C hack.)

Before this, the AX calls query the AXUIElement/AXWebArea object for
an AXSelectedText object, but find that it's NULL.
Comment 31 Steven Michaud [:smichaud] (Retired) 2011-09-21 15:28:48 PDT
(In further reply to comment #25)

> Steve, could you point out what's expected from Gecko accessible
> tree (which interfaces/protocols/methods/whatever) we should
> expose/implement?

The AX calls made from SpeechSynthesisServer already "work" (they
return answers, though not necessarily the "right" ones).  So we
already expose/implement at least some of what's needed ... though
possibly not everything.

If you add logging to the following methods (as I have), you'll find
that all of them are called when you try to do Text-to-Speech in an
accessibility-enabled build:

[mozAccessible accessibilityAttributeValue:]
[mozAccessible accessibilityFocusedUIElement]
[mozAccessible children]
[mozAccessible role]
[mozAccessible value]

What's puzzling is that not every call to
AXUIElementCopyAttributeValue() from SpeechSynthesisServer results in
a call to one of these methods.  I don't know why.

I've found an undocumented NSObject method (present only on OS X 10.7)
that *does* appear to be called once for every call to
AXUIElementCopyAttributeValue().  Sometimes it calls "down" to a
mozAccessible object; sometimes it doesn't.

[NSObject(NSAccessibilityInternal) _accessibilityValueForAttribute:clientError:]

I swizzled it, then logged its parameters and results.

If I had to find the answer entirely from reverse engineering, this
method is where I'd start.  But we shouldn't depend on reverse
engineering unless we have to.  In other words, I suspect this problem
would be easier to figure out for someone who knows more about
accessibility than I do, and can more quickly guess which other
methods it might make sense for mozAcessible to implement.
Comment 32 alexander :surkov 2011-09-21 20:05:25 PDT
I guess from what I read is they use accessible value to obtain the text to announce. If so then the problem is the accessible value is not null for controls like textboxes and etc only. Also if they get a focused element and then get accessible value on it then in the case of selected text within a document, the focused element is the document accessible. So by implementing accessible value on document accessible which returns selected text the problem should be fixed.

Steve, does it make sense?
Comment 33 Steven Michaud [:smichaud] (Retired) 2011-09-24 13:12:23 PDT
Created attachment 562256 [details] [diff] [review]
Possible fix

Here's a patch that fixes this bug, in builds with accessibility
turned on or turned off.  But it has a significant side effect, so I'm
not sure I'm pursuing the right strategy.  Marco, I'll be asking your
advice on this (in my next comment).

This patch actually has two different fixes, one of which only effects
builds with accessibility turned on:

1) As mentioned in the last (3rd) paragraph of comment #28, the
   SpeechSynthesisServer can query the AXValue attribute on an
   AXStaticText object.  But in current code this query fails (it
   returns NULL), because a mozTextAccessible object only supports the
   AXSelectedText and AXSelectedTextRange attributes.

   My patch adds support for the AXValue attribute, and makes it
   synonymous with the AXSelectedText attribute.

   I don't begin to know whether this is correct, or how to do it more
   correctly.  But Mac accessibility code is in pretty bad shape
   (judging by what happens when you try to use VoiceOver).  So I
   don't think it'll hurt to add one more rough spot that may need to
   be smoothed over later.

   This change is in accessibility code, and only effects builds with
   accessibility turned on.

2) Even when accessibility is turned off, OS X provides accessibility
   support for at least an app's titlebar:  Its close, minimize and
   zoom buttons, and the titebar's "title", are all "accessible" --
   for example they can be seen by VoiceOver.

   These objects are children of the browser window (an AXWindow
   object), and are only seen by the SpeechSynthesisServer if the
   application's AXFocusedUIElement object is the browser window.  As
   best I can tell, Mozilla apps are the only cases in which this is
   true.  It's always true when accessibility is off.  And when
   accessibility is on, it's true until the user has placed the focus
   somewhere else in the browser window.

   In principle, one way to fix this problem would be to ensure that
   Mozilla apps never return an AXWindow object as the application's
   AXFocusedUIElement attribute.  But the several ways I tried to do
   this all had severe problems, and doing it correctly would take me
   farther into Mac accessibility code than I have time for.

   So I fell back to ensuring that the browser window, when queried
   for its AXChildren attribute, never returns any of the titlebar's
   accessible objects.  This has the side effect of making these
   objects invisible to VoiceOver.  Marco, I'll be asking you about
   this in my next comment.

   This change effects builds with accessibility turned on or off.

I've started tryserver builds of this patch -- one with accessibility
on and one with accessibility off.  They should be available in a
kalpa or two :-)

(Following up comment #31)

> [mozAccessible accessibilityAttributeValue:]
> [mozAccessible accessibilityFocusedUIElement]
> [mozAccessible children]
> [mozAccessible role]
> [mozAccessible value]
>
> What's puzzling is that not every call to
> AXUIElementCopyAttributeValue() from SpeechSynthesisServer results
> in a call to one of these methods.  I don't know why.

I've figured this out.  mozAccessible has a bunch of subclasses I
hadn't noticed.  Every non-trivial call from the SpeechSynthesisServer
to AXUIElementCopyAttributeValue() results in a call to one of these
objects.
Comment 34 Steven Michaud [:smichaud] (Retired) 2011-09-24 13:36:37 PDT
Comment on attachment 562256 [details] [diff] [review]
Possible fix

As I mentioned in comment #33, my patch makes the titlebar's "chilren"
(the close button, the minimize button, the zoom button, and the
titebar's "label" (which contains the window's title)) invisible to
VoiceOver.  Somehow (at least in my tests on OS X 10.7.1) VoiceOver
can still see the window's title.  But it can no longer see any of the
buttons.

I'm not sure how serious this is.  But VoiceOver can see the close,
miminize and zoom button in other apps that have them (like Safari),
so what	my patch does is technically incorrect.

Marco, what do you think?

Should we live with this bug until the Mac accessibility code is
finished and is turned on?  Or until Apple fixes this bug for
non-accessible apps (very unlikely)?

Or should we make the titlebar's buttons invisible to VoiceOver
(perhaps only on OS X 10.7)?
Comment 35 Steven Michaud [:smichaud] (Retired) 2011-09-24 13:39:29 PDT
Comment on attachment 562256 [details] [diff] [review]
Possible fix

Come to think of it, it wouldn't be too hard to special case the buttons (and keep them visible).  The only thing I *need* to get rid of is the titlebar's "label".

I'll submit another patch that does that.
Comment 36 Steven Michaud [:smichaud] (Retired) 2011-09-24 14:56:59 PDT
Created attachment 562263 [details] [diff] [review]
Fix rev1 (special-case titlebar buttons)

This patch still changes VoiceOver's behavior, but much less than the
previous patch.  (And I've now limited the changes to OS X 10.7.)

VoiceOver (on Lion) no longer sees the titebar's "label", and can no
longer "interact" with it.  But the label is static text, and this
doesn't seem like a great loss.  VoiceOver somehow still knows what
the window's title is.

Marco and Alexander, please test my patch and let me know if there are
any problems I missed.  You can build the patch yourself, or wait an
eon or two for my new tryserver builds to finish (once again two
builds -- with accessibility turned on or off).
Comment 37 alexander :surkov 2011-09-25 21:08:28 PDT
Steven, do you need to change exposed children to make bug fixed when accessibility is on? In other words are changes in mozTextAccessible.mm enough to get it working for accessibility enabled build?
Comment 38 Marco Zehe (:MarcoZ) 2011-09-25 23:07:05 PDT
I'm unsure what our current state of accessibility is anyway. Because if when accessibility is turned on and I build a regular nightly build, I currently only see the window title and the three buttons when a regular web page is enabled. VoiceOver doesn't see any of the web contents, and even when I switch focus, I don't get speech. This used to work at some point, even though it had huge performance problems. Other windows such as the "not default browser" dialog are more accessible than the main window when accessibility is turned on. But judging from this, accessibility on mac is currently broken pretty badly, even worse than it used to be.

So I'd be fine with fixing the speeech server side for now, and leave a proper fix for the time when we finally get to dealing with Mac accessibility the right way. As long as there's a comment in code saying "watch bug 674612 when you make changes here", it'll be OK with me. But Alex, as the module owner, has the last saying on this.
Comment 39 Marco Zehe (:MarcoZ) 2011-09-25 23:23:12 PDT
Comment on attachment 562263 [details] [diff] [review]
Fix rev1 (special-case titlebar buttons)

Seem y comment #38 for a more detailed explanation for this f+.
Comment 40 Steven Michaud [:smichaud] (Retired) 2011-09-26 08:38:13 PDT
(In reply to comment #37)

> In other words are changes in mozTextAccessible.mm enough to get it
> working for accessibility enabled build?

No, unfortunately.  Even with accessibility turned on, the bug still
happens when an AXWindow is the application's AXFocusedUIElement
attribute -- which is the default until the user explicitly places the
focus somewhere else.

Like I said in comment #33, it's probably better to ensure that the
AXFocusedUIElement is never an AXWindow.  But this current state of
Mac accessibility code makes this difficult, and I don't have the time
to figure out why.

The two parts of my patch are both hacks.  But I doubt that we can do
better until we have someone working specifically on Mac
accessibility.
Comment 41 alexander :surkov 2011-09-26 20:01:05 PDT
Comment on attachment 562263 [details] [diff] [review]
Fix rev1 (special-case titlebar buttons)

r=me
Comment 42 Steven Michaud [:smichaud] (Retired) 2011-09-27 08:42:34 PDT
Landed on mozilla-inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/3794007f4f5a
Comment 43 Marco Bonardo [::mak] 2011-09-28 02:00:34 PDT
https://hg.mozilla.org/mozilla-central/rev/3794007f4f5a
Comment 44 Marco Zehe (:MarcoZ) 2011-09-29 06:57:43 PDT
Requesting someone from the QA team to verify this bug on the 2011-09-29 nightly or later on OS X. STR are:
1. Load a page.
2. highlight something using the mouse.
3. Press Option+Escape.

Expected: The highlighted text should be read to you by Apple's Text-to-speech system.

I cannot visually select something from the screen on mac, so requesting some help here. Thank you!
Comment 45 Steven Michaud [:smichaud] (Retired) 2011-09-29 07:50:26 PDT
Note that Speech-to-Text is off by default, so you may need to turn it on:

1) In System Preferences : Speech, make sure "Speak selected text when
   the key is pressed" is selected.
Comment 46 Marcia Knous [:marcia - use ni] 2011-09-29 09:15:27 PDT
Verified fixed using  Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:10.0a1) Gecko/20110929 Firefox/10.0a1

I verified using the steps in Comment 44.

Note You need to log in before you can comment on or make changes to this bug.