Closed Bug 1520177 Opened 1 year ago Closed 5 months ago

mingw-clang has broken accessibility

Categories

(Core :: Disability Access APIs, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla68

People

(Reporter: tjr, Assigned: tjr)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tor])

Attachments

(3 files)

As reported here: https://bugzilla.mozilla.org/show_bug.cgi?id=1430149#c11

[The user] reports that the chrome parts are accessible again now but the content window just shows "unknown" instead of web content. It turns out that this is related to e10s (but not to sandboxing) as setting browser.tabs.remote.autostart to false makes accessibility work for content as well.

It's worth noting that there are certain systems where content accessibility doesn't work even in official Firefox builds. We believe this to be a registry issue, but we don't know what's causing it yet. It'd be worth verifying with affected Tor users (e.g. onionsoup) whether content accessibility works with official Firefox builds, just to make sure we're not dealing with a "broken" system here.

(In reply to James Teh [:Jamie] from comment #1)

It's worth noting that there are certain systems where content accessibility
doesn't work even in official Firefox builds. We believe this to be a
registry issue, but we don't know what's causing it yet. It'd be worth
verifying with affected Tor users (e.g. onionsoup) whether content
accessibility works with official Firefox builds, just to make sure we're
not dealing with a "broken" system here.

Good point. I got one confirmation from a regular Firefox user that the problem in this bug is indeed only happening on Tor Browser while Firefox works fine: disabling browser.tabs.remote.autostart in the test build with accessibility support enabled gives a similar experience in Tor Browser compared to vanilla Firefox. But with e10s on it is broken.

I'm also one of those users, I can confirm that FF 64.0.2 x64 is working fine.

I tested this with a mingw-clang build of esr60 and central; and regular Firefox Nightly with the NVDA screen reader on Windows 10.

On esr60; I could use the browser normally; but NVDA would not dictate any of the tab content; as described.

On central (mingw-clang) - all tabs would crash. I could not load any webpage.

On Nightly; everything worked as expected.

On -central; we are asserting here: https://searchfox.org/mozilla-central/rev/465dbfe030dfec7756b9b523029e90d48dd5ecce/ipc/mscom/ActivationContext.cpp#33 so I'm guessing the windows function CreateActCtx probably has some sort of mingw header issue...?

Flags: needinfo?(jacek)

(In reply to Tom Ritter [:tjr] from comment #5)

On -central; we are asserting here: https://searchfox.org/mozilla-central/rev/465dbfe030dfec7756b9b523029e90d48dd5ecce/ipc/mscom/ActivationContext.cpp#33 so I'm guessing the windows function CreateActCtx probably has some sort of mingw header issue...?

Actually; looking around MinGW; I don't see anything that jumps out to me as incorrect...

https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/winbase.h#l2606

It doesn't seem to be mingw fault. It looks like we're still missing some resources. CreateActCtx loads manifests from resources. I can see that xul.dll has only VERSION resources, no manifest.

Flags: needinfo?(jacek)

I suspect missing accessible/ipc/win/IAccessible64.manifest. It's mentioned in moz.build, but I don't see any .rc file including it. Is it somehow included in MSVC by mt.exe?

(In reply to Jacek Caban from comment #8)

I suspect missing accessible/ipc/win/IAccessible64.manifest. It's mentioned in moz.build, but I don't see any .rc file including it. Is it somehow included in MSVC by mt.exe?

What about https://searchfox.org/mozilla-central/source/toolkit/library/xulrunner.rc#9 ?

I sent in a build that changes RT_MANIFEST to 24 like Bug 1515982; I'l test it when I have a chance but that time is uncertain...

https://treeherder.mozilla.org/#/jobs?repo=try&revision=60398b80a34548a32b2f290c063e592629ead22c

(In reply to Tom Ritter [:tjr] (PTO) from comment #9)

I sent in a build that changes RT_MANIFEST to 24 like Bug 1515982; I'l test it when I have a chance but that time is uncertain...

This build has the same crash issue; and if I'm using Resource Hacker correctly on xul.dll - I also only see a Version resource and nothing else.

Oh, right, I missed it. The file is, however, explicitly not used in mingw builds (which looks wrong):
https://searchfox.org/mozilla-central/source/toolkit/library/moz.build#50

(In reply to Jacek Caban from comment #11)

Oh, right, I missed it. The file is, however, explicitly not used in mingw builds (which looks wrong):
https://searchfox.org/mozilla-central/source/toolkit/library/moz.build#50

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9e395b787654bb41751b3c30c2e4c2060eb705b4

This now has resources in xul.dll; but still crashes in the same place. Probably missing another include...

Yeah, now it needs ia2marshal.dll manifest. Adding something like
2 24 IA2Marshal.dll.manifest
to IA2Marshal.rc should help.

(In reply to Jacek Caban from comment #13)

Yeah, now it needs ia2marshal.dll manifest. Adding something like
2 24 IA2Marshal.dll.manifest
to IA2Marshal.rc should help.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c5ef984f2726adce22144c7a0d843ae761b94c6e

This seems to get further but still crashes. I won't be able to investigate for a bit.

It crashes in CreateInterceptor, which uses ITypeInfo. I don't know relevant Gecko code, but I guess it uses typelib generated by widl, so it's likely a widl bug. Accessible.tlb looks indeed weird.

(In reply to Tom Ritter [:tjr] (PTO) from comment #5)

On -central; we are asserting here:
https://searchfox.org/mozilla-central/rev/
465dbfe030dfec7756b9b523029e90d48dd5ecce/ipc/mscom/ActivationContext.cpp#33
so I'm guessing the windows function CreateActCtx probably has some sort of
mingw header issue...?

I actually see this with Tor Browser debug builds for ESR60 as well (still with the mingw-w64/gcc toolchain).

(In reply to Tom Ritter [:tjr] (PTO) from comment #16)

Created attachment 9039768 [details] [diff] [review]
accessibility patches so far

And those patches are enough to stop the crashing on ESR 60 debug builds.

(In reply to Jacek Caban from comment #15)

It crashes in CreateInterceptor, which uses ITypeInfo. I don't know relevant
Gecko code, but I guess it uses typelib generated by widl, so it's likely a
widl bug. Accessible.tlb looks indeed weird.

FWIW: we seem to hit that one with ESR 60 + accessibility tools installed, too.

Attached file Accessible.tlb

This is the result of running

> set INCLUDE="C:\\PROGRA~2\\MIB055~1\\2017\\COMMUN~1\\VC\\Tools\\MSVC\\1413~1.261\\include;C:\\PROGRA~2\\MIB055~1\\2017\\COMMUN~1\\VC\\Tools\\MSVC\\1413~1.261\\atlmfc\\include;C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.15063.0\\shared;C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.15063.0\\um;C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.15063.0\\winrt;C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.15063.0\\ucrt;C:\\PROGRA~2\\MIB055~1\\2017\\COMMUN~1\\DIA SDK\\include"
C:\mozilla-unified\accessible\ipc\win\typelib> C:/PROGRA~2/WI3CF2~1/10/bin/100150~1.0/x64/midl.exe -env x64 -cpp_cmd C:/PROGRA~2/MIB055~1/2017/COMMUN~1/VC/Tools/MSVC/1413~1.261/bin/HostX64/x64/cl.exe -Oicf Accessible.idl

On a Windows 10 machine

(In reply to Jacek Caban from comment #15)

It crashes in CreateInterceptor, which uses ITypeInfo. I don't know relevant Gecko code, but I guess it uses typelib generated by widl, so it's likely a widl bug. Accessible.tlb looks indeed weird.

Actually, Accessible.tlb generated by widl is probably fine. It's different than midl generated and Windows, in separated simple test case, can successfully load it. So the next step is finding where the invalid ITypeInfo comes from.

The actual failing function is CoGetInterceptorFromTypeInfo. I suspect that it's because ITypeInfo we pass is somehow broken (eg. typelib it's based on is invalid), but simple tests didn't confirm that. The typelib seems to come from IA2Marshal.dll. It may be interesting to experiment with replacing that DLL with MSVC build and see if it helps.

So I've a hacked together (ie not built through our deterministic build system, but built with our deterministically built gcc, mingw, fxc2 and rust) build of Tor Browser working as expected with Tom's patches applied. I suppose that would point to an issue with clang perhaps?

Can't find the edit button, but this is for an i686 build (not x64).

(In reply to Jacek Caban from comment #22)

The actual failing function is CoGetInterceptorFromTypeInfo. I suspect that it's because ITypeInfo we pass is somehow broken (eg. typelib it's based on is invalid), but simple tests didn't confirm that. The typelib seems to come from IA2Marshal.dll. It may be interesting to experiment with replacing that DLL with MSVC build and see if it helps.

We don't have MSVC builds anymore; but we have clang-cl. I used the ooold build from Comment 14 (which I had to regenerate) with NVDA.

I reproduced the failure from CoGetInterceptorFromTypeInfo in a pure mingw-clang build. Then I copied IA2Marshal.dll over from the clang-cl build. I don't see a crash; but neither does the browser load a page. (It's debug, so it's slow but it should have loaded by now...

By 'working as expected' I mean page contents are now read on mouse-over, but navigation is not working. Same behaviour for both 32 and 64 bit. Dropping in the vanilla ESR60 version of the IA2Marshall.dll over my custom build results in the same behaviour. Regardless, no crashes.

This is with just the ' accessibility patches so far' patch applied and built with mingw-gcc (not mingw-clang).

It looks like the attached patches are enough to fix things for the mingw-gcc build for esr60; but not enough to fix it for the mingw-clang build on central.

I'll clean up the patch and request review and landing and then we can continue to investigate for mingw-clang.

This patch contains three fixes.

  1. As in Bug 1515982, we use the constant for RT_MANIFEST instead of
    the define, which needs winuser.rh to be included

  2. We stop exempting the mingw builds from RCINCLUDE in
    toolkit/library/moz.build which causes us to miss all of the
    resources in xul.dll

  3. We explicitly include IA2Marshal.dll instead of relying on
    compiler magic to include it for us.

Attachment #9056090 - Attachment description: Bug 1520177 - Fixed MinHW build's accessibility r?froydnj → Bug 1520177 - Fixed MinGW build's accessibility r?froydnj

(In reply to Tom Ritter [:tjr] (On Leave) from comment #27)

It looks like the attached patches are enough to fix things for the mingw-gcc build for esr60; but not enough to fix it for the mingw-clang build on central.

I'll clean up the patch and request review and landing and then we can continue to investigate for mingw-clang.

Things aren't completely fixed (navigation using NVDA does not work) but there are no crashes.

Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/675ac787a5d3
Fixed MinGW build's accessibility r=froydnj

Keywords: checkin-needed

It would have been nice if we had just defined RT_MANIFEST when not present so that we didn't have to replace it with a magic number in xulrunner.rc.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68

Re-opened, as it turns out this does crash mingw-gcc tabs when the --enable-debug flag is added to the build configuration (though behaviour still is as described without the flag)

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Duplicate of this bug: 1543120

(In reply to Jacek Caban from comment #22)

The actual failing function is CoGetInterceptorFromTypeInfo. I suspect that it's because ITypeInfo we pass is somehow broken (eg. typelib it's based on is invalid), but simple tests didn't confirm that. The typelib seems to come from IA2Marshal.dll. It may be interesting to experiment with replacing that DLL with MSVC build and see if it helps.

So I can confirm that the typelib generated by widl from IA2Typelib.idl is a bit wonky. Using oleview.exe that comes with the Windows SDK indicates the generated tlb is invalid, and cannot be fully decompiled. The specific issue seems to be that various IA2 interfaces are missing their inherited interfaces (for instance, IAccessibleRelation is missing it's IUnknown definition). The strange thing is that if you build the tlb with only a subset of the IA2 interfaces (for instance, if you create tlb with just IAccessibleRelation the library decompiles fine and the IUnknown interface is present).

If you just swap out IA2Marshal.dll, AccessibleHandler.dll, AccessibleMarshal.dll and Accessible.tlb from vanilla ESR60 to ESR60 built with mingw, than everything seems to be working fine (ie, child proc ia2 interfaces are actually exposed to screen readers).

(In reply to Richard Pospesel (Tor Browser Dev) from comment #37)

Oh and the cool thing is that different interfaces will have a missing interface depending on how your order the interfaces in the 'library IAccessible2Lib' of IA2Typelib.idl (assuming you inline the IA2TypeLibrary.idl include)

Created a min-repro of our invalid tlb issue: https://github.com/pospeselr/mingw-widl-bug

Filed bugs with wine (which owns widl) and mingw about this invalid tlb issue.

Wine: https://bugs.winehq.org/show_bug.cgi?id=47031
MingW: https://osdn.net/projects/mingw/ticket/39139#preview

Hopefully someone over there has an idea of how to fix this.

Priority: -- → P3

The test case is great, thanks. Zebediah already sent a patch to Wine, here is a try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b7ee84fed3f1ee97d04ecfaeb9ebb46d1256dffd

So I've been going back and forth with the wine devs fixing the various bugs in widl that prevent us from generating an identical tlb as with midl. Here are the ones filed so far for those following along at home:

widl generates invalid typelib when importing types from multiple typelibs (FIXED) : https://bugs.winehq.org/show_bug.cgi?id=47031
widl generates typelib file with enums multiply defined : https://bugs.winehq.org/show_bug.cgi?id=47035
wire_marshal attribute ignored and the underlying type is used instead : https://bugs.winehq.org/show_bug.cgi?id=47041

Hey Richard; when all your changes are upstreamed to mingw let me know and I'll do a version bump.

Flags: needinfo?(richard)

(In reply to Tom Ritter [:tjr] from comment #43)

Hey Richard; when all your changes are upstreamed to mingw let me know and I'll do a version bump.

I think we are good now, see: https://trac.torproject.org/projects/tor/ticket/31458. We bumped the mingw-w64 we use to current HEAD (2d52c4b3433e55b1c454f9567c0ae9adc4b83b41) which seems to contain all the necessary fixes. So, go ahead bumping the commit you use on your side and then let's close this bug. :)

Flags: needinfo?(richard)

It was bumped in bug 1575670

Yay!

Status: REOPENED → RESOLVED
Closed: 1 year ago5 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.