Closed Bug 915558 Opened 11 years ago Closed 11 years ago

XUL UI completely broken in Windows: Tabs, menus, etc.

Categories

(Core :: Disability Access APIs, defect, P1)

x86_64
Windows 7
defect

Tracking

()

VERIFIED FIXED
mozilla27
Tracking Status
firefox26 --- verified
firefox27 --- verified
firefox-esr17 - ---

People

(Reporter: MarcoZ, Assigned: tbsaunde)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(1 file, 3 obsolete files)

Accessibility of menu bars, context menus, tabs, labels for XUL textboxes and possibly many other ares are completely broken in Windows. My suspect is bug 846185 since that massively changed the way XUL accessibles are created.

Firefox i largely unusable right now on Windows.
(In reply to Marco Zehe (:MarcoZ) from comment #0)
> Accessibility of menu bars, context menus, tabs, labels for XUL textboxes
> and possibly many other ares are completely broken in Windows. My suspect is
> bug 846185 since that massively changed the way XUL accessibles are created.

ideas what have changed?
Symptoms are:
1. In the awesome bar, NVDA no longer says "Enter an address" or whatever the label is currently. Same for the search box, no more "Search using Google" label spoken. The name of the toolbar is also no longer spoken when focus moves to either of these controls.
2. The tabs (shift+tab twice from the awesome bar) no longer are announced as tabs. Some document title is spoken, some other stuff too, but it feels like the tabs themselves are not at all exposed any more.
3. Pressing Alt to open the menu bar no longer yields any speech. Same with context menus.
it's unlikely bug 846185 because no mochitest failure and Marco said that Trev said Linux is ok.
No longer blocks: 846185
When I started hunting for a regression range, suddenly my nightly builds started working again. I suspect this to have been a fluke in one of my nightly updates. I'm now on the 11th of September build, and it is working. Closing as invalid. Sorry for the noise!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → INVALID
I'm seeing this with the current nightly (2013-09-12). However, I'm pretty sure I had this same nightly working earlier today without problems, so I think something really weird is going on here. I'll try to investigate further.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
I can't find a commonality in what goes wrong here. For example, further to Marco's observations, the navigation toolbar goes completely missing from the hierarchy (so its children move up), the location bar has the wrong name and many controls have a role of unknown.

It seems that the first few times you use a profile, everything works as it should. However, after the first few sessions, things break. Once a profile breaks, it seems to remain broken. So, create a new profile, then keep opening and closing Firefox with that profile, checking for XUL a11y problems each time.

Even more bizarre, deleting compatibility.ini from the profile seems to fix the problem. This doesn't make any sense to me, but it worked every time here. The problem will reoccur in some future session, at which point you delete the file again and it is back to working.
Do we have Mr Wolf at Mozilla? I run out of ideas.
(In reply to alexander :surkov from comment #3)
> it's unlikely bug 846185 because no mochitest failure and Marco said that
> Trev said Linux is ok.
Trev, did you test multiple restarts of Firefox in Linux? As I noted in comment 6, this doesn't happen the first time (perhaps first few times) you use an unaffected profile, but it may happen after several restarts of Firefox and then continue to happen.

Btw, I've now seen this in Thunderbird. Again, deleting compatibility.ini fixed it, but I wager it'll be back after a few restarts.
(In reply to James Teh [:Jamie] from comment #8)
> (In reply to alexander :surkov from comment #3)
> > it's unlikely bug 846185 because no mochitest failure and Marco said that
> > Trev said Linux is ok.
> Trev, did you test multiple restarts of Firefox in Linux? As I noted in
> comment 6, this doesn't happen the first time (perhaps first few times) you
> use an unaffected profile, but it may happen after several restarts of
> Firefox and then continue to happen.

weird, now menus are busted for me if I make firefox start with a compatibility.ini so it always breaks on the first restart after rming that file.

I'd still this is a really weird effect of an a11y patch, anyone want to go through inbound tinderbox builds to see if they can find a good regression range?
So, I actually think I understand this now, and I think bug 846185 may very well be at fault.  If I disable the xul cache by setting nglayout.debug.disable_xul_cache = true then I can't reproduce this. so my theory is that role= on the binding elements isn't getting into the cache and so when we start using the cache we can't get roles for anything and so xul is broken.  Smaug sound like a reasonable theory?  How can I get attributes into the xul cache?
Flags: needinfo?(bugs)
Does the bug happen with new profiles or only with already existing profiles?

If only with old profiles, updating xul-cache version might help...except that I don't know how
update the version these days. Going through the code...
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #11)
> Does the bug happen with new profiles or only with already existing profiles?

According to a community member on Twitter, it happened shortly after he created a new profile, too. I definitely can confirm it happening on existing profiles.
I think this is a bug in XBL proto cache. We just store some small set of attributes on
binding element.
Blocks: 888531
Comment on attachment 810106 [details] [diff] [review]
bug 915558 - save attributes of binding element to startup cache


>+  if (mBinding) {
>+    uint32_t attributes = mBinding->GetAttrCount();
>+    nsAutoString attrValue;
>+    for (uint32_t i = 0; i < attributes; ++i) {
>+      const nsAttrName* attr = mBinding->GetAttrNameAt(i);
>+      if (attr->Atom() == nsGkAtoms::id || attr->Atom() == nsGkAtoms::extends) {
>+        continue;
>+      }
Please check the namespace.

>+
>+      nsDependentAtomString attrName = attr->Atom();
>+      int32_t attrNameSpace = attr->NamespaceID();
>+      mBinding->GetAttr(attrNameSpace, attr->Atom(), attrValue);
>+      rv = aStream->Write8(XBLBinding_Serialize_Attribute);
>+      NS_ENSURE_SUCCESS(rv, rv);
>+
>+      rv = aStream->Write32(attrNameSpace);
We can't serialize namespace as int32, since namespace_as_string -> int32 mapping happens on runtime.
So, need to serialize as string
Attachment #810106 - Flags: review?(bugs) → review-
Comment on attachment 810779 [details] [diff] [review]
bug 915558 - save attributes of binding element to startup cache

>+      nsAutoString attrNamespace;
>+      rv = aStream->ReadString(attrNamespace);
>+      NS_ENSURE_SUCCESS(rv, rv);
Hey, I noticed we have ReadNamespace and WriteNamespace methods. Use those.


>+    for (uint32_t i = 0; i < attributes; ++i) {
>+      const nsAttrName* attr = mBinding->GetAttrNameAt(i);
>+      if (attr->Equals(nsGkAtoms::id) || attr->Equals(nsGkAtoms::extends)) {
>+        continue;
>+      }
Actually, we could just copy these too. So remove this if check.
Makes the code a bit simpler to read.
Attachment #810779 - Flags: review?(bugs) → review-
Attachment #810106 - Attachment is obsolete: true
Attachment #810779 - Attachment is obsolete: true
Attachment #811273 - Flags: review?(bugs)
Comment on attachment 811273 [details] [diff] [review]
bug 915558 - save attributes of binding element to startup cache


>+  if (mBinding) {
>+    nsCOMPtr<nsINameSpaceManager> nsm = do_GetService(NS_NAMESPACEMANAGER_CONTRACTID);
You're not using nsm at all, so please remove this

>+++ b/content/xbl/src/nsXBLSerialize.h
Update the version
http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLSerialize.h?rev=9e98958b5e50#18
Attachment #811273 - Flags: review?(bugs) → review+
https://hg.mozilla.org/mozilla-central/rev/734a282006c2
Assignee: nobody → trev.saunders
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Depends on: 926427
15:52:22 - bsmedberg: edmorley|sheriffduty: since the canadians are away today, would you be willing to back out bug 915558 for topcrash bug 926427 and respin nightlies?

Backed out:
remote:   https://hg.mozilla.org/mozilla-central/rev/ab8e70fb76a8
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Please can we get some tests added to cover the crash, since our testsuite clearly doesn't cover this at the moment... :-)
Unfortunately testing startupcache is a bit hard with our current setup.
I'm not aware of any good tests for it.
tbsaunde, you were asking me yesterday about atoms returning a null string, which I suppose was in reference to this bug, but it doesn't appear that the crashes were at a null string pointer: the crash addresses are things like 0x104455f so it's an invalid pointer or buffer but not a null pointer.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #25)
> tbsaunde, you were asking me yesterday about atoms returning a null string,

yeah, I was thinking about this (I didn't actually look at the crash address hanging head in embarrassment).

> which I suppose was in reference to this bug, but it doesn't appear that the
> crashes were at a null string pointer: the crash addresses are things like
> 0x104455f so it's an invalid pointer or buffer but not a null pointer.

interesting!  I have no idea how that could happen can nsAttrName::Atom() return garbage in some funny case?
Unless this is a temporary atom and we freed it, I don't see an obvious way for this to happen. Assuming, of course that mBinding->GetAttrCount() is correct and we aren't walking off the end of the array, and that nothing is mutating the binding, releasing the atom, or other things like that in the middle of this loop. But I don't see any suspects like that here.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #27)
> Unless this is a temporary atom and we freed it, I don't see an obvious way

nsAttrName holds a ref to the atom so I don't see how it could have been freed.

> for this to happen. Assuming, of course that mBinding->GetAttrCount() is
> correct and we aren't walking off the end of the array, and that nothing is

it seems like that being wrong would break a whole bunch of stuff other than this...

> mutating the binding, releasing the atom, or other things like that in the
> middle of this loop. But I don't see any suspects like that here.

other than a custom js nsIObjectOutputStream I don't see how that could happen.

hopefully I'll have a running windows vm to poke at this with a dbugger later today.
So, I looked at this with ehsan a bit thisafternoon, and he pointed out that my memory of nsAttrName::Atom() handling the case the attr name is holding a node info is incorrect.  So I suspect I just need to change Atom() to LocalName() because it does infact handle the attr name holding a node info, but we weren't really clear on if that will work with prefixes.

Interestingly if this theory is true we could probably sort of  test this by adding some xbl with appropriate weird namespacing stuff.

smaug sound reasonable?
Oh dear, nsAttrName is just very error prone API :/

Yes, the theory makes sense to me. Sorry that I didn't catch the problem
Attachment #811273 - Attachment is obsolete: true
Attachment #821105 - Flags: review?(bugs)
Comment on attachment 821105 [details] [diff] [review]
bug 915558 - save attributes of binding element to startup cache

Could you move prefix writing/reading before name writing/reading.
Attachment #821105 - Flags: review?(bugs) → review+
landed as 512ee2e8691e
Comment on attachment 821105 [details] [diff] [review]
bug 915558 - save attributes of binding element to startup cache

[Approval Request Comment]
Bug caused by (feature/regressing bug #):  846185
User impact if declined:  UI (menus status bars etc are inaccessible
Testing completed (on m-c, etc.):  I've tested this patch locally, and its on inbound now.
Risk to taking this patch (and alternatives if risky): moderate? smaug would be a better judge than me the alternative would be to backout bug 846185 (see below)
String or IDL/UUID changes made by this patch:none

So, this bug is pretty serious, it means that the chrome UI becomes inaccessible when firefox is restarted.  I think we need to fix it in firefox 26 :/  The options I can see our to land this patch there, which would seem fairly safe, but as we saw turned out to have a bug, or I can back out bug 846185 on the branch.  That bug had a much larger patch, so backing it out is non-trivial, and it changed a API used a fair bit in the front end so I'm somewhat concerned about how safe it is, the biggest risk is probably missing restoring new frontend bits to the old API which would make them inaccessible.  I'd rather not prepare that backout unless we've decided that's the way to fix this on the branch.
Attachment #821105 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/512ee2e8691e
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Attachment #821105 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Since ESR 17 is EOL, not tracking for that branch.
I was unable to reproduce the initial issue on Nightly from 2013-09-12 using a new profile and a dirty one. Did not encounter the behaviors from comment 2 on latest Aurora 27.0a2 nor Firefox 26 beta 1. 
Can you or anyone else who initially came on this issue please verify that the issue is fixed on Aurora 27.0a2 and Firefox 26 beta 1?
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-aurora/
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/26.0b1-candidates/build1/win32/en-US/
Flags: needinfo?(marco.zehe)
Verified fixed in 26.0b1candidate, 27.0a2 (2013-10-30), and Nightly 28.0a1 (2013-10-30).
Status: RESOLVED → VERIFIED
Flags: needinfo?(marco.zehe)
Keywords: verifyme
Thank you Marco.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: