Open Bug 1603956 Opened 6 years ago Updated 2 years ago

Racing for the same prototype document causes others to receive an untranslated prototype and then cache it

Categories

(Core :: XUL, defect, P3)

71 Branch
x86_64
macOS
defect

Tracking

()

REOPENED
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- wontfix
firefox72 --- wontfix
firefox73 --- wontfix
firefox74 --- wontfix
firefox75 --- wontfix
firefox76 --- wontfix

People

(Reporter: mstaelen, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(16 files, 1 obsolete file)

126.90 KB, image/png
Details
125.17 KB, image/png
Details
197.35 KB, image/png
Details
19.89 KB, text/plain
Details
945.08 KB, image/png
Details
924.44 KB, image/png
Details
22.35 KB, text/plain
Details
239.00 KB, image/png
Details
18.64 KB, text/plain
Details
1.15 MB, image/png
Details
121.46 KB, image/png
Details
18.37 KB, text/plain
Details
817.15 KB, image/png
Details
34.83 KB, text/plain
Details
139.87 KB, image/png
Details
309.00 KB, image/png
Details
Attached image example of this issue

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:71.0) Gecko/20100101 Firefox/71.0

Steps to reproduce:

the issue often happens when I click on a link from my email client (Mail or Outlook)

Actual results:

sometimes the menus text is not displayed. (an example of this issue with the screen shot) Note: I have this issue on macOs only

Component: Untriaged → Menus
OS: Unspecified → macOS
Hardware: Unspecified → x86_64

Hi,

Thanks for submitting this bug to us!

I was unable to reproduce this issue on my end, I tried on MacOS 10.14.5 with Firefox Nightly version 73.0a1 (2019-12-12) (64-bit) nor Firefox Release v71.0

Does this issue occur with a fresh profile as well? You can find the steps here: https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles?redirectlocale=en-US&redirectslug=Managing-profiles#w_starting-the-profile-manager

You can also try testing if the issue is reproducible in safe mode, here is a link that can help you:
https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode

Or you can also download Firefox Nightly from here: https://nightly.mozilla.org/ , to see if the issue still occurs there as well.

Thanks in advance.

Best regardas, Clara.

Flags: needinfo?(mstaelen)

Zibi, any idea what could be causing this? Perhaps the strings are not yet there when we pass the menu info to cocoa and we never update it, or something? Or something to do with langpacks or missing strings, or something?

Reporter: also, if you open the browser console (not the regular devtools console; use cmd-shift-j as a shortcut to open it), are there any errors there? Also, from an affected Firefox instance, can you open about:support, click "copy raw data to clipboard" and attach as a file (by pasting) at https://bugzilla.mozilla.org/attachment.cgi?bugid=1603956&action=enter ? Thank you!

Flags: needinfo?(gandalf)

Nothing obvious comes to mind. If it was persistent, I'd expect langpack misalignment, but if it's intermittent, I'd rather assume race and since macos menus update on change (we can switch languages on fly and macos menu picks up updated DOM), I'd expect that to happen here.

I can try to reproduce once I get macos machine, but if Clara was unable to, it's likely not trivial to reproduce.

a supplement, my problem occurs when firefox is not started beforehand. I tried with version 73.0a1 (2019-12-19) the problem is the same on the page displaying the link. while the second page page is correct I hope this can help you.

Flags: needinfo?(mstaelen)
Attached image with issue

Capture d’écran 2019-12-19 à 22.19.16.png => is my second page (started automatically ?) and "with issue" is my first page

Amended subject, since this seems to be the case.

Summary: the menus text is not displayed (sometimes) → Menu labels missing when cold opening Firefox from link or other app

I would love to see the browser console errors, I suspect it's something about I/O for FTL files, but I don't have a macos around so I can't reproduce it :(

Mathieu / Crudo, can you both please go to about:support, click the "copy raw data" button, and attach the result at https://bugzilla.mozilla.org/attachment.cgi?bugid=1603956&action=enter (using paste in the "File" box) ?

Also, can you press cmd+shift+j after Firefox has opened in this way, and then take a screenshot of the console that pops up and provide it here? Thank you!

Tentatively moving to Core::Intl as this seems fluent-related, but it may be that actual patching needs to happen in cocoa-land. Also, bug 1568518 seems related, but we explicitly disallowed the early blank window option on mac from 70 onwards, so there must be some other issue - still, perhaps that can provide clues as to what's going on here...

Status: UNCONFIRMED → NEW
Component: Menus → Internationalization
Ever confirmed: true
Flags: needinfo?(mstaelen)
Flags: needinfo?(crudo.daniele)
Product: Firefox → Core
See Also: → 1580799

I'd also appreciate browser console log, esp if there's anything related to L10nRegistry or Fluent.

Attached file RawData about:support
Flags: needinfo?(mstaelen)
Attached image console : first page
Attached image console : last page

Thanks,

Hmm, the errors might be same as bug 1605489, or from some missing menubar.ftl?

I tried to reproduce this morning and I can't (from Nightly to Release), the build always starts normally.

Based on the supportURL, I would assume Matthieu is using a full localized build, not a language pack. At that point, the only difference would be the OS (mine is still Mojave 10.14.6).

Attached file raw data
Flags: needinfo?(crudo.daniele)

Hi all,
I have another mac with firefox 71.0 with mojave 10.14.6 18G2022 and the top menu is ok, so I think this is os specific (bug can be replicated in Catalina 10.15.2 19C57)

Attached image console

:spohl, I don't suppose you have any ideas about what might have affected this behaviour on catalina?

Blocks: catalina
Flags: needinfo?(spohl.mozilla.bugs)

Unfortunately not, :(

If it can help this bug doesn't happen in 10.15.2 when firefox is started in safe mode (i.e. option+double click on the link shortcut --> start in safe mode).
Even if already two windows are opened, both have a consistent top menu.

I also noticed a difference when firefox is started in safe mode vs normal mode.
In normal mode a first window is opened, then a second one: the second window (foreground) is the window with the link, the first one is the firefox homepage (background).
In safe mode it is the opposite: the second window is the firefox homepage (foreground), the first one is the window with the link (background).

Same problem on macOS 10.15.2. Menu bar is completely gone when launching URLs from other applications (like Thunderbird or Microsoft Excel).

Interestingly, opening a new window correctly renders the Menu bar. I attached a screencapture on this bug report (closed as duplicate): https://bugzilla.mozilla.org/show_bug.cgi?id=1605853

Could someone who can reproduce this verify that this isn't a regression? Unfortunately, since this has to do with launching the application, we can't use mozregression in this case. There is an archive with our builds here: https://ftp.mozilla.org/pub/firefox/releases/

Flags: needinfo?(spohl.mozilla.bugs)

(In reply to Stephen A Pohl [:spohl] from comment #28)

Could someone who can reproduce this verify that this isn't a regression? Unfortunately, since this has to do with launching the application, we can't use mozregression in this case. There is an archive with our builds here: https://ftp.mozilla.org/pub/firefox/releases/

Tried build 70.0, bug is not present on that build. So it's something that changed between 70.0 and 71.0.

(Side note for testing: if you want to set the default browser to old Firefox, quit all instances of Firefox, launch Safari, set as default browser, then launch the old Firefox and click on set as default.)

Confirmed, this is a regression, I started with v. 69.0 and there's no bug.

(In reply to crudo.daniele from comment #30)

Confirmed, this is a regression, I started with v. 69.0 and there's no bug.

Damn, I'm typing too fast..sorry for spamming: this is NOT a regression

(In reply to crudo.daniele from comment #31)

(In reply to crudo.daniele from comment #30)

Confirmed, this is a regression, I started with v. 69.0 and there's no bug.

Damn, I'm typing too fast..sorry for spamming: this is NOT a regression

A regression means the behaviour used to be correct, and in more recent versions of Firefox it is not correct anymore. It seems like you're saying that 69 didn't have the bug, and 71 does, which would mean it is a regression. Or are you saying you're seeing the same problem in 69?

Flags: needinfo?(crudo.daniele)

Sorry :) this means I don't know what a regression is...
v.69.0 works ok, there is no bug

Flags: needinfo?(crudo.daniele)

for your information, I've the same issue with firefox73

Summary: Menu labels missing when cold opening Firefox from link or other app → macOS Catalina: Menu labels missing when cold opening Firefox from link or other app

Too late for a fix in 71 and likely in 72 though I'll keep this affected in case we come up with a fix that's upliftable for an RC2 or later dot release

Any ideas here Haik?

Flags: needinfo?(haftandilian)

We could check for Mac Console.app errors being logged and also check if the problem reproduces with the sandbox disabled.

Matthieu or anyone else that can reproduce this issue, could you try the following tests (they are two different things)?

  1. Check for errors on the Mac Console.app application (/Applications/Utilities/Console).

    1. Quit Firefox
    2. Open /Application/Utilities/Console
    3. Enter Firefox in the search box (We can also try filtering with plugin-container or with no search term at all, but there are a lot of messages streamed to the Console application.)
    4. Click the Clear button
    5. Reproduce the problem (click the Clear button before each attempt at reproducing)
    6. If the problem reproduces, collect the logged entries in Console. You can do this in the Console program with Edit->Select All, Copy, then Paste into a text file and attach the file to bug.
  2. Another test to try is to check if this problem is caused by sandboxing:

    1. Use Nightly
    2. Open about:config in the browser
    3. Set security.sandbox.content.level to 0
    4. Quit Nightly (restart is required for the change to take effect).

Note: disabling the sandbox does affect the security of the browser. The change will be tied to the profile used and the sandbox should be re-enabled after the tests. After testing, remember to set security.sandbox.content.level back to 3.

Flags: needinfo?(haftandilian) → needinfo?(mstaelen)

I have a reproducible test case on 10.15.1 (not updated to latest) with Nightly 74 (2020-01-17) where I open a link from Apple Mail.

The problem also reproduces with Firefox Beta 73.0b6 and Firefox 72.0.1.

The problem reproduces with the sandbox disabled. I don't see any messages in the Mac Console.app. Some errors in the Browser Console-- see attached image on comment 45.

Flags: needinfo?(mstaelen)
Flags: needinfo?(haftandilian)

Note: this is very easy to replicate, at least for me (had a chance to test a loaner with Mojave Catalina), without setting up any external apps.

With Firefox set as default (I used Nightly), open a terminal and run open https://www.mozilla.org. This will result in two windows being open. One has all the menus, the one with the new link has none.

For what it's worth, this doesn't happen if I open an URL running Firefox from the command line with --new-tab or --new-window and a URL.

Tried to reproduce the issue with Nightly 74 (2020-01-21) on macOS 10.15 & 10.15.2 but to no success.
Attempted with en-us and fr locales.
Not even steps from Francesco's previous comment were of any help.

That's surprising. I've just updated the Macbook I use at work, and I can replicate it with Nightly. Note that the browser needs to be completely closed.

I've also realized that it doesn't happen if you have the profile manager showing up (i.e. you uncheck the "Use the selected profile without asking at startup" option in the profile manager).

Attached file console log

Bingo, the profile manager was it.
10.15 also suffers from this issue.

Now that we can easily reproduce this with the US locale, does that rule out internationalization?

(In reply to Francesco Lodolo [:flod] from comment #46)

Note: this is very easy to replicate, at least for me (had a chance to test a loaner with Mojave), without setting up any external apps.

This bug blocks the catalina metabug, but this seems to imply this reproduces on 10.14 as well for you?

With Firefox set as default (I used Nightly), open a terminal and run open https://www.mozilla.org. This will result in two windows being open. One has all the menus, the one with the new link has none.

Does it also repro with something like open -a /path/to/FirefoxNightly.app/ --args -P profilename https://www.mozilla.org/ (ie when we're not the default browser) ?

Flags: needinfo?(francesco.lodolo)

(In reply to :Gijs (he/him) from comment #52)

(In reply to Francesco Lodolo [:flod] from comment #46)

Note: this is very easy to replicate, at least for me (had a chance to test a loaner with Mojave), without setting up any external apps.

This bug blocks the catalina metabug, but this seems to imply this reproduces on 10.14 as well for you?

No, only Catalina. That's just me being very bad at remembering version names of macOS (corrected to avoid further confusion).

With Firefox set as default (I used Nightly), open a terminal and run open https://www.mozilla.org. This will result in two windows being open. One has all the menus, the one with the new link has none.

Does it also repro with something like open -a /path/to/FirefoxNightly.app/ --args -P profilename https://www.mozilla.org/ (ie when we're not the default browser) ?

/Applications/Firefox\ Nightly.app/Contents/MacOS/firefox -P test https://www.mozilla.org

This works correctly, it opens the URL in a tab. That's consistent with comment 46 (I can't reproduce if I call the Firefox executable, only when the OS opens the browser).

Flags: needinfo?(francesco.lodolo)
See Also: → 1600153

Missed the open part of the command.

open -a /Applications/Firefox\ Nightly.app/Contents/MacOS/firefox --args -P test https://www.mozilla.org

This works too correctly: one window open with the URL, and there's a menu.

In an attempt to get the regression range, only manual "bisection" was possible with a fresh profile created for each check.
This goes all the way back to 63.0a1.

Has Regression Range: --- → yes

(In reply to Cristian Fogel, QA [:cfogel] from comment #56)

In an attempt to get the regression range, only manual "bisection" was possible with a fresh profile created for each check.
This goes all the way back to 63.0a1.

Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d57a89840dbb4ae0611d0d9a1e6d27e3d0a99e00&tochange=af6a7edf0069549543f2fba6a8ee3ea251b20829

Not sure off-hand what in there could have tripped this... Perhaps the nsTArray refactor makes a difference for some l10n or cocoa API? Stab in the dark though...

It feels unlikely to be bug 1479606 but I see no other explanation. Can someone who can reproduce it try to back it out and try to repro?

Ooh, and if it's this - did anyone reproduce it on debug build? It would trip on MOZ_ASSERT in debug!

Flags: needinfo?(gandalf)

(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #58)

It feels unlikely to be bug 1479606 but I see no other explanation. Can someone who can reproduce it try to back it out and try to repro?

That bug seemed to be content-process only? And the menus are only in the parent process, so I don't think that's possible. Am I missing something?

Flags: needinfo?(gandalf)

Am I missing something?

I don't think you are. I'm just scanning the range and can't see anything that would otherwise I could connect to that, so started thinking of some scenario where the parent process registers wrong locales because content process gets spinned up differently and triggers some negotiation that leads to different outcome? I'm really blind here without being able to reproduce :(

I doubt it's that, but I also don't see any better candidate in that range.

Flags: needinfo?(gandalf)

In case this helps, it might be a clue that some menu items are always displayed while others are not (at least in my case). For example, in the File menu, "Close Tab" is always displayed. In the Bookmarks menu, "Bookmark This Page" is always displayed.

Flags: needinfo?(haftandilian)

(In reply to Haik Aftandilian [:haik] from comment #61)

In case this helps, it might be a clue that some menu items are always displayed while others are not (at least in my case). For example, in the File menu, "Close Tab" is always displayed. In the Bookmarks menu, "Bookmark This Page" is always displayed.

These are the items that have not yet been converted to fluent. I think we know the issue has something to do with fluent, but we don't know why or how.

Priority: -- → P3

Given the number of duplicates and severity of the problem (for instance bug 1600153 and bug 1580799 seems to be related) can we bump the priority of this one?

Flags: needinfo?(m_kato)
See Also: → 1610926

(In reply to Sean Voisen (:svoisen) from comment #65)

Given the number of duplicates and severity of the problem (for instance bug 1600153 and bug 1580799 seems to be related) can we bump the priority of this one?

OK, we should raise to P1 (we should fix on 74. 73 is too late). I am interesting that first Windows sets menu label, but 2nd windows doesn't use data-l10n-id string even if English. (New Fission Window doesn't use fluent, so it is shown correctly.)

Flags: needinfo?(m_kato)
Priority: P3 → P1

(In reply to Makoto Kato [:m_kato] from comment #66)

(In reply to Sean Voisen (:svoisen) from comment #65)

Given the number of duplicates and severity of the problem (for instance bug 1600153 and bug 1580799 seems to be related) can we bump the priority of this one?

OK, we should raise to P1 (we should fix on 74. 73 is too late). I am interesting that first Window sets menu label, but 2nd window doesn't use data-l10n-id string even if English. (New Fission Window doesn't use fluent, so it is shown correctly.)

Zibi, as long as I look this, Gecko sets empty text to menu item that FTL resource. When this occurs, DOMLocalization::TranslateElements seems to return promise (mIsSync is false). Could you investigate this? I guess that this promise isn't resolved.

Flags: needinfo?(gandalf)

I'll take a look at this tomorrow. From comment 13 and comment 14 it seems that FTL files are not loaded maybe? Some Resource protocol I/O issue during startup in this scenario?

Ok, I did repro using steps from comment 46. It now looks more like some race. Maybe XUL Cache?

Zibi, would you want to assign yourself to this bug (it's a P1 regression for 74) ?

Yeah, I'll give it a shot.

Assignee: nobody → gandalf
Status: NEW → ASSIGNED

I spent today looking at this, and narrowed it down to the XUL Cache.

Basically, if I switch this if to be always false - https://searchfox.org/mozilla-central/rev/7e92a667e3829831c31e8d46aefe7ef67ad5be1c/dom/l10n/DocumentL10n.cpp#125 - and then open the window with

open -a /Users/zbraniecki/projects/mozilla-unified/objdir/dist/Firefox.app/MacOS/firefox https://www.mozilla.org

both windows are fully localized.

Mossop - I'm sorry to bother you but I assume you may be more familiar with MacOS than I am and in particular how the open command works in relation to our cache. Also, since you designed the original cache for Fluent, you may have an easier time reasoning on why would it happen and how to adjust it.

My best guess is that we open two windows "in parallel" and when the second window is localized it already "has proto", but the proto is not localized, and yet because it does have a proto, we skip localization of it.

One workaround would be to check if we're in this "open" mode on Mac and always force-localize the document, but I hope there's a better way to address it.

Flags: needinfo?(gandalf) → needinfo?(dtownsend)

Could reproduce the issue on Mac 10.15.2 with Firefox Nightly 75.0a1 (2020-02-13). Will add the according flag.

(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #73)

Mossop - I'm sorry to bother you but I assume you may be more familiar with MacOS than I am and in particular how the open command works in relation to our cache. Also, since you designed the original cache for Fluent, you may have an easier time reasoning on why would it happen and how to adjust it.

My best guess is that we open two windows "in parallel" and when the second window is localized it already "has proto", but the proto is not localized, and yet because it does have a proto, we skip localization of it.

The way this is supposed to work is that the first window starts creating the prototype and any other windows for the same document then wait for the first window to finish loading and then use the created prototype. Last I looked we had applied the localization to the prototype before we notify other windows to start loading (https://searchfox.org/mozilla-central/source/dom/xul/nsXULPrototypeDocument.cpp#411) but it from debugging it looks like that has changed now.

Flags: needinfo?(dtownsend)

Thank you!

Olli - is that accurate? Did something changed within the last year that could cause this race? If so, can you recommend a way for fluent xul cache to avoid loading a window with no translation? (I think it should be as easy as adjusting the if to not use cache if we're not done caching)

Flags: needinfo?(bugs)

So which bug regressed this? We have a range, but did anyone go through that range and compile patches possibly one by one to see which patch regressed this? That would be the first thing to do with a regression bug ;)

Last year we went through XUL -> XHTML conversion, so lots of code moved around.

Flags: needinfo?(bugs)

Oh, hmmm.. I initially thought that it was the issue since the beginning, so bug 1517880. But now I think that maybe it used to work and then regressed?

Ah, @cfogel pointed out that we already have a regression window in comment 57: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d57a89840dbb4ae0611d0d9a1e6d27e3d0a99e00&tochange=af6a7edf0069549543f2fba6a8ee3ea251b20829

Olli - does anything in that list look likely to cause such race regression?

Flags: needinfo?(bugs)

I did look at that yesterday, and no, nothing was obvious. So someone should go through the patches one by one.

Flags: needinfo?(bugs)

@cfogel confirmed that the QA team cannot further bisect the range because builds from that range are not available on treeherder unfortunately.

I tried to bisect it manually, but it seems like the revision d57a89840dbb (last known good one) cannot be build on mac os with the recent toolchain anymore.

First, there was an error in build/moz.configure/init.configure which I was able to fix by applying the change that happen within this range: https://paste.mozilla.org/O2NwPAYZ#L3

Then, the errors kept sprinkling:

 0:03.24 In file included from /Users/zbraniecki/projects/mozilla-unified/obj-x86_64-apple-darwin19.3.0-opt/modules/libmar/verify/Unified_cpp_libmar_verify0.cpp:2:
 0:03.24 /Users/zbraniecki/projects/mozilla-unified/modules/libmar/verify/MacVerifyCrypto.cpp:19:18: error: typedef 'OpaqueSecKeyRef' cannot be referenced with a struct specifier
 0:03.24   typedef struct OpaqueSecKeyRef* SecKeyRef;
 0:03.24                  ^
 0:03.24 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/Security.framework/Headers/SecBase.h:114:25: note: declared here
 0:03.24 typedef struct __SecKey OpaqueSecKeyRef;
 0:03.24                         ^
 0:03.24 1 error generated.
0:10.38 error[E0713]: borrow may still be in use when destructor runs
 0:10.38    --> /Users/zbraniecki/projects/mozilla-unified/third_party/rust/url/src/form_urlencoded.rs:261:40
 0:10.38     |
 0:10.38 259 | impl<'a> Target for ::UrlQuery<'a> {
 0:10.38     |      -- lifetime `'a` defined here
 0:10.38 260 |     fn as_mut_string(&mut self) -> &mut String { &mut self.url.serialization }
 0:10.38 261 |     fn finish(self) -> &'a mut ::Url { self.url }
 0:10.38     |                                        ^^^^^^^^ - here, drop of `self` needs exclusive access to `*self.url`, because the type `UrlQuery<'_>` implements the `Drop` trait
 0:10.38     |                                        |
 0:10.38     |                                        returning this value requires that `*self.url` is borrowed for `'a`
 0:10.38 error: aborting due to previous error
 0:10.38 For more information about this error, try `rustc --explain E0713`.

I don't think I can cherrypick the fixes or invest time to set up a macos system with XCode appropriate for that revisions time to be able to bisect the range and find the revision that caused it :(

I'm unassigning myself to indicate that this bug needs an owner.

I also am moving it to DOM::Core because it is not related to intl/l10n or intl/locale directories, but to dom/xul or even dom/base directories based on comment 76 and comment 78.

Assignee: gandalf → nobody
Status: ASSIGNED → NEW
Component: Internationalization → DOM: Core & HTML

(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #82)

I don't think I can cherrypick the fixes or invest time to set up a macos system with XCode appropriate for that revisions time to be able to bisect the range and find the revision that caused it :(

It looks like you might just need an older version of rust. You can temporarily downgrade your rust version for a directory/repo using the rustup override command. See rustup help override.

Not DOM core. More like XUL, or localization.

Component: DOM: Core & HTML → XUL

(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #73)

I spent today looking at this, and narrowed it down to the XUL Cache.

Basically, if I switch this if to be always false - https://searchfox.org/mozilla-central/rev/7e92a667e3829831c31e8d46aefe7ef67ad5be1c/dom/l10n/DocumentL10n.cpp#125 - and then open the window with

open -a /Users/zbraniecki/projects/mozilla-unified/objdir/dist/Firefox.app/MacOS/firefox https://www.mozilla.org

both windows are fully localized.

I'm catching up on backscroll here so I might be missing something. But is it possible that this regressed at some point in 2018, was fixed, and then regressed again in 2019 (and possibly from Bug 1517880)? A few things make me think that:

  1. The regression range from Comment 57 is Last good: 2018-07-31-22-02-08, First bad: 2018-08-01-10-01-16
  2. Comment 33 says this was working fine in Firefox 69 (which had a merge date of 2019-05-20)
  3. The line of code you referenced in Comment 73 was added in Bug 1517880 (on 2019-07-26)

If so, then chasing down the regressor from the range in Comment 57 won't help us since whatever broke it then was already fixed, and then something different broke it subsequently. We'd want a new regression range starting with Firefox 69 as a known good.

Thank you Brian! Requesting a new regression range starting with Firefox 69 as a known good!

It looks like you might just need an older version of rust. You can temporarily downgrade your rust version for a directory/repo using the rustup override command. See rustup help override.

Yeah, that fixed the rust, but not the C++ errors. I probably would also need to downgrade my XCode SDK. But as Brian pointed out, maybe we can avoid having to go through that?

(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #88)

It looks like you might just need an older version of rust. You can temporarily downgrade your rust version for a directory/repo using the rustup override command. See rustup help override.

Yeah, that fixed the rust, but not the C++ errors. I probably would also need to downgrade my XCode SDK. But as Brian pointed out, maybe we can avoid having to go through that?

I'll point out that the only SDK version that we officially support for building Firefox at the moment is 10.11 [1].

[1] https://wiki.developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Mac_OS_X_Prerequisites#macOSSDK

Rechecked after the previous comments, and the note in comment 33.
Confirmed with 69.0a1 as being a good build.
Last Good: 2019-10-13-21-36-50
First Bad: 2019-10-14-09-52-34

Could you use mozregression, with 2019-10-13-21-36-50 as last good, to narrow this down further? Thank you

Flags: needinfo?(cristian.fogel)

I'm not sure chasing a regression range is necessarily helpful as it may only reveal what caused this particular case to start hitting this race condition. I think it will be easier to just fix the race condition itself.

Assignee: nobody → dtownsend

Manage to find a workaround and get mozregression usable.

Got to the following results:
Pushlog URL

Last good revision: 3bdfb7bc00a06a84c676d4d72ce47ea16b4b0042
First bad revision: 585fe45563356a10a9d53cddfeda2bd699e46dd5

Flags: needinfo?(cristian.fogel)

What's going on here is that the OSX open request is causing us to open a second window (ideally we wouldn't do this but that is a separate bug). The two windows open at basically the same time and so this happens:

  1. Window 1 starts opening and attempts to get browser.xhtml from the startup cache.
  2. The document isn't cached and so we start to parse the file from disk into a prototype document.
  3. Window 2 starts opening and attempts to get browser.xhtml from the startup cache.
  4. The cache reports there is a pending prototype so it waits for the parse to complete.
  5. Parsing of the prototype completes and both windows are given the built prototype.
  6. Both windows construct a DOM document from the prototypes and each sees that their document needs a localization pass and so fires up fluent.
  7. Whichever finishes first applies the translation to the prototype.

I'm not entirely sure why the empty menus are happening but it's most likely a result of both windows doing the same translation and attempting to apply it to the same prototype. Fixing the race will solve that and so I'm finishing up a patch that makes window 2 not get the prototype until after window 1 has translated it.

But there is something odd. If browser.xhtml is in the startup cache then none of this should happen. Both windows should just get the translated prototype directly at step 1 and 3 and there should be no race so I was pretty confused for a while about how this was reproducible normally and not just when the cache had been purged.

Well it turns out that the XUL prototype cache has been broken since early October and so right now we're parsing and translating browser.xhtml on every startup!

Fixing bug 1617092 will make this bug much less of an issue since it would only occur in cases where the startup cache is empty (every Firefox upgrade basically).

The range in comment 93 doesn't look right, nothing in there is a likely candidate but it is around a week away from when the startup cache broke so my suspicion is that that is what actually revealed this. It is possible that this bug has existed for longer, maybe even since we started caching fluent passes, but it's just been hidden by the working startup cache.

Depends on: 1617092

Oh wow! Thank you Mossop! This seems like a fun investigation and maybe it also will explain why am I seeing tspaint wins in bug 1613705!

The second window when opening a URL thing is probably effectively bug 1580799 (see particularly comments 4 through 13); there's some kind of race in how we collect URLs we were asked to open by the OS on mac, and the exact effects vary depending on what wins that race.

Status: NEW → ASSIGNED

The mitigation in bug 1617092 should be enough to stop tracking this bug, marking 74 as wontfix given where we are in the cycle.

This is actually not happening anymore for me (I only get one window, and the menu is localized). Is there a point in keeping this bug open?

(In reply to Francesco Lodolo [:flod] from comment #98)

This is actually not happening anymore for me (I only get one window, and the menu is localized). Is there a point in keeping this bug open?

Based on Dave's comment #94 (see penultimate paragraph) this could still happen if you stage an update, close Firefox, and then the next time it starts you do so from an external link, ie if the external link case is combined with having to apply updates (thus blowing away the cache).

(In reply to :Gijs (he/him) from comment #99)

(In reply to Francesco Lodolo [:flod] from comment #98)

This is actually not happening anymore for me (I only get one window, and the menu is localized). Is there a point in keeping this bug open?

Based on Dave's comment #94 (see penultimate paragraph) this could still happen if you stage an update, close Firefox, and then the next time it starts you do so from an external link, ie if the external link case is combined with having to apply updates (thus blowing away the cache).

Yes. If the startup cache is missing or gets purged when you start in this way you'll still hit it. Much less of an issue (gonna argue P3, could even be P5) but there is definitely still a bug in how we handle races to parse the same XUL/XHTML document right now.

Priority: P1 → P3

Apologies for posting a duplicate bug. Read the whole thread, looks like it's fixed in 75?

Just noticed an interesting secondary effect -- in the primary window opened with this issue, right clicking produces a context menu with blank entries as well.

(In reply to scott.feier from comment #103)

Just noticed an interesting secondary effect -- in the primary window opened with this issue, right clicking produces a context menu with blank entries as well.

This is to be expected. Basically all static strings will be missing as the UI has not had any localization applied.

(In reply to JC from comment #108)

When testing I did note a minor bug, however. On first access of the "Bookmarks" sub-menu of the Top Menu, the option to "Add bookmark" is missing. Performing any operation after this causes the Add bookmark option to appear.

Please file a new bug for this issue.

I am still experiencing this issue with v73.0.1 (64-bit) on Catalina.

The original launch page will load the menus correctly and all are populated. However, clicking on a link in another page or from an email will launch a second window and all menu text is missing. Clicking on where the menus are SUPPOSED to be will create their corresponding drop down menus, but there is no text, only the short cuts as shown in the above posting by Haik Aftandilian.

I have disabled all plugins, extensions, etc. but nothing has resolved the problems.

Thanks for following up with this issue.

-DON-

(In reply to DMatthews251@gmail.com from comment #112)

I am still experiencing this issue with v73.0.1 (64-bit) on Catalina.

Bug 1617092 should fix most of the cases, and will be in 74 next Tuesday.

II no longer have worries since the update of OsX (version 10.15.4), you confirm?
take care

(In reply to Matthieu S. from comment #116)

II no longer have worries since the update of OsX (version 10.15.4), you confirm?
take care

I am on FF ver. 74.0 (64 bit) OSx and have just run the OS update. All seems OK for the moment.

Thanks for your help.

(In reply to DMatthews251@gmail.com from comment #117)

(In reply to Matthieu S. from comment #116)

II no longer have worries since the update of OsX (version 10.15.4), you confirm?
take care

I am on FF ver. 74.0 (64 bit) OSx and have just run the OS update. All seems OK for the moment.

Thanks for your help.

Yes, things seem to suddenly work the way they were intended now. How weird. ;)

Thanks for verifying. Closing as resolved, presumably fixed by Apple in the latest dot release.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME

It's back - in the new release of Firefox 74.0.1 (64-bit): no menus, Forward/Back button hover tooltips are missing. Keyboard shortcuts such as Cmd-D don't work as well. Mozilla really needs to get in better synch with Catalina changes.

The menus are missing both when opening Firefox 74.0.1 from the Mac dock, and when opening windows from e-mail links.

I'm pretty sure this has never been gone, but it now only happens if you update Firefox as part of the restart. See comment #100. It should be fine after a Firefox restart (for Firefox 74 and above). If you're still seeing it then, please provide more details about your system.

Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

(In reply to :Gijs (he/him) from comment #122)

I'm pretty sure this has never been gone, but it now only happens if you update Firefox as part of the restart. See comment #100. It should be fine after a Firefox restart (for Firefox 74 and above). If you're still seeing it then, please provide more details about your system.

I'm not sure what you mean by a "restart". I updated Firefox last evening, and the computer was shutdown at the end of the day. After booting the Mac today the problem remains.

(In reply to Frank from comment #123)

(In reply to :Gijs (he/him) from comment #122)

I'm pretty sure this has never been gone, but it now only happens if you update Firefox as part of the restart. See comment #100. It should be fine after a Firefox restart (for Firefox 74 and above). If you're still seeing it then, please provide more details about your system.

I'm not sure what you mean by a "restart". I updated Firefox last evening, and the computer was shutdown at the end of the day. After booting the Mac today the problem remains.

If you create a fresh dummy profile ( https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles ), set it as the default, quit Firefox, and then start Firefox with that from the dock, does the dummy profile have the same problem? (you can then set your "normal" profile back to being the default and restart again, then delete the dummy profile)

Flags: needinfo?(pherankh)

If you create a fresh dummy profile ( https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles ), set it as the default, quit Firefox, and then start Firefox with that from the dock, does the dummy profile have the same problem? (you can then set your "normal" profile back to being the default and restart again, then delete the dummy profile)

I created a new profile, quit Firefox, and then restarted. The window displayed with menus in place and working (context menu too). Forward/Back buttons tooltips displayed normally and keyboard shortcuts behaved normally. I then switched back to the older "default" profile and that appears to work normally now - both cold starting and opening windows from e-mail links.

Thanks for the tip.

Flags: needinfo?(pherankh)

The problem has returned.
After cold starting Firefox, and in browser windows opened by clicking on email links - all active windows are missing navigation menus. Right-click context menu text is missing. Keyboard shortcuts not working. Forward/Back button tooltips missing. Changing profile (I have 3 existing profiles) doesn't change the results.

Closing all windows sometimes leaves me with a normal Firefox menu bar, but opening a new window to browse gives me a window with no menus, no keyboard shortcuts, etc. Firefox is pretty much unusable.

Mac OS X Catalina Version 10.15.4
Firefox 75.0
iMac Retina 5K, 27-inch, 2019

See Also: → 1629291
Summary: macOS Catalina: Menu labels missing when cold opening Firefox from link or other app → Racing for the same prototype document causes others to receive an untranslated prototype and then cache it
See Also: → 1632486

In Firefox 80.0.1, I'm still seeing this issue. Is there a fix intended? I don't see any tracking status beyond firefox76.

Assignee: dtownsend → nobody

Same issue, double firefox opened when I click on external link

firefox 94 - macos monteray

(In reply to Dave Townsend [:mossop] from comment #94)

What's going on here is that the OSX open request is causing us to open a second window (ideally we wouldn't do this but that is a separate bug). The two windows open at basically the same time and so this happens:

  1. Window 1 starts opening and attempts to get browser.xhtml from the startup cache.
  2. The document isn't cached and so we start to parse the file from disk into a prototype document.
  3. Window 2 starts opening and attempts to get browser.xhtml from the startup cache.
  4. The cache reports there is a pending prototype so it waits for the parse to complete.
  5. Parsing of the prototype completes and both windows are given the built prototype.
  6. Both windows construct a DOM document from the prototypes and each sees that their document needs a localization pass and so fires up fluent.
  7. Whichever finishes first applies the translation to the prototype.

I'm not entirely sure why the empty menus are happening but it's most likely a result of both windows doing the same translation and attempting to apply it to the same prototype. Fixing the race will solve that and so I'm finishing up a patch that makes window 2 not get the prototype until after window 1 has translated it.

Did you end up writing this patch, or part of this patch?

(In reply to Salvatore from comment #131)

Same issue, double firefox opened when I click on external link

firefox 94 - macos monteray

Every time? That seems like it might be a separate issue, given that per Dave's comment this should now only happen if you start Firefox externally when it needed to upgrade.

Flags: needinfo?(firefoxaccenture)
Flags: needinfo?(dtownsend)

(In reply to :Gijs (he/him) from comment #132)

(In reply to Dave Townsend [:mossop] from comment #94)

What's going on here is that the OSX open request is causing us to open a second window (ideally we wouldn't do this but that is a separate bug). The two windows open at basically the same time and so this happens:

  1. Window 1 starts opening and attempts to get browser.xhtml from the startup cache.
  2. The document isn't cached and so we start to parse the file from disk into a prototype document.
  3. Window 2 starts opening and attempts to get browser.xhtml from the startup cache.
  4. The cache reports there is a pending prototype so it waits for the parse to complete.
  5. Parsing of the prototype completes and both windows are given the built prototype.
  6. Both windows construct a DOM document from the prototypes and each sees that their document needs a localization pass and so fires up fluent.
  7. Whichever finishes first applies the translation to the prototype.

I'm not entirely sure why the empty menus are happening but it's most likely a result of both windows doing the same translation and attempting to apply it to the same prototype. Fixing the race will solve that and so I'm finishing up a patch that makes window 2 not get the prototype until after window 1 has translated it.

Did you end up writing this patch, or part of this patch?

See attached. I can't really remember where I left off though, I think I was having problems testing it. Also I attempted to rebase this and resolve the conflicts but haven't even attempted to build it in its current state.

Flags: needinfo?(dtownsend)
Attachment #9252078 - Attachment is obsolete: true
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 14 duplicates and 5 See Also bugs.
:enndeakin, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(enndeakin)
Flags: needinfo?(enndeakin)

Clear a needinfo that is pending on an inactive user.

Inactive users most likely will not respond; if the missing information is essential and cannot be collected another way, the bug maybe should be closed as INCOMPLETE.

For more information, please visit auto_nag documentation.

Flags: needinfo?(firefoxaccenture)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: