Closed Bug 1574322 Opened 5 years ago Closed 4 years ago

Thunderbird freezes/hangs upon opening folder Properties window on Mac

Categories

(Thunderbird :: Folder and Message Lists, defect)

All
macOS
defect
Not set
critical

Tracking

(thunderbird_esr68 unaffected, thunderbird69 affected)

RESOLVED FIXED
Thunderbird 74.0
Tracking Status
thunderbird_esr68 --- unaffected
thunderbird69 --- affected

People

(Reporter: de.berberich, Assigned: emilio)

References

(Regression)

Details

(Keywords: hang, regression, Whiteboard: smoketest69.0b3, Revised STR in comment #41)

Attachments

(3 files)

Steps to reproduce:
right-click on a folder in the folder pane > choose "Properties" in the contextual menu.
The Properties drop-down window would open displaying the "General Information" tab. Immediately Thunderbird 69 becomes unresponsive (freezes) and macOS displays the "spinning beach-ball" .
One has to use the force-quit command to quit Thunderbird.

Expected behavior: Thunderbird should not freeze when user opens the "Properties" drop-down window.

Working fine on Windows. Mac owners ahoy!

Flags: needinfo?(richard.marti)
Flags: needinfo?(alessandro)
Summary: Thunderbird freezes upon opening Properties window → Thunderbird freezes upon opening folder Properties window on Mac
See Also: → 1574249

Confirmed!
No errors show up in the terminal tho, it just freezes.
I'll keep it running and see if it crashes with some log info.

Flags: needinfo?(alessandro)
No longer blocks: tb68found
Severity: normal → critical
Keywords: hang

The hang issue is already present in build 1. I didn't (re)test build 2.

No problem here. Tried with beta3b2 too on a new profile with a IMAP account and local folders. The dialog is shown and I can change the tab and also close the dialog.

Flags: needinfo?(richard.marti)

So why can Alex see the issue (comment #2)? Some add-on crashing it?

I tested in new profiles and Lightning disabled. TB 69 still hangs when opening a "Properties" box.

But what about beta 2? If the issue is already in beta 2, I have no trouble shipping beta 3.

I just tested in new profiles, the issue is already present in both TB 69.0b2 and TB 69.0b1.

Can't find the bug with only "freezes".

Summary: Thunderbird freezes upon opening folder Properties window on Mac → Thunderbird freezes/hangs upon opening folder Properties window on Mac

For what it's worth, I'm not using any add-on, and it hangs also on trunk with Lightning disables.

69.0b4 available for testing

Flags: needinfo?(de.berberich)

The bug is still present in 69.0b4.

Flags: needinfo?(de.berberich)

Still working. What am I doing wrong? Tried on a empty folder, a folder with message open. What type of folder does it happen? I have IMAP and Local Folders. On Daily additionally Feed- and newsgroups folders.

Or system relevant? I have a 13" MBP macOS 10.14 with Retina screen, and on external LoRES monitor no problems too.

(In reply to Richard Marti (:Paenglab) from comment #14)

Still working. What am I doing wrong? Tried on a empty folder, a folder with message open. What type of folder does it happen? I have IMAP and Local Folders. On Daily additionally Feed- and newsgroups folders.

Or system relevant? I have a 13" MBP macOS 10.14 with Retina screen, and on external LoRES monitor no problems too.

Also, what version of MacOS ?

Flags: needinfo?(de.berberich)

10.14.6

It happens with any folder : POP and IMAP accounts, news accounts, feed accounts
(iMac 21,5" Retina 4K display, macOS 10.14.6).

Flags: needinfo?(de.berberich)

Adam or Antony, haveyou seen this?

Flags: needinfo?(adam)
Flags: needinfo?(acdp)

(In reply to Wayne Mery (:wsmwk) from comment #18)

Adam or Antony, haveyou seen this?

I'm still running 60 because I haven't had time to update my own add-on for 69 yet, and I can't really function without it. :)

For what it's worth, I ran the STR in 60 and didn't get a hang.

Flags: needinfo?(adam)

I do not see this issue and using 71.0b1 (64-bit) at the moment.

Open and close a number of diff folder propertie in diff email addresses and TB stabl on latest Mojave beta - still same MAC late 2013 - rebuilt almost!

Flags: needinfo?(acdp)

This issue is still present in Thunderbird 70.0b1 and Thunderbird Daily.
Running macOS 10.14.6 (Mojave )

Let's see who can post a stack first :)

Flags: needinfo?(de.berberich)
Flags: needinfo?(alessandro)
Keywords: stackwanted

(In reply to Wayne Mery (:wsmwk) from comment #22)

Let's see who can post a stack first :)

Unfortunately I don't know what "stack" means in this environment and how to do it.
The only report I have is a crash report (stackshots) my Mac generated tonight when I had to force-quit Thunderbird Daily which had frozen after opening the Properties window.

Flags: needinfo?(de.berberich)

https://developer.mozilla.org/en-US/docs/Mozilla/How_to_report_a_hung_Firefox describes how to get a stacktrace (also known as a backtrace)

This is not the page which describes how to get a stacktrace but in the meantime I've found another page https://developer.mozilla.org/en-US/docs/Mozilla/How_to_get_a_stacktrace_for_a_bug_report
However the instructions on the latter page don't inspire me how to get a stacktrace (or backtrace). And when I go to ~/Library/Logs" and open "CrashReporter" there are no logs for "thunderbird-bin".

Don't forget that I'm a simple user and no developer.
Eventually I found that I had to look up the Thunderbird Daily PID in the Activity Monitor, modify the command line in Terminal to provoke the crash. The crash report has been sent to Mozilla, what now?

Crash IDs:
bp-4483f238-d834-486b-a86b-781e10191104 [@ nsFrameIterator::GetFirstChild ]
bp-7e318902-8adb-4387-bd0e-c7d370191104 [@ __CFRunLoopDoSources0 ]

My mac is out for maintenance for a couple of weeks, so I'm kinda useless on this, sorry.

Flags: needinfo?(alessandro)

m_kato can you make anything out of the stacks in comment 27?
Is more detail needed?

The first signature doesn't exist as a crash for Thunderbird or Firefox. For _CFRunLoopDoSources0 only Firefox crashes exist, but for the crashes I checked the stacks dont' match up

Flags: needinfo?(m_kato)

Maybe I should add that I provoked the first crash while Daily ran normally and the second crash when it already hung after opening a folder properties window - bp-7e318902-8adb-4387-bd0e-c7d370191104 [@ __CFRunLoopDoSources0 ]

libsystem_kernel.dylib 	mach_msg_trap 		context

1 CoreFoundation __CFRunLoopDoSources0 frame_pointer
2 CoreFoundation __CFRunLoopRun frame_pointer
3 CoreFoundation __CFTSRToDispatchTime frame_pointer
4 HIToolbox RunCurrentEventLoopInMode frame_pointer
5 HIToolbox ReceiveNextEventCommon frame_pointer
6 HIToolbox _BlockUntilNextEventMatchingListInModeWithFilter frame_pointer
7 AppKit _DPSNextEvent frame_pointer
8 AppKit -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] frame_pointer
9 XUL -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] widget/cocoa/nsAppShell.mm:169 frame_pointer
10 AppKit -[NSApplication run] frame_pointer

Just an FYI that I can consistently produce this hang on macOS 10.15.1, Thunderbird 72.0b2. It happens on any folder as far as I can tell, all my accounts are IMAP/SMTP. Let me know if additional stack traces would help!

ben can you make anything out of the stack in comment 30?

I'm a newbie to Mac, 10.15.1 - I can't reproduce with 72.0b1, for any folder, and any account type.

Flags: needinfo?(benc)

(In reply to Wayne Mery (:wsmwk) from comment #32)

ben can you make anything out of the stack in comment 30?

Not much, I'm afraid :-(

I don't know much about OSX internals, but my rough web searching hints that EXC_SOFTWARE is used to wrap up unix-style signals, and that the SIGABRT is used by libc (as an example), when it wants to freak out because it's detected it's heap bookkeeping has been corrupted. So my first guess would be a memory overrun somewhere.
(a run through valgrind might shed some light on memory corruption, but I suspect it'd be obscured by huge amounts of other (valid) complaints...)

Flags: needinfo?(benc)

Aleca, are you on 10.14, or catalina?

By my accounting, everyone reporting problem so far is 10.14.

Flags: needinfo?(alessandro)

nevermind, comment 30 cites catalina.

smichaud, any advice?

Flags: needinfo?(alessandro) → needinfo?(smichaud)

None, I'm afraid.

I can't reproduce these hangs (I tested with Thunderbird 69.0 b4). The stack from comment #30 is completely normal. As I understand it, this is how Thunderbird/Firefox records a hang -- as a "crash" on the main thread, whatever happens to be running there.

I'm currently running on macOS 10.12.6. I'll reboot into Catalina and try again.

Flags: needinfo?(smichaud)

I'm currently running on macOS 10.12.6. I'll reboot into Catalina and try again.

Oops, I can't for now. I'm away from home, and don't have all my testing servers. I'll be back in a week or so.

Bug reproduced on every folder properties panel (whether inboxes or subfolders of "sent mail", "archive", etc.) on Thunderbird 72.0b2 on MacOS Catalina (10.15.3 (19D49f)).

(In reply to anfire@nym.hush.com from comment #38)

Bug reproduced on every folder properties panel (whether inboxes or subfolders of "sent mail", "archive", etc.) on Thunderbird 72.0b2 on MacOS Catalina (10.15.3 (19D49f)).

Does it happen to you when using version 68? https://www.thunderbird.net/en-US/

Flags: needinfo?(anfire)

(In reply to Wayne Mery (:wsmwk) from comment #39)

(In reply to anfire@nym.hush.com from comment #38)

Bug reproduced on every folder properties panel (whether inboxes or subfolders of "sent mail", "archive", etc.) on Thunderbird 72.0b2 on MacOS Catalina (10.15.3 (19D49f)).

Does it happen to you when using version 68? https://www.thunderbird.net/en-US/

Just did a quick install/test with a single email account on TB 68.3.1 (still MacOS Catalina 10.15.3 (19D49f)) and the bug does not reproduce. The properties panel displays properly on the various folders, and I am able to switch between the various tabs without hang/freeze.

Flags: needinfo?(anfire)

(Following up comment #37)

I can now reproduce this bug on both macOS 10.12.6 and 10.15.2, testing with Thunderbird 69.0 b4. The STR from comment #0 is missing a step, at least for me.

STR

  1. Right-click on a folder in the folder pane and choose Properties in the contextual menu.
  2. Click on any of the tabs aside from the current one (General Information).
  3. Thunderbird hangs -- the spinning beachball appears.

This doesn't happen with Thunderbird 68.

Whiteboard: smoketest69.0b3 → smoketest69.0b3, Revised STR in comment #41

I'm also able to reproduce this bug with Thunderbird 73.0 b1, using the STR from comment #41.

Attached file lldb stack of hang

Here's an lldb stack of this bug's hang, made using today's comm-central nightly. I needed to use a nightly, because other distros have their symbols stripped.

Can you get the mozilla-central range for that? The changesets should be somewhere in the build info of the two versions. None of the C-C changes in this range look like they could have caused this issue.

I've already tried using the exact same revision numbers on mozilla-central, and it didn't work:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f86f48fda7bd&tochange=3550d1f37dd5

I also strongly suspect that even if I could do what you ask, it wouldn't help. Before we do anything else, someone should try backing out each of the patches in the comm-central regression range. I'm afraid that won't be me :-)

They will be different changesets. Alice, can you please get the M-C range for the C-C range in comment #44. You can do it on a Windows build.

As I said, nothing in that C-C range looks suspicious. Those changesets were committed to the code base more than two years ago, you won't be able to simply back them out. As your stack shows, and thanks for that(!!), the issue is in nsFocusManager.

Flags: needinfo?(alice0775)

Hold on. Your range doesn't look right. You're saying it stopped working in TB 57 in August 2017. But TB 68 is not affected and the problem was reported for TB 69 which was in Daily between 2019-05-20 and 2019-07-08.

Flags: needinfo?(alice0775)

Try out the two builds from comment #44 on any recent version of macOS (I tested on 10.12.6 and 10.15.2). You'll find that the second one has the bug while the first one doesn't.

Possibly the bug disappeared later, then reappeared.

I don't have a Mac. But we need to check the 69 cycle.

Start here: http://ftp.mozilla.org/pub/thunderbird/nightly/2019/05/2019-05-20-10-41-36-comm-central/ (last 68 version) and end here:
http://ftp.mozilla.org/pub/thunderbird/nightly/2019/07/2019-07-10-11-54-19-comm-central/ (first 70 version).

Possibly the bug disappeared later, then reappeared.

Yes, that seems to have happened. The 2019-05-20 comm-central nightly doesn't have the bug. I'll try to find another, later, regression range.

Excellent, we're cooking with gas. Thanks a lot. We certainly need to get this fixed before the next major release.

Here's another, later regression range:

http://ftp.mozilla.org/pub/thunderbird/nightly/2019/06/2019-06-04-09-18-47-comm-central/thunderbird-69.0a1.en-US.mac.dmg
http://ftp.mozilla.org/pub/thunderbird/nightly/2019/06/2019-06-05-10-11-01-comm-central/thunderbird-69.0a1.en-US.mac.dmg

https://hg.mozilla.org/comm-central/pushloghtml?fromchange=098d414457e2&tochange=4a37dadf6819

By the way, neither build shows a mozilla-central revision number. That capability was only added later.

Note that when testing with these builds, you need to prevent them from automatically updating. I did that by starting them with the network off, then changing their "update" setting to "Check for updates, but let me choose whether to install them".

https://hg.mozilla.org/comm-central/pushloghtml?fromchange=098d414457e2&tochange=4a37dadf6819

I can't see any obvious culprit in this range, either. I strongly suspect that this bug is "weird" -- that it involves some kind of very low level bad interaction between macOS and the Firefox/Thunderbird code base. It may only be possible to tease it out by trial and error -- for example by backing out or disabling patches in the range one by one. Once you find which patch is "at fault", I can start digging into the problem with my HookCase debugger.

Thanks again. As I said, the crash is in M-C code in nsFocusManager. Of course the build information is present in version 68 and later. On Windows that is: Help > Troubleshooting Information, about:buildconfig, see here for the TB 68.4.1 I'm currently using:
Thunderbird source
https://hg.mozilla.org/releases/comm-esr68/rev/afc61dccd44560fac1f8647821e55b110e3fe370
Platform source
https://hg.mozilla.org/releases/mozilla-esr68/rev/5c3329fb2b7d52fa06d00b6b1b384b0ef7c4a279

The Daily should have a platform source of mozilla-central. If now, we can ask Alice as I did in comment #47.

(In reply to Steven Michaud [:smichaud] (Retired) from comment #53)

Here's another, later regression range:

http://ftp.mozilla.org/pub/thunderbird/nightly/2019/06/2019-06-04-09-18-47-comm-central/thunderbird-69.0a1.en-US.mac.dmg
http://ftp.mozilla.org/pub/thunderbird/nightly/2019/06/2019-06-05-10-11-01-comm-central/thunderbird-69.0a1.en-US.mac.dmg

https://hg.mozilla.org/comm-central/pushloghtml?fromchange=098d414457e2&tochange=4a37dadf6819

By the way, neither build shows a mozilla-central revision number. That capability was only added later.

Note that when testing with these builds, you need to prevent them from automatically updating. I did that by starting them with the network off, then changing their "update" setting to "Check for updates, but let me choose whether to install them".

mozilla-central range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c909c105f914f69054b9a7c6b189ee39fa1cad44&tochange=155a7e2117e575ff6de6caa3dfe5b076cb455ae1

(you can find m-c changeset from same way of comment #56)

Can we narrow this down even more with some TaskCluster builds? Could be caused by bug 1519948 or bug 1544916 or bug 836176 ... or something else altogether.

Emilio, looking at nsFocusManager.cpp, I found your name a few times. Maybe it would be best to simply debug this crash on Mac since it's easily reproducible. Are you on Mac?

Flags: needinfo?(emilio)

Of those three my bets would be on bug 1544916, but that's just a guess...

I'm not on Mac unfortunately, and I won't have access to mac until after the all hands (away from the office till then for family reasons :/).

Does this repro on trunk? If so it might be reproducible on Linux with some of the mac-specific focus code enabled...

Flags: needinfo?(emilio)

Does this repro on trunk?

I believe that it does.

I've confirmed that this bug doesn't happen on Windows (Windows 10) or Linux (Fedora 31), even with my revised STR from comment #41. I tested with Thunderbird 69.0 b4.

Can we narrow this down even more with some TaskCluster builds?

I'd be willing to help with this. Where are the task cluster builds?

Damn, in principle there are here:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&098d414457e2&tochange=4a37dadf6819
but they only get kept for a while, maybe six months or so, so all the stuff from June 2019 is already gone ... unless Alice knows a better trick.

< 2019.11.09
https://tools.taskcluster.net/index/comm.v2.comm-central.revision

2019.11.10
https://firefox-ci-tc.services.mozilla.com/tasks/index/comm.v2.comm-central.revision

then find changeset > thunderbird > macosx64-opt > public/build/target.dmg and public/build/target.txt

https://tools.taskcluster.net/index/comm.v2.comm-central.revision

I couldn't find https://hg.mozilla.org/mozilla-central/rev/5e7c2d9420574ed055721a9685ff0d048d15399c, the revision that landed just before bug 1544916's revision. I suspect that even this server doesn't retain task cluster builds old enough to be useful here.

If I remember correctly, it's possible to use hg to roll back a current tree to any previous revision, no matter how old. Then one can build the tree and test with it. I could try this in the next few days. It sounds like there's no-one else available, at least for the time being.

However, I don't know how to roll back a comm-central tree to a given mozilla-central revision. And in any case this may not be possible. That leaves the option of trying to back out a patch, or to disable it if that's not possible. Slow work. But it sounds like nobody else (besides me) will be able to get to it anytime soon.

https://tools.taskcluster.net/index/comm.v2.comm-central.revision

I couldn't find https://hg.mozilla.org/mozilla-central/rev/5e7c2d9420574ed055721a9685ff0d048d15399c, the revision that landed just before bug 1544916's revision. I suspect that even this server doesn't retain task cluster builds old enough to be useful here.

Oops, it's probably a mistake to look for mozilla-central revisions there.

(In reply to Steven Michaud [:smichaud] (Retired) from comment #65)

However, I don't know how to roll back a comm-central tree to a given mozilla-central revision. And in any case this may not be possible. That leaves the option of trying to back out a patch, or to disable it if that's not possible.

Yes it's unfortunately not possible to automate the matching of comm-central and mozilla-central versions. Of course it only needs to match if there are breaking changes in between, which there might not be too much of in this case since the window is so small. Thanks for looking into it!

(In reply to Alice0775 White from comment #64)

< 2019.11.09
https://tools.taskcluster.net/index/comm.v2.comm-central.revision

Thanks, Alice, we're looking at June 2019 here and those builds are likely gone due to the TC move. Or can you see any .dmg files? Then please paste the links here.

How, exactly, does new mozilla-central code get into the comm-central "branch"? Looking at https://hg.mozilla.org/comm-central/shortlog, I don't see any "merge" revisions (like at https://hg.mozilla.org/mozilla-central/shortlog).

It doesn't. M-C and C-C are two separate repositories. M-C is all the Mozilla platform code and Firefox code, and C-C is a 5% mail add-on (baroque) plaster balcony. But since M-C constantly change APIs, C-C needs to match. So each TB binary is product of two repositories which were compiled together, see comment #55 and about:buildconfig. Since the crash is in M-C code we assume that the problem got introduced there, so we need the M-C changesets with which the binary was built. Is that clear now? EDIT: No offence intended, I can explain it further. Or further reading here: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Simple_Thunderbird_build#Get_the_source

Is that clear now?

Not really, I'm afraid.

In the next few days I'll start building comm-central, and fiddling with the code. I hope things will become clearer then.

The key is to understand that the code we compile lives in two repositories. You don't build comm-central, you pull two repositories and then build Thunderbird. To build Firefox, you only need to pull M-C.

The build instructions linked in comment #72 are old, they are superseded by https://developer.thunderbird.net/the-basics/building-thunderbird, but sadly, as I've just noticed, that page has been badly refactored from the old Wiki page, see:
http://lists.thunderbird.net/pipermail/maildev_lists.thunderbird.net/2020-January/002031.html

You don't build comm-central, you pull two repositories and then build Thunderbird.

That makes things a lot clearer.

I'll play around with building Thunderbird. I'm quite used to dealing with iffy documentation. Usually I'm able to figure things out.

FWIW I can confirm the steps of comment 41.

Thanks, so that looks like bug 1519948 or bug 1544916 with Emilio putting his money on the latter. The strange thing is that that's purely JS/CSS/XML changes https://hg.mozilla.org/mozilla-central/rev/843c150636f81fb2f13196844206514f8a938bd9 and shouldn't cause a crash. Hmm. I think some direct debugging is needed here.

(In reply to Jorg K (GMT+1) (PTO to 26th Jan 2020, sporadically reading bugmail) from comment #74)

The key is to understand that the code we compile lives in two repositories. You don't build comm-central, you pull two repositories and then build Thunderbird. To build Firefox, you only need to pull M-C.

The build instructions linked in comment #72 are old, they are superseded by https://developer.thunderbird.net/the-basics/building-thunderbird, but sadly, as I've just noticed, that page has been badly refactored from the old Wiki page, see:
http://lists.thunderbird.net/pipermail/maildev_lists.thunderbird.net/2020-January/002031.html

Thanks, Jorg, for this information. I had no problem building Thunderbird.

Now I'm trying to use hg rewind on each of the repositories (mozilla-central and comm-central), to roll them back to given revisions. But it seems to be completely broken, though it worked just fine a few years ago. Anyone here with recent experience using hg rewind? I'll probably eventually figure it out, but it may take me several days.

how about this?

hg update -r <changeset>

Thanks, Alice! Your comment jogged my memory, and I now realize that it was hg update -r REV that I used a few years ago, and not hg rewind.

I've got another problem:

hg update -r REV works fine. But ./mach build fails with various errors if I try to build the resulting trees. I assume this is because the revised trees are so out of date that the current build environment doesn't work on them.

Any ideas, Alice?

Otherwise I'm going to have to read through each patch in the regression ranges and figure out how to disable them "by hand".

Just for some background. Alice works mostly on Windows, not at all on Mac.

The issue is that older code sometimes needs to be compiled with older compiler versions. Since you're only going back to June 2019, that shouldn't be so much of a problem.

Personally I'd set M-C to 21d22b2f541258d3d1cf96c7ba5ad73e96e616b5 and C-C to b12420c26104dd203307856e5086cb3139493c89 (from comment #77) and start from there. You need to do a mach clobber.

What are the errors?

What are the errors?

Things like this:

    mozbuild.configure.ConfigureError: `--enable-app-system-headers`, emitted from `/Volumes/Emma/Users/smichaud/Documents/Mozilla/NewBugs/bugzilla1574322/build/source/comm/mail/moz.configure` line 91, is unknown.

I'm going to sleep on this, and maybe take a different tack tomorrow. For example, it'd be nice to know exactly where, in the source for nsFocusManager::GetNextTabbableContent(), we go into an infinite loop.

Did you do a mach clobber. Something wrong with the build configuration, not the compiler version, see moz.configure. Hmm.

Did you do a mach clobber.

Yes.

Something wrong with the build configuration

Yes. That's why I concluded that the build configuration no longer matches the trees after hg update.

Overnight I got some ideas, but I'm not going to try them until I've used ./mach run --debug to find out where in the source for nsFocusManager::GetNextTabbableContent() we go into an infinite loop. I've got a current working build, and it takes almost an hour to rebuild it from scratch.

As best I can tell, it's here that nsFocusManager::GetNextTabbableContent() goes into an infinite loop:

https://searchfox.org/mozilla-central/source/dom/base/nsFocusManager.cpp#3471

This block of code keeps getting called (currentTopLevelScopeOwner is always non-zero). But GetNextTabbableContentInScope() always returns nullptr, and the block always does a continue.

A bunch of local variables (like aForward) aren't available. I'll rebuild with --enable debug and try to get at them.

Emilio, can you do something with comment #87?

Flags: needinfo?(emilio)

(Following up comment #88)

It worked. Here are some local variables. Whoever's interested, let me know if you want more. I checked with the instruction pointer here.

(bool) aForward = true
(bool) focusableHostSlot = true
(int32_t) aCurrentTabIndex = 1
(nsXULElement *) currentTopLevelScopeOwner = [non-null value]
(nsBoxFrame *) frame = [non-null value]

Here are some that still aren't available, for unknown reasons:

(int32_t) tabIndex = <variable not available>
(bool) aIgnoreTabIndex = <no location, value may have been optimized out>
(nsIContent *) currentContent = <no location, value may have been optimized out>
(bool) aForDocumentNavigation = <no location, value may have been optimized out>
(nsIContent *) oldTopLevelScopeOwner = <variable not available>

So... not really without reproing... Need to try to repro here on Linux somehow.

So frame->mContent->mParent should be the document I guess?

This time, after running hg update -r REV on both trees, I re-ran ./mach bootstrap. That seems to have done the trick. The build is now proceeding normally. Fingers crossed, it should finish in about an hour.

So frame->mContent->mParent should be the document I guess?

Give me a list of local variables you want me to check. I can't guarantee I'll be able to get them all. And it will have to wait for about an hour -- I hesitate to run ./mach run --debug in my previous build while my new test build is still building.

I'd love to know:

  • frame
  • frame->DumpFrameTreeLimited()
  • frame->mContent
  • Output of frame->mContent->List((FILE*)stderr, 0)
  • Output of frame->mContent->mSlots->mShadowRoot (or such)
  • And List() for that.
  • The output of DumpJSStack()
  • And ideally which branches / calls does the loop take / make.

It'd be good to know if calls to nsIFrame::IsFocusable ends up in some mac-specific code like this sTabFocusModel check.

I'm building too, though I'm in a CSSWG face-to-face meeting this week so a bit hard to get a build locally... And I can't debug this over ssh to my fast machine unfortunately :/

(Following up comment #92)

I'm still not out of the woods. Sigh.

Now I'm seeing rustc errors:

    11:25.78 error[E0277]: `baldrapi::BD_ConstantValue__bindgen_ty_1` doesn't implement `std::fmt::Debug`
    11:25.78  --> /Volumes/Emma/Users/smichaud/Documents/Mozilla/NewBugs/bugzilla1574322/build/test/obj-x86_64-apple-darwin18.7.0/x86_64-apple-darwin/release/build/baldrdash-550a74f37802cf38/out/bindings.rs:3:15257

...

    11:25.82 error: aborting due to previous error
    11:25.82 For more information about this error, try `rustc --explain E0277`.
    11:25.82 error: could not compile `baldrdash`.

I assume the current version of rustc is fussier than the version that was current as of early June 2019. Any suggestions about how I can use an earlier version of rust, and which version I should use?

(In response to comment #93)

I got some of what you requested, but not a lot:

    (lldb) frame variable frame
    (nsBoxFrame *) frame = 0x000000011ca63640
    
    (lldb) expr (void) frame->DumpFrameTreeLimited()
    error: Couldn't lookup symbols:
      nsIFrame::DumpFrameTreeLimited() const
    
    (lldb) frame variable frame->mContent
    (nsCOMPtr<nsIContent>) frame->mContent = {
      mRawPtr = 0x000000012cfb81f0
    }
    
    (lldb) expr (void) frame->mContent->List((FILE*)stderr, 0)
    error: reference to 'stderr' is ambiguous
    candidate found by name lookup is 'std::io::stdio::stderr'
    candidate found by name lookup is 'std::io::stdio::stderr'
    
    (lldb) frame variable frame->mContent->mSlots->mShadowRoot
    error: "frame->mContent" is not a pointer and -> was used to attempt to access "mSlots". Did you mean "frame->mContent.mSlots->mShadowRoot"?
    (lldb) frame variable frame->mContent.mSlots->mShadowRoot
    error: "mSlots" is not a member of "(nsCOMPtr<nsIContent>) frame->mContent"
    
    (lldb) expr (void) DumpJSStack()
    0 _selectNewTab(aNewTab = "[object XULElement]") ["chrome://global/content/elements/tabbox.js":754:35]
        this = [object XULElement]
    1 on_mousedown(event = "[object MouseEvent]") ["chrome://global/content/elements/tabbox.js":345:27]
        this = [object XULElement]
    2 handleEvent(event = "[object MouseEvent]") ["chrome://global/content/customElements.js":468:26]
        this = [object XULElement]
    3 editFolder() ["chrome://messenger/content/folderPane.js":2977:11]
        this = [object Object]
    4 oncommand(event = "[object XULCommandEvent]") ["chrome://messenger/content/messenger.xhtml":1:22]
        this = [object XULElement]

One more:

    (lldb) expr frame->mContent->mSlots->mShadowRoot
    error: no member named 'mShadowRoot' in 'nsINode::nsSlots'

(Following up comment #94)

I assume the current version of rustc is fussier than the version that was current as of early June 2019. Any suggestions about how I can use an earlier version of rust, and which version I should use?

I've now gotten past this obstacle. It was surprisingly easy.

I grepped through the hg update-ed tree on "rustc_min_version" and found that, in this older code, it was set to "1.34.0". Then I ran rustup override set 1.34.0 in the directory where I'd put my hg update-ed tree, to make rust use that version while running in that directory.

The build still hasn't finished, but baldrdash did compile.

Now I'm waiting to trip over the next obstacle :-)

Here's the trigger for this bug from the reduced regression range from comment #77:

Bug 1544916 - migrate dialog binding to Custom Element r=bgrins,whimboo
https://hg.mozilla.org/mozilla-central/rev/843c150636f81fb2f13196844206514f8a938bd9

Now I'm going to try to find the trigger in the earlier regression range from comment #56.

Thanks for the confirmation.

Now I'm going to try to find the trigger in the earlier regression range from comment #56.

As a volunteer you can do what you like, but I don't think it will help proceedings to look at something that happened two years before on a very different code base. I could be wrong of course. IMHO, it would make sense to get Emilio the necessary information. You can modify the code and add print statements or those List() calls. They don't actually need any argument:
https://searchfox.org/mozilla-central/rev/e878e5b81bb319c141900ce9cfcde732df5c8449/dom/base/nsIContent.h#810

Regressed by: 1544916
Hardware: Unspecified → All
Attached file Hack to repro on Linux

I could repro on Linux with this, will check it out on a debug build later today... I suspect in the end only the CSS change is necessary.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #100)

Created attachment 9122606 [details]
Hack to repro on Linux

I could repro on Linux with this, will check it out on a debug build later today... I suspect in the end only the CSS change is necessary.

I'm glad to hear it. So this isn't a low-level Apple bug, and I don't need to use HookCase here.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #100)

Created attachment 9122606 [details]
Hack to repro on Linux

I could repro on Linux with this, will check it out on a debug build later today... I suspect in the end only the CSS change is necessary.

Steven, could you try a build with only the CSS change from Emilio's hack if this fixes the issue on Mac?

To be clear that won't fix it. That's necessary to break it on Linux.

I suspect enabling full keyboard access on MacOS or such would fix it.

Flags: needinfo?(emilio)

Err, sorry didn't want to cancel the ni?

Flags: needinfo?(emilio)

(In reply to Alice0775 White from comment #56)

mozilla-central range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d10c97627b51a226e19d0fa801201897fe1932f6&tochange=1b4c59eef820b46eb0037aca68f83a15088db45f

This is wrong. It's actually:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1b4c59eef820b46eb0037aca68f83a15088db45f&tochange=ab2d700fda2b4934d24227216972dce9fac19b74

But even so, I can't reproduce this bug when I do hg update -r ab2d700fda2b4934d24227216972dce9fac19b74 for mozilla-central and hg update -r 3550d1f37dd5be9bad2345031928264294436443 for comm-central. I don't know why, and I'm going to leave the problem aside, at least for the time being. With luck, what I've already found out about the second regression range (from comment #57) will suffice.

This was missing a check and as a result we were hanging. The simplified setup
is as follows:

<dialog>
#shadow-root
<panel>
<!-- nothing focusable here -->

And chrome code calling advanceFocusIntoSubtree with <panel> as the start
content.

We first try starting at <panel>, then from the root. It's in this second try
where we fail to find anything into the <panel> subtree, but we don't realize we
wrapped around and should stop.

I think we should really look into unifying the shadow dom and non-shadow dom
focus code, these bugs are somewhat nasty :/

I don't think this is observable from the web because this second iteration
happens from here:

And in the web case we'd hit the TabToTreeOwner call above...

Assignee: nobody → emilio
Status: NEW → ASSIGNED

Ok sorry for the lag again, finally got back home. This patch should fix it, though note the commit message...

Flags: needinfo?(emilio)
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1e1496ab6597
We may wrap around when looking into what to focus if the start content is in shadow DOM. r=smaug
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 74.0

I can confirm that this bug is fixed on the Mac. I tested with today's comm-central nightly on macOS 10.15.2.

Just tested the same version as Steven on macOS 10.14.6 and confirm that the bug is fixed in the Mac version.
Thanks to all of you who have invested time and effort to find a solution.

Flags: needinfo?(m_kato)
Regressions: 1614224
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: