hitting assertion "selection count is wrong" in nsMsgDBView::GetIndicesForSelection
Categories
(MailNews Core :: Backend, defect)
Tracking
(thunderbird_esr78 wontfix)
Tracking | Status | |
---|---|---|
thunderbird_esr78 | --- | wontfix |
People
(Reporter: mkmelin, Assigned: benc)
Details
Attachments
(1 file)
Looking at a Z1 orange, like https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=277564114&repo=comm-central&lineNumber=26977
We're hitting this assertion: https://searchfox.org/comm-central/rev/f40fe7e4be1d5dc5bd90c831a0bf87d354f3e74b/mailnews/base/src/nsMsgDBView.cpp#2335
Regressed by bug 1562158? Or related to bug 1158471?
Assignee | ||
Comment 1•4 years ago
•
|
||
Bug 1562158 would have been my first suspect too.
But looking through the code, I can't see anything obviously borked there.
I can't get that line to assert when I run the composition/test-send-button.js
mozmill test locally (see below for the asserts I do get)
The assert occurs when the number of selected items we get from the selection ranges of a nsITreeSelection differs from the nsITreeSelection
s count
attribute.
This could happen if:
- the nsITreeSelection has elements selected which are out of the view's range (ie non-existent elements)
- the nsITreeSelection is returning overlapping ranges.
My money is on 1. The nsTreeSelection implementation code aims to cope with merging overlapping ranges and it looks solid. I'm sure it's been well exercised over the years. I think overlapping ranges would have been noticed before. The implementation has a couple of oddities, but overall looks good.
I think things are being selected outside the view. Maybe something's being deleted and the selection hasn't been updated accordingly...
Anyway, I've just added a bunch of asserts for a try run to catch both 1 and 2 above:
https://hg.mozilla.org/try-comm-central/rev/9f5877ad6f66e63d35894d199e56b9a60059d65f
The asserts I do get on composition/test-send-button.js
:
[36909, Main Thread] ###!!! ASSERTION: Table inline-size is less than the sum of its columns' min inline-sizes: '!(aISizeType == BTLS_FINAL_ISIZE && aISize < guess_min)', file /fast/ben/tb/mozilla/layout/tables/BasicTableLayoutStrategy.cpp, line 760
[36909, Main Thread] ###!!! ASSERTION: didn't subtract all that we added: '(space == 0 || space == nscoord_MAX) && ((l2t == FLEX_PCT_LARGE) ? (-0.001f < basis.f && basis.f < 0.001f) : (basis.c == 0 || basis.c == nscoord_MAX))', file /fast/ben/tb/mozilla/layout/tables/BasicTableLayoutStrategy.cpp, line 987
Could conceivably be masking things. I haven't dug into these in detail, but from experience I've noticed treeview widgets tend to use table layouts... so it could be the same bug manifesting differently.
Comment 2•4 years ago
•
|
||
(In reply to Ben Campbell from comment #1)
Bug 1562158 would have been my first suspect too.
But looking through the code, I can't see anything obviously borked there.I can't get that line to assert when I run the
composition/test-send-button.js
mozmill test locally (see below for the asserts I do get)The assert occurs when the number of selected items we get from the selection ranges of a nsITreeSelection differs from the
nsITreeSelection
scount
attribute.This could happen if:
- the nsITreeSelection has elements selected which are out of the view's range (ie non-existent elements)
- the nsITreeSelection is returning overlapping ranges.
My money is on 1. The nsTreeSelection implementation code aims to cope with merging overlapping ranges and it looks solid. I'm sure it's been well exercised over the years. I think overlapping ranges would have been noticed before. The implementation has a couple of oddities, but overall looks good.
I think things are being selected outside the view. Maybe something's being deleted and the selection hasn't been updated accordingly...
Anyway, I've just added a bunch of asserts for a try run to catch both 1 and 2 above:
https://hg.mozilla.org/try-comm-central/rev/b65b47c9a47990e275f80d1bd2f80c3515c1e57dThe asserts I do get on
composition/test-send-button.js
:[36909, Main Thread] ###!!! ASSERTION: Table inline-size is less than the sum of its columns' min inline-sizes: '!(aISizeType == BTLS_FINAL_ISIZE && aISize < guess_min)', file /fast/ben/tb/mozilla/layout/tables/BasicTableLayoutStrategy.cpp, line 760 [36909, Main Thread] ###!!! ASSERTION: didn't subtract all that we added: '(space == 0 || space == nscoord_MAX) && ((l2t == FLEX_PCT_LARGE) ? (-0.001f < basis.f && basis.f < 0.001f) : (basis.c == 0 || basis.c == nscoord_MAX))', file /fast/ben/tb/mozilla/layout/tables/BasicTableLayoutStrategy.cpp, line 987
Could conceivably be masking things. I haven't dug into these in detail, but from experience I've noticed treeview widgets tend to use table layouts... so it could be the same bug manifesting differently.
My observation in bug 1158471 comment 6
is related to case (1) above.
Basically, I observed that
- selection count is misused a little bit.
It stands for different numbers in different places.
In some places, it seems it stands (?) for the selected lines as in selection of message headers in the header pane. If you select, say, 30 headers, then this should be 30.
However, in some places, this number seems to be used for
the lines that are currently *visible* in, say, header pane and selected.
So even if we may have selected 30 header lines as a whole, suppose the header pane shows only
10 headers, only 10 selected lines can be shown at maximum,
and if we scroll the headers, fewer selected lines are visible depending on the position of selected lines and where we look at the headers through the header pane.
No wonder assertion failures occur (!)
Good to know that someone has finally paid attention to this Assertion.
The last observation was done manually: you select some header lines in message list.
Shrink the message list window, or scroll back and forth.
Voila. You get assertions with DEBUG BUILD.
Assignee | ||
Comment 3•4 years ago
•
|
||
(In reply to ISHIKAWA, Chiaki from comment #2)
- selection count is misused a little bit. It stands for different numbers in different places. In some places, it seems it stands (?) for the selected lines as in selection of message headers in the header pane. If you select, say, 30 headers, then this should be 30. However, in some places, this number seems to be used for the lines that are currently *visible* in, say, header pane and selected.
Hmm. I don't think I've seen this case. In all the cases I've seen, nsMSgDBView
is a list of all the messages, not just the ones onscreen (it has to be all because it reorders them according to the current primary/secondary sorting criteria).
So even if we may have selected 30 header lines as a whole, suppose the header pane shows only 10 headers, only 10 selected lines can be shown at maximum, and if we scroll the headers, fewer selected lines are visible depending on the position of selected lines and where we look at the headers through the header pane.
The last observation was done manually: you select some header lines in message list.
Shrink the message list window, or scroll back and forth.
I haven't been able to replicate this, at least not on the main Thunderbird window (I picked an Inbox, then highlighted a bunch of messages and resized the window). How should I go about reproducing this one?
Assignee | ||
Comment 4•4 years ago
|
||
Yay - my test asserts from comment 1 seem to have triggered on a few tests!
Assertion failure: startRange<viewSize, at /builds/worker/workspace/build/src/comm/mailnews/base/src/nsMsgDBView.cpp:2330
They are all startRange<viewSize
, which means the treeSelection has got some selection(s) which are outside the range of nsMsgDBView
. Which I'd guess implies something hasn't updated the treeSelection to keep it in sync with the nsMsgDBView...
Hopefully I'll be able to get the assert triggering locally... (otherwise: printf debugging via try builds.... shudder...)
Updated•4 years ago
|
Comment 5•3 years ago
|
||
(In reply to Ben Campbell from comment #4)
Yay - my test asserts from comment 1 seem to have triggered on a few tests!
...
Hopefully I'll be able to get the assert triggering locally... (otherwise: printf debugging via try builds.... shudder...)
any luck?
Assignee | ||
Comment 6•3 years ago
|
||
OK - finally making some progress on this. No fix yet, but notes so far...
Easy way to replicate:
- set up a local folder with a single message in it.
- click on that message and delete it.
- Viola.
The main issue is that FolderDisplayWidget.selectedMessages
(https://searchfox.org/comm-central/source/mail/base/content/folderDisplay.js#2248) is being accessed after the message has been deleted from the nsMsgDBView, but before the view's tree selection has been updated (via OnMessagesRemoved().)
This is getting a little to much into GUI land for my comfort here ;-)
There might also be an issue with nsMsgDBView::GetMsgToSelectAfterDelete(). It's called before the deletion, then the value returned is used in OnMessagesRemoved to set the selection afterwards. But in this case, of deleting the last message it returns 0, rather than nsMsgViewIndex_None (-1). I think the rest of the code would be tolerant of this out-of-bounds selection (0 is no longer a valid message after the delete), but it still seems like it should be fixed.
Comment 7•3 years ago
|
||
(In reply to Ben Campbell from comment #6)
OK - finally making some progress on this. No fix yet, but notes so far...
Easy way to replicate:
- set up a local folder with a single message in it.
- click on that message and delete it.
- Viola.
Great simple repeatable case (!)
I have had a bad days during April and early May due to cold-like symptoms (not Covid-19) and am trying to catch up on some remaining patches, but once I take care of them I would like to look into this. This has bothered me for quite a long time when I analyze the local test log file.
Assignee | ||
Comment 8•3 years ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #7)
I have had a bad days during April and early May due to cold-like symptoms
Hope you're feeling much better now!
No fix yet, but I know what's happening - when you delete a message, it causes the various nsIFolderListener notifications to fire in order to update the message counts.
The GUI makes heavy use of these, and iterates over the various commands to see if they should be enabled/disabled. Most of these try to count the selected messages, which ends up in nsMsgDBView::GetIndicesForSelection(). But because the selection hasn't yet been updated to reflect the now-deleted message, the selection is invalid and that assert triggers.
I'm certain the actual fix will be something really simple, but I want to understand what's going on and that means picking through the copying code, which is rather... involved. So many listeners and notifications!
I'm most of the way there now, and hoping to sort out a fix tomorrow.
Assignee | ||
Comment 9•3 years ago
|
||
Assignee | ||
Comment 10•3 years ago
|
||
Well, I've got a WIP fix for it. There was already a mechanism for suppressing nsIFolderListener count notifications (totalmessages and unread count, via OnIntPropertyChanged()). But the notifications were being switched back on too soon. Handlers in the GUI for these notifications caused the problem, by accessing the view selection before it had been updated to account for the now-missing messages.
Obviously, there are other bits of code that depend on those notifications firing at the wrong time. In particular:
./mach xpcshell-test comm/mailnews/local/test/unit/test_pop3MultiCopy2.js
... now fails if this patch is applied. Sigh.
Anyway it doesn't look tooooo serious, so I'm confident of revising my patch.
There'll likely be a bunch of follow-up patches also, to smooth out a few of the rough edges I've encountered along the way so far.
The whole copy/move/delete mechanism is mindbendingly complex and brittle, spread out over large numbers of objects, jumping back and forth from C++ to javascript, and async using at least 5 or 6 separate listener systems. Just complexity creeping in over time.
Definitely some massive simplification is in order. I'll be trying to come up with some steps toward that goal.
Updated•3 years ago
|
Reporter | ||
Updated•3 years ago
|
Comment 11•3 years ago
|
||
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/216ae92d0317
Tweak folder notification during deletion to keep view selection in sync. r=mkmelin
Description
•