Possible use-after-free with C-C TB (a new twist unlike bug 1298103)
Categories
(MailNews Core :: Backend, defect)
Tracking
(Not tracked)
People
(Reporter: ishikawa, Unassigned)
References
(Depends on 1 open bug)
Details
(Keywords: regression, regressionwindow-wanted, Whiteboard: [regression:TB??])
Attachments
(1 file)
7.00 KB,
text/plain
|
Details |
Reporter | ||
Comment 1•8 years ago
|
||
Updated•8 years ago
|
Updated•7 years ago
|
Reporter | ||
Comment 2•7 years ago
|
||
Reporter | ||
Comment 3•7 years ago
|
||
Reporter | ||
Comment 5•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #4)
Chiaki, does this issue still exist?
I will check this again in a few days.
|make mozmill| is gone and I had to modify all my local test scripts accordingly.
PLUS, the clumsy handling of corona virus infection by government here in Japan has really diverted my attention to this literally life and death issue for the last several weeks.
Finally, I have begun telecommuting most of the week since last week,
and thus have more time (less commute time) to use for TB debuggin/development.
I will report my finding hopefully later this week.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 6•5 years ago
|
||
I have not been able to run |mach mochitest|, the replacement of |make mozmill| test suite under varlgrind due to a strange issue.
Bug 1629433: Running thunderbird binary under valgrind while trying to run mochitest requires way TOO MANY threads (> 1500) !?
https://bugzilla.mozilla.org/show_bug.cgi?id=1629433
Once that is fixed, and I can run TB under valgrind during |mach mochitest| I can say for sure if the problem still persists or not.
Reporter | ||
Comment 7•5 years ago
|
||
I am still stuck.:
bug 1630345
./mach bootstrap fails with python-pip dependency issue: python-pip : Depends: python-pip-whl (= 18.1-5) but 20.0.2-4 is to be installed
bug 1629433
Running thunderbird binary under valgrind while trying to run mochitest requires way TOO MANY threads (> 1500) !?
(Well, actually, I can increase the # of threads to 5,000 and the test proceeded. But once a stack trace is printed, and external stack address decoder is invoked, and that external stack address decoder needs to be installed by |./mach bootstrap| which unfortunately fails.
.
Comment 8•5 years ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #7)
I am still stuck.:
bug 1630345
With a priority of P3, that bug doesn't have a great chance of being fixed in coming months, unless something moves the priority needle.
Reporter | ||
Comment 9•5 years ago
|
||
I got stuck with another bug: bug 1633092
TB build failure: GLSL optimizer output causes a compiler error (GCC-9) error: comparison of integer expressions of different signedness: ‘long int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
Hmm...
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 10•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #4)
Chiaki, does this issue still exist?
Wayne, I don't see it any more. (I have figured out to work around the build bugs for now.)
Bug 1578951 might have fixed the underlying cause.
So we can close it for now.
Comment 11•5 years ago
|
||
-> WFM then
Reporter | ||
Comment 12•5 years ago
|
||
Sorry, I was wrong.
The mochitest framework is not quite valgrind-friendly.
I realize that I may not see the bug BECAUSE the test times out prematurely due to the global mochitest timeout value, which I need to enlarge somehow.
Bug 1636945 - make timeout of mochitest much longer for running mochitest under valgrind successfully.
So until I can cover the whole of mochitest under valgrind, this bug needs to be reopened.
Comment 13•2 years ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #12)
...
So until I can cover the whole of mochitest under valgrind, this bug needs to be reopened.
How is that going? What, if anything, from the Thunderbird team would help your work moved forward?
Reporter | ||
Comment 14•2 years ago
|
||
The problem was reported by mozmill.
But we now have switched to mochitest framework.
From my old logs, I found that the errors were reported by
two tests.
mail/test/mozmill/content-tabs/test-about-support.js
mail/test/mozmill/content-tabs/test-plugin-crashing.js
But they are no longer in mochitest.
I wonder if someone can recreate the tests for mochitest framework, or if the equivalent of ech already exists under a different name.
Running mochitest under valgrind takes more than 24 hours including the failed timeout tests.
And thus the past test did NOT cover all the tests due to timeout (setting proper timeout value for each test is rather difficult. I have been trying to see if I can set reasonable timeout, but it is on the back burner.)
Oh yes, there will be a very useful feature if we can set
timeout for each mochitest test as in xpcshell.ini for xpcshell-test.
Actually, something better than xpcshell.ini.: xpcshell.ini is for general processing (not under valgrind).
We specifically want to state that valgrind timeout for each test.
I also find xpcshell.ini gets edited quite often and my effort to put in proper xpcshell timout locally for valgrind is
really difficult to sync with the latest changes.
Thus I am not so sure if I am setting proper timeout value for valgrind execution of xpcshell-test now.
(But I have more other bugs/paches on hand right now.)
So if somone can re-create the equivlalent tests above in mochitest framework, I can try
running them individually.
I searched for PR_SetEnv that appears in the valgrind stack in my local logs.
The test name that caused valgrind is inserted.
log842-mozmill.txt.gz
TEST-START | /NREF-COMM-CENTRAL/comm-central/ | test_display_about_supportmail/test/mozmill/content-tabs/test-about-support.js
...
==28905== by 0x40B5DB8: PR_SetEnv (prenv.c:110)
fun:PR_SetEnv
log973-mozmill-memcheck.txt
TEST-START | /NREF-COMM-CENTRAL/comm-central/mail/test/mozmill/content-tabs/test-plugin-crashing.js | setupTest
...
==25564== by 0x5B2A2D1: __add_to_environ (setenv.c:143)
==25564== by 0x40B2998: PR_SetEnv (prenv.c:110)
fun:PR_SetEnv
log974-mozmill-memcheck-opt.txt
TEST-START | /NREF-COMM-CENTRAL/comm-central/mail/test/mozmill/content-tabs/test-plugin-crashing.js | setupTest
==4154== by 0x5B2A2D1: __add_to_environ (setenv.c:143)
==4154== by 0x40B2998: PR_SetEnv (prenv.c:110)
fun:PR_SetEnv
This one is just FYI:
log637-mozmill-memcheck.txt.gz
--11108-- used_suppression: 9 PR_SetEnv requires its argument to be leaked, but does not appear on stacks. (See bug 793534 and see bug 793549.) /home/ishikawa/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/cross-architecture.sup:8 suppressed: 160 bytes in 9 blocks
TIA
Comment 15•2 years ago
|
||
Tests were all converted over, changing the naming convetion
mail/test/mozmill/content-tabs/test-about-support.js -> browser_aboutSupport.js
mail/test/mozmill/content-tabs/test-plugin-crashing.js (deleted since plugins do not work at all anymore?)
Reporter | ||
Comment 16•2 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #15)
Tests were all converted over, changing the naming convetion
mail/test/mozmill/content-tabs/test-about-support.js -> browser_aboutSupport.js
mail/test/mozmill/content-tabs/test-plugin-crashing.js (deleted since plugins do not work at all anymore?)
Thank you, Magnus!
I will check running browser_aboutSupport.js test under valgrind and see how far I can debug.
Reporter | ||
Comment 17•2 years ago
|
||
Short story:
It seems that this particular problem is no longer reported by valgrind. Good.
Long story:
What is not so good is this.
I could run valgrind + TB on August 15.
Something has changed since then.
I think the binary tool chain (probably rust compiler?) has changed.
Suddenly today when I tried running the particular test in the mochitest test suite, during the execution of mochitest with "valgrind + TB" combination terminated because valgrind could not read certain dwarf2 data in libxul.a.
It did not happen on August 15 with then compiled binaries.
There have been problems of reading dwaref2 data by valgrind.
https://bugzilla.mozilla.org/buglist.cgi?quicksearch=dwarf&list_id=16216759
One of them is reported by me (!) : Bug 1534109
But luckily, for the last several months, valgrind did not crash.
Well, for the last several months I have been compiling C-C TB without -gsplit-dwarf, but I recompiled the whole binary
WITH -gpslit-dwarf, and the fatal crash of valgrind disappeared.
(There are still cases where valgrind could not read some debug info and reported as such.
But it does not get killed due to fatal error any more.)
In the log, only the previously reported valgrind warnings are visible now.
So, this is problem is no longer visible, but I have a bitter taste in my mouth.
I wonder if valgrind is working perfectly these days.
We need more people testing TB under valgrind.
My previous comment. : Maybe I should submit a bugzilla as Request for Enhancment.
Oh yes, there will be a very useful feature if we can set
timeout for each mochitest test as in xpcshell.ini for xpcshell-test.
Actually, something better than xpcshell.ini.: xpcshell.ini is for general processing (not under valgrind).
We specifically want to state that valgrind timeout for each test.
I also find xpcshell.ini gets edited quite often and my effort to put in proper xpcshell timout locally for valgrind is
really difficult to sync with the latest changes.
Thus I am not so sure if I am setting proper timeout value for valgrind execution of xpcshell-test now.
(But I have more other bugs/paches on hand right now.)
Description
•