Closed
Bug 1236830
Opened 8 years ago
Closed 6 years ago
[emulator-x86-kk][mochitest] Run valgrind on emulator tests
Categories
(Firefox OS Graveyard :: Emulator, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: cyu, Unassigned)
References
Details
Attachments
(9 files)
7.91 KB,
text/plain
|
Details | |
981.73 KB,
text/plain
|
Details | |
25.06 KB,
text/plain
|
Details | |
7.21 KB,
patch
|
Details | Diff | Splinter Review | |
155.27 KB,
application/x-bzip
|
Details | |
3.55 MB,
application/octet-stream
|
Details | |
696 bytes,
patch
|
Details | Diff | Splinter Review | |
890 bytes,
patch
|
Details | Diff | Splinter Review | |
442 bytes,
patch
|
Details | Diff | Splinter Review |
I have this idea from bug 1234981, where there is a crash in jemalloc that arena magic contains unexpected value. One possible cause is memory corruption. We may consider enabling valgrind on b2g tests to detect memory bugs. The target will be local emulator x86 runs, which can run in near-native speed with kvm support to make the slowdown of valgrind more acceptable.
Reporter | ||
Comment 1•8 years ago
|
||
I got this crash in the linker. It seems to fail in calling soinfo_alloc(). ADB Location: adb remount succeeded Compressing libxul.so... ==1749== Memcheck, a memory error detector ==1749== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==1749== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==1749== Command: /data/valgrind-b2g/b2g ==1749== WARNING: linker: vgpreload_memcheck-x86-linux.so has text relocations. This is wasting memory and is a security risk. Please fix. ==1749== Invalid read of size 4 ==1749== at 0x40049A7: add_vdso (bionic/linker/linker.cpp:1596) ==1749== by 0x40049A7: __dl__ZL29__linker_init_post_relocationR19KernelArgumentBlockj (bionic/linker/linker.cpp:1733) ==1749== by 0x40078C7: __dl___linker_init (bionic/linker/linker.cpp:1855) ==1749== by 0x4008703: __dl__start (bionic/linker/arch/x86/begin.c:38) ==1749== Address 0x1c is not stack'd, malloc'd or (recently) free'd ==1749== ==1749== ==1749== Process terminating with default action of signal 11 (SIGSEGV) ==1749== Access not within mapped region at address 0x1C ==1749== at 0x40049A7: add_vdso (bionic/linker/linker.cpp:1596) ==1749== by 0x40049A7: __dl__ZL29__linker_init_post_relocationR19KernelArgumentBlockj (bionic/linker/linker.cpp:1733) ==1749== by 0x40078C7: __dl___linker_init (bionic/linker/linker.cpp:1855) ==1749== by 0x4008703: __dl__start (bionic/linker/arch/x86/begin.c:38) ==1749== If you believe this happened as a result of a stack ==1749== overflow in your program's main thread (unlikely but ==1749== possible), you can try to increase the size of the ==1749== main thread stack using the --main-stacksize= flag. ==1749== The main thread stack size used in this run was 8388608. ==1749== ==1749== HEAP SUMMARY: ==1749== in use at exit: 0 bytes in 0 blocks ==1749== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==1749== ==1749== All heap blocks were freed -- no leaks are possible ==1749== ==1749== For counts of detected and suppressed errors, rerun with: -v ==1749== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 0 from 0)
Comment 2•8 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #1) > I got this crash in the linker. It seems to fail in calling soinfo_alloc(). Cervantes, can you re-run with -v added to the flags for Valgrind and post the results as an attachment?
Reporter | ||
Comment 3•8 years ago
|
||
Reporter | ||
Comment 4•8 years ago
|
||
Reporter | ||
Comment 5•8 years ago
|
||
Reporter | ||
Comment 6•8 years ago
|
||
I cross-checked linker on KK with the one on L and found that add_vdso() has a bug that it doesn't perform nullity check against ehdr_vdso. Adding a check as L does fixes this bug.
Comment 7•8 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #6) > I cross-checked linker on KK with the one on L and found that add_vdso() has > a bug that it doesn't perform nullity check against ehdr_vdso. Adding a > check as L does fixes this bug. So .. am I correct to understand that -- at least for the crash -- there is nothing that needs to be changed in Valgrind? I should point out one other thing, though. From Comment 5 log I see a lot of false errors in calls to __dl_strlen. You might be able to get rid of these by adding # if defined(VGPV_x86_linux_android) add_hardwired_spec( "NONE", "__dl_strlen", (Addr)&VG_(x86_linux_REDIR_FOR_strlen), NULL ); # endif in the section for guarded by "# if defined(VGP_x86_linux)" in VG_(redir_initialise) in m_redir.c.
Reporter | ||
Comment 8•8 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #7) > (In reply to Cervantes Yu [:cyu] [:cervantes] from comment #6) > > I cross-checked linker on KK with the one on L and found that add_vdso() has > > a bug that it doesn't perform nullity check against ehdr_vdso. Adding a > > check as L does fixes this bug. > > So .. am I correct to understand that -- at least for the crash -- > there is nothing that needs to be changed in Valgrind? Yes. It can be fixed in the linker. Nothing needs to be done in valgrind.
Reporter | ||
Comment 9•8 years ago
|
||
I got b2g on emulator-x86-kk boot to the homescreen with the following changes - The changes to gonk-misc/default-gecko-config and .userconfig as in https://developer.mozilla.org/en-US/docs/Mozilla/Firefox_OS/Debugging/Debugging_B2G_using_valgrind - increase system and user data partition image size - increase qemu memory size - rebuild goldfish kernel with CONFIG_HIGHMEM and CONFIG_HIGHMEM4G so the kernel can use memory > 895 MB - build valgrind from source and adb push to the system partition after emulator starts (it's supposed to build with B2G_VALGRIND=1, but emulator-x86-kk doesn't so I built it manually) - 2 fixes in bionic linker. 1 for startup crash and the other for SIGFPE that deadlocks Nuwa Then b2g starts with run-valgrind.sh. The next step is integration with mochitest and other tests.
Reporter | ||
Comment 10•8 years ago
|
||
A quick hack for running mochitest with valgrind on emulator-x86-kk. This patch modifies the command line in running "./mach mochitest" to start b2g with valgrind. Now I can run a single folder (tested with dom/ipc/tests/). Running a chunk passes mochitest-plain but gets stuck in mochitest-plain with subsuite webgl).
Reporter | ||
Comment 11•8 years ago
|
||
Emulator and adb logcat logs for running "./mach mochitest -f plain --total-chunks 9 --this-chunk 2"
Reporter | ||
Comment 12•8 years ago
|
||
Kernel source downloaded from > git clone https://github.com/mozilla-b2g/kernel_goldfish.git > git checkout remotes/origin/b2g-goldfish-3.4.67 and is built with the following configs enabled: > CONFIG_HIGHMEM=y > CONFIG_HIGHMEM4G=y so that we may give qemu more memory for running valgrind.
Reporter | ||
Comment 13•8 years ago
|
||
This patches $B2G_HOME/build/target/board/generic_x86/BoardConfig.mk for creating larger images on which we can put valgrind and b2g binaries with symbol.
Reporter | ||
Comment 14•8 years ago
|
||
This patches $B2G_HOME/bionic/linker/linker.cpp for 1. valgrind startup crash. 2. Nuwa deadlock in SIGFPE when calling dlsym().
Reporter | ||
Comment 15•8 years ago
|
||
To run emulator-x86-kk with valgrind: 1. download B2G source from git://github.com/mozilla-b2g/B2G.git to $B2G_HOME 2. run under $B2G_HOME: ./config.sh emulator-x86-kk 3. apply attachment 8712624 [details] [diff] [review] under $B2G_HOME/build 4. apply attachment 8712628 [details] [diff] [review] under $B2G_HOME/bionic 5. download attachment 8712620 [details] and overwrite $B2G_HOME/prebuilts/qemu-kernel/x86/kernel-qemu 6. build valgrind for android x86 as in http://valgrind.org/docs/manual/dist.readme-android.html 7. add the following to $B2G_HOME/.userconfig as in https://developer.mozilla.org/en-US/docs/Mozilla/Firefox_OS/Debugging/Debugging_B2G_using_valgrind > export B2G_VALGRIND=1 > export DISABLE_JEMALLOC=1 and add the following to $B2G_HOME/gonk-misc/default-gecko-config: > ac_add_options --enable-optimize="-g -O2" > ac_add_options --enable-valgrind > ac_add_options --disable-jemalloc > ac_add_options --disable-sandbox 8. build the emulator by running ./build.sh under $B2G_HOME. If there are errors that -lX11 or -lGL not found (like on ubuntu 14.04), creating symbolic links as the following works the error around > cd /usr/lib > sudo ln -s ./i386-linux-gnu/libX11.so.6 libX11.so > sudo ln -s ./i386-linux-gnu/mesa/libGL.so.1 libGL.so.1 9. After emulator is successfully built, start it by running > ./run-emulator.sh under $B2G_HOME 10. push valgrind binaries to the system partition: > adb remount Run under valgrind source directory. > adb push Inst / 11. Then b2g should be able to start with the script: > ./run-valgrind.sh
Reporter | ||
Comment 16•8 years ago
|
||
There are lots of false errors caused by Nuwa's stack tricks and we need to do as in bug 1125091 for android x86. Julian, can we have your help for fixing the false errors? Thanks.
Flags: needinfo?(jseward)
Reporter | ||
Comment 17•8 years ago
|
||
Reporter | ||
Comment 18•8 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #15) > To run emulator-x86-kk with valgrind: > 1. download B2G source from git://github.com/mozilla-b2g/B2G.git to $B2G_HOME > 2. run under $B2G_HOME: ./config.sh emulator-x86-kk > 3. apply attachment 8712624 [details] [diff] [review] under $B2G_HOME/build > 4. apply attachment 8712628 [details] [diff] [review] under $B2G_HOME/bionic > 5. download attachment 8712620 [details] and overwrite > $B2G_HOME/prebuilts/qemu-kernel/x86/kernel-qemu > 6. build valgrind for android x86 as in > http://valgrind.org/docs/manual/dist.readme-android.html > 7. add the following to $B2G_HOME/.userconfig as in > https://developer.mozilla.org/en-US/docs/Mozilla/Firefox_OS/Debugging/ > Debugging_B2G_using_valgrind > > export B2G_VALGRIND=1 > > export DISABLE_JEMALLOC=1 > and add the following to $B2G_HOME/gonk-misc/default-gecko-config: > > ac_add_options --enable-optimize="-g -O2" > > ac_add_options --enable-valgrind > > ac_add_options --disable-jemalloc > > ac_add_options --disable-sandbox > 8. build the emulator by running ./build.sh under $B2G_HOME. If there are > errors that -lX11 or -lGL not found (like on ubuntu 14.04), creating > symbolic links as the following works the error around > > cd /usr/lib > > sudo ln -s ./i386-linux-gnu/libX11.so.6 libX11.so > > sudo ln -s ./i386-linux-gnu/mesa/libGL.so.1 libGL.so.1 8.1 Apply attachment 8713005 [details] [diff] [review] to $B2G_HOME/run-emulator.sh to grant more memory (2GB) to qemu > 9. After emulator is successfully built, start it by running > > ./run-emulator.sh > under $B2G_HOME > 10. push valgrind binaries to the system partition: > > adb remount > Run under valgrind source directory. > > adb push Inst / > 11. Then b2g should be able to start with the script: > > ./run-valgrind.sh
Updated•8 years ago
|
Flags: needinfo?(jseward)
Comment 19•6 years ago
|
||
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•