Closed
Bug 1185244
Opened 9 years ago
Closed 9 years ago
Improve mach support for running mochitests on Valgrind
Categories
(Testing :: Mochitest, defect)
Tracking
(firefox42 affected, firefox43 affected, firefox44 fixed)
RESOLVED
FIXED
mozilla44
People
(Reporter: jseward, Assigned: jseward)
References
Details
Attachments
(1 file, 7 obsolete files)
35.01 KB,
patch
|
jgraham
:
review+
|
Details | Diff | Splinter Review |
Currently it is possible to use mach to run mochitests on Valgrind using the --debugger and --debugger-args flags. However, piggybacking on the --debugger machinery isn't ideal. Instead, add --valgrind and --valgrind-args flags, and separate support for Valgrinded runs. This faciliates the following: * Better selection of default flags for Valgrind, including selection of correct valgrind-tool-specific flags, and target specific suppression files. * If necessary, adjust mochitest timeout values for the run. * Check whether the build is actually suitable for running on V; ditto whether this is feasible given the platform. * Parsing of the Valgrind output, so as to be able to insert TEST-UNEXPECTED-FAIL messages in the output in the case of Valgrind detected errors, that in turn treeherder will notice. In the style of existing build/valgrind/output_handler.py
Assignee | ||
Comment 1•9 years ago
|
||
WIP patch. Implements the first of the four points in comment #0, excluding tool-specific flags. Does not implement build-suitability checking, timeout adjustment, or output parsing.
Comment 2•9 years ago
|
||
Re: output parsing, I think the right approach is to add a new structured log action for valgrind errors like: {action:"valgrind_error", error_type:string error_details:dict} And then to teach the tbpl, mach and errorsummary formatters about this new action. mozharness would also have to learn that such a message in a structured log should be considered an error (I don't think this is as much work as it sounds). Since actually producing such a message requires parsing the valgrind output which will presumably come in as action=log messages, it's necessary to create a log handler that can take a single message and produce an additional message. This currently isn't too easy because the logger can't be called reentrantly. I can look at making this possible.
Assignee | ||
Updated•9 years ago
|
QA Contact: jseward
Assignee | ||
Updated•9 years ago
|
QA Contact: jseward
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → jseward
Assignee | ||
Comment 3•9 years ago
|
||
Rebased to current m-c. Implements parsing of the output and insertion of lines like TEST-UNEXPECTED-FAIL | valgrind-test | Conditional jump or move depends on uninitialised value(s) at main / ?!? / ?!? / ?!? when Valgrind errors are detected. This is done almost entirely in testing/mozbase/mozlog/mozlog, by subclassing StructuredLogger to StructuredLoggerValgrindWrapper, and using that instead of the vanilla StructuredLogger when setting up logging in testing/mozbase/mozlog/mozlog/commandline.py. But only when --valgrind is given to mochitest. So far there is no addition of a new log action 'valgrind_error' as has been discussed. Am unclear if that is still necessary. It would be easy to do so if required. The parsing code is a marginally modified version of build/valgrind/output_handler.py. In the fullness of time it may be possible to remove that file.
Attachment #8635707 -
Attachment is obsolete: true
Assignee | ||
Comment 4•9 years ago
|
||
New version, that adds testing/mozbase/mozlog/mozlog/handlers/valgrindhandler.py and doesn't do any subclassing. 1. Ignore changes to browser/app/nsBrowserApp.cpp. These are just to generate a valgrind error early on, for quick/reliable testing. 2. I had some difficulty figuring out how to integrate this into commandline.py. In the end I added 'valgrind':True as an option for all formatters if --valgrind was present on the command line. One area of concern is setup_handlers() in commandline.py. The way I have it now, if 'valgrind' is an option then it overrides 'buffer' as an option. Is it possible to chain the handlers, so that they don't interact badly? 3. A second area of concern is finding the top of the mozilla source tree in testing/mozbase/mozdebug/mozdebug/mozdebug.py, so as to be able to get the correct suppression files for valgrind -- see get_default_valgrind_suppression_args(). I did this by getting testing/mochitest/runtests.py runApp() to pass SCRIPT_DIR to get_default_valgrind_args(), but this seems a bit of a kludge (to say the least). Is there a better way?
Attachment #8641731 -
Attachment is obsolete: true
Attachment #8644291 -
Flags: feedback?(james)
Comment 5•9 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #4) > 3. A second area of concern is finding the top of the mozilla source > tree in testing/mozbase/mozdebug/mozdebug/mozdebug.py, so as to be > able to get the correct suppression files for valgrind -- see > get_default_valgrind_suppression_args(). I did this by getting > testing/mochitest/runtests.py runApp() to pass SCRIPT_DIR to > get_default_valgrind_args(), but this seems a bit of a kludge > (to say the least). Is there a better way? mozdebug (and the other mozbase modules) don't have a concept of topsrcdir because they can be used independently of a source checkout. (In fact that's how we run most of our tests, from the test package produced by the build.) You'll probably want to add a --valgrind-suppression-file option, and you can make it have a sensible default when we're running in-tree. You can see an example of an option like that in mochitest_options.py: https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/mochitest_options.py#84
Comment 6•9 years ago
|
||
Comment on attachment 8644291 [details] [diff] [review] bug1185244-6.cset Review of attachment 8644291 [details] [diff] [review]: ----------------------------------------------------------------- I think this is looking good. I suppose in an ideal world there would be a better API for other code to register bits of the logging pipeline that are influenced by command line arguments but since that doesn't exist I think this approach is OK. ::: testing/mochitest/mochitest_options.py @@ +470,5 @@ > {"dest": "debuggerArgs", > "default": None, > "help": "Arguments to pass to the debugger.", > }], > + [["--valgrind"], So I feel like mozdebug should maybe get some command line arguments support like mozlog has, which can be used to add these arguments to an existing argparse ArgumentParser. Then it would be easier to add the command line options to all different harnesses rather than just mochitest. That might be a little out of scope for this bug though. ::: testing/mozbase/mozlog/mozlog/commandline.py @@ +156,5 @@ > formatter = formatter_cls() > handler_wrapper, handler_option = None, "" > for option, value in formatter_options[fmt].iteritems(): > + if option == "valgrind": > + handler_wrapper = valgrind_handler_wrapper It seems like instead of having one wrapper you should create a list of wrappers and then apply them sequentially i.e. for wrapper, options in reversed(handler_wrappers): handler = wrapper(handler, *options) (this might require changing the existing handler so that options can be passed in this way). ::: testing/mozbase/mozlog/mozlog/formatters/tbplformatter.py @@ +150,5 @@ > else: > return " ".join(test_id) > + > + def valgrind_error(self, data): > + rv = "TEST-VALGRIND-ERROR | " + data['primary'] + "\n" Some jobs still use regexp against the string here to detect job failures. Assuming this should fail the job, I think that you want to check that this string will be recognised by those regexp.
Attachment #8644291 -
Flags: feedback?(james) → feedback+
Assignee | ||
Comment 7•9 years ago
|
||
Addresses review comments in Comment 5 and Comment 6. ------------ Removes the kludge of passing SCRIPT_DIR around (per Comment 5). Instead, adds a flag --valgrind-supp-files=/foo/bar.supp,/xyzzy/qqq.supp and encodes the knowledge of "default" suppression files in the default for that flag. Now known only by testing/mochitest/mochitest_options.py, so that testing/mozbase is not polluted by knowledge of the tree layout. ------------ > ::: testing/mozbase/mozlog/mozlog/commandline.py > @@ +156,5 @@ > > formatter = formatter_cls() > > handler_wrapper, handler_option = None, "" > > for option, value in formatter_options[fmt].iteritems(): > > + if option == "valgrind": > > + handler_wrapper = valgrind_handler_wrapper > > It seems like instead of having one wrapper you should create a list > of wrappers and then apply them sequentially i.e. Done. ------------ > Some jobs still use regexp against the string here to detect job > failures. Assuming this should fail the job, I think that you want to > check that this string will be recognised by those regexp. Done.
Attachment #8644291 -
Attachment is obsolete: true
Assignee | ||
Comment 8•9 years ago
|
||
Try results: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e9957481be5b
Attachment #8649900 -
Attachment is obsolete: true
Assignee | ||
Comment 9•9 years ago
|
||
With a very minor improvement to command line arg checking.
Attachment #8651091 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8652251 -
Flags: review?(james)
Comment 10•9 years ago
|
||
Comment on attachment 8652251 [details] [diff] [review] bug1185244-11.cset Review of attachment 8652251 [details] [diff] [review]: ----------------------------------------------------------------- Please add output support to MachFormatter. This looks pretty good. I'm a little unhappy about the way that mozlog, mozdebug and the harness all have to know about each other, but I think we can save fixing that for future work. Do you have try runs with test failures to demonstrate that it works as expected. ::: build/automation.py.in @@ +529,5 @@ > xrePath = None, certPath = None, > debuggerInfo = None, symbolsPath = None, > timeout = -1, maxTime = None, onLaunch = None, > + detectShutdownLeaks = False, screenshotOnFail=False, testPath=None, bisectChunk=None, > + valgrindPath=None, valgrindArgs=None, valgrindSuppFiles=None): Do these extra arguments get used? ::: layout/tools/reftest/remotereftest.py @@ +441,5 @@ > > def runApp(self, profile, binary, cmdargs, env, > timeout=None, debuggerInfo=None, > + symbolsPath=None, options=None, > + valgrindPath=None, valgrindArgs=None, valgrindSuppFiles=None): Are these used? ::: layout/tools/reftest/runreftest.py @@ +574,5 @@ > > def runApp(self, profile, binary, cmdargs, env, > timeout=None, debuggerInfo=None, > + symbolsPath=None, options=None, > + valgrindPath=None, valgrindArgs=None, valgrindSuppFiles=None): Are these used? ::: layout/tools/reftest/runreftestb2g.py @@ +503,5 @@ > > def runApp(self, profile, binary, cmdargs, env, > timeout=None, debuggerInfo=None, > + symbolsPath=None, options=None, > + valgrindPath=None, valgrindArgs=None, valgrindSuppFiles=None): Are these used? ::: testing/mochitest/mochitest_options.py @@ +34,5 @@ > > +def get_default_valgrind_suppression_files(): > + # We are trying to locate files in the source tree. So if we > + # don't know where the source tree is, we must give up. > + if build_obj is None or build_obj.topsrcdir is None: Should these files be put in the test packages so that they are available to automation? @@ +46,5 @@ > + supp_for_plat = supps_path + "/x86_64-redhat-linux-gnu.sup" > + elif mozinfo.processor == "x86": > + supp_for_plat = supps_path + "/i386-redhat-linux-gnu.sup" > + > + supp_cross_arch = supps_path + "/cross-architecture.sup" It looks like this option will always be used? @@ +499,5 @@ > {"dest": "debuggerArgs", > "default": None, > "help": "Arguments to pass to the debugger.", > }], > + [["--valgrind"], I find it pretty weird that there is code in mozlog to interpret this command line argument, but it's actually defined here. That feels like an incorrect factoring. You don't necessarily have to fix that for this round, but I'm pretty sure we will want to do so soon in order to use valgrind with other test types. ::: testing/mochitest/runtests.py @@ +1715,5 @@ > timeout = self.DEFAULT_TIMEOUT > > + # Note in the log if running on Valgrind > + if valgrindPath: > + self.log.info("runtests.py | Running on Valgrind. " ultra nit: "on" seems like the wrong preposition here. "under" or "in" sound better to me. ::: testing/mozbase/mozdebug/mozdebug/mozdebug.py @@ +153,5 @@ > return None > > return None > + > +# Defines default values for Valgrind flags. I am totally the wrong person to review this part for correctness. ::: testing/mozbase/mozlog/mozlog/commandline.py @@ +159,2 @@ > for option, value in formatter_options[fmt].iteritems(): > + a_wrapper, a_option = None, "" Let's call this wrapper and wrapper_args and make wrapper_args a tuple so that later we call wrapper(handler, **wrapper_args) @@ +171,3 @@ > for value in streams: > handler = handlers.StreamHandler(stream=value, formatter=formatter) > + for a_wrapper, a_option in handler_wrappers_and_options: Again change the names. ::: testing/mozbase/mozlog/mozlog/formatters/tbplformatter.py @@ +151,5 @@ > return " ".join(test_id) > + > + def valgrind_error(self, data): > + rv = "TEST-VALGRIND-ERROR | " + data['primary'] + "\n" > + for str in data['secondary']: Call this "line" rather than "str" (which is a builtin) ::: testing/mozbase/mozlog/mozlog/handlers/valgrindhandler.py @@ +7,5 @@ > +import re > + > + > +class ValgrindHandler(BaseHandler): > + No space here @@ +19,5 @@ > + if tmp is not None: > + self.inner(tmp) > + > + > +class ValgrindFilter(object): Having two different objects here seems rather excessive, it would at least be slightly faster to remove one. But I'm not set on it if you prefer this way. @@ +55,5 @@ > + def __init__(self): > + # The regexps in this list match all of Valgrind's errors. Note that > + # Valgrind is English-only, so we don't have to worry about > + # localization. > + self.re_error = \ Seems like it should be helpful to compile these regexps (here and the two below)? I assume they are just copied from the old code, so I haven't read them closely. @@ +86,5 @@ > + if msg['action'] != 'log': > + return msg > + > + line = msg['message'] > + to_return = None I guess I would prefer a better name here like "output_msg" or similar. ::: testing/mozharness/mozharness/mozilla/testing/errors.py @@ +78,5 @@ > 'full_regex': re.compile(r"(?:TEST-UNEXPECTED-FAIL|PROCESS-CRASH) \| .* \| (application crashed|missing output line for total leaks!|negative leaks caught!|\d+ bytes leaked)"), > 'minimum_regex': re.compile(r'''(TEST-UNEXPECTED|PROCESS-CRASH)'''), > 'retry_regex': re.compile(r'''FAIL-SHOULD-RETRY''') > }, > + "valgrind_error": { I think this is OK, but maybe get emorley to cross-check.
Attachment #8652251 -
Flags: review?(james)
Assignee | ||
Comment 11•9 years ago
|
||
(In reply to James Graham [:jgraham] from comment #10) > Please add output support to MachFormatter. I can do that, but how can I test it? I added a "def valgrind_error" method in testing/mozlog/mozlog/formatters/machformatter.py which prints some obvious text, but it doesn't seem to get called during my runs of ./mach mochitest --valgrind= ... > ::: build/automation.py.in > @@ +529,5 @@ > > xrePath = None, certPath = None, > > debuggerInfo = None, symbolsPath = None, > > timeout = -1, maxTime = None, onLaunch = None, > > + detectShutdownLeaks = False, screenshotOnFail=False, testPath=None, bisectChunk=None, > > + valgrindPath=None, valgrindArgs=None, valgrindSuppFiles=None): > > Do these extra arguments get used? > [ditto question for layout/tools/reftest/remotereftest.py, layout/tools/reftest/runreftest.py, > layout/tools/reftest/runreftestb2g.py] As I understand it, the various runApp implementations all have have the same arg types. The one in testing/mochitest/runtests.py changed to accept these 3 extra args, and I found I had to change at least two of the other implementations to stop it failing on Treeherder. For example, if I back out the change in build/automation.py.in, it fails thusly: 10:12:12 INFO - TypeError: runApp() got an unexpected keyword argument 'valgrindPath' 10:12:12 INFO - 0 ERROR Automation Error: Received unexpected exception while running application and I had that also with at least one of the other instances you pointed out. So I thought it safest to add those formal parameters to all runApp instances I could find. > @@ +46,5 @@ > > + supp_for_plat = supps_path + "/x86_64-redhat-linux-gnu.sup" > > + elif mozinfo.processor == "x86": > > + supp_for_plat = supps_path + "/i386-redhat-linux-gnu.sup" > > + > > + supp_cross_arch = supps_path + "/cross-architecture.sup" > > It looks like this option will always be used? I did the logic how it is so that get_default_valgrind_suppression_files returns an empty list if get_default_valgrind_suppression_files() gets called on a non-valgrind-supported target. It would be possible to lift "supp_cross_arch =" out of the |if| but then the above property wouldn't hold. > ::: testing/mozbase/mozlog/mozlog/commandline.py > @@ +159,2 @@ > > for option, value in formatter_options[fmt].iteritems(): > > + a_wrapper, a_option = None, "" > > Let's call this wrapper and wrapper_args and make wrapper_args a tuple so > that later we call > > wrapper(handler, **wrapper_args) Done > @@ +171,3 @@ > > for value in streams: > > handler = handlers.StreamHandler(stream=value, formatter=formatter) > > + for a_wrapper, a_option in handler_wrappers_and_options: > > Again change the names. Done > Call this "line" rather than "str" (which is a builtin) Done > No space here Done > Seems like it should be helpful to compile these regexps (here and the two > below)? Done > > + if msg['action'] != 'log': > > + return msg > > + > > + line = msg['message'] > > + to_return = None > > I guess I would prefer a better name here like "output_msg" or similar. Done.
Flags: needinfo?(james)
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to James Graham [:jgraham] from comment #10) (remaining review comments) > Do you have try runs with test failures to demonstrate that it works > as expected. No. Running in automation is out of scope for this bug. This is just for local ./mach mochitest runs. What I was hoping is that this patch can land as-is and in Q4 I or others can start to add Mochitest runs on Valgrind to the Tier-2 set of tests. What I do have is a try run which shows that the patch doesn't mess up the existing infrastructure: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d435497822e8 > Should these files be put in the test packages so that they are > available to automation? Similarly, that is out of scope for this bug. Clearly that will be need to be fixed for running in automation. > :: testing/mozharness/mozharness/mozilla/testing/errors.py > @@ +78,5 @@ > 'full_regex': re.compile(r"(?:TEST-UNEXPECTED-FAIL|PROCESS-CRASH) \| .* \| (application crashed|missing output line for total leaks!|negative leaks caught!|\d+ bytes leaked)"), > > 'minimum_regex': re.compile(r'''(TEST-UNEXPECTED|PROCESS-CRASH)'''), > > 'retry_regex': re.compile(r'''FAIL-SHOULD-RETRY''') > > }, > > + "valgrind_error": { > > I think this is OK, but maybe get emorley to cross-check. Again, not needed until this is run in automation.
Assignee | ||
Comment 13•9 years ago
|
||
Nick: pls can you review just the V flag changes in testing/mozbase/mozdebug/mozdebug/mozdebug.py.
Attachment #8652251 -
Attachment is obsolete: true
Attachment #8657428 -
Flags: review?(n.nethercote)
Attachment #8657428 -
Flags: review?(james)
Comment 14•9 years ago
|
||
Comment on attachment 8657428 [details] [diff] [review] bug1185244-13.cset Review of attachment 8657428 [details] [diff] [review]: ----------------------------------------------------------------- ::: testing/mozbase/mozdebug/mozdebug/mozdebug.py @@ +153,5 @@ > return None > > return None > + > +# Defines default values for Valgrind flags. You have detailed docs for some but not all flags. Perhaps document --trace-children=yes and --child-silent-after-fork=yes --trace-children-skip=... together. @@ +160,5 @@ > +# patching by the various JITS. Note that this is only necessary on > +# x86 and x86_64, but not on ARM. > +# > +# --vex-iropt-register-updates=allregs-at-mem-access is required so that > +# valgrind generates correct register values whenever there is a Nit: capitalize "valgrind". @@ +195,5 @@ > + + get_default_valgrind_tool_specific_args()) > + > +def get_default_valgrind_tool_specific_args(): > + return [ > + '--partial-loads-ok=yes' Why did you remove --leak-check=full and --show-possibly-lost=no? Is it because LSAN is likely to detect such leaks?
Attachment #8657428 -
Flags: review?(n.nethercote) → review+
Comment 15•9 years ago
|
||
Comment on attachment 8657428 [details] [diff] [review] bug1185244-13.cset Review of attachment 8657428 [details] [diff] [review]: ----------------------------------------------------------------- Looks fine, just a little bit of cleanup required. ::: testing/mochitest/mochitest_options.py @@ +38,5 @@ > + # don't know where the source tree is, we must give up. > + if build_obj is None or build_obj.topsrcdir is None: > + return [] > + > + supps_path = build_obj.topsrcdir + "/build/valgrind" using os.path.join makes the code more idiomatic, even though there is only one possible platform in this case. @@ +40,5 @@ > + return [] > + > + supps_path = build_obj.topsrcdir + "/build/valgrind" > + supp_for_plat = None > + supp_cross_arch = None I think this would be mildly clearer if you had rv = [] if mozinfo.os == "linux": if mozinfo.processor == "x86_64": rv.append(os.path.join(supps_path, "x86_64-redhat-linux-gnu.sup")) elif mozinfo.processor == "x86": rv.append(os.path.join(supps_path, "i386-redhat-linux-gnu.sup")) rv.append(os.path.join(supps_path, "cross-architecture.sup")) return rv ::: testing/mozbase/mozlog/mozlog/commandline.py @@ +38,5 @@ > else: > buffer_limit = int(buffer_limit) > return handlers.BufferHandler(handler, buffer_limit) > > +def valgrind_handler_wrapper(handler, handler_option): This shouldn't have a handler_option argument @@ +159,2 @@ > for option, value in formatter_options[fmt].iteritems(): > + wrapper, wrapper_args = None, ("",) The default wrapper args should be an empty tuple, spelled (,) ::: testing/mozbase/mozlog/mozlog/formatters/machformatter.py @@ +232,5 @@ > rv = rv[:-1] > return rv > > + def valgrind_error(self, data): > + rv = " " + data['primary'] + "\n" nit: I would probably use a format string here, and build up the rest using a list. Not critical to fix though. ::: testing/mozbase/mozlog/mozlog/handlers/valgrindhandler.py @@ +6,5 @@ > + > +import re > + > +class ValgrindHandler(BaseHandler): > + nit: remove this blank line
Attachment #8657428 -
Flags: review?(james) → review+
Updated•9 years ago
|
Flags: needinfo?(james)
https://hg.mozilla.org/mozilla-central/rev/feceb41f1c3c
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox44:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Backed out in https://hg.mozilla.org/mozilla-central/rev/19b4265d0d56 at billm's request since it apparently broke running tests locally. I'll merge this back around to the integration branches now to unbreak them.
Assignee | ||
Comment 19•9 years ago
|
||
With a fix for breaking local test runs, per comment 18.
Attachment #8657428 -
Attachment is obsolete: true
Attachment #8665418 -
Flags: review?(james)
Comment 20•9 years ago
|
||
Comment on attachment 8665418 [details] [diff] [review] bug1185244-17.cset Review of attachment 8665418 [details] [diff] [review]: ----------------------------------------------------------------- This seemed to work for me locally, and looks right.
Attachment #8665418 -
Flags: review?(james) → review+
https://hg.mozilla.org/mozilla-central/rev/a62c1724002a
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•