Closed Bug 1124220 Opened 10 years ago Closed 10 years ago

concurrent.futures hangs intermittently

Categories

(Webtools Graveyard :: DXR, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: erik, Unassigned)

Details

Once in awhile on my Vagrant VM (but never in production), the worker pool hangs while indexing. It can happen at any point in the process. Here's an example where it hung at the end: 227 of 229 jobs done. 228 of 229 jobs done. ...and then it sits here forever. Control-C-ing the parent process gives you this: ^CProcess Process-7: Process Process-5: Process Process-8: Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self.run() self.run() self._target(*self._args, **self._kwargs) File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 128, in _process_worker File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 128, in _process_worker File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 128, in _process_worker call_item = call_queue.get(block=True) File "/usr/lib/python2.7/multiprocessing/queues.py", line 115, in get call_item = call_queue.get(block=True) File "/usr/lib/python2.7/multiprocessing/queues.py", line 115, in get call_item = call_queue.get(block=True) File "/usr/lib/python2.7/multiprocessing/queues.py", line 117, in get self._rlock.acquire() self._rlock.acquire() KeyboardInterrupt KeyboardInterrupt res = self._recv() KeyboardInterrupt ^CProcess Process-6: Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 128, in _process_worker call_item = call_queue.get(block=True) File "/usr/lib/python2.7/multiprocessing/queues.py", line 115, in get self._rlock.acquire() KeyboardInterrupt Traceback (most recent call last): File "/usr/local/bin/dxr-build.py", line 10, in <module> execfile(__file__) File "/home/vagrant/dxr/bin/dxr-build.py", line 60, in <module> exit(main()) File "/home/vagrant/dxr/bin/dxr-build.py", line 53, in main verbose=options.verbose) File "/home/vagrant/dxr/dxr/build.py", line 112, in build_instance index_tree(tree, es, verbose=verbose)) for tree in trees] File "/home/vagrant/dxr/dxr/build.py", line 273, in index_tree index_files(tree, tree_indexers, index, pool, es) File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 573, in __exit__ self.shutdown(wait=True) File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 354, in shutdown self._queue_management_thread.join() File "/usr/lib/python2.7/threading.py", line 949, in join self.__block.wait() File "/usr/lib/python2.7/threading.py", line 339, in wait waiter.acquire() KeyboardInterrupt [1]+ Terminated dxr-build.py vagrant@vagrant-ubuntu-trusty-64:~$ It's reminiscent of http://bugs.python.org/issue9205, but that was supposed to be fixed in concurrent.futures years ago.
Perhaps it's a worker that hangs forever. When I run the same job on the same data with disable_workers = true, it also hangs. Aborting, I get this traceback: - Skipping rebuild (due to 'build' in 'skip_stages') ^CTraceback (most recent call last): File "/usr/local/bin/dxr-build.py", line 10, in <module> execfile(__file__) File "/home/vagrant/dxr/bin/dxr-build.py", line 60, in <module> exit(main()) File "/home/vagrant/dxr/bin/dxr-build.py", line 53, in main verbose=options.verbose) File "/home/vagrant/dxr/dxr/build.py", line 112, in build_instance index_tree(tree, es, verbose=verbose)) for tree in trees] File "/home/vagrant/dxr/dxr/build.py", line 273, in index_tree index_files(tree, tree_indexers, index, pool, es) File "/home/vagrant/dxr/dxr/build.py", line 605, in index_files swallow_exc=False) File "/home/vagrant/dxr/dxr/build.py", line 560, in index_chunk index_file(tree, tree_indexers, path, es, index, jinja_env) File "/home/vagrant/dxr/dxr/build.py", line 542, in index_file annotations_by_line) if is_text else [], File "/home/vagrant/dxr/dxr/lines.py", line 360, in build_lines tags = list(tag_boundaries(refs, regions)) File "/home/vagrant/dxr/dxr/lines.py", line 234, in tag_boundaries for start, end, data in intervals: File "/home/vagrant/dxr/dxr/plugins/pygmentize.py", line 90, in regions for index, token, text in lexer.get_tokens_unprocessed(self.contents): File "/usr/local/lib/python2.7/dist-packages/pygments/lexers/c_cpp.py", line 160, in get_tokens_unprocessed RegexLexer.get_tokens_unprocessed(self, text): File "/usr/local/lib/python2.7/dist-packages/pygments/lexer.py", line 629, in get_tokens_unprocessed m = rexmatch(text, pos) KeyboardInterrupt Let's improve progress indication so we can always see where things grind to a halt.
Btw, as suspected, the run completed when I disabled pygmentize. It looks like there's a pygments bug that results in an infinite loop during C++ lexing.
The hang happens while pygmentizing mozilla-central/gfx/harfbuzz/src/hb-ot-tag.cc, which looks fairly harmless. Let's see if we can get a reduction and then fix Pygments. If I'm lucky, it's already fixed on master.
Commit pushed to es at https://github.com/mozilla/dxr https://github.com/mozilla/dxr/commit/aecf94ead9a2cbea96a67b9558b3bc2be361f97b Add logging for indexing workers. Refs bug 1124220. This helped me track down which files pygments was hanging on.
Commit pushed to es at https://github.com/mozilla/dxr which the github bot somehow missed https://github.com/mozilla/dxr/commit/597991a0befbbb1e0808f6d957fe4f9d7c1e3429 Switch to a patched Pygments that dodges a pathological slowness bug in the C++ lexer. Fix bug 1124220. When https://bitbucket.org/birkenfeld/pygments-main/issue/1081/cpplexer-pathological-slowness-case or something better is merged, we can get back to the mainline.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.