Closed
Bug 1136896
Opened 9 years ago
Closed 9 years ago
Speed up fill() and dedent()
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
Tracking | Status | |
---|---|---|
firefox39 | --- | fixed |
People
(Reporter: bzbarsky, Assigned: bzbarsky)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
5.32 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
I was looking into why codegen for all bindings takes ~15 seconds, and ~5 seconds of that is under fill()/dedent(), mostly doing regexpy bits and whatnot. In practice, we end up calling these a bunch of times on the same strings and redoing a lot of work. Memoizing a bit cuts the number of calls to textwrap.dedent from 86000-some to 400-some, and saves 3-3.5 seconds of codegen time for me.
Assignee | ||
Comment 1•9 years ago
|
||
Jason, you're probably most familiar with this code, so do you mind taking a look? If you'd rather I found another reviewer, let me know.
Attachment #8569415 -
Flags: review?(jorendorff)
Comment 2•9 years ago
|
||
Comment on attachment 8569415 [details] [diff] [review] Speed up fill() and dedent() by memoizing some of the work they currently end up doing on each call Review of attachment 8569415 [details] [diff] [review]: ----------------------------------------------------------------- I would do this a little differently, because I think it leaves the code reading better to separate the caching code from functionality. But feel free to take or leave this suggestion. import functools def memoize(fn): """ Decorator to memoize a function of one argument. The cache simply grows without bound. """ cache = {} @functools.wraps(f) def wrapper(arg): r = cache.get(arg) if r is None: r = cache[arg] = fn(arg) return r return wrapper @memoize def dedent(s): ...body and docstring of dedent unchanged... @memoize def compile_fill_template(template_str): """ Helper function for fill(). Given the template string passed to fill(), do the reusable part of template processing and return a pair (t, argsToIndent) that can be used every time fill() is called with that template argument. """ t = dedent(template) assert t.endswith("\n") or "\n" not in t argsToIndent = [] def replace(match): ... t = re.sub(fill_multiline_substitution_re, replace, t) return string.Template(t), argsToIndent def fill(template, **args): """ ...long docstring... """ t, argsToIndent = compile_fill_template(template) for name, modified_name, depth in argsToIndent: args[modified_name] = indent(args[name], depth) return t.substitute(args) ::: dom/bindings/Codegen.py @@ +172,4 @@ > def replace(match): > """ > Replaces a line like ' $*{xyz}\n' with '${xyz_n}', > where n is the indent depth, and add a corresponding entry to args. "to argModList", not args @@ +174,5 @@ > Replaces a line like ' $*{xyz}\n' with '${xyz_n}', > where n is the indent depth, and add a corresponding entry to args. > + > + Note that this needs to close over argModList, so it has to be > + defined inside fill(). BTW, defining a replace() function that closes over local variables and gets passed to re.sub() is a very common idiom in Python. It wouldn't typically merit a comment; you just put the def right next to the re.sub() call that uses it, and the reader will catch on. But in this file (maintained by C++ hackers) it doesn't hurt to explain a bit more. @@ +197,5 @@ > + # Call dedent_internal here, since we don't really need to memoize the > + # intermediate state. > + t = dedent_internal(template) > + assert t.endswith("\n") or "\n" not in t > + argModList = list() Change `list()` to `[]`. @@ +210,5 @@ > indented_value = indent(args[name], depth) > if modified_name in args: > assert args[modified_name] == indented_value > else: > args[modified_name] = indented_value Pre-existing code, but I don't think the if-statement and the assert are useful anymore. Now that we have separate "compile" and "run time" phases, we can do better. At compile time, eliminate duplicates: entry = (name, modified_name, depth) if entry not in argModList: argModList.append(entry) Then at run time, you can just for name, modified_name, depth in argsToIndent: args[modified_name] = indent(args[name], depth)
Attachment #8569415 -
Flags: review?(jorendorff) → review+
Assignee | ||
Comment 3•9 years ago
|
||
Did the @memoize thing, and https://hg.mozilla.org/integration/mozilla-inbound/rev/58f17236b187
https://hg.mozilla.org/mozilla-central/rev/58f17236b187
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•