Skip to content

Speed up graph objs #678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 2, 2017
Merged

Speed up graph objs #678

merged 7 commits into from
Feb 2, 2017

Conversation

theengineear
Copy link
Contributor

Introduces a utility decorator to memoize functions along with some other performance improvements surrounding `graph_reference.

This was something I tinkered on over the weekend. I spent a bit more time today formalizing it so that the work didn't go to waste.

Closes #497

^^ I'm closing that issue, but it's likely that there are other improvements to be had. My suggestion would be to separate make_subplots so that users can choose whether they'd like to use graph_objs or native Python objects to create a plot. That can/should be handled as a different issue though.

@theengineear
Copy link
Contributor Author

@Kully, wanna put some 👀 on this?

//cc @depettit, thanks for the clear issue documentation. This targets the make_subplots portion. It's quite possible that the append_trace is sill going to be problematic for you. I'll likely ask for your help in manually testing this after this PR is merged into master.

@Kully
Copy link
Contributor

Kully commented Jan 30, 2017

@Kully, wanna put some 👀 on this?
Sure, tomorrow most likely 👍

@dhirschfeld
Copy link

FWIW I always use the decorator module to implement signature/docstring preserving decorators. If the decorated functions aren't user-facing then it's possibly not a big issue however I've found it useful when debugging my own code to have the decorated functions have meaningful signatures and docstrings.

The added utility does have to be weighed up against another dependency but these days with conda I don't see that as such a big issue and it's already a dep for sever other widely used packages.

@theengineear
Copy link
Contributor Author

Oops, thanks @dhirschfeld , I meant to preserve the docstrings but forgot 😓.

I typically use something like wraps. What benefits beyond something like wraps, does the decorator package provide?

@theengineear
Copy link
Contributor Author

FWIW, these decorated functions aren't really meant to be user-facing, but, it was my intention to make a more general-purpose decorator that would preserve names/docs.

@theengineear
Copy link
Contributor Author

theengineear commented Jan 31, 2017 via email

@dhirschfeld
Copy link

dhirschfeld commented Jan 31, 2017

The decorator module will preserve docstring and signature on both py2/3 and will ensure inspect.getargspec also works correctly.

From a quick test, much of this is mitigated on py3.5 where wraps also preserves the signature and where getargspec is deprecated in favour of inspect.signature which does work with wrapped functions.

My mantra of always using the decorator module comes from a long history of Py2 usage 😛 . Still, if you have to support both it's probably worth considering...

The bottleneck for the `make_subplots` function was due to excessive
lookups in the `plot_schema`. These lookups are actually pretty
computation-intensive so caching these computations can give us a large
performance boost.

Note that Python 3.2+ has a new [`functools.lru_cache`](https://docs.python.org/3/library/functools.html#functools.lru_cache) which
can be used for this. HOWEVER, we support Python 2.7+ and I didn’t see a backport for it.

There are numerous `memoize` packages on PyPI, so many that I didn’t
want to commit to one. It’s fairly simple to write this and then we
don’t need another dependency.
The `value` argument currently allows for any value type. This isn’t
very helpful for memoization. Instead, the top-level function now
categorizes `value` as a string so that we *can* memoize the function.

Also, we need hashable args, so we `tuple` the `parent_object_names`.
The `__init__` method ends up doing some additional validation, it’s
more performant (and exactly the same), to set an item in layout as a
dict instead of as a *graph_obj*.
@theengineear
Copy link
Contributor Author

@dhirschfeld , I took you up on the suggestion. Thanks!

@chriddyp the decorator module is a dep for ipython, so it's likely that folks will have it already. It's fairly small and provides some nice Python-version-independent functionality. Given it's size, I'm a proponent of adding it as a dep, but I figured I'd ping you on it.

@theengineear
Copy link
Contributor Author

Additionally, I should note that instead of letting callers specify a key_function, I decided it was better form to nudge developers to just create functions that are naturally memoizable (all args/kwargs hashable).

@theengineear
Copy link
Contributor Author

@Kully unless there's pushback on the new decorator dep, I'm done futzing with this PR.

@@ -1000,8 +1000,7 @@ def _get_anchors(r, c, x_cnt, y_cnt, shared_xaxes, shared_yaxes):
# Function pasting x/y domains in layout object (2d case)
def _add_domain(layout, x_or_y, label, domain, anchor, position):
name = label[0] + 'axis' + label[1:]
graph_obj = '{X_or_Y}Axis'.format(X_or_Y=x_or_y.upper())
axis = getattr(graph_objs, graph_obj)(domain=domain)
axis = {'domain': domain}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you did this because axis will never be equal to anything other than {'domain': domain}, and so it's like a trivial case of memoization? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't about memoization as much as it's about performance. the line graph_obj = '{X_or_Y}Axis'.format(X_or_Y=x_or_y.upper()) was eating up a ton of cpu time because it was instantiating that graph_object and then immediately setting it inside layout, which effectively was instantiating it again.


:param (int|None) maxsize: Limit the number of cached results. This is a
simple way to prevent memory leaks. Setting this
to `None` will remember *all* calls. The 128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's a good idea to allow None - i.e. remembering all calls? Perhaps to avoid huge memory leaks, maybe have the cap at a really high number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, this mimics lru_cache. I'd have tried to use this, but it's not available in Python < 3.2.

if key in keys:
return cache[key]

if maxsize is not None and len(keys) == maxsize:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is set out very clearly! 👍


self.assertEqual(name_space.call_count, 0)
for i, (inputs, result) in enumerate(tests, 1):
for _ in range(10):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is _ the canonical (or a typical) variable used for loops where the looping variable is not used inside?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, _ is the common placeholder when you don't intent to use the value.

@theengineear
Copy link
Contributor Author

@Kully , what's the word, 🐦? Can this 💃?

@chriddyp , can I get a confirmation that adding the decorator dep is OK? Again, IPython uses it, it's PY2/3 agnostic, and it's quite small (nudge, nudge, nudge).

@chriddyp
Copy link
Member

chriddyp commented Feb 2, 2017

can I get a confirmation that adding the decorator dep is OK? Again, IPython uses it, it's PY2/3 agnostic, and it's quite small (nudge, nudge, nudge).

yup!

@Kully
Copy link
Contributor

Kully commented Feb 2, 2017

Can this 💃?

Yup! 💃 🕺 👯

@theengineear theengineear merged commit 6f9621a into master Feb 2, 2017
@theengineear theengineear deleted the speed_up_graph_objs branch February 2, 2017 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants