Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy loading of service descriptions #11479

Merged
merged 21 commits into from
Jan 7, 2018

Conversation

amelchio
Copy link
Contributor

@amelchio amelchio commented Jan 6, 2018

Description:

With this PR the loading of services.yaml is postponed until it is needed, namely when listing services in websocket_api and api.

This enables significant code cleanup in platforms but my main motivation was to allow custom_components testing without having to copy services.yaml around.

Checklist:

  • The code change is tested and works locally.

If the code does not interact with devices:

  • Local tests with tox run successfully.
  • Tests have been added to verify that the new code works.

'fields': description.get('fields', {})
}

@bind_hass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected 2 blank lines, found 1

@@ -112,3 +116,44 @@ def extract_entity_ids(hass, service_call, expand_group=True):
return [service_ent_id]

return service_ent_id

@bind_hass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected 2 blank lines, found 1

'fields': description.get('fields', {})
}

@bind_hass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected 2 blank lines, found 1

@@ -112,3 +116,44 @@ def extract_entity_ids(hass, service_call, expand_group=True):
return [service_ent_id]

return service_ent_id

@bind_hass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected 2 blank lines, found 1

Copy link
Member

@MartinHjelmare MartinHjelmare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the clean up. Downside is that it's less clear that the services have descriptions.


if domain == ha.DOMAIN:
import homeassistant.components as components
file = components.__file__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file is a python builtin. Please use another variable name, eg fil, comp_file or similar.


descriptions = DESCRIPTION_CACHE.get(file)
if descriptions is None:
descriptions = load_yaml_config_file(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call will do I/O. Since get_description will be called from async_get_all_descriptions, the function that does I/O needs to be scheduled on the executor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. This was moved from the LIFX async_setup_platform but I guess that was also wrong.

I am a bit confused by the main/executor transitions available. What is the proper way to schedule the function in this situation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function that schedules the job for the executor, via hass.async_add_job, needs to be a coroutine to be able to yield and wait for the result, ie to have the descriptions ready. Not 💯 though. I'd wait for a core review before changing anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct Martin

return {domain: {key: value.as_dict() for key, value
in self._services[domain].items()}
for domain in self._services}
return {domain: self._services[domain].copy() for domain in self._services}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line too long (83 > 79 characters)

descriptions[domain] = {}
for service in services[domain]:
description = yield from hass.async_add_job(
get_description, hass, domain, service)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know that we're going to load all the descriptions, it would help if we preload the catch all services file (/components/services.yaml). That file is pretty heavy if we have to parse it X times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only be read and parsed once, that's what the DESCRIPTION_CACHE is for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see now that you cache that file. I'm not sure if that is a great idea, it's big and a lot will be unused.

@balloob
Copy link
Member

balloob commented Jan 6, 2018

I love this ❤️

It doesn't make sense to keep it in core.

@balloob
Copy link
Member

balloob commented Jan 6, 2018

Once removed from all components/platforms, should really speed up the startup and tests 👍

@@ -21,6 +23,8 @@

_LOGGER = logging.getLogger(__name__)

DESCRIPTION_CACHE = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't store the description cache in a global. Instead store it in hass.data, which is a dict for globals during the runtime of a hass instance.


descriptions = DESCRIPTION_CACHE.get(comp_file)
if descriptions is None:
descriptions = load_yaml_config_file(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file might not exist

if descriptions is None:
descriptions = load_yaml_config_file(
path.join(path.dirname(comp_file), 'services.yaml'))
DESCRIPTION_CACHE[comp_file] = descriptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to keep a cache per file or would a cache per service be better?

"""Return descriptions (i.e. user documentation) for all service calls."""
FILE_CACHE = {}

if not SERVICE_DESCRIPTION_CACHE in hass.data:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for membership should be 'not in'

@amelchio amelchio changed the title RFC: Lazy loading of service descriptions WIP: Lazy loading of service descriptions Jan 7, 2018
for domain in services:
descriptions[domain] = {}
for service in services[domain]:
descriptions[domain][service] = yield from hass.async_add_job(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code would be a lot faster if you test if the key exists in the cache inside the async context (before this line). No need to yield and wait for a thread to run our job.

@balloob
Copy link
Member

balloob commented Jan 7, 2018

Super excited about this PR ! It looks like we're saving around ~1 minute on a full test run! (8m23s -> 7m23s)

@amelchio
Copy link
Contributor Author

amelchio commented Jan 7, 2018

I reworked it once again, this time with more attention to performance. So now we only async_add_job the actual load_yaml call and we do so concurrently (loading all files at once).

Can't say I notice any difference on my dev setup, though. Maybe it will be more significant on slow machines with a cold disk cache, or maybe not.

@balloob Are the timings from your local setup? I still only use Travis and it is so erratic that I cannot tell whether there is any improvement at all.

@amelchio amelchio changed the title WIP: Lazy loading of service descriptions Lazy loading of service descriptions Jan 7, 2018
msg['id'], self.hass.services.async_services()))
@asyncio.coroutine
def call_service_helper(msg):
"""Call a service and fire complete message."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name and docstring are stale. This function doesn't call a service.


def format_cache_key(domain, service):
"""Build a cache key."""
return "{}.{}".format(domain, service)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just store the format string as a constant instead of returning it inside a function. You'll save one function call like that.

@balloob
Copy link
Member

balloob commented Jan 7, 2018

Those were timings from Travis. I took the build times from your branch and compared it to the latest dev branch. Not very scientific as container speeds fluctuate.

@balloob balloob merged commit 8267a21 into home-assistant:dev Jan 7, 2018
@balloob
Copy link
Member

balloob commented Jan 7, 2018

🎉 🐬 🌮 💃 🍻

@amelchio
Copy link
Contributor Author

amelchio commented Jan 7, 2018

The device tracker tests failed frequently with this change. I think I finally figured out that it was a couple of existing races that got easier to hit after the YAML loading was removed.

@balloob
Copy link
Member

balloob commented Jan 7, 2018

Did a test run on my Macbook Pro Retina 2012:

Before:

Results (276.76s):
    3077 passed
       1 failed
         - tests/test_core.py:800 TestConfig.test_is_allowed_path
      52 skipped

After:

Results (220.09s):
    3078 passed
       1 failed
         - tests/test_core.py:797 TestConfig.test_is_allowed_path
      52 skipped

That's a great improvement! 👍

@balloob
Copy link
Member

balloob commented Jan 7, 2018

(is_allowed_path always fails on Mac because /tmp is a symlink)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants