Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for limiting which entities are recorded #4733

Merged
merged 2 commits into from
Jan 3, 2017

Conversation

mjg59
Copy link
Contributor

@mjg59 mjg59 commented Dec 4, 2016

Description:
Allow filtering of entities in the recorder component where there may be privacy concerns.

Pull request in home-assistant.github.io with documentation: home-assistant/home-assistant.io#1539

Example entry for configuration.yaml (if applicable):

recorder:
  exclude_entities:
    - sun.sun

Checklist:

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • Local tests with tox run successfully. Your PR cannot be merged unless tests pass
  • New dependencies have been added to the REQUIREMENTS variable (example).
  • New dependencies are only imported inside functions that use them (example).
  • New dependencies have been added to requirements_all.txt by running script/gen_requirements_all.py.
  • New files were added to .coveragerc.

If the code does not interact with devices:

  • Local tests with tox run successfully. Your PR cannot be merged unless tests pass
  • Tests have been added to verify that the new code works.

For privacy reasons, it may be desirable to either exclude some entities
from being recorded or limit the recorded devices to a specific list. This
patch adds support. The exclude_entities list in the recorder configuration
provides a set of entities that will be excluded. The include_entities list,
if present, will avoid recording all entities except for those explicitly
included in the list.

@mention-bot
Copy link

@mjg59, thanks for your PR! By analyzing the history of the files in this pull request, we identified @rhooper, @fabaff and @kellerza to be potential reviewers.

Copy link
Member

@kellerza kellerza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the correct approach, since the entity lists might become quite long. Also for some components it might be critical not to log it and the reason why I proposed to make this a property of the Entity rather (or even attribute of Entity so that it can be customized) in #4576

try:
entity_id = event.data[ATTR_ENTITY_ID]
if self.exclude and entity_id in self.exclude:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably complete your queue task before continuing

@mjg59
Copy link
Contributor Author

mjg59 commented Dec 5, 2016

@kellerza Making it a property of the entity involves quite a lot of additional configuration with auto-discovery setups. In that scenario the easiest option might be to add an exclude option at the entity level but leave include at the recorder object?

@kellerza
Copy link
Member

kellerza commented Dec 5, 2016

Actually the oppsite, if an Entity contains its sane defaults, it will ok on discovery.

Customizing can then force/change the behaviour.

@mjg59
Copy link
Contributor Author

mjg59 commented Dec 5, 2016

@kellerza I don't think I'm quite clear on this. If I want to limit recording to a single entity, I probably don't want to have to add explicit configuration for everything that's currently autodiscovered in order to exclude those and leave behind the one I want - similarly, it would feel odd if adding configuration for a single entity were to change the behaviour of every other entity. Having that configuration at the recorder level feels more obvious in that respect. Am I misunderstanding your proposal?

@balloob
Copy link
Member

balloob commented Dec 6, 2016

@kellerza I disagree, I think this is well fit to be part of the recorder component.

@mjg59 could you follow the same configuration format as history ?

@balloob balloob self-assigned this Dec 6, 2016
@kellerza
Copy link
Member

kellerza commented Dec 6, 2016

The idea was that an entity indicates its recorder level and then you can, in rare cases, overwrite it with customize

Say you have three states for the Entity's recorder attr (or recorder key in attributes dict): All (default), None, States [wont log changes to attributes if the state did not change]

In the recorder component you can see how the entity should be handled by using:
my_entity.attributes.get('recorder', getattr(my_entity, 'recorder', 'All')

Most entities will default to All. sun will have a Python @Property that returns 'States'. Any fast changing sensor might have a @Property returning None and you can always overwrite any entity with customize

It will cover your exclude_entities use case nicely and allow sane defaults for sun for example, Unfortunately not cover your 'include_entities' scenario, but I do not see a big demand for including only 1/2, since it will just limit history and restore capabilities of HA

@kellerza
Copy link
Member

kellerza commented Dec 6, 2016

Ah, ok, only saw @balloob 's reply now

@kellerza
Copy link
Member

kellerza commented Dec 6, 2016

@balloob I see this as a similar property to visible. On a per entity level, the entity's programmer, component doing load_platform, a slave instance and the user through config has control over visible. If visible was under the frontend, only the user doing the config would have had control.

Examples:

  • template sensors. Generally they duplicate some other source of information we dont need to store, potential candidate to be OFF for recording by default.

  • Recorder functionality can take more states than simply on/off. For example, how would we handle only recording state changes and not attribute changes? (i.e. sun) Would we add another config option for CONF_STATE_ONLY (together with CONF_INCLUDE and CONF_EXCLUDE). It will get bloated soon

  • In a future when we have many micro HA instances, how do we exclude all Entities that originates from a slave instance? (Say slaves write their own DB records) Once again the central config seems inappropriate

  • Say I build a clock sensor (or some other high speed sensor) The HA architecture is fast enough to do this, this should really always be excluded from recording, but very nice to see in the frontend (clocks, immedaite power usage etc). Would be good for the creator of such a sensor to pick the default record level.

In summary, I have no objections adding it here, it answers a immediate question after all. Just feel that we could go down a path that can answer at least two of the questions raised in #4576 more elegantly.

@balloob
Copy link
Member

balloob commented Dec 7, 2016

I think that recording is not a concern that the core should even be aware of that it exists.

Entities should not have to care about anything outside of themselves, their state and their attributes. An entity has a few properties that go beyond this, but they are exceptions to give hints to our UI how they should be rendered: icon, entity_picture, hidden.

Recording can have so many use cases, allowing that to sip into the core will open the door for more attributes and properties to add. Sure, the idea that a sensor can tell that it will update frequently and a switch that it won't is already engrained in their domain. If the domain is not clearly representing the entity that it holds, the entity should be part of another component (or maybe it's own).

When it comes to a distributed setup of HASS: let's figure that out when we get there. But adding an origin like we already do for event (remote/local) would be an easy fix there. Entity properties would not be the answer there, because how would a local recorder know to record it if it is saved in the entity?

@kellerza
Copy link
Member

kellerza commented Dec 7, 2016

Maybe making it a property or attribute is the wrong approach, but if we make it an attribute only we can get the same behaviour

In a way we will already go the attribute route. In #4614 we will add an attribute called 'restored' to indicate to the frontend this state was restored from the DB. Part of this will be adding logic not to record entities with the 'restored' attribute or at least on the first shot.

In this way the core does not have to be concerned about it at all, but it is still an interaction between the Entity and the recorder that can be customized by the user through the attribute structure already in place

(The recorder is actually also the perfect place to clear the 'restored' attribute, but that is a side topic)

@balloob
Copy link
Member

balloob commented Dec 9, 2016

The difference with the restored property is that the entity is not in control of it. It's not part of the Entity class or any component base classes. It is something that gets added by the new restore_entity component when it restores a state.

A property will also introduce problems if you would want to use the recorder component and the InfluxDB component and record different things for each.

@kellerza
Copy link
Member

kellerza commented Dec 9, 2016

If we think about it as two different recorder components with different scope of entities to record, having local config makes sense.

Still would have liked a way to indicate that an entity is a spammer (like the sun's elevation attribute), but will ponder it a bit more...

@balloob
Copy link
Member

balloob commented Dec 9, 2016

@kellerza 👍

@mjg59 we got to a conclusion. Could you follow the same configuration format as history? You don't have to add all the options that the history implements (whitelist, blacklist domains), but at least follow the format so we can add the other options later if needed. Thanks!

@@ -17,7 +17,7 @@
import voluptuous as vol

from homeassistant.core import HomeAssistant, callback
from homeassistant.const import (EVENT_HOMEASSISTANT_START,
from homeassistant.const import (ATTR_ENTITY_ID, ATTR_DOMAIN, EVENT_HOMEASSISTANT_START,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line too long (88 > 79 characters)

@mjg59
Copy link
Contributor Author

mjg59 commented Dec 10, 2016

Done - domain support seemed easy enough so I added that as well.

Copy link
Member

@kellerza kellerza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, do you mind adding a test for this logic, especially the include part?

I think the way the recorder unit tests are structured it might be easier to set recorded.INSTANCE.include_* and test the event in your test case and not try the full init cycle.

Not squashing commits from here on will help subsequent reviews

CONF_EXCLUDE = 'exclude'
CONF_INCLUDE = 'include'
CONF_ENTITIES = 'entities'
CONF_DOMAINS = 'domains'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you import these from history? Or better yet if only in history move it to consts and import from there?

@@ -165,6 +184,20 @@ def __init__(self, hass: HomeAssistant, purge_days: int, uri: str) -> None:
self.db_ready = threading.Event()
self.engine = None # type: Any
self._run = None # type: Any
self.include = include
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to save both entities dict and the expanded lists

entity_id = event.data.get(ATTR_ENTITY_ID)
domain = event.data.get(ATTR_DOMAIN)

if self.exclude and entity_id in self.exclude_entities or \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only need to test if not in list

self.queue.task_done()
continue

if self.include and entity_id not in self.include_entities and \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self.include should always be true here, because of voluptuous passing in a dict of empty lists

@balloob balloob assigned kellerza and unassigned balloob Dec 12, 2016
@kellerza
Copy link
Member

@mjg59 shout if you get stuck with the unit tests

Matthew Garrett and others added 2 commits January 3, 2017 23:34
For privacy reasons, it may be desirable to either exclude some entities
or domains from being recorded or limit the recorded devices to a specific
list. This patch adds support. The exclude list allows for domains and
entities to be blacklisted such that they will never be recorded. The include
list, if present, will avoid recording all entities and domains unless they
appear in the list.
@kellerza kellerza merged commit 2970196 into home-assistant:dev Jan 3, 2017
@mjg59
Copy link
Contributor Author

mjg59 commented Jan 3, 2017

@kellerza Thanks for fixing that up - I'm still out of town, but will fix up docs this week!

@Maxr1998
Copy link

Maxr1998 commented Jan 4, 2017

@mjg59 I updated the docs to your code, and added some extra stuff from history docs in a new PR. Hope my wording is correct ;)

@Maxr1998
Copy link

Maxr1998 commented Jan 6, 2017

@mjg59 I updated hass to the dev version, but it seems domain filtering doesn't work.. entities get filtered properly though.

Configuration:

recorder:
  db_url: !secret db_url
  purge_days: 3
  exclude:
    domains:
      - group
      - scene
      - updater
      - weblink
      - zone
    entities:
      - sensor.time
      - sensor.since_last_boot
      - sensor.disk_free_

image

@kellerza
Copy link
Member

kellerza commented Jan 6, 2017

@Maxr1998 thanks for the feedback. This line is probably wrong

Do you mind changing to domain = entity_id.split('.')[0] in your testing?

Let me try and work on a unit test as well before the next release

@Maxr1998
Copy link

Maxr1998 commented Jan 7, 2017

Sorry for the late feedback, I updated the code this morning, but also horribly broke my MySQL server while trying to reset the DB, so I need to fix that before I can report back.

@home-assistant home-assistant locked and limited conversation to collaborators Apr 30, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants