Skip to content

HalShami/Home-Occupancy-Simulation-Using-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Home Occupancy Simulation Using Machine Learning

This project is aimed at learning time-series patterns in IoT device usage and predicting future activity patterns.
The current FastAPI implementation accepts data in CSV format exported from Home Assistant and outputs YAML code ready to be saved into automations.yaml in Home Assistant OS.

FastAPI_Service

A small FastAPI service that predicts daily on/off events for devices from historical CSV usage data and generates Home Assistant automation YAML as output.

This service is part of the GhostAI project and provides a single POST endpoint (/predict/) that accepts a historical CSV, a date range, and a device id, and returns predicted on/off events formatted as Home Assistant automation YAML (packaged in a JSON response and exposed with an attachment header for download).

Important Notes

The original Home Occupancy Simulation Using Machine Learning paper implemented predictions through an XGBoost classifier for time intervals every n minutes.
This version approaches the problem differently, employing Kernel Density Estimation (KDE) to extract times of activity with highest probability and generate automations accordingly.

Contents / Key Modules

  • main.py
    • FastAPI application with the /predict/ endpoint.
    • Uploads the CSV to a temporary file, invokes preprocessing, runs prediction logic over each day in the requested date range, and generates YAML automations.
  • Preprocessing.py
    • Contains data-loading and cleaning utilities used to convert the raw CSV into the format expected by the prediction model.
  • KDE_Model.py
    • Implements a kernel density estimation–based algorithm to infer likely on/off minute ranges (including a fallback mechanism).
  • Yaml_Generator.py
    • Builds Home Assistant automation YAML from the predicted on/off timestamps.

Quickstart (local)

Prerequisites

  • Python 3.8+
  • pip

Install dependencies:

pip install fastapi uvicorn pandas pyyaml numpy scikit-learn

Run the service:

uvicorn main:app --reload --host 0.0.0.0 --port 8000

API available at:
http://127.0.0.1:8000

API

POST /predict/

Description:
Accepts historical CSV data and returns predicted on/off events as YAML automations.

Query Parameters

  • start_date — YYYY-MM-DD (required)
  • end_date — YYYY-MM-DD (required)
  • device_id — device identifier (required)

Form Data

  • file — CSV file upload (required)

Example:

curl -X POST "http://127.0.0.1:8000/predict/?start_date=2025-12-01&end_date=2025-12-07&device_id=light.living_room"   -F "file=@history.csv"   -H "accept: application/json"

Expected CSV Format

Required / typical fields:

  • timestamp — ISO 8601 datetime
  • State — "on" or "off"
  • device_id (optional)

How Prediction Works (High Level)

  1. main.py saves uploaded CSV and calls preprocessing.
  2. For each day in the selected range:
    • Extracts temporal features.
    • Uses KDE to infer likely minute ranges for on/off transitions.
  3. Output is passed to Yaml_Generator to produce automation-ready YAML.

Tuning & Customization

Adjustable parameters include:

  • day_weight, month_weight
  • KDE bandwidth
  • percentile

Testing & Debugging

  • Error responses include a traceback.
  • Use small CSV samples for debugging.
  • Add logging in preprocessing, KDE, and YAML generation modules.

Contributing

Contributions are welcome, including:

  • CSV parsing improvements
  • Prediction model adjustments
  • YAML structure customization
  • Home Assistant integration extensions

Citation

If you find this work helpful in your research, please cite:

APA citation:

Al-Shami, H. A. (2024). Home Occupancy Simulation Using Machine Learning. In K. Arai (Ed.), Proceedings of the Future Technologies Conference (FTC) 2024, Volume 1 (Lecture Notes in Networks and Systems, Vol. 1154). Springer, Cham. https://doi.org/10.1007/978-3-031-73110-5_33

BibTeX:

@inproceedings{alshami2024home,
  author    = {Al{-}Shami, H. A.},
  title     = {Home Occupancy Simulation Using Machine Learning},
  booktitle = {Proceedings of the Future Technologies Conference (FTC) 2024, Volume 1},
  editor    = {Arai, K.},
  series    = {Lecture Notes in Networks and Systems},
  volume    = {1154},
  publisher = {Springer},
  address   = {Cham},
  year      = {2024},
  doi       = {10.1007/978-3-031-73110-5_33}
}

License

This project is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

License: CC BY 4.0

Releases

No releases published

Packages

No packages published

Languages