-
Notifications
You must be signed in to change notification settings - Fork 14
Feature/abstract data collector #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
adamamer20
merged 40 commits into
projectmesa:main
from
Ben-geo:feature/abstract-data-collector
Jun 8, 2025
Merged
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
7799b45
abstract
Ben-geo ad86972
list removed
Ben-geo 8ec3dd5
descriptions
Ben-geo 2a9ef69
fleshed out flush
Ben-geo 50a59f7
ent
Ben-geo 18028e6
reset functionality
Ben-geo dd24042
ent
Ben-geo 2e3bc0f
removed register stats func
Ben-geo 0b19475
doc fixes
Ben-geo ec04fd4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] bc6e9b1
more descriptive
Ben-geo 5847323
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 444a863
resolve
Ben-geo 2ac2b23
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 83724f4
removed doc
Ben-geo 7a77cf2
add doc to flush
Ben-geo 1f2a748
removed load_data
Ben-geo 03d4784
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 870a7d6
condtitional collect
Ben-geo 8d3643c
trigger default is pass
Ben-geo a88d35b
added seed and fixed docs
Ben-geo 063adfd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 3ced8d3
adding pre-commit and ruff to dev dependencies
adamamer20 9cac504
Merge branch 'main' into feature/abstract-data-collector
adamamer20 9a05e68
precommit
Ben-geo 7b32764
Merge branch 'feature/abstract-data-collector' of https://github.com/…
adamamer20 3d2afb3
Merge branch 'main' of https://github.com/projectmesa/mesa-frames int…
adamamer20 cb50066
Merge branch 'main' into feature/abstract-data-collector
adamamer20 aa8da74
Merge branch 'main' of https://github.com/projectmesa/mesa-frames int…
adamamer20 7843a76
Merge branch 'feature/abstract-data-collector' of https://github.com/…
adamamer20 65277f2
fix: uv.lock was outdated
adamamer20 0b55cb6
Merge branch 'main' into feature/abstract-data-collector
adamamer20 a14af67
spell check
Ben-geo 108bfbb
periods
Ben-geo 15e1585
precommit
Ben-geo 6be3274
uv
Ben-geo efdd04d
Merge branch 'main' into feature/abstract-data-collector
Ben-geo fa788e9
suggested changes
Ben-geo 5123bb7
sync
Ben-geo 1329439
Merge branch 'main' into feature/abstract-data-collector
adamamer20 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,210 @@ | ||
""" | ||
Abstract base classes for data collection components in mesa-frames. | ||
|
||
This module defines the core abstractions for data collection in mesa-frames. | ||
It provides a standardized interface for collecting model- and agent-level | ||
data during simulation runs, supporting flexible triggers, custom statistics, | ||
and optional external storage. | ||
|
||
Classes: | ||
AbstractDataCollector: | ||
An abstract base class defining the structure and core logic for | ||
all data collector implementations. It supports flexible reporting | ||
of model and agent attributes, conditional data collection using | ||
triggers, and pluggable backends for storage. | ||
|
||
These classes are designed to be subclassed by concrete implementations that | ||
handle the specifics of data collection and storage such as in-memory, CSV, | ||
or database-backed collectors, potentially using Polars for high-performance | ||
tabular operations. | ||
|
||
Usage: | ||
These classes should not be instantiated directly. Instead, they should be | ||
subclassed to create concrete DataCollector: | ||
|
||
from mesa_frames.abstract.datacollector import AbstractDataCollector | ||
|
||
class DataCollector(AbstractDataCollector): | ||
def collect(self): | ||
# Implementation using Polars DataFrame to collect model and agent data | ||
... | ||
|
||
def conditional_collect(self): | ||
# Implementation using Polars DataFrame to collect model and agent data if trigger returns True | ||
... | ||
|
||
def data(self): | ||
# Returns the data currently in memory | ||
... | ||
|
||
def flush(self): | ||
# Persists collected data if configured and optionally deletes data from memory | ||
... | ||
|
||
For more detailed information on each class, refer to their individual docstrings. | ||
""" | ||
|
||
from abc import ABC, abstractmethod | ||
from typing import Dict, Optional, Union, Any, Literal, List | ||
from collections.abc import Callable | ||
from mesa_frames import ModelDF | ||
import polars as pl | ||
|
||
|
||
class AbstractDataCollector(ABC): | ||
""" | ||
Abstract Base Class for Mesa-Frames DataCollector. | ||
|
||
This class defines methods for collecting data from both model and agents. | ||
Sub classes must implement logic for the methods | ||
""" | ||
|
||
_model: ModelDF | ||
_model_reporters: dict[str, Callable] | None | ||
_agent_reporters: dict[str, str | Callable] | None | ||
_trigger: Callable[..., bool] | ||
_reset_memory = bool | ||
_storage_uri: Literal["memory:", "csv:", "postgresql:"] | ||
_frames: list[pl.DataFrame] | ||
|
||
def __init__( | ||
self, | ||
model: ModelDF, | ||
model_reporters: dict[str, Callable] | None = None, | ||
agent_reporters: dict[str, str | Callable] | None = None, | ||
trigger: Callable[[Any], bool] | None = None, | ||
reset_memory: bool = True, | ||
storage: Literal["memory:", "csv:", "postgresql:"] = "memory:", | ||
): | ||
""" | ||
Initialize a Datacollector. | ||
|
||
Parameters | ||
---------- | ||
model : ModelDF | ||
The model object from which data is collected. | ||
model_reporters : dict[str, Callable] | None | ||
Functions to collect data at the model level. | ||
agent_reporters : dict[str, str | Callable] | None | ||
Attributes or functions to collect data at the agent level. | ||
trigger : Callable[[Any], bool] | None | ||
A function(model) -> bool that determines whether to collect data. | ||
reset_memory : bool | ||
Whether to reset in-memory data after flushing. Default is True. | ||
storage : Literal["memory:", "csv:", "postgresql:"] | ||
Storage backend URI (e.g. 'memory:', 'csv:', 'postgresql:'). | ||
""" | ||
self._model = model | ||
self._model_reporters = model_reporters or {} | ||
self._agent_reporters = agent_reporters or {} | ||
self._trigger = trigger or (lambda model: False) | ||
self._reset_memory = reset_memory | ||
self._storage_uri = storage or "memory:" | ||
self._frames = [] | ||
|
||
def collect(self) -> None: | ||
""" | ||
Trigger Data collection. | ||
|
||
This method calls _collect() to perform actual data collection. | ||
|
||
Example | ||
------- | ||
>>> datacollector.collect() | ||
""" | ||
self._collect() | ||
|
||
def conditional_collect(self) -> None: | ||
""" | ||
Trigger data collection if condition is met. | ||
|
||
This method caslls _collect() to perform actual data collection | ||
|
||
Example | ||
------- | ||
>>> datacollector.conditional_collect() | ||
""" | ||
if self._should_collect(): | ||
self._collect() | ||
|
||
def _should_collect(self) -> bool: | ||
""" | ||
Evaluate whether data should be collected at current step. | ||
|
||
Returns | ||
------- | ||
bool | ||
True if the configured trigger condition is met, False otherwise. | ||
""" | ||
return self._trigger(self._model) | ||
|
||
@abstractmethod | ||
def _collect(self): | ||
""" | ||
Perform the actual data collection logic. | ||
|
||
This method must be im | ||
""" | ||
pass | ||
|
||
@property | ||
@abstractmethod | ||
def data(self) -> Any: | ||
""" | ||
Returns collected data currently in memory as a dataframe. | ||
|
||
Example: | ||
------- | ||
>>> df = datacollector.data | ||
>>> print(df) | ||
""" | ||
pass | ||
|
||
def flush(self) -> None: | ||
""" | ||
Persist all collected data to configured backend. | ||
|
||
After flushing data optionally clears in-memory | ||
data buffer if `reset_memory` is True (default behavior). | ||
|
||
use this method to save collected data. | ||
|
||
|
||
Example | ||
------- | ||
>>> datacollector.flush() | ||
>>> # Data is saved externally and in-memory buffers are cleared if configured | ||
""" | ||
self._flush() | ||
if self._reset_memory: | ||
self._reset() | ||
|
||
def _reset(self): | ||
""" | ||
Clear all collected data currently stored in memory. | ||
|
||
Use this to free memory or start fresh without affecting persisted data. | ||
|
||
""" | ||
self._frames = [] | ||
|
||
@abstractmethod | ||
def _flush(self) -> None: | ||
""" | ||
Implement persistence of collected data to external storage. | ||
|
||
This method must be implemented by subclasses to handle | ||
backend-specific data saving operations. | ||
""" | ||
pass | ||
|
||
@property | ||
def seed(self) -> int: | ||
""" | ||
Function to get the model seed. | ||
|
||
Example: | ||
-------- | ||
>>> seed = datacollector.seed | ||
""" | ||
return self._model._seed | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.