Loom provides a structured way to weave together ideas and concepts using information and time constructs.
It offers a pragmatic approach for framing complex reality model.
- Lower mental load: Abstract away boilerplate and repetitive tasks, letting developers focus on logic.
- Promote simple, elegant code: Prioritize clarity and simplicity over exhaustive features. We are willing to sacrifice capability for a simpler, more focused codebase. For example, Loom is intentionally built only for MongoDB to avoid the complexity of supporting multiple database paradigms. Supporting other databases would likely compromise this simplicity.
- Make the safe way the easy way: The framework guides developers away from common pitfalls, especially in data and time management. Though, it creates coding friction for unsafe practices, it does not build exhaustive guardrails at the cost of simplicity.
For detailed usage examples and comprehensive guides, see USAGE.md.
Loom aims to minimize the cognitive load on developers. While Python offers endless ways to accomplish tasks, Loom provides a collection of harmonized approaches that follow our philosophy.
It promotes and glues together mental models that work well together and align with our needs.
The core of Loom's information management is a declarative persistence layer built on top of MongoDB, Pydantic, and PyMongoArrow
PyMongoArrow is a PyMongo extension that serves as the essential bridge between MongoDB and the Arrow ecosystem. Its purpose is to load MongoDB query result-sets directly into high-performance analytical structures. It is, in fact, the "recommended way to materialize MongoDB query result-sets as contiguous-in-memory typed arrays suited for in-memory analytical processing applications." PyMongoArrow can materialize data into several key data structures favored by data scientists and analysts:
- Apache Arrow tables
- NumPy arrays
- Pandas DataFrames
- Polars DataFrames
By providing a direct conversion path to these industry-standard formats, PyMongoArrow dramatically simplifies the data access layer for analytical applications built on top of MongoDB.
The data flow begins with a query to a MongoDB database. This blueprint bypasses the costly, row-by-row object process common in traditional database drivers. Instead, PyMongoArrow materializes the entire result set directly into the Arrow columnar format in memory. The critical outcome of this process is the elimination of the data serialization and deserialization steps that plague traditional data transfer architectures. By creating Arrow-native structures directly from the database results, applications can fully leverage the format's support for zero-copy reads. This directly enables the "lightning-fast data access" that is Arrow's core promise, removing a significant performance bottleneck and creating a highly efficient pipeline for in-memory computation.
Getting started with Loom is simple. Define your model, decorate it, and start interacting with the database.
import loom as lm
from loom.info import Persistable, declare_persist_db, StrLower
# 1. Define a Pydantic model and make it Persistable
# The `_id`, `created_at`, and `updated_time` fields are handled automatically.
@declare_persist_db(db_name="my_app", collection_name="users")
class User(Persistable):
name: str
email: StrLower #StrLower is an annotated str type that will normalized to lower case (querying and filtering will also lowercase the search input)
age: int
# 2. Create and save a new user
new_user = User(name="Alice", email="alice@example.com", age=30)
new_user.persist()
print(f"Saved user with ID: {new_user.id}")
# 3. Load a user from the database using its ID
retrieved_user = User.from_id(new_user.id)
if retrieved_user:
print(f"Found user: {retrieved_user.name}")
# 4. Update the user's age
retrieved_user.age = 31
retrieved_user.persist()
print(f"User's updated time: {retrieved_user.updated_time}")
# 5. Load multiple users with a filter
active_users = User.filter(lm.fld['age'] < 40).load_many()
print(f"Found {len(active_users)} active users.")
# 6. Create other users to persist multiple documents at once
user_bob = User(name="Bob", email="bob@example.com", age=25)
user_charlie = User(name="Charlie", email="charlie@example.com", age=35)
User.persist_many([user_bob, user_charlie])
# 7. Load one user with a filter
bob_user = User.filter(lm.fld['email'] == "BOB@example.COM").load_one()
if bob_user:
print(f"Found Bob: {bob_user.name} with email: {bob_user.email}")Persistable: Your data models inherit fromlm.Persistableto gain database-aware capabilities.@declare_persist_db: A class decorator that links your model to a specific MongoDB database and collection. It keeps your persistence logic clean and co-located with your data definition.- Specialized Persistence Models:
LedgerModel: For append-only data. Everypersist()call creates a new document, ensuring an immutable history.TimeSeriesLedgerModel: ExtendsLedgerModelfor use with MongoDB's native time-series collections, configured with the@declare_timeseriesdecorator.
- Fluent Query API: Loom provides a fluent and chainable API for building queries, starting with
YourModel.filter()orYourModel.aggregation(). This returns aLoadDirectiveobject that you can use to build and execute your query. - Expressive Filtering with
Model.fields(): TheYourModel.fields()class method returns a dictionary-like object that allows you to create filter expressions in a more Pythonic way (e.g.,User.fields()['age'] < 40). - Rich Querying and Loading: The
LoadDirectiveobject provides a rich set of methods for loading data, includingload_one,load_many,load_latest,exists, and methods for loading data intopandas,polars, andpyarrowdata structures. - Bulk Operations: Use
persist_manyandinsert_dataframeto efficiently save multiple model instances or a whole DataFrame at once.
Loom 2.0 introduces a new fluent API for building and executing queries. This API is designed to be more expressive, more Pythonic, and easier to use than the previous query API. It borrows familiarity from the Polars API to provide a consistent learning curve.
To start building a query, use the filter() or aggregation() class methods on your Persistable model. These methods return a LoadDirective object, which you can use to build and execute your query.
# Start a query with a filter
directive = User.filter(lm.fld['age'] > 30)
# Start a query with an aggregation
directive = User.agg(lm.AggregationStages().group({"_id": "$name", "sum_age": {"$sum": "$age"}}))The LoadDirective object provides a number of methods for building queries, including:
filter(filter): Add a filter to the query.sort(sort): Add a sort to the query.limit(limit): Add a limit to the query.aggregation(aggregation): Add an aggregation pipeline to the query.
These methods are chainable, so you can build complex queries in a single expression.
# Build a query with a filter, sort, and limit
users = User.filter(lm.fld['age'] > 30).sort('age', descending=True).limit(10).load_many()The Model.fields() method provides a more Pythonic way to create filter expressions. You can use standard Python comparison operators to create filters.
# Find users with age between 30 and 40
users = User.filter((lm.fld['age'] >= 30) & (lm.fld['age'] <= 40)).load_many()
# Find users with name 'Alice' or 'Bob'
users = User.filter(lm.fld['name'].is_in(['Alice', 'Bob'])).load_many()Once you have built your query, you can use one of the load_* methods to execute it and get the results.
load_one(): Load a single document.load_many(): Load multiple documents.load_latest(): Load the most recently updated document.exists(): Check if a document exists.load_dataframe(): Load the results into a pandas DataFrame.load_polars(): Load the results into a polars DataFrame.load_table(): Load the results into a pyarrow Table.
# Load a single user
user = User.filter(lm.fld['name'] == 'Alice').load_one()
# Load all users into a pandas DataFrame
df = User.filter().load_dataframe()
# Load all users into a polars DataFrame
polars_df = User.filter().load_polars()
# Load all users into a pyarrow Table
table = User.filter().load_table()Loom uses Python's Annotated type to attach special behaviors to your model fields, reducing boilerplate and ensuring consistency.
TimeInserted&TimeUpdated: Automatically managecreated_atandupdated_timetimestamps.TimeInsertedis set once on creation, andTimeUpdatedis refreshed on everypersist()call.StrUpper&StrLower: Automatically normalize string fields to uppercase or lowercase.IncrCounter: An integer type for atomic increments. Use the+=operator on the field, and Loom will translate it into an$incoperation in the database.
- Support for MongoDB only.
- Secret management is handled via
google-cloud-secret-manager, localkeyring, or environment variable.
Loom provides a structured way to reason about time through TimeFrame objects, which represent discrete, human-understandable intervals.
TimeFrame: The base class for all time frames. It defines afloor(inclusive) and aceiling(exclusive).- Resolutions: Loom provides concrete
TimeFrameimplementations for common resolutions:HourlyFrame,DailyFrame,WeeklyFrame,MonthlyFrame,QuarterlyFrame, andYearlyFrame. - Timezone Aware: Time frames are timezone-aware, defaulting to UTC but easily configurable for any timezone.
import loom as lm
# Get the current daily frame in US/Eastern time
today_eastern = lm.DailyFrame.create(tzone=lm.EASTERN_TIMEZONE)
print(f"Today (ET): {today_eastern.get_pretty_value()}")
print(f"Floor: {today_eastern.get_floor()}")
print(f"Ceiling: {today_eastern.get_ceiling()}")
# Get the previous frame
yesterday_eastern = today_eastern.get_previous_frame()
print(f"Yesterday (ET): {yesterday_eastern.get_pretty_value()}")
# Get the weekly frame for a specific moment
weekly_frame = lm.WeeklyFrame.create(moment=yesterday_eastern.floor)
print(f"Weekly Frame: {weekly_frame.get_pretty_value()}")Moment: A corePersistablemodel that represents a single, timestamped event or data point, identified by asymbol.MomentWindow: A container for a sequence ofMomentobjects, sorted by time. It provides powerful methods for analysis, such as:sliding_window(size): Yields consecutive windows of a fixed number of moments.sliding_time_cone(past_size, future_size): Yields a tuple of (past_window, future_window), ideal for creating training data for predictive models.
insightToolkit: A collection of Pydantic models for performing statistical analysis on time-series data.Spread: Calculates percentiles, standard deviation, and mean.Regression: Performs linear regression to find the slope and statistical significance of a trend.PdfValue: Uses Kernel Density Estimation (KDE) to find the most likely value (peak) and confidence interval of a dataset.Trending: A composite model that combinesSpread,Regression, andPdfValuefor a comprehensive trend analysis.