TidyBear

A tidier approach to pandas.

This package was originally a collection of functions, routines, and processes that I found myself often repeating. It has since evolved into a desire to work my way through the tidyverse to reimplement my favorite tidy features in python. This project is not aimed at creating a better experience for every pandas task, but rather just a different one that sometimes feels more natural to me. I hope something here can be useful to you.

Installation

pip install tidybear

Usage

import pandas as pd
import tidybear as tb

Verbs

# rename columns
tb.rename(data, old="new")

# select columns
tb.select(data, ["col1", "col2"])

# count number of rows across multiple columns
tb.count(data, ["col1", "col2"])

# pivot long to wide or wide to long
tb.pivot_longer(data, ["val1", "val2"], names_to="val_type")
tb.pivot_wider(data, names_from="val_type", values_from="value")

# slice rows
tb.slice_max(data, order_by="val1", n=10)
tb.slice_min(data, order_by="val1", n=10, groupby="col1")

# join dataframes
tb.left_join(data1, data2, "colA") #  use "colA" as key
tb.right_join(data1, data2, col1A="col1B") #  use "col1A" from left and "col1B" from right

tb.cross_join(data1, data2)

Groupby and Summarise API

with tb.GroupBy(df, "group_var") as g:
    g.n()
    g.sum("value", name="total_value")
    g.n_distinct("ids", name="n_unique_ids")

    summary = g.summarise()

TidySelectors

everything() - Select all columns
last_col - Select last column
first_col - Select first column
contains(pattern) - Select columns that contain the literal string
matches(pattern) - Select columns that match the regular expression pattern
starts_with(pattern) - Select columns that start with the literal string
ends_with - Select all columns that end with the literal srting
num_range - Select all columns that match a numeric range like x01, x02, x03

These can be used in a variety of tidybear verbs

from tidybear.selectors import contains, everything

# select all columns that contain "foo"
tb.select(data, contains("foo"))

# pivot all columns to long format
tb.pivot_longer(data, everything())

You can also negate these, so if you wanted everything except one columns, you could do:

from tidybear.selectors import last_col

tb.select(data, -last_col())

Coming Soon (maybe)

Method chaining
Tribbles!

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
data		data
docs		docs
examples		examples
tests		tests
tidybear		tidybear
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TidyBear

Installation

Usage

Verbs

Groupby and Summarise API

TidySelectors

Coming Soon (maybe)

About

Uh oh!

Releases 5

Uh oh!

Languages

License

mbmackenzie/tidybear

Folders and files

Latest commit

History

Repository files navigation

TidyBear

Installation

Usage

Verbs

Groupby and Summarise API

TidySelectors

Coming Soon (maybe)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Uh oh!

Languages