Steward View — A Python Demo for Finance Professionals

This is a teaching repository that demonstrates how Python solves real-world finance problems that Excel cannot.

Using a single Jupyter notebook as its entry point, the program ingests raw household transactions of a fictional couple (“Jack & Jill”), cleans and enriches them, produces summary statistics and charts, and writes one consolidated CSV. It also lets users investigate the result conversationally with an OpenAI model equipped with the Code Interpreter (Python) tool.

Who is this for?

This repo is for finance professionals and students who want a clear, hands-on example of Python’s value beyond spreadsheets.

What does it do?

Run the top-level run() function to execute the full pipeline:

Load raw data.
Tag sources.
Clean data.
Combine separate sources into one table with:
- A custom matching algorithm that reconciles irregularly billed Amazon product purchases with the corresponding credit-card charges — enabling precise categorization of otherwise generic “AMZN” expenses. This operation is not replicable in Excel.
Label transactions with:
- Vendor and category inference from bank statement descriptions using the OpenAI API. This program issues hundreds of calls asynchronously with asyncio for speed.

Outputs

Statistics: Total spending and spending by account
Charts: Spending by spender, category, and vendor
CSV: One fully tagged / cleaned / combined / labeled dataset for downstream analysis

After the pipeline runs, users can chat with the data (inside the notebook) using an OpenAI model (itself equipped with Python) to answer ad hoc analytical questions.

Why Python (vs. Excel)?

Python is uniquely useful for enhancing and automating data analysis. This repo demonstrates the following advantages:

Automation: Repeatably execute a complex workflow in one call (run())
Custom Logic: Solve a difficult matching problem with a purpose-built algorithm that spreadsheets cannot replicate
Low-Level Control of AI: Deploy LLMs with precision for reliable results tailored to your needs
Speed: Process hundreds of API calls in parallel with asyncio
Quality: Iterate using robust GitHub version control and flag problems with automatic testing
Modularity: Combine bite-sized components into infinitely complex (and capable) systems

For a thorough comparison of Python and its alternatives, see the results of our research in this 14-minute video on our YouTube channel.

Repository layout

data/                 # sample Jack & Jill inputs and program outputs
notebooks/
  demo.ipynb          # start here
src/
  analysis/
    config/           # settings (e.g. model choice)
    inspect/          # chat interface & helpers for notebook analysis
    run/              # the data processing pipeline
    utilities/        # small, focused helpers
tests/                # automated tests

The codebase is composed of many small, minimally complex files to make each piece approachable for non-programmers.

Quick start

1) Configure your OpenAI API key

To run the program yourself, you will need to:

Create a free OpenAI developer account by visiting OpenAI's developer site,
Add your payment information to your OpenAI account by going to Your profile > Billing,
Get an OpenAI API key by going to Your profile > API keys, and
Bind your key to a GitHub secret environment variable named OPENAI_API_KEY by logging into GitHub and going to Settings > Codespaces > Secrets.

Running the whole program once (including asking one chat question) with GPT 5 Mini (the default model) costs about $0.03. Loading $5.00 onto your OpenAI developer account will give you more than enough credit to explore this repo.

2) Install the program

Install the program by opening notebooks/demo.ipynb and executing the first code cell:

pip install ..

3) Run the pipeline in the notebook

To run the data processing pipeline, execute:

import analysis

analysis.run()  # Loads, tags, cleans, combines, labels, prints stats, shows charts, and writes a CSV

The function prints summary stats, renders charts, and writes one consolidated CSV containing the processed dataset (which the chat tool uses to answer questions).

Conversational analysis (after `run()`)

The Chat object allows you to ask questions of the CSV with an OpenAI model that can execute Python against the file. Each Chat instance represents one continuous conversation.

from analysis.inspect import Chat

chat = Chat()  # Uses the pipeline’s CSV output
chat.msg("What is the combined spending in the top three categories?")
chat.msg("How about the top five?")

To change the model, adjust the configuration (see next section).

Configuration

In src/analysis/config/, you can find program settings such as:

LLM model selection
LLM instructions
Expense categories
Chart modifiers

Inspecting the Amazon matcher (optional)

The analysis.inspect module includes the load_matcher() function for reviewing the Amazon reconciliation. This function loads the Matcher instance output by the run() function. This Matcher instance has a counter attribute that records the activity of the matching algorithm:

from analysis.inspect import load_matcher

matcher = load_matcher()
matcher.counter  # A record of the number of charges reconciled with each matching rule

This is an educational aid for understanding the custom algorithm.

Cost, privacy, and performance

API cost: Labeling uses hundreds of OpenAI API calls. Changing the model will change the cost of running the program. You may want to review OpenAI's pricing page before adjusting this setting.
Privacy: This program only sends the fields necessary for labeling to the OpenAI API. The included dataset is fictional.
Speed: The pipeline uses asyncio to issue requests concurrently. If you hit rate limits, wait a minute before running the program again.

License

PyFi students may use the contents of this repository for private, personal projects but may not publicly redistribute any part of this repository (or derivatives) or use any part of this repository for commercial projects. For full license details, see this repository's LICENCE.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data/input		data/input
notebooks		notebooks
src/analysis		src/analysis
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Steward View — A Python Demo for Finance Professionals

Who is this for?

What does it do?

Why Python (vs. Excel)?

Repository layout

Quick start

1) Configure your OpenAI API key

2) Install the program

3) Run the pipeline in the notebook

Conversational analysis (after `run()`)

Configuration

Inspecting the Amazon matcher (optional)

Cost, privacy, and performance

License

About

Uh oh!

Releases

Packages

Languages

License

PyFi-Training/steward-view

Folders and files

Latest commit

History

Repository files navigation

Steward View — A Python Demo for Finance Professionals

Who is this for?

What does it do?

Why Python (vs. Excel)?

Repository layout

Quick start

1) Configure your OpenAI API key

2) Install the program

3) Run the pipeline in the notebook

Conversational analysis (after run())

Configuration

Inspecting the Amazon matcher (optional)

Cost, privacy, and performance

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Conversational analysis (after `run()`)

Packages