Wirehead

Caching system for scaling of synthetic data generators using MongoDB

I. Installation

1. MongoDB Setup (For Development/Testing Only)

Ubuntu Setup
macOS Setup

Important Note: The following instructions are for development and testing purposes only. For production deployments, please refer to the official MongoDB documentation for secure and proper installation guidelines.

a. Quick MongoDB Setup (Ubuntu):

sudo apt-get install gnupg curl
curl -fsSL https://www.mongodb.org/static/pgp/server-7.0.asc | \
   sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg \
   --dearmor
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org

# Run MongoDB
sudo systemctl start mongod

# Stop MongoDB
sudo systemctl stop mongod

b. Quick MongoDB Setup (MacOS):

Install homebrew if you haven't

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

brew tap mongodb/brew
brew update
brew install mongodb-community@7.0

# Run MongoDB
brew services start mongodb-community@7.0

# Stop MongoDB
brew services stop mongodb-community@7.0

2. Create virtual environment

# Note:
# python version doesn't necessarily have to be 3.10
# but this gives better support for some generation pipelines

# Conda
conda create -n wirehead python=3.10
conda activate wirehead

# venv
python3.10 -m venv wirehead 
source venv/bin/activate

3. Install wirehead:

git clone git@github.com:neuroneural/wirehead.git
cd wirehead
pip install -e .

4. Run the test

cd examples/unit
chmod +x test.sh
./test.sh

II. Usage

See examples/unit for a minimal example

1. Generator

import numpy as np
from wirehead import WireheadGenerator 

def create_generator():
    while True: 
        img = np.random.rand(256,256,256)
        lab = np.random.rand(256,256,256)
        yield (img, lab)

if __name__ == "__main__":
    brain_generator     = create_generator()
    wirehead_runtime    = WireheadGenerator(
        generator = brain_generator,
        config_path = "config.yaml" 
    )
    wirehead_runtime.run_generator()

2. Dataset

import torch
from wirehead import MongoheadDataset

dataset = MongoheadDataset(config_path = "config.yaml")

idx = [0] 
data = dataset[idx]
sample, label = data[0]['input'], data[0]['label']

III. Config guide

All wirehead configs live inside yaml files, and must be specified when declaring wirehead manager, generator and dataset objects. For the system to work, all components must use the same configs.

1. Basic configs:

MONGOHOST -- IP address or hostname for machine running MongoDB instance
DBNAME -- MongoDB database name
PORT -- Port for MongoDB instance. Defaults to 27017
SWAP_CAP -- Size cap for read and write collections. bigger means bigger cache, and less frequent swaps. The total memory used by wirehead can be calculated with:
        SWAP_CAP * SIZE OF YIELDED TUPLE * 2

2. Advanced configs:

SAMPLE -- Array of strings denoting name of samples in data tuple. 
WRITE_COLLECTION   -- Name of write collection (generators push to this)
READ_COLLECTION    -- Name of read colletion (dataset reads from this)
COUNTER_COLLECTION -- Name of counter collection for manager metrics
TEMP_COLLECTION    -- Name of temporary collection used for moving data during swap
CHUNKSIZE          -- Number of megabytes used for chunking data

IV. Generator example

See a simple example in examples/unit/generator.py or a Synthseg example in examples/synthseg/generator.py

Wirehead's WireheadGenerator object takes in a generator, which is a python generator function. This function yields a tuple containing numpy arrays. The number of samples in this tuple should match the number of strings specified in SAMPLE in config.yaml

1. Set SAMPLE in "config.yaml" (note the number of keys)

SAMPLE: ["a", "b"]

2. Create a generator function, which yields the same number of objects

def create_generator():
    while True: 
        a = np.random.rand(256,256,256)
        b = np.random.rand(256,256,256)
        yield (a, b)

3. Insert config file path and generator function into WireheadGenerator

generator = create_generator()
runtime = WireheadGenerator(
    generator = generator,
    config_path = "config.yaml" 
)

4. Press play

runtime.run_generator() # runs an infinite loop

Citation/Contact

This code is under MIT licensing

If you have any questions specific to the Wirehead pipeline, please raise an issue or contact us at mdoan4@gsu.edu

Name		Name	Last commit message	Last commit date
Latest commit History 390 Commits
assets		assets
examples		examples
wirehead		wirehead
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wirehead

I. Installation

1. MongoDB Setup (For Development/Testing Only)

a. Quick MongoDB Setup (Ubuntu):

b. Quick MongoDB Setup (MacOS):

2. Create virtual environment

3. Install wirehead:

4. Run the test

II. Usage

1. Generator

2. Dataset

III. Config guide

1. Basic configs:

2. Advanced configs:

IV. Generator example

1. Set SAMPLE in "config.yaml" (note the number of keys)

2. Create a generator function, which yields the same number of objects

3. Insert config file path and generator function into WireheadGenerator

4. Press play

Citation/Contact

About

Releases

Packages

Contributors 2

Languages

License

neuroneural/wirehead

Folders and files

Latest commit

History

Repository files navigation

Wirehead

I. Installation

1. MongoDB Setup (For Development/Testing Only)

a. Quick MongoDB Setup (Ubuntu):

b. Quick MongoDB Setup (MacOS):

2. Create virtual environment

3. Install wirehead:

4. Run the test

II. Usage

1. Generator

2. Dataset

III. Config guide

1. Basic configs:

2. Advanced configs:

IV. Generator example

1. Set SAMPLE in "config.yaml" (note the number of keys)

2. Create a generator function, which yields the same number of objects

3. Insert config file path and generator function into WireheadGenerator

4. Press play

Citation/Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages