Skip to content

kylestach/bigvision-palivla

Repository files navigation

PaliVLA

This is a framework for training multimodal vision-language-action (VLA) model for robotics in JAX. It primarily supports PaliGemma for now, though more base models will be added in the future.

Installation

We develop with uv, but other environment managers should work fine. To install the dependencies, run:

uv venv
uv sync

Training

To train a model, run:

python -m palivla/train.py --config_file palivla/configs/bridge_config.py

This repository is (for now) a fork of big_vision.

Citation

If you use PaliVLA in your own project, please cite this repository:

@misc{palivla,
  author       = {Kyle Stachowicz},
  title        = {PaliVLA},
  year         = {2024},
  url          = {https://github.com/kylestach/bigvision-palivla},
  note         = {GitHub repository}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published