🚀 Launching Data Engine for powerful pandas-like management of file datasets! 🚀
DagsHub is a platform where machine learning and data science teams can build, manage, and collaborate on their projects. With DagsHub you can:
- Version code, data, and models in one place. Use the free provided DagsHub storage or connect it to your cloud storage
- Track Experiments using Git, DVC or MLflow, to provide a fully reproducible environment
- Visualize pipelines, data, and notebooks in and interactive, diff-able, and dynamic way
- Label your data directly on the platform using Label Studio
- Share your work with your team members
- Stream and upload your data in an intuitive and easy way, while preserving versioning and structure.
DagsHub is built firmly around open, standard formats for your project. In particular:
- Git
- DVC
- MLflow
- Label Studio
- Standard data formats like YAML, JSON, CSV
Therefore, you can work with DagsHub regardless of your chosen programming language or frameworks.
This client library is meant to help you get started quickly with DagsHub. It is made up of Experiment tracking and Direct Data Access (DDA), a component to let you stream and upload your data.
For more details on the different functions of the client, check out the docs segments:
Some functionality is supported only in Python.
To read about some of the awesome use cases for Direct Data Access, check out the relevant doc page.
pip install dagshub
Direct Data Access (DDA) functionality requires authentication, which you can easily do by running the following command in your terminal:
dagshub login
The easiest way to start using DagsHub is via the Python Hooks method. To do this:
- Your DagsHub project,
- Copy the following 2 lines of code into your Python code which accesses your data:
from dagshub.streaming import install_hooks install_hooks()
- That’s it! You now have streaming access to all your project files.
🤩 Check out this colab to see an example of this Data Streaming work end to end:
You can dive into the expanded documentation, to learn more about data streaming, data upload and experiment tracking with DagsHub
To improve your experience, we collect analytics on client usage. If you want to disable analytics collection,
set the DAGSHUB_DISABLE_ANALYTICS
environment variable to any value.
Made with 🐶 by DagsHub.