Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move datasets to MLDatasets.jl #22

Closed
4 tasks
darsnack opened this issue Apr 2, 2021 · 1 comment
Closed
4 tasks

Move datasets to MLDatasets.jl #22

darsnack opened this issue Apr 2, 2021 · 1 comment
Labels
good first issue Good for newcomers gsoc-proposal Good issues to tackle for GSoC proposals help wanted Contributions welcome!

Comments

@darsnack
Copy link
Member

darsnack commented Apr 2, 2021

In the long term, we'd like most of the src/datasets code to move to MLDatasets.jl. To make this happen, we need a refactor of MLDatasets.jl to be more extensible and build on top of LearnBase.jl. Below is the structure envisioned for MLDatasets.jl:

  1. Low-level API: structs for different types of I/O (e.g. FileDataset) that support reading from the underlying I/O via getobs and nobs from LearnBase.jl
  2. High-level API: specific datasets (e.g. CIFAR10) implement using the low-level API

To achieve this goal, we need to complete the following stages:

  • Move data containers (e.g. FileDataset) to MLDatasets.jl
  • Move data container transformations (e.g. mapobs, groupsobs, etc.) to MLDataPattern.jl (these transformations apply generically to any iterator of observations, not just data containers)
  • Refactor existing data sets in MLDatasets.jl to utilize the low-level APIs
  • Move FastAI.jl datasets to MLDatasets.jl
@darsnack darsnack added good first issue Good for newcomers gsoc-proposal Good issues to tackle for GSoC proposals help wanted Contributions welcome! labels Apr 2, 2021
@lorenzoh
Copy link
Member

Closed by #229

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers gsoc-proposal Good issues to tackle for GSoC proposals help wanted Contributions welcome!
Projects
None yet
Development

No branches or pull requests

2 participants