Skip to content

Clarify what is *needed* vs what is *recommended* in terms of working with data packages and sprout #1575

@joelostblom

Description

@joelostblom

Something like this:

  1. One could manually create all the files needed to make a data package, but that would lead to issues in maintainability and introduce errors from manual edits. This also helps motivate sprout more clearly by describing the problems that can occur when not using it.
  2. One could Install sprout globally (or in a non-project virtual env) via pip, pipx, etc, which would solve all the shortcomings of working with a data package manually. However it would introduce new issues in that the Python environment that is used to work in sprout is not reproducible.
  3. One could create a Python project per data package, which includes information about which version of sprout is used etc to make sure it is as reproducible as possible, e.g. via uv (which is more robust than creating the python project files manually). However, this does not create websites, github releases, etc for easy browsing of project info and the metadata
  4. One could install the datapackage template which comes with a the python project files but also extra config for github, changelog, releases, typos, linting, etc.
    • This is the recommended way to set up a package in a fully reproducible way both for the data, metadata, and for sprout to manage these two in a reproducible way.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions