Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BagIt Descriptions to Libraries #5159

Closed
jmchilton opened this issue Dec 7, 2017 · 3 comments
Closed

BagIt Descriptions to Libraries #5159

jmchilton opened this issue Dec 7, 2017 · 3 comments

Comments

@jmchilton
Copy link
Member

jmchilton commented Dec 7, 2017

I'm trying to determine if this makes sense - maybe it is too heavy though?

Would this make sense to describes groups of files for instance needed for training and allow uploads of many heterogeneous files with their metadata to libraries.

Individual manual uploads when setting up a training instance has been identified as a training pain point. We should probably either develop a YAML format Galaxy can consume to describe this or use an existing format and create entities for training-material trainings.

We could also consider these for collections and histories if this works well also.

  • The following from Wikipedia makes me believe we can create bags that like out to Zenodo datasets.

    A bag can specify payload content indirectly via a "fetch.txt" file that lists URLs for content that can be fetched over the network to complete the bag.

@bgruening
Copy link
Member

Also interesting http://www.researchobject.org. @jgoecks and @jxtx known more about this and plans about using this or similar archives to import and export data.

@jmchilton
Copy link
Member Author

In response to RO from https://github.com/ResearchObject/bagit-ro:

A BagIt bag can be considered a mechanism for serialization and transport consistency, while a Research Object can be considered a way to capture identity, annotations and provenance of the resources. As such, the two formats complement each-other. They are however not directly compatible.

I think RO has an important role to play in tracking Galaxy Analyses - I'm not sure it would be the appropriate format to describe data libraries though? I'll admit there is much I don't understand though.

@jmchilton
Copy link
Member Author

Possible now in dev thanks to #5220. I'll put together an issue of follow up things we can do for this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants