-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Funding opportunity for Bioconda?! #15405
Comments
I guess the lowest hanging fruit is to support the continuation of the bot work that @epruesse has been spearheading. We'd need to outline some tangible goals for that, of course. In hindsight, maybe we should have submitted a GSoC proposal and had some masters students working on this already. |
The main issue we've had is wanting access to something new in the bleeding-edge Bioconductor, but Bioconda lagging behind for very good reasons related to labour-intensive update processes. I feel like there could be some pitch around additional support for inclusion of cutting-edge software, reducing the lag between e.g. R/Bioconductor releases and Bioconda updates etc. As @dpryan79 suggested this might just involve work on tooling and automation. I also feel like there could be some tools for recipe creation to encourage submissions. |
I'm not sure how labor intensive bioconductor releases are, it's mostly just hitting the "rerun" butten every 5 hours on circleci :) There does need to be some additional retooling done (also on the bioconductor side) to better automate system requirements. I'm trying to keep track of that stuff for the most recent release, which is as done as it can be until R 3.6.1 is released and conda-forge migrates. Then again, given how annoying it is to get R to build properly in conda-forge, maybe we could pay someone for their time :) Supporting a custom |
Sorry @dpryan79 , showing my ignorance of how things work, obviously I've been more a user than a developer here ;-). Integration with workflow tools already works pretty well, I use the Nextflow integration all the time (though not currently with containers), but maybe that's another angle to be worked if people can think of improvements. Paolo was here last week and I asked him about the possibility of Conda -> container resolution, so that someone could e.g. specify a conda package and just a small additional thing to get the associated biocontainer, rather than having to manually specify that container. Is that something others would find useful? |
I think that the conda -> container resolution within applications is something particular to each workflow environment, not sure if that is something that bioconda funded hours should be fuelling or diverted to (and this does happen on other workflow environments already, such as Galaxy, I think it is more a matter of the Nextflow community to be able to sort this out). Even though work might be just pressing buttons every some hours, still the fact that people need to dedicate time to maintaining Bioconda at the different levels is something that should be put in a grant, as maintenance of resources is more and more recognised as important by funders (albeit slowly). Also I'm sure that there must be occasions in which things go south and the amount of time that @dpryan79 and others spend is probably not negligible. Lets consider as well that CZI for instance is not a typical funder (in the sense that they won't be only pursuing scientific novelty), they are more engineering-involved and they might even be willing to put some of their engineer's time in helping to maintain bioconda. For HCA they put both funding and engineering time from their staff. What about applying as well for small or special grants from software carpentries or asking for large institutes that might recognise the value of bioconda to chip in with labour time, as large organisations and companies do for instance with large open source projects of their interest, where they commit an FTE or fraction of it for a year. What about Elixir? this seems pretty much like relevant research infrastructure to me. Or pre-competitive setups in industry like the Pistoia Alliance (not sure if this is still alive - but I'm sure there must be equivalents). |
What about maybe arguing that bioconda needs storage space for container storage in distributed geographical locations, specially close to areas of the world with less speed in terms of connectivity, and to protect bioconda of a potential shut down or denial of service from quay.io? Quay.io is currently the only storage of containers that we have, right? |
No, Docker containers have a backup at EBI and Singularity container have a backup in 5+ places. But in general the more backup the better :)
This works already. Biocontainers are designed to have this match. Galaxy does support it and CWL does support it. I talked to @pditommaso recently and its just a matter of implementing it in Nextflow.
Please note that ELIXIR is already funding Biocontainers and has funded people to work on Bioconda recipes in the past. The bioconductor caseThe reason we can not support bleeding edge bioconductor in this release cycle is because it depends on a new R version, which is too cutting edge for the conda community atm. So we wait for the first point release. This is depatable but the decision was taken because the .0 release has often some major bugs. The real solution to this problem is to get the point releases of R to be ABI compatible. So we do not need to rebuild all packages against every point release. Like we do for python. We build against Python 3.6 but not against 3.6.1 and so on. In my opionion we should not push for faster, bigger, higher releases, we should concentrate on quality. Software needs time and if people need bleeding edge technologies or beta software they could use a separate channel, or we can spin something up like bioconda-test or something. In my view conda and bioconda are for production systems. Easy deployable, trasnferable, reproducible. People always can install from upstream/master if they need bleeding edge or beta packages. Here is my small todo list, maybe its useful:
|
Just a quick idea: as the call includes outreach and community engagement work, what about developing something like a Software Carpentry lesson that teaches (bio)conda basics. So teach how to set up and use bioconda, but then also how to contribute. Even though the Contribution Guide is good already, I think this is often times still a very daunting step to take for many. If there was a half-day or full-day lesson for on-boarding people to that process, taking/assisting that step could be scaled more easily. Also, as a nice side-effect, such lesson development would probably stress-test the Contribution Guide for weaknesses... |
One thing I could suggest for Bioconda is improving reproducibility (even further). Another thing, as pointed by @dlaehnemann would be to extend communication with a MOOC or something similar, or even support the development of local users communities. Goes together with @bgruening 's codefest. Also, I like the suggestion of @dpryan79 regarding a Not mentioning the rest of your list @bgruening, all points are relevant. |
This is not necessarily funding for Bioconda, but a fellowship for someone affiliated with a U.S.-based institution that is interested in promoting scientific software, https://bssw.io/pages/bssw-fellowship-program. The description says that "Each 2020 BSSw Fellow will receive up to $25,000 for an activity that promotes better scientific software. Activities can include organizing a workshop, preparing a tutorial, or creating content to engage the scientific software community." Looking at the previous fellows, one can organize workshops to teach researchers how to create conda recipes, and how to package and distribute software with conda. The deadline for applying is October 15, 2019. |
Hi,
@pinin4fjords pointed us to: https://chanzuckerberg.com/rfa/essential-open-source-software-for-science
Sounds like the Bioconda community could apply and get some funding for interesting projects.
Please use this issue to sketch out some ideas. How can we make Bioconda better? :)
The text was updated successfully, but these errors were encountered: