You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the function listGenomes on Ensembl Genomes instances, it returns all genomes in all Ensembl instances, regardless of what users specify in the argument subset.
Below you can find an example for Ensembl Fungi, but it's the same for all other instances. Obviously, there are not 33791 species on Ensembl Fungi.
library(biomartr)
fungi<- listGenomes(db="ensembl", subset="EnsemblFungi", skip_bacteria=TRUE)
#> Starting information retrieval for: EnsemblVertebrates#> Starting information retrieval for: EnsemblPlants#> Starting information retrieval for: EnsemblFungi#> Starting information retrieval for: EnsemblMetazoa#> Starting information retrieval for: EnsemblBacteria#> Starting information retrieval for: EnsemblProtists
length(fungi)
#> [1] 33791
head(fungi)
#> [1] "leptobrachium_leishanense" "mus_musculus_pwkphj" #> [3] "strigops_habroptila" "sus_scrofa_hampshire" #> [5] "struthio_camelus_australis" "latimeria_chalumnae"
I know that support for Ensembl Genomes was added recently, so it's still in experimental stage, but this is something that could be easily avoided by writing comprehensive unit tests for functions. Maybe that's something to consider for the future.
I also do not understand why {biomartr} has to download data for all Ensembl instances beforehand, even when users specify that they only want one instance. This could also be fixed to improve efficiency.
Best,
Fabricio
The text was updated successfully, but these errors were encountered:
Hi, @HajkD
When using the function
listGenomes
on Ensembl Genomes instances, it returns all genomes in all Ensembl instances, regardless of what users specify in the argument subset.Below you can find an example for Ensembl Fungi, but it's the same for all other instances. Obviously, there are not 33791 species on Ensembl Fungi.
Created on 2023-11-10 with reprex v2.0.2
Session info
I know that support for Ensembl Genomes was added recently, so it's still in experimental stage, but this is something that could be easily avoided by writing comprehensive unit tests for functions. Maybe that's something to consider for the future.
I also do not understand why {biomartr} has to download data for all Ensembl instances beforehand, even when users specify that they only want one instance. This could also be fixed to improve efficiency.
Best,
Fabricio
The text was updated successfully, but these errors were encountered: