-
Couldn't load subscription status.
- Fork 79
fix #1236 #1644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #1236 #1644
Conversation
qiita_db/analysis.py
Outdated
| from biom.util import biom_open | ||
| import pandas as pd | ||
| from skbio.util import find_duplicates | ||
| from skbio.util import flatten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flatten has been removed in later versions of skbio. In other parts of the code, we have been using the construction below to use flatten:
from itertools import chain
list(chain.from_iterable(ITERABLE))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K
|
thanks @antgonza - some comments |
qiita_db/analysis.py
Outdated
| # different stages and possible errors. | ||
| samples = self.samples | ||
| # figuring out if we are going to have duplicated samples, again | ||
| # doing it here cause it's computetional cheaper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
computetional -> computational
|
Super minor typos. 👍 otherwise. |
qiita_db/analysis.py
Outdated
| to_concat = [] | ||
| for aid, samps in viewitems(samples): | ||
| qdb.sql_connection.TRN.add(sql, [aid]) | ||
| qm_fp = qdb.sql_connection.TRN.execute_fetchindex()[0][1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized - if you are accession only to [0][1], doesn't it mean that the above query is retrieving more info than needed? I think it should only be SELECT filepath and then here you can use execute_fetchlast. I think it will make the code cleaner and also removes the magic numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
|
Few minor comments. |
|
👍 given tests pass |
This fixes #1236 as discussed in there and during the last call. The best way to test this was to add a new biom artifact in a study that already had one => added in populate_test_db.sql (artifact 5). This change will also be helpful as we move to test more than one artifact at the time. However, this change also make some scripts to failed and I'm fixing them. I check all errors in my laptop and flake8 so we should be OK to review ASAP.