Skip to content

Merge relax-md-req into master #1122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 98 commits into from
Apr 29, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
94bf806
Modifying the database
josenavas Apr 17, 2015
b3aa78d
Fixing populate_test_db.sql
josenavas Apr 17, 2015
fd13c59
Renaming tables as the old names do not make sense now
josenavas Apr 17, 2015
cf85cfd
Merge branch 'fix-extend' into fix-metadata-creation
josenavas Apr 17, 2015
b677ba0
Fixing sample obj tests
josenavas Apr 17, 2015
7896e22
Correctly merging the column values in the sample template
josenavas Apr 17, 2015
86f6474
Merge branch 'relax-db' into fix-sample-obj
josenavas Apr 17, 2015
cc3b359
Fixing latitude/longitude types on populate
josenavas Apr 17, 2015
de3659e
Merge branch 'relax-db' into fix-sample-obj
josenavas Apr 17, 2015
6472cf3
Fixing ReadOnly tests for PrepSample
josenavas Apr 17, 2015
4e87d20
Fixing ReadWrite tests for PrepSample
josenavas Apr 17, 2015
8ec3b27
Adding the column_restriction module
josenavas Apr 17, 2015
a13a7b8
Fixing SampleTemplate ReadOnly tests
josenavas Apr 17, 2015
c64ab6c
Fixng all SampleTemplate tests
josenavas Apr 17, 2015
5f23a55
Removing column restriction as it can be added in constants
josenavas Apr 17, 2015
ef284f3
Fixing TestPrepTemplateReadOnly
josenavas Apr 17, 2015
694f4ab
Fixing all prep template object tests
josenavas Apr 17, 2015
3728f63
All tests under metadata_template passing
josenavas Apr 17, 2015
b145fca
Fixing flake8
josenavas Apr 17, 2015
85a32dc
Merge branch 'master' of https://github.com/biocore/qiita into relax-db
josenavas Apr 21, 2015
4ea0312
Merge branch 'master' of https://github.com/biocore/qiita into fix-sa…
josenavas Apr 21, 2015
2990e23
Merge remote-tracking branch 'upstream/cart-branch' into relax-md-req
josenavas Apr 22, 2015
af248cf
Preparing files for the merge
josenavas Apr 22, 2015
b2721eb
Solving the hell of the merge conflict
josenavas Apr 22, 2015
771c27d
Merge branch 'relax-db' into fix-sample-obj
josenavas Apr 22, 2015
78f763a
Fix merge conflicts
josenavas Apr 23, 2015
fbdf463
Fixing test_setup.py
josenavas Apr 23, 2015
ff4aae6
Fixing util.py
josenavas Apr 23, 2015
f7e62bd
Fixing data.py
josenavas Apr 23, 2015
10d9a0e
Fixing search.py
josenavas Apr 23, 2015
299c52f
Chaning queue name
josenavas Apr 23, 2015
18efa9c
Adding the missing prep template to the DB
josenavas Apr 23, 2015
e845794
Fixing type on populate_test_db.sql
josenavas Apr 23, 2015
a8b7d98
Doing all the patch in SQL
josenavas Apr 24, 2015
8772f32
Merge branch 'relax-db' into fix-sample-obj
josenavas Apr 24, 2015
013d126
Merge branch 'fix-sample-obj' into fix-metadata-obj
josenavas Apr 24, 2015
e65a41d
Merge branch 'fix-metadata-obj' into fix-qiita-db-tests
josenavas Apr 24, 2015
c6147b0
Merge branch 'fix-qiita-db-tests' into fix-analysis-tests
josenavas Apr 24, 2015
0ddb98a
Atatching the new prep template file to the prep template
josenavas Apr 24, 2015
5d7a752
fixing test_setup.py
josenavas Apr 24, 2015
e9fab3e
Fixing test_reference.py by removing magic numbers
josenavas Apr 24, 2015
ff305f9
Fixing test_job.py by removing magic numbers
josenavas Apr 24, 2015
01ec8bc
Fixing test_prep_template.py by removing magic numbers
josenavas Apr 24, 2015
69cfcc1
Fixing test_meta_util.py
josenavas Apr 24, 2015
c5f3107
Adding the qiime mapping file
josenavas Apr 24, 2015
4ef8b42
Fixing all the analysis tests. Fixes partially #247. Fixes #465
josenavas Apr 24, 2015
847485b
Fixing tests due to the addition of the mapping file
josenavas Apr 25, 2015
5ee643f
addressing @squirrelo's comments
josenavas Apr 25, 2015
3d4cd42
Removing patch as per @squirrelo's suggestion
josenavas Apr 25, 2015
e75c372
Fixing qiita ware test util
josenavas Apr 25, 2015
fa4a68b
Fixing qiita ware tests
josenavas Apr 25, 2015
b43c3cc
Flake8
josenavas Apr 25, 2015
69395bf
Merge branch 'relax-db' into fix-sample-obj
josenavas Apr 25, 2015
ba49dc5
Reading python patch as now it has the needed functionality
josenavas Apr 25, 2015
3d4d7dd
Merge branch 'fix-metadata-obj' into fix-qiita-db-tests
josenavas Apr 25, 2015
e7cefb0
Merge branch 'fix-qiita-db-tests' into fix-analysis-tests
josenavas Apr 25, 2015
658a5b5
Merge branch 'fix-analysis-tests' into fix-qiita-ware-tests
josenavas Apr 25, 2015
6177e4e
Merge pull request #1073 from josenavas/relax-db
antgonza Apr 27, 2015
0439de9
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 27, 2015
157b60a
Adding qiime-map property to the prep template
josenavas Apr 27, 2015
2c5027e
Merge pull request #1074 from josenavas/fix-sample-obj
adamrp Apr 27, 2015
57ed605
Merge branch 'master' of https://github.com/biocore/qiita into fix-me…
josenavas Apr 27, 2015
2a29012
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 27, 2015
b2c4f70
Merge branch 'master' of https://github.com/biocore/qiita into relax-…
josenavas Apr 27, 2015
0cfab56
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 27, 2015
62f0d3e
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 27, 2015
f81caa9
Fixing call to clean validate template
josenavas Apr 27, 2015
134678d
Removing all warnings from tests
josenavas Apr 27, 2015
6aa5335
Fixing create qiime mapping file add_filepath call
josenavas Apr 27, 2015
1cc26e9
Addressing comments
josenavas Apr 27, 2015
2d3df08
Reducing the rename_cols dict
josenavas Apr 28, 2015
79256ff
Merge pull request #1075 from josenavas/fix-metadata-obj
antgonza Apr 28, 2015
0829f95
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 28, 2015
d777469
Addressing comments from @ElDeveloper and @squirrelo
josenavas Apr 28, 2015
b26a867
Merge pull request #1099 from josenavas/fix-qiita-db-tests
squirrelo Apr 28, 2015
e76a10c
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 28, 2015
d1bee5e
Addressing @ElDeveloper comments
josenavas Apr 28, 2015
8be914f
Removing inference - good catch @ElDeveloper\!
josenavas Apr 28, 2015
b604ae1
Merge pull request #1106 from josenavas/fix-analysis-tests
ElDeveloper Apr 28, 2015
a663d33
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 28, 2015
b19b60c
Merge branch 'relax-md-req' of https://github.com/biocore/qiita into …
josenavas Apr 28, 2015
e7363a3
Merge branch 'add-qiime-map-func' into fix-qiita-ware-tests
josenavas Apr 28, 2015
d464237
Fixing bug in add filepath
josenavas Apr 28, 2015
84beb81
Fixing _get_qiime_minimal_mapping
josenavas Apr 28, 2015
c30dd8c
Fixing qiime map parsing
josenavas Apr 28, 2015
e5ae3e4
Removing warnings from tests:
josenavas Apr 28, 2015
14b0c49
Addressing comments
josenavas Apr 29, 2015
0c9a47a
Adding test with reverse linker primer
josenavas Apr 29, 2015
6c83ba4
Fixing WTF failures
josenavas Apr 29, 2015
7390538
Merge pull request #1107 from josenavas/fix-qiita-ware-tests
ElDeveloper Apr 29, 2015
a546233
Reverting the comment from @squirrelo on search for JOIN...ON -> USIN…
josenavas Apr 29, 2015
402003a
Removing the ambiguity
josenavas Apr 29, 2015
d87890b
Removing join as now all the columns are in the dynamic table...
josenavas Apr 29, 2015
f73887f
Execute the tests even if you change the format... there might be a s…
josenavas Apr 29, 2015
4830b24
Merge branch 'master' of https://github.com/biocore/qiita into join-u…
josenavas Apr 29, 2015
7a29adc
Fixing bug in _build_mapping_file and create a test for it
josenavas Apr 29, 2015
fa2392c
Adding tests for test_build_biom_tables
josenavas Apr 29, 2015
1a163fb
Merge pull request #1126 from josenavas/join-using-instead-on-db-issues
antgonza Apr 29, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 55 additions & 70 deletions qiita_db/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,17 @@
from future.utils import viewitems
from biom import load_table
from biom.util import biom_open
import pandas as pd
from skbio.util import find_duplicates

from qiita_core.exceptions import IncompetentQiitaDeveloperError
from .sql_connection import SQLConnectionHandler
from .base import QiitaStatusObject
from .data import ProcessedData, RawData
from .data import ProcessedData
from .study import Study
from .exceptions import QiitaDBStatusError # QiitaDBNotImplementedError
from .exceptions import QiitaDBStatusError, QiitaDBError
from .util import (convert_to_id, get_work_base_dir,
get_mountpoint, get_table_cols, insert_filepaths)
get_mountpoint, insert_filepaths)


class Analysis(QiitaStatusObject):
Expand Down Expand Up @@ -719,78 +721,61 @@ def _build_mapping_file(self, samples, conn_handler=None):
Code modified slightly from qiime.util.MetadataMap.__add__"""
conn_handler = conn_handler if conn_handler is not None \
else SQLConnectionHandler()
# We will keep track of all unique sample_ids and metadata headers
# we have seen as we go, as well as studies already seen

all_sample_ids = set()
all_headers = set(get_table_cols("required_sample_info", conn_handler))
all_studies = set()
sql = """SELECT filepath_id, filepath
FROM qiita.filepath
JOIN qiita.prep_template_filepath USING (filepath_id)
JOIN qiita.prep_template_preprocessed_data
USING (prep_template_id)
JOIN qiita.preprocessed_processed_data
USING (preprocessed_data_id)
JOIN qiita.filepath_type USING (filepath_type_id)
WHERE processed_data_id = %s
AND filepath_type = 'qiime_map'
ORDER BY filepath_id DESC"""
_id, fp = get_mountpoint('templates')[0]
to_concat = []

merged_data = defaultdict(lambda: defaultdict(lambda: None))
for pid, samples in viewitems(samples):
if any([all_sample_ids.intersection(samples),
len(set(samples)) != len(samples)]):
# duplicate samples so raise error
raise ValueError("Duplicate sample ids found: %s" %
str(all_sample_ids.intersection(samples)))
all_sample_ids.update(samples)
study_id = ProcessedData(pid).study

# create a convenience study object
s = Study(study_id)

# get the ids to retrieve the data from the sample and prep tables
sample_template_id = s.sample_template
# you can have multiple different prep templates but we are only
# using the one for 16S i. e. the last one ... sorry ;l
# see issue https://github.com/biocore/qiita/issues/465
prep_template_id = RawData(s.raw_data()[0]).prep_templates[-1]

if study_id in all_studies:
# samples already added by other processed data file
# with the study_id
continue
all_studies.add(study_id)
# add headers to set of all headers found
all_headers.update(get_table_cols("sample_%d" % sample_template_id,
conn_handler))
all_headers.update(get_table_cols("prep_%d" % prep_template_id,
conn_handler))
# NEED TO ADD COMMON PREP INFO Issue #247
sql = ("SELECT rs.*, p.*, ss.* "
"FROM qiita.required_sample_info rs JOIN qiita.sample_{0} "
"ss USING(sample_id) JOIN qiita.prep_{1} p USING(sample_id)"
" WHERE rs.sample_id IN {2} AND rs.study_id = {3}".format(
sample_template_id, prep_template_id,
"(%s)" % ",".join("'%s'" % s for s in samples),
study_id))
metadata = conn_handler.execute_fetchall(sql)
# add all the metadata to merged_data
for data in metadata:
sample_id = data['sample_id']
for header, value in viewitems(data):
if header in {'sample_id'}:
continue
merged_data[sample_id][header] = str(value)

# prep headers, making sure they follow mapping file format rules
all_headers = list(all_headers - {'linkerprimersequence',
'barcodesequence', 'description', 'sample_id'})
all_headers.sort()
all_headers = ['BarcodeSequence', 'LinkerPrimerSequence'] + all_headers
all_headers.append('Description')

# write mapping file out
if len(samples) != len(set(samples)):
duplicates = find_duplicates(samples)
raise QiitaDBError("Duplicate sample ids found: %s"
% ', '.join(duplicates))
# Get the QIIME mapping file
qiime_map_fp = conn_handler.execute_fetchall(sql, (pid,))[0][1]
# Parse the mapping file
qiime_map = pd.read_csv(
join(fp, qiime_map_fp), sep='\t', keep_default_na=False,
na_values=['unknown'], index_col=False,
converters=defaultdict(lambda: str))
qiime_map.set_index('#SampleID', inplace=True, drop=True)
qiime_map = qiime_map.loc[samples]

duplicates = all_sample_ids.intersection(qiime_map.index)
if duplicates or len(samples) != len(set(samples)):
# Duplicate samples so raise error
raise QiitaDBError("Duplicate sample ids found: %s"
% ', '.join(duplicates))
all_sample_ids.update(qiime_map.index)
to_concat.append(qiime_map)

merged_map = pd.concat(to_concat)

cols = merged_map.columns.values.tolist()
cols.remove('BarcodeSequence')
cols.remove('LinkerPrimerSequence')
cols.remove('Description')
new_cols = ['BarcodeSequence', 'LinkerPrimerSequence']
new_cols.extend(cols)
new_cols.append('Description')
merged_map = merged_map[new_cols]

# Save the mapping file
_, base_fp = get_mountpoint(self._table)[0]
mapping_fp = join(base_fp, "%d_analysis_mapping.txt" % self._id)
with open(mapping_fp, 'w') as f:
f.write("#SampleID\t%s\n" % '\t'.join(all_headers))
for sample, metadata in viewitems(merged_data):
data = [sample]
for header in all_headers:
l_head = header.lower()
data.append(metadata[l_head] if
metadata[l_head] is not None else "no_data")
f.write("%s\n" % "\t".join(data))
merged_map.to_csv(mapping_fp, index_label='#SampleID',
na_rep='unknown', sep='\t')

self._add_file("%d_analysis_mapping.txt" % self._id,
"plain_text", conn_handler=conn_handler)
Expand Down
4 changes: 2 additions & 2 deletions qiita_db/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -369,9 +369,9 @@ def delete(cls, raw_data_id, study_id):
"""
SELECT EXISTS(
SELECT * FROM qiita.prep_template AS pt
LEFT JOIN qiita.common_prep_info AS cpi ON
LEFT JOIN qiita.prep_template_sample AS cpi ON
(pt.prep_template_id=cpi.prep_template_id)
LEFT JOIN qiita.required_sample_info AS rsi ON
LEFT JOIN qiita.study_sample AS rsi ON
(cpi.sample_id=rsi.sample_id)
WHERE raw_data_id = {0} and study_id = {1}
)
Expand Down
Loading