Skip to content

ENH CompCor #1599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 66 commits into from
Sep 13, 2016
Merged
Show file tree
Hide file tree
Changes from 64 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
50da63f
compcor skeleton
Sep 1, 2016
9e0d327
incremental changes, baby test
Sep 2, 2016
9f7886c
add doctest
Sep 2, 2016
55aec65
add doctest
Sep 2, 2016
b45b7b1
Merge branch 'compcor1594' of http://github.com/shoshber/nipype into …
Sep 2, 2016
f454cbe
fix imports
Sep 3, 2016
ca930d3
better comments, minor re-arranging
Sep 3, 2016
414f11a
set up test prereq's
Sep 4, 2016
bca197e
complete _run_interface implementation
Sep 4, 2016
124d346
implement _list_outputs
Sep 4, 2016
ef50858
pretty ok test
Sep 4, 2016
5e8b997
compcore -> compcor
Sep 6, 2016
6127924
Merge http://github.com/nipy/nipype into compcor1594
Sep 6, 2016
ac8dbfe
TCompCor skeleton, tests
Sep 7, 2016
a5402f7
make python2 happy
Sep 7, 2016
8f5eabc
factor out tSTD calculation
Sep 7, 2016
d87ac71
tCompCor first pass
Sep 7, 2016
757bd0e
alias ACompCor => CompCor
Sep 7, 2016
a54120f
prevent name collision in tests
Sep 7, 2016
c1b0a87
better unique file names
Sep 7, 2016
9a88a11
refactor toy maker function
Sep 7, 2016
61da367
testing skeleton for tsnr
Sep 8, 2016
89b530a
factor out deleting temp files
Sep 8, 2016
87108d9
pre-refactor tests
Sep 8, 2016
33d2f30
move tsnr to own file
Sep 8, 2016
1511bea
naive refactor
Sep 8, 2016
b66db5f
make compcor tests more sensitive to changes, simplify regress_poly()
Sep 8, 2016
4b9dbe4
fix test oops
Sep 8, 2016
bd9113a
m rm unnecessary/confusing test msgs
Sep 8, 2016
e78b0ae
make regress_poly take any kind of input data
oesteban Sep 8, 2016
5bf2d54
improve code readability
Sep 9, 2016
fd45c05
fixed for real this time?
Sep 9, 2016
62bb3bd
change absolute imports to explicit relative imports
Sep 9, 2016
da1e59a
add use_regress_poly Boolean to compcor
Sep 9, 2016
6f76129
Merge branch 'fix-shoshber-compcor1594' of https://github.com/oesteba…
Sep 9, 2016
045bfdc
Merge branch 'oesteban-fix-shoshber-compcor1594' into compcor1594
Sep 9, 2016
ee726a1
undo accidental changes in merge
Sep 9, 2016
0a5bc7a
move tsnr back into misc
Sep 9, 2016
61c96ad
remove rogue file
Sep 9, 2016
4fd169a
fix lingregress (swapped z/y! )
Sep 9, 2016
dfdd6b0
more intelligent testing
Sep 9, 2016
1995a01
use regress_poly from TSNR in compcor
Sep 9, 2016
13838bd
add mask.nii back
Sep 9, 2016
027d47e
smarter tests
Sep 9, 2016
5464902
rsfmri workflow test skeleton
Sep 9, 2016
f8cea61
set up mocks
Sep 10, 2016
1c655f8
mock only what isn't being tested
Sep 10, 2016
08bccfa
run compcor tests in tempfile
Sep 10, 2016
4401bb8
set up 2 out of 3 inputs for compcor node
Sep 10, 2016
29c8b74
Merge http://github.com/nipy/nipype into compcor1594
Sep 10, 2016
402b6b1
add __init__.py file
Sep 11, 2016
c18a780
mocked-up rsfmri workflow runs; use tempfiles
Sep 11, 2016
083c16f
completed rsfmri wf test
Sep 11, 2016
be215f5
refactor rsfmri fsl workflow
Sep 11, 2016
5f39bfc
make python 3 happy
Sep 11, 2016
6869a5f
use mkdtemp for test_tsnr
Sep 11, 2016
c9aa6d6
enable extra_regressors (needed by rsfmri_surface_vol...)
Sep 12, 2016
9aa2300
make percentile threshold a parameter for tCompCor
Sep 12, 2016
276fb57
use CompCor interface in example
Sep 12, 2016
ea60782
forgot to regress pre-computing std
Sep 12, 2016
6a1497d
use relative paths in doctests
Sep 12, 2016
e245073
update documentation
Sep 12, 2016
6a88b3a
robuster tests
Sep 12, 2016
5347881
fix dumb error, comments
Sep 12, 2016
cb2faa2
clarify documentation
Sep 13, 2016
bb45785
remove stuff
Sep 13, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion examples/README
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
A dataset for use with these scripts can be downloaded from the nipype
website. At the time of writing, it's at:

http://nipy.org/nipype/users/pipeline_tutorial.html
http://nipype.readthedocs.io/en/0.12.0/users/pipeline_tutorial.html
62 changes: 8 additions & 54 deletions examples/rsfmri_vol_surface_preprocessing_nipy.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,9 @@
from nipype.interfaces.base import CommandLine
CommandLine.set_default_terminal_output('allatonce')

# https://github.com/moloney/dcmstack
from dcmstack.extract import default_extractor
# pip install pydicom
from dicom import read_file

from nipype.interfaces import (fsl, Function, ants, freesurfer, nipy)
Expand All @@ -64,6 +66,7 @@

from nipype.algorithms.rapidart import ArtifactDetect
from nipype.algorithms.misc import TSNR
from nipype.algorithms.compcor import ACompCor
from nipype.interfaces.utility import Rename, Merge, IdentityInterface
from nipype.utils.filemanip import filename_to_list
from nipype.interfaces.io import DataSink, FreeSurferSource
Expand Down Expand Up @@ -228,51 +231,6 @@ def build_filter1(motion_params, comp_norm, outliers, detrend_poly=None):
out_files.append(filename)
return out_files


def extract_noise_components(realigned_file, mask_file, num_components=5,
extra_regressors=None):
"""Derive components most reflective of physiological noise

Parameters
----------
realigned_file: a 4D Nifti file containing realigned volumes
mask_file: a 3D Nifti file containing white matter + ventricular masks
num_components: number of components to use for noise decomposition
extra_regressors: additional regressors to add

Returns
-------
components_file: a text file containing the noise components
"""
imgseries = nb.load(realigned_file)
components = None
for filename in filename_to_list(mask_file):
mask = nb.load(filename).get_data()
if len(np.nonzero(mask > 0)[0]) == 0:
continue
voxel_timecourses = imgseries.get_data()[mask > 0]
voxel_timecourses[np.isnan(np.sum(voxel_timecourses, axis=1)), :] = 0
# remove mean and normalize by variance
# voxel_timecourses.shape == [nvoxels, time]
X = voxel_timecourses.T
stdX = np.std(X, axis=0)
stdX[stdX == 0] = 1.
stdX[np.isnan(stdX)] = 1.
stdX[np.isinf(stdX)] = 1.
X = (X - np.mean(X, axis=0)) / stdX
u, _, _ = sp.linalg.svd(X, full_matrices=False)
if components is None:
components = u[:, :num_components]
else:
components = np.hstack((components, u[:, :num_components]))
if extra_regressors:
regressors = np.genfromtxt(extra_regressors)
components = np.hstack((components, regressors))
components_file = os.path.join(os.getcwd(), 'noise_components.txt')
np.savetxt(components_file, components, fmt=b"%.10f")
return components_file


def rename(in_files, suffix=None):
from nipype.utils.filemanip import (filename_to_list, split_filename,
list_to_filename)
Expand Down Expand Up @@ -592,7 +550,7 @@ def create_workflow(files,
realign.inputs.slice_info = 2
realign.plugin_args = {'sbatch_args': '-c%d' % 4}

# Comute TSNR on realigned data regressing polynomials upto order 2
# Compute TSNR on realigned data regressing polynomials up to order 2
tsnr = MapNode(TSNR(regress_poly=2), iterfield=['in_file'], name='tsnr')
wf.connect(realign, "out_file", tsnr, "in_file")

Expand Down Expand Up @@ -694,14 +652,10 @@ def merge_files(in1, in2):
filter1, 'out_res_name')
wf.connect(createfilter1, 'out_files', filter1, 'design')

createfilter2 = MapNode(Function(input_names=['realigned_file', 'mask_file',
'num_components',
'extra_regressors'],
output_names=['out_files'],
function=extract_noise_components,
imports=imports),
createfilter2 = MapNode(ACompCor(),
iterfield=['realigned_file', 'extra_regressors'],
name='makecompcorrfilter')
createfilter2.inputs.components_file = 'noise_components.txt'
createfilter2.inputs.num_components = num_components

wf.connect(createfilter1, 'out_files', createfilter2, 'extra_regressors')
Expand All @@ -717,7 +671,7 @@ def merge_files(in1, in2):
wf.connect(filter1, 'out_res', filter2, 'in_file')
wf.connect(filter1, ('out_res', rename, '_cleaned'),
filter2, 'out_res_name')
wf.connect(createfilter2, 'out_files', filter2, 'design')
wf.connect(createfilter2, 'components_file', filter2, 'design')
wf.connect(mask, 'mask_file', filter2, 'mask')

bandpass = Node(Function(input_names=['files', 'lowpass_freq',
Expand Down Expand Up @@ -923,7 +877,7 @@ def get_names(files, suffix):
wf.connect(smooth, 'out_file', datasink, 'resting.timeseries.@smoothed')
wf.connect(createfilter1, 'out_files',
datasink, 'resting.regress.@regressors')
wf.connect(createfilter2, 'out_files',
wf.connect(createfilter2, 'components_file',
datasink, 'resting.regress.@compcorr')
wf.connect(maskts, 'out_file', datasink, 'resting.timeseries.target')
wf.connect(sampleaparc, 'summary_file',
Expand Down
178 changes: 178 additions & 0 deletions nipype/algorithms/compcor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# -*- coding: utf-8 -*-
# emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-
# vi: set ft=python sts=4 ts=4 sw=4 et:
'''
Miscellaneous algorithms

Change directory to provide relative paths for doctests
>>> import os
>>> filepath = os.path.dirname(os.path.realpath(__file__))
>>> datadir = os.path.realpath(os.path.join(filepath, '../testing/data'))
>>> os.chdir(datadir)

'''
from ..interfaces.base import (BaseInterfaceInputSpec, TraitedSpec,
BaseInterface, traits, File)
from ..pipeline import engine as pe
from ..interfaces.utility import IdentityInterface
from .misc import regress_poly

import nibabel as nb
import numpy as np
from scipy import linalg, stats
import os

class CompCorInputSpec(BaseInterfaceInputSpec):
realigned_file = File(exists=True, mandatory=True,
desc='already realigned brain image (4D)')
mask_file = File(exists=True, mandatory=False,
desc='mask file that determines ROI (3D)')
components_file = File('components_file.txt', exists=False,
mandatory=False, usedefault=True,
desc='filename to store physiological components')
num_components = traits.Int(6, usedefault=True) # 6 for BOLD, 4 for ASL
use_regress_poly = traits.Bool(True, usedefault=True,
desc='use polynomial regression'
'pre-component extraction')
regress_poly_degree = traits.Range(low=1, default=1, usedefault=True,
desc='the degree polynomial to use')
extra_regressors = File(exists=True, mandatory=False,
desc='additional regressors to add')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? In preprocessing usually regressors from different sources are combined in a later step.

Copy link
Contributor Author

@berleant berleant Sep 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added extra_regressors here because https://github.com/nipy/nipype/blob/master/examples/rsfmri_vol_surface_preprocessing_nipy.py

had an extra_regressors input to it's compcor implementation. I assumed that was normal ... I'm not averse to leaving it out, but then I'm not sure to refactor the above file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it it "lean and mean" and remove this.


class CompCorOutputSpec(TraitedSpec):
components_file = File(exists=True,
desc='text file containing the noise components')

class CompCor(BaseInterface):
'''
Interface with core CompCor computation, used in aCompCor and tCompCor

Example
-------

>>> ccinterface = CompCor()
>>> ccinterface.inputs.realigned_file = 'functional.nii'
>>> ccinterface.inputs.mask_file = 'mask.nii'
>>> ccinterface.inputs.num_components = 1
>>> ccinterface.inputs.use_regress_poly = True
>>> ccinterface.inputs.regress_poly_degree = 2
'''
input_spec = CompCorInputSpec
output_spec = CompCorOutputSpec

def _run_interface(self, runtime):
imgseries = nb.load(self.inputs.realigned_file).get_data()
mask = nb.load(self.inputs.mask_file).get_data()
voxel_timecourses = imgseries[mask > 0]
# Zero-out any bad values
voxel_timecourses[np.isnan(np.sum(voxel_timecourses, axis=1)), :] = 0

# from paper:
# "The constant and linear trends of the columns in the matrix M were
# removed [prior to ...]"
if self.inputs.use_regress_poly:
voxel_timecourses = regress_poly(self.inputs.regress_poly_degree,
voxel_timecourses)
voxel_timecourses = voxel_timecourses - np.mean(voxel_timecourses,
axis=1)[:, np.newaxis]

# "Voxel time series from the noise ROI (either anatomical or tSTD) were
# placed in a matrix M of size Nxm, with time along the row dimension
# and voxels along the column dimension."
M = voxel_timecourses.T
numvols = M.shape[0]
numvoxels = M.shape[1]

# "[... were removed] prior to column-wise variance normalization."
M = M / self._compute_tSTD(M, 1.)

# "The covariance matrix C = MMT was constructed and decomposed into its
# principal components using a singular value decomposition."
u, _, _ = linalg.svd(M, full_matrices=False)
components = u[:, :self.inputs.num_components]
if self.inputs.extra_regressors:
components = self._add_extras(components,
self.inputs.extra_regressors)

components_file = os.path.join(os.getcwd(), self.inputs.components_file)
np.savetxt(components_file, components, fmt="%.10f")
return runtime

def _list_outputs(self):
outputs = self._outputs().get()
outputs['components_file'] = os.path.abspath(self.inputs.components_file)
return outputs

def _compute_tSTD(self, M, x):
stdM = np.std(M, axis=0)
# set bad values to x
stdM[stdM == 0] = x
stdM[np.isnan(stdM)] = x
stdM[np.isinf(stdM)] = x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen? Why are those set to zero?

Copy link
Contributor Author

@berleant berleant Sep 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know! This was in the original implementation for aCompCor, though there they are set to 1. In tCompCor ROI selection, I set them to 0 because setting them to 0 guaranteed that they would not be selected for the ROI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit counter intuitive. If the SD is infinite (as a numerical approximation of very very big) the voxel should be included. For the sake of safety I would remove this line.

return stdM

def _add_extras(self, components, extra_regressors):
regressors = np.genfromtxt(self.inputs.extra_regressors)
return np.hstack((components, regressors))

class TCompCorInputSpec(CompCorInputSpec):
# and all the fields in CompCorInputSpec
percentile_threshold = traits.Range(low=0., high=1., value=.02,
exclude_low=True, exclude_high=True,
usedefault=True, desc='the percentile '
'used to select highest-variance '
'voxels. By default the 2% of voxels '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is ".02" not "2%".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be some clarity that this should be expressed as value ranging from 0.0-1.0 rather than 0-100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix the documentation.

'with the highest variance are used.')

class TCompCor(CompCor):
'''
Interface for tCompCor. Computes a ROI mask based on variance of voxels.

Example
-------

>>> ccinterface = TCompCor()
>>> ccinterface.inputs.realigned_file = 'functional.nii'
>>> ccinterface.inputs.mask_file = 'mask.nii'
>>> ccinterface.inputs.num_components = 1
>>> ccinterface.inputs.use_regress_poly = True
>>> ccinterface.inputs.regress_poly_degree = 2
>>> ccinterface.inputs.percentile_threshold = .03
'''

input_spec = TCompCorInputSpec
output_spec = CompCorOutputSpec

def _run_interface(self, runtime):
imgseries = nb.load(self.inputs.realigned_file).get_data()

# From the paper:
# "For each voxel time series, the temporal standard deviation is
# defined as the standard deviation of the time series after the removal
# of low-frequency nuisance terms (e.g., linear and quadratic drift)."
imgseries = regress_poly(2, imgseries)
imgseries = imgseries - np.mean(imgseries, axis=1)[:, np.newaxis]

time_voxels = imgseries.T
num_voxels = np.prod(time_voxels.shape[1:])

# "To construct the tSTD noise ROI, we sorted the voxels by their
# temporal standard deviation ..."
tSTD = self._compute_tSTD(time_voxels, 0)
sortSTD = np.sort(tSTD, axis=None) # flattened sorted matrix

# use percentile_threshold to pick voxels
threshold_index = int(num_voxels * (1. - self.inputs.percentile_threshold))
threshold_std = sortSTD[threshold_index]
mask = tSTD >= threshold_std
mask = mask.astype(int)

# save mask
mask_file = 'mask.nii'
nb.nifti1.save(nb.Nifti1Image(mask, np.eye(4)), mask_file)
self.inputs.mask_file = mask_file

super(TCompCor, self)._run_interface(runtime)
return runtime

ACompCor = CompCor
40 changes: 28 additions & 12 deletions nipype/algorithms/misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
BaseInterfaceInputSpec, isdefined,
DynamicTraitedSpec, Undefined)
from ..utils.filemanip import fname_presuffix, split_filename

iflogger = logging.getLogger('interface')


Expand Down Expand Up @@ -257,7 +258,6 @@ def _list_outputs(self):
outputs['nifti_file'] = self._gen_output_file_name()
return outputs


class TSNRInputSpec(BaseInterfaceInputSpec):
in_file = InputMultiPath(File(exists=True), mandatory=True,
desc='realigned 4D file or a list of 3D files')
Expand Down Expand Up @@ -308,17 +308,7 @@ def _run_interface(self, runtime):
data = data.astype(np.float32)

if isdefined(self.inputs.regress_poly):
timepoints = img.shape[-1]
X = np.ones((timepoints, 1))
for i in range(self.inputs.regress_poly):
X = np.hstack((X, legendre(
i + 1)(np.linspace(-1, 1, timepoints))[:, None]))
betas = np.dot(np.linalg.pinv(X), np.rollaxis(data, 3, 2))
datahat = np.rollaxis(np.dot(X[:, 1:],
np.rollaxis(
betas[1:, :, :, :], 0, 3)),
0, 4)
data = data - datahat
data = regress_poly(self.inputs.regress_poly, data)
img = nb.Nifti1Image(data, img.get_affine(), header)
nb.save(img, op.abspath(self.inputs.detrended_file))

Expand All @@ -343,6 +333,32 @@ def _list_outputs(self):
outputs['detrended_file'] = op.abspath(self.inputs.detrended_file)
return outputs

def regress_poly(degree, data):
''' returns data with degree polynomial regressed out.
The last dimension (i.e. data.shape[-1]) should be time.
'''
datashape = data.shape
timepoints = datashape[-1]

# Rearrange all voxel-wise time-series in rows
data = data.reshape((-1, timepoints))

# Generate design matrix
X = np.ones((timepoints, 1))
for i in range(degree):
polynomial_func = legendre(i+1)
value_array = np.linspace(-1, 1, timepoints)
X = np.hstack((X, polynomial_func(value_array)[:, np.newaxis]))

# Calculate coefficients
betas = np.linalg.pinv(X).dot(data.T)

# Estimation
datahat = X[:, 1:].dot(betas[1:, ...]).T
regressed_data = data - datahat

# Back to original shape
return regressed_data.reshape(datashape)

class GunzipInputSpec(BaseInterfaceInputSpec):
in_file = File(exists=True, mandatory=True)
Expand Down
Loading