Skip to content

Commit ec8705c

Browse files
committed
[DATALAD RUNCMD] Remove trailing whitespaces
=== Do not change lines below === { "chain": [], "cmd": "sed -i -e 's, *$,,g' 00-overview.md 01-shell-basics.md 02-vcs.md 03-packages.md 04-legalities.md 05-misc.md 11-wrap-up.md", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "_episodes" } ^^^ Do not change lines above ^^^
1 parent f8b0fe6 commit ec8705c

File tree

6 files changed

+93
-93
lines changed

6 files changed

+93
-93
lines changed

_episodes/00-overview.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -18,16 +18,16 @@ keypoints:
1818

1919
The term "reproducibility" conjures a mental image of dedicated systems
2020
conducting automated and repeatable computations. However, **you** can
21-
embrace reproducibility in your day-to-day research activities.
22-
Neuroimaging is a heavily data- and software-driven field of science.
21+
embrace reproducibility in your day-to-day research activities.
22+
Neuroimaging is a heavily data- and software-driven field of science.
2323
As a result, by learning best practices for the tools you already use daily,
2424
you will discover ways to improve your efficiency and increase the
2525
reproducibility of your research.
2626

2727
Reproducibility requires us to know the **what**, **when**, and **how**
28-
for any particular analysis we carry out. The lessons in this module will
29-
help us answer those questions. Before addressing these specific questions,
30-
consult the referenced external materials (tutorials, lessons, etc.) to get a
28+
for any particular analysis we carry out. The lessons in this module will
29+
help us answer those questions. Before addressing these specific questions,
30+
consult the referenced external materials (tutorials, lessons, etc.) to get a
3131
more generic and thorough treatment of the topics.
3232

3333

@@ -83,9 +83,9 @@ or recommended practices.**
8383
### What are the lessons in this module?
8484

8585
This module introduces three somewhat independent topics at the heart of
86-
efficient and reproducible scientific computing: command line/shell,
86+
efficient and reproducible scientific computing: command line/shell,
8787
version control systems (for code and data), distribution package managers,
88-
and a few additional aspects such as bug reporting and licensing. It's
88+
and a few additional aspects such as bug reporting and licensing. It's
8989
unlikely that you've managed to completely avoid those tools so far,
9090
but it's possible that you've under-utilized their capabilities.
9191
Gaining additional skills in any of these topics can not only help

_episodes/01-shell-basics.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ for loops, functions and conditions. So, in contrast to GUIs (graphical
5858
user interfaces), such automation via scripting is a native feature of
5959
a CLI shell. Unlike GUI-integrated environments with lots of
6060
functionality exposed in menu items and icons, shell is truly a "black
61-
box", with lots of powerful underlying features integral to efficient use.
61+
box", with lots of powerful underlying features integral to efficient use.
6262
Since manipulating files is one of the main tasks in a shell, a shell usually
6363
comes with common commands (such as `cp`, `mv`, etc.) built in
6464
or provided by an additional package (e.g., `coreutils` in Debian).
@@ -95,7 +95,7 @@ failed interim execution.
9595
>
9696
> Relevant Books:
9797
>
98-
> - [Data Science at the Command Line](http://datascienceatthecommandline.com) --
98+
> - [Data Science at the Command Line](http://datascienceatthecommandline.com) --
9999
> contains a list of
100100
> command line tools useful for “data science”
101101
{: .callout}
@@ -152,7 +152,7 @@ failed interim execution.
152152
153153
154154
> ## What is a shebang?
155-
> It is the first line in the script, which starts with `#!` and is
155+
> It is the first line in the script, which starts with `#!` and is
156156
> followed by the command interpreting the script; e.g.,
157157
> if a file `blah` begins with the following:
158158
> ~~~
@@ -224,7 +224,7 @@ unintentionally run a different version than intended and end up with different
224224
> ~~~
225225
> {: .bash}
226226
> Do not confuse this with the `locate` command, which (if available) would
227-
> find a file containing the specified word somewhere in the file name/path.
227+
> find a file containing the specified word somewhere in the file name/path.
228228
{: .solution}
229229
230230
@@ -290,7 +290,7 @@ unintentionally run a different version than intended and end up with different
290290
291291
> ## Why is ${variable} is preferable over $variable?
292292
>
293-
> You use ${variable} to safely concatenate a variable with another string.
293+
> You use ${variable} to safely concatenate a variable with another string.
294294
> For instance, if you had a variable `filename` that contains the value
295295
> `preciousfile`, `$filename_modified` would refer to the value of the
296296
> possibly undefined `filename_modified` variable; on the other hand, `${filename}_modified`
@@ -405,12 +405,12 @@ modules.
405405
406406
## Efficient use of the interactive shell
407407
408-
A shell can be used quite efficiently once you become familiar with its
408+
A shell can be used quite efficiently once you become familiar with its
409409
features and configure it to simplify common operations.
410410
411411
### aliases
412412
413-
Aliases are shortcuts for commonly used commands and can add
413+
Aliases are shortcuts for commonly used commands and can add
414414
options to calls for most common commands. Please review useful aliases presented in
415415
[30 Handy Bash Shell Aliases For Linux / Unix / Mac OS X](https://www.cyberciti.biz/tips/bash-aliases-mac-centos-linux-unix.html).
416416
@@ -497,7 +497,7 @@ Some shortcuts can not only edit command line text, but also control the executi
497497
498498
By default, a shell stores in memory a history of the commands you
499499
have run. You can access this log using the `history` command. When you exit
500-
the shell, those history lines are appended to a file (by default in
500+
the shell, those history lines are appended to a file (by default in
501501
`~/.bash_history` for bash shell). This not
502502
only allows you to quickly recall commands you have run recently, but
503503
can effectively provide a "lab notebook" of the actions you have
@@ -664,7 +664,7 @@ your script performs as expected.
664664
[Unit-testing](https://en.wikipedia.org/wiki/Unit_testing) is a
665665
powerful paradigm to verify that pieces of your code (units) operate
666666
correctly in various scenarios, and that these assumptions are represented in
667-
the code. An interesting observation is that everyone does at least some
667+
the code. An interesting observation is that everyone does at least some
668668
"testing" by simply running their code/scrip on an input and checking
669669
that the output matches their expectations. Unit-testing just takes this
670670
workflow one step further: code such tests in a separate file so you can run
@@ -673,7 +673,7 @@ that your script still performs correctly. In the simplest case, you can
673673
just copy your test commands into a separate script that would fail if
674674
any command within it fails (therfore effectively testing your target
675675
script(s)).
676-
676+
677677
For example, the following script could be used to test basic correct operations
678678
of AFNI's `1dsum` command:
679679

_episodes/02-vcs.md

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: "Version control systems"
33
teaching: 300
44
exercises: 40
55
questions:
6-
- How do version control systems facilitate reproducibility, and
6+
- How do version control systems facilitate reproducibility, and
77
which systems should be used?
88
objectives:
99
- Become familiar with version control systems for
@@ -38,7 +38,7 @@ we do it in an ad-hoc manner:
3838

3939
In general, a VCS helps you track **versions** of digital artifacts,
4040
such as code (scripts, source files), configuration files, images,
41-
documents, and data -- both original or derived (e.g., the outcome of an
41+
documents, and data -- both original or derived (e.g., the outcome of an
4242
analysis). With proper annotation of changes, a VCS becomes the lab notebook
4343
for changing content in the digital world. Since all versions are stored,
4444
VCS makes it possible to provide any previous version at a later
@@ -122,7 +122,7 @@ control over your digital research artifacts and notes
122122
> subject from release 1.0.0 to 1.1.0.
123123
> > ## Answer
124124
> > git diff allows us to see the differences between points in the Git history
125-
> > and to optionally restrict the search to the specific file(s), so the answers to the
125+
> > and to optionally restrict the search to the specific file(s), so the answers to the
126126
> > challenge were `git tag` and `git grep`:
127127
> > ~~~
128128
> > % git diff 1.0.0..1.1.0 -- expected_output/AnnArbor_sub16960/segstats.json
@@ -143,11 +143,11 @@ control over your digital research artifacts and notes
143143
## Third-party services
144144
145145
As you learned in the [Remotes in GitHub](http://swcarpentry.github.io/git-novice/07-github/) section of the [Software Carpentry Git course](http://swcarpentry.github.io/git-novice/), the [GitHub] website provides you with public (or private) storage for your Git repositories on the web.
146-
The GitHub website also allows third-party websites to interact with your repositories
146+
The GitHub website also allows third-party websites to interact with your repositories
147147
to provide additional services, typically in response to new changes
148148
to your repositories. Visit [GitHub Marketplace](https://github.com/marketplace) for an
149149
overview of the vast collection of such additional services. Some services are free,
150-
some are "pay-for-service". Students can benefit from obtaining a
150+
some are "pay-for-service". Students can benefit from obtaining a
151151
[Student Developer Pack](https://education.github.com/pack) to gain free access to
152152
some services which otherwise would require a fee.
153153
@@ -167,25 +167,25 @@ For example, see [simple workflow](https://github.com/ReproNim/simple_workflow)
167167
with [GitHub], and is free for publicly available projects.
168168
169169
> ## External teaching materials
170-
> - [A quick Travis CI Tutorial for Node.js developers (full: 20m)](https://github.com/dwyl/learn-travis) --
170+
> - [A quick Travis CI Tutorial for Node.js developers (full: 20m)](https://github.com/dwyl/learn-travis) --
171171
> a good description of all necessary steps to enable Travis CI for your GitHub project;
172172
> although geared toward Node.js projects, the same principles apply to other platforms/languages.
173-
> - [Shablona - A template for small scientific python projects (review: 5m, optional)](https://github.com/uwescience/shablona) --
173+
> - [Shablona - A template for small scientific python projects (review: 5m, optional)](https://github.com/uwescience/shablona) --
174174
> a template for scientific Python projects; review its `.travis.yml` for an example
175175
> of a typical setup for a Python-based project.
176-
> - [Travis CI Documentation (familiarize: 10m, canonical reference)](https://docs.travis-ci.com/) --
176+
> - [Travis CI Documentation (familiarize: 10m, canonical reference)](https://docs.travis-ci.com/) --
177177
> documentation for Travis CI; review sections relevant to your language/platform.
178178
{: .callout}
179179
180180
#### CircleCI
181181
182182
> ## External teaching materials
183-
> - [CircleCI 1.0 Documentation (familiarize: 10m, canonical reference)](https://circleci.com/docs/1.0) --
183+
> - [CircleCI 1.0 Documentation (familiarize: 10m, canonical reference)](https://circleci.com/docs/1.0) --
184184
> documentation for CircleCI; review sections relevant to your language/platform.
185185
{: .callout}
186186
187187
> ## External review materials
188-
> - [Continuous Integration in the Cloud: Comparing Travis, Circle and Codeship (review: 10m)](https://strongloop.com/strongblog/node-js-travis-circle-codeship-compare/) --
188+
> - [Continuous Integration in the Cloud: Comparing Travis, Circle and Codeship (review: 10m)](https://strongloop.com/strongblog/node-js-travis-circle-codeship-compare/) --
189189
> having acquainted yourself with the basics of two CIs, review the differences.
190190
> - [Side-by-side comparison of CI services: review 5m](https://www.slant.co/versus/625/2481/~circleci_vs_appveyor)
191191
{: .callout}
@@ -204,21 +204,21 @@ with [GitHub], and is free for publicly available projects.
204204
without committing the (large) content of those files directly under git.
205205
In a nutshell, [git-annex]
206206
207-
- moves actual data file(s) under `.git/annex/objects` into a file typically
208-
named according to the [checksum](https://en.wikipedia.org/wiki/Checksum) of
207+
- moves actual data file(s) under `.git/annex/objects` into a file typically
208+
named according to the [checksum](https://en.wikipedia.org/wiki/Checksum) of
209209
the file's content, and in its place creates a [symbolic link](https://en.wikipedia.org/wiki/Symbolic_link) (symlink) pointing to that new location
210210
- commits that symlink (not actual data) under git, so a file of any size
211211
has the same small footprint within git
212212
- within `git-annex` branch, the location of the data file (on which machine, clone, or
213213
web URL) is recorded
214-
214+
215215
Later on, if you have access to the clones of the repository containing
216216
the file, you can easily `get` it (which will download/copy that file
217-
under `.git/annex/objects`) or `drop` it (which will remove that file from
217+
under `.git/annex/objects`) or `drop` it (which will remove that file from
218218
`.git/annex/objects`).
219219
220-
Since Git doesn't contain the actual content of large files, but
221-
instead just contains symlinks and information in the `git-annex` branch, it
220+
Since Git doesn't contain the actual content of large files, but
221+
instead just contains symlinks and information in the `git-annex` branch, it
222222
becomes possible to
223223
224224
- have very lean Git repositories pointing to arbitrarily large files
@@ -230,16 +230,16 @@ becomes possible to
230230
### Note
231231
232232
Never manually `git merge` a `git-annex` branch; [git-annex] uses a special merge
233-
algorithm to merge data availability information, and you should use
234-
[git annex merge](https://git-annex.branchable.com/git-annex-merge/)
233+
algorithm to merge data availability information, and you should use
234+
[git annex merge](https://git-annex.branchable.com/git-annex-merge/)
235235
or [git annex sync](https://git-annex.branchable.com/git-annex-sync/)
236236
commands to merge the `git-annex` branch correctly.
237237
238238
> ## External teaching materials
239-
> - [git-annex walkthrough from a cognitive scientist (full: 30 min)](https://github.com/jhamrick/git-annex-tutorial/blob/master/Tutorial%20on%20git-annex.ipynb) --
239+
> - [git-annex walkthrough from a cognitive scientist (full: 30 min)](https://github.com/jhamrick/git-annex-tutorial/blob/master/Tutorial%20on%20git-annex.ipynb) --
240240
> a Jupyter notebook; please go through all the items by running
241241
> the notebook cells or copy/pasting them into a terminal.
242-
> - [git-annex walkthrough (full: 10 min)](http://git-annex.branchable.com/walkthrough/) --
242+
> - [git-annex walkthrough (full: 10 min)](http://git-annex.branchable.com/walkthrough/) --
243243
> original git-annex walkthrough; go through all sections to see
244244
> which aspects previous walkthroughs did not cover.
245245
> - (optional) [Another walkthrough on a typical use-case for sync'ing)](https://writequit.org/articles/getting-started-with-git-annex.html)
@@ -248,7 +248,7 @@ commands to merge the `git-annex` branch correctly.
248248
> ## Exercise: getting data files controlled by git-annex
249249
>
250250
> Using git/git-annex commands
251-
>
251+
>
252252
> 1. “Download" a [BIDS](http://bids.neuroimaging.io) dataset from https://github.com/datalad/ds000114
253253
> 2. `get` all non-preprocessed T1w anatomicals
254254
> 3. Try (and fail) to get all `T1.mgz` files
@@ -281,7 +281,7 @@ commands to merge the `git-annex` branch correctly.
281281
> {: .bash}
282282
>
283283
> ### Advanced method (for all future `git annex add` calls)
284-
> If you want to
284+
> If you want to
285285
> [automate such "decision making"](http://git-annex.branchable.com/tips/largefiles/)
286286
> based on either file extensions
287287
> and/or their sizes, you can specify those rules within a `.gitattributes` file.
@@ -306,7 +306,7 @@ commands to merge the `git-annex` branch correctly.
306306
The [DataLad] project relies on Git and git-annex to establish an
307307
integrated data monitoring, management, and distribution environment.
308308
As data distribution capitalizing on a number of "data
309-
crawlers" for existing data portals, it provides unified access to over
309+
crawlers" for existing data portals, it provides unified access to over
310310
50TB of data from various initiatives (such as CRCNS, OpenfMRI, etc.).
311311
312312
> ## External teaching materials
@@ -339,11 +339,11 @@ crawlers" for existing data portals, it provides unified access to over
339339
>
340340
> Using DataLad commands, and starting with your existing clone of `ds000114`
341341
> from the preceding exercise, do the following:
342-
>
342+
>
343343
> 1. Create sub-dataset `derivatives/demo-bet`
344344
> 2. Using a skull-stripping tool (e.g., `bet` from FSL) to produce a
345345
> skull-stripped anatomical for each subject under the subdirectory
346-
> `derivatives/demo-bet`; use the `datalad run` command
346+
> `derivatives/demo-bet`; use the `datalad run` command
347347
> (available in DataLad 0.9 or later) to keep a record of your analysis
348348
> 3. [Publish](http://docs.datalad.org/en/latest/generated/man/datalad-publish.html)
349349
> your work to your fork of the repository on GitHub and upload data to your
@@ -363,7 +363,7 @@ crawlers" for existing data portals, it provides unified access to over
363363
> > % git annex initremote box.com type=webdav url=https://dav.box.com/dav/team/ds000114--demo-bet chunk=50mb encryption=none
364364
> > % datalad create-sibling-github --publish-depends box.com --access-protocol https ds000114--demo-bet
365365
> > % datalad publish --to github sub\* # 3/
366-
> > %
366+
> > %
367367
> > ~~~
368368
> > {: .bash}
369369
> {: .solution}

0 commit comments

Comments
 (0)