Skip to content

CommandlineBasics

Bonder-MJ edited this page Aug 8, 2014 · 2 revisions

###Basics for working with the commandline Here we supply a small introduction so you will have some idea on how to work on the commandline. First we start of with paths, we will introduce the java virtual machine and start you of with the qtl-mapping-pipeline.

###Path definitions and commands Please note that our software expects full paths, although shorter relative paths will also work in most cases. So, if you are on Windows, a full path to a genotype directory would be similar to c:\path\to\genotype\dir\ and a full path to a file would be c:\path\to\genotype\directory\file.txt. Linux and Mac OS use different path separators. On these systems, these paths would be similar to the following /path/to/genotype/dir/ and /path/to/genotype/dir/file.txt. Our main point here is that when pointing to a directory, use a 'trailing slash'

This manual will combine references to paths with commands that need to be issued for a certain task. For example, at some point in this manual we refer to your phenotype data as traitfile, which will be printed in a grey box. Commands will also be in grey boxes, and can make references to paths defined earlier (to keep the manual readable), as follows:

java -jar eqtl-mapping-pipeline.jar --mode metaqtl --inexp traitfile

To run the command in the example above, you have to replace the string traitfile with the full path of your traitfile after the command line switch --inexp. So if the full path to your traitfile would be /path/to/traitfile.txt, the final command would be:

java -jar eqtl-mapping-pipeline.jar --mode metaqtl --inexp /path/to/traitfile.txt

###Java Virtual Machine Our QTL mapping software is written in Java, which makes the software both fast and portable across multiple operating systems. Executing a java program is similar to executing a normal program or app, although there are some considerations:

  • QTL mapping heavily relies on available memory. Make sure your machine is 64-bit and has lots of memory installed (at least 20Gb).

  • Please make sure your version of java is up-to-date. Use at least the 64-bit version of Java version 6 (also called version 1.6.0). The software should also work on version 7 and higher. If you are running on Windows, you can download the so-called Java Runtime Environment (JRE) from: http://www.java.com. If you are running Linux, the virtual machine may be present in the proprietary section of your package manager, or may be available via http://www.java.com.

  • Java executables are called jar-files. The meQTL mapping pipeline is such a jar-file. You can execute it by using the following command from a terminal / console:

    java –jar eqtl-mapping-pipeline.jar 
  • You need to specify the maximal amount of memory available to the program using the command-line switch –Xmx (case-sensitive!). It is also wise to set an initial amount of memory available to the program which can be specified with the -Xms option (case-sensitive!). The amount of memory can be specified in megabytes (using an m suffix) or in gigabytes (using a g suffix). To be sure your computer is running java at 64-bit, please add the switch –d64. These java VM switches (–Xmx, -Xms, –d64 and others) should be called prior to the –jar switch. To be sure you have enough space to put SNP and probe information in you should also set -XX:StringTableSize=, chosing a prime number which is slighly higher than the amount of SNPs and probes combined will yield optimal performance. An example command which was used to map meQTLs in 100 samples on 400,000 probes and 10,000,000 SNPs will look like this:
    java –d64 -XX:StringTableSize=15485867 -Xms20g –Xmx40g –jar eqtl-mapping-pipeline.jar 
  • Try to increase the –Xmx amount when you get Out-Of-Memory-Errors or errors involving ‘heap space’.
  • If the program return an " java.lang.OutOfMemoryError: PermGen space" error try adding -XX:MaxPermSize=512m, before the -jar. In the later releases of the software we use another representation of SNPs which needs another memory setting to be specified.

IMPORTANT NOTE: In this manual, we assume you understand the principle that you need to allocate sufficient amounts of RAM and therefore we excluded the VM switches from the example commands. Please be aware that you should use it, as some of the commands may require a substantial amount of memory (depending on your dataset and settings)!

###General information about software

  • The meQTL mapping pipeline is a command line program. In order to help you a bit, an overview of available switch options is displayed when a command is incomplete or incorrect. Furthermore, each mode of the program also has its own overview of available switches with a small description of its functionality. For example: java –jar eqtl-mapping-pipeline.jar produces a list of available modes, while each individual mode gives information how to use this specific mode.
  • The software is able to process GZipped text files for most of the input files (not files in .tar archives however), which allows you to save space on your hard drive.
Clone this wiki locally