This package is intended to help overcome limitation when estimating a model's evolutionary history. It takes user provided files and parameters, and creates several random topologies for the user to then test using fastsimcoal28.
- Clone the CoalMiner repo onto your machine:
git clone https://github.com/raywray/CoalMiner.git- Move into the CoalMiner directory:
cd CoalMiner- Install all necessary conda or mamba packages and create a conda environment (will need to have anaconda or miniconda installed prior) with the following command:
conda env create -f environment.yml -y- Activate the conda environment:
conda activate coalminer_envCoalMiner requires only 2 types of input files: an SFS and a .yml with user parameters.
Users can create their SFS's from .vcf files using several packages, including using the PPP or easySFS packages1.
You can specify the paths to your .obs files directly in the user parameter .yml file (see below), or alternatively place them directly in the CoalMiner directory using:
cp [prefix]_joint*.obs CoalMiner/
# Where [prefix] is your chosen internal prefix for your filesThis file must contain the following information
INPUT_PREFIX: the output prefix the user would like to useNUM_POPS: the number of populations being testedSAMPLE_SIZES: as a bulleted list, the number of sampled individuals from each population (haploid)
Additionally, prior distributions, ranges and types must be provided for (under the MODEL_PARAMS section):
- mutation rates:
mutation_rate_dist - effective population sizes:
effective_pop_size_dist - migration rates:
migration_dist - time in generations:
time_dist - and an optional value for the maximum number of generations between events (default=1000):
max_time_between_events
Optional Parameters:
OUTPUT_DIR: path for outputNUM_RANDOM_MODELS: the number of random topologies to generate (defaulted to 100)OBS_FILES: list of paths to your.obsfiles. If not provided, CoalMiner will look for files matchingINPUT_PREFIX*.obsin the current directory. Supports absolute paths, relative paths, and~for home directory. Example:
OBS_FILES:
- /absolute/path/to/hom_sap_DSFS.obs
- relative/path/to/hom_sap_jointDAFpop1_0.obs
- ~/data/hom_sap_jointDAFpop2_0.obsExample input .yml files can be found in the example_input_files/ directory.
After the input have been created, running CoalMiner is very simple and can be done with the following command: python3 [path_to_coalminer.py] [path_to_user_input_yml]
For example:
python3 /Users/foo/Projects/CoalMiner/coalminer.py /Users/foo/Projects/coalminer_input.ymlCoalMiner generates random .est and .tpl files and saves them in directories titled {prefix}_random_model_1, {prefix}_random_model_2, etc., in the output directory. It also copies the provided SFS files into the respective model directories. Example output files can be seen in the tutorial/example_output_files directory.
Any example files can be found in the tutorial/example_input_files directory. These files are used in the video tutorial. Run the following commands to see how the example files work (assuming you have navigated into the CoalMiner directory):
# Run coalminer (the example YAML already specifies the paths to the .obs files)
python3 coalminer.py tutorial/example_input_files/hom_sap_3_pop_model.ymlNote: The example now uses the OBS_FILES parameter in the YAML to specify file paths. If you prefer the old method, you can copy the .obs files to the current directory and remove the OBS_FILES section from the YAML:
# 1. Copy homo sapiens example parameter files into coal miner directory
cp tutorial/example_input_files/hom_sap_joint*.obs .
# 2. Run coalminer
python3 coalminer.py tutorial/example_input_files/hom_sap_3_pop_model.ymlFootnotes
-
fastsimcoal will only run with specific SFS suffix names. See the OBSERVED SFS FILE NAMES section of the fastsimcoal manual. ↩
