Skip to content

crukci-bioinformatics/pod5split

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pod5 Split

This tool fills a hole in the ONT Pod5 tools suite to split a large Pod5 file into chunks of a given number of reads.

Running the Tool

Assuming your environment is set up correctly (see below), the program is run on the command line thus:

python3 pod5split.py [-h] [-b <name>] [-o <dir>] [-r <int>] [-t <int>] <pod5 file>

The required argument is the path to the Pod5 file you want to split. Other options are:

-b/--base: The base name for the output file chunks. If this is not provided, the base name of the input pod5 file is used.

-o/--out: The directory to write the chunk files to. Defaults to the current working directory if not given. The directory is created if it does not exist.

-r/--reads: The number of reads to put in each chunk. The default is 25,000.

-t/--threads: The number of threads to use for concurrent processing. Default is 4 unless your computer has fewer cores.

Creating the Virtual Environment

This tool has dependencies on the ONT Pod5 Python package, which in turn has its own dependencies. If you are using the tool outside of a container, you will probably want to create and activate a Python virtual environment within which you can run the tool.

% python3 -m venv venv
% source venv/bin/activate
% pip install -r requirements.txt

Unit Tests

The unit test relies on pytest. This will be installed in the virtual environment with the other dependencies. Running the test is simply typing at the top level:

pytest

About

Splitting tool for ONT Pod5 files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages