Skip to content

Commit

Permalink
release 0.9
Browse files Browse the repository at this point in the history
  • Loading branch information
seanlaidlaw committed Apr 7, 2019
1 parent 59d49c4 commit 27fc0e7
Show file tree
Hide file tree
Showing 12 changed files with 47 additions and 42 deletions.
35 changes: 14 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,14 @@
[![Known Vulnerabilities](https://snyk.io/test/github/SL-LAIDLAW/Metallaxis/badge.svg?targetFile=requirements.txt)](https://snyk.io/test/github/SL-LAIDLAW/Metallaxis?targetFile=requirements.txt)
[![Maintainability](https://api.codeclimate.com/v1/badges/636053f63e1587622300/maintainability)](https://codeclimate.com/github/SL-LAIDLAW/Metallaxis/maintainability)
[![Documentation Status](https://readthedocs.org/projects/metallaxis/badge/?version=latest)](https://metallaxis.readthedocs.io/en/latest/?badge=latest)
[![PyPI version](https://badge.fury.io/py/Metallaxis.svg)](https://badge.fury.io/py/Metallaxis)

[![PyPI version](https://badge.fury.io/py/Metallaxis.svg)](https://badge.fury.io/py/Metallaxis)
[![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

# Metallaxis

Metallaxis is a Python graphical interface for viewing and annotating VCF
files. On loading a VCF file or compressed variant (vcf.gz,vcf.xz) it will make
an API call to EMSEMBL's [VEP](https://www.ensembl.org/vep) with the IDs of the
VCF.
files. On loading a VCF file or compressed variant (vcf.gz,vcf.xz) it will generate statistics, graphs, and open a table view where the variants can be sorted, filtered, and their position visualised compared to the location of genes, based on data retrieved from API requests to ENSEMBL.

Additionally, there is the option to annotate the VCF, whereby the dbSNP and clinVar databases are downloaded and SnpEff is used to annotate the VCF, to view the impact of the variant.

On the interface basic statistics and graphs are displayed in the
statistics pane, and a Table pane is also avaliable on which variants can be
Expand All @@ -20,12 +17,12 @@ filtered by data from VCF, from annotation, or parsed form the INFO column.
## Features
- INFO column splitting into columns that can be sorted
- Filtering of all VCF columns
- Automatic annotation from VEP (provided VCF is human)
- Automatic annotation from dbSNP, ClinVar, and ENSEMBL (provided VCF is human)
- Automatically generated statistics and graphs
- Savable analysis as a HDF5 data store
- Savable analysis as a portable sqlite database

## Authors
Sean Laidlaw & Qiqi He
Sean Laidlaw

## Requirements:
Python:
Expand All @@ -35,10 +32,10 @@ Libraries
- python-magic : 0.4.15
- pandas : 0.23.4
- numpy : 1.15.4
- tables : 3.4.4
- PyQt5 : 5.11.2
- requests : 2.20.1
- matplotlib : 3.0.2
- wget : 3.2


## Installation
Expand Down Expand Up @@ -76,38 +73,34 @@ python3 -m metallaxis ../samples/1000_genomes_extract.vcf.gz
Or to load a previously saved Metallaxis session, by using the saved HDF5 as
argument:
```bash
python3 -m metallaxis ../saves/big_saved_analysis.h5
python3 -m metallaxis ../saves/big_saved_analysis.sqlite
```


## Screenshots

Changing the Species to "Human" in the Settings window allows you to check "Annotate Variants", meaning that the next loaded file will have its IDs sent to be annotated from the VEP API.

![settings_annot_off](img/interface_settings_annotation_off.png)
![settings_annot_on](img/interface_settings_annotation_on.png)

Example variant data displaying statistics of variants by chromosome, types of variants, and averages. As well as number of variants by position for each chromosome.
Example variant data displaying statistics of variants by chromosome, as well as number of variants by position for each chromosome.
![window_statistics](img/interface_window_statistics.png)
![window_statistics_pos](img/interface_window_statistics_pos.png)

Example variant data with VEP annotation listing consequence terms, biotype, gene id, impact, etc. for some of our variants.
Example variant data with annotation listing consequence terms, biotype, gene id, impact, etc. for some of our variants.

![window_annotation](img/interface_window_annotation.png)

Example data showing off filtering ability of _Metallaxis_, not just limited to normal VCF columns but works equally well on recently obtained annotation columns:

![window_filter](img/interface_window_filter.png)

Visualising variant position compared to nearby genes is also easy, with a customisable window, allowing zooming in or out.

![gene_view](img/interface_window_gene_view.png)
![gene_view](img/interface_window_gene_view_zoom.png)


## TODO
1. Rewrite API request to retry in case of server error
1. Generate statistics off of annotation data
1. Write documentation to be generated by Sphinx
1. Split \_\_main__ into separate files for easy importing and readability
1. Optimise writing annotation data to HDF5 to reduce performance bottleneck
1. Add secondary annotation to get ontology and phenotype information


Expand Down
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
# The short X.Y version
version = ''
# The full version, including alpha/beta/rc tags
release = '0.8'
release = '0.9'


# -- General configuration ---------------------------------------------------
Expand Down Expand Up @@ -176,4 +176,4 @@
epub_exclude_files = ['search.html']


# -- Extension configuration -------------------------------------------------
# -- Extension configuration -------------------------------------------------
Binary file removed img/interface_settings_annotation_off.png
Binary file not shown.
Binary file removed img/interface_settings_annotation_on.png
Binary file not shown.
Binary file added img/interface_window_gene_view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/interface_window_gene_view_zoom.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/interface_window_statistics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 10 additions & 3 deletions metallaxis/MetallaxisGui.ui
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<enum>Qt::LeftToRight</enum>
</property>
<property name="currentIndex">
<number>2</number>
<number>1</number>
</property>
<property name="usesScrollButtons">
<bool>false</bool>
Expand Down Expand Up @@ -245,14 +245,21 @@
<number>0</number>
</property>
<item>
<widget class="QLabel" name="label_3">
<widget class="QLabel" name="chrom_selection_label">
<property name="enabled">
<bool>false</bool>
</property>
<property name="text">
<string>Choose a chromosome to view variants by position:</string>
</property>
</widget>
</item>
<item>
<widget class="QComboBox" name="chrom_selection_stat_comboBox"/>
<widget class="QComboBox" name="chrom_selection_stat_comboBox">
<property name="enabled">
<bool>false</bool>
</property>
</widget>
</item>
</layout>
</item>
Expand Down
3 changes: 2 additions & 1 deletion metallaxis/SVGClasses.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#!/usr/bin/env python
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""\
SVG.py - Construct/display SVG scenes.
Expand Down
25 changes: 15 additions & 10 deletions metallaxis/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
from PyQt5.QtSvg import QSvgWidget

# Import SVG Drawing Classes
import SVGClasses
from metallaxis import SVGClasses

# for plotting graphs
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
Expand Down Expand Up @@ -565,9 +565,13 @@ def database_encode(decompressed_file, variant_stats, metadata_dict):
(metadata_dict). Then encodes them as tables into a database.
Returns filename of created sqlite database.
"""
os.remove(sqlite_output_name)
sqlite_output = sqlite3.connect(sqlite_output_name)
# conn = sqlite_connection.cursor()
cursor = sqlite_connection.cursor()
cursor.execute("DROP TABLE IF EXISTS df;")
cursor.execute("DROP TABLE IF EXISTS stats;")
cursor.execute("DROP TABLE IF EXISTS metadata;")
cursor.execute("DROP TABLE IF EXISTS chrom_genes;")
sqlite_output.commit()

# write each entry from metadata_dict to a new "metadata" table in database
for metadata_line_nb in metadata_dict:
Expand Down Expand Up @@ -754,14 +758,15 @@ def __init__(self):
self.setWindowTitle("Metallaxis")
# initialise progress bar
self.MetallaxisProgress = MetallaxisProgress()
self.MetallaxisProgress.show()

# Setup inital GUI
self.graphicsView.setMaximumHeight(0)

# Center GUI on screen
qt_rectangle = self.frameGeometry()
center_point = QDesktopWidget().availableGeometry().center()
qt_rectangle.moveCenter(center_point)

self.progress_bar(1, "setting up gui")
# buttons on interface
self.open_vcf_button.clicked.connect(self.select_and_parse)
# menus on interface
Expand Down Expand Up @@ -851,6 +856,7 @@ def select_and_parse(self, cli_arg=False):
of the database file directly. Accepts no arguments, and returns nothing. This function exists solely to call other
functions, as menu items in PyQt can only call one function.
"""
self.MetallaxisProgress.show()

if not cli_arg:
selected_file = self.select_file()
Expand Down Expand Up @@ -1154,7 +1160,8 @@ def write_database_to_interface(self, loaded_database):
self.filter_lineedit.setEnabled(True)
self.filter_box.setEnabled(True)
self.view_variant_btn.setEnabled(True)
self.graphicsView.setMaximumHeight(0)
self.chrom_selection_stat_comboBox.setEnabled(True)
self.chrom_selection_label.setEnabled(True)

# get column numbers for ID, POS, etc.
self.progress_bar(47, "Extracting column data")
Expand Down Expand Up @@ -1194,12 +1201,10 @@ def write_database_to_interface(self, loaded_database):
var_counts_value = stats_sql_result['Result'][i]
var_counts[var_counts_key] = var_counts_value

if "ALT_Types" in var_counts:
ALT_Types = eval(var_counts["ALT_Types"])

self.progress_bar(49, "Plotting Statistics")

if "ALT_Types" in var_counts:
ALT_Types = eval(var_counts["ALT_Types"])
# plot piechart of proportions of types of ALT
# get the value for each ALT_Types key in order, per type of Alt so it can be graphed
alt_values_to_plot = []
Expand Down Expand Up @@ -1244,8 +1249,8 @@ def write_database_to_interface(self, loaded_database):
plt.ylabel('Number of Variants')
total_figure.tight_layout()
self.stat_plot_layout.addWidget(FigureCanvas(total_figure))
self.chrom_selection_stat_comboBox.addItems(graph_df.index)

self.chrom_selection_stat_comboBox.addItems(graph_df.index)
# setup variants by position graph for first chromosome in list
if list_chromosomes[0]:
self.changed_chrom_stat_combobox(list_chromosomes[0])
Expand Down
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
python-magic==0.4.15
pandas==0.23.4
numpy==1.15.4
tables==3.4.4
PyQt5==5.11.2
requests==2.20.1
matplotlib==3.0.2
Expand Down
8 changes: 4 additions & 4 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,22 @@

setup(
name="Metallaxis",
version="0.8",
version="0.9",
license='GPLv3',
author="Sean Laidlaw",
author_email="seanlaidlaw95@gmail.com",
url='https://github.com/SL-LAIDLAW/Metallaxis',
description="A graphical python-based VCF viewer with optional VEP annotation",
description="A graphical python-based VCF viewer with optional annotation",
packages=['metallaxis'],
package_data={'':['*.ui']},
package_data={'': ['*.ui', 'annotation/*']},
include_package_data=True,
install_requires=[
'python-magic',
'pandas',
'numpy',
'tables',
'PyQt5',
'requests',
'wget',
'matplotlib'
],
classifiers=[
Expand Down

0 comments on commit 27fc0e7

Please sign in to comment.