CurFu is an interactive curation tool for describing and representing gene fusions in a computable manner. It's developed to support the VICC Fusion Guidelines project.
Clone the repo:
git clone https://github.com/cancervariants/fusion-curation
cd fusion-curation
Ensure that the following data sources are available:
- the VICC Gene Normalization database, accessible from a DynamoDB-compliant service. Set the endpoint address with environment variable
GENE_NORM_DB_URL
; default value ishttp://localhost:8000
. - the Biocommons SeqRepo database, used by
Cool-Seq-Tool
. The precise file location is configurable via theSEQREPO_ROOT_DIR
variable, per the documentation. - the Biocommons Universal Transcript Archive, by way of Genomic Med Lab's Cool Seq Tool package. Connection parameters to the Postgres database are set most easily as a Libpq-compliant URL under the environment variable
UTA_DB_URL
.
Create a virtual environment for the server and install. Note: there's also a Pipfile so you can skip the virtualenv steps if you'd rather use a Pipenv instance instead of virtualenv/venv. I have been sticking with the latter because Pipenv doesn't play well with entry points in development, but if you aren't editing them in setup.cfg
, then the former should be fine.
cd server # regardless of your environment decision, build it in server/
virtualenv venv
source venv/bin/activate
python3 -m pip install -e ".[dev,tests]" # make sure to include the extra dependencies!
Acquire two sets of static assets and place all of them within the server/src/curfu/data
directory:
-
Gene autocomplete files, providing legal gene search terms to the client autocomplete component. One file each is used for entity types
aliases
,assoc_with
,xrefs
,prev_symbols
,labels
, andsymbols
. Each should be named according to the patterngene_<type>_<YYYYMMDD>.tsv
. These can be regenerated with the shell commandcurfu_devtools genes
. -
Domain lookup file, for use in providing possible functional domains for user-selected genes in the client. This should be named according to the pattern
domain_lookup_YYYYMMDD.tsv
. These can be regenerated with the shell commandcurfu_devtools domains
, although this is an extremely time- and storage-intensive process.
Your data/directory should look something like this:
server/src/curfu/data
├── domain_lookup_2022-01-20.tsv
├── gene_aliases_suggest_20211025.tsv
├── gene_assoc_with_suggest_20211025.tsv
├── gene_labels_suggest_20211025.tsv
├── gene_prev_symbols_suggest_20211025.tsv
├── gene_symbols_suggest_20211025.tsv
└── gene_xrefs_suggest_20211025.tsv
Finally, start backend service with curfu
, by default on port 5000. To use a different port, pass the number to the -p
option:
curfu -p 5000
In another shell, navigate to the repo client/
directory and install frontend dependencies:
cd client
yarn install
If you get the following error:
error api@3.4.2: The engine "node" is incompatible with this module. Expected version "^12 || ^14 || ^16". Got "18.0.0"
error Found incompatible module.
info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
You can run:
yarn install --ignore-engines
Then start the development server:
yarn start
The frontend utilizes Typescript definitions generated from the backend pydantic schema. These can be refreshed, from the server environment, with the command curfu_devtools client-types
. This will only work if json2ts
has been installed in the client's node_modules
binary directory.
Python code style is enforced by flake8 and Black, and frontend style is enforced by ESLint and Prettier. Conformance is ensured by pre-commit. Before your first commit, run
pre-commit install
This will require installation of dev
dependencies on the server side.
In practice, Prettier and Black will do most of the formatting work for you to be in accordance with ESLint and flake8. In the backend, run python3 -m black path/to/file
, and in the frontend, run yarn run prettier --write path/to/file
to autoformat a file.
Backend tests require installation of tests
dependencies. Run with pytest
.
requirements.txt
is used for Elastic Beanstalk to install the dependencies. Anytime you update package requirements, be sure to update requirements.txt
. To generate run the below command from server
directory (ensure you have started the venv):
pip freeze --exclude-editable > ../requirements.txt