Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added maximum TR length parameter for trf #22

Merged
merged 2 commits into from
Nov 28, 2023
Merged

Conversation

atotickov
Copy link
Contributor

Dear aaranyue,

Thank you for creating the wonderful quarTeT program! I've been testing it for the past few days and getting excellent results. However, among the currently available parameters for trf, I couldn't find the -l parameter, which is crucial for mammals whose genome assemblies contain a high amount of repetitive sequences. Without this trf parameter, it gets stuck and after several weeks of work, it might not finish. I took the liberty of adding this parameter myself. If you don't mind, please consider accepting the merge request.

With respect and gratitude,
atotickov.

@Echoring
Copy link
Collaborator

Dear atotickov,

Thank you so much for your generous suggestion! I noticed the issue that trf can stuck for a long time in several chromosomes, but not found this option can help.
I also made a little adjustment based on your code to prevent error due to default None. The default value is set to 3 million.

Thank you again for your help!
Echoring

@Echoring Echoring merged commit 69e3159 into aaranyue:main Nov 28, 2023
@atotickov
Copy link
Contributor Author

atotickov commented Nov 28, 2023

Dear Echoring,

Thank you for adding the parameter!
I've come across a minor issue when attempting to install quarTeT in a conda environment via pip. The quarTeT repository lacks the setup.py file required for such installation methods. Would you mind adding it? This file won't impact the functionality of quarTeT; it would simply provide an additional installation method.

Example setup.py file:

__author__ = 'name'

from pathlib import Path
from os.path import join, dirname
from setuptools import setup, find_packages

dependencies = ['dependencies_name1', 'dependencies_name2']

setup(name='quarTeT',
      version='1.1.6',
      packages=find_packages(),
      author='name',
      author_email='email',
      install_requires=dependencies,
      long_description=open(join(dirname(__file__), 'README.md')).read(),
      scripts=list(map(str, sorted(Path('./').rglob("*.py")))))

I would greatly appreciate it if you could add this file.

Thank you!

Best regards,
atotickov

@Echoring
Copy link
Collaborator

Dear atotickov,

Thanks for your advise. However, I haven't get the point of using setup.py.
In my understanding, setup.py is used to install the required python packages via pip, and copy the executable script to $PATH.
But quarTeT use no python packages besides the standard packages. The required third party software of quarTeT is unable to found in pip. I think this is unnecessary to use setup.py here?

Meanwhile, I tried to create a setup.py like below, but I find that quarTeT.egg-info/scripts is not automatically generated after build and install, result in error pkg_resources.ResolutionError: Script 'scripts/quartet.py' not found in metadata at ....../quarTeT.egg-info. I manually create this dir and copy the scripts and it works with a warning DeprecationWarning: pkg_resources is deprecated as an API.

import setuptools
import os
import sys

if sys.version_info.major != 3:
    raise EnvironmentError("quarTeT requires python3, and is not compatible with python2.")

setuptools.setup(
    name="quarTeT",
    version="1.1.6",
    author="Yunzhi Lin",
    author_email="linyunzhi20@gmail.com",
    description="A telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification.",
    long_description=open('README.md').read(),
    url="http://www.atcgn.com:8080/quarTeT/home.html",
    packages=setuptools.find_packages(),
    py_modules=['quartet_util'],
    scripts=['quartet.py', 'quartet_assemblymapper.py', 'quartet_centrominer.py', 'quartet_gapfiller.py', 'quartet_teloexplorer.py'],
)

I would greatly appreciate it if you could point out the error here.

Thank you!

Best regards,
Echoring

@atotickov
Copy link
Contributor Author

Dear @Echoring,

Since quarTeT is currently unavailable in conda, utilizing it in various pipelines seems challenging from my perspective. As an alternative, installing quarTeT into the conda environment using a .yaml file is feasible, but it cannot be done without setup.py in the GitHub repository.

I haven't encountered this error before, so I can't offer any specific advice. However, I'm also getting the "DeprecationWarning: pkg_resources is deprecated as an API." warning, but it doesn't seem to affect the program's functionality.

As a temporary measure, I added a setup.py file specifying different packages as dependencies to the forked quarTeT branch. This helped me use the program within the conda environment.

Best regards,
atotickov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants