Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target branch for v0.9.1 #368

Merged
merged 56 commits into from
Aug 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
ae25d6e
bump version
daler Jun 5, 2022
13ed037
allow plotting of lengths of intervals (#367)
yunfeiguo Jun 5, 2022
e939086
address #354 to support newer bedtools versions
daler Jun 5, 2022
5e7b83c
Merge branch 'v0.9.1' of github.com:daler/pybedtools into v0.9.1
daler Jun 5, 2022
40d4b9b
test much larger numbers of -b files
daler Jun 6, 2022
bca7eac
Merge branch 'master' into v0.9.1
daler Oct 17, 2022
d6db78f
merge dpi changes
daler Oct 17, 2022
300cf1c
add missing test data for narrowpeak
daler Oct 17, 2022
71b4751
update test for many files
daler Oct 17, 2022
49f9afa
version bump and changelog
daler Oct 17, 2022
b3794ae
add rationale for test_issue_365
daler Oct 17, 2022
06a8227
try workaround for github actions ulimit
daler Oct 17, 2022
ad45b36
print ulimit for diagnostics
daler Oct 17, 2022
bfb093b
go up to reported ulimit files; could take a while on CI
daler Oct 17, 2022
c83d48b
only use ulimit
daler Oct 17, 2022
6e60591
whitespace to trigger tests
daler Jul 1, 2023
8bf8f1a
don't test on deprecated python versions
daler Jul 1, 2023
90ea89a
relax number of files tested
daler Jul 1, 2023
57ff09c
use mamba for faster installation
daler Jul 1, 2023
043f184
catch OSError as well
daler Jul 1, 2023
700c706
try using shorter prefix to get under ARG_MAX
daler Jul 1, 2023
3132c82
test more versions of python
daler Jul 1, 2023
96e271d
fix #381
daler Jul 1, 2023
39f3505
fix python versions
daler Jul 1, 2023
b82ebc4
try building for py 3.11, but not using dependencies in bioconda
daler Jul 1, 2023
b15e210
further conditional tests for 3.11
daler Jul 1, 2023
d4e2bda
fix bash if in main.yml
daler Jul 1, 2023
b5271c2
install test deps for 3.11
daler Jul 1, 2023
2a41ae5
now add in 3.11
daler Jul 1, 2023
2143ede
be more careful about python/cython versions when testing
daler Jul 1, 2023
8944712
install matplotlib for 3.11
daler Jul 1, 2023
288f782
more special-casing of py 3.11
daler Jul 1, 2023
7bde715
add pandas to optional requirements
daler Jul 1, 2023
4b386cb
Add minimal pyproject.toml (#386)
afg1 Jul 1, 2023
5350e38
Fixes #381 closes all .tmp files opened by .save_seqs (#382)
PeterRobots Jul 1, 2023
db30a31
Add genome arguments to BedTool.sort() (#380)
mgperry Jul 1, 2023
b14b1b7
update tests for new change in sorting behavior
daler Jul 23, 2023
a120eeb
update changelog
daler Jul 23, 2023
969b770
update more tests to reflect sorting
daler Jul 23, 2023
a79ab73
another test fix
daler Jul 23, 2023
f007fb9
doctest fix
daler Jul 23, 2023
588c76b
cython language level 2 (from #393, thanks @daz10000)
daler Aug 12, 2023
936f10a
fix #390
daler Aug 12, 2023
f8e1319
ensure numpy is installed at cythonization time
daler Aug 12, 2023
abad241
update changelog
daler Aug 12, 2023
cba97bc
allow tests to run from PRs from forks
daler Aug 12, 2023
c26b4d2
keep pandas in requirements, not optional
daler Aug 12, 2023
17be5e7
be more explicit about push/PR actions
daler Aug 12, 2023
a1dfb39
require numpy
daler Aug 12, 2023
dafde27
install directly from tarball that would be uploaded to pypi
daler Aug 12, 2023
e72556f
cd up a directory before trying to import
daler Aug 12, 2023
44c763d
still need to extract tarball for running tests
daler Aug 12, 2023
3618285
activate env
daler Aug 12, 2023
2732542
pip -e
daler Aug 12, 2023
1926078
more messing around with location
daler Aug 13, 2023
a18b34e
more messing around with location
daler Aug 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 69 additions & 37 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
name: main
on: [push]
on:
push:
branches:
- master
pull_request:
types:
- opened
- reopened
- synchronize

jobs:
build-and-test:
strategy:
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: ["3.8", "3.9", "3.10", "3.11"]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
Expand All @@ -30,7 +38,7 @@ jobs:
# This only requires Cython, no other dependencies.
run: |
eval "$(conda shell.bash hook)"
conda create -p ./cython-env -y cython
conda create -p ./cython-env -y cython python=${{ matrix.python-version }} numpy
conda activate ./cython-env
python setup.py clean cythonize sdist
(cd dist && pip install pybedtools-*.tar.gz && cd $TMPDIR && python -c 'import pybedtools; print(pybedtools.__file__)')
Expand All @@ -53,31 +61,53 @@ jobs:
# Tests below will operate in this newly-installed directory.
run: |
eval "$(conda shell.bash hook)"
conda create -y -p ./test-env \
--channel conda-forge \
--channel bioconda python=${{ matrix.python-version }} \
--file requirements.txt \
--file test-requirements.txt \
--file optional-requirements.txt
conda install mamba python=${{ matrix.python-version }} -y --channel conda-forge

if [ ${{ matrix.python-version }} != "3.11" ]; then
mamba create -y -p ./test-env \
--channel conda-forge \
--channel bioconda python=${{ matrix.python-version }} \
--file requirements.txt \
--file test-requirements.txt \
--file optional-requirements.txt
conda activate ./test-env
else
# Only install bedtools; let pip take care of the rest for 3.11 until
# bioconda catches up.
#
# We still install the test requirements though, and the optional
# requirements except for genomepy which is in bioconda.
grep -v "genomepy" optional-requirements.txt > optional-requirements-3.11.txt
mamba create -y -p ./test-env \
--channel conda-forge \
--channel bioconda \
bedtools \
python=${{ matrix.python-version }} \
--file test-requirements.txt \
--file optional-requirements-3.11.txt
conda activate ./test-env
pip install genomepy

fi
conda activate ./test-env

mkdir -p /tmp/pybedtools-uncompressed
cd /tmp/pybedtools-uncompressed
tar -xf $WORKDIR/dist/pybedtools-*.tar.gz
cd pybedtools-*
pip install -e .
python -c 'import pybedtools; print(pybedtools.__file__)'
ls *
pip install -e /tmp/pybedtools-uncompressed/pybedtools-*

# Trying import in the same directory will complain that cbedtools
# can't be imported
(cd / && python -c 'import pybedtools; print(pybedtools.__file__)')

- name: tests
# Run pytest and sphinx doctests
run: |
eval "$(conda shell.bash hook)"
cd $WORKDIR
eval "$(conda shell.bash hook)"
conda activate ./test-env

# Move to extracted tarball dir, see above notes
# Extract the package tarball built above, and use that for running the tests.
cd /tmp/pybedtools-uncompressed/pybedtools-*
pytest -v --doctest-modules
pytest -v pybedtools/test/genomepy_integration.py
Expand All @@ -89,29 +119,31 @@ jobs:
# Build docs and commit to gh-pages branch. Note that no push happens
# unless we're on the master branch
run: |
eval "$(conda shell.bash hook)"
conda activate ./test-env

# Move to extracted tarball dir, see above notes
cd /tmp/pybedtools-uncompressed/pybedtools-*
(cd docs && make html)

git clone \
--single-branch \
--branch gh-pages "https://x-access-token:${{ secrets.GITHUB_TOKEN }}@github.com/$GITHUB_REPOSITORY" \
/tmp/docs

rm -rf /tmp/docs/*
cp -r docs/build/html/* /tmp/docs
touch /tmp/docs/.nojekyll
cd /tmp/docs
git add .
if git diff --cached --quiet; then
echo "no changes, nothing to commit"
else
git commit -m 'update docs'
if [ ${{ matrix.python-version }} != "3.11" ]; then
eval "$(conda shell.bash hook)"
conda activate ./test-env

# Move to extracted tarball dir, see above notes
cd /tmp/pybedtools-uncompressed/pybedtools-*
(cd docs && make html)

git clone \
--single-branch \
--branch gh-pages "https://x-access-token:${{ secrets.GITHUB_TOKEN }}@github.com/$GITHUB_REPOSITORY" \
/tmp/docs

rm -rf /tmp/docs/*
cp -r docs/build/html/* /tmp/docs
touch /tmp/docs/.nojekyll
cd /tmp/docs
git add .
if git diff --cached --quiet; then
echo "no changes, nothing to commit"
else
git commit -m 'update docs'
fi
cd $WORKDIR
fi
cd $WORKDIR


- name: docs artifact
Expand Down
1 change: 1 addition & 0 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

Overview
--------

Expand Down
20 changes: 20 additions & 0 deletions docs/source/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,26 @@
Changelog
=========

Changes in v0.9.1
-----------------

2023-07-23

* Dropping support for Python 3.6 and 3.7
* Respect sorting of chromsize files (thanks @mgperry)
* Updated setup.py to correctly reflect the MIT license change elsewhere (`#374
<https://github.com/daler/pybedtools/issues/374>`, thanks @hyandell)
* Support plotting lengths of intervals and custom DPI (`#367
<https://github.com/daler/pybedtools/issues/367>`, `#366
<https://github.com/daler/pybedtools/issues/366>`), thanks @yunfeiguo)
* Remove outdated hard-coded check for 510 files in ``intersect`` and instead
defer to local machine's ``ulimit``
* Enabling building/installing on Python 3.11 (thanks @daz10000)
* Allow np.int64 start/stop positions to be used when creating Interval objects (`#390 <https://github.com/daler/pybedtools/issues/390>`)
* properly close filehandles in .save_seq (thanks @PeterRobots)
* include minimal pyproject.toml file (thanks @afg1)


Changes in v0.9
---------------

Expand Down
12 changes: 5 additions & 7 deletions docs/source/topical-genome.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,20 +98,18 @@ will create a file from a dictionary or string:
'dm3.genome'
>>> print(open('dm3.genome').read())
chr2L 23011544
chr2LHet 368872
chr2R 21146708
chr2RHet 3288761
chr3L 24543557
chr3LHet 2555491
chr3R 27905053
chr3RHet 2517507
chr4 1351857
chrX 22422827
chr2LHet 368872
chr2RHet 3288761
chr3LHet 2555491
chr3RHet 2517507
chrM 19517
chrU 10049037
chrUextra 29004656
chrX 22422827
chrXHet 204112
chrYHet 347038
<BLANKLINE>


1 change: 1 addition & 0 deletions pybedtools/_Window.pyx
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# cython: profile=True
# cython: language_level=2

import os
from collections import deque
Expand Down
28 changes: 8 additions & 20 deletions pybedtools/bedtool.py
Original file line number Diff line number Diff line change
Expand Up @@ -338,20 +338,6 @@ def wrapped(self, *args, **kwargs):
if check_for_genome:
kwargs = self.check_genome(**kwargs)

# TODO: should this be implemented as a generic function that can
# be passed in for a each tool to check kwargs? Currently this is
# the only check I can think of.
if prog in ("intersect", "intersectBed"):
if (
isinstance(kwargs["b"], list)
and len(kwargs["b"]) > 510
and all([isinstance(i, str) for i in kwargs["b"]])
):
raise pybedtoolsError(
"BEDTools intersect does not support > 510 filenames for -b "
"argument. Consider passing these as BedTool objects instead"
)

# For sequence methods, we may need to make a tempfile that will
# hold the resulting sequence. For example, fastaFromBed needs to
# make a tempfile for 'fo' if no 'fo' was explicitly specified by
Expand Down Expand Up @@ -2130,7 +2116,7 @@ def shuffle(self):
"""

@_log_to_history
@_wraps(prog="sortBed", implicit="i")
@_wraps(prog="sortBed", implicit="i", uses_genome=True, genome_if=["g", "genome"])
def sort(self):
"""
Wraps `bedtools sort`.
Expand Down Expand Up @@ -2320,8 +2306,8 @@ def complement(self):
chr1 0 1
chr1 500 900
chr1 950 249250621
chr10 0 135534747
chr11 0 135006516
chr2 0 243199373
chr3 0 198022430
"""

@_log_to_history
Expand Down Expand Up @@ -2726,9 +2712,11 @@ def save_seqs(self, fn):

if not hasattr(self, "seqfn"):
raise ValueError("Use .sequence(fasta) to get the sequence first")
fout = open(fn, "w")
fout.write(open(self.seqfn).read())
fout.close()

with open(fn, "w") as fout:
with open(self.seqfn) as seqfile:
fout.write(seqfile.read())

new_bedtool = BedTool(self.fn)
new_bedtool.seqfn = fn
return new_bedtool
Expand Down
8 changes: 5 additions & 3 deletions pybedtools/cbedtools.pyx
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# distutils: language = c++
# cython: language_level=2

# String notes:
#
Expand All @@ -15,6 +16,7 @@

from cpython.version cimport PY_MAJOR_VERSION
from libcpp.string cimport string
import numpy as np

# Python byte strings automatically coerce to/from C++ strings.

Expand All @@ -23,7 +25,7 @@ cdef _cppstr(s):
#
# C++ uses bytestrings. PY2 strings need no conversion; bare PY3 strings
# are unicode and so must be encoded to bytestring.
if isinstance(s, int):
if isinstance(s, integer_types):
s = str(s)
if isinstance(s, unicode):
s = s.encode('UTF-8')
Expand All @@ -36,9 +38,9 @@ cdef _pystr(string s):
return s.decode('UTF-8', 'strict')

if PY_MAJOR_VERSION < 3:
integer_types = (int, long)
integer_types = (int, long, np.int64)
else:
integer_types = (int,)
integer_types = (int, np.int64)

"""
bedtools.pyx: A Cython wrapper for the BEDTools BedFile class
Expand Down
1 change: 1 addition & 0 deletions pybedtools/featurefuncs.pyx
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# cython: language_level=2
# distutils: language = c++
from cbedtools cimport Interval
from cbedtools import create_interval_from_list
Expand Down
2 changes: 1 addition & 1 deletion pybedtools/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -815,7 +815,7 @@ def chromsizes_to_file(chrom_sizes, fn=None):
if isinstance(chrom_sizes, str):
chrom_sizes = chromsizes(chrom_sizes)
fout = open(fn, "wt")
for chrom, bounds in sorted(chrom_sizes.items()):
for chrom, bounds in chrom_sizes.items():
line = chrom + "\t" + str(bounds[1]) + "\n"
fout.write(line)
fout.close()
Expand Down
25 changes: 15 additions & 10 deletions pybedtools/scripts/venn_mpl.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@
import os
import pybedtools


def venn_mpl(a, b, c, colors=None, outfn="out.png", labels=None, dpi=300):
def venn_mpl(a, b, c, colors=None, outfn="out.png", labels=None, by_length=False, dpi=300):
"""
*a*, *b*, and *c* are filenames to BED-like files.

Expand All @@ -30,7 +29,10 @@ def venn_mpl(a, b, c, colors=None, outfn="out.png", labels=None, dpi=300):

*labels* is a list of labels to use for each of the files; by default the
labels are ['a','b','c']


*by_length* if True, then instead of plotting number of intervals, plot combined
lengths of intervals

*dpi* is the dpi setting passed to matplotlib savefig
"""
try:
Expand All @@ -46,6 +48,9 @@ def venn_mpl(a, b, c, colors=None, outfn="out.png", labels=None, dpi=300):
a = pybedtools.BedTool(a)
b = pybedtools.BedTool(b)
c = pybedtools.BedTool(c)
count_features = lambda x:x.count()
if by_length:
count_features = lambda x:x.total_coverage()

if colors is None:
colors = ["r", "b", "g"]
Expand Down Expand Up @@ -91,31 +96,31 @@ def venn_mpl(a, b, c, colors=None, outfn="out.png", labels=None, dpi=300):
kwargs = dict(horizontalalignment="center")

# Unique to A
ax.text(center - 2 * offset, center + offset, str((a - b - c).count()), **kwargs)
ax.text(center - 2 * offset, center + offset, str(count_features(a - b - c)), **kwargs)

# Unique to B
ax.text(center + 2 * offset, center + offset, str((b - a - c).count()), **kwargs)
ax.text(center + 2 * offset, center + offset, str(count_features(b - a - c)), **kwargs)

# Unique to C
ax.text(center, center - 2 * offset, str((c - a - b).count()), **kwargs)
ax.text(center, center - 2 * offset, str(count_features(c - a - b)), **kwargs)

# A and B not C
ax.text(
center, center + 2 * offset - 0.5 * offset, str((a + b - c).count()), **kwargs
center, center + 2 * offset - 0.5 * offset, str(count_features(a + b - c)), **kwargs
)

# A and C not B
ax.text(
center - 1.2 * offset, center - 0.5 * offset, str((a + c - b).count()), **kwargs
center - 1.2 * offset, center - 0.5 * offset, str(count_features(a + c - b)), **kwargs
)

# B and C not A
ax.text(
center + 1.2 * offset, center - 0.5 * offset, str((b + c - a).count()), **kwargs
center + 1.2 * offset, center - 0.5 * offset, str(count_features(b + c - a)), **kwargs
)

# all
ax.text(center, center, str((a + b + c).count()), **kwargs)
ax.text(center, center, str(count_features(a + b + c)), **kwargs)

ax.legend(loc="best")

Expand Down
5 changes: 5 additions & 0 deletions pybedtools/test/data/example.narrowPeak
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
track type=narrowPeak visibility=3 db=hg19 name="nPk" description="ENCODE narrowPeak Example"
browser position chr1:9356000-9365000
chr1 9356548 9356648 . 0 . 182 5.0945 -1 50
chr1 9358722 9358822 . 0 . 91 4.6052 -1 40
chr1 9361082 9361182 . 0 . 182 9.2103 -1 75
Loading
Loading