Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: add --set-name to sig intersect and sig subtract #3162

Merged
merged 26 commits into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
333832b
support more general sig loading in sig overlap
ctb May 12, 2024
9a45829
fix up subtract as well
ctb May 12, 2024
6da2d3a
add sourmash_args.load_one_signature
ctb May 12, 2024
0082c45
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 12, 2024
093ed65
fix one more call to load_one_signature
ctb May 13, 2024
6d7ac35
add tests
ctb May 13, 2024
9a6a61c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2024
76aaafe
rename load/save fns
ctb May 13, 2024
64d2786
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2024
fb28a1b
rename to use _json names
ctb May 13, 2024
dcf5920
ignore zipfile UserWarnings about dup file names
ctb May 13, 2024
266d78a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2024
35dc642
Merge branch 'latest' of github.com:sourmash-bio/sourmash into upgrad…
ctb May 14, 2024
e3e7d28
Merge branch 'upgrade_sig_cmds' into rename_sig_json_fns
ctb May 14, 2024
1501704
stop unnecessary renaming ;)
ctb May 14, 2024
066d984
add --name to sig intersect and sig subtract
ctb May 14, 2024
db1114d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 14, 2024
053662b
add name output tests to filter, flatten, downsample
ctb May 15, 2024
5070bee
Merge branch 'latest' of github.com:sourmash-bio/sourmash into provid…
ctb May 15, 2024
0791c5d
use --set-name
ctb May 15, 2024
ba3ea49
Merge branch 'provide_name_on_sig' of github.com:sourmash-bio/sourmas…
ctb May 15, 2024
23230e4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2024
0ce0d8d
update docs for --set-name on intersect, subtract, and merge
ctb May 24, 2024
94406d8
add/prioritize --set-name for sketch dna/protein/translate
ctb May 24, 2024
521bb80
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 24, 2024
d3caff1
Merge branch 'latest' of github.com:sourmash-bio/sourmash into provid…
ctb Jun 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
support more general sig loading in sig overlap
  • Loading branch information
ctb committed May 12, 2024
commit 333832bacd3d3e40f291619017200c17bdb14a89
15 changes: 13 additions & 2 deletions src/sourmash/sig/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -384,12 +384,23 @@ def overlap(args):

moltype = sourmash_args.calculate_moltype(args)

sig1 = sourmash.load_one_signature(
sig1 = sourmash.load_file_as_signatures(
args.signature1, ksize=args.ksize, select_moltype=moltype
)
sig2 = sourmash.load_one_signature(
sig1 = list(sig1)
if len(sig1) != 1:
notify(f"ERROR: 'sig overlap' needs exactly one signature per file; found {len(sig1)} in '{args.signature1}'")
sys.exit(-1)
sig2 = sourmash.load_file_as_signatures(
args.signature2, ksize=args.ksize, select_moltype=moltype
)
sig2 = list(sig2)
if len(sig2) != 1:
notify(f"ERROR: 'sig overlap' needs exactly one signature per file; found {len(sig2)} in '{args.signature2}'")
sys.exit(-1)

sig1 = sig1[0]
sig2 = sig2[0]

notify(f"loaded one signature each from {args.signature1} and {args.signature2}")

Expand Down
22 changes: 20 additions & 2 deletions tests/test_cmd_signature.py
Original file line number Diff line number Diff line change
Expand Up @@ -3839,8 +3839,8 @@ def test_sig_describe_3_manifest_fails_when_moved(runtmp):
runtmp.sourmash("sig", "describe", "mf.csv")


@utils.in_tempdir
def test_sig_overlap(c):
def test_sig_overlap(runtmp):
c = runtmp
# get overlap details
sig47 = utils.get_test_data("47.fa.sig")
sig63 = utils.get_test_data("63.fa.sig")
Expand All @@ -3857,6 +3857,24 @@ def test_sig_overlap(c):
assert "number of hashes in common: 2529" in out


def test_sig_overlap_2(runtmp):
c = runtmp
# get overlap details
sig47 = utils.get_test_data("47.fa.sig.zip")
sig63 = utils.get_test_data("63.fa.sig.zip")
c.run_sourmash("sig", "overlap", sig47, sig63)
out = c.last_result.out

print(out)

# md5s
assert "09a08691ce52952152f0e866a59f6261" in out
assert "38729c6374925585db28916b82a6f513" in out

assert "similarity: 0.32069" in out
assert "number of hashes in common: 2529" in out


@utils.in_tempdir
def test_import_export_1(c):
# check to make sure we can import what we've exported!
Expand Down