Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] "consume_masked" mode for consume_seqfile_with_mask #1867

Merged
merged 10 commits into from
May 29, 2018

Conversation

standage
Copy link
Member

@standage standage commented May 25, 2018

In branch feature/mask-comp I implemented a new complement consume_masked option for the consume_seqfile_with_mask method. By default, the function will not count any k-mers present in the mask, but with complement=True consume_masked=True the function will count only the k-mers present in the mask/filter.

In that branch I also putzed around with some Python code (utils and scripts) to get that functionality working in abundance-dist-single.py.

This PR splits the C++/Cython changes from the Python changes, the latter of which aren't ready for review yet. The C++/Cython functionality is blocking my progress on a feature in kevlar, whereas I don't have a pressing need to bake this in to khmer scripts at the moment.

  • Is any new functionality in tested? (This can be checked with
    make clean diff-cover or the CodeCov report that is automatically
    generated following a successful CI build.)
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@standage standage changed the title Feature/mask comp only [WIP] "Complement" mode for consume_seqfile_with_mask May 25, 2018
@codecov-io

This comment has been minimized.

@standage standage changed the title [WIP] "Complement" mode for consume_seqfile_with_mask [MRG "Complement" mode for consume_seqfile_with_mask May 25, 2018
@standage standage changed the title [MRG "Complement" mode for consume_seqfile_with_mask [MRG] "Complement" mode for consume_seqfile_with_mask May 25, 2018
@standage
Copy link
Member Author

Ready for review and merge! @betatim @camillescott @ctb @luizirber

@ctb
Copy link
Member

ctb commented May 27, 2018

Hi @standage the code looks good but I am not sure why you chose complement here. I can't find a definition of the word that makes sense with the usage here - help?

@standage
Copy link
Member Author

I explained this briefly in the first paragraph of the PR, but I am open to using different terminology for the variable names if it communicates intent more clearly.

The old behavior of consume_seqfile_with_mask is IMO pretty self explanatory: consume all k-mers except those in the mask. The new behavior is an option that complements this behavior, consuming all k-mers that are in the mask.

Let me know what you think regarding terminology.

@ctb
Copy link
Member

ctb commented May 29, 2018

I finally remembered to search for "set complement" and that is the right definition :). Glad I found it! What about consume_masked=True|False (False by default)?

@standage
Copy link
Member Author

What about consume_masked=True|False (False by default)?

Roger that. I like it.

@standage standage changed the title [MRG] "Complement" mode for consume_seqfile_with_mask [MRG] "consume_masked" mode for consume_seqfile_with_mask May 29, 2018
Copy link
Member

@ctb ctb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ctb ctb merged commit 998ca3c into master May 29, 2018
@standage standage deleted the feature/mask-comp-only branch May 29, 2018 03:08
@standage
Copy link
Member Author

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants