Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling single reads failed #111

Closed
suzuki-shm opened this issue Apr 12, 2018 · 3 comments
Closed

Sampling single reads failed #111

suzuki-shm opened this issue Apr 12, 2018 · 3 comments

Comments

@suzuki-shm
Copy link

Hi,

When I used seqtk sample command with 1 as number of sequence, I didn't get a file that have single reads, but I got all of reads(=input sequence file).
This reproduced by multiple test sequence files and seeds.
The commands I used is

seqtk sample test_reads.fq 1

I ran it with CWL wrapped seqtk, and raw software inside docker container.

@shenwei356
Copy link

Try

 seqtk sample test_reads.fq 1.1

It works

@tseemann
Copy link

tseemann commented May 3, 2018

@TaskeHAMANO The number can be a fraction OR an exact number.
You have discovered an ambuguity with the string 1.
It could be 1 read, or ALL the reads (fraction 1.0).
@shenwei356 has a nice workaround - because fraction must be between 0 and 1, if you give it 1.1 it must be treating it as a number, and rounding it down to 1.

@lh3 lh3 closed this as completed in 8152017 Jun 17, 2018
@eboyden
Copy link

eboyden commented Dec 22, 2021

I would actually prefer that 1.0 be treated as a float (returning all reads) rather than an int, whereas 1 would return a single read. I'd recommend enforcing this behavior, so that integers are treated as numbers and floats are treated as fractions. This could also be used to allow oversampling, e.g. 10 would return 10 reads but 10.0 would oversample 10X.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants