Name	Name	Last commit message	Last commit date
Latest commit History 26 Commits
R	R
inst/extdata	inst/extdata
man	man
src	src
.gitignore	.gitignore
DESCRIPTION	DESCRIPTION
NAMESPACE	NAMESPACE
NEWS	NEWS
README.md	README.md

gds2bgen: Format Conversion from BGEN to GDS

GNU General Public License, GPLv3

Description

This package provides functions for format conversion from bgen files to SeqArray GDS files.

Version

v0.9.0

Package Maintainer

Dr. Xiuwen Zheng (zhengxwen@gmail.com)

Installation

Requires R (≥ v3.5.0), gdsfmt (≥ v1.20.0), SeqArray (≥ v1.24.0)

Installation from Github:

library("devtools")
install_github("zhengxwen/gds2bgen")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

Or manually intall the package

git clone https://github.com/zhengxwen/gds2bgen
cd gds2bgen/src
tar -vxzf gavinband-bgen-0b7a2803adb5.tar.gz
cd gavinband-bgen-0b7a2803adb5
./waf configure
./waf
cp build/libbgen.a ..
cp build/3rd_party/zstd-1.1.0/libzstd.a ..
rm -rf build
cd ../../..
R CMD INSTALL gds2bgen

Copyright Notice

This package includes the sources of the bgen library written by Gavin Band and Jonathan Marchini (https://bitbucket.org/gavinband/bgen), Boost (the C++ libraries, https://www.boost.org) and Zstandard (https://zstd.net).

Citations for GDS

Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012). A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. DOI: 10.1093/bioinformatics/bts606.

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics. DOI: 10.1093/bioinformatics/btx145.

Examples

library(gds2bgen)

bgen_fn <- system.file("extdata", "example.8bits.bgen", package="gds2bgen")
# or bgen_fn <- "your_bgen_file.bgen"
seqBGEN_Info(bgen_fn)

## bgen file: gds2bgen/extdata/example.8bits.bgen
## # of samples: 500
## # of variants: 199
## compression method: zlib
## layout version: v1.2
## sample id: sample_001, sample_002, sample_003, sample_004, ...


# example.8bits.bgen ==> example.gds, using 4 cores
seqBGEN2GDS(bgen_fn, "example.gds",
    storage.option="LZMA_RA",  # compression option, e.g., ZIP_RA for zlib or LZ4_RA for LZ4
    float.type="packed8",      # 8-bit packed real numbers
    geno=FALSE,     # 2-bit integer genotypes, stored in 'genotype/data'
    dosage=TRUE,    # numeric alternative allele dosages, stored in 'annotation/format/DS'
    prob=FALSE,     # numeric genotype probabilities, stored in 'annotation/format/GP'
    parallel=4      # the number of cores
)


# show file structure
library(SeqArray)
(f <- seqOpen("example.gds"))
seqClose(f)

## File: example.gds (137.7K)
## +    [  ] *
## |--+ description   [  ] *
## |--+ sample.id   { Str8 500 LZMA_ra(7.02%), 393B } *
## |--+ variant.id   { Int32 199 LZMA_ra(33.9%), 277B } *
## |--+ position   { Int32 199 LZMA_ra(60.6%), 489B } *
## |--+ chromosome   { Str8 199 LZMA_ra(15.7%), 101B } *
## |--+ allele   { Str8 199 LZMA_ra(11.8%), 101B } *
## |--+ genotype   [  ] *
## |  |--+ data   { Bit2 2x500x0 LZMA_ra, 18B } *
## |  |--+ extra.index   { Int32 3x0 LZMA_ra, 18B } *
## |  \--+ extra   { Int16 0 LZMA_ra, 18B }
## |--+ phase   [  ]
## |  |--+ data   { Bit1 500x0 LZMA_ra, 18B } *
## |  |--+ extra.index   { Int32 3x0 LZMA_ra, 18B } *
## |  \--+ extra   { Bit1 0 LZMA_ra, 18B }
## |--+ annotation   [  ]
## |  |--+ id   { Str8 199 LZMA_ra(18.6%), 321B } *
## |  |--+ qual   { Float32 199 LZMA_ra(11.8%), 101B } *
## |  |--+ filter   { Int32 199 LZMA_ra(11.3%), 97B } *
## |  |--+ info   [  ]
## |  \--+ format   [  ]
## |     |--+ DS   [  ] *
## |     |  \--+ data   { PackedReal8U 500x199 LZMA_ra(55.6%), 54.0K } *
## |     \--+ GP   [  ] *
## |        \--+ data   { PackedReal8U 500x398 LZMA_ra(38.8%), 75.3K } *
## \--+ sample.annotation   [  ]

Also See

seqVCF2GDS() in the SeqArray package, conversion from VCF files to GDS files.

seqBED2GDS() in the SeqArray package, conversion from PLINK BED files to GDS files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

gds2bgen: Format Conversion from BGEN to GDS

Description

Version

Package Maintainer

Installation

Copyright Notice

Citations for GDS

Examples

Also See

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

CoreArray/gds2bgen

Folders and files

Latest commit

History

Repository files navigation

gds2bgen: Format Conversion from BGEN to GDS

Description

Version

Package Maintainer

Installation

Copyright Notice

Citations for GDS

Examples

Also See

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages